[PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets
@ 2021-01-08  4:17 Simon Marchi
  2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches

This series is made of two set of changes.  The first 4 patches are the
new version of this series:

  https://sourceware.org/pipermail/gdb-patches/2020-December/174271.html

The rationale remains the same.

The main change is in patch 1.  Previously, the remote target was only
tracking the resume state in non-stop, now it does it even in all-stop.
This was changed because it is useful for patch 5, which is the new
version of this patch:

  https://sourceware.org/pipermail/gdb-patches/2020-December/174277.html

I included patch 5 in this series because it uses the infrastructure
added by patch 1.

I also think that making the remote target track the resume state in all
modes, as opposed to just in non-stop mode, is less confusing in the
long run.

Since patches 3 and 4 may be a bit more controversial and require more
discussions, it would be possible to merge only 1, 2 and 5 and discuss
the others separately.  Or even just 1 and 5.

This series is called "v3" because what's now patch 5 was already at v2,
so I thought it was less confusing this way.

Andrew Burgess (1):
  gdb: better handling of 'S' packets

Simon Marchi (4):
  gdb: make the remote target track its own thread resume state
  gdb: remove target_ops::commit_resume implementation in
    record-{btrace,full}.c
  gdb: move commit_resume to process_stratum_target
  gdb: generalize commit_resume, avoid commit-resuming when threads have
    pending statuses

 gdb/infcmd.c                                  |   8 +
 gdb/infrun.c                                  | 126 ++++-
 gdb/infrun.h                                  |  41 ++
 gdb/linux-nat.c                               |   5 +
 gdb/mi/mi-main.c                              |   2 +
 gdb/process-stratum-target.c                  |  14 +
 gdb/process-stratum-target.h                  |  38 ++
 gdb/record-btrace.c                           |  11 -
 gdb/record-full.c                             |  20 +-
 gdb/remote.c                                  | 431 +++++++++++++-----
 gdb/target-delegates.c                        |  22 -
 gdb/target.c                                  |  22 -
 gdb/target.h                                  |  20 -
 .../gdb.server/stop-reply-no-thread-multi.c   |  77 ++++
 .../gdb.server/stop-reply-no-thread-multi.exp | 136 ++++++
 15 files changed, 740 insertions(+), 233 deletions(-)
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp

-- 
2.29.2

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
@ 2021-01-08  4:17 ` Simon Marchi
  2021-01-08 15:41   ` Pedro Alves
  2021-01-18  5:16   ` Sebastian Huber
  2021-01-08  4:17 ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace, full}.c Simon Marchi
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Pedro Alves, Simon Marchi

From: Simon Marchi <simon.marchi@efficios.com>

New in v3: track the state even in all-stop.

The next patch moves the target commit_resume method to be a
process_stratum_target-only method.  The only non-process targets that
currently implement the commit_resume method are the btrace and full
record targets.  The only reason they need to do so is to prevent a
commit resume from reaching the beneath (process) target if they are
currently replaying.

This is important if a record target is used on top of the remote target
(the only process target implementing the commit_resume method).
Currently, the remote target checks the `thread_info::executing` flag of
a thread to know if it should commit resume that thread:

    if (!tp->executing || remote_thr->vcont_resumed)
      continue;

The `tp->executing` flag is set by infrun when it has asked the target
stack to resume the thread, and therefore if the thread is executing,
from its point of view.  It _not_ equivalent to whether the remote
target was asked to resume this thread.

Indeed, if infrun asks the target stack to resume some thread while the
record target is replaying, the record target won't forward the resume
request the remote target beneath, because we don't actually want to
resume the thread on the execution target.  But the `tp->executing` flag
is still set, because from the point of view of infrun, the thread
executes.  So, if the commit_resume call wasn't intercepted by the
record target as it is today and did reach the remote target, the remote
target would say "Oh, this thread should be executing and I haven't
vCont-resumed it!  I must vCont-resume it!".  But that would be wrong,
because it was never asked to resume this thread, the resume request did
not reach it.  This is why the record targets currently need to
implement commit_resume: to prevent the beneath target from
commit_resuming threads it wasn't asked to resume.

Since commit_resume will become a method on process_stratum_target in
the following patch, record targets won't have a chance to intercept the
calls and that would result in the remote target commit_resuming threads
it shouldn't.  To avoid this, this patch makes the remote target track
its own thread resumption state.  That means, tracking which threads it
was asked to resume via target_ops::resume.  Regardless of the context
of this patch, I think this change makes it easier to understand how
resume / commit_resume works in the remote target.  It makes the target
more self-contained, as it only depends on what it gets asked to do via
the target methods, and not on tp->executing, which is a flag maintained
from the point of view of infrun.

I initially made it so this state was only used when the remote target
operates in non-stop mode, since commit_resume is only used when the
target is non-stop.  However, it's more consistent and it can be useful
to maintain this state even in all-stop too.  In all-stop, receiving a
stop notification for one thread means all threads of the target are
considered stopped.

From the point of view of the remote target, there are three states a
thread can be in:

 1. not resumed
 2. resumed but pending vCont-resume
 3. resumed

State 2 only exists when the target is non-stop.

As of this patch, valid state transitions are:

 - 1 -> 2 (through the target resume method if in non-stop)
 - 2 -> 3 (through the target commit_resume method if in non-stop)
 - 1 -> 3 (through the target resume method if in all-stop)
 - 3 -> 1 (through a remote stop notification / reporting an event to the
   event loop)

A subsequent patch will make it possible to go from 2 to 1, in case
infrun asks to stop a thread that was resumed but not commit-resumed
yet.  I don't think it can happen as of now.

In terms of code, this patch replaces the vcont_resumed field with an
enumeration that explicitly represents the three states described above.
The last_resume_sig and last_resume_step fields are moved to a structure
which is clearly identified as only used when the thread is in the
"resumed but pending vCont-resume" state.

gdb/ChangeLog:

        * remote.c (enum class resume_state): New.
        (struct resumed_pending_vcont_info): New.
        (struct remote_thread_info) <resume_state, set_not_resumed,
	set_resumed_pending_vcont, resumed_pending_vcont_info,
	set_resumed, m_resume_state, m_resumed_pending_vcont_info>:
	New.
	<last_resume_step, last_resume_sig, vcont_resumed>: Remove.
        (remote_target::remote_add_thread): Adjust.
        (remote_target::process_initial_stop_replies): Adjust.
        (remote_target::resume): Adjust.
        (remote_target::commit_resume): Rely on state in
	remote_thread_info and not on tp->executing.
        (remote_target::process_stop_reply): Adjust.

Change-Id: I10480919ccb4552faa62575e447a36dbe7c2d523
---
 gdb/remote.c | 167 ++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 131 insertions(+), 36 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index 6dacc24307ea..f8150f39fb5c 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -1050,6 +1050,38 @@ static struct cmd_list_element *remote_show_cmdlist;
 
 static bool use_range_stepping = true;
 
+/* From the remote target's point of view, each thread is in one of these three
+   states.  */
+enum class resume_state
+{
+  /* Not resumed - we haven't been asked to resume this thread.  */
+  NOT_RESUMED,
+
+  /* We have been asked to resume this thread, but haven't sent a vCont action
+     for it yet.  We'll need to consider it next time commit_resume is
+     called.  */
+  RESUMED_PENDING_VCONT,
+
+  /* We have been asked to resume this thread, and we have sent a vCont action
+     for it.  */
+  RESUMED,
+};
+
+/* Information about a thread's pending vCont-resume.  Used when a thread is in
+   the remote_resume_state::RESUMED_PENDING_VCONT state.  remote_target::resume
+   stores this information which is them picked up by
+   remote_target::commit_resume to know which is the proper action for this
+   thread to include in the vCont packet.  */
+struct resumed_pending_vcont_info
+{
+  /* True if the last resume call for this thread was a step request, false
+     if a continue request.  */
+  bool step;
+
+  /* The signal specified in the last resume call for this thread.  */
+  gdb_signal sig;
+};
+
 /* Private data that we'll store in (struct thread_info)->priv.  */
 struct remote_thread_info : public private_thread_info
 {
@@ -1068,23 +1100,61 @@ struct remote_thread_info : public private_thread_info
      to stop for a watchpoint.  */
   CORE_ADDR watch_data_address = 0;
 
-  /* Fields used by the vCont action coalescing implemented in
-     remote_resume / remote_commit_resume.  remote_resume stores each
-     thread's last resume request in these fields, so that a later
-     remote_commit_resume knows which is the proper action for this
-     thread to include in the vCont packet.  */
+  /* Get the thread's resume state.  */
+  enum resume_state resume_state () const
+  {
+    return m_resume_state;
+  }
 
-  /* True if the last target_resume call for this thread was a step
-     request, false if a continue request.  */
-  int last_resume_step = 0;
+  /* Put the thread in the NOT_RESUMED state.  */
+  void set_not_resumed ()
+  {
+    m_resume_state = resume_state::NOT_RESUMED;
+  }
 
-  /* The signal specified in the last target_resume call for this
-     thread.  */
-  gdb_signal last_resume_sig = GDB_SIGNAL_0;
+  /* Put the thread in the RESUMED_PENDING_VCONT state.  */
+  void set_resumed_pending_vcont (bool step, gdb_signal sig)
+  {
+    m_resume_state = resume_state::RESUMED_PENDING_VCONT;
+    m_resumed_pending_vcont_info.step = step;
+    m_resumed_pending_vcont_info.sig = sig;
+  }
+
+  /* Get the information this thread's pending vCont-resumption.
 
-  /* Whether this thread was already vCont-resumed on the remote
-     side.  */
-  int vcont_resumed = 0;
+     Must only be called if the thread is in the RESUMED_PENDING_VCONT resume
+     state.  */
+  const struct resumed_pending_vcont_info &resumed_pending_vcont_info () const
+  {
+    gdb_assert (m_resume_state == resume_state::RESUMED_PENDING_VCONT);
+
+    return m_resumed_pending_vcont_info;
+  }
+
+  /* Put the thread in the VCONT_RESUMED state.  */
+  void set_resumed ()
+  {
+    m_resume_state = resume_state::RESUMED;
+  }
+
+private:
+  /* Resume state for this thread.  This is used to implement vCont action
+     coalescing (only when the target operates in non-stop mode).
+
+     remote_target::resume moves the thread to the RESUMED_PENDING_VCONT state,
+     which notes that this thread must be considered in the next commit_resume
+     call.
+
+     remote_target::commit_resume sends a vCont packet with actions for the
+     threads in the RESUMED_PENDING_VCONT state and moves them to the
+     VCONT_RESUMED state.
+
+     When reporting a stop to the core for a thread, that thread is moved back
+     to the NOT_RESUMED state.  */
+  enum resume_state m_resume_state = resume_state::NOT_RESUMED;
+
+  /* Extra info used if the thread is in the RESUMED_PENDING_VCONT state.  */
+  struct resumed_pending_vcont_info m_resumed_pending_vcont_info;
 };
 
 remote_state::remote_state ()
@@ -2443,7 +2513,10 @@ remote_target::remote_add_thread (ptid_t ptid, bool running, bool executing)
   else
     thread = add_thread (this, ptid);
 
-  get_remote_thread_info (thread)->vcont_resumed = executing;
+  /* We start by assuming threads are resumed.  That state then gets updated
+     when we process a matching stop reply.  */
+  get_remote_thread_info (thread)->set_resumed ();
+
   set_executing (this, ptid, executing);
   set_running (this, ptid, running);
 
@@ -4472,7 +4545,7 @@ remote_target::process_initial_stop_replies (int from_tty)
 
       set_executing (this, event_ptid, false);
       set_running (this, event_ptid, false);
-      get_remote_thread_info (evthread)->vcont_resumed = 0;
+      get_remote_thread_info (evthread)->set_not_resumed ();
     }
 
   /* "Notice" the new inferiors before anything related to
@@ -6307,9 +6380,9 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
      individually.  Resuming remote threads directly in target_resume
      would thus result in sending one packet per thread.  Instead, to
      minimize roundtrip latency, here we just store the resume
-     request; the actual remote resumption will be done in
-     target_commit_resume / remote_commit_resume, where we'll be able
-     to do vCont action coalescing.  */
+     request (put the thread in RESUMED_PENDING_VCONT state); the actual remote
+     resumption will be done in remote_target::commit_resume, where we'll be
+     able to do vCont action coalescing.  */
   if (target_is_non_stop_p () && ::execution_direction != EXEC_REVERSE)
     {
       remote_thread_info *remote_thr;
@@ -6319,8 +6392,11 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
       else
 	remote_thr = get_remote_thread_info (this, ptid);
 
-      remote_thr->last_resume_step = step;
-      remote_thr->last_resume_sig = siggnal;
+      /* We don't expect the core to ask to resume an already resumed (from
+         its point of view) thread.  */
+      gdb_assert (remote_thr->resume_state () == resume_state::NOT_RESUMED);
+
+      remote_thr->set_resumed_pending_vcont (step, siggnal);
       return;
     }
 
@@ -6339,6 +6415,10 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
   if (!remote_resume_with_vcont (ptid, step, siggnal))
     remote_resume_with_hc (ptid, step, siggnal);
 
+  /* Update resumed state tracked by the remote target.  */
+  for (thread_info *tp : all_non_exited_threads (this, ptid))
+    get_remote_thread_info (tp)->set_resumed ();
+
   /* We are about to start executing the inferior, let's register it
      with the event loop.  NOTE: this is the one place where all the
      execution commands end up.  We could alternatively do this in each
@@ -6562,9 +6642,11 @@ remote_target::commit_resume ()
 
   for (thread_info *tp : all_non_exited_threads (this))
     {
+      remote_thread_info *priv = get_remote_thread_info (tp);
+
       /* If a thread of a process is not meant to be resumed, then we
 	 can't wildcard that process.  */
-      if (!tp->executing)
+      if (priv->resume_state () == resume_state::NOT_RESUMED)
 	{
 	  get_remote_inferior (tp->inf)->may_wildcard_vcont = false;
 
@@ -6593,24 +6675,24 @@ remote_target::commit_resume ()
     {
       remote_thread_info *remote_thr = get_remote_thread_info (tp);
 
-      if (!tp->executing || remote_thr->vcont_resumed)
+      /* If the thread was previously vCont-resumed, no need to send a specific
+	 action for it.  If we didn't receive a resume request for it, don't
+	 send an action for it either.  */
+      if (remote_thr->resume_state () != resume_state::RESUMED_PENDING_VCONT)
 	continue;
 
       gdb_assert (!thread_is_in_step_over_chain (tp));
 
-      if (!remote_thr->last_resume_step
-	  && remote_thr->last_resume_sig == GDB_SIGNAL_0
-	  && get_remote_inferior (tp->inf)->may_wildcard_vcont)
-	{
-	  /* We'll send a wildcard resume instead.  */
-	  remote_thr->vcont_resumed = 1;
-	  continue;
-	}
+      const resumed_pending_vcont_info &info
+	= remote_thr->resumed_pending_vcont_info ();
 
-      vcont_builder.push_action (tp->ptid,
-				 remote_thr->last_resume_step,
-				 remote_thr->last_resume_sig);
-      remote_thr->vcont_resumed = 1;
+      /* Check if we need to send a specific action for this thread.  If not,
+         it will be included in a wildcard resume instead.  */
+      if (info.step || info.sig != GDB_SIGNAL_0
+	  || !get_remote_inferior (tp->inf)->may_wildcard_vcont)
+	vcont_builder.push_action (tp->ptid, info.step, info.sig);
+
+      remote_thr->set_resumed ();
     }
 
   /* Now check whether we can send any process-wide wildcard.  This is
@@ -7764,7 +7846,20 @@ remote_target::process_stop_reply (struct stop_reply *stop_reply,
       remote_thr->core = stop_reply->core;
       remote_thr->stop_reason = stop_reply->stop_reason;
       remote_thr->watch_data_address = stop_reply->watch_data_address;
-      remote_thr->vcont_resumed = 0;
+
+      if (target_is_non_stop_p ())
+	{
+	  /* If the target works in non-stop mode, a stop-reply indicates that
+	     only this thread stopped.  */
+	  remote_thr->set_not_resumed ();
+	}
+      else
+	{
+	  /* If the target works in all-stop mode, a stop-reply indicates that
+	     all the target's threads stopped.  */
+	  for (thread_info *tp : all_non_exited_threads (this))
+	    get_remote_thread_info (tp)->set_not_resumed ();
+	}
     }
 
   delete stop_reply;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace, full}.c
  2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
  2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
@ 2021-01-08  4:17 ` Simon Marchi
  2021-01-08 15:43   ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace,full}.c Pedro Alves
  2021-01-08  4:17 ` [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target Simon Marchi
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Pedro Alves, Simon Marchi

From: Simon Marchi <simon.marchi@efficios.com>

The previous patch made the commit_resume implementations in the record
targets unnecessary, as the remote target's commit_resume implementation
won't commit-resume threads for which it didn't see a resume.  This
patch removes them.

gdb/ChangeLog:

        * record-btrace.c (class record_btrace_target):
        (record_btrace_target::commit_resume):
        * record-full.c (class record_full_target):
        (record_full_target::commit_resume):

Change-Id: I3a68d3d726fb09d8b7165b4edefc330d27803b27
---
 gdb/record-btrace.c | 11 -----------
 gdb/record-full.c   | 10 ----------
 2 files changed, 21 deletions(-)

diff --git a/gdb/record-btrace.c b/gdb/record-btrace.c
index 5924e83898a0..81686ee867b7 100644
--- a/gdb/record-btrace.c
+++ b/gdb/record-btrace.c
@@ -116,7 +116,6 @@ class record_btrace_target final : public target_ops
 
   const struct frame_unwind *get_tailcall_unwinder () override;
 
-  void commit_resume () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
 
@@ -2206,16 +2205,6 @@ record_btrace_target::resume (ptid_t ptid, int step, enum gdb_signal signal)
     }
 }
 
-/* The commit_resume method of target record-btrace.  */
-
-void
-record_btrace_target::commit_resume ()
-{
-  if ((::execution_direction != EXEC_REVERSE)
-      && !record_is_replaying (minus_one_ptid))
-    beneath ()->commit_resume ();
-}
-
 /* Cancel resuming TP.  */
 
 static void
diff --git a/gdb/record-full.c b/gdb/record-full.c
index 5ed9c1a428b1..22eaaa4bb1bc 100644
--- a/gdb/record-full.c
+++ b/gdb/record-full.c
@@ -267,7 +267,6 @@ class record_full_target final : public record_full_base_target
   const target_info &info () const override
   { return record_full_target_info; }
 
-  void commit_resume () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   void disconnect (const char *, int) override;
   void detach (inferior *, int) override;
@@ -1103,15 +1102,6 @@ record_full_target::resume (ptid_t ptid, int step, enum gdb_signal signal)
     target_async (1);
 }
 
-/* "commit_resume" method for process record target.  */
-
-void
-record_full_target::commit_resume ()
-{
-  if (!RECORD_FULL_IS_REPLAY)
-    beneath ()->commit_resume ();
-}
-
 static int record_full_get_sig = 0;
 
 /* SIGINT signal handler, registered by "wait" method.  */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target
  2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
  2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
  2021-01-08  4:17 ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace, full}.c Simon Marchi
@ 2021-01-08  4:17 ` Simon Marchi
  2021-01-08 18:12   ` Andrew Burgess
  2021-01-09 20:29   ` Pedro Alves
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
  2021-01-08  4:17 ` [PATCH v3 5/5] gdb: better handling of 'S' packets Simon Marchi
  4 siblings, 2 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Pedro Alves, Simon Marchi

From: Simon Marchi <simon.marchi@efficios.com>

The following patch will change the commit_resume target method to
something stateful.  Because it would be difficult to track a state
replicated in the various targets of a target stack, and since for the
foreseeable future, only process stratum targets are going to use this
concept, this patch makes the commit resume concept specific to process
stratum targets.

So, move the method to process_stratum_target, and move helper functions
to process-stratum-target.h.

gdb/ChangeLog:

	* target.h (struct target_ops) <commit_resume>: New.
	(target_commit_resume): Remove.
	(make_scoped_defer_target_commit_resume): Remove.
	* target.c (defer_target_commit_resume): Remove.
	(target_commit_resume): Remove.
	(make_scoped_defer_target_commit_resume): Remove.
	* process-stratum-target.h (class process_stratum_target)
	<commit_resume>: New.
	(maybe_commit_resume_all_process_targets): New.
	(make_scoped_defer_process_target_commit_resume): New.
	* process-stratum-target.c (defer_process_target_commit_resume):
	New.
	(maybe_commit_resume_process_target): New.
	(make_scoped_defer_process_target_commit_resume): New.
	* infrun.c (do_target_resume): Adjust.
	(commit_resume_all_targets): Rename into...
	(maybe_commit_resume_all_process_targets): ... this, adjust.
	(proceed): Adjust.
	* record-full.c (record_full_wait_1): Adjust.
	* target-delegates.c: Re-generate.

Change-Id: Ifc957817ac5b2303e22760ce3d14740b9598f02c
---
 gdb/infrun.c                 | 28 +++++++++-------------------
 gdb/process-stratum-target.c | 23 +++++++++++++++++++++++
 gdb/process-stratum-target.h | 29 +++++++++++++++++++++++++++++
 gdb/record-full.c            |  8 ++++----
 gdb/target-delegates.c       | 22 ----------------------
 gdb/target.c                 | 22 ----------------------
 gdb/target.h                 | 20 --------------------
 7 files changed, 65 insertions(+), 87 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 45bedf896419..1a27af51b7e9 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2172,7 +2172,7 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
 
   target_resume (resume_ptid, step, sig);
 
-  target_commit_resume ();
+  maybe_commit_resume_process_target (tp->inf->process_target ());
 
   if (target_can_async_p ())
     target_async (1);
@@ -2760,28 +2760,17 @@ schedlock_applies (struct thread_info *tp)
 					    execution_direction)));
 }
 
-/* Calls target_commit_resume on all targets.  */
+/* Calls maybe_commit_resume_process_target on all process targets.  */
 
 static void
-commit_resume_all_targets ()
+maybe_commit_resume_all_process_targets ()
 {
   scoped_restore_current_thread restore_thread;
 
-  /* Map between process_target and a representative inferior.  This
-     is to avoid committing a resume in the same target more than
-     once.  Resumptions must be idempotent, so this is an
-     optimization.  */
-  std::unordered_map<process_stratum_target *, inferior *> conn_inf;
-
-  for (inferior *inf : all_non_exited_inferiors ())
-    if (inf->has_execution ())
-      conn_inf[inf->process_target ()] = inf;
-
-  for (const auto &ci : conn_inf)
+  for (process_stratum_target *target : all_non_exited_process_targets ())
     {
-      inferior *inf = ci.second;
-      switch_to_inferior_no_thread (inf);
-      target_commit_resume ();
+      switch_to_target_no_thread (target);
+      maybe_commit_resume_process_target (target);
     }
 }
 
@@ -3005,7 +2994,8 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   cur_thr->prev_pc = regcache_read_pc_protected (regcache);
 
   {
-    scoped_restore save_defer_tc = make_scoped_defer_target_commit_resume ();
+    scoped_restore save_defer_tc
+      = make_scoped_defer_process_target_commit_resume ();
 
     started = start_step_over ();
 
@@ -3075,7 +3065,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
       }
   }
 
-  commit_resume_all_targets ();
+  maybe_commit_resume_all_process_targets ();
 
   finish_state.release ();
 
diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
index 719167803fff..1436a550ac04 100644
--- a/gdb/process-stratum-target.c
+++ b/gdb/process-stratum-target.c
@@ -108,3 +108,26 @@ switch_to_target_no_thread (process_stratum_target *target)
       break;
     }
 }
+
+/* If true, `maybe_commit_resume_process_target` is a no-op.  */
+
+static bool defer_process_target_commit_resume;
+
+/* See target.h.  */
+
+void
+maybe_commit_resume_process_target (process_stratum_target *proc_target)
+{
+  if (defer_process_target_commit_resume)
+    return;
+
+  proc_target->commit_resume ();
+}
+
+/* See process-stratum-target.h.  */
+
+scoped_restore_tmpl<bool>
+make_scoped_defer_process_target_commit_resume ()
+{
+  return make_scoped_restore (&defer_process_target_commit_resume, true);
+}
diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
index b513c26ffc2a..c8060c46be93 100644
--- a/gdb/process-stratum-target.h
+++ b/gdb/process-stratum-target.h
@@ -63,6 +63,20 @@ class process_stratum_target : public target_ops
   bool has_registers () override;
   bool has_execution (inferior *inf) override;
 
+  /* Commit a series of resumption requests previously prepared with
+     resume calls.
+
+     GDB always calls `commit_resume` on the process stratum target after
+     calling `resume` on a target stack.  A process stratum target may thus use
+     this method in coordination with its `resume` method to batch resumption
+     requests.  In that case, the target doesn't actually resume in its
+     `resume` implementation.  Instead, it takes note of resumption intent in
+     `resume`, and defers the actual resumption `commit_resume`.
+
+     E.g., the remote target uses this to coalesce multiple resumption requests
+     in a single vCont packet.  */
+  virtual void commit_resume () {}
+
   /* True if any thread is, or may be executing.  We need to track
      this separately because until we fully sync the thread list, we
      won't know whether the target is fully stopped, even if we see
@@ -92,4 +106,19 @@ extern std::set<process_stratum_target *> all_non_exited_process_targets ();
 
 extern void switch_to_target_no_thread (process_stratum_target *target);
 
+/* Commit a series of resumption requests previously prepared with
+   target_resume calls.
+
+   This function is a no-op if commit resumes are deferred (see
+   `make_scoped_defer_process_target_commit_resume`).  */
+
+extern void maybe_commit_resume_process_target
+  (process_stratum_target *target);
+
+/* Setup to defer `commit_resume` calls, and re-set to the previous status on
+   destruction.  */
+
+extern scoped_restore_tmpl<bool>
+  make_scoped_defer_process_target_commit_resume ();
+
 #endif /* !defined (PROCESS_STRATUM_TARGET_H) */
diff --git a/gdb/record-full.c b/gdb/record-full.c
index 22eaaa4bb1bc..56ab29479874 100644
--- a/gdb/record-full.c
+++ b/gdb/record-full.c
@@ -1242,11 +1242,11 @@ record_full_wait_1 (struct target_ops *ops,
 			   break;
   			}
 
+		      process_stratum_target *proc_target
+			= current_inferior ()->process_target ();
+
 		      if (gdbarch_software_single_step_p (gdbarch))
 			{
-			  process_stratum_target *proc_target
-			    = current_inferior ()->process_target ();
-
 			  /* Try to insert the software single step breakpoint.
 			     If insert success, set step to 0.  */
 			  set_executing (proc_target, inferior_ptid, false);
@@ -1263,7 +1263,7 @@ record_full_wait_1 (struct target_ops *ops,
 					    "issuing one more step in the "
 					    "target beneath\n");
 		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
-		      ops->beneath ()->commit_resume ();
+		      proc_target->commit_resume ();
 		      continue;
 		    }
 		}
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index 437b19b8581c..8b933fdf82eb 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -14,7 +14,6 @@ struct dummy_target : public target_ops
   void detach (inferior *arg0, int arg1) override;
   void disconnect (const char *arg0, int arg1) override;
   void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
-  void commit_resume () override;
   ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
@@ -185,7 +184,6 @@ struct debug_target : public target_ops
   void detach (inferior *arg0, int arg1) override;
   void disconnect (const char *arg0, int arg1) override;
   void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
-  void commit_resume () override;
   ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
   void fetch_registers (struct regcache *arg0, int arg1) override;
   void store_registers (struct regcache *arg0, int arg1) override;
@@ -440,26 +438,6 @@ debug_target::resume (ptid_t arg0, int arg1, enum gdb_signal arg2)
   fputs_unfiltered (")\n", gdb_stdlog);
 }
 
-void
-target_ops::commit_resume ()
-{
-  this->beneath ()->commit_resume ();
-}
-
-void
-dummy_target::commit_resume ()
-{
-}
-
-void
-debug_target::commit_resume ()
-{
-  fprintf_unfiltered (gdb_stdlog, "-> %s->commit_resume (...)\n", this->beneath ()->shortname ());
-  this->beneath ()->commit_resume ();
-  fprintf_unfiltered (gdb_stdlog, "<- %s->commit_resume (", this->beneath ()->shortname ());
-  fputs_unfiltered (")\n", gdb_stdlog);
-}
-
 ptid_t
 target_ops::wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2)
 {
diff --git a/gdb/target.c b/gdb/target.c
index 3a03a0ad530e..3a5270e5a416 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -2062,28 +2062,6 @@ target_resume (ptid_t ptid, int step, enum gdb_signal signal)
   clear_inline_frame_state (curr_target, ptid);
 }
 
-/* If true, target_commit_resume is a nop.  */
-static int defer_target_commit_resume;
-
-/* See target.h.  */
-
-void
-target_commit_resume (void)
-{
-  if (defer_target_commit_resume)
-    return;
-
-  current_top_target ()->commit_resume ();
-}
-
-/* See target.h.  */
-
-scoped_restore_tmpl<int>
-make_scoped_defer_target_commit_resume ()
-{
-  return make_scoped_restore (&defer_target_commit_resume, 1);
-}
-
 void
 target_pass_signals (gdb::array_view<const unsigned char> pass_signals)
 {
diff --git a/gdb/target.h b/gdb/target.h
index e1a1d7a9226b..a252c29eafb4 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -478,8 +478,6 @@ struct target_ops
 			 int TARGET_DEBUG_PRINTER (target_debug_print_step),
 			 enum gdb_signal)
       TARGET_DEFAULT_NORETURN (noprocess ());
-    virtual void commit_resume ()
-      TARGET_DEFAULT_IGNORE ();
     /* See target_wait's description.  Note that implementations of
        this method must not assume that inferior_ptid on entry is
        pointing at the thread or inferior that ends up reporting an
@@ -1431,24 +1429,6 @@ extern void target_disconnect (const char *, int);
    target_commit_resume below.  */
 extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
 
-/* Commit a series of resumption requests previously prepared with
-   target_resume calls.
-
-   GDB always calls target_commit_resume after calling target_resume
-   one or more times.  A target may thus use this method in
-   coordination with the target_resume method to batch target-side
-   resumption requests.  In that case, the target doesn't actually
-   resume in its target_resume implementation.  Instead, it prepares
-   the resumption in target_resume, and defers the actual resumption
-   to target_commit_resume.  E.g., the remote target uses this to
-   coalesce multiple resumption requests in a single vCont packet.  */
-extern void target_commit_resume ();
-
-/* Setup to defer target_commit_resume calls, and reactivate
-   target_commit_resume on destruction, if it was previously
-   active.  */
-extern scoped_restore_tmpl<int> make_scoped_defer_target_commit_resume ();
-
 /* For target_read_memory see target/target.h.  */
 
 /* The default target_ops::to_wait implementation.  */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
                   ` (2 preceding siblings ...)
  2021-01-08  4:17 ` [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target Simon Marchi
@ 2021-01-08  4:17 ` Simon Marchi
  2021-01-08 18:34   ` Andrew Burgess
                     ` (3 more replies)
  2021-01-08  4:17 ` [PATCH v3 5/5] gdb: better handling of 'S' packets Simon Marchi
  4 siblings, 4 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Pedro Alves, Simon Marchi

From: Simon Marchi <simon.marchi@efficios.com>

The rationale for this patch comes from the ROCm port [1], the goal
being to reduce the number of back and forths between GDB and the target
when doing successive operations.  I'll start with explaining the
rationale and then go over the implementation.  In the ROCm / GPU world,
the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
read if from a GPU stand point, just s/thread/wave/.

ROCdbgapi, the library used by GDB [2] to communicate with the GPU
target, gives the illusion that it's possible for the debugger to
control (start and stop) individual threads.  But in reality, this is
not how it works.  Under the hood, all threads of a queue are controlled
as a group.  To stop one thread in a group of running ones, the state of
all threads is retrieved from the GPU, all threads are destroyed, and all
threads but the one we want to stop are re-created from the saved state.
The net result, from the point of view of GDB, is that the library
stopped one thread.  The same thing goes if we want to resume one thread
while others are running: the state of all running threads is retrieved
from the GPU, they are all destroyed, and they are all re-created,
including the thread we want to resume.

This leads to some inefficiencies when combined with how GDB works, here
are two examples:

 - Stopping all threads: because the target operates in non-stop mode,
   when the user interface mode is all-stop, GDB must stop all threads
   individually when presenting a stop.  Let's suppose we have 1000
   threads and the user does ^C.  GDB asks the target to stop one
   thread.  Behind the scenes, the library retrieves 1000 thread states
   and restores the 999 others still running ones.  GDB asks the target
   to stop another one.  The target retrieves 999 thread states and
   restores the 998 remaining ones.  That means that to stop 1000
   threads, we did 1000 back and forths with the GPU.  It would have
   been much better to just retrieve the states once and stop there.

 - Resuming with pending events: suppose the 1000 threads hit a
   breakpoint at the same time.  The breakpoint is conditional and
   evaluates to true for the first thread, to false for all others.  GDB
   pulls one event (for the first thread) from the target, decides that
   it should present a stop, so stops all threads using
   stop_all_threads.  All these other threads have a breakpoint event to
   report, which is saved in `thread_info::suspend::waitstatus` for
   later.  When the user does "continue", GDB resumes that one thread
   that did hit the breakpoint.  It then processes the pending events
   one by one as if they just arrived.  It picks one, evaluates the
   condition to false, and resumes the thread.  It picks another one,
   evaluates the condition to false, and resumes the thread.  And so on.
   In between each resumption, there is a full state retrieval and
   re-creation.  It would be much nicer if we could wait a little bit
   before sending those threads on the GPU, until it processed all those
   pending events.

To address this kind of performance issue, ROCdbgapi has a concept
called "forward progress required", which is a boolean state that allows
its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
hold off putting the threads on the GPU until I'm done" (the "forward
progress not required" state).  Turning forward progress back on
indicates to the library that all threads that are supposed to be
running should now be really running on the GPU.

It turns out that GDB has a similar concept, though not as general,
commit_resume.  On difference is that commit_resume is not stateful: the
target can't look up "does the core need me to schedule resumed threads
for execution right now".  It is also specifically linked to the resume
method, it is not used in other contexts.  The target accumulates
resumption requests through target_ops::resume calls, and then commits
those resumptions when target_ops::commit_resume is called.  The target
has no way to check if it's ok to leave resumed threads stopped in other
target methods.

To bridge the gap, this patch generalizes the commit_resume concept in
GDB to match the forward progress concept of ROCdbgapi.  The current
name (commit_resume) can be interpreted as "commit the previous resume
calls".  I renamed the concept to "commit_resumed", as in "commit the
threads that are resumed".

In the new version, we have two things in process_stratum_target:

 - the commit_resumed_state field: indicates whether GDB requires this
   target to have resumed threads committed to the execution
   target/device.  If false, the target is allowed to leave resumed
   threads un-committed at the end of whatever method it is executing.

 - the commit_resumed method: called when commit_resumed_state
   transitions from false to true.  While commit_resumed_state was
   false, the target may have left some resumed threads un-committed.
   This method being called tells it that it should commit them back to
   the execution device.

Let's take the "Stopping all threads" scenario from above and see how it
would work with the ROCm target with this change.  Before stopping all
threads, GDB would set the target's commit_resumed_state field to false.
It would then ask the target to stop the first thread.  The target would
retrieve all threads' state from the GPU and mark that one as stopped.
Since commit_resumed_state is false, it leaves all the other threads
(still resumed) stopped.  GDB would then proceed to call target_stop for
all the other threads.  Since resumed threads are not committed, this
doesn't do any back and forth with the GPU.

To simplify the implementation of targets, I made it so that when
calling certain target methods, the contract between the core and the
targets guarantees that commit_resumed_state is false.  This way, the
target doesn't need two paths, one commit_resumed_state == true and one
for commit_resumed_state == false.  It can just assert that
commit_resumed_state is false and work with that assumption.  This also
helps catch places where we forgot to disable commit_resumed_state
before calling the method, which represents a probable optimization
opportunity.

To have some confidence that this contract between the core and the
targets is respected, I added assertions in the linux-nat target
methods, even though the linux-nat target doesn't actually use that
feature.  Since linux-nat is tested much more than other targets, this
will help catch these issues quicker.

To ensure that commit_resumed_state is always turned back on (only if
necessary, see below) and the commit_resumed method is called when doing
so, I introduced the scoped_disabled_commit_resumed RAII object, which
replaces make_scoped_defer_process_target_commit_resume.  On
construction, it clears the commit_resumed_state flag of all process
targets.  On destruction, it turns it back on (if necessary) and calls
the commit_resumed method.  The nested case is handled by having a
"nesting" counter: only when the counter goes back to 0 is
commit_resumed_state turned back on.

On destruction, commit-resumed is not re-enabled for a given target if:

 1. this target has no threads resumed, or
 2. this target at least one thread with a pending status known to the
    core (saved in thread_info::suspend::waitstatus).

The first point is not technically necessary, because a proper
commit_resumed implementation would be a no-op if the target has no
resumed threads.  But since we have a flag do to a quick check, I think
it doesn't hurt.

The second point is more important: together with the
scoped_disable_commit_resumed instance added in fetch_inferior_event, it
makes it so the "Resuming with pending events" described above is
handled efficiently.  Here's what happens in that case:

 1. The user types "continue".
 2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
    function does not enable commit-resumed, as it sees other threads
    have pending statuses.
 3. fetch_inferior_event is called to handle another event, one thread
    is resumed.  Because there are still more threads with pending
    statuses, the destructor of scoped_disable_commit_resumed in
    fetch_inferior_event still doesn't enable commit-resumed.
 4. Rinse and repeat step 3, until the last pending status is handled by
    fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
    destructor sees there are no more threads with pending statues, so
    it asks the target to commit resumed threads.

This allows us to avoid all unnecessary back and forths, there is a
single commit_resumed call.

This change required remote_target::remote_stop_ns to learn how to
handle stopping threads that were resumed but pending vCont.  The
simplest example where that happens is when using the remote target in
all-stop, but with "maint set target-non-stop on", to force it to
operate in non-stop mode under the hood.  If two threads hit a
breakpoint at the same time, GDB will receive two stop replies.  It will
present the stop for one thread and save the other one in
thread_info::suspend::waitstatus.

Before this patch, when doing "continue", GDB first resumes the thread
without a pending status:

    Sending packet: $vCont;c:p172651.172676#f3

It then consumes the pending status in the next fetch_inferior_event
call:

    [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
    [infrun] target_wait (-1.0.0, status) =
    [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
    [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP

It then realizes it needs to stop all threads to present the stop, so
stops the thread it just resumed:

    [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
    [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
    remote_stop called
    Sending packet: $vCont;t:p172651.172676#04

This is an unnecessary resume/stop.  With this patch, we don't commit
resumed threads after proceeding, because of the pending status:

    [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus

When GDB handles the pending status and stop_all_threads runs, we stop a
resumed but pending vCont thread:

    remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)

That thread was never actually resumed on the remote stub / gdbserver.
This is why remote_stop_ns needed to learn this new trick of enqueueing
phony stop replies.

Note that this patch only considers pending statuses known to the core
of GDB, that is the events that were pulled out of the target and stored
in `thread_info::suspend::waitstatus`.  In some cases, we could also
avoid unnecessary back and forth when the target has events that it has
not yet reported the core.  I plan to implement this as a subsequent
patch, once this series has settled.

gdb/ChangeLog:

	* infrun.h (struct scoped_disable_commit_resumed): New.
	* infrun.c (do_target_resume): Remove
	maybe_commit_resume_process_target call.
	(maybe_commit_resume_all_process_targets): Rename to...
	(maybe_commit_resumed_all_process_targets): ... this.  Skip
	targets that have no executing threads or resumed threads with
	a pending status.
	(scoped_disable_commit_resumed_depth): New.
	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
	New.
	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
	New.
	(proceed): Use scoped_disable_commit_resumed.
	(fetch_inferior_event): Use scoped_disable_commit_resumed.
	* process-stratum-target.h (class process_stratum_target):
	<commit_resume>: Rename to...
	<commit_resumed>: ... this.
	<commit_resumed_state>: New.
	(all_process_targets): New.
	(maybe_commit_resume_process_target): Remove.
	(make_scoped_defer_process_target_commit_resume): Remove.
	* process-stratum-target.c (all_process_targets): New.
	(defer_process_target_commit_resume): Remove.
	(maybe_commit_resume_process_target): Remove.
	(make_scoped_defer_process_target_commit_resume): Remove.
	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
	(linux_nat_target::wait): Add gdb_assert.
	(linux_nat_target::stop): Add gdb_assert.
	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
	(attach_command): Use scoped_disable_commit_resumed.
	(detach_command): Use scoped_disable_commit_resumed.
	(interrupt_target_1): Use scoped_disable_commit_resumed.
	* mi/mi-main.c (exec_continue): Use
	scoped_disable_commit_resumed.
	* record-full.c (record_full_wait_1): Change
	commit_resumed_state around calling commit_resumed.
	* remote.c (class remote_target) <commit_resume>: Rename to...
	<commit_resumed>: ... this.
	(remote_target::resume): Add gdb_assert.
	(remote_target::commit_resume): Rename to...
	(remote_target::commit_resumed): ... this.  Check if there is
	any thread pending vCont resume.
	(struct stop_reply): Move up.
	(remote_target::remote_stop_ns): Generate stop replies for
	resumed but pending vCont threads.
	(remote_target::wait_ns): Add gdb_assert.

[1] https://github.com/ROCm-Developer-Tools/ROCgdb/
[2] https://github.com/ROCm-Developer-Tools/ROCdbgapi

Change-Id: I836135531a29214b21695736deb0a81acf8cf566
---
 gdb/infcmd.c                 |   8 +++
 gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
 gdb/infrun.h                 |  41 +++++++++++++
 gdb/linux-nat.c              |   5 ++
 gdb/mi/mi-main.c             |   2 +
 gdb/process-stratum-target.c |  37 +++++------
 gdb/process-stratum-target.h |  63 +++++++++++--------
 gdb/record-full.c            |   4 +-
 gdb/remote.c                 | 111 +++++++++++++++++++++++----------
 9 files changed, 292 insertions(+), 95 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 6f0ed952de67..b7595e42e265 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
       uiout->flush ();
     }
 
+  scoped_disable_commit_resumed disable_commit_resumed ("running");
+
   /* We call get_inferior_args() because we might need to compute
      the value now.  */
   run_target->create_inferior (exec_file,
@@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
   if (non_stop && !attach_target->supports_non_stop ())
     error (_("Cannot attach to this target in non-stop mode"));
 
+  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
+
   attach_target->attach (args, from_tty);
   /* to_attach should push the target, so after this point we
      shouldn't refer to attach_target again.  */
@@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
   if (inferior_ptid == null_ptid)
     error (_("The program is not being run."));
 
+  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
+
   query_if_trace_running (from_tty);
 
   disconnect_tracing ();
@@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
 void
 interrupt_target_1 (bool all_threads)
 {
+  scoped_disable_commit_resumed inhibit ("interrupting");
+
   if (non_stop)
     {
       if (all_threads)
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 1a27af51b7e9..92a1102cb595 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
 
   target_resume (resume_ptid, step, sig);
 
-  maybe_commit_resume_process_target (tp->inf->process_target ());
-
   if (target_can_async_p ())
     target_async (1);
 }
@@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
 					    execution_direction)));
 }
 
-/* Calls maybe_commit_resume_process_target on all process targets.  */
+/* Maybe require all process stratum targets to commit their resumed threads.
+
+   A specific process stratum target is not required to do so if:
+
+   - it has no resumed threads
+   - it has a thread with a pending status  */
 
 static void
-maybe_commit_resume_all_process_targets ()
+maybe_commit_resumed_all_process_targets ()
 {
-  scoped_restore_current_thread restore_thread;
+  /* This is an optional to avoid unnecessary thread switches. */
+  gdb::optional<scoped_restore_current_thread> restore_thread;
 
   for (process_stratum_target *target : all_non_exited_process_targets ())
     {
+      gdb_assert (!target->commit_resumed_state);
+
+      if (!target->threads_executing)
+	{
+	  infrun_debug_printf ("not re-enabling forward progress for target "
+			       "%s, no executing threads",
+			       target->shortname ());
+	  continue;
+	}
+
+      /* If a thread from this target has some status to report, we better
+	 handle it before requiring the target to commit its resumed threads:
+	 handling the status might lead to resuming more threads.  */
+      bool has_thread_with_pending_status = false;
+      for (thread_info *thread : all_non_exited_threads (target))
+	if (thread->resumed && thread->suspend.waitstatus_pending_p)
+	  {
+	    has_thread_with_pending_status = true;
+	    break;
+	  }
+
+      if (has_thread_with_pending_status)
+	{
+	  infrun_debug_printf ("not requesting commit-resumed for target %s, a"
+			       "thread has a pending waitstatus",
+			       target->shortname ());
+	  continue;
+	}
+
+      if (!restore_thread.has_value ())
+	restore_thread.emplace ();
+
       switch_to_target_no_thread (target);
-      maybe_commit_resume_process_target (target);
+      infrun_debug_printf ("enabling commit-resumed for target %s",
+			   target->shortname());
+
+      target->commit_resumed_state = true;
+      target->commit_resumed ();
+    }
+}
+
+/* To track nesting of scoped_disable_commit_resumed objects.  */
+
+static int scoped_disable_commit_resumed_depth = 0;
+
+scoped_disable_commit_resumed::scoped_disable_commit_resumed
+  (const char *reason)
+  : m_reason (reason)
+{
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  for (process_stratum_target *target : all_process_targets ())
+    {
+      if (scoped_disable_commit_resumed_depth == 0)
+	{
+	  /* This is the outermost instance.  */
+	  target->commit_resumed_state = false;
+	}
+      else
+	{
+	  /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE
+	     to have been cleared by the outermost instance.  */
+	  gdb_assert (!target->commit_resumed_state);
+	}
+    }
+
+  ++scoped_disable_commit_resumed_depth;
+}
+
+scoped_disable_commit_resumed::~scoped_disable_commit_resumed ()
+{
+  infrun_debug_printf ("reason=%s", m_reason);
+
+  gdb_assert (scoped_disable_commit_resumed_depth > 0);
+
+  --scoped_disable_commit_resumed_depth;
+
+  if (scoped_disable_commit_resumed_depth == 0)
+    {
+      /* This is the outermost instance.  */
+      maybe_commit_resumed_all_process_targets ();
+    }
+  else
+    {
+      /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE to
+	 still be false.  */
+      for (process_stratum_target *target : all_process_targets ())
+	gdb_assert (!target->commit_resumed_state);
     }
 }
 
@@ -2994,8 +3084,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   cur_thr->prev_pc = regcache_read_pc_protected (regcache);
 
   {
-    scoped_restore save_defer_tc
-      = make_scoped_defer_process_target_commit_resume ();
+    scoped_disable_commit_resumed disable_commit_resumed ("proceeding");
 
     started = start_step_over ();
 
@@ -3065,8 +3154,6 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
       }
   }
 
-  maybe_commit_resume_all_process_targets ();
-
   finish_state.release ();
 
   /* If we've switched threads above, switch back to the previously
@@ -3819,8 +3906,15 @@ fetch_inferior_event ()
       = make_scoped_restore (&execution_direction,
 			     target_execution_direction ());
 
+    /* Allow process stratum targets to pause their resumed threads while we
+       handle the event.  */
+    scoped_disable_commit_resumed disable_commit_resumed ("handling event");
+
     if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
-      return;
+      {
+	infrun_debug_printf ("do_target_wait returned no event");
+	return;
+      }
 
     gdb_assert (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
 
diff --git a/gdb/infrun.h b/gdb/infrun.h
index 7160b60f1368..5c32c0c97f6e 100644
--- a/gdb/infrun.h
+++ b/gdb/infrun.h
@@ -269,4 +269,45 @@ extern void all_uis_check_sync_execution_done (void);
    started or re-started).  */
 extern void all_uis_on_sync_execution_starting (void);
 
+/* RAII object to temporarily disable the requirement for process stratum
+   targets to commit their resumed threads.
+
+   On construction, set process_stratum_target::commit_resumed_state to false
+   for all process stratum targets.
+
+   On destruction, call maybe_commit_resumed_all_process_targets.
+
+   In addition, track creation of nested scoped_disable_commit_resumed objects,
+   for cases like this:
+
+     void
+     inner_func ()
+     {
+       scoped_disable_commit_resumed disable;
+       // do stuff
+     }
+
+     void
+     outer_func ()
+     {
+       scoped_disable_commit_resumed disable;
+
+       for (... each thread ...)
+	 inner_func ();
+     }
+
+   In this case, we don't want the `disable` in `inner_func` to require targets
+   to commit resumed threads in its destructor.  */
+
+struct scoped_disable_commit_resumed
+{
+  scoped_disable_commit_resumed (const char *reason);
+  ~scoped_disable_commit_resumed ();
+
+  DISABLE_COPY_AND_ASSIGN (scoped_disable_commit_resumed);
+
+private:
+  const char *m_reason;
+};
+
 #endif /* INFRUN_H */
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index dc524cf10dc1..9adec81ba132 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1661,6 +1661,8 @@ linux_nat_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
 			   ? strsignal (gdb_signal_to_host (signo)) : "0"),
 			  target_pid_to_str (inferior_ptid).c_str ());
 
+  gdb_assert (!this->commit_resumed_state);
+
   /* A specific PTID means `step only this process id'.  */
   resume_many = (minus_one_ptid == ptid
 		 || ptid.is_pid ());
@@ -3406,6 +3408,8 @@ linux_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   linux_nat_debug_printf ("[%s], [%s]", target_pid_to_str (ptid).c_str (),
 			  target_options_to_string (target_options).c_str ());
 
+  gdb_assert (!this->commit_resumed_state);
+
   /* Flush the async file first.  */
   if (target_is_async_p ())
     async_file_flush ();
@@ -4166,6 +4170,7 @@ linux_nat_stop_lwp (struct lwp_info *lwp)
 void
 linux_nat_target::stop (ptid_t ptid)
 {
+  gdb_assert (!this->commit_resumed_state);
   iterate_over_lwps (ptid, linux_nat_stop_lwp);
 }
 
diff --git a/gdb/mi/mi-main.c b/gdb/mi/mi-main.c
index 9a14d78e1e27..e5653ea3e3f5 100644
--- a/gdb/mi/mi-main.c
+++ b/gdb/mi/mi-main.c
@@ -266,6 +266,8 @@ exec_continue (char **argv, int argc)
 {
   prepare_execution_command (current_top_target (), mi_async_p ());
 
+  scoped_disable_commit_resumed disable_commit_resumed ("mi continue");
+
   if (non_stop)
     {
       /* In non-stop mode, 'resume' always resumes a single thread.
diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
index 1436a550ac04..9877f0d81931 100644
--- a/gdb/process-stratum-target.c
+++ b/gdb/process-stratum-target.c
@@ -99,6 +99,20 @@ all_non_exited_process_targets ()
 
 /* See process-stratum-target.h.  */
 
+std::set<process_stratum_target *>
+all_process_targets ()
+{
+  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
+  std::set<process_stratum_target *> targets;
+  for (inferior *inf : all_inferiors ())
+    if (inf->process_target () != nullptr)
+      targets.insert (inf->process_target ());
+
+  return targets;
+}
+
+/* See process-stratum-target.h.  */
+
 void
 switch_to_target_no_thread (process_stratum_target *target)
 {
@@ -108,26 +122,3 @@ switch_to_target_no_thread (process_stratum_target *target)
       break;
     }
 }
-
-/* If true, `maybe_commit_resume_process_target` is a no-op.  */
-
-static bool defer_process_target_commit_resume;
-
-/* See target.h.  */
-
-void
-maybe_commit_resume_process_target (process_stratum_target *proc_target)
-{
-  if (defer_process_target_commit_resume)
-    return;
-
-  proc_target->commit_resume ();
-}
-
-/* See process-stratum-target.h.  */
-
-scoped_restore_tmpl<bool>
-make_scoped_defer_process_target_commit_resume ()
-{
-  return make_scoped_restore (&defer_process_target_commit_resume, true);
-}
diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
index c8060c46be93..3cea911dee09 100644
--- a/gdb/process-stratum-target.h
+++ b/gdb/process-stratum-target.h
@@ -63,19 +63,10 @@ class process_stratum_target : public target_ops
   bool has_registers () override;
   bool has_execution (inferior *inf) override;
 
-  /* Commit a series of resumption requests previously prepared with
-     resume calls.
+  /* Ensure that all resumed threads are committed to the target.
 
-     GDB always calls `commit_resume` on the process stratum target after
-     calling `resume` on a target stack.  A process stratum target may thus use
-     this method in coordination with its `resume` method to batch resumption
-     requests.  In that case, the target doesn't actually resume in its
-     `resume` implementation.  Instead, it takes note of resumption intent in
-     `resume`, and defers the actual resumption `commit_resume`.
-
-     E.g., the remote target uses this to coalesce multiple resumption requests
-     in a single vCont packet.  */
-  virtual void commit_resume () {}
+     See the description of COMMIT_RESUMED_STATE for more details.  */
+  virtual void commit_resumed () {}
 
   /* True if any thread is, or may be executing.  We need to track
      this separately because until we fully sync the thread list, we
@@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
 
   /* The connection number.  Visible in "info connections".  */
   int connection_number = 0;
+
+  /* Whether resumed threads must be committed to the target.
+
+     When true, resumed threads must be committed to the execution target.
+
+     When false, the process stratum target may leave resumed threads stopped
+     when it's convenient or efficient to do so.  When the core requires resumed
+     threads to be committed again, this is set back to true and calls the
+     `commit_resumed` method to allow the target to do so.
+
+     To simplify the implementation of process stratum targets, the following
+     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
+     false:
+
+       - resume
+       - stop
+       - wait
+
+     Knowing this, the process stratum target doesn't need to implement
+     different behaviors depending on the COMMIT_RESUMED_STATE, and can
+     simply assert that it is false.
+
+     Process stratum targets can take advantage of this to batch resumption
+     requests, for example.  In that case, the target doesn't actually resume in
+     its `resume` implementation.  Instead, it takes note of the resumption
+     intent in `resume` and defers the actual resumption to `commit_resumed`.
+     For example, the remote target uses this to coalesce multiple resumption
+     requests in a single vCont packet.  */
+  bool commit_resumed_state = false;
 };
 
 /* Downcast TARGET to process_stratum_target.  */
@@ -101,24 +121,13 @@ as_process_stratum_target (target_ops *target)
 
 extern std::set<process_stratum_target *> all_non_exited_process_targets ();
 
+/* Return a collection of all existing process stratum targets.  */
+
+extern std::set<process_stratum_target *> all_process_targets ();
+
 /* Switch to the first inferior (and program space) of TARGET, and
    switch to no thread selected.  */
 
 extern void switch_to_target_no_thread (process_stratum_target *target);
 
-/* Commit a series of resumption requests previously prepared with
-   target_resume calls.
-
-   This function is a no-op if commit resumes are deferred (see
-   `make_scoped_defer_process_target_commit_resume`).  */
-
-extern void maybe_commit_resume_process_target
-  (process_stratum_target *target);
-
-/* Setup to defer `commit_resume` calls, and re-set to the previous status on
-   destruction.  */
-
-extern scoped_restore_tmpl<bool>
-  make_scoped_defer_process_target_commit_resume ();
-
 #endif /* !defined (PROCESS_STRATUM_TARGET_H) */
diff --git a/gdb/record-full.c b/gdb/record-full.c
index 56ab29479874..fad355afdf4f 100644
--- a/gdb/record-full.c
+++ b/gdb/record-full.c
@@ -1263,7 +1263,9 @@ record_full_wait_1 (struct target_ops *ops,
 					    "issuing one more step in the "
 					    "target beneath\n");
 		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
-		      proc_target->commit_resume ();
+		      proc_target->commit_resumed_state = true;
+		      proc_target->commit_resumed ();
+		      proc_target->commit_resumed_state = false;
 		      continue;
 		    }
 		}
diff --git a/gdb/remote.c b/gdb/remote.c
index f8150f39fb5c..be53886c1837 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -421,7 +421,7 @@ class remote_target : public process_stratum_target
   void detach (inferior *, int) override;
   void disconnect (const char *, int) override;
 
-  void commit_resume () override;
+  void commit_resumed () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
 
@@ -6376,6 +6376,8 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
 {
   struct remote_state *rs = get_remote_state ();
 
+  gdb_assert (!this->commit_resumed_state);
+
   /* When connected in non-stop mode, the core resumes threads
      individually.  Resuming remote threads directly in target_resume
      would thus result in sending one packet per thread.  Instead, to
@@ -6565,7 +6567,7 @@ vcont_builder::push_action (ptid_t ptid, bool step, gdb_signal siggnal)
 /* to_commit_resume implementation.  */
 
 void
-remote_target::commit_resume ()
+remote_target::commit_resumed ()
 {
   int any_process_wildcard;
   int may_global_wildcard_vcont;
@@ -6640,6 +6642,8 @@ remote_target::commit_resume ()
      disable process and global wildcard resumes appropriately.  */
   check_pending_events_prevent_wildcard_vcont (&may_global_wildcard_vcont);
 
+  bool any_pending_vcont_resume = false;
+
   for (thread_info *tp : all_non_exited_threads (this))
     {
       remote_thread_info *priv = get_remote_thread_info (tp);
@@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
 	  continue;
 	}
 
+      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
+	any_pending_vcont_resume = true;
+
       /* If a thread is the parent of an unfollowed fork, then we
 	 can't do a global wildcard, as that would resume the fork
 	 child.  */
@@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
 	may_global_wildcard_vcont = 0;
     }
 
+  /* We didn't have any resumed thread pending a vCont resume, so nothing to
+     do.  */
+  if (!any_pending_vcont_resume)
+    return;
+
   /* Now let's build the vCont packet(s).  Actions must be appended
      from narrower to wider scopes (thread -> process -> global).  If
      we end up with too many actions for a single packet vcont_builder
@@ -6735,7 +6747,35 @@ remote_target::commit_resume ()
   vcont_builder.flush ();
 }
 
-\f
+struct stop_reply : public notif_event
+{
+  ~stop_reply ();
+
+  /* The identifier of the thread about this event  */
+  ptid_t ptid;
+
+  /* The remote state this event is associated with.  When the remote
+     connection, represented by a remote_state object, is closed,
+     all the associated stop_reply events should be released.  */
+  struct remote_state *rs;
+
+  struct target_waitstatus ws;
+
+  /* The architecture associated with the expedited registers.  */
+  gdbarch *arch;
+
+  /* Expedited registers.  This makes remote debugging a bit more
+     efficient for those targets that provide critical registers as
+     part of their normal status mechanism (as another roundtrip to
+     fetch them is avoided).  */
+  std::vector<cached_reg_t> regcache;
+
+  enum target_stop_reason stop_reason;
+
+  CORE_ADDR watch_data_address;
+
+  int core;
+};
 
 /* Non-stop version of target_stop.  Uses `vCont;t' to stop a remote
    thread, all threads of a remote process, or all threads of all
@@ -6748,6 +6788,39 @@ remote_target::remote_stop_ns (ptid_t ptid)
   char *p = rs->buf.data ();
   char *endp = p + get_remote_packet_size ();
 
+  gdb_assert (!this->commit_resumed_state);
+
+  /* If any threads that needs to stop are pending a vCont resume, generate
+     dummy stop_reply events.  */
+  for (thread_info *tp : all_non_exited_threads (this, ptid))
+    {
+      remote_thread_info *remote_thr = get_remote_thread_info (tp);
+
+      if (remote_thr->resume_state () == resume_state::RESUMED_PENDING_VCONT)
+	{
+	  if (remote_debug)
+	    {
+	      fprintf_unfiltered (gdb_stdlog,
+				  "remote_stop_ns: Enqueueing phony stop reply "
+				  "for thread pending vCont-resume "
+				  "(%d, %ld, %ld)\n",
+				  tp->ptid.pid(), tp->ptid.lwp (),
+				  tp->ptid.tid ());
+	    }
+
+	  stop_reply *sr = new stop_reply ();
+	  sr->ptid = tp->ptid;
+	  sr->rs = rs;
+	  sr->ws.kind = TARGET_WAITKIND_STOPPED;
+	  sr->ws.value.sig = GDB_SIGNAL_0;
+	  sr->arch = tp->inf->gdbarch;
+	  sr->stop_reason = TARGET_STOPPED_BY_NO_REASON;
+	  sr->watch_data_address = 0;
+	  sr->core = 0;
+	  this->push_stop_reply (sr);
+	}
+    }
+
   /* FIXME: This supports_vCont_probed check is a workaround until
      packet_support is per-connection.  */
   if (packet_support (PACKET_vCont) == PACKET_SUPPORT_UNKNOWN
@@ -6955,36 +7028,6 @@ remote_console_output (const char *msg)
   gdb_stdtarg->flush ();
 }
 
-struct stop_reply : public notif_event
-{
-  ~stop_reply ();
-
-  /* The identifier of the thread about this event  */
-  ptid_t ptid;
-
-  /* The remote state this event is associated with.  When the remote
-     connection, represented by a remote_state object, is closed,
-     all the associated stop_reply events should be released.  */
-  struct remote_state *rs;
-
-  struct target_waitstatus ws;
-
-  /* The architecture associated with the expedited registers.  */
-  gdbarch *arch;
-
-  /* Expedited registers.  This makes remote debugging a bit more
-     efficient for those targets that provide critical registers as
-     part of their normal status mechanism (as another roundtrip to
-     fetch them is avoided).  */
-  std::vector<cached_reg_t> regcache;
-
-  enum target_stop_reason stop_reason;
-
-  CORE_ADDR watch_data_address;
-
-  int core;
-};
-
 /* Return the length of the stop reply queue.  */
 
 int
@@ -7877,6 +7920,8 @@ remote_target::wait_ns (ptid_t ptid, struct target_waitstatus *status,
   int ret;
   int is_notif = 0;
 
+  gdb_assert (!this->commit_resumed_state);
+
   /* If in non-stop mode, get out of getpkt even if a
      notification is received.	*/
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
                   ` (3 preceding siblings ...)
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
@ 2021-01-08  4:17 ` Simon Marchi
  2021-01-08 18:19   ` Andrew Burgess
  2021-01-09 21:26   ` Pedro Alves
  4 siblings, 2 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08  4:17 UTC (permalink / raw)
  To: gdb-patches

From: Andrew Burgess <andrew.burgess@embecosm.com>

New in v3 (Simon Marchi): rely on the resume state saved in
remote_thread_info to find the first non-exited, resumed thread.  This
simplifies the code a bit, as we don't need to fall back on the first
non-exited thread on initial connection.

This commit builds on work started in the following two commits:

  commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
  Date:   Thu Jan 30 14:35:40 2020 +0000

      gdb/remote: Restore support for 'S' stop reply packet

  commit cada5fc921e39a1945c422eea055c8b326d8d353
  Date:   Wed Mar 11 12:30:13 2020 +0000

      gdb: Handle W and X remote packets without giving a warning

This is related to how GDB handles remote targets that send back 'S'
packets.

In the first of the above commits we fixed GDB's ability to handle a
single process, single threaded target that sends back 'S' packets.
Although the 'T' packet would always be preferred to 'S' these days,
there's nothing really wrong with 'S' for this situation.

The second commit above fixed an oversight in the first commit, a
single-process, multi-threaded target can send back a process wide
event, for example the process exited event 'W' without including a
process-id, this also is fine as there is no ambiguity in this case.

In PR gdb/26819 we run into yet another problem with the above
commits.  In this case we have a single process with two threads, GDB
hits a breakpoint in thread 2 and then performs a stepi:

  (gdb) b main
  Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
  (gdb) c
  Continuing.

  Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
  10    in infinite_loop.S
  (gdb) set debug remote 1
  (gdb) stepi
  Sending packet: $vCont;s:2#24...Packet received: S05
  ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.

What happens in this case is that on the RISC-V target displaced
stepping is not supported, so when the stepi is issued GDB steps just
thread 2.  As only a single thread was set running the target decides
that is can get away with sending back an 'S' packet without a
thread-id.  GDB then associates the stop with thread 1 (the first
non-exited thread), but as thread 1 was not previously set executing
the assertion seen above triggers.

As an aside I am surprised that the target sends pack 'S' in this
situation.  The target is happy to send back 'T' (including thread-id)
when multiple threads are set running, so (to me) it would seem easier
to just always use the 'T' packet when multiple threads are in use.
However, the target only uses 'T' when multiple threads are actually
executing, otherwise an 'S' packet it used.

Still, when looking at the above situation we can see that GDB should
be able to understand which thread the 'S' reply is referring too.

The problem is that is that in commit 24ed6739b699 (above) when a stop
reply comes in with no thread-id we look for the first non-exited
thread and select that as the thread the stop applies too.

What we should really do is select the first non-exited, resumed thread,
and associate the stop event with this thread.  In the above example
both thread 1 and 2 are non-exited, but only thread 2 is resumed, so
this is what we should use.

There's a test for this issue included which works with stock
gdbserver by disabling use of the 'T' packet, and enabling
'scheduler-locking' within GDB so only one thread is set running.

gdb/ChangeLog:

	PR gdb/26819
	* remote.c
	(remote_target::select_thread_for_ambiguous_stop_reply): New
	member function.
	(remote_target::process_stop_reply): Call
	select_thread_for_ambiguous_stop_reply.

gdb/testsuite/ChangeLog:

	PR gdb/26819
	* gdb.server/stop-reply-no-thread-multi.c: New file.
	* gdb.server/stop-reply-no-thread-multi.exp: New file.

Change-Id: I9b49d76c2a99063dcc76203fa0f5270a72825d15
---
 gdb/remote.c                                  | 153 +++++++++++-------
 .../gdb.server/stop-reply-no-thread-multi.c   |  77 +++++++++
 .../gdb.server/stop-reply-no-thread-multi.exp | 136 ++++++++++++++++
 3 files changed, 312 insertions(+), 54 deletions(-)
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp

diff --git a/gdb/remote.c b/gdb/remote.c
index be53886c1837..f12a86f66a14 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
   ptid_t process_stop_reply (struct stop_reply *stop_reply,
 			     target_waitstatus *status);
 
+  ptid_t select_thread_for_ambiguous_stop_reply
+    (const struct target_waitstatus *status);
+
   void remote_notice_new_inferior (ptid_t currthread, int executing);
 
   void process_initial_stop_replies (int from_tty);
@@ -7796,75 +7799,117 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
   remote->remote_notif_get_pending_events (nc);
 }
 
-/* Called when it is decided that STOP_REPLY holds the info of the
-   event that is to be returned to the core.  This function always
-   destroys STOP_REPLY.  */
+/* Called from process_stop_reply when the stop packet we are responding
+   to didn't include a process-id or thread-id.  STATUS is the stop event
+   we are responding to.
+
+   It is the task of this function to select a suitable thread (or process)
+   and return its ptid, this is the thread (or process) we will assume the
+   stop event came from.
+
+   In some cases there isn't really any choice about which thread (or
+   process) is selected, a basic remote with a single process containing a
+   single thread might choose not to send any process-id or thread-id in
+   its stop packets, this function will select and return the one and only
+   thread.
+
+   However, if a target supports multiple threads (or processes) and still
+   doesn't include a thread-id (or process-id) in its stop packet then
+   first, this is a badly behaving target, and second, we're going to have
+   to select a thread (or process) at random and use that.  This function
+   will print a warning to the user if it detects that there is the
+   possibility that GDB is guessing which thread (or process) to
+   report.  */
 
 ptid_t
-remote_target::process_stop_reply (struct stop_reply *stop_reply,
-				   struct target_waitstatus *status)
+remote_target::select_thread_for_ambiguous_stop_reply
+  (const struct target_waitstatus *status)
 {
-  ptid_t ptid;
+  /* Some stop events apply to all threads in an inferior, while others
+     only apply to a single thread.  */
+  bool is_stop_for_all_threads
+    = (status->kind == TARGET_WAITKIND_EXITED
+       || status->kind == TARGET_WAITKIND_SIGNALLED);
 
-  *status = stop_reply->ws;
-  ptid = stop_reply->ptid;
+  thread_info *first_resumed_thread = nullptr;
+  bool multiple_resumed_thread = false;
 
-  /* If no thread/process was reported by the stub then use the first
-     non-exited thread in the current target.  */
-  if (ptid == null_ptid)
+  /* Consider all non-exited threads of the target, find the first resumed
+     one.  */
+  for (thread_info *thr : all_non_exited_threads (this))
     {
-      /* Some stop events apply to all threads in an inferior, while others
-	 only apply to a single thread.  */
-      bool is_stop_for_all_threads
-	= (status->kind == TARGET_WAITKIND_EXITED
-	   || status->kind == TARGET_WAITKIND_SIGNALLED);
+      remote_thread_info *remote_thr =get_remote_thread_info (thr);
+
+      if (remote_thr->resume_state () != resume_state::RESUMED)
+	continue;
+
+      if (first_resumed_thread == nullptr)
+	first_resumed_thread = thr;
+      else if (!is_stop_for_all_threads
+	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
+	multiple_resumed_thread = true;
+    }
 
-      for (thread_info *thr : all_non_exited_threads (this))
+  gdb_assert (first_resumed_thread != nullptr);
+
+  /* Warn if the remote target is sending ambiguous stop replies.  */
+  if (multiple_resumed_thread)
+    {
+      static bool warned = false;
+
+      if (!warned)
 	{
-	  if (ptid != null_ptid
-	      && (!is_stop_for_all_threads
-		  || ptid.pid () != thr->ptid.pid ()))
-	    {
-	      static bool warned = false;
+	  /* If you are seeing this warning then the remote target has
+	     stopped without specifying a thread-id, but the target
+	     does have multiple threads (or inferiors), and so GDB is
+	     having to guess which thread stopped.
 
-	      if (!warned)
-		{
-		  /* If you are seeing this warning then the remote target
-		     has stopped without specifying a thread-id, but the
-		     target does have multiple threads (or inferiors), and
-		     so GDB is having to guess which thread stopped.
-
-		     Examples of what might cause this are the target
-		     sending and 'S' stop packet, or a 'T' stop packet and
-		     not including a thread-id.
-
-		     Additionally, the target might send a 'W' or 'X
-		     packet without including a process-id, when the target
-		     has multiple running inferiors.  */
-		  if (is_stop_for_all_threads)
-		    warning (_("multi-inferior target stopped without "
-			       "sending a process-id, using first "
-			       "non-exited inferior"));
-		  else
-		    warning (_("multi-threaded target stopped without "
-			       "sending a thread-id, using first "
-			       "non-exited thread"));
-		  warned = true;
-		}
-	      break;
-	    }
+	     Examples of what might cause this are the target sending
+	     and 'S' stop packet, or a 'T' stop packet and not
+	     including a thread-id.
 
-	  /* If this is a stop for all threads then don't use a particular
-	     threads ptid, instead create a new ptid where only the pid
-	     field is set.  */
+	     Additionally, the target might send a 'W' or 'X packet
+	     without including a process-id, when the target has
+	     multiple running inferiors.  */
 	  if (is_stop_for_all_threads)
-	    ptid = ptid_t (thr->ptid.pid ());
+	    warning (_("multi-inferior target stopped without "
+		       "sending a process-id, using first "
+		       "non-exited inferior"));
 	  else
-	    ptid = thr->ptid;
+	    warning (_("multi-threaded target stopped without "
+		       "sending a thread-id, using first "
+		       "non-exited thread"));
+	  warned = true;
 	}
-      gdb_assert (ptid != null_ptid);
     }
 
+  /* If this is a stop for all threads then don't use a particular threads
+     ptid, instead create a new ptid where only the pid field is set.  */
+  if (is_stop_for_all_threads)
+    return ptid_t (first_resumed_thread->ptid.pid ());
+  else
+    return first_resumed_thread->ptid;
+}
+
+/* Called when it is decided that STOP_REPLY holds the info of the
+   event that is to be returned to the core.  This function always
+   destroys STOP_REPLY.  */
+
+ptid_t
+remote_target::process_stop_reply (struct stop_reply *stop_reply,
+				   struct target_waitstatus *status)
+{
+  ptid_t ptid;
+
+  *status = stop_reply->ws;
+  ptid = stop_reply->ptid;
+
+  /* If no thread/process was reported by the stub then select a suitable
+     thread/process.  */
+  if (ptid == null_ptid)
+    ptid = select_thread_for_ambiguous_stop_reply (status);
+  gdb_assert (ptid != null_ptid);
+
   if (status->kind != TARGET_WAITKIND_EXITED
       && status->kind != TARGET_WAITKIND_SIGNALLED
       && status->kind != TARGET_WAITKIND_NO_RESUMED)
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
new file mode 100644
index 000000000000..01f6d3c07ff4
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
@@ -0,0 +1,77 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include <unistd.h>
+
+volatile int worker_blocked = 1;
+volatile int main_blocked = 1;
+
+void
+unlock_worker (void)
+{
+  worker_blocked = 0;
+}
+
+void
+unlock_main (void)
+{
+  main_blocked = 0;
+}
+
+void
+breakpt (void)
+{
+  /* Nothing.  */
+}
+
+static void *
+worker (void *data)
+{
+  unlock_main ();
+
+  while (worker_blocked)
+    ;
+
+  breakpt ();
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  pthread_t thr;
+  void *retval;
+
+  /* Ensure the test doesn't run forever.  */
+  alarm (99);
+
+  if (pthread_create (&thr, NULL, worker, NULL) != 0)
+    abort ();
+
+  while (main_blocked)
+    ;
+
+  unlock_worker ();
+
+  if (pthread_join (thr, &retval) != 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
new file mode 100644
index 000000000000..f394ca8ed0c4
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
@@ -0,0 +1,136 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2020 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test how GDB handles the case where a target either doesn't use 'T'
+# packets at all or doesn't include a thread-id in a 'T' packet, AND,
+# where the test program contains multiple threads.
+#
+# In general if multiple threads are executing and the target doesn't
+# include a thread-id in its stop response then GDB will not be able
+# to correctly figure out which thread the stop applies to.
+#
+# However, this test covers a very specific case, there are multiple
+# threads but only a single thread is actually executing.  So, when
+# the stop comes from the target, without a thread-id, GDB should be
+# able to correctly figure out which thread has stopped.
+
+load_lib gdbserver-support.exp
+
+if { [skip_gdbserver_tests] } {
+    verbose "skipping gdbserver tests"
+    return -1
+}
+
+standard_testfile
+if { [build_executable "failed to prepare" $testfile $srcfile {debug pthreads}] == -1 } {
+    return -1
+}
+
+# Run the tests with different features of GDBserver disabled.
+proc run_test { disable_feature } {
+    global binfile gdb_prompt decimal hex
+
+    clean_restart ${binfile}
+
+    # Make sure we're disconnected, in case we're testing with an
+    # extended-remote board, therefore already connected.
+    gdb_test "disconnect" ".*"
+
+    set packet_arg ""
+    if { $disable_feature != "" } {
+	set packet_arg "--disable-packet=${disable_feature}"
+    }
+    set res [gdbserver_start $packet_arg $binfile]
+    set gdbserver_protocol [lindex $res 0]
+    set gdbserver_gdbport [lindex $res 1]
+
+    # Disable XML-based thread listing, and multi-process extensions.
+    gdb_test_no_output "set remote threads-packet off"
+    gdb_test_no_output "set remote multiprocess-feature-packet off"
+
+    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
+    if ![gdb_assert {$res == 0} "connect"] {
+	return
+    }
+
+    # There should be only one thread listed at this point.
+    gdb_test_multiple "info threads" "" {
+	-re "2 Thread.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "has terminated.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
+	    pass $gdb_test_name
+	}
+    }
+
+    gdb_breakpoint "unlock_worker"
+    gdb_continue_to_breakpoint "run to unlock_worker"
+
+    # There should be two threads at this point with thread 1 selected.
+    gdb_test "info threads" \
+	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should now exist"
+
+    # Switch threads.
+    gdb_test "thread 2" ".*" "switch to second thread"
+
+    # Now turn on scheduler-locking so that when we step thread 2 only
+    # that one thread will be set running.
+    gdb_test_no_output "set scheduler-locking on"
+
+    # Single step thread 2.  Only the one thread will step.  When the
+    # thread stops, if the stop packet doesn't include a thread-id
+    # then GDB should still understand which thread stopped.
+    gdb_test_multiple "stepi" "" {
+	-re "Thread 1 received signal SIGTRAP" {
+	    fail $gdb_test_name
+	}
+	-re -wrap "$hex.*$decimal.*while \\(worker_blocked\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check that thread 2 is still selected.
+    gdb_test "info threads" \
+	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should still be selected after stepi"
+
+    # Turn scheduler locking off again so that when we continue all
+    # threads will be set running.
+    gdb_test_no_output "set scheduler-locking off"
+
+    # Continue until exit.  The server sends a 'W' with no PID.
+    # Bad GDB gave an error like below when target is nonstop:
+    #  (gdb) c
+    #  Continuing.
+    #  No process or thread specified in stop reply: W00
+    gdb_continue_to_end "" continue 1
+}
+
+# Disable different features within gdbserver:
+#
+# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
+#          emulating old gdbservers when debugging single-threaded programs.
+#
+# T: Start GDBserver with the entire 'T' stop reply packet disabled,
+#    GDBserver will instead send the 'S' stop reply.
+foreach_with_prefix to_disable { "" Tthread T } {
+    run_test $to_disable
+}
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
@ 2021-01-08 15:41   ` Pedro Alves
  2021-01-08 18:56     ` Simon Marchi
  2021-01-18  5:16   ` Sebastian Huber
  1 sibling, 1 reply; 33+ messages in thread
From: Pedro Alves @ 2021-01-08 15:41 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

Hi,

This patch LGTM.  A couple tiny issue below.

> gdb/ChangeLog:
> 
>         * remote.c (enum class resume_state): New.
>         (struct resumed_pending_vcont_info): New.
>         (struct remote_thread_info) <resume_state, set_not_resumed,
> 	set_resumed_pending_vcont, resumed_pending_vcont_info,
> 	set_resumed, m_resume_state, m_resumed_pending_vcont_info>:
> 	New.
> 	<last_resume_step, last_resume_sig, vcont_resumed>: Remove.
>         (remote_target::remote_add_thread): Adjust.
>         (remote_target::process_initial_stop_replies): Adjust.
>         (remote_target::resume): Adjust.
>         (remote_target::commit_resume): Rely on state in
> 	remote_thread_info and not on tp->executing.
>         (remote_target::process_stop_reply): Adjust.

Mind spaces vs tabs in the ChangeLog entry.

> +/* Information about a thread's pending vCont-resume.  Used when a thread is in
> +   the remote_resume_state::RESUMED_PENDING_VCONT state.  remote_target::resume
> +   stores this information which is them picked up by

them -> then

That's it.  :-)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace,full}.c
  2021-01-08  4:17 ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace, full}.c Simon Marchi
@ 2021-01-08 15:43   ` Pedro Alves
  2021-01-08 19:00     ` Simon Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Pedro Alves @ 2021-01-08 15:43 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

On 08/01/21 04:17, Simon Marchi wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The previous patch made the commit_resume implementations in the record
> targets unnecessary, as the remote target's commit_resume implementation
> won't commit-resume threads for which it didn't see a resume.  This
> patch removes them.
> 
> gdb/ChangeLog:
> 
>         * record-btrace.c (class record_btrace_target):
>         (record_btrace_target::commit_resume):
>         * record-full.c (class record_full_target):
>         (record_full_target::commit_resume):

Incomplete entry.

Otherwise LGTM.  I like how these two patches result in clearer code.  Nice.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target
  2021-01-08  4:17 ` [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target Simon Marchi
@ 2021-01-08 18:12   ` Andrew Burgess
  2021-01-08 19:01     ` Simon Marchi
  2021-01-09 20:29   ` Pedro Alves
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Burgess @ 2021-01-08 18:12 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Pedro Alves, Simon Marchi

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:32 -0500]:

> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The following patch will change the commit_resume target method to
> something stateful.  Because it would be difficult to track a state
> replicated in the various targets of a target stack, and since for the
> foreseeable future, only process stratum targets are going to use this
> concept, this patch makes the commit resume concept specific to process
> stratum targets.
> 
> So, move the method to process_stratum_target, and move helper functions
> to process-stratum-target.h.
> 
> gdb/ChangeLog:
> 
> 	* target.h (struct target_ops) <commit_resume>: New.
> 	(target_commit_resume): Remove.
> 	(make_scoped_defer_target_commit_resume): Remove.
> 	* target.c (defer_target_commit_resume): Remove.
> 	(target_commit_resume): Remove.
> 	(make_scoped_defer_target_commit_resume): Remove.
> 	* process-stratum-target.h (class process_stratum_target)
> 	<commit_resume>: New.
> 	(maybe_commit_resume_all_process_targets): New.
> 	(make_scoped_defer_process_target_commit_resume): New.
> 	* process-stratum-target.c (defer_process_target_commit_resume):
> 	New.
> 	(maybe_commit_resume_process_target): New.
> 	(make_scoped_defer_process_target_commit_resume): New.
> 	* infrun.c (do_target_resume): Adjust.
> 	(commit_resume_all_targets): Rename into...
> 	(maybe_commit_resume_all_process_targets): ... this, adjust.
> 	(proceed): Adjust.
> 	* record-full.c (record_full_wait_1): Adjust.
> 	* target-delegates.c: Re-generate.
> 
> Change-Id: Ifc957817ac5b2303e22760ce3d14740b9598f02c
> ---
>  gdb/infrun.c                 | 28 +++++++++-------------------
>  gdb/process-stratum-target.c | 23 +++++++++++++++++++++++
>  gdb/process-stratum-target.h | 29 +++++++++++++++++++++++++++++
>  gdb/record-full.c            |  8 ++++----
>  gdb/target-delegates.c       | 22 ----------------------
>  gdb/target.c                 | 22 ----------------------
>  gdb/target.h                 | 20 --------------------
>  7 files changed, 65 insertions(+), 87 deletions(-)
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 45bedf896419..1a27af51b7e9 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2172,7 +2172,7 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>  
>    target_resume (resume_ptid, step, sig);
>  
> -  target_commit_resume ();
> +  maybe_commit_resume_process_target (tp->inf->process_target ());
>  
>    if (target_can_async_p ())
>      target_async (1);
> @@ -2760,28 +2760,17 @@ schedlock_applies (struct thread_info *tp)
>  					    execution_direction)));
>  }
>  
> -/* Calls target_commit_resume on all targets.  */
> +/* Calls maybe_commit_resume_process_target on all process targets.  */
>  
>  static void
> -commit_resume_all_targets ()
> +maybe_commit_resume_all_process_targets ()
>  {
>    scoped_restore_current_thread restore_thread;
>  
> -  /* Map between process_target and a representative inferior.  This
> -     is to avoid committing a resume in the same target more than
> -     once.  Resumptions must be idempotent, so this is an
> -     optimization.  */
> -  std::unordered_map<process_stratum_target *, inferior *> conn_inf;
> -
> -  for (inferior *inf : all_non_exited_inferiors ())
> -    if (inf->has_execution ())
> -      conn_inf[inf->process_target ()] = inf;
> -
> -  for (const auto &ci : conn_inf)
> +  for (process_stratum_target *target : all_non_exited_process_targets ())
>      {
> -      inferior *inf = ci.second;
> -      switch_to_inferior_no_thread (inf);
> -      target_commit_resume ();
> +      switch_to_target_no_thread (target);
> +      maybe_commit_resume_process_target (target);
>      }
>  }
>  
> @@ -3005,7 +2994,8 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>    cur_thr->prev_pc = regcache_read_pc_protected (regcache);
>  
>    {
> -    scoped_restore save_defer_tc = make_scoped_defer_target_commit_resume ();
> +    scoped_restore save_defer_tc
> +      = make_scoped_defer_process_target_commit_resume ();
>  
>      started = start_step_over ();
>  
> @@ -3075,7 +3065,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>        }
>    }
>  
> -  commit_resume_all_targets ();
> +  maybe_commit_resume_all_process_targets ();
>  
>    finish_state.release ();
>  
> diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
> index 719167803fff..1436a550ac04 100644
> --- a/gdb/process-stratum-target.c
> +++ b/gdb/process-stratum-target.c
> @@ -108,3 +108,26 @@ switch_to_target_no_thread (process_stratum_target *target)
>        break;
>      }
>  }
> +
> +/* If true, `maybe_commit_resume_process_target` is a no-op.  */
> +
> +static bool defer_process_target_commit_resume;
> +
> +/* See target.h.  */

Should be 'process-stratum-target.h' now.

Thanks,
Andrew

> +
> +void
> +maybe_commit_resume_process_target (process_stratum_target *proc_target)
> +{
> +  if (defer_process_target_commit_resume)
> +    return;
> +
> +  proc_target->commit_resume ();
> +}
> +
> +/* See process-stratum-target.h.  */
> +
> +scoped_restore_tmpl<bool>
> +make_scoped_defer_process_target_commit_resume ()
> +{
> +  return make_scoped_restore (&defer_process_target_commit_resume, true);
> +}
> diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
> index b513c26ffc2a..c8060c46be93 100644
> --- a/gdb/process-stratum-target.h
> +++ b/gdb/process-stratum-target.h
> @@ -63,6 +63,20 @@ class process_stratum_target : public target_ops
>    bool has_registers () override;
>    bool has_execution (inferior *inf) override;
>  
> +  /* Commit a series of resumption requests previously prepared with
> +     resume calls.
> +
> +     GDB always calls `commit_resume` on the process stratum target after
> +     calling `resume` on a target stack.  A process stratum target may thus use
> +     this method in coordination with its `resume` method to batch resumption
> +     requests.  In that case, the target doesn't actually resume in its
> +     `resume` implementation.  Instead, it takes note of resumption intent in
> +     `resume`, and defers the actual resumption `commit_resume`.
> +
> +     E.g., the remote target uses this to coalesce multiple resumption requests
> +     in a single vCont packet.  */
> +  virtual void commit_resume () {}
> +
>    /* True if any thread is, or may be executing.  We need to track
>       this separately because until we fully sync the thread list, we
>       won't know whether the target is fully stopped, even if we see
> @@ -92,4 +106,19 @@ extern std::set<process_stratum_target *> all_non_exited_process_targets ();
>  
>  extern void switch_to_target_no_thread (process_stratum_target *target);
>  
> +/* Commit a series of resumption requests previously prepared with
> +   target_resume calls.
> +
> +   This function is a no-op if commit resumes are deferred (see
> +   `make_scoped_defer_process_target_commit_resume`).  */
> +
> +extern void maybe_commit_resume_process_target
> +  (process_stratum_target *target);
> +
> +/* Setup to defer `commit_resume` calls, and re-set to the previous status on
> +   destruction.  */
> +
> +extern scoped_restore_tmpl<bool>
> +  make_scoped_defer_process_target_commit_resume ();
> +
>  #endif /* !defined (PROCESS_STRATUM_TARGET_H) */
> diff --git a/gdb/record-full.c b/gdb/record-full.c
> index 22eaaa4bb1bc..56ab29479874 100644
> --- a/gdb/record-full.c
> +++ b/gdb/record-full.c
> @@ -1242,11 +1242,11 @@ record_full_wait_1 (struct target_ops *ops,
>  			   break;
>    			}
>  
> +		      process_stratum_target *proc_target
> +			= current_inferior ()->process_target ();
> +
>  		      if (gdbarch_software_single_step_p (gdbarch))
>  			{
> -			  process_stratum_target *proc_target
> -			    = current_inferior ()->process_target ();
> -
>  			  /* Try to insert the software single step breakpoint.
>  			     If insert success, set step to 0.  */
>  			  set_executing (proc_target, inferior_ptid, false);
> @@ -1263,7 +1263,7 @@ record_full_wait_1 (struct target_ops *ops,
>  					    "issuing one more step in the "
>  					    "target beneath\n");
>  		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
> -		      ops->beneath ()->commit_resume ();
> +		      proc_target->commit_resume ();
>  		      continue;
>  		    }
>  		}
> diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
> index 437b19b8581c..8b933fdf82eb 100644
> --- a/gdb/target-delegates.c
> +++ b/gdb/target-delegates.c
> @@ -14,7 +14,6 @@ struct dummy_target : public target_ops
>    void detach (inferior *arg0, int arg1) override;
>    void disconnect (const char *arg0, int arg1) override;
>    void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
> -  void commit_resume () override;
>    ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
>    void fetch_registers (struct regcache *arg0, int arg1) override;
>    void store_registers (struct regcache *arg0, int arg1) override;
> @@ -185,7 +184,6 @@ struct debug_target : public target_ops
>    void detach (inferior *arg0, int arg1) override;
>    void disconnect (const char *arg0, int arg1) override;
>    void resume (ptid_t arg0, int arg1, enum gdb_signal arg2) override;
> -  void commit_resume () override;
>    ptid_t wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2) override;
>    void fetch_registers (struct regcache *arg0, int arg1) override;
>    void store_registers (struct regcache *arg0, int arg1) override;
> @@ -440,26 +438,6 @@ debug_target::resume (ptid_t arg0, int arg1, enum gdb_signal arg2)
>    fputs_unfiltered (")\n", gdb_stdlog);
>  }
>  
> -void
> -target_ops::commit_resume ()
> -{
> -  this->beneath ()->commit_resume ();
> -}
> -
> -void
> -dummy_target::commit_resume ()
> -{
> -}
> -
> -void
> -debug_target::commit_resume ()
> -{
> -  fprintf_unfiltered (gdb_stdlog, "-> %s->commit_resume (...)\n", this->beneath ()->shortname ());
> -  this->beneath ()->commit_resume ();
> -  fprintf_unfiltered (gdb_stdlog, "<- %s->commit_resume (", this->beneath ()->shortname ());
> -  fputs_unfiltered (")\n", gdb_stdlog);
> -}
> -
>  ptid_t
>  target_ops::wait (ptid_t arg0, struct target_waitstatus *arg1, target_wait_flags arg2)
>  {
> diff --git a/gdb/target.c b/gdb/target.c
> index 3a03a0ad530e..3a5270e5a416 100644
> --- a/gdb/target.c
> +++ b/gdb/target.c
> @@ -2062,28 +2062,6 @@ target_resume (ptid_t ptid, int step, enum gdb_signal signal)
>    clear_inline_frame_state (curr_target, ptid);
>  }
>  
> -/* If true, target_commit_resume is a nop.  */
> -static int defer_target_commit_resume;
> -
> -/* See target.h.  */
> -
> -void
> -target_commit_resume (void)
> -{
> -  if (defer_target_commit_resume)
> -    return;
> -
> -  current_top_target ()->commit_resume ();
> -}
> -
> -/* See target.h.  */
> -
> -scoped_restore_tmpl<int>
> -make_scoped_defer_target_commit_resume ()
> -{
> -  return make_scoped_restore (&defer_target_commit_resume, 1);
> -}
> -
>  void
>  target_pass_signals (gdb::array_view<const unsigned char> pass_signals)
>  {
> diff --git a/gdb/target.h b/gdb/target.h
> index e1a1d7a9226b..a252c29eafb4 100644
> --- a/gdb/target.h
> +++ b/gdb/target.h
> @@ -478,8 +478,6 @@ struct target_ops
>  			 int TARGET_DEBUG_PRINTER (target_debug_print_step),
>  			 enum gdb_signal)
>        TARGET_DEFAULT_NORETURN (noprocess ());
> -    virtual void commit_resume ()
> -      TARGET_DEFAULT_IGNORE ();
>      /* See target_wait's description.  Note that implementations of
>         this method must not assume that inferior_ptid on entry is
>         pointing at the thread or inferior that ends up reporting an
> @@ -1431,24 +1429,6 @@ extern void target_disconnect (const char *, int);
>     target_commit_resume below.  */
>  extern void target_resume (ptid_t ptid, int step, enum gdb_signal signal);
>  
> -/* Commit a series of resumption requests previously prepared with
> -   target_resume calls.
> -
> -   GDB always calls target_commit_resume after calling target_resume
> -   one or more times.  A target may thus use this method in
> -   coordination with the target_resume method to batch target-side
> -   resumption requests.  In that case, the target doesn't actually
> -   resume in its target_resume implementation.  Instead, it prepares
> -   the resumption in target_resume, and defers the actual resumption
> -   to target_commit_resume.  E.g., the remote target uses this to
> -   coalesce multiple resumption requests in a single vCont packet.  */
> -extern void target_commit_resume ();
> -
> -/* Setup to defer target_commit_resume calls, and reactivate
> -   target_commit_resume on destruction, if it was previously
> -   active.  */
> -extern scoped_restore_tmpl<int> make_scoped_defer_target_commit_resume ();
> -
>  /* For target_read_memory see target/target.h.  */
>  
>  /* The default target_ops::to_wait implementation.  */
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-08  4:17 ` [PATCH v3 5/5] gdb: better handling of 'S' packets Simon Marchi
@ 2021-01-08 18:19   ` Andrew Burgess
  2021-01-08 19:11     ` Simon Marchi
  2021-01-09 21:26   ` Pedro Alves
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Burgess @ 2021-01-08 18:19 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Pedro Alves

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:34 -0500]:

> From: Andrew Burgess <andrew.burgess@embecosm.com>
> 
> New in v3 (Simon Marchi): rely on the resume state saved in
> remote_thread_info to find the first non-exited, resumed thread.  This
> simplifies the code a bit, as we don't need to fall back on the first
> non-exited thread on initial connection.
> 
> This commit builds on work started in the following two commits:
> 
>   commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
>   Date:   Thu Jan 30 14:35:40 2020 +0000
> 
>       gdb/remote: Restore support for 'S' stop reply packet
> 
>   commit cada5fc921e39a1945c422eea055c8b326d8d353
>   Date:   Wed Mar 11 12:30:13 2020 +0000
> 
>       gdb: Handle W and X remote packets without giving a warning
> 
> This is related to how GDB handles remote targets that send back 'S'
> packets.
> 
> In the first of the above commits we fixed GDB's ability to handle a
> single process, single threaded target that sends back 'S' packets.
> Although the 'T' packet would always be preferred to 'S' these days,
> there's nothing really wrong with 'S' for this situation.
> 
> The second commit above fixed an oversight in the first commit, a
> single-process, multi-threaded target can send back a process wide
> event, for example the process exited event 'W' without including a
> process-id, this also is fine as there is no ambiguity in this case.
> 
> In PR gdb/26819 we run into yet another problem with the above
> commits.  In this case we have a single process with two threads, GDB
> hits a breakpoint in thread 2 and then performs a stepi:
> 
>   (gdb) b main
>   Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
>   (gdb) c
>   Continuing.
> 
>   Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
>   10    in infinite_loop.S
>   (gdb) set debug remote 1
>   (gdb) stepi
>   Sending packet: $vCont;s:2#24...Packet received: S05
>   ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> 
> What happens in this case is that on the RISC-V target displaced
> stepping is not supported, so when the stepi is issued GDB steps just
> thread 2.  As only a single thread was set running the target decides
> that is can get away with sending back an 'S' packet without a
> thread-id.  GDB then associates the stop with thread 1 (the first
> non-exited thread), but as thread 1 was not previously set executing
> the assertion seen above triggers.
> 
> As an aside I am surprised that the target sends pack 'S' in this
> situation.  The target is happy to send back 'T' (including thread-id)
> when multiple threads are set running, so (to me) it would seem easier
> to just always use the 'T' packet when multiple threads are in use.
> However, the target only uses 'T' when multiple threads are actually
> executing, otherwise an 'S' packet it used.
> 
> Still, when looking at the above situation we can see that GDB should
> be able to understand which thread the 'S' reply is referring too.
> 
> The problem is that is that in commit 24ed6739b699 (above) when a stop
> reply comes in with no thread-id we look for the first non-exited
> thread and select that as the thread the stop applies too.
> 
> What we should really do is select the first non-exited, resumed thread,
> and associate the stop event with this thread.  In the above example
> both thread 1 and 2 are non-exited, but only thread 2 is resumed, so
> this is what we should use.
> 
> There's a test for this issue included which works with stock
> gdbserver by disabling use of the 'T' packet, and enabling
> 'scheduler-locking' within GDB so only one thread is set running.
> 
> gdb/ChangeLog:
> 
> 	PR gdb/26819
> 	* remote.c
> 	(remote_target::select_thread_for_ambiguous_stop_reply): New
> 	member function.
> 	(remote_target::process_stop_reply): Call
> 	select_thread_for_ambiguous_stop_reply.
> 
> gdb/testsuite/ChangeLog:
> 
> 	PR gdb/26819
> 	* gdb.server/stop-reply-no-thread-multi.c: New file.
> 	* gdb.server/stop-reply-no-thread-multi.exp: New file.

Simon,

Thanks for integrating this with your series.  I like that this patch
has gotten even simpler (and clearer) now.

I had a few minor nits, but otherwise I'm happy with this.

Thanks,
Andrew

> 
> Change-Id: I9b49d76c2a99063dcc76203fa0f5270a72825d15
> ---
>  gdb/remote.c                                  | 153 +++++++++++-------
>  .../gdb.server/stop-reply-no-thread-multi.c   |  77 +++++++++
>  .../gdb.server/stop-reply-no-thread-multi.exp | 136 ++++++++++++++++
>  3 files changed, 312 insertions(+), 54 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
>  create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index be53886c1837..f12a86f66a14 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
>    ptid_t process_stop_reply (struct stop_reply *stop_reply,
>  			     target_waitstatus *status);
>  
> +  ptid_t select_thread_for_ambiguous_stop_reply
> +    (const struct target_waitstatus *status);
> +
>    void remote_notice_new_inferior (ptid_t currthread, int executing);
>  
>    void process_initial_stop_replies (int from_tty);
> @@ -7796,75 +7799,117 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>    remote->remote_notif_get_pending_events (nc);
>  }
>  
> -/* Called when it is decided that STOP_REPLY holds the info of the
> -   event that is to be returned to the core.  This function always
> -   destroys STOP_REPLY.  */
> +/* Called from process_stop_reply when the stop packet we are responding
> +   to didn't include a process-id or thread-id.  STATUS is the stop event
> +   we are responding to.
> +
> +   It is the task of this function to select a suitable thread (or process)
> +   and return its ptid, this is the thread (or process) we will assume the
> +   stop event came from.
> +
> +   In some cases there isn't really any choice about which thread (or
> +   process) is selected, a basic remote with a single process containing a
> +   single thread might choose not to send any process-id or thread-id in
> +   its stop packets, this function will select and return the one and only
> +   thread.
> +
> +   However, if a target supports multiple threads (or processes) and still
> +   doesn't include a thread-id (or process-id) in its stop packet then
> +   first, this is a badly behaving target, and second, we're going to have
> +   to select a thread (or process) at random and use that.  This function
> +   will print a warning to the user if it detects that there is the
> +   possibility that GDB is guessing which thread (or process) to
> +   report.  */
>  
>  ptid_t
> -remote_target::process_stop_reply (struct stop_reply *stop_reply,
> -				   struct target_waitstatus *status)
> +remote_target::select_thread_for_ambiguous_stop_reply
> +  (const struct target_waitstatus *status)
>  {
> -  ptid_t ptid;
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
>  
> -  *status = stop_reply->ws;
> -  ptid = stop_reply->ptid;
> +  thread_info *first_resumed_thread = nullptr;
> +  bool multiple_resumed_thread = false;

This might be better named 'multiple_resumed_threads'.  Apologies if
this was just copied from my original code.

>  
> -  /* If no thread/process was reported by the stub then use the first
> -     non-exited thread in the current target.  */
> -  if (ptid == null_ptid)
> +  /* Consider all non-exited threads of the target, find the first resumed
> +     one.  */
> +  for (thread_info *thr : all_non_exited_threads (this))
>      {
> -      /* Some stop events apply to all threads in an inferior, while others
> -	 only apply to a single thread.  */
> -      bool is_stop_for_all_threads
> -	= (status->kind == TARGET_WAITKIND_EXITED
> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
> +      remote_thread_info *remote_thr =get_remote_thread_info (thr);

Missing space after '='.

> +
> +      if (remote_thr->resume_state () != resume_state::RESUMED)
> +	continue;
> +
> +      if (first_resumed_thread == nullptr)
> +	first_resumed_thread = thr;
> +      else if (!is_stop_for_all_threads
> +	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
> +	multiple_resumed_thread = true;
> +    }
>  
> -      for (thread_info *thr : all_non_exited_threads (this))
> +  gdb_assert (first_resumed_thread != nullptr);
> +
> +  /* Warn if the remote target is sending ambiguous stop replies.  */
> +  if (multiple_resumed_thread)
> +    {
> +      static bool warned = false;
> +
> +      if (!warned)
>  	{
> -	  if (ptid != null_ptid
> -	      && (!is_stop_for_all_threads
> -		  || ptid.pid () != thr->ptid.pid ()))
> -	    {
> -	      static bool warned = false;
> +	  /* If you are seeing this warning then the remote target has
> +	     stopped without specifying a thread-id, but the target
> +	     does have multiple threads (or inferiors), and so GDB is
> +	     having to guess which thread stopped.
>  
> -	      if (!warned)
> -		{
> -		  /* If you are seeing this warning then the remote target
> -		     has stopped without specifying a thread-id, but the
> -		     target does have multiple threads (or inferiors), and
> -		     so GDB is having to guess which thread stopped.
> -
> -		     Examples of what might cause this are the target
> -		     sending and 'S' stop packet, or a 'T' stop packet and
> -		     not including a thread-id.
> -
> -		     Additionally, the target might send a 'W' or 'X
> -		     packet without including a process-id, when the target
> -		     has multiple running inferiors.  */
> -		  if (is_stop_for_all_threads)
> -		    warning (_("multi-inferior target stopped without "
> -			       "sending a process-id, using first "
> -			       "non-exited inferior"));
> -		  else
> -		    warning (_("multi-threaded target stopped without "
> -			       "sending a thread-id, using first "
> -			       "non-exited thread"));
> -		  warned = true;
> -		}
> -	      break;
> -	    }
> +	     Examples of what might cause this are the target sending
> +	     and 'S' stop packet, or a 'T' stop packet and not
> +	     including a thread-id.
>  
> -	  /* If this is a stop for all threads then don't use a particular
> -	     threads ptid, instead create a new ptid where only the pid
> -	     field is set.  */
> +	     Additionally, the target might send a 'W' or 'X packet
> +	     without including a process-id, when the target has
> +	     multiple running inferiors.  */
>  	  if (is_stop_for_all_threads)
> -	    ptid = ptid_t (thr->ptid.pid ());
> +	    warning (_("multi-inferior target stopped without "
> +		       "sending a process-id, using first "
> +		       "non-exited inferior"));
>  	  else
> -	    ptid = thr->ptid;
> +	    warning (_("multi-threaded target stopped without "
> +		       "sending a thread-id, using first "
> +		       "non-exited thread"));
> +	  warned = true;
>  	}
> -      gdb_assert (ptid != null_ptid);
>      }
>  
> +  /* If this is a stop for all threads then don't use a particular threads
> +     ptid, instead create a new ptid where only the pid field is set.  */
> +  if (is_stop_for_all_threads)
> +    return ptid_t (first_resumed_thread->ptid.pid ());
> +  else
> +    return first_resumed_thread->ptid;
> +}
> +
> +/* Called when it is decided that STOP_REPLY holds the info of the
> +   event that is to be returned to the core.  This function always
> +   destroys STOP_REPLY.  */
> +
> +ptid_t
> +remote_target::process_stop_reply (struct stop_reply *stop_reply,
> +				   struct target_waitstatus *status)
> +{
> +  ptid_t ptid;

Shouldn't this just be inline below?

> +
> +  *status = stop_reply->ws;
> +  ptid = stop_reply->ptid;
> +
> +  /* If no thread/process was reported by the stub then select a suitable
> +     thread/process.  */
> +  if (ptid == null_ptid)
> +    ptid = select_thread_for_ambiguous_stop_reply (status);
> +  gdb_assert (ptid != null_ptid);
> +
>    if (status->kind != TARGET_WAITKIND_EXITED
>        && status->kind != TARGET_WAITKIND_SIGNALLED
>        && status->kind != TARGET_WAITKIND_NO_RESUMED)
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> new file mode 100644
> index 000000000000..01f6d3c07ff4
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
> @@ -0,0 +1,77 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020 Free Software Foundation, Inc.

The copyright year will need updating now.

> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdlib.h>
> +#include <pthread.h>
> +#include <unistd.h>
> +
> +volatile int worker_blocked = 1;
> +volatile int main_blocked = 1;
> +
> +void
> +unlock_worker (void)
> +{
> +  worker_blocked = 0;
> +}
> +
> +void
> +unlock_main (void)
> +{
> +  main_blocked = 0;
> +}
> +
> +void
> +breakpt (void)
> +{
> +  /* Nothing.  */
> +}
> +
> +static void *
> +worker (void *data)
> +{
> +  unlock_main ();
> +
> +  while (worker_blocked)
> +    ;
> +
> +  breakpt ();
> +
> +  return NULL;
> +}
> +
> +int
> +main (void)
> +{
> +  pthread_t thr;
> +  void *retval;
> +
> +  /* Ensure the test doesn't run forever.  */
> +  alarm (99);
> +
> +  if (pthread_create (&thr, NULL, worker, NULL) != 0)
> +    abort ();
> +
> +  while (main_blocked)
> +    ;
> +
> +  unlock_worker ();
> +
> +  if (pthread_join (thr, &retval) != 0)
> +    abort ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> new file mode 100644
> index 000000000000..f394ca8ed0c4
> --- /dev/null
> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
> @@ -0,0 +1,136 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2020 Free Software Foundation, Inc.

And again.

> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test how GDB handles the case where a target either doesn't use 'T'
> +# packets at all or doesn't include a thread-id in a 'T' packet, AND,
> +# where the test program contains multiple threads.
> +#
> +# In general if multiple threads are executing and the target doesn't
> +# include a thread-id in its stop response then GDB will not be able
> +# to correctly figure out which thread the stop applies to.
> +#
> +# However, this test covers a very specific case, there are multiple
> +# threads but only a single thread is actually executing.  So, when
> +# the stop comes from the target, without a thread-id, GDB should be
> +# able to correctly figure out which thread has stopped.
> +
> +load_lib gdbserver-support.exp
> +
> +if { [skip_gdbserver_tests] } {
> +    verbose "skipping gdbserver tests"
> +    return -1
> +}
> +
> +standard_testfile
> +if { [build_executable "failed to prepare" $testfile $srcfile {debug pthreads}] == -1 } {
> +    return -1
> +}
> +
> +# Run the tests with different features of GDBserver disabled.
> +proc run_test { disable_feature } {
> +    global binfile gdb_prompt decimal hex
> +
> +    clean_restart ${binfile}
> +
> +    # Make sure we're disconnected, in case we're testing with an
> +    # extended-remote board, therefore already connected.
> +    gdb_test "disconnect" ".*"
> +
> +    set packet_arg ""
> +    if { $disable_feature != "" } {
> +	set packet_arg "--disable-packet=${disable_feature}"
> +    }
> +    set res [gdbserver_start $packet_arg $binfile]
> +    set gdbserver_protocol [lindex $res 0]
> +    set gdbserver_gdbport [lindex $res 1]
> +
> +    # Disable XML-based thread listing, and multi-process extensions.
> +    gdb_test_no_output "set remote threads-packet off"
> +    gdb_test_no_output "set remote multiprocess-feature-packet off"
> +
> +    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
> +    if ![gdb_assert {$res == 0} "connect"] {
> +	return
> +    }
> +
> +    # There should be only one thread listed at this point.
> +    gdb_test_multiple "info threads" "" {
> +	-re "2 Thread.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "has terminated.*$gdb_prompt $" {
> +	    fail $gdb_test_name
> +	}
> +	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    gdb_breakpoint "unlock_worker"
> +    gdb_continue_to_breakpoint "run to unlock_worker"
> +
> +    # There should be two threads at this point with thread 1 selected.
> +    gdb_test "info threads" \
> +	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should now exist"
> +
> +    # Switch threads.
> +    gdb_test "thread 2" ".*" "switch to second thread"
> +
> +    # Now turn on scheduler-locking so that when we step thread 2 only
> +    # that one thread will be set running.
> +    gdb_test_no_output "set scheduler-locking on"
> +
> +    # Single step thread 2.  Only the one thread will step.  When the
> +    # thread stops, if the stop packet doesn't include a thread-id
> +    # then GDB should still understand which thread stopped.
> +    gdb_test_multiple "stepi" "" {
> +	-re "Thread 1 received signal SIGTRAP" {
> +	    fail $gdb_test_name
> +	}
> +	-re -wrap "$hex.*$decimal.*while \\(worker_blocked\\).*" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # Check that thread 2 is still selected.
> +    gdb_test "info threads" \
> +	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
> +	"second thread should still be selected after stepi"
> +
> +    # Turn scheduler locking off again so that when we continue all
> +    # threads will be set running.
> +    gdb_test_no_output "set scheduler-locking off"
> +
> +    # Continue until exit.  The server sends a 'W' with no PID.
> +    # Bad GDB gave an error like below when target is nonstop:
> +    #  (gdb) c
> +    #  Continuing.
> +    #  No process or thread specified in stop reply: W00
> +    gdb_continue_to_end "" continue 1
> +}
> +
> +# Disable different features within gdbserver:
> +#
> +# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
> +#          emulating old gdbservers when debugging single-threaded programs.
> +#
> +# T: Start GDBserver with the entire 'T' stop reply packet disabled,
> +#    GDBserver will instead send the 'S' stop reply.
> +foreach_with_prefix to_disable { "" Tthread T } {
> +    run_test $to_disable
> +}
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
@ 2021-01-08 18:34   ` Andrew Burgess
  2021-01-08 19:04     ` Simon Marchi
  2021-01-09 20:34   ` Pedro Alves
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Andrew Burgess @ 2021-01-08 18:34 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Pedro Alves, Simon Marchi

* Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:33 -0500]:

> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The rationale for this patch comes from the ROCm port [1], the goal
> being to reduce the number of back and forths between GDB and the target
> when doing successive operations.  I'll start with explaining the
> rationale and then go over the implementation.  In the ROCm / GPU world,
> the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
> read if from a GPU stand point, just s/thread/wave/.
> 
> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
> target, gives the illusion that it's possible for the debugger to
> control (start and stop) individual threads.  But in reality, this is
> not how it works.  Under the hood, all threads of a queue are controlled
> as a group.  To stop one thread in a group of running ones, the state of
> all threads is retrieved from the GPU, all threads are destroyed, and all
> threads but the one we want to stop are re-created from the saved state.
> The net result, from the point of view of GDB, is that the library
> stopped one thread.  The same thing goes if we want to resume one thread
> while others are running: the state of all running threads is retrieved
> from the GPU, they are all destroyed, and they are all re-created,
> including the thread we want to resume.
> 
> This leads to some inefficiencies when combined with how GDB works, here
> are two examples:
> 
>  - Stopping all threads: because the target operates in non-stop mode,
>    when the user interface mode is all-stop, GDB must stop all threads
>    individually when presenting a stop.  Let's suppose we have 1000
>    threads and the user does ^C.  GDB asks the target to stop one
>    thread.  Behind the scenes, the library retrieves 1000 thread states
>    and restores the 999 others still running ones.  GDB asks the target
>    to stop another one.  The target retrieves 999 thread states and
>    restores the 998 remaining ones.  That means that to stop 1000
>    threads, we did 1000 back and forths with the GPU.  It would have
>    been much better to just retrieve the states once and stop there.
> 
>  - Resuming with pending events: suppose the 1000 threads hit a
>    breakpoint at the same time.  The breakpoint is conditional and
>    evaluates to true for the first thread, to false for all others.  GDB
>    pulls one event (for the first thread) from the target, decides that
>    it should present a stop, so stops all threads using
>    stop_all_threads.  All these other threads have a breakpoint event to
>    report, which is saved in `thread_info::suspend::waitstatus` for
>    later.  When the user does "continue", GDB resumes that one thread
>    that did hit the breakpoint.  It then processes the pending events
>    one by one as if they just arrived.  It picks one, evaluates the
>    condition to false, and resumes the thread.  It picks another one,
>    evaluates the condition to false, and resumes the thread.  And so on.
>    In between each resumption, there is a full state retrieval and
>    re-creation.  It would be much nicer if we could wait a little bit
>    before sending those threads on the GPU, until it processed all those
>    pending events.
> 
> To address this kind of performance issue, ROCdbgapi has a concept
> called "forward progress required", which is a boolean state that allows
> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
> hold off putting the threads on the GPU until I'm done" (the "forward
> progress not required" state).  Turning forward progress back on
> indicates to the library that all threads that are supposed to be
> running should now be really running on the GPU.
> 
> It turns out that GDB has a similar concept, though not as general,
> commit_resume.  On difference is that commit_resume is not stateful: the

typo: 'On difference' ?

> target can't look up "does the core need me to schedule resumed threads
> for execution right now".  It is also specifically linked to the resume
> method, it is not used in other contexts.  The target accumulates
> resumption requests through target_ops::resume calls, and then commits
> those resumptions when target_ops::commit_resume is called.  The target
> has no way to check if it's ok to leave resumed threads stopped in other
> target methods.
> 
> To bridge the gap, this patch generalizes the commit_resume concept in
> GDB to match the forward progress concept of ROCdbgapi.  The current
> name (commit_resume) can be interpreted as "commit the previous resume
> calls".  I renamed the concept to "commit_resumed", as in "commit the
> threads that are resumed".
> 
> In the new version, we have two things in process_stratum_target:
> 
>  - the commit_resumed_state field: indicates whether GDB requires this
>    target to have resumed threads committed to the execution
>    target/device.  If false, the target is allowed to leave resumed
>    threads un-committed at the end of whatever method it is executing.
> 
>  - the commit_resumed method: called when commit_resumed_state
>    transitions from false to true.  While commit_resumed_state was
>    false, the target may have left some resumed threads un-committed.
>    This method being called tells it that it should commit them back to
>    the execution device.
> 
> Let's take the "Stopping all threads" scenario from above and see how it
> would work with the ROCm target with this change.  Before stopping all
> threads, GDB would set the target's commit_resumed_state field to false.
> It would then ask the target to stop the first thread.  The target would
> retrieve all threads' state from the GPU and mark that one as stopped.
> Since commit_resumed_state is false, it leaves all the other threads
> (still resumed) stopped.  GDB would then proceed to call target_stop for
> all the other threads.  Since resumed threads are not committed, this
> doesn't do any back and forth with the GPU.
> 
> To simplify the implementation of targets, I made it so that when
> calling certain target methods, the contract between the core and the
> targets guarantees that commit_resumed_state is false.  This way, the
> target doesn't need two paths, one commit_resumed_state == true and one
> for commit_resumed_state == false.  It can just assert that
> commit_resumed_state is false and work with that assumption.  This also
> helps catch places where we forgot to disable commit_resumed_state
> before calling the method, which represents a probable optimization
> opportunity.
> 
> To have some confidence that this contract between the core and the
> targets is respected, I added assertions in the linux-nat target
> methods, even though the linux-nat target doesn't actually use that
> feature.  Since linux-nat is tested much more than other targets, this
> will help catch these issues quicker.
> 
> To ensure that commit_resumed_state is always turned back on (only if
> necessary, see below) and the commit_resumed method is called when doing
> so, I introduced the scoped_disabled_commit_resumed RAII object, which
> replaces make_scoped_defer_process_target_commit_resume.  On
> construction, it clears the commit_resumed_state flag of all process
> targets.  On destruction, it turns it back on (if necessary) and calls
> the commit_resumed method.  The nested case is handled by having a
> "nesting" counter: only when the counter goes back to 0 is
> commit_resumed_state turned back on.
> 
> On destruction, commit-resumed is not re-enabled for a given target if:
> 
>  1. this target has no threads resumed, or
>  2. this target at least one thread with a pending status known to the

Missing word: 'this target at least'?

>     core (saved in thread_info::suspend::waitstatus).
> 
> The first point is not technically necessary, because a proper
> commit_resumed implementation would be a no-op if the target has no
> resumed threads.  But since we have a flag do to a quick check, I think
> it doesn't hurt.
> 
> The second point is more important: together with the
> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
> makes it so the "Resuming with pending events" described above is
> handled efficiently.  Here's what happens in that case:
> 
>  1. The user types "continue".
>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>     function does not enable commit-resumed, as it sees other threads
>     have pending statuses.
>  3. fetch_inferior_event is called to handle another event, one thread
>     is resumed.  Because there are still more threads with pending
>     statuses, the destructor of scoped_disable_commit_resumed in
>     fetch_inferior_event still doesn't enable commit-resumed.
>  4. Rinse and repeat step 3, until the last pending status is handled by
>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>     destructor sees there are no more threads with pending statues, so
>     it asks the target to commit resumed threads.
> 
> This allows us to avoid all unnecessary back and forths, there is a
> single commit_resumed call.
> 
> This change required remote_target::remote_stop_ns to learn how to
> handle stopping threads that were resumed but pending vCont.  The
> simplest example where that happens is when using the remote target in
> all-stop, but with "maint set target-non-stop on", to force it to
> operate in non-stop mode under the hood.  If two threads hit a
> breakpoint at the same time, GDB will receive two stop replies.  It will
> present the stop for one thread and save the other one in
> thread_info::suspend::waitstatus.
> 
> Before this patch, when doing "continue", GDB first resumes the thread
> without a pending status:
> 
>     Sending packet: $vCont;c:p172651.172676#f3
> 
> It then consumes the pending status in the next fetch_inferior_event
> call:
> 
>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>     [infrun] target_wait (-1.0.0, status) =
>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
> 
> It then realizes it needs to stop all threads to present the stop, so
> stops the thread it just resumed:
> 
>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>     remote_stop called
>     Sending packet: $vCont;t:p172651.172676#04
> 
> This is an unnecessary resume/stop.  With this patch, we don't commit
> resumed threads after proceeding, because of the pending status:
> 
>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
> 
> When GDB handles the pending status and stop_all_threads runs, we stop a
> resumed but pending vCont thread:
> 
>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
> 
> That thread was never actually resumed on the remote stub / gdbserver.
> This is why remote_stop_ns needed to learn this new trick of enqueueing
> phony stop replies.
> 
> Note that this patch only considers pending statuses known to the core
> of GDB, that is the events that were pulled out of the target and stored
> in `thread_info::suspend::waitstatus`.  In some cases, we could also
> avoid unnecessary back and forth when the target has events that it has
> not yet reported the core.  I plan to implement this as a subsequent
> patch, once this series has settled.

I read through the commit message and convinced myself that it made
sense.  I ran out of time to look at the actual code.

Thanks,
Andrew


> 
> gdb/ChangeLog:
> 
> 	* infrun.h (struct scoped_disable_commit_resumed): New.
> 	* infrun.c (do_target_resume): Remove
> 	maybe_commit_resume_process_target call.
> 	(maybe_commit_resume_all_process_targets): Rename to...
> 	(maybe_commit_resumed_all_process_targets): ... this.  Skip
> 	targets that have no executing threads or resumed threads with
> 	a pending status.
> 	(scoped_disable_commit_resumed_depth): New.
> 	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
> 	New.
> 	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
> 	New.
> 	(proceed): Use scoped_disable_commit_resumed.
> 	(fetch_inferior_event): Use scoped_disable_commit_resumed.
> 	* process-stratum-target.h (class process_stratum_target):
> 	<commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	<commit_resumed_state>: New.
> 	(all_process_targets): New.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* process-stratum-target.c (all_process_targets): New.
> 	(defer_process_target_commit_resume): Remove.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
> 	(linux_nat_target::wait): Add gdb_assert.
> 	(linux_nat_target::stop): Add gdb_assert.
> 	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
> 	(attach_command): Use scoped_disable_commit_resumed.
> 	(detach_command): Use scoped_disable_commit_resumed.
> 	(interrupt_target_1): Use scoped_disable_commit_resumed.
> 	* mi/mi-main.c (exec_continue): Use
> 	scoped_disable_commit_resumed.
> 	* record-full.c (record_full_wait_1): Change
> 	commit_resumed_state around calling commit_resumed.
> 	* remote.c (class remote_target) <commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	(remote_target::resume): Add gdb_assert.
> 	(remote_target::commit_resume): Rename to...
> 	(remote_target::commit_resumed): ... this.  Check if there is
> 	any thread pending vCont resume.
> 	(struct stop_reply): Move up.
> 	(remote_target::remote_stop_ns): Generate stop replies for
> 	resumed but pending vCont threads.
> 	(remote_target::wait_ns): Add gdb_assert.
> 
> [1] https://github.com/ROCm-Developer-Tools/ROCgdb/
> [2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
> 
> Change-Id: I836135531a29214b21695736deb0a81acf8cf566
> ---
>  gdb/infcmd.c                 |   8 +++
>  gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
>  gdb/infrun.h                 |  41 +++++++++++++
>  gdb/linux-nat.c              |   5 ++
>  gdb/mi/mi-main.c             |   2 +
>  gdb/process-stratum-target.c |  37 +++++------
>  gdb/process-stratum-target.h |  63 +++++++++++--------
>  gdb/record-full.c            |   4 +-
>  gdb/remote.c                 | 111 +++++++++++++++++++++++----------
>  9 files changed, 292 insertions(+), 95 deletions(-)
> 
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index 6f0ed952de67..b7595e42e265 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
>        uiout->flush ();
>      }
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("running");
> +
>    /* We call get_inferior_args() because we might need to compute
>       the value now.  */
>    run_target->create_inferior (exec_file,
> @@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
>    if (non_stop && !attach_target->supports_non_stop ())
>      error (_("Cannot attach to this target in non-stop mode"));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
> +
>    attach_target->attach (args, from_tty);
>    /* to_attach should push the target, so after this point we
>       shouldn't refer to attach_target again.  */
> @@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
>    if (inferior_ptid == null_ptid)
>      error (_("The program is not being run."));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
> +
>    query_if_trace_running (from_tty);
>  
>    disconnect_tracing ();
> @@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
>  void
>  interrupt_target_1 (bool all_threads)
>  {
> +  scoped_disable_commit_resumed inhibit ("interrupting");
> +
>    if (non_stop)
>      {
>        if (all_threads)
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 1a27af51b7e9..92a1102cb595 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>  
>    target_resume (resume_ptid, step, sig);
>  
> -  maybe_commit_resume_process_target (tp->inf->process_target ());
> -
>    if (target_can_async_p ())
>      target_async (1);
>  }
> @@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
>  					    execution_direction)));
>  }
>  
> -/* Calls maybe_commit_resume_process_target on all process targets.  */
> +/* Maybe require all process stratum targets to commit their resumed threads.
> +
> +   A specific process stratum target is not required to do so if:
> +
> +   - it has no resumed threads
> +   - it has a thread with a pending status  */
>  
>  static void
> -maybe_commit_resume_all_process_targets ()
> +maybe_commit_resumed_all_process_targets ()
>  {
> -  scoped_restore_current_thread restore_thread;
> +  /* This is an optional to avoid unnecessary thread switches. */
> +  gdb::optional<scoped_restore_current_thread> restore_thread;
>  
>    for (process_stratum_target *target : all_non_exited_process_targets ())
>      {
> +      gdb_assert (!target->commit_resumed_state);
> +
> +      if (!target->threads_executing)
> +	{
> +	  infrun_debug_printf ("not re-enabling forward progress for target "
> +			       "%s, no executing threads",
> +			       target->shortname ());
> +	  continue;
> +	}
> +
> +      /* If a thread from this target has some status to report, we better
> +	 handle it before requiring the target to commit its resumed threads:
> +	 handling the status might lead to resuming more threads.  */
> +      bool has_thread_with_pending_status = false;
> +      for (thread_info *thread : all_non_exited_threads (target))
> +	if (thread->resumed && thread->suspend.waitstatus_pending_p)
> +	  {
> +	    has_thread_with_pending_status = true;
> +	    break;
> +	  }
> +
> +      if (has_thread_with_pending_status)
> +	{
> +	  infrun_debug_printf ("not requesting commit-resumed for target %s, a"
> +			       "thread has a pending waitstatus",
> +			       target->shortname ());
> +	  continue;
> +	}
> +
> +      if (!restore_thread.has_value ())
> +	restore_thread.emplace ();
> +
>        switch_to_target_no_thread (target);
> -      maybe_commit_resume_process_target (target);
> +      infrun_debug_printf ("enabling commit-resumed for target %s",
> +			   target->shortname());
> +
> +      target->commit_resumed_state = true;
> +      target->commit_resumed ();
> +    }
> +}
> +
> +/* To track nesting of scoped_disable_commit_resumed objects.  */
> +
> +static int scoped_disable_commit_resumed_depth = 0;
> +
> +scoped_disable_commit_resumed::scoped_disable_commit_resumed
> +  (const char *reason)
> +  : m_reason (reason)
> +{
> +  infrun_debug_printf ("reason=%s", m_reason);
> +
> +  for (process_stratum_target *target : all_process_targets ())
> +    {
> +      if (scoped_disable_commit_resumed_depth == 0)
> +	{
> +	  /* This is the outermost instance.  */
> +	  target->commit_resumed_state = false;
> +	}
> +      else
> +	{
> +	  /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE
> +	     to have been cleared by the outermost instance.  */
> +	  gdb_assert (!target->commit_resumed_state);
> +	}
> +    }
> +
> +  ++scoped_disable_commit_resumed_depth;
> +}
> +
> +scoped_disable_commit_resumed::~scoped_disable_commit_resumed ()
> +{
> +  infrun_debug_printf ("reason=%s", m_reason);
> +
> +  gdb_assert (scoped_disable_commit_resumed_depth > 0);
> +
> +  --scoped_disable_commit_resumed_depth;
> +
> +  if (scoped_disable_commit_resumed_depth == 0)
> +    {
> +      /* This is the outermost instance.  */
> +      maybe_commit_resumed_all_process_targets ();
> +    }
> +  else
> +    {
> +      /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE to
> +	 still be false.  */
> +      for (process_stratum_target *target : all_process_targets ())
> +	gdb_assert (!target->commit_resumed_state);
>      }
>  }
>  
> @@ -2994,8 +3084,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>    cur_thr->prev_pc = regcache_read_pc_protected (regcache);
>  
>    {
> -    scoped_restore save_defer_tc
> -      = make_scoped_defer_process_target_commit_resume ();
> +    scoped_disable_commit_resumed disable_commit_resumed ("proceeding");
>  
>      started = start_step_over ();
>  
> @@ -3065,8 +3154,6 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>        }
>    }
>  
> -  maybe_commit_resume_all_process_targets ();
> -
>    finish_state.release ();
>  
>    /* If we've switched threads above, switch back to the previously
> @@ -3819,8 +3906,15 @@ fetch_inferior_event ()
>        = make_scoped_restore (&execution_direction,
>  			     target_execution_direction ());
>  
> +    /* Allow process stratum targets to pause their resumed threads while we
> +       handle the event.  */
> +    scoped_disable_commit_resumed disable_commit_resumed ("handling event");
> +
>      if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
> -      return;
> +      {
> +	infrun_debug_printf ("do_target_wait returned no event");
> +	return;
> +      }
>  
>      gdb_assert (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
>  
> diff --git a/gdb/infrun.h b/gdb/infrun.h
> index 7160b60f1368..5c32c0c97f6e 100644
> --- a/gdb/infrun.h
> +++ b/gdb/infrun.h
> @@ -269,4 +269,45 @@ extern void all_uis_check_sync_execution_done (void);
>     started or re-started).  */
>  extern void all_uis_on_sync_execution_starting (void);
>  
> +/* RAII object to temporarily disable the requirement for process stratum
> +   targets to commit their resumed threads.
> +
> +   On construction, set process_stratum_target::commit_resumed_state to false
> +   for all process stratum targets.
> +
> +   On destruction, call maybe_commit_resumed_all_process_targets.
> +
> +   In addition, track creation of nested scoped_disable_commit_resumed objects,
> +   for cases like this:
> +
> +     void
> +     inner_func ()
> +     {
> +       scoped_disable_commit_resumed disable;
> +       // do stuff
> +     }
> +
> +     void
> +     outer_func ()
> +     {
> +       scoped_disable_commit_resumed disable;
> +
> +       for (... each thread ...)
> +	 inner_func ();
> +     }
> +
> +   In this case, we don't want the `disable` in `inner_func` to require targets
> +   to commit resumed threads in its destructor.  */
> +
> +struct scoped_disable_commit_resumed
> +{
> +  scoped_disable_commit_resumed (const char *reason);
> +  ~scoped_disable_commit_resumed ();
> +
> +  DISABLE_COPY_AND_ASSIGN (scoped_disable_commit_resumed);
> +
> +private:
> +  const char *m_reason;
> +};
> +
>  #endif /* INFRUN_H */
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index dc524cf10dc1..9adec81ba132 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1661,6 +1661,8 @@ linux_nat_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
>  			   ? strsignal (gdb_signal_to_host (signo)) : "0"),
>  			  target_pid_to_str (inferior_ptid).c_str ());
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* A specific PTID means `step only this process id'.  */
>    resume_many = (minus_one_ptid == ptid
>  		 || ptid.is_pid ());
> @@ -3406,6 +3408,8 @@ linux_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
>    linux_nat_debug_printf ("[%s], [%s]", target_pid_to_str (ptid).c_str (),
>  			  target_options_to_string (target_options).c_str ());
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* Flush the async file first.  */
>    if (target_is_async_p ())
>      async_file_flush ();
> @@ -4166,6 +4170,7 @@ linux_nat_stop_lwp (struct lwp_info *lwp)
>  void
>  linux_nat_target::stop (ptid_t ptid)
>  {
> +  gdb_assert (!this->commit_resumed_state);
>    iterate_over_lwps (ptid, linux_nat_stop_lwp);
>  }
>  
> diff --git a/gdb/mi/mi-main.c b/gdb/mi/mi-main.c
> index 9a14d78e1e27..e5653ea3e3f5 100644
> --- a/gdb/mi/mi-main.c
> +++ b/gdb/mi/mi-main.c
> @@ -266,6 +266,8 @@ exec_continue (char **argv, int argc)
>  {
>    prepare_execution_command (current_top_target (), mi_async_p ());
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("mi continue");
> +
>    if (non_stop)
>      {
>        /* In non-stop mode, 'resume' always resumes a single thread.
> diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
> index 1436a550ac04..9877f0d81931 100644
> --- a/gdb/process-stratum-target.c
> +++ b/gdb/process-stratum-target.c
> @@ -99,6 +99,20 @@ all_non_exited_process_targets ()
>  
>  /* See process-stratum-target.h.  */
>  
> +std::set<process_stratum_target *>
> +all_process_targets ()
> +{
> +  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
> +  std::set<process_stratum_target *> targets;
> +  for (inferior *inf : all_inferiors ())
> +    if (inf->process_target () != nullptr)
> +      targets.insert (inf->process_target ());
> +
> +  return targets;
> +}
> +
> +/* See process-stratum-target.h.  */
> +
>  void
>  switch_to_target_no_thread (process_stratum_target *target)
>  {
> @@ -108,26 +122,3 @@ switch_to_target_no_thread (process_stratum_target *target)
>        break;
>      }
>  }
> -
> -/* If true, `maybe_commit_resume_process_target` is a no-op.  */
> -
> -static bool defer_process_target_commit_resume;
> -
> -/* See target.h.  */
> -
> -void
> -maybe_commit_resume_process_target (process_stratum_target *proc_target)
> -{
> -  if (defer_process_target_commit_resume)
> -    return;
> -
> -  proc_target->commit_resume ();
> -}
> -
> -/* See process-stratum-target.h.  */
> -
> -scoped_restore_tmpl<bool>
> -make_scoped_defer_process_target_commit_resume ()
> -{
> -  return make_scoped_restore (&defer_process_target_commit_resume, true);
> -}
> diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
> index c8060c46be93..3cea911dee09 100644
> --- a/gdb/process-stratum-target.h
> +++ b/gdb/process-stratum-target.h
> @@ -63,19 +63,10 @@ class process_stratum_target : public target_ops
>    bool has_registers () override;
>    bool has_execution (inferior *inf) override;
>  
> -  /* Commit a series of resumption requests previously prepared with
> -     resume calls.
> +  /* Ensure that all resumed threads are committed to the target.
>  
> -     GDB always calls `commit_resume` on the process stratum target after
> -     calling `resume` on a target stack.  A process stratum target may thus use
> -     this method in coordination with its `resume` method to batch resumption
> -     requests.  In that case, the target doesn't actually resume in its
> -     `resume` implementation.  Instead, it takes note of resumption intent in
> -     `resume`, and defers the actual resumption `commit_resume`.
> -
> -     E.g., the remote target uses this to coalesce multiple resumption requests
> -     in a single vCont packet.  */
> -  virtual void commit_resume () {}
> +     See the description of COMMIT_RESUMED_STATE for more details.  */
> +  virtual void commit_resumed () {}
>  
>    /* True if any thread is, or may be executing.  We need to track
>       this separately because until we fully sync the thread list, we
> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>  
>    /* The connection number.  Visible in "info connections".  */
>    int connection_number = 0;
> +
> +  /* Whether resumed threads must be committed to the target.
> +
> +     When true, resumed threads must be committed to the execution target.
> +
> +     When false, the process stratum target may leave resumed threads stopped
> +     when it's convenient or efficient to do so.  When the core requires resumed
> +     threads to be committed again, this is set back to true and calls the
> +     `commit_resumed` method to allow the target to do so.
> +
> +     To simplify the implementation of process stratum targets, the following
> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
> +     false:
> +
> +       - resume
> +       - stop
> +       - wait
> +
> +     Knowing this, the process stratum target doesn't need to implement
> +     different behaviors depending on the COMMIT_RESUMED_STATE, and can
> +     simply assert that it is false.
> +
> +     Process stratum targets can take advantage of this to batch resumption
> +     requests, for example.  In that case, the target doesn't actually resume in
> +     its `resume` implementation.  Instead, it takes note of the resumption
> +     intent in `resume` and defers the actual resumption to `commit_resumed`.
> +     For example, the remote target uses this to coalesce multiple resumption
> +     requests in a single vCont packet.  */
> +  bool commit_resumed_state = false;
>  };
>  
>  /* Downcast TARGET to process_stratum_target.  */
> @@ -101,24 +121,13 @@ as_process_stratum_target (target_ops *target)
>  
>  extern std::set<process_stratum_target *> all_non_exited_process_targets ();
>  
> +/* Return a collection of all existing process stratum targets.  */
> +
> +extern std::set<process_stratum_target *> all_process_targets ();
> +
>  /* Switch to the first inferior (and program space) of TARGET, and
>     switch to no thread selected.  */
>  
>  extern void switch_to_target_no_thread (process_stratum_target *target);
>  
> -/* Commit a series of resumption requests previously prepared with
> -   target_resume calls.
> -
> -   This function is a no-op if commit resumes are deferred (see
> -   `make_scoped_defer_process_target_commit_resume`).  */
> -
> -extern void maybe_commit_resume_process_target
> -  (process_stratum_target *target);
> -
> -/* Setup to defer `commit_resume` calls, and re-set to the previous status on
> -   destruction.  */
> -
> -extern scoped_restore_tmpl<bool>
> -  make_scoped_defer_process_target_commit_resume ();
> -
>  #endif /* !defined (PROCESS_STRATUM_TARGET_H) */
> diff --git a/gdb/record-full.c b/gdb/record-full.c
> index 56ab29479874..fad355afdf4f 100644
> --- a/gdb/record-full.c
> +++ b/gdb/record-full.c
> @@ -1263,7 +1263,9 @@ record_full_wait_1 (struct target_ops *ops,
>  					    "issuing one more step in the "
>  					    "target beneath\n");
>  		      ops->beneath ()->resume (ptid, step, GDB_SIGNAL_0);
> -		      proc_target->commit_resume ();
> +		      proc_target->commit_resumed_state = true;
> +		      proc_target->commit_resumed ();
> +		      proc_target->commit_resumed_state = false;
>  		      continue;
>  		    }
>  		}
> diff --git a/gdb/remote.c b/gdb/remote.c
> index f8150f39fb5c..be53886c1837 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -421,7 +421,7 @@ class remote_target : public process_stratum_target
>    void detach (inferior *, int) override;
>    void disconnect (const char *, int) override;
>  
> -  void commit_resume () override;
> +  void commit_resumed () override;
>    void resume (ptid_t, int, enum gdb_signal) override;
>    ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
>  
> @@ -6376,6 +6376,8 @@ remote_target::resume (ptid_t ptid, int step, enum gdb_signal siggnal)
>  {
>    struct remote_state *rs = get_remote_state ();
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* When connected in non-stop mode, the core resumes threads
>       individually.  Resuming remote threads directly in target_resume
>       would thus result in sending one packet per thread.  Instead, to
> @@ -6565,7 +6567,7 @@ vcont_builder::push_action (ptid_t ptid, bool step, gdb_signal siggnal)
>  /* to_commit_resume implementation.  */
>  
>  void
> -remote_target::commit_resume ()
> +remote_target::commit_resumed ()
>  {
>    int any_process_wildcard;
>    int may_global_wildcard_vcont;
> @@ -6640,6 +6642,8 @@ remote_target::commit_resume ()
>       disable process and global wildcard resumes appropriately.  */
>    check_pending_events_prevent_wildcard_vcont (&may_global_wildcard_vcont);
>  
> +  bool any_pending_vcont_resume = false;
> +
>    for (thread_info *tp : all_non_exited_threads (this))
>      {
>        remote_thread_info *priv = get_remote_thread_info (tp);
> @@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
>  	  continue;
>  	}
>  
> +      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
> +	any_pending_vcont_resume = true;
> +
>        /* If a thread is the parent of an unfollowed fork, then we
>  	 can't do a global wildcard, as that would resume the fork
>  	 child.  */
> @@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
>  	may_global_wildcard_vcont = 0;
>      }
>  
> +  /* We didn't have any resumed thread pending a vCont resume, so nothing to
> +     do.  */
> +  if (!any_pending_vcont_resume)
> +    return;
> +
>    /* Now let's build the vCont packet(s).  Actions must be appended
>       from narrower to wider scopes (thread -> process -> global).  If
>       we end up with too many actions for a single packet vcont_builder
> @@ -6735,7 +6747,35 @@ remote_target::commit_resume ()
>    vcont_builder.flush ();
>  }
>  
> -\f
> +struct stop_reply : public notif_event
> +{
> +  ~stop_reply ();
> +
> +  /* The identifier of the thread about this event  */
> +  ptid_t ptid;
> +
> +  /* The remote state this event is associated with.  When the remote
> +     connection, represented by a remote_state object, is closed,
> +     all the associated stop_reply events should be released.  */
> +  struct remote_state *rs;
> +
> +  struct target_waitstatus ws;
> +
> +  /* The architecture associated with the expedited registers.  */
> +  gdbarch *arch;
> +
> +  /* Expedited registers.  This makes remote debugging a bit more
> +     efficient for those targets that provide critical registers as
> +     part of their normal status mechanism (as another roundtrip to
> +     fetch them is avoided).  */
> +  std::vector<cached_reg_t> regcache;
> +
> +  enum target_stop_reason stop_reason;
> +
> +  CORE_ADDR watch_data_address;
> +
> +  int core;
> +};
>  
>  /* Non-stop version of target_stop.  Uses `vCont;t' to stop a remote
>     thread, all threads of a remote process, or all threads of all
> @@ -6748,6 +6788,39 @@ remote_target::remote_stop_ns (ptid_t ptid)
>    char *p = rs->buf.data ();
>    char *endp = p + get_remote_packet_size ();
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
> +  /* If any threads that needs to stop are pending a vCont resume, generate
> +     dummy stop_reply events.  */
> +  for (thread_info *tp : all_non_exited_threads (this, ptid))
> +    {
> +      remote_thread_info *remote_thr = get_remote_thread_info (tp);
> +
> +      if (remote_thr->resume_state () == resume_state::RESUMED_PENDING_VCONT)
> +	{
> +	  if (remote_debug)
> +	    {
> +	      fprintf_unfiltered (gdb_stdlog,
> +				  "remote_stop_ns: Enqueueing phony stop reply "
> +				  "for thread pending vCont-resume "
> +				  "(%d, %ld, %ld)\n",
> +				  tp->ptid.pid(), tp->ptid.lwp (),
> +				  tp->ptid.tid ());
> +	    }
> +
> +	  stop_reply *sr = new stop_reply ();
> +	  sr->ptid = tp->ptid;
> +	  sr->rs = rs;
> +	  sr->ws.kind = TARGET_WAITKIND_STOPPED;
> +	  sr->ws.value.sig = GDB_SIGNAL_0;
> +	  sr->arch = tp->inf->gdbarch;
> +	  sr->stop_reason = TARGET_STOPPED_BY_NO_REASON;
> +	  sr->watch_data_address = 0;
> +	  sr->core = 0;
> +	  this->push_stop_reply (sr);
> +	}
> +    }
> +
>    /* FIXME: This supports_vCont_probed check is a workaround until
>       packet_support is per-connection.  */
>    if (packet_support (PACKET_vCont) == PACKET_SUPPORT_UNKNOWN
> @@ -6955,36 +7028,6 @@ remote_console_output (const char *msg)
>    gdb_stdtarg->flush ();
>  }
>  
> -struct stop_reply : public notif_event
> -{
> -  ~stop_reply ();
> -
> -  /* The identifier of the thread about this event  */
> -  ptid_t ptid;
> -
> -  /* The remote state this event is associated with.  When the remote
> -     connection, represented by a remote_state object, is closed,
> -     all the associated stop_reply events should be released.  */
> -  struct remote_state *rs;
> -
> -  struct target_waitstatus ws;
> -
> -  /* The architecture associated with the expedited registers.  */
> -  gdbarch *arch;
> -
> -  /* Expedited registers.  This makes remote debugging a bit more
> -     efficient for those targets that provide critical registers as
> -     part of their normal status mechanism (as another roundtrip to
> -     fetch them is avoided).  */
> -  std::vector<cached_reg_t> regcache;
> -
> -  enum target_stop_reason stop_reason;
> -
> -  CORE_ADDR watch_data_address;
> -
> -  int core;
> -};
> -
>  /* Return the length of the stop reply queue.  */
>  
>  int
> @@ -7877,6 +7920,8 @@ remote_target::wait_ns (ptid_t ptid, struct target_waitstatus *status,
>    int ret;
>    int is_notif = 0;
>  
> +  gdb_assert (!this->commit_resumed_state);
> +
>    /* If in non-stop mode, get out of getpkt even if a
>       notification is received.	*/
>  
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-08 15:41   ` Pedro Alves
@ 2021-01-08 18:56     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08 18:56 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

On 2021-01-08 10:41 a.m., Pedro Alves wrote:
> Hi,
> 
> This patch LGTM.  A couple tiny issue below.
> 
>> gdb/ChangeLog:
>>
>>         * remote.c (enum class resume_state): New.
>>         (struct resumed_pending_vcont_info): New.
>>         (struct remote_thread_info) <resume_state, set_not_resumed,
>> 	set_resumed_pending_vcont, resumed_pending_vcont_info,
>> 	set_resumed, m_resume_state, m_resumed_pending_vcont_info>:
>> 	New.
>> 	<last_resume_step, last_resume_sig, vcont_resumed>: Remove.
>>         (remote_target::remote_add_thread): Adjust.
>>         (remote_target::process_initial_stop_replies): Adjust.
>>         (remote_target::resume): Adjust.
>>         (remote_target::commit_resume): Rely on state in
>> 	remote_thread_info and not on tp->executing.
>>         (remote_target::process_stop_reply): Adjust.
> 
> Mind spaces vs tabs in the ChangeLog entry.

Fixed.

> 
>> +/* Information about a thread's pending vCont-resume.  Used when a thread is in
>> +   the remote_resume_state::RESUMED_PENDING_VCONT state.  remote_target::resume
>> +   stores this information which is them picked up by
> 
> them -> then

Fixed.

> That's it.  :-)

Thanks!

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace,full}.c
  2021-01-08 15:43   ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace,full}.c Pedro Alves
@ 2021-01-08 19:00     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08 19:00 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

On 2021-01-08 10:43 a.m., Pedro Alves wrote:
> On 08/01/21 04:17, Simon Marchi wrote:
>> From: Simon Marchi <simon.marchi@efficios.com>
>>
>> The previous patch made the commit_resume implementations in the record
>> targets unnecessary, as the remote target's commit_resume implementation
>> won't commit-resume threads for which it didn't see a resume.  This
>> patch removes them.
>>
>> gdb/ChangeLog:
>>
>>         * record-btrace.c (class record_btrace_target):
>>         (record_btrace_target::commit_resume):
>>         * record-full.c (class record_full_target):
>>         (record_full_target::commit_resume):
> 
> Incomplete entry.

Woops, fixed.  It's "Remove." everywhere.
 
> Otherwise LGTM.  I like how these two patches result in clearer code.  Nice.

Ok, thanks.  I'll keep reading the comments to see where the rest of the series
is going, but in any case I think we can at least push patches 1 and 2 on their
own then, if they are a good clean up on their own.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target
  2021-01-08 18:12   ` Andrew Burgess
@ 2021-01-08 19:01     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08 19:01 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches, Pedro Alves, Simon Marchi

>> diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
>> index 719167803fff..1436a550ac04 100644
>> --- a/gdb/process-stratum-target.c
>> +++ b/gdb/process-stratum-target.c
>> @@ -108,3 +108,26 @@ switch_to_target_no_thread (process_stratum_target *target)
>>        break;
>>      }
>>  }
>> +
>> +/* If true, `maybe_commit_resume_process_target` is a no-op.  */
>> +
>> +static bool defer_process_target_commit_resume;
>> +
>> +/* See target.h.  */
> 
> Should be 'process-stratum-target.h' now.

Fixed thanks.  It would be simpler it we always wrote "See header file." :).

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08 18:34   ` Andrew Burgess
@ 2021-01-08 19:04     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08 19:04 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches, Pedro Alves, Simon Marchi

On 2021-01-08 1:34 p.m., Andrew Burgess wrote:
> * Simon Marchi <simon.marchi@polymtl.ca> [2021-01-07 23:17:33 -0500]:
> 
>> From: Simon Marchi <simon.marchi@efficios.com>
>>
>> The rationale for this patch comes from the ROCm port [1], the goal
>> being to reduce the number of back and forths between GDB and the target
>> when doing successive operations.  I'll start with explaining the
>> rationale and then go over the implementation.  In the ROCm / GPU world,
>> the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
>> read if from a GPU stand point, just s/thread/wave/.
>>
>> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
>> target, gives the illusion that it's possible for the debugger to
>> control (start and stop) individual threads.  But in reality, this is
>> not how it works.  Under the hood, all threads of a queue are controlled
>> as a group.  To stop one thread in a group of running ones, the state of
>> all threads is retrieved from the GPU, all threads are destroyed, and all
>> threads but the one we want to stop are re-created from the saved state.
>> The net result, from the point of view of GDB, is that the library
>> stopped one thread.  The same thing goes if we want to resume one thread
>> while others are running: the state of all running threads is retrieved
>> from the GPU, they are all destroyed, and they are all re-created,
>> including the thread we want to resume.
>>
>> This leads to some inefficiencies when combined with how GDB works, here
>> are two examples:
>>
>>  - Stopping all threads: because the target operates in non-stop mode,
>>    when the user interface mode is all-stop, GDB must stop all threads
>>    individually when presenting a stop.  Let's suppose we have 1000
>>    threads and the user does ^C.  GDB asks the target to stop one
>>    thread.  Behind the scenes, the library retrieves 1000 thread states
>>    and restores the 999 others still running ones.  GDB asks the target
>>    to stop another one.  The target retrieves 999 thread states and
>>    restores the 998 remaining ones.  That means that to stop 1000
>>    threads, we did 1000 back and forths with the GPU.  It would have
>>    been much better to just retrieve the states once and stop there.
>>
>>  - Resuming with pending events: suppose the 1000 threads hit a
>>    breakpoint at the same time.  The breakpoint is conditional and
>>    evaluates to true for the first thread, to false for all others.  GDB
>>    pulls one event (for the first thread) from the target, decides that
>>    it should present a stop, so stops all threads using
>>    stop_all_threads.  All these other threads have a breakpoint event to
>>    report, which is saved in `thread_info::suspend::waitstatus` for
>>    later.  When the user does "continue", GDB resumes that one thread
>>    that did hit the breakpoint.  It then processes the pending events
>>    one by one as if they just arrived.  It picks one, evaluates the
>>    condition to false, and resumes the thread.  It picks another one,
>>    evaluates the condition to false, and resumes the thread.  And so on.
>>    In between each resumption, there is a full state retrieval and
>>    re-creation.  It would be much nicer if we could wait a little bit
>>    before sending those threads on the GPU, until it processed all those
>>    pending events.
>>
>> To address this kind of performance issue, ROCdbgapi has a concept
>> called "forward progress required", which is a boolean state that allows
>> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
>> hold off putting the threads on the GPU until I'm done" (the "forward
>> progress not required" state).  Turning forward progress back on
>> indicates to the library that all threads that are supposed to be
>> running should now be really running on the GPU.
>>
>> It turns out that GDB has a similar concept, though not as general,
>> commit_resume.  On difference is that commit_resume is not stateful: the
> 
> typo: 'On difference' ?

Fixed to "One difference".

> 
>> target can't look up "does the core need me to schedule resumed threads
>> for execution right now".  It is also specifically linked to the resume
>> method, it is not used in other contexts.  The target accumulates
>> resumption requests through target_ops::resume calls, and then commits
>> those resumptions when target_ops::commit_resume is called.  The target
>> has no way to check if it's ok to leave resumed threads stopped in other
>> target methods.
>>
>> To bridge the gap, this patch generalizes the commit_resume concept in
>> GDB to match the forward progress concept of ROCdbgapi.  The current
>> name (commit_resume) can be interpreted as "commit the previous resume
>> calls".  I renamed the concept to "commit_resumed", as in "commit the
>> threads that are resumed".
>>
>> In the new version, we have two things in process_stratum_target:
>>
>>  - the commit_resumed_state field: indicates whether GDB requires this
>>    target to have resumed threads committed to the execution
>>    target/device.  If false, the target is allowed to leave resumed
>>    threads un-committed at the end of whatever method it is executing.
>>
>>  - the commit_resumed method: called when commit_resumed_state
>>    transitions from false to true.  While commit_resumed_state was
>>    false, the target may have left some resumed threads un-committed.
>>    This method being called tells it that it should commit them back to
>>    the execution device.
>>
>> Let's take the "Stopping all threads" scenario from above and see how it
>> would work with the ROCm target with this change.  Before stopping all
>> threads, GDB would set the target's commit_resumed_state field to false.
>> It would then ask the target to stop the first thread.  The target would
>> retrieve all threads' state from the GPU and mark that one as stopped.
>> Since commit_resumed_state is false, it leaves all the other threads
>> (still resumed) stopped.  GDB would then proceed to call target_stop for
>> all the other threads.  Since resumed threads are not committed, this
>> doesn't do any back and forth with the GPU.
>>
>> To simplify the implementation of targets, I made it so that when
>> calling certain target methods, the contract between the core and the
>> targets guarantees that commit_resumed_state is false.  This way, the
>> target doesn't need two paths, one commit_resumed_state == true and one
>> for commit_resumed_state == false.  It can just assert that
>> commit_resumed_state is false and work with that assumption.  This also
>> helps catch places where we forgot to disable commit_resumed_state
>> before calling the method, which represents a probable optimization
>> opportunity.
>>
>> To have some confidence that this contract between the core and the
>> targets is respected, I added assertions in the linux-nat target
>> methods, even though the linux-nat target doesn't actually use that
>> feature.  Since linux-nat is tested much more than other targets, this
>> will help catch these issues quicker.
>>
>> To ensure that commit_resumed_state is always turned back on (only if
>> necessary, see below) and the commit_resumed method is called when doing
>> so, I introduced the scoped_disabled_commit_resumed RAII object, which
>> replaces make_scoped_defer_process_target_commit_resume.  On
>> construction, it clears the commit_resumed_state flag of all process
>> targets.  On destruction, it turns it back on (if necessary) and calls
>> the commit_resumed method.  The nested case is handled by having a
>> "nesting" counter: only when the counter goes back to 0 is
>> commit_resumed_state turned back on.
>>
>> On destruction, commit-resumed is not re-enabled for a given target if:
>>
>>  1. this target has no threads resumed, or
>>  2. this target at least one thread with a pending status known to the
> 
> Missing word: 'this target at least'?

Fixed to "this target has at least".

> I read through the commit message and convinced myself that it made
> sense.  I ran out of time to look at the actual code.

That's already very nice, thanks!

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-08 18:19   ` Andrew Burgess
@ 2021-01-08 19:11     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-08 19:11 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches, Pedro Alves

>> +  bool multiple_resumed_thread = false;
> 
> This might be better named 'multiple_resumed_threads'.  Apologies if
> this was just copied from my original code.

Agreed, fixed.
 
>>  
>> -  /* If no thread/process was reported by the stub then use the first
>> -     non-exited thread in the current target.  */
>> -  if (ptid == null_ptid)
>> +  /* Consider all non-exited threads of the target, find the first resumed
>> +     one.  */
>> +  for (thread_info *thr : all_non_exited_threads (this))
>>      {
>> -      /* Some stop events apply to all threads in an inferior, while others
>> -	 only apply to a single thread.  */
>> -      bool is_stop_for_all_threads
>> -	= (status->kind == TARGET_WAITKIND_EXITED
>> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
>> +      remote_thread_info *remote_thr =get_remote_thread_info (thr);
> 
> Missing space after '='.

Fixed.

>> +/* Called when it is decided that STOP_REPLY holds the info of the
>> +   event that is to be returned to the core.  This function always
>> +   destroys STOP_REPLY.  */
>> +
>> +ptid_t
>> +remote_target::process_stop_reply (struct stop_reply *stop_reply,
>> +				   struct target_waitstatus *status)
>> +{
>> +  ptid_t ptid;
> 
> Shouldn't this just be inline below?

Fixed.

>> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
>> new file mode 100644
>> index 000000000000..01f6d3c07ff4
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
>> @@ -0,0 +1,77 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> +   Copyright 2020 Free Software Foundation, Inc.
> 
> The copyright year will need updating now.

Oh, we now live in the future!

>> diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
>> new file mode 100644
>> index 000000000000..f394ca8ed0c4
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
>> @@ -0,0 +1,136 @@
>> +# This testcase is part of GDB, the GNU debugger.
>> +#
>> +# Copyright 2020 Free Software Foundation, Inc.
> 
> And again.

Fixed too.

Thanks for the comments.  I'll give a bit more time for people to
comment.  But at this point, patches 1, 2 and 5 were looked at and
agreed with, so I'll push these eventually at least if nothing else
comes up.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target
  2021-01-08  4:17 ` [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target Simon Marchi
  2021-01-08 18:12   ` Andrew Burgess
@ 2021-01-09 20:29   ` Pedro Alves
  1 sibling, 0 replies; 33+ messages in thread
From: Pedro Alves @ 2021-01-09 20:29 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

On 08/01/21 04:17, Simon Marchi wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The following patch will change the commit_resume target method to
> something stateful.  Because it would be difficult to track a state
> replicated in the various targets of a target stack, and since for the
> foreseeable future, only process stratum targets are going to use this
> concept, this patch makes the commit resume concept specific to process
> stratum targets.
> 
> So, move the method to process_stratum_target, and move helper functions
> to process-stratum-target.h.
> 
> gdb/ChangeLog:
> 
> 	* target.h (struct target_ops) <commit_resume>: New.
> 	(target_commit_resume): Remove.
> 	(make_scoped_defer_target_commit_resume): Remove.
> 	* target.c (defer_target_commit_resume): Remove.
> 	(target_commit_resume): Remove.
> 	(make_scoped_defer_target_commit_resume): Remove.
> 	* process-stratum-target.h (class process_stratum_target)
> 	<commit_resume>: New.
> 	(maybe_commit_resume_all_process_targets): New.
> 	(make_scoped_defer_process_target_commit_resume): New.
> 	* process-stratum-target.c (defer_process_target_commit_resume):
> 	New.
> 	(maybe_commit_resume_process_target): New.
> 	(make_scoped_defer_process_target_commit_resume): New.
> 	* infrun.c (do_target_resume): Adjust.
> 	(commit_resume_all_targets): Rename into...
> 	(maybe_commit_resume_all_process_targets): ... this, adjust.
> 	(proceed): Adjust.
> 	* record-full.c (record_full_wait_1): Adjust.
> 	* target-delegates.c: Re-generate.
> 

OK.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
  2021-01-08 18:34   ` Andrew Burgess
@ 2021-01-09 20:34   ` Pedro Alves
  2021-01-11 20:28     ` Simon Marchi
  2021-01-12 17:14   ` Simon Marchi
  2021-01-15 19:17   ` Simon Marchi
  3 siblings, 1 reply; 33+ messages in thread
From: Pedro Alves @ 2021-01-09 20:34 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess, Simon Marchi

[-- Attachment #1: Type: text/plain, Size: 27687 bytes --]

On 08/01/21 04:17, Simon Marchi wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The rationale for this patch comes from the ROCm port [1], the goal
> being to reduce the number of back and forths between GDB and the target
> when doing successive operations.  I'll start with explaining the
> rationale and then go over the implementation.  In the ROCm / GPU world,
> the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
> read if from a GPU stand point, just s/thread/wave/.
> 
> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
> target, gives the illusion that it's possible for the debugger to
> control (start and stop) individual threads.  But in reality, this is
> not how it works.  Under the hood, all threads of a queue are controlled
> as a group.  To stop one thread in a group of running ones, the state of
> all threads is retrieved from the GPU, all threads are destroyed, and all
> threads but the one we want to stop are re-created from the saved state.
> The net result, from the point of view of GDB, is that the library
> stopped one thread.  The same thing goes if we want to resume one thread
> while others are running: the state of all running threads is retrieved
> from the GPU, they are all destroyed, and they are all re-created,
> including the thread we want to resume.
> 
> This leads to some inefficiencies when combined with how GDB works, here
> are two examples:
> 
>  - Stopping all threads: because the target operates in non-stop mode,
>    when the user interface mode is all-stop, GDB must stop all threads
>    individually when presenting a stop.  Let's suppose we have 1000
>    threads and the user does ^C.  GDB asks the target to stop one
>    thread.  Behind the scenes, the library retrieves 1000 thread states
>    and restores the 999 others still running ones.  GDB asks the target
>    to stop another one.  The target retrieves 999 thread states and
>    restores the 998 remaining ones.  That means that to stop 1000
>    threads, we did 1000 back and forths with the GPU.  It would have
>    been much better to just retrieve the states once and stop there.
> 
>  - Resuming with pending events: suppose the 1000 threads hit a
>    breakpoint at the same time.  The breakpoint is conditional and
>    evaluates to true for the first thread, to false for all others.  GDB
>    pulls one event (for the first thread) from the target, decides that
>    it should present a stop, so stops all threads using
>    stop_all_threads.  All these other threads have a breakpoint event to
>    report, which is saved in `thread_info::suspend::waitstatus` for
>    later.  When the user does "continue", GDB resumes that one thread
>    that did hit the breakpoint.  It then processes the pending events
>    one by one as if they just arrived.  It picks one, evaluates the
>    condition to false, and resumes the thread.  It picks another one,
>    evaluates the condition to false, and resumes the thread.  And so on.
>    In between each resumption, there is a full state retrieval and
>    re-creation.  It would be much nicer if we could wait a little bit
>    before sending those threads on the GPU, until it processed all those
>    pending events.

A potential downside of holding on in this latter scenario, with regular
host debugging, is that currently, threads are resumed immediately, thus potentially
the inferior process's threads spend less time paused, at least with the native target
if we implemented commit_resume there.  With remote, the trade off is
probably more in favor of deferring, given the higher latency.

However, since we don't implement commit_resume for native target
currently, it shouldn't have any effect there.

To confirm this, I tried the testcase we used when debugging the
displaced stepping buffers series, with 100 threads continuously
stepping over a breakpoint, for 10 seconds.  Suprisingly, when
native target, I see a consistent ~3% slowdown caused by this series.

I don't see any material difference with gdbserver.

(higher is better)

native, pristine

 avg              440.240000
 avg              436.670000
 avg              451.310000
 avg              432.840000
 avg              437.060000
 ===========================
 avg of avg       439.624000

native, patched

 avg              420.940000
 avg              428.130000
 avg              425.230000
 avg              428.080000
 avg              424.880000
 ===========================
 avg of avg       425.452000



gdbserver, pristine:

 avg              633.490000
 avg              639.910000
 avg              642.300000
 avg              626.160000
 avg              626.460000
 ===========================
 avg of avg       633.664000


gdbserver, patched

 avg              630.970000
 avg              628.960000
 avg              638.340000
 avg              627.030000
 avg              638.390000
 ===========================
 avg of avg       632.738000

tests run like this:

  $ gcc disp-step-buffers-test.c -o disp-step-buffers-test -g3 -O2 -pthread
  $ g="./gdb -data-directory=data-directory"
  $ time $g -q --batch disp-step-buffers-test -ex "b 16 if 0" -ex "r"
  $ time $g -q --batch disp-step-buffers-test -ex "set sysroot" -ex "target remote | ../gdbserver/gdbserver - disp-step-buffers-test" -ex "b 16 if 0" -ex "c" 

I've attached disp-step-buffers-test.c.

I'm surprised that native debugging is quite slower here, compared to
gdbserver.  I don't recall observing that earlier.  Maybe I just missed
it then.

I wouldn't have thought we would be doing that much work that it
would be noticeable with the native target (pristive vs patched, the 3%
slowdown).  I wonder whether that is caused by the constant std::set allocation
in all_process_targets.  But then it's strange that we don't see that
same slowdown when remote debugging.  I'm surprised.

> 
> To address this kind of performance issue, ROCdbgapi has a concept
> called "forward progress required", which is a boolean state that allows
> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
> hold off putting the threads on the GPU until I'm done" (the "forward
> progress not required" state).  Turning forward progress back on
> indicates to the library that all threads that are supposed to be
> running should now be really running on the GPU.
> 
> It turns out that GDB has a similar concept, though not as general,
> commit_resume.  On difference is that commit_resume is not stateful: the
> target can't look up "does the core need me to schedule resumed threads
> for execution right now".  It is also specifically linked to the resume
> method, it is not used in other contexts.  The target accumulates
> resumption requests through target_ops::resume calls, and then commits
> those resumptions when target_ops::commit_resume is called.  The target
> has no way to check if it's ok to leave resumed threads stopped in other
> target methods.
> 
> To bridge the gap, this patch generalizes the commit_resume concept in
> GDB to match the forward progress concept of ROCdbgapi.  The current
> name (commit_resume) can be interpreted as "commit the previous resume
> calls".  I renamed the concept to "commit_resumed", as in "commit the
> threads that are resumed".

Makes sense.

> 
> In the new version, we have two things in process_stratum_target:
> 
>  - the commit_resumed_state field: indicates whether GDB requires this
>    target to have resumed threads committed to the execution
>    target/device.  If false, the target is allowed to leave resumed
>    threads un-committed at the end of whatever method it is executing.
> 
>  - the commit_resumed method: called when commit_resumed_state
>    transitions from false to true.  While commit_resumed_state was
>    false, the target may have left some resumed threads un-committed.
>    This method being called tells it that it should commit them back to
>    the execution device.
> 
> Let's take the "Stopping all threads" scenario from above and see how it
> would work with the ROCm target with this change.  Before stopping all
> threads, GDB would set the target's commit_resumed_state field to false.
> It would then ask the target to stop the first thread.  The target would
> retrieve all threads' state from the GPU and mark that one as stopped.
> Since commit_resumed_state is false, it leaves all the other threads
> (still resumed) stopped.  GDB would then proceed to call target_stop for
> all the other threads.  Since resumed threads are not committed, this
> doesn't do any back and forth with the GPU.
> 
> To simplify the implementation of targets, I made it so that when
> calling certain target methods, the contract between the core and the
> targets guarantees that commit_resumed_state is false.  This way, the
> target doesn't need two paths, one commit_resumed_state == true and one
> for commit_resumed_state == false.  It can just assert that
> commit_resumed_state is false and work with that assumption.  This also
> helps catch places where we forgot to disable commit_resumed_state
> before calling the method, which represents a probable optimization
> opportunity.
> 
> To have some confidence that this contract between the core and the
> targets is respected, I added assertions in the linux-nat target
> methods, even though the linux-nat target doesn't actually use that
> feature.  Since linux-nat is tested much more than other targets, this
> will help catch these issues quicker.

Did you consider adding the assertions to target.c instead, in the
target_resume/target_wait/target_stop wrapper methods?  That would
cover all targets.

> 
> To ensure that commit_resumed_state is always turned back on (only if
> necessary, see below) and the commit_resumed method is called when doing
> so, I introduced the scoped_disabled_commit_resumed RAII object, which
> replaces make_scoped_defer_process_target_commit_resume.  On
> construction, it clears the commit_resumed_state flag of all process
> targets.  On destruction, it turns it back on (if necessary) and calls
> the commit_resumed method.  

This part makes me nervous and I think will cause us problems.  I'm
really not sure it's a good idea.  The issue is that the commit_resumed method can
throw, and we'll be in a dtor, which means that we will need to swallow the
error, there's no way to propagate it out aborting the current function.
That's why we currently have explicit commit calls, and the scoped object just
tweaks the "defer commit" flag.  Would it work to build on the current
design instead of moving the commit to the dtor?

> The nested case is handled by having a
> "nesting" counter: only when the counter goes back to 0 is
> commit_resumed_state turned back on.

It wasn't obvious to me from the description why do we need both commit_resumed_state
and a counter.  As in, wouldn't just the counter work?  Like, if the count is 0,
the state is on, if >0, it is off.  

Can different targets ever have different commit resumed states?
The only spot I see that tweaks the flag outside of the scoped object,
is record-full.c, but I think that's only to avoid hitting the assertion?
Do you plan on adding more spots that would override the state even if
a scoped_disable_commit_resumed object is live?

> 
> On destruction, commit-resumed is not re-enabled for a given target if:
> 
>  1. this target has no threads resumed, or
>  2. this target at least one thread with a pending status known to the
>     core (saved in thread_info::suspend::waitstatus).

Should also check whether the thread with the pending status is resumed.
/me reads patch, oh, did you that.  Good.  Please mention it here:
... one resumed thread ...

> 
> The first point is not technically necessary, because a proper
> commit_resumed implementation would be a no-op if the target has no
> resumed threads.  But since we have a flag do to a quick check, I think
> it doesn't hurt.
> 
> The second point is more important: together with the
> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
> makes it so the "Resuming with pending events" described above is
> handled efficiently.  Here's what happens in that case:
> 
>  1. The user types "continue".
>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>     function does not enable commit-resumed, as it sees other threads
>     have pending statuses.
>  3. fetch_inferior_event is called to handle another event, one thread
>     is resumed.  Because there are still more threads with pending
>     statuses, the destructor of scoped_disable_commit_resumed in
>     fetch_inferior_event still doesn't enable commit-resumed.
>  4. Rinse and repeat step 3, until the last pending status is handled by
>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>     destructor sees there are no more threads with pending statues, so
>     it asks the target to commit resumed threads.
> 
> This allows us to avoid all unnecessary back and forths, there is a
> single commit_resumed call.
> 
> This change required remote_target::remote_stop_ns to learn how to
> handle stopping threads that were resumed but pending vCont.  The
> simplest example where that happens is when using the remote target in
> all-stop, but with "maint set target-non-stop on", to force it to
> operate in non-stop mode under the hood.  If two threads hit a
> breakpoint at the same time, GDB will receive two stop replies.  It will
> present the stop for one thread and save the other one in
> thread_info::suspend::waitstatus.
> 
> Before this patch, when doing "continue", GDB first resumes the thread
> without a pending status:
> 
>     Sending packet: $vCont;c:p172651.172676#f3
> 
> It then consumes the pending status in the next fetch_inferior_event
> call:
> 
>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>     [infrun] target_wait (-1.0.0, status) =
>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
> 
> It then realizes it needs to stop all threads to present the stop, so
> stops the thread it just resumed:
> 
>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>     remote_stop called
>     Sending packet: $vCont;t:p172651.172676#04
> 
> This is an unnecessary resume/stop.  With this patch, we don't commit
> resumed threads after proceeding, because of the pending status:
> 
>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
> 
> When GDB handles the pending status and stop_all_threads runs, we stop a
> resumed but pending vCont thread:
> 
>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
> 
> That thread was never actually resumed on the remote stub / gdbserver.
> This is why remote_stop_ns needed to learn this new trick of enqueueing
> phony stop replies.
> 
> Note that this patch only considers pending statuses known to the core
> of GDB, that is the events that were pulled out of the target and stored
> in `thread_info::suspend::waitstatus`.  In some cases, we could also
> avoid unnecessary back and forth when the target has events that it has
> not yet reported the core.  I plan to implement this as a subsequent
> patch, once this series has settled.
> 
> gdb/ChangeLog:
> 
> 	* infrun.h (struct scoped_disable_commit_resumed): New.
> 	* infrun.c (do_target_resume): Remove
> 	maybe_commit_resume_process_target call.
> 	(maybe_commit_resume_all_process_targets): Rename to...
> 	(maybe_commit_resumed_all_process_targets): ... this.  Skip
> 	targets that have no executing threads or resumed threads with
> 	a pending status.
> 	(scoped_disable_commit_resumed_depth): New.
> 	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
> 	New.
> 	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
> 	New.
> 	(proceed): Use scoped_disable_commit_resumed.
> 	(fetch_inferior_event): Use scoped_disable_commit_resumed.
> 	* process-stratum-target.h (class process_stratum_target):
> 	<commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	<commit_resumed_state>: New.
> 	(all_process_targets): New.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* process-stratum-target.c (all_process_targets): New.
> 	(defer_process_target_commit_resume): Remove.
> 	(maybe_commit_resume_process_target): Remove.
> 	(make_scoped_defer_process_target_commit_resume): Remove.
> 	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
> 	(linux_nat_target::wait): Add gdb_assert.
> 	(linux_nat_target::stop): Add gdb_assert.
> 	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
> 	(attach_command): Use scoped_disable_commit_resumed.
> 	(detach_command): Use scoped_disable_commit_resumed.
> 	(interrupt_target_1): Use scoped_disable_commit_resumed.
> 	* mi/mi-main.c (exec_continue): Use
> 	scoped_disable_commit_resumed.
> 	* record-full.c (record_full_wait_1): Change
> 	commit_resumed_state around calling commit_resumed.
> 	* remote.c (class remote_target) <commit_resume>: Rename to...
> 	<commit_resumed>: ... this.
> 	(remote_target::resume): Add gdb_assert.
> 	(remote_target::commit_resume): Rename to...
> 	(remote_target::commit_resumed): ... this.  Check if there is
> 	any thread pending vCont resume.
> 	(struct stop_reply): Move up.
> 	(remote_target::remote_stop_ns): Generate stop replies for
> 	resumed but pending vCont threads.
> 	(remote_target::wait_ns): Add gdb_assert.
> 
> [1] https://github.com/ROCm-Developer-Tools/ROCgdb/
> [2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
> 
> Change-Id: I836135531a29214b21695736deb0a81acf8cf566
> ---
>  gdb/infcmd.c                 |   8 +++
>  gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
>  gdb/infrun.h                 |  41 +++++++++++++
>  gdb/linux-nat.c              |   5 ++
>  gdb/mi/mi-main.c             |   2 +
>  gdb/process-stratum-target.c |  37 +++++------
>  gdb/process-stratum-target.h |  63 +++++++++++--------
>  gdb/record-full.c            |   4 +-
>  gdb/remote.c                 | 111 +++++++++++++++++++++++----------
>  9 files changed, 292 insertions(+), 95 deletions(-)
> 
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index 6f0ed952de67..b7595e42e265 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
>        uiout->flush ();
>      }
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("running");
> +
>    /* We call get_inferior_args() because we might need to compute
>       the value now.  */
>    run_target->create_inferior (exec_file,
> @@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
>    if (non_stop && !attach_target->supports_non_stop ())
>      error (_("Cannot attach to this target in non-stop mode"));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
> +
>    attach_target->attach (args, from_tty);
>    /* to_attach should push the target, so after this point we
>       shouldn't refer to attach_target again.  */
> @@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
>    if (inferior_ptid == null_ptid)
>      error (_("The program is not being run."));
>  
> +  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
> +

This one looks incorrect -- target_detach -> prepare_for_detach
may need to finish off displaced steps, and resume the target
in the process.  This here will inhibit it.  I have some WIP patches
that will stop prepare_for_detach from doing that though, so it'll
end up being correct after.

>    query_if_trace_running (from_tty);
>  
>    disconnect_tracing ();
> @@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
>  void
>  interrupt_target_1 (bool all_threads)
>  {
> +  scoped_disable_commit_resumed inhibit ("interrupting");
> +
>    if (non_stop)
>      {
>        if (all_threads)
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 1a27af51b7e9..92a1102cb595 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>  
>    target_resume (resume_ptid, step, sig);
>  
> -  maybe_commit_resume_process_target (tp->inf->process_target ());
> -
>    if (target_can_async_p ())
>      target_async (1);
>  }
> @@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
>  					    execution_direction)));
>  }
>  
> -/* Calls maybe_commit_resume_process_target on all process targets.  */
> +/* Maybe require all process stratum targets to commit their resumed threads.
> +
> +   A specific process stratum target is not required to do so if:
> +
> +   - it has no resumed threads
> +   - it has a thread with a pending status  */
>  
>  static void
> -maybe_commit_resume_all_process_targets ()
> +maybe_commit_resumed_all_process_targets ()
>  {
> -  scoped_restore_current_thread restore_thread;
> +  /* This is an optional to avoid unnecessary thread switches. */

Missing double space after period.

But, just scoped_restore_current_thread itself doesn't switch the
thread.  Is this trying to save something else?  It seems pointless
to me offhand.

> +  gdb::optional<scoped_restore_current_thread> restore_thread;
>  
>    for (process_stratum_target *target : all_non_exited_process_targets ())
>      {
> +      gdb_assert (!target->commit_resumed_state);

Not sure I understand this assertion.  Isn't this another thing
showing that the per-target state isn't really necessary, and we
could just use the global state?

> +
> +      if (!target->threads_executing)
> +	{
> +	  infrun_debug_printf ("not re-enabling forward progress for target "
> +			       "%s, no executing threads",
> +			       target->shortname ());
> +	  continue;
> +	}

...

> diff --git a/gdb/infrun.h b/gdb/infrun.h
> index 7160b60f1368..5c32c0c97f6e 100644
> --- a/gdb/infrun.h
> +++ b/gdb/infrun.h

> +
> +struct scoped_disable_commit_resumed
> +{
> +  scoped_disable_commit_resumed (const char *reason);

explicit

> index 1436a550ac04..9877f0d81931 100644
> --- a/gdb/process-stratum-target.c
> +++ b/gdb/process-stratum-target.c
> @@ -99,6 +99,20 @@ all_non_exited_process_targets ()
>  
>  /* See process-stratum-target.h.  */
>  
> +std::set<process_stratum_target *>
> +all_process_targets ()
> +{
> +  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
> +  std::set<process_stratum_target *> targets;
> +  for (inferior *inf : all_inferiors ())
> +    if (inf->process_target () != nullptr)
> +      targets.insert (inf->process_target ());
> +
> +  return targets;
> +}

An alternative that would avoid creating this temporary std::set
(along with its internal heap allocations) on every call would be to expose
target-connection.c:process_targets.

> +
> +/* See process-stratum-target.h.  */
> +
>  void
>  switch_to_target_no_thread (process_stratum_target *target)
>  {
> @@ -108,26 +122,3 @@ switch_to_target_no_thread (process_stratum_target *target)
>        break;
>      }
>  }
> -
> -/* If true, `maybe_commit_resume_process_target` is a no-op.  */
> -
> -static bool defer_process_target_commit_resume;
> -
> -/* See target.h.  */
> -
> -void
> -maybe_commit_resume_process_target (process_stratum_target *proc_target)
> -{
> -  if (defer_process_target_commit_resume)
> -    return;
> -
> -  proc_target->commit_resume ();
> -}
> -
> -/* See process-stratum-target.h.  */
> -
> -scoped_restore_tmpl<bool>
> -make_scoped_defer_process_target_commit_resume ()
> -{
> -  return make_scoped_restore (&defer_process_target_commit_resume, true);
> -}
> diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
> index c8060c46be93..3cea911dee09 100644
> --- a/gdb/process-stratum-target.h
> +++ b/gdb/process-stratum-target.h
> @@ -63,19 +63,10 @@ class process_stratum_target : public target_ops
>    bool has_registers () override;
>    bool has_execution (inferior *inf) override;
>  
> -  /* Commit a series of resumption requests previously prepared with
> -     resume calls.
> +  /* Ensure that all resumed threads are committed to the target.
>  
> -     GDB always calls `commit_resume` on the process stratum target after
> -     calling `resume` on a target stack.  A process stratum target may thus use
> -     this method in coordination with its `resume` method to batch resumption
> -     requests.  In that case, the target doesn't actually resume in its
> -     `resume` implementation.  Instead, it takes note of resumption intent in
> -     `resume`, and defers the actual resumption `commit_resume`.
> -
> -     E.g., the remote target uses this to coalesce multiple resumption requests
> -     in a single vCont packet.  */
> -  virtual void commit_resume () {}
> +     See the description of COMMIT_RESUMED_STATE for more details.  */
> +  virtual void commit_resumed () {}
>  
>    /* True if any thread is, or may be executing.  We need to track
>       this separately because until we fully sync the thread list, we
> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>  
>    /* The connection number.  Visible in "info connections".  */
>    int connection_number = 0;
> +
> +  /* Whether resumed threads must be committed to the target.
> +
> +     When true, resumed threads must be committed to the execution target.
> +
> +     When false, the process stratum target may leave resumed threads stopped
> +     when it's convenient or efficient to do so.  When the core requires resumed
> +     threads to be committed again, this is set back to true and calls the
> +     `commit_resumed` method to allow the target to do so.
> +
> +     To simplify the implementation of process stratum targets, the following
> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
> +     false:
> +
> +       - resume
> +       - stop
> +       - wait

Should we mention this in the documentation of each of these methods?

> +
> +     Knowing this, the process stratum target doesn't need to implement
> +     different behaviors depending on the COMMIT_RESUMED_STATE, and can
> +     simply assert that it is false.
> +
> +     Process stratum targets can take advantage of this to batch resumption
> +     requests, for example.  In that case, the target doesn't actually resume in
> +     its `resume` implementation.  Instead, it takes note of the resumption
> +     intent in `resume` and defers the actual resumption to `commit_resumed`.
> +     For example, the remote target uses this to coalesce multiple resumption
> +     requests in a single vCont packet.  */
> +  bool commit_resumed_state = false;
>  };



> @@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
>  	  continue;
>  	}
>  
> +      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
> +	any_pending_vcont_resume = true;
> +
>        /* If a thread is the parent of an unfollowed fork, then we
>  	 can't do a global wildcard, as that would resume the fork
>  	 child.  */
> @@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
>  	may_global_wildcard_vcont = 0;
>      }
>  
> +  /* We didn't have any resumed thread pending a vCont resume, so nothing to
> +     do.  */
> +  if (!any_pending_vcont_resume)
> +    return;

Is this just an optimization you noticed, or something more related to
this patch?

[-- Attachment #2: disp-step-buffers-test.c --]
[-- Type: text/x-csrc, Size: 849 bytes --]

#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>

#define NUM_THREADS 100

static pthread_t child_thread[NUM_THREADS];
static unsigned long long counters[NUM_THREADS];
static volatile int done;

static void *
child_function (void *arg)
{
  while (!done)
    counters[(long) arg]++;   // set breakpoint here
  return NULL;
}

int
main (void)
{
  long i;

  for (i = 0; i < NUM_THREADS; i++)
    pthread_create (&child_thread[i], NULL, child_function, (void *) i);

  sleep (10);

  done = 1;

  for (i = 0; i < NUM_THREADS; i++)
    pthread_join (child_thread[i], NULL);

  double avg = 0;
  for (i = 0; i < NUM_THREADS; i++)
    {
      printf ("thread %02ld, count %llu\n", i, counters[i]);
      avg += counters[i];
    }

  double f = avg;
  f /= NUM_THREADS;

  printf ("avg              %f\n", f);

  return 0;
}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-08  4:17 ` [PATCH v3 5/5] gdb: better handling of 'S' packets Simon Marchi
  2021-01-08 18:19   ` Andrew Burgess
@ 2021-01-09 21:26   ` Pedro Alves
  2021-01-11 20:36     ` Simon Marchi
  1 sibling, 1 reply; 33+ messages in thread
From: Pedro Alves @ 2021-01-09 21:26 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 08/01/21 04:17, Simon Marchi wrote:

> @@ -7796,75 +7799,117 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>    remote->remote_notif_get_pending_events (nc);
>  }
>  
> -/* Called when it is decided that STOP_REPLY holds the info of the
> -   event that is to be returned to the core.  This function always
> -   destroys STOP_REPLY.  */
> +/* Called from process_stop_reply when the stop packet we are responding
> +   to didn't include a process-id or thread-id.  STATUS is the stop event
> +   we are responding to.
> +
> +   It is the task of this function to select a suitable thread (or process)
> +   and return its ptid, this is the thread (or process) we will assume the
> +   stop event came from.
> +
> +   In some cases there isn't really any choice about which thread (or
> +   process) is selected, a basic remote with a single process containing a
> +   single thread might choose not to send any process-id or thread-id in
> +   its stop packets, this function will select and return the one and only
> +   thread.
> +
> +   However, if a target supports multiple threads (or processes) and still
> +   doesn't include a thread-id (or process-id) in its stop packet then
> +   first, this is a badly behaving target, and second, we're going to have
> +   to select a thread (or process) at random and use that.  This function
> +   will print a warning to the user if it detects that there is the
> +   possibility that GDB is guessing which thread (or process) to
> +   report.  */
>  
>  ptid_t
> -remote_target::process_stop_reply (struct stop_reply *stop_reply,
> -				   struct target_waitstatus *status)
> +remote_target::select_thread_for_ambiguous_stop_reply
> +  (const struct target_waitstatus *status)

Note that this is called before gdb fetches the updated thread list,
so the stop reply may be ambiguous without gdb realizing, if
the inferior spawned new threads, but the stop is for the thread
that was resumed.  Maybe the comment should mention that.

For this reason, I see this patch more as being lenient to the stub,
than fixing a GDB bug with misimplementing the remote protocol.

>  {
> -  ptid_t ptid;
> +  /* Some stop events apply to all threads in an inferior, while others
> +     only apply to a single thread.  */
> +  bool is_stop_for_all_threads
> +    = (status->kind == TARGET_WAITKIND_EXITED
> +       || status->kind == TARGET_WAITKIND_SIGNALLED);

I didn't mention this before, but I keep having the same thought, so I'd
better speak up.  :-)  I find "stop is for all threads" ambiguous with
all-stop vs non-stop.  I'd suggest something like "process_wide_stop",
I think it would work.

>  
> -  *status = stop_reply->ws;
> -  ptid = stop_reply->ptid;
> +  thread_info *first_resumed_thread = nullptr;
> +  bool multiple_resumed_thread = false;
>  
> -  /* If no thread/process was reported by the stub then use the first
> -     non-exited thread in the current target.  */
> -  if (ptid == null_ptid)
> +  /* Consider all non-exited threads of the target, find the first resumed
> +     one.  */
> +  for (thread_info *thr : all_non_exited_threads (this))
>      {
> -      /* Some stop events apply to all threads in an inferior, while others
> -	 only apply to a single thread.  */
> -      bool is_stop_for_all_threads
> -	= (status->kind == TARGET_WAITKIND_EXITED
> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
> +      remote_thread_info *remote_thr =get_remote_thread_info (thr);
> +
> +      if (remote_thr->resume_state () != resume_state::RESUMED)
> +	continue;
> +
> +      if (first_resumed_thread == nullptr)
> +	first_resumed_thread = thr;


> +      else if (!is_stop_for_all_threads
> +	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
> +	multiple_resumed_thread = true;

The connection between the condition and whether there are multiple
resumed threads seems mysterious and distracting to me.  For a variable
called multiple_resumed_thread(s), I would have expected instead:

      if (first_resumed_thread == nullptr)
	first_resumed_thread = thr;
      else
        multiple_resumed_threads = true;

maybe something like "bool ambiguous;" would be more to the point?

> +    }
>  
> -      for (thread_info *thr : all_non_exited_threads (this))
> +  gdb_assert (first_resumed_thread != nullptr);
> +
> +  /* Warn if the remote target is sending ambiguous stop replies.  */
> +  if (multiple_resumed_thread)
> +    {
> +      static bool warned = false;
> +


> +    # Single step thread 2.  Only the one thread will step.  When the
> +    # thread stops, if the stop packet doesn't include a thread-id
> +    # then GDB should still understand which thread stopped.
> +    gdb_test_multiple "stepi" "" {
> +	-re "Thread 1 received signal SIGTRAP" {
> +	    fail $gdb_test_name
> +	}

This is still missing consuming the prompt.  I'll leave deciding whether
this -re need to be here to Andrew, but it is kept, but should consume
the problem, since otherwise we will leave the prompt in the expect
buffer and confuse the next gdb_test.  Just adding -wrap would do, I think.

Otherwise this LGTM.


> +	-re -wrap "$hex.*$decimal.*while \\(worker_blocked\\).*" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-09 20:34   ` Pedro Alves
@ 2021-01-11 20:28     ` Simon Marchi
  2021-01-22  2:46       ` Simon Marchi
  2021-01-22 22:07       ` Simon Marchi
  0 siblings, 2 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-11 20:28 UTC (permalink / raw)
  To: Pedro Alves, Simon Marchi, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 29685 bytes --]

On 2021-01-09 3:34 p.m., Pedro Alves wrote:
> On 08/01/21 04:17, Simon Marchi wrote:
>>  - Resuming with pending events: suppose the 1000 threads hit a
>>    breakpoint at the same time.  The breakpoint is conditional and
>>    evaluates to true for the first thread, to false for all others.  GDB
>>    pulls one event (for the first thread) from the target, decides that
>>    it should present a stop, so stops all threads using
>>    stop_all_threads.  All these other threads have a breakpoint event to
>>    report, which is saved in `thread_info::suspend::waitstatus` for
>>    later.  When the user does "continue", GDB resumes that one thread
>>    that did hit the breakpoint.  It then processes the pending events
>>    one by one as if they just arrived.  It picks one, evaluates the
>>    condition to false, and resumes the thread.  It picks another one,
>>    evaluates the condition to false, and resumes the thread.  And so on.
>>    In between each resumption, there is a full state retrieval and
>>    re-creation.  It would be much nicer if we could wait a little bit
>>    before sending those threads on the GPU, until it processed all those
>>    pending events.
> 
> A potential downside of holding on in this latter scenario, with regular
> host debugging, is that currently, threads are resumed immediately, thus potentially
> the inferior process's threads spend less time paused, at least with the native target
> if we implemented commit_resume there.  With remote, the trade off is
> probably more in favor of deferring, given the higher latency.
> 
> However, since we don't implement commit_resume for native target
> currently, it shouldn't have any effect there.

Indeed.  If there was a way to ptrace-resume multiple threads in one
ptrace call, we could think of implementing it for linux-nat, for
example.  But even then it would be a trade-off: not implementing it
get the first thread back and running faster, implementing it reduces
the number of syscalls done.

> To confirm this, I tried the testcase we used when debugging the
> displaced stepping buffers series, with 100 threads continuously
> stepping over a breakpoint, for 10 seconds.  Suprisingly, when
> native target, I see a consistent ~3% slowdown caused by this series.
> 
> I don't see any material difference with gdbserver.
> 
> (higher is better)
> 
> native, pristine
> 
>  avg              440.240000
>  avg              436.670000
>  avg              451.310000
>  avg              432.840000
>  avg              437.060000
>  ===========================
>  avg of avg       439.624000
> 
> native, patched
> 
>  avg              420.940000
>  avg              428.130000
>  avg              425.230000
>  avg              428.080000
>  avg              424.880000
>  ===========================
>  avg of avg       425.452000
> 
> 
> 
> gdbserver, pristine:
> 
>  avg              633.490000
>  avg              639.910000
>  avg              642.300000
>  avg              626.160000
>  avg              626.460000
>  ===========================
>  avg of avg       633.664000
> 
> 
> gdbserver, patched
> 
>  avg              630.970000
>  avg              628.960000
>  avg              638.340000
>  avg              627.030000
>  avg              638.390000
>  ===========================
>  avg of avg       632.738000
> 
> tests run like this:
> 
>   $ gcc disp-step-buffers-test.c -o disp-step-buffers-test -g3 -O2 -pthread
>   $ g="./gdb -data-directory=data-directory"
>   $ time $g -q --batch disp-step-buffers-test -ex "b 16 if 0" -ex "r"
>   $ time $g -q --batch disp-step-buffers-test -ex "set sysroot" -ex "target remote | ../gdbserver/gdbserver - disp-step-buffers-test" -ex "b 16 if 0" -ex "c" 
> 
> I've attached disp-step-buffers-test.c.
> 
> I'm surprised that native debugging is quite slower here, compared to
> gdbserver.  I don't recall observing that earlier.  Maybe I just missed
> it then.

Well, remember that GDBserver is doing the condition evaluation, so it all
happens GDBserver-side.  And given all the state machines and everything in
GDBserver is simpler than GDB, I could imagine it could explain why GDBserver
is faster.

I tried it on my side, I also see the GDBserver-based test doing about 1.33
more steps.

When adding "set breakpoint condition-evaluation host", then the GDBserver
test becomes slower, doing about 0.55 times the number of steps than the GDB
baseline.

> 
> I wouldn't have thought we would be doing that much work that it
> would be noticeable with the native target (pristive vs patched, the 3%
> slowdown).  I wonder whether that is caused by the constant std::set allocation
> in all_process_targets.  But then it's strange that we don't see that
> same slowdown when remote debugging.  I'm surprised.

I see more of ~1.33 % slowdown, but it's consistent too.

It could be the std::set allocation.  I changed all_process_targets to make it
return an array of 1 element, just to see what happens, it didn't seem to help.
See attached patch "0001-Test-returning-something-else-than-std-set-in-all_pr.patch"
if you want to try it.

It's maybe due to the fact that we now iterate on all threads at every handled
event? fetch_inferior_event calls ~scoped_disable_commit_resumed, which calls
maybe_commit_resumed_all_process_targets, which iterates on all threads.  The
loop actually breaks when it finds a thread with a pending status, but that
still makes this function O(number of threads).

>>
>> In the new version, we have two things in process_stratum_target:
>>
>>  - the commit_resumed_state field: indicates whether GDB requires this
>>    target to have resumed threads committed to the execution
>>    target/device.  If false, the target is allowed to leave resumed
>>    threads un-committed at the end of whatever method it is executing.
>>
>>  - the commit_resumed method: called when commit_resumed_state
>>    transitions from false to true.  While commit_resumed_state was
>>    false, the target may have left some resumed threads un-committed.
>>    This method being called tells it that it should commit them back to
>>    the execution device.
>>
>> Let's take the "Stopping all threads" scenario from above and see how it
>> would work with the ROCm target with this change.  Before stopping all
>> threads, GDB would set the target's commit_resumed_state field to false.
>> It would then ask the target to stop the first thread.  The target would
>> retrieve all threads' state from the GPU and mark that one as stopped.
>> Since commit_resumed_state is false, it leaves all the other threads
>> (still resumed) stopped.  GDB would then proceed to call target_stop for
>> all the other threads.  Since resumed threads are not committed, this
>> doesn't do any back and forth with the GPU.
>>
>> To simplify the implementation of targets, I made it so that when
>> calling certain target methods, the contract between the core and the
>> targets guarantees that commit_resumed_state is false.  This way, the
>> target doesn't need two paths, one commit_resumed_state == true and one
>> for commit_resumed_state == false.  It can just assert that
>> commit_resumed_state is false and work with that assumption.  This also
>> helps catch places where we forgot to disable commit_resumed_state
>> before calling the method, which represents a probable optimization
>> opportunity.
>>
>> To have some confidence that this contract between the core and the
>> targets is respected, I added assertions in the linux-nat target
>> methods, even though the linux-nat target doesn't actually use that
>> feature.  Since linux-nat is tested much more than other targets, this
>> will help catch these issues quicker.
> 
> Did you consider adding the assertions to target.c instead, in the
> target_resume/target_wait/target_stop wrapper methods?  That would
> cover all targets.

No, but it would be a good idea.  That wouldn't cover the cases where
target_ops::<method> is called directly, not through target_<method>.
From what I can see, this is only ever done by target methods to call
the corresponding method in the beneath target.  So presumably, up the
call somewhere is target_<method>, which will have already made the
assertion. The only exception to that is record_full_wait_1 which calls
beneath ()->resume.  But since both wait and resume are guaranteed to
be called with commit_resumed_state false, we're fine.

We'd be in trouble if, for example, a target's fetch_registers called
the beneath target's wait method (for some weird reason), as
fetch_registers could be called with commit_resumed_state true.  So,
just something to keep in mind in the future.

>> To ensure that commit_resumed_state is always turned back on (only if
>> necessary, see below) and the commit_resumed method is called when doing
>> so, I introduced the scoped_disabled_commit_resumed RAII object, which
>> replaces make_scoped_defer_process_target_commit_resume.  On
>> construction, it clears the commit_resumed_state flag of all process
>> targets.  On destruction, it turns it back on (if necessary) and calls
>> the commit_resumed method.  
> 
> This part makes me nervous and I think will cause us problems.  I'm
> really not sure it's a good idea.  The issue is that the commit_resumed method can
> throw, and we'll be in a dtor, which means that we will need to swallow the
> error, there's no way to propagate it out aborting the current function.
> That's why we currently have explicit commit calls, and the scoped object just
> tweaks the "defer commit" flag.  Would it work to build on the current
> design instead of moving the commit to the dtor?

Hmm, I'll give it a try.  The reason why I made it RAII is that I wanted
to be absolutely sure commit_resumed_state was turned back to true, even
in case of error.  Perhaps indeed the RAII can just flip back
commit_resumed_state to true (like the defer commit flag is today) and
the call to the commit_resumed can be made by hand after the scope.

>> The nested case is handled by having a
>> "nesting" counter: only when the counter goes back to 0 is
>> commit_resumed_state turned back on.
> 
> It wasn't obvious to me from the description why do we need both commit_resumed_state
> and a counter.  As in, wouldn't just the counter work?  Like, if the count is 0,
> the state is on, if >0, it is off.  

The counter just counts how many scoped_disable_commit_resumed
instances are active right now, up the stack, to make sure only
the outermost actually tries to re-enable commit-resumed.  This
is explained (perhaps poorly) in the comment in infrun.h:

   In addition, track creation of nested scoped_disable_commit_resumed objects,
   for cases like this:

     void
     inner_func ()
     {
       scoped_disable_commit_resumed disable;
       // do stuff
     }

     void
     outer_func ()
     {
       scoped_disable_commit_resumed disable;

       for (... each thread ...)
         inner_func ();
     }

   In this case, we don't want the `disable` in `inner_func` to require targets
   to commit resumed threads in its destructor.  */


When a scoped_disable_commit_resumed gets destroyed and
the counter goes down to 0, it means it's the outermost
instance, and that means it will try to re-enable
commit-resumed on all targets.  But that doesn't mean
that commit-resumed will be re-enabled on all targets:
if a target has a pending status, commit-resumed will
stay disabled.

The counter basically just replaces the scoped_restore on
defer_process_target_commit_resume.  Using a counter,
where each object decrements the counter on destruction,
is a slightly more general solution to this problem than
using scoped_restore, where each object restores the
value it saw when it got constructed.

The scoped_restore solution works when the lifetime of
all instances of the object are perfectly nested.  The
counter solution works when they are not.  But since our
scoped_disable_commit_resumed objects are all stack
allocated, their lifetimes should be perfectly nested,
so just using a scoped_restore should work.  I'll try to
replace the counter with just a boolean and a
scoped_restore, as we have now, that may simplify things
a bit.

But hopefully that clarifies why commit_resumed_state
and that counter thing are not the same thing.

> Can different targets ever have different commit resumed states?
> The only spot I see that tweaks the flag outside of the scoped object,
> is record-full.c, but I think that's only to avoid hitting the assertion?
> Do you plan on adding more spots that would override the state even if
> a scoped_disable_commit_resumed object is live?

No, I don't plan to add more such spots.  The record one is just an
annoying exception.

And yes, different targets can have different commit resumed states.

When a scoped_disable_commit_resumed object is created, it disables
commit-resumed for all targets.  When it is destructed, commit-resumed
is conditionally re-enabled for targets which have resumed threads
(from the point of view of infrun) and no pending status.

So if you have two targets with threads resumed, one of which has a
pending status, then that one will have commit-resumed off and the
other one will have commit-resumed on.  Or if you have two targets,
all threads stopped, and single step one thread.  Only the target
with the single-stepped thread will have its commit-resumed state
momentarily turned on.

It would be possible to implement different schemes, like a single
commit-resumed state for all targets.  If one target somewhere has
a pending status, we don't commit-resumed anybody.  But it seemed
to me like having things a bit more granular from the start would
help in the long run.

>> On destruction, commit-resumed is not re-enabled for a given target if:
>>
>>  1. this target has no threads resumed, or
>>  2. this target at least one thread with a pending status known to the
>>     core (saved in thread_info::suspend::waitstatus).
> 
> Should also check whether the thread with the pending status is resumed.
> /me reads patch, oh, did you that.  Good.  Please mention it here:
> ... one resumed thread ...

Will do.  I did not add this check from the start, I only realized it
was needed when debugging some testsuite regression.

>> The first point is not technically necessary, because a proper
>> commit_resumed implementation would be a no-op if the target has no
>> resumed threads.  But since we have a flag do to a quick check, I think
>> it doesn't hurt.
>>
>> The second point is more important: together with the
>> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
>> makes it so the "Resuming with pending events" described above is
>> handled efficiently.  Here's what happens in that case:
>>
>>  1. The user types "continue".
>>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>>     function does not enable commit-resumed, as it sees other threads
>>     have pending statuses.
>>  3. fetch_inferior_event is called to handle another event, one thread
>>     is resumed.  Because there are still more threads with pending
>>     statuses, the destructor of scoped_disable_commit_resumed in
>>     fetch_inferior_event still doesn't enable commit-resumed.
>>  4. Rinse and repeat step 3, until the last pending status is handled by
>>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>>     destructor sees there are no more threads with pending statues, so
>>     it asks the target to commit resumed threads.
>>
>> This allows us to avoid all unnecessary back and forths, there is a
>> single commit_resumed call.
>>
>> This change required remote_target::remote_stop_ns to learn how to
>> handle stopping threads that were resumed but pending vCont.  The
>> simplest example where that happens is when using the remote target in
>> all-stop, but with "maint set target-non-stop on", to force it to
>> operate in non-stop mode under the hood.  If two threads hit a
>> breakpoint at the same time, GDB will receive two stop replies.  It will
>> present the stop for one thread and save the other one in
>> thread_info::suspend::waitstatus.
>>
>> Before this patch, when doing "continue", GDB first resumes the thread
>> without a pending status:
>>
>>     Sending packet: $vCont;c:p172651.172676#f3
>>
>> It then consumes the pending status in the next fetch_inferior_event
>> call:
>>
>>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>>     [infrun] target_wait (-1.0.0, status) =
>>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
>>
>> It then realizes it needs to stop all threads to present the stop, so
>> stops the thread it just resumed:
>>
>>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>>     remote_stop called
>>     Sending packet: $vCont;t:p172651.172676#04
>>
>> This is an unnecessary resume/stop.  With this patch, we don't commit
>> resumed threads after proceeding, because of the pending status:
>>
>>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
>>
>> When GDB handles the pending status and stop_all_threads runs, we stop a
>> resumed but pending vCont thread:
>>
>>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
>>
>> That thread was never actually resumed on the remote stub / gdbserver.
>> This is why remote_stop_ns needed to learn this new trick of enqueueing
>> phony stop replies.
>>
>> Note that this patch only considers pending statuses known to the core
>> of GDB, that is the events that were pulled out of the target and stored
>> in `thread_info::suspend::waitstatus`.  In some cases, we could also
>> avoid unnecessary back and forth when the target has events that it has
>> not yet reported the core.  I plan to implement this as a subsequent
>> patch, once this series has settled.
>>
>> gdb/ChangeLog:
>>
>> 	* infrun.h (struct scoped_disable_commit_resumed): New.
>> 	* infrun.c (do_target_resume): Remove
>> 	maybe_commit_resume_process_target call.
>> 	(maybe_commit_resume_all_process_targets): Rename to...
>> 	(maybe_commit_resumed_all_process_targets): ... this.  Skip
>> 	targets that have no executing threads or resumed threads with
>> 	a pending status.
>> 	(scoped_disable_commit_resumed_depth): New.
>> 	(scoped_disable_commit_resumed::scoped_disable_commit_resumed):
>> 	New.
>> 	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed):
>> 	New.
>> 	(proceed): Use scoped_disable_commit_resumed.
>> 	(fetch_inferior_event): Use scoped_disable_commit_resumed.
>> 	* process-stratum-target.h (class process_stratum_target):
>> 	<commit_resume>: Rename to...
>> 	<commit_resumed>: ... this.
>> 	<commit_resumed_state>: New.
>> 	(all_process_targets): New.
>> 	(maybe_commit_resume_process_target): Remove.
>> 	(make_scoped_defer_process_target_commit_resume): Remove.
>> 	* process-stratum-target.c (all_process_targets): New.
>> 	(defer_process_target_commit_resume): Remove.
>> 	(maybe_commit_resume_process_target): Remove.
>> 	(make_scoped_defer_process_target_commit_resume): Remove.
>> 	* linux-nat.c (linux_nat_target::resume): Add gdb_assert.
>> 	(linux_nat_target::wait): Add gdb_assert.
>> 	(linux_nat_target::stop): Add gdb_assert.
>> 	* infcmd.c (run_command_1): Use scoped_disable_commit_resumed.
>> 	(attach_command): Use scoped_disable_commit_resumed.
>> 	(detach_command): Use scoped_disable_commit_resumed.
>> 	(interrupt_target_1): Use scoped_disable_commit_resumed.
>> 	* mi/mi-main.c (exec_continue): Use
>> 	scoped_disable_commit_resumed.
>> 	* record-full.c (record_full_wait_1): Change
>> 	commit_resumed_state around calling commit_resumed.
>> 	* remote.c (class remote_target) <commit_resume>: Rename to...
>> 	<commit_resumed>: ... this.
>> 	(remote_target::resume): Add gdb_assert.
>> 	(remote_target::commit_resume): Rename to...
>> 	(remote_target::commit_resumed): ... this.  Check if there is
>> 	any thread pending vCont resume.
>> 	(struct stop_reply): Move up.
>> 	(remote_target::remote_stop_ns): Generate stop replies for
>> 	resumed but pending vCont threads.
>> 	(remote_target::wait_ns): Add gdb_assert.
>>
>> [1] https://github.com/ROCm-Developer-Tools/ROCgdb/
>> [2] https://github.com/ROCm-Developer-Tools/ROCdbgapi
>>
>> Change-Id: I836135531a29214b21695736deb0a81acf8cf566
>> ---
>>  gdb/infcmd.c                 |   8 +++
>>  gdb/infrun.c                 | 116 +++++++++++++++++++++++++++++++----
>>  gdb/infrun.h                 |  41 +++++++++++++
>>  gdb/linux-nat.c              |   5 ++
>>  gdb/mi/mi-main.c             |   2 +
>>  gdb/process-stratum-target.c |  37 +++++------
>>  gdb/process-stratum-target.h |  63 +++++++++++--------
>>  gdb/record-full.c            |   4 +-
>>  gdb/remote.c                 | 111 +++++++++++++++++++++++----------
>>  9 files changed, 292 insertions(+), 95 deletions(-)
>>
>> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
>> index 6f0ed952de67..b7595e42e265 100644
>> --- a/gdb/infcmd.c
>> +++ b/gdb/infcmd.c
>> @@ -488,6 +488,8 @@ run_command_1 (const char *args, int from_tty, enum run_how run_how)
>>        uiout->flush ();
>>      }
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("running");
>> +
>>    /* We call get_inferior_args() because we might need to compute
>>       the value now.  */
>>    run_target->create_inferior (exec_file,
>> @@ -2591,6 +2593,8 @@ attach_command (const char *args, int from_tty)
>>    if (non_stop && !attach_target->supports_non_stop ())
>>      error (_("Cannot attach to this target in non-stop mode"));
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("attaching");
>> +
>>    attach_target->attach (args, from_tty);
>>    /* to_attach should push the target, so after this point we
>>       shouldn't refer to attach_target again.  */
>> @@ -2746,6 +2750,8 @@ detach_command (const char *args, int from_tty)
>>    if (inferior_ptid == null_ptid)
>>      error (_("The program is not being run."));
>>  
>> +  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
>> +
> 
> This one looks incorrect -- target_detach -> prepare_for_detach
> may need to finish off displaced steps, and resume the target
> in the process.  This here will inhibit it.  I have some WIP patches
> that will stop prepare_for_detach from doing that though, so it'll
> end up being correct after.

Ok, I will re-check that.

> 
>>    query_if_trace_running (from_tty);
>>  
>>    disconnect_tracing ();
>> @@ -2814,6 +2820,8 @@ stop_current_target_threads_ns (ptid_t ptid)
>>  void
>>  interrupt_target_1 (bool all_threads)
>>  {
>> +  scoped_disable_commit_resumed inhibit ("interrupting");
>> +
>>    if (non_stop)
>>      {
>>        if (all_threads)
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 1a27af51b7e9..92a1102cb595 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -2172,8 +2172,6 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
>>  
>>    target_resume (resume_ptid, step, sig);
>>  
>> -  maybe_commit_resume_process_target (tp->inf->process_target ());
>> -
>>    if (target_can_async_p ())
>>      target_async (1);
>>  }
>> @@ -2760,17 +2758,109 @@ schedlock_applies (struct thread_info *tp)
>>  					    execution_direction)));
>>  }
>>  
>> -/* Calls maybe_commit_resume_process_target on all process targets.  */
>> +/* Maybe require all process stratum targets to commit their resumed threads.
>> +
>> +   A specific process stratum target is not required to do so if:
>> +
>> +   - it has no resumed threads
>> +   - it has a thread with a pending status  */
>>  
>>  static void
>> -maybe_commit_resume_all_process_targets ()
>> +maybe_commit_resumed_all_process_targets ()
>>  {
>> -  scoped_restore_current_thread restore_thread;
>> +  /* This is an optional to avoid unnecessary thread switches. */
> 
> Missing double space after period.
> 
> But, just scoped_restore_current_thread itself doesn't switch the
> thread.  Is this trying to save something else?  It seems pointless
> to me offhand.

IIRC that was to avoid some regressions in annotation tests, where
we would suddenly generate some additional "registers changed" events
or something like that after the prompt.

> 
>> +  gdb::optional<scoped_restore_current_thread> restore_thread;
>>  
>>    for (process_stratum_target *target : all_non_exited_process_targets ())
>>      {
>> +      gdb_assert (!target->commit_resumed_state);
> 
> Not sure I understand this assertion.  Isn't this another thing
> showing that the per-target state isn't really necessary, and we
> could just use the global state?

I don't think so.  maybe_commit_resumed_all_process_targets is called
when destructing the last / outermost scoped_disable_commit_resumed
object.  When constructing that scoped_disable_commit_resumed object,
we explicitly set commit_resumed_state for all targets to false, to
disable commit-resumed for all targets for the duration of the scope.
So this verifies that it's still the case.  We don't expect any other
inner code to change to change that value.  Or if some code does
(like the record-full target's wait method), then it must make sure
to turn it back to false.

After maybe_commit_resumed_all_process_targets has ran, the targets
that have threads resumed and no pending status will have their
commit-resumed state turned back to true, while other targets will
have theirs still false.

>> diff --git a/gdb/infrun.h b/gdb/infrun.h
>> index 7160b60f1368..5c32c0c97f6e 100644
>> --- a/gdb/infrun.h
>> +++ b/gdb/infrun.h
> 
>> +
>> +struct scoped_disable_commit_resumed
>> +{
>> +  scoped_disable_commit_resumed (const char *reason);
> 
> explicit

Fixed.

> 
>> index 1436a550ac04..9877f0d81931 100644
>> --- a/gdb/process-stratum-target.c
>> +++ b/gdb/process-stratum-target.c
>> @@ -99,6 +99,20 @@ all_non_exited_process_targets ()
>>  
>>  /* See process-stratum-target.h.  */
>>  
>> +std::set<process_stratum_target *>
>> +all_process_targets ()
>> +{
>> +  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
>> +  std::set<process_stratum_target *> targets;
>> +  for (inferior *inf : all_inferiors ())
>> +    if (inf->process_target () != nullptr)
>> +      targets.insert (inf->process_target ());
>> +
>> +  return targets;
>> +}
> 
> An alternative that would avoid creating this temporary std::set
> (along with its internal heap allocations) on every call would be to expose
> target-connection.c:process_targets.

That would make sense.

>> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>>  
>>    /* The connection number.  Visible in "info connections".  */
>>    int connection_number = 0;
>> +
>> +  /* Whether resumed threads must be committed to the target.
>> +
>> +     When true, resumed threads must be committed to the execution target.
>> +
>> +     When false, the process stratum target may leave resumed threads stopped
>> +     when it's convenient or efficient to do so.  When the core requires resumed
>> +     threads to be committed again, this is set back to true and calls the
>> +     `commit_resumed` method to allow the target to do so.
>> +
>> +     To simplify the implementation of process stratum targets, the following
>> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
>> +     false:
>> +
>> +       - resume
>> +       - stop
>> +       - wait
> 
> Should we mention this in the documentation of each of these methods?

Yeah that would be nice.  Would you mention it in both places
or just in those methods' documentation?

>> @@ -6656,6 +6660,9 @@ remote_target::commit_resume ()
>>  	  continue;
>>  	}
>>  
>> +      if (priv->resume_state () == resume_state::RESUMED_PENDING_VCONT)
>> +	any_pending_vcont_resume = true;
>> +
>>        /* If a thread is the parent of an unfollowed fork, then we
>>  	 can't do a global wildcard, as that would resume the fork
>>  	 child.  */
>> @@ -6663,6 +6670,11 @@ remote_target::commit_resume ()
>>  	may_global_wildcard_vcont = 0;
>>      }
>>  
>> +  /* We didn't have any resumed thread pending a vCont resume, so nothing to
>> +     do.  */
>> +  if (!any_pending_vcont_resume)
>> +    return;
> 
> Is this just an optimization you noticed, or something more related to
> this patch?

Damn, I knew you would ask :P.  I honestly can't remember.  I think
it's just an obvious-ish optimization.  With the scoped_disable_commit_resumed
in fetch_inferior_event, I am under the impression that we potentially call
commit_resumed more often when nothing actually requires commit-resuming.

Let's say you debug a remote program and a native program (so, one remote
target and the native target), both are running.  When the native target
generates an event, we will end up calling commit_resumed on the remote
target, although nothing needs to be done.  So an early exit sounds
beneficial to avoid the extra work.  But I think it's not necessary, the
remote target's commit_resumed function would otherwise do the right thing
(send nothing) if nothing needs to be done.

That means I could keep this change for later or make a preparatory patch
for it.

And to further optimize things (to avoid iterating on all of the target's
threads), we could maintain a flag in the target that indicates whether
any thread is in the RESUMED_PENDING_VCONT state.  If that flag is false,
we can early-return without doing any work.

Simon


[-- Attachment #2: 0001-Test-returning-something-else-than-std-set-in-all_pr.patch --]
[-- Type: text/x-patch, Size: 2750 bytes --]

From 73b29de6b97acc477f131d4e26919395e8953f04 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@efficios.com>
Date: Mon, 11 Jan 2021 13:55:27 -0500
Subject: [PATCH] Test returning something else than std::set in
 all_process_targets

Change-Id: Ieb9ab1997824c8d7ef8e2bc998a4131b97b1b434
---
 gdb/infrun.c                 |  8 +++++++-
 gdb/process-stratum-target.c | 10 +++++-----
 gdb/process-stratum-target.h |  2 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 92a1102cb59..ab619814420 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2826,6 +2826,8 @@ scoped_disable_commit_resumed::scoped_disable_commit_resumed
 
   for (process_stratum_target *target : all_process_targets ())
     {
+      if (target == nullptr)
+	break;
       if (scoped_disable_commit_resumed_depth == 0)
 	{
 	  /* This is the outermost instance.  */
@@ -2860,7 +2862,11 @@ scoped_disable_commit_resumed::~scoped_disable_commit_resumed ()
       /* This is not the outermost instance, we expect COMMIT_RESUMED_STATE to
 	 still be false.  */
       for (process_stratum_target *target : all_process_targets ())
-	gdb_assert (!target->commit_resumed_state);
+	{
+	  if (target == nullptr)
+	    break;
+	  gdb_assert (!target->commit_resumed_state);
+	}
     }
 }
 
diff --git a/gdb/process-stratum-target.c b/gdb/process-stratum-target.c
index 9877f0d8193..184450e99f9 100644
--- a/gdb/process-stratum-target.c
+++ b/gdb/process-stratum-target.c
@@ -99,16 +99,16 @@ all_non_exited_process_targets ()
 
 /* See process-stratum-target.h.  */
 
-std::set<process_stratum_target *>
+std::array<process_stratum_target *, 1>
 all_process_targets ()
 {
-  /* Inferiors may share targets.  To eliminate duplicates, use a set.  */
-  std::set<process_stratum_target *> targets;
+  std::array<process_stratum_target *, 1> array;
+  array[0] = nullptr;
   for (inferior *inf : all_inferiors ())
     if (inf->process_target () != nullptr)
-      targets.insert (inf->process_target ());
+      array[0] = inf->process_target ();
 
-  return targets;
+  return array;
 }
 
 /* See process-stratum-target.h.  */
diff --git a/gdb/process-stratum-target.h b/gdb/process-stratum-target.h
index 3cea911dee0..c387c0ec11d 100644
--- a/gdb/process-stratum-target.h
+++ b/gdb/process-stratum-target.h
@@ -123,7 +123,7 @@ extern std::set<process_stratum_target *> all_non_exited_process_targets ();
 
 /* Return a collection of all existing process stratum targets.  */
 
-extern std::set<process_stratum_target *> all_process_targets ();
+extern std::array<process_stratum_target *, 1> all_process_targets ();
 
 /* Switch to the first inferior (and program space) of TARGET, and
    switch to no thread selected.  */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-09 21:26   ` Pedro Alves
@ 2021-01-11 20:36     ` Simon Marchi
  2021-01-12  3:07       ` Simon Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Simon Marchi @ 2021-01-11 20:36 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 2021-01-09 4:26 p.m., Pedro Alves wrote:
> On 08/01/21 04:17, Simon Marchi wrote:
> 
>> @@ -7796,75 +7799,117 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
>>    remote->remote_notif_get_pending_events (nc);
>>  }
>>  
>> -/* Called when it is decided that STOP_REPLY holds the info of the
>> -   event that is to be returned to the core.  This function always
>> -   destroys STOP_REPLY.  */
>> +/* Called from process_stop_reply when the stop packet we are responding
>> +   to didn't include a process-id or thread-id.  STATUS is the stop event
>> +   we are responding to.
>> +
>> +   It is the task of this function to select a suitable thread (or process)
>> +   and return its ptid, this is the thread (or process) we will assume the
>> +   stop event came from.
>> +
>> +   In some cases there isn't really any choice about which thread (or
>> +   process) is selected, a basic remote with a single process containing a
>> +   single thread might choose not to send any process-id or thread-id in
>> +   its stop packets, this function will select and return the one and only
>> +   thread.
>> +
>> +   However, if a target supports multiple threads (or processes) and still
>> +   doesn't include a thread-id (or process-id) in its stop packet then
>> +   first, this is a badly behaving target, and second, we're going to have
>> +   to select a thread (or process) at random and use that.  This function
>> +   will print a warning to the user if it detects that there is the
>> +   possibility that GDB is guessing which thread (or process) to
>> +   report.  */
>>  
>>  ptid_t
>> -remote_target::process_stop_reply (struct stop_reply *stop_reply,
>> -				   struct target_waitstatus *status)
>> +remote_target::select_thread_for_ambiguous_stop_reply
>> +  (const struct target_waitstatus *status)
> 
> Note that this is called before gdb fetches the updated thread list,
> so the stop reply may be ambiguous without gdb realizing, if
> the inferior spawned new threads, but the stop is for the thread
> that was resumed.  Maybe the comment should mention that.
> 
> For this reason, I see this patch more as being lenient to the stub,
> than fixing a GDB bug with misimplementing the remote protocol.

I don't really understand this.

> 
>>  {
>> -  ptid_t ptid;
>> +  /* Some stop events apply to all threads in an inferior, while others
>> +     only apply to a single thread.  */
>> +  bool is_stop_for_all_threads
>> +    = (status->kind == TARGET_WAITKIND_EXITED
>> +       || status->kind == TARGET_WAITKIND_SIGNALLED);
> 
> I didn't mention this before, but I keep having the same thought, so I'd
> better speak up.  :-)  I find "stop is for all threads" ambiguous with
> all-stop vs non-stop.  I'd suggest something like "process_wide_stop",
> I think it would work.

Agreed, will fix.

> 
>>  
>> -  *status = stop_reply->ws;
>> -  ptid = stop_reply->ptid;
>> +  thread_info *first_resumed_thread = nullptr;
>> +  bool multiple_resumed_thread = false;
>>  
>> -  /* If no thread/process was reported by the stub then use the first
>> -     non-exited thread in the current target.  */
>> -  if (ptid == null_ptid)
>> +  /* Consider all non-exited threads of the target, find the first resumed
>> +     one.  */
>> +  for (thread_info *thr : all_non_exited_threads (this))
>>      {
>> -      /* Some stop events apply to all threads in an inferior, while others
>> -	 only apply to a single thread.  */
>> -      bool is_stop_for_all_threads
>> -	= (status->kind == TARGET_WAITKIND_EXITED
>> -	   || status->kind == TARGET_WAITKIND_SIGNALLED);
>> +      remote_thread_info *remote_thr =get_remote_thread_info (thr);
>> +
>> +      if (remote_thr->resume_state () != resume_state::RESUMED)
>> +	continue;
>> +
>> +      if (first_resumed_thread == nullptr)
>> +	first_resumed_thread = thr;
> 
> 
>> +      else if (!is_stop_for_all_threads
>> +	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
>> +	multiple_resumed_thread = true;
> 
> The connection between the condition and whether there are multiple
> resumed threads seems mysterious and distracting to me.  For a variable
> called multiple_resumed_thread(s), I would have expected instead:
> 
>       if (first_resumed_thread == nullptr)
> 	first_resumed_thread = thr;
>       else
>         multiple_resumed_threads = true;
> 
> maybe something like "bool ambiguous;" would be more to the point?

Makes sense.

> 
>> +    }
>>  
>> -      for (thread_info *thr : all_non_exited_threads (this))
>> +  gdb_assert (first_resumed_thread != nullptr);
>> +
>> +  /* Warn if the remote target is sending ambiguous stop replies.  */
>> +  if (multiple_resumed_thread)
>> +    {
>> +      static bool warned = false;
>> +
> 
> 
>> +    # Single step thread 2.  Only the one thread will step.  When the
>> +    # thread stops, if the stop packet doesn't include a thread-id
>> +    # then GDB should still understand which thread stopped.
>> +    gdb_test_multiple "stepi" "" {
>> +	-re "Thread 1 received signal SIGTRAP" {
>> +	    fail $gdb_test_name
>> +	}
> 
> This is still missing consuming the prompt.  I'll leave deciding whether
> this -re need to be here to Andrew, but it is kept, but should consume
> the problem, since otherwise we will leave the prompt in the expect
> buffer and confuse the next gdb_test.  Just adding -wrap would do, I think.


> Otherwise this LGTM.

Thanks, I'll address the comments and push patches 1, 2 and 5.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-11 20:36     ` Simon Marchi
@ 2021-01-12  3:07       ` Simon Marchi
  2021-01-13 20:17         ` Pedro Alves
  0 siblings, 1 reply; 33+ messages in thread
From: Simon Marchi @ 2021-01-12  3:07 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 2021-01-11 3:36 p.m., Simon Marchi via Gdb-patches wrote:
> On 2021-01-09 4:26 p.m., Pedro Alves wrote:
>> Note that this is called before gdb fetches the updated thread list,
>> so the stop reply may be ambiguous without gdb realizing, if
>> the inferior spawned new threads, but the stop is for the thread
>> that was resumed.  Maybe the comment should mention that.
>>
>> For this reason, I see this patch more as being lenient to the stub,
>> than fixing a GDB bug with misimplementing the remote protocol.
> 
> I don't really understand this.

After re-reading it, I think I get it.  Please see the updated patch
below, see if the modified comment makes sense.

>>> +    # Single step thread 2.  Only the one thread will step.  When the
>>> +    # thread stops, if the stop packet doesn't include a thread-id
>>> +    # then GDB should still understand which thread stopped.
>>> +    gdb_test_multiple "stepi" "" {
>>> +	-re "Thread 1 received signal SIGTRAP" {
>>> +	    fail $gdb_test_name
>>> +	}
>>
>> This is still missing consuming the prompt.  I'll leave deciding whether
>> this -re need to be here to Andrew, but it is kept, but should consume
>> the problem, since otherwise we will leave the prompt in the expect
>> buffer and confuse the next gdb_test.  Just adding -wrap would do, I think.

Ok, I added -wrap here and a .* at the end.  I suppose the intent is that
even if GDB gets it wrong and displays it as a spurious SIGTRAP, it could
also match the second regexp (the pass one).  So that's why we need this
fail case before.

> Thanks, I'll address the comments and push patches 1, 2 and 5.

Please see the updated patch below, just to make sure I got things right.


From c407b5066e54e11028e73f27721a3993f3c027c1 Mon Sep 17 00:00:00 2001
From: Andrew Burgess <andrew.burgess@embecosm.com>
Date: Thu, 7 Jan 2021 23:17:34 -0500
Subject: [PATCH] gdb: better handling of 'S' packets

This commit builds on work started in the following two commits:

  commit 24ed6739b699f329c2c45aedee5f8c7d2f54e493
  Date:   Thu Jan 30 14:35:40 2020 +0000

      gdb/remote: Restore support for 'S' stop reply packet

  commit cada5fc921e39a1945c422eea055c8b326d8d353
  Date:   Wed Mar 11 12:30:13 2020 +0000

      gdb: Handle W and X remote packets without giving a warning

This is related to how GDB handles remote targets that send back 'S'
packets.

In the first of the above commits we fixed GDB's ability to handle a
single process, single threaded target that sends back 'S' packets.
Although the 'T' packet would always be preferred to 'S' these days,
there's nothing really wrong with 'S' for this situation.

The second commit above fixed an oversight in the first commit, a
single-process, multi-threaded target can send back a process wide
event, for example the process exited event 'W' without including a
process-id, this also is fine as there is no ambiguity in this case.

In PR gdb/26819 we run into yet another problem with the above
commits.  In this case we have a single process with two threads, GDB
hits a breakpoint in thread 2 and then performs a stepi:

  (gdb) b main
  Breakpoint 1 at 0x1212340830: file infinite_loop.S, line 10.
  (gdb) c
  Continuing.

  Thread 2 hit Breakpoint 1, main () at infinite_loop.S:10
  10    in infinite_loop.S
  (gdb) set debug remote 1
  (gdb) stepi
  Sending packet: $vCont;s:2#24...Packet received: S05
  ../binutils-gdb/gdb/infrun.c:5807: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.

What happens in this case is that on the RISC-V target displaced
stepping is not supported, so when the stepi is issued GDB steps just
thread 2.  As only a single thread was set running the target decides
that is can get away with sending back an 'S' packet without a
thread-id.  GDB then associates the stop with thread 1 (the first
non-exited thread), but as thread 1 was not previously set executing
the assertion seen above triggers.

As an aside I am surprised that the target sends pack 'S' in this
situation.  The target is happy to send back 'T' (including thread-id)
when multiple threads are set running, so (to me) it would seem easier
to just always use the 'T' packet when multiple threads are in use.
However, the target only uses 'T' when multiple threads are actually
executing, otherwise an 'S' packet it used.

Still, when looking at the above situation we can see that GDB should
be able to understand which thread the 'S' reply is referring too.

The problem is that is that in commit 24ed6739b699 (above) when a stop
reply comes in with no thread-id we look for the first non-exited
thread and select that as the thread the stop applies too.

What we should really do is select the first non-exited, resumed thread,
and associate the stop event with this thread.  In the above example
both thread 1 and 2 are non-exited, but only thread 2 is resumed, so
this is what we should use.

There's a test for this issue included which works with stock
gdbserver by disabling use of the 'T' packet, and enabling
'scheduler-locking' within GDB so only one thread is set running.

gdb/ChangeLog:

	PR gdb/26819
	* remote.c
	(remote_target::select_thread_for_ambiguous_stop_reply): New
	member function.
	(remote_target::process_stop_reply): Call
	select_thread_for_ambiguous_stop_reply.

gdb/testsuite/ChangeLog:

	PR gdb/26819
	* gdb.server/stop-reply-no-thread-multi.c: New file.
	* gdb.server/stop-reply-no-thread-multi.exp: New file.

Change-Id: I9b49d76c2a99063dcc76203fa0f5270a72825d15
---
 gdb/remote.c                                  | 163 ++++++++++++------
 .../gdb.server/stop-reply-no-thread-multi.c   |  77 +++++++++
 .../gdb.server/stop-reply-no-thread-multi.exp | 136 +++++++++++++++
 3 files changed, 321 insertions(+), 55 deletions(-)
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
 create mode 100644 gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp

diff --git a/gdb/remote.c b/gdb/remote.c
index a657902080d7..74ebbf9ab023 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -747,6 +747,9 @@ class remote_target : public process_stratum_target
   ptid_t process_stop_reply (struct stop_reply *stop_reply,
 			     target_waitstatus *status);
 
+  ptid_t select_thread_for_ambiguous_stop_reply
+    (const struct target_waitstatus *status);
+
   void remote_notice_new_inferior (ptid_t currthread, int executing);
 
   void process_initial_stop_replies (int from_tty);
@@ -7753,75 +7756,125 @@ remote_notif_get_pending_events (remote_target *remote, notif_client *nc)
   remote->remote_notif_get_pending_events (nc);
 }
 
-/* Called when it is decided that STOP_REPLY holds the info of the
-   event that is to be returned to the core.  This function always
-   destroys STOP_REPLY.  */
+/* Called from process_stop_reply when the stop packet we are responding
+   to didn't include a process-id or thread-id.  STATUS is the stop event
+   we are responding to.
+
+   It is the task of this function to select a suitable thread (or process)
+   and return its ptid, this is the thread (or process) we will assume the
+   stop event came from.
+
+   In some cases there isn't really any choice about which thread (or
+   process) is selected, a basic remote with a single process containing a
+   single thread might choose not to send any process-id or thread-id in
+   its stop packets, this function will select and return the one and only
+   thread.
+
+   However, if a target supports multiple threads (or processes) and still
+   doesn't include a thread-id (or process-id) in its stop packet then
+   first, this is a badly behaving target, and second, we're going to have
+   to select a thread (or process) at random and use that.  This function
+   will print a warning to the user if it detects that there is the
+   possibility that GDB is guessing which thread (or process) to
+   report.
+
+   Note that this is called before GDB fetches the updated thread list from the
+   target.  So it's possible for the stop reply to be ambiguous and for GDB to
+   not realize it.  For example, if there's initially one thread, the target
+   spawns a second thread, and then sends a stop reply without an id that
+   concerns the first thread.  GDB will assume the stop reply is about the
+   first thread - the only thread it knows about - without printing a warning.
+   Anyway, if the remote meant for the stop reply to be about the second thread,
+   then it would be really broken, because GDB doesn't know about that thread
+   yet.  */
 
 ptid_t
-remote_target::process_stop_reply (struct stop_reply *stop_reply,
-				   struct target_waitstatus *status)
+remote_target::select_thread_for_ambiguous_stop_reply
+  (const struct target_waitstatus *status)
 {
-  ptid_t ptid;
+  /* Some stop events apply to all threads in an inferior, while others
+     only apply to a single thread.  */
+  bool process_wide_stop
+    = (status->kind == TARGET_WAITKIND_EXITED
+       || status->kind == TARGET_WAITKIND_SIGNALLED);
 
-  *status = stop_reply->ws;
-  ptid = stop_reply->ptid;
+  thread_info *first_resumed_thread = nullptr;
+  bool ambiguous = false;
 
-  /* If no thread/process was reported by the stub then use the first
-     non-exited thread in the current target.  */
-  if (ptid == null_ptid)
+  /* Consider all non-exited threads of the target, find the first resumed
+     one.  */
+  for (thread_info *thr : all_non_exited_threads (this))
     {
-      /* Some stop events apply to all threads in an inferior, while others
-	 only apply to a single thread.  */
-      bool is_stop_for_all_threads
-	= (status->kind == TARGET_WAITKIND_EXITED
-	   || status->kind == TARGET_WAITKIND_SIGNALLED);
+      remote_thread_info *remote_thr = get_remote_thread_info (thr);
 
-      for (thread_info *thr : all_non_exited_threads (this))
+      if (remote_thr->resume_state () != resume_state::RESUMED)
+	continue;
+
+      if (first_resumed_thread == nullptr)
+	first_resumed_thread = thr;
+      else if (!process_wide_stop
+	       || first_resumed_thread->ptid.pid () != thr->ptid.pid ())
+	ambiguous = true;
+    }
+
+  gdb_assert (first_resumed_thread != nullptr);
+
+  /* Warn if the remote target is sending ambiguous stop replies.  */
+  if (ambiguous)
+    {
+      static bool warned = false;
+
+      if (!warned)
 	{
-	  if (ptid != null_ptid
-	      && (!is_stop_for_all_threads
-		  || ptid.pid () != thr->ptid.pid ()))
-	    {
-	      static bool warned = false;
+	  /* If you are seeing this warning then the remote target has
+	     stopped without specifying a thread-id, but the target
+	     does have multiple threads (or inferiors), and so GDB is
+	     having to guess which thread stopped.
 
-	      if (!warned)
-		{
-		  /* If you are seeing this warning then the remote target
-		     has stopped without specifying a thread-id, but the
-		     target does have multiple threads (or inferiors), and
-		     so GDB is having to guess which thread stopped.
-
-		     Examples of what might cause this are the target
-		     sending and 'S' stop packet, or a 'T' stop packet and
-		     not including a thread-id.
-
-		     Additionally, the target might send a 'W' or 'X
-		     packet without including a process-id, when the target
-		     has multiple running inferiors.  */
-		  if (is_stop_for_all_threads)
-		    warning (_("multi-inferior target stopped without "
-			       "sending a process-id, using first "
-			       "non-exited inferior"));
-		  else
-		    warning (_("multi-threaded target stopped without "
-			       "sending a thread-id, using first "
-			       "non-exited thread"));
-		  warned = true;
-		}
-	      break;
-	    }
+	     Examples of what might cause this are the target sending
+	     and 'S' stop packet, or a 'T' stop packet and not
+	     including a thread-id.
 
-	  /* If this is a stop for all threads then don't use a particular
-	     threads ptid, instead create a new ptid where only the pid
-	     field is set.  */
-	  if (is_stop_for_all_threads)
-	    ptid = ptid_t (thr->ptid.pid ());
+	     Additionally, the target might send a 'W' or 'X packet
+	     without including a process-id, when the target has
+	     multiple running inferiors.  */
+	  if (process_wide_stop)
+	    warning (_("multi-inferior target stopped without "
+		       "sending a process-id, using first "
+		       "non-exited inferior"));
 	  else
-	    ptid = thr->ptid;
+	    warning (_("multi-threaded target stopped without "
+		       "sending a thread-id, using first "
+		       "non-exited thread"));
+	  warned = true;
 	}
-      gdb_assert (ptid != null_ptid);
     }
 
+  /* If this is a stop for all threads then don't use a particular threads
+     ptid, instead create a new ptid where only the pid field is set.  */
+  if (process_wide_stop)
+    return ptid_t (first_resumed_thread->ptid.pid ());
+  else
+    return first_resumed_thread->ptid;
+}
+
+/* Called when it is decided that STOP_REPLY holds the info of the
+   event that is to be returned to the core.  This function always
+   destroys STOP_REPLY.  */
+
+ptid_t
+remote_target::process_stop_reply (struct stop_reply *stop_reply,
+				   struct target_waitstatus *status)
+{
+  *status = stop_reply->ws;
+  ptid_t ptid = stop_reply->ptid;
+
+  /* If no thread/process was reported by the stub then select a suitable
+     thread/process.  */
+  if (ptid == null_ptid)
+    ptid = select_thread_for_ambiguous_stop_reply (status);
+  gdb_assert (ptid != null_ptid);
+
   if (status->kind != TARGET_WAITKIND_EXITED
       && status->kind != TARGET_WAITKIND_SIGNALLED
       && status->kind != TARGET_WAITKIND_NO_RESUMED)
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
new file mode 100644
index 000000000000..40cc71a85bc5
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.c
@@ -0,0 +1,77 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <pthread.h>
+#include <unistd.h>
+
+volatile int worker_blocked = 1;
+volatile int main_blocked = 1;
+
+void
+unlock_worker (void)
+{
+  worker_blocked = 0;
+}
+
+void
+unlock_main (void)
+{
+  main_blocked = 0;
+}
+
+void
+breakpt (void)
+{
+  /* Nothing.  */
+}
+
+static void *
+worker (void *data)
+{
+  unlock_main ();
+
+  while (worker_blocked)
+    ;
+
+  breakpt ();
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  pthread_t thr;
+  void *retval;
+
+  /* Ensure the test doesn't run forever.  */
+  alarm (99);
+
+  if (pthread_create (&thr, NULL, worker, NULL) != 0)
+    abort ();
+
+  while (main_blocked)
+    ;
+
+  unlock_worker ();
+
+  if (pthread_join (thr, &retval) != 0)
+    abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
new file mode 100644
index 000000000000..6350f5771e31
--- /dev/null
+++ b/gdb/testsuite/gdb.server/stop-reply-no-thread-multi.exp
@@ -0,0 +1,136 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2021 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test how GDB handles the case where a target either doesn't use 'T'
+# packets at all or doesn't include a thread-id in a 'T' packet, AND,
+# where the test program contains multiple threads.
+#
+# In general if multiple threads are executing and the target doesn't
+# include a thread-id in its stop response then GDB will not be able
+# to correctly figure out which thread the stop applies to.
+#
+# However, this test covers a very specific case, there are multiple
+# threads but only a single thread is actually executing.  So, when
+# the stop comes from the target, without a thread-id, GDB should be
+# able to correctly figure out which thread has stopped.
+
+load_lib gdbserver-support.exp
+
+if { [skip_gdbserver_tests] } {
+    verbose "skipping gdbserver tests"
+    return -1
+}
+
+standard_testfile
+if { [build_executable "failed to prepare" $testfile $srcfile {debug pthreads}] == -1 } {
+    return -1
+}
+
+# Run the tests with different features of GDBserver disabled.
+proc run_test { disable_feature } {
+    global binfile gdb_prompt decimal hex
+
+    clean_restart ${binfile}
+
+    # Make sure we're disconnected, in case we're testing with an
+    # extended-remote board, therefore already connected.
+    gdb_test "disconnect" ".*"
+
+    set packet_arg ""
+    if { $disable_feature != "" } {
+	set packet_arg "--disable-packet=${disable_feature}"
+    }
+    set res [gdbserver_start $packet_arg $binfile]
+    set gdbserver_protocol [lindex $res 0]
+    set gdbserver_gdbport [lindex $res 1]
+
+    # Disable XML-based thread listing, and multi-process extensions.
+    gdb_test_no_output "set remote threads-packet off"
+    gdb_test_no_output "set remote multiprocess-feature-packet off"
+
+    set res [gdb_target_cmd $gdbserver_protocol $gdbserver_gdbport]
+    if ![gdb_assert {$res == 0} "connect"] {
+	return
+    }
+
+    # There should be only one thread listed at this point.
+    gdb_test_multiple "info threads" "" {
+	-re "2 Thread.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "has terminated.*$gdb_prompt $" {
+	    fail $gdb_test_name
+	}
+	-re "\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n$gdb_prompt $" {
+	    pass $gdb_test_name
+	}
+    }
+
+    gdb_breakpoint "unlock_worker"
+    gdb_continue_to_breakpoint "run to unlock_worker"
+
+    # There should be two threads at this point with thread 1 selected.
+    gdb_test "info threads" \
+	"\\\* 1\[\t \]*Thread\[^\r\n\]*\r\n  2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should now exist"
+
+    # Switch threads.
+    gdb_test "thread 2" ".*" "switch to second thread"
+
+    # Now turn on scheduler-locking so that when we step thread 2 only
+    # that one thread will be set running.
+    gdb_test_no_output "set scheduler-locking on"
+
+    # Single step thread 2.  Only the one thread will step.  When the
+    # thread stops, if the stop packet doesn't include a thread-id
+    # then GDB should still understand which thread stopped.
+    gdb_test_multiple "stepi" "" {
+	-re -wrap "Thread 1 received signal SIGTRAP.*" {
+	    fail $gdb_test_name
+	}
+	-re -wrap "$hex.*$decimal.*while \\(worker_blocked\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Check that thread 2 is still selected.
+    gdb_test "info threads" \
+	"  1\[\t \]*Thread\[^\r\n\]*\r\n\\\* 2\[\t \]*Thread\[^\r\n\]*" \
+	"second thread should still be selected after stepi"
+
+    # Turn scheduler locking off again so that when we continue all
+    # threads will be set running.
+    gdb_test_no_output "set scheduler-locking off"
+
+    # Continue until exit.  The server sends a 'W' with no PID.
+    # Bad GDB gave an error like below when target is nonstop:
+    #  (gdb) c
+    #  Continuing.
+    #  No process or thread specified in stop reply: W00
+    gdb_continue_to_end "" continue 1
+}
+
+# Disable different features within gdbserver:
+#
+# Tthread: Start GDBserver, with ";thread:NNN" in T stop replies disabled,
+#          emulating old gdbservers when debugging single-threaded programs.
+#
+# T: Start GDBserver with the entire 'T' stop reply packet disabled,
+#    GDBserver will instead send the 'S' stop reply.
+foreach_with_prefix to_disable { "" Tthread T } {
+    run_test $to_disable
+}
-- 
2.29.2


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
  2021-01-08 18:34   ` Andrew Burgess
  2021-01-09 20:34   ` Pedro Alves
@ 2021-01-12 17:14   ` Simon Marchi
  2021-01-12 18:04     ` Simon Marchi
  2021-01-15 19:17   ` Simon Marchi
  3 siblings, 1 reply; 33+ messages in thread
From: Simon Marchi @ 2021-01-12 17:14 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

On 2021-01-07 11:17 p.m., Simon Marchi via Gdb-patches wrote:
> From: Simon Marchi <simon.marchi@efficios.com>
> 
> The rationale for this patch comes from the ROCm port [1], the goal
> being to reduce the number of back and forths between GDB and the target
> when doing successive operations.  I'll start with explaining the
> rationale and then go over the implementation.  In the ROCm / GPU world,
> the term "wave" is somewhat equivalent to a "thread" in GDB.  So if you
> read if from a GPU stand point, just s/thread/wave/.
> 
> ROCdbgapi, the library used by GDB [2] to communicate with the GPU
> target, gives the illusion that it's possible for the debugger to
> control (start and stop) individual threads.  But in reality, this is
> not how it works.  Under the hood, all threads of a queue are controlled
> as a group.  To stop one thread in a group of running ones, the state of
> all threads is retrieved from the GPU, all threads are destroyed, and all
> threads but the one we want to stop are re-created from the saved state.
> The net result, from the point of view of GDB, is that the library
> stopped one thread.  The same thing goes if we want to resume one thread
> while others are running: the state of all running threads is retrieved
> from the GPU, they are all destroyed, and they are all re-created,
> including the thread we want to resume.
> 
> This leads to some inefficiencies when combined with how GDB works, here
> are two examples:
> 
>  - Stopping all threads: because the target operates in non-stop mode,
>    when the user interface mode is all-stop, GDB must stop all threads
>    individually when presenting a stop.  Let's suppose we have 1000
>    threads and the user does ^C.  GDB asks the target to stop one
>    thread.  Behind the scenes, the library retrieves 1000 thread states
>    and restores the 999 others still running ones.  GDB asks the target
>    to stop another one.  The target retrieves 999 thread states and
>    restores the 998 remaining ones.  That means that to stop 1000
>    threads, we did 1000 back and forths with the GPU.  It would have
>    been much better to just retrieve the states once and stop there.
> 
>  - Resuming with pending events: suppose the 1000 threads hit a
>    breakpoint at the same time.  The breakpoint is conditional and
>    evaluates to true for the first thread, to false for all others.  GDB
>    pulls one event (for the first thread) from the target, decides that
>    it should present a stop, so stops all threads using
>    stop_all_threads.  All these other threads have a breakpoint event to
>    report, which is saved in `thread_info::suspend::waitstatus` for
>    later.  When the user does "continue", GDB resumes that one thread
>    that did hit the breakpoint.  It then processes the pending events
>    one by one as if they just arrived.  It picks one, evaluates the
>    condition to false, and resumes the thread.  It picks another one,
>    evaluates the condition to false, and resumes the thread.  And so on.
>    In between each resumption, there is a full state retrieval and
>    re-creation.  It would be much nicer if we could wait a little bit
>    before sending those threads on the GPU, until it processed all those
>    pending events.
> 
> To address this kind of performance issue, ROCdbgapi has a concept
> called "forward progress required", which is a boolean state that allows
> its user (i.e. GDB) to say "I'm doing a bunch of operations, you can
> hold off putting the threads on the GPU until I'm done" (the "forward
> progress not required" state).  Turning forward progress back on
> indicates to the library that all threads that are supposed to be
> running should now be really running on the GPU.
> 
> It turns out that GDB has a similar concept, though not as general,
> commit_resume.  On difference is that commit_resume is not stateful: the
> target can't look up "does the core need me to schedule resumed threads
> for execution right now".  It is also specifically linked to the resume
> method, it is not used in other contexts.  The target accumulates
> resumption requests through target_ops::resume calls, and then commits
> those resumptions when target_ops::commit_resume is called.  The target
> has no way to check if it's ok to leave resumed threads stopped in other
> target methods.
> 
> To bridge the gap, this patch generalizes the commit_resume concept in
> GDB to match the forward progress concept of ROCdbgapi.  The current
> name (commit_resume) can be interpreted as "commit the previous resume
> calls".  I renamed the concept to "commit_resumed", as in "commit the
> threads that are resumed".
> 
> In the new version, we have two things in process_stratum_target:
> 
>  - the commit_resumed_state field: indicates whether GDB requires this
>    target to have resumed threads committed to the execution
>    target/device.  If false, the target is allowed to leave resumed
>    threads un-committed at the end of whatever method it is executing.
> 
>  - the commit_resumed method: called when commit_resumed_state
>    transitions from false to true.  While commit_resumed_state was
>    false, the target may have left some resumed threads un-committed.
>    This method being called tells it that it should commit them back to
>    the execution device.
> 
> Let's take the "Stopping all threads" scenario from above and see how it
> would work with the ROCm target with this change.  Before stopping all
> threads, GDB would set the target's commit_resumed_state field to false.
> It would then ask the target to stop the first thread.  The target would
> retrieve all threads' state from the GPU and mark that one as stopped.
> Since commit_resumed_state is false, it leaves all the other threads
> (still resumed) stopped.  GDB would then proceed to call target_stop for
> all the other threads.  Since resumed threads are not committed, this
> doesn't do any back and forth with the GPU.
> 
> To simplify the implementation of targets, I made it so that when
> calling certain target methods, the contract between the core and the
> targets guarantees that commit_resumed_state is false.  This way, the
> target doesn't need two paths, one commit_resumed_state == true and one
> for commit_resumed_state == false.  It can just assert that
> commit_resumed_state is false and work with that assumption.  This also
> helps catch places where we forgot to disable commit_resumed_state
> before calling the method, which represents a probable optimization
> opportunity.
> 
> To have some confidence that this contract between the core and the
> targets is respected, I added assertions in the linux-nat target
> methods, even though the linux-nat target doesn't actually use that
> feature.  Since linux-nat is tested much more than other targets, this
> will help catch these issues quicker.
> 
> To ensure that commit_resumed_state is always turned back on (only if
> necessary, see below) and the commit_resumed method is called when doing
> so, I introduced the scoped_disabled_commit_resumed RAII object, which
> replaces make_scoped_defer_process_target_commit_resume.  On
> construction, it clears the commit_resumed_state flag of all process
> targets.  On destruction, it turns it back on (if necessary) and calls
> the commit_resumed method.  The nested case is handled by having a
> "nesting" counter: only when the counter goes back to 0 is
> commit_resumed_state turned back on.
> 
> On destruction, commit-resumed is not re-enabled for a given target if:
> 
>  1. this target has no threads resumed, or
>  2. this target at least one thread with a pending status known to the
>     core (saved in thread_info::suspend::waitstatus).
> 
> The first point is not technically necessary, because a proper
> commit_resumed implementation would be a no-op if the target has no
> resumed threads.  But since we have a flag do to a quick check, I think
> it doesn't hurt.
> 
> The second point is more important: together with the
> scoped_disable_commit_resumed instance added in fetch_inferior_event, it
> makes it so the "Resuming with pending events" described above is
> handled efficiently.  Here's what happens in that case:
> 
>  1. The user types "continue".
>  2. Upon destruction, the scoped_disable_commit_resumed in the `proceed`
>     function does not enable commit-resumed, as it sees other threads
>     have pending statuses.
>  3. fetch_inferior_event is called to handle another event, one thread
>     is resumed.  Because there are still more threads with pending
>     statuses, the destructor of scoped_disable_commit_resumed in
>     fetch_inferior_event still doesn't enable commit-resumed.
>  4. Rinse and repeat step 3, until the last pending status is handled by
>     fetch_inferior_event.  In that case, scoped_disable_commit_resumed's
>     destructor sees there are no more threads with pending statues, so
>     it asks the target to commit resumed threads.
> 
> This allows us to avoid all unnecessary back and forths, there is a
> single commit_resumed call.
> 
> This change required remote_target::remote_stop_ns to learn how to
> handle stopping threads that were resumed but pending vCont.  The
> simplest example where that happens is when using the remote target in
> all-stop, but with "maint set target-non-stop on", to force it to
> operate in non-stop mode under the hood.  If two threads hit a
> breakpoint at the same time, GDB will receive two stop replies.  It will
> present the stop for one thread and save the other one in
> thread_info::suspend::waitstatus.
> 
> Before this patch, when doing "continue", GDB first resumes the thread
> without a pending status:
> 
>     Sending packet: $vCont;c:p172651.172676#f3
> 
> It then consumes the pending status in the next fetch_inferior_event
> call:
> 
>     [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
>     [infrun] target_wait (-1.0.0, status) =
>     [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
>     [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP
> 
> It then realizes it needs to stop all threads to present the stop, so
> stops the thread it just resumed:
> 
>     [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
>     [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
>     remote_stop called
>     Sending packet: $vCont;t:p172651.172676#04
> 
> This is an unnecessary resume/stop.  With this patch, we don't commit
> resumed threads after proceeding, because of the pending status:
> 
>     [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus
> 
> When GDB handles the pending status and stop_all_threads runs, we stop a
> resumed but pending vCont thread:
> 
>     remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)
> 
> That thread was never actually resumed on the remote stub / gdbserver.
> This is why remote_stop_ns needed to learn this new trick of enqueueing
> phony stop replies.
> 
> Note that this patch only considers pending statuses known to the core
> of GDB, that is the events that were pulled out of the target and stored
> in `thread_info::suspend::waitstatus`.  In some cases, we could also
> avoid unnecessary back and forth when the target has events that it has
> not yet reported the core.  I plan to implement this as a subsequent
> patch, once this series has settled.

I think this patch introduces some regressions, when running

$ while make check TESTS="gdb.threads/interrupt-while-step-over.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"; do done

I'll sometimes get:

/home/smarchi/src/binutils-gdb/gdb/inline-frame.c:383: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) 

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-12 17:14   ` Simon Marchi
@ 2021-01-12 18:04     ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-12 18:04 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 2021-01-12 12:14 p.m., Simon Marchi wrote:
> I think this patch introduces some regressions, when running
> 
> $ while make check TESTS="gdb.threads/interrupt-while-step-over.exp" RUNTESTFLAGS="--target_board=native-extended-gdbserver"; do done
> 
> I'll sometimes get:
> 
> /home/smarchi/src/binutils-gdb/gdb/inline-frame.c:383: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) 

Ok, I think I found the reason.  We are missing a little something when
enqueuing phony stop replys (stop reply for a thread at was resumed but
not commit-resumed, still in the RESUMED_PENDING_VCONT state).

Imagine the following sequence:

1. Thread is resumed (goes into RESUMED_PENDING_VCONT)
2. Thread is stopped (phony stop reply is enqueued)
3. Core calls commit_resumed, sending a vCont;c for the thread

We are now in a state where we have a stop reply that we are going to
report as a stop to the core, but the thread is also running on the
target -> bad.

I think the state where the thread is in the RESUMED_PENDING_VCONT
resume state but there is a stop reply enqueued for it is wrong
and should be avoided.  When we enqueue a phony stop reply, we want
to pretend that the thread has executed on the remote target, so I
think we should change the thread's state to "resumed".

If I stick a `remote_thr->set_resumed ();` at the place we enqueue
the phony stop replies, I no longer get the failure.

I think this is sufficient, but if we ever want to distinguish
threads that are currently resumed on the remote target vs threads
that are stopped and have a stop reply waiting to be processed,
we could add a 4th resume state STOP_REPLY_PENDING.  When a enqueuing
a stop reply (real or phony), we would move the matching thread(s)
to that state.  And when processing the stop reply / reporting the
event to the core, we would move the state to NOT_RESUMED, just like
we do now.  But to be clear, I don't think we need this today.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-12  3:07       ` Simon Marchi
@ 2021-01-13 20:17         ` Pedro Alves
  2021-01-14  1:28           ` Simon Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Pedro Alves @ 2021-01-13 20:17 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 1/12/21 3:07 AM, Simon Marchi wrote:

> Please see the updated patch below, just to make sure I got things right.

Yup, looks great.  Thanks.

Pedro Alves

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] gdb: better handling of 'S' packets
  2021-01-13 20:17         ` Pedro Alves
@ 2021-01-14  1:28           ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-14  1:28 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 2021-01-13 3:17 p.m., Pedro Alves wrote:
> On 1/12/21 3:07 AM, Simon Marchi wrote:
> 
>> Please see the updated patch below, just to make sure I got things right.
> 
> Yup, looks great.  Thanks.

Thanks, I pushed patches 1, 2 an 5.

Patch 3, although it was OK'ed, does not help anything in itself so I haven't
merged it yet, there's no rush.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
                     ` (2 preceding siblings ...)
  2021-01-12 17:14   ` Simon Marchi
@ 2021-01-15 19:17   ` Simon Marchi
  3 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-15 19:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

I think I found another issue.  "run" while threads are running gives:

/home/simark/src/binutils-gdb/gdb/target.c:2001: internal-error: ptid_t target_wait(ptid_t, target_waitstatus*, target_wait_flags): Assertion `!proc_target->commit_resumed_state' failed.

This is because run_command_1 kills the existing inferior and then uses
scoped_disable_commit_resumed.  When scoped_disable_commit_resumed runs,
the Linux target is no longer pushed, so its commit_resumed_state doesn't
get turned back to false.

The same probably happens if you "attach" while the current inferior is
running.

So, adding this to my to-fix for this patch.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
  2021-01-08 15:41   ` Pedro Alves
@ 2021-01-18  5:16   ` Sebastian Huber
  2021-01-18  6:04     ` Simon Marchi
  1 sibling, 1 reply; 33+ messages in thread
From: Sebastian Huber @ 2021-01-18  5:16 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Simon Marchi

Hello Simon,

On 08/01/2021 05:17, Simon Marchi via Gdb-patches wrote:
> +/* From the remote target's point of view, each thread is in one of these three
> +   states.  */
> +enum class resume_state
> +{
> +  /* Not resumed - we haven't been asked to resume this thread.  */
> +  NOT_RESUMED,
> +
> +  /* We have been asked to resume this thread, but haven't sent a vCont action
> +     for it yet.  We'll need to consider it next time commit_resume is
> +     called.  */
> +  RESUMED_PENDING_VCONT,
> +
> +  /* We have been asked to resume this thread, and we have sent a vCont action
> +     for it.  */
> +  RESUMED,
> +};

there could be a problem with this "enum class" on CentOS 7.9:

../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1157:38: error: 'resume_state' is not a class, namespace, or enumeration
    enum resume_state m_resume_state = resume_state::NOT_RESUMED;
                                       ^
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_not_resumed()':
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1115:22: error: 'resume_state' is not a class, namespace, or enumeration
      m_resume_state = resume_state::NOT_RESUMED;
                       ^
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed_pending_vcont(bool, gdb_signal)':
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1121:22: error: 'resume_state' is not a class, namespace, or enumeration
      m_resume_state = resume_state::RESUMED_PENDING_VCONT;
                       ^
In file included from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_string_view.h:49:0,
                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-utils.h:46,
                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-defs.h:125,
                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/defs.h:28,
                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:22:
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'const resumed_pending_vcont_info& remote_thread_info::resumed_pending_vcont_info() const':
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1132:35: error: 'resume_state' is not a class, namespace, or enumeration
      gdb_assert (m_resume_state == resume_state::RESUMED_PENDING_VCONT);
                                    ^
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_assert.h:35:13: note: in definition of macro 'gdb_assert'
    ((void) ((expr) ? 0 :                                                       \
              ^
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed()':
../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1140:22: error: 'resume_state' is not a class, namespace, or enumeration
      m_resume_state = resume_state::RESUMED;
                       ^

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-18  5:16   ` Sebastian Huber
@ 2021-01-18  6:04     ` Simon Marchi
  2021-01-18 10:36       ` Sebastian Huber
  0 siblings, 1 reply; 33+ messages in thread
From: Simon Marchi @ 2021-01-18  6:04 UTC (permalink / raw)
  To: Sebastian Huber, gdb-patches; +Cc: Simon Marchi



On 2021-01-18 12:16 a.m., Sebastian Huber wrote:
> Hello Simon,
> 
> On 08/01/2021 05:17, Simon Marchi via Gdb-patches wrote:
>> +/* From the remote target's point of view, each thread is in one of these three
>> +   states.  */
>> +enum class resume_state
>> +{
>> +  /* Not resumed - we haven't been asked to resume this thread.  */
>> +  NOT_RESUMED,
>> +
>> +  /* We have been asked to resume this thread, but haven't sent a vCont action
>> +     for it yet.  We'll need to consider it next time commit_resume is
>> +     called.  */
>> +  RESUMED_PENDING_VCONT,
>> +
>> +  /* We have been asked to resume this thread, and we have sent a vCont action
>> +     for it.  */
>> +  RESUMED,
>> +};
> 
> there could be a problem with this "enum class" on CentOS 7.9:
> 
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1157:38: error: 'resume_state' is not a class, namespace, or enumeration
>    enum resume_state m_resume_state = resume_state::NOT_RESUMED;
>                                       ^
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_not_resumed()':
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1115:22: error: 'resume_state' is not a class, namespace, or enumeration
>      m_resume_state = resume_state::NOT_RESUMED;
>                       ^
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed_pending_vcont(bool, gdb_signal)':
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1121:22: error: 'resume_state' is not a class, namespace, or enumeration
>      m_resume_state = resume_state::RESUMED_PENDING_VCONT;
>                       ^
> In file included from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_string_view.h:49:0,
>                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-utils.h:46,
>                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-defs.h:125,
>                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/defs.h:28,
>                  from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:22:
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'const resumed_pending_vcont_info& remote_thread_info::resumed_pending_vcont_info() const':
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1132:35: error: 'resume_state' is not a class, namespace, or enumeration
>      gdb_assert (m_resume_state == resume_state::RESUMED_PENDING_VCONT);
>                                    ^
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_assert.h:35:13: note: in definition of macro 'gdb_assert'
>    ((void) ((expr) ? 0 :                                                       \
>              ^
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed()':
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1140:22: error: 'resume_state' is not a class, namespace, or enumeration
>      m_resume_state = resume_state::RESUMED;
>                       ^
> 

Huh, maybe it gets confused because there's a method named "remote_state"
as well?  Does it work if you rename the method to something else?

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-18  6:04     ` Simon Marchi
@ 2021-01-18 10:36       ` Sebastian Huber
  2021-01-18 13:53         ` Simon Marchi
  0 siblings, 1 reply; 33+ messages in thread
From: Sebastian Huber @ 2021-01-18 10:36 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Simon Marchi

On 18/01/2021 07:04, Simon Marchi wrote:

> On 2021-01-18 12:16 a.m., Sebastian Huber wrote:
>> Hello Simon,
>>
>> On 08/01/2021 05:17, Simon Marchi via Gdb-patches wrote:
>>> +/* From the remote target's point of view, each thread is in one of these three
>>> +   states.  */
>>> +enum class resume_state
>>> +{
>>> +  /* Not resumed - we haven't been asked to resume this thread.  */
>>> +  NOT_RESUMED,
>>> +
>>> +  /* We have been asked to resume this thread, but haven't sent a vCont action
>>> +     for it yet.  We'll need to consider it next time commit_resume is
>>> +     called.  */
>>> +  RESUMED_PENDING_VCONT,
>>> +
>>> +  /* We have been asked to resume this thread, and we have sent a vCont action
>>> +     for it.  */
>>> +  RESUMED,
>>> +};
>> there could be a problem with this "enum class" on CentOS 7.9:
>>
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1157:38: error: 'resume_state' is not a class, namespace, or enumeration
>>     enum resume_state m_resume_state = resume_state::NOT_RESUMED;
>>                                        ^
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_not_resumed()':
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1115:22: error: 'resume_state' is not a class, namespace, or enumeration
>>       m_resume_state = resume_state::NOT_RESUMED;
>>                        ^
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed_pending_vcont(bool, gdb_signal)':
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1121:22: error: 'resume_state' is not a class, namespace, or enumeration
>>       m_resume_state = resume_state::RESUMED_PENDING_VCONT;
>>                        ^
>> In file included from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_string_view.h:49:0,
>>                   from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-utils.h:46,
>>                   from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/common-defs.h:125,
>>                   from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/defs.h:28,
>>                   from ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:22:
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'const resumed_pending_vcont_info& remote_thread_info::resumed_pending_vcont_info() const':
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1132:35: error: 'resume_state' is not a class, namespace, or enumeration
>>       gdb_assert (m_resume_state == resume_state::RESUMED_PENDING_VCONT);
>>                                     ^
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/../gdbsupport/gdb_assert.h:35:13: note: in definition of macro 'gdb_assert'
>>     ((void) ((expr) ? 0 :                                                       \
>>               ^
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c: In member function 'void remote_thread_info::set_resumed()':
>> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:1140:22: error: 'resume_state' is not a class, namespace, or enumeration
>>       m_resume_state = resume_state::RESUMED;
>>                        ^
>>
> Huh, maybe it gets confused because there's a method named "remote_state"
> as well?  Does it work if you rename the method to something else?

Thanks for the hint, I will have a look at it, but it may take a while.

I noticed another issue on nios2-rtems built on openSUSE:

../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:7820: 
internal-error: ptid_t 
remote_target::select_thread_for_ambiguous_stop_reply(const 
target_waitstatus*): Assertion `first_resumed_thread != nullptr' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

It worked with the GDB commit 4180301.

-- 
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.huber@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/5] gdb: make the remote target track its own thread resume state
  2021-01-18 10:36       ` Sebastian Huber
@ 2021-01-18 13:53         ` Simon Marchi
  0 siblings, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-18 13:53 UTC (permalink / raw)
  To: Sebastian Huber, Simon Marchi, gdb-patches



On 2021-01-18 5:36 a.m., Sebastian Huber wrote:
> Thanks for the hint, I will have a look at it, but it may take a while.
> 
> I noticed another issue on nios2-rtems built on openSUSE:
> 
> ../../sourceware-mirror-binutils-gdb-edf0f28/gdb/remote.c:7820: internal-error: ptid_t remote_target::select_thread_for_ambiguous_stop_reply(const target_waitstatus*): Assertion `first_resumed_thread != nullptr' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> 
> It worked with the GDB commit 4180301.

Please see this comment (an proposed fix), I think you are hitting the same issue:

https://sourceware.org/bugzilla/show_bug.cgi?id=26819#c20

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-11 20:28     ` Simon Marchi
@ 2021-01-22  2:46       ` Simon Marchi
  2021-01-22 22:07       ` Simon Marchi
  1 sibling, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-22  2:46 UTC (permalink / raw)
  To: Simon Marchi, Pedro Alves, gdb-patches

On 2021-01-11 3:28 p.m., Simon Marchi wrote:
> I see more of ~1.33 % slowdown, but it's consistent too.
> 
> It could be the std::set allocation.  I changed all_process_targets to make it
> return an array of 1 element, just to see what happens, it didn't seem to help.
> See attached patch "0001-Test-returning-something-else-than-std-set-in-all_pr.patch"
> if you want to try it.
> 
> It's maybe due to the fact that we now iterate on all threads at every handled
> event? fetch_inferior_event calls ~scoped_disable_commit_resumed, which calls
> maybe_commit_resumed_all_process_targets, which iterates on all threads.  The
> loop actually breaks when it finds a thread with a pending status, but that
> still makes this function O(number of threads).

I did some more testing regarding this.

I am testing the following scenarios:

- baseline: up to the previous patch ("gdb: move commit_resume to
  process_stratum_target") applied.
- original patch: baseline + this patch applied
- always commit resume: I take the original patch, and I remove the loop over
  all threads to check if there are resumed threads with a pending status.
  Instead, I just let it do the commit-resume all the time.
- all no-op: I make these no-ops:

    - maybe_commit_resumed_all_process_targets
    - scoped_disable_commit_resumed::scoped_disable_commit_resumed
    - scoped_disable_commit_resumed::~scoped_disable_commit_resumed

  This essentially gets rid of all calls to all_process_targets and its std::set
  allocation.  It wouldn't be a correct patch, but the native target doesn't care,
  it's just to see the impact of these functions.

The results are:

baseline: 81,534
original patch: 78,926
always commit resume: 80,416
all no-op: 80,890

I don't really understand why "all no-op" would be slower than the baseline, there
isn't much left at this point.

Still, we see, with the difference between "original patch" and "always commit
resume", that iterating on all threads to see if there's one resumed with a
pending status has a non-negligible cost.  To speed this up, we could maintain
a per target or per inferior count of "number of resumed threads with a pending
status".  I think it would be a bit tricky to get completely right, to make sure
this count never gets out of sync with the reality.

Another similar idea would be to keep a per inferior list / queue of resumed
threads with pending statuses.  That would serve the same purpose as the count
from above (checking quickly if there exists a resumed thread with pending status)
but could also make random_pending_event_thread more efficient.
random_pending_event_thread would just return the first thread in that list that
matches waiton_ptid (most of the time, that will be the first in the list because
we are looking for any thread).  It would no longer be random, but it would be
FIFO, but I think it would achieve the same goal of ensuring no thread starves
because it is never the chosen one.  There would be the same difficulty as with
the counter though, to make sure that this list gets in sync with the reality.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
  2021-01-11 20:28     ` Simon Marchi
  2021-01-22  2:46       ` Simon Marchi
@ 2021-01-22 22:07       ` Simon Marchi
  1 sibling, 0 replies; 33+ messages in thread
From: Simon Marchi @ 2021-01-22 22:07 UTC (permalink / raw)
  To: Simon Marchi, Pedro Alves, gdb-patches

On 2021-01-11 3:28 p.m., Simon Marchi wrote:
> On 2021-01-09 3:34 p.m., Pedro Alves wrote:
>>> @@ -86,6 +77,35 @@ class process_stratum_target : public target_ops
>>>  
>>>    /* The connection number.  Visible in "info connections".  */
>>>    int connection_number = 0;
>>> +
>>> +  /* Whether resumed threads must be committed to the target.
>>> +
>>> +     When true, resumed threads must be committed to the execution target.
>>> +
>>> +     When false, the process stratum target may leave resumed threads stopped
>>> +     when it's convenient or efficient to do so.  When the core requires resumed
>>> +     threads to be committed again, this is set back to true and calls the
>>> +     `commit_resumed` method to allow the target to do so.
>>> +
>>> +     To simplify the implementation of process stratum targets, the following
>>> +     methods are guaranteed to be called with COMMIT_RESUMED_STATE set to
>>> +     false:
>>> +
>>> +       - resume
>>> +       - stop
>>> +       - wait
>>
>> Should we mention this in the documentation of each of these methods?
> 
> Yeah that would be nice.  Would you mention it in both places
> or just in those methods' documentation?

I just remembered why I put it there and not on the target methods.
Since commit-resumed is a concept specific to process targets, I don't
think that information belongs in struct target_ops, since it doesn't
make sense for other target_ops implementers.

Simon

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-01-22 22:07 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-08  4:17 [PATCH v3 0/5] Reduce back and forth with target when threads have pending statuses + better handling of 'S' packets Simon Marchi
2021-01-08  4:17 ` [PATCH v3 1/5] gdb: make the remote target track its own thread resume state Simon Marchi
2021-01-08 15:41   ` Pedro Alves
2021-01-08 18:56     ` Simon Marchi
2021-01-18  5:16   ` Sebastian Huber
2021-01-18  6:04     ` Simon Marchi
2021-01-18 10:36       ` Sebastian Huber
2021-01-18 13:53         ` Simon Marchi
2021-01-08  4:17 ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace, full}.c Simon Marchi
2021-01-08 15:43   ` [PATCH v3 2/5] gdb: remove target_ops::commit_resume implementation in record-{btrace,full}.c Pedro Alves
2021-01-08 19:00     ` Simon Marchi
2021-01-08  4:17 ` [PATCH v3 3/5] gdb: move commit_resume to process_stratum_target Simon Marchi
2021-01-08 18:12   ` Andrew Burgess
2021-01-08 19:01     ` Simon Marchi
2021-01-09 20:29   ` Pedro Alves
2021-01-08  4:17 ` [PATCH v3 4/5] gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses Simon Marchi
2021-01-08 18:34   ` Andrew Burgess
2021-01-08 19:04     ` Simon Marchi
2021-01-09 20:34   ` Pedro Alves
2021-01-11 20:28     ` Simon Marchi
2021-01-22  2:46       ` Simon Marchi
2021-01-22 22:07       ` Simon Marchi
2021-01-12 17:14   ` Simon Marchi
2021-01-12 18:04     ` Simon Marchi
2021-01-15 19:17   ` Simon Marchi
2021-01-08  4:17 ` [PATCH v3 5/5] gdb: better handling of 'S' packets Simon Marchi
2021-01-08 18:19   ` Andrew Burgess
2021-01-08 19:11     ` Simon Marchi
2021-01-09 21:26   ` Pedro Alves
2021-01-11 20:36     ` Simon Marchi
2021-01-12  3:07       ` Simon Marchi
2021-01-13 20:17         ` Pedro Alves
2021-01-14  1:28           ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).