[PATCH v5 0/5] Handling already-exited threads in 'stop_all

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v5 0/5] Handling already-exited threads in 'stop_all_threads'
@ 2020-04-06 15:45 Tankut Baris Aktemur
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

Hi,

This is the v5 of a series that aims to fix GDB going into an infinite
loop when it attempts to stop a thread in 'stop_all_threads', if the
thread has already died.  The problem is fixed by handling
waitstatuses that denote exiting.

For the most recent comment, please see

  https://sourceware.org/pipermail/gdb-patches/2020-April/167286.html

The v4 is available at

  https://sourceware.org/legacy-ml/gdb-patches/2020-02/msg00153.html

This revision, v5, makes the following changes:

* The exit event of the thread we've attempted to stop is saved as a
  pending event.  The event is processed only when the user resumes the
  process.

* Two additional predecessor patches are added to tolerate errors
  while reading registers of a dead process so that the 'proceed' flow
  of GDB can make progress.

* Another predecessor patch is added to not delete a thread if it has
  a pending event.  This was needed for when the target is a remote
  target, because remote_target::update_thread_list deletes threads
  that don't exist in the target anymore.

* The case for when the ptid of the exit event is not a full ptid but
  only a pid is handled (e.g.  "8262.0.0" instead of "8262.8262.0").

* A new test scenario is added to cover the case mentioned above (by
  using gdbserver as the target).

The series assumes that a fix to the infinite query problem explained
in the link below has been applied beforehand:

  https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html

Thanks.
Baris

Tankut Baris Aktemur (5):
  gdb: protect some 'regcache_read_pc' calls
  gdb/infrun: move a 'regcache_read_pc' call down to first use
  gdb/remote: do not delete a thread if it has a pending event
  gdb/infrun: extract out a code piece into 'mark_non_executing_threads'
    function
  gdb/infrun: handle already-exited threads when attempting to stop

 gdb/infrun.c                           | 146 ++++++++++++++++--------
 gdb/regcache.c                         |  18 +++
 gdb/remote.c                           |   3 +
 gdb/testsuite/gdb.multi/multi-exit.c   |  22 ++++
 gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
 gdb/testsuite/gdb.multi/multi-kill.c   |  34 ++++++
 gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
 gdbsupport/common-regcache.h           |   5 +
 8 files changed, 434 insertions(+), 45 deletions(-)
 create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
 create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
 create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
 create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp

-- 
2.17.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
@ 2020-04-06 15:45   ` Tankut Baris Aktemur
  2020-04-16 16:11     ` Pedro Alves
  2020-04-06 15:45   ` [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use Tankut Baris Aktemur
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

It possible that a thread whose PC we attempt to read is already dead.
In this case, 'regcache_read_pc' errors out.  This impacts the
"proceed" execution flow, where GDB quits early before having a chance
to check if there exists a pending event.  To remedy, keep going with
a 0 value for the PC if 'regcache_read_pc' fails.  Because the value
of PC before resuming a thread is mostly used for storing and checking
the next time the thread stops, this tolerance is expected to be
harmless for a dead thread/process.

gdb/ChangeLog:
2020-04-03  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* regcache.c (regcache_read_pc_protected): New function
	implementation that returns 0 if the PC cannot read via
	'regcache_read_pc'.
	* infrun.c (proceed): Call 'regcache_read_pc_protected'
	instead of 'regcache_read_pc'.
	(keep_going_pass_signal): Ditto.

gdbsupport/ChangeLog:
2020-04-03  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* common-regcache.h (regcache_read_pc_protected): New function
	declaration.
---
 gdb/infrun.c                 |  7 ++++---
 gdb/regcache.c               | 18 ++++++++++++++++++
 gdbsupport/common-regcache.h |  5 +++++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 8ff34c382d6..d6265c6df51 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2987,7 +2987,8 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   gdbarch = regcache->arch ();
   const address_space *aspace = regcache->aspace ();
 
-  pc = regcache_read_pc (regcache);
+  pc = regcache_read_pc_protected (regcache);
+
   thread_info *cur_thr = inferior_thread ();
 
   /* Fill in with reasonable starting values.  */
@@ -3114,7 +3115,7 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
      advanced.  Must do this before resuming any thread, as in
      all-stop/remote, once we resume we can't send any other packet
      until the target stops again.  */
-  cur_thr->prev_pc = regcache_read_pc (regcache);
+  cur_thr->prev_pc = regcache_read_pc_protected (regcache);
 
   {
     scoped_restore save_defer_tc = make_scoped_defer_target_commit_resume ();
@@ -7921,7 +7922,7 @@ keep_going_pass_signal (struct execution_control_state *ecs)
 
   /* Save the pc before execution, to compare with pc after stop.  */
   ecs->event_thread->prev_pc
-    = regcache_read_pc (get_thread_regcache (ecs->event_thread));
+    = regcache_read_pc_protected (get_thread_regcache (ecs->event_thread));
 
   if (ecs->event_thread->control.trap_expected)
     {
diff --git a/gdb/regcache.c b/gdb/regcache.c
index 4f079c91a7f..4a1cf552505 100644
--- a/gdb/regcache.c
+++ b/gdb/regcache.c
@@ -1220,6 +1220,24 @@ regcache_read_pc (struct regcache *regcache)
   return pc_val;
 }
 
+/* See gdbsupport/common-regcache.h.  */
+
+CORE_ADDR
+regcache_read_pc_protected (regcache *regcache)
+{
+  CORE_ADDR pc;
+  try
+    {
+      pc = regcache_read_pc (regcache);
+    }
+  catch (const gdb_exception &ex)
+    {
+      pc = 0;
+    }
+
+  return pc;
+}
+
 void
 regcache_write_pc (struct regcache *regcache, CORE_ADDR pc)
 {
diff --git a/gdbsupport/common-regcache.h b/gdbsupport/common-regcache.h
index 18446ff8416..650536e8a88 100644
--- a/gdbsupport/common-regcache.h
+++ b/gdbsupport/common-regcache.h
@@ -56,6 +56,11 @@ extern int regcache_register_size (const struct regcache *regcache, int n);
 
 extern CORE_ADDR regcache_read_pc (struct regcache *regcache);
 
+/* Read the PC register.  If PC cannot be read, return 0.
+   This is a wrapper around 'regcache_read_pc'.  */
+
+extern CORE_ADDR regcache_read_pc_protected (regcache *regcache);
+
 /* Read a raw register into a unsigned integer.  */
 extern enum register_status regcache_raw_read_unsigned
   (struct regcache *regcache, int regnum, ULONGEST *val);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
  2020-04-06 15:45   ` [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls Tankut Baris Aktemur
@ 2020-04-06 15:45   ` Tankut Baris Aktemur
  2020-04-16 16:12     ` Pedro Alves
  2020-04-06 15:45   ` [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event Tankut Baris Aktemur
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

In infrun.c's resume_1 function, move the definition of the local
variable PC down to its first use.  This is useful if the thread we want
to resume is already gone with a pending exit event, because we avoid
the error we would see otherwise when trying to read the PC.

gdb/ChangeLog:
2020-04-03  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* infrun.c (resume_1): Move a 'regcache_read_pc' call down to first
	use.
---
 gdb/infrun.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index d6265c6df51..f5d3b79e0d0 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2271,7 +2271,6 @@ resume_1 (enum gdb_signal sig)
   struct regcache *regcache = get_current_regcache ();
   struct gdbarch *gdbarch = regcache->arch ();
   struct thread_info *tp = inferior_thread ();
-  CORE_ADDR pc = regcache_read_pc (regcache);
   const address_space *aspace = regcache->aspace ();
   ptid_t resume_ptid;
   /* This represents the user's step vs continue request.  When
@@ -2350,6 +2349,8 @@ resume_1 (enum gdb_signal sig)
       step = 0;
     }
 
+  CORE_ADDR pc = regcache_read_pc (regcache);
+
   if (debug_infrun)
     fprintf_unfiltered (gdb_stdlog,
 			"infrun: resume (step=%d, signal=%s), "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
  2020-04-06 15:45   ` [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls Tankut Baris Aktemur
  2020-04-06 15:45   ` [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use Tankut Baris Aktemur
@ 2020-04-06 15:45   ` Tankut Baris Aktemur
  2020-04-16 16:24     ` Pedro Alves
  2020-04-06 15:45   ` [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function Tankut Baris Aktemur
  2020-04-06 15:45   ` [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop Tankut Baris Aktemur
  4 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

gdb/ChangeLog:
2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* remote.c (remote_target::update_thread_list): Do not delete
	a thread if it has a pending event.
---
 gdb/remote.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gdb/remote.c b/gdb/remote.c
index bfbc0bc21d3..12ac7cb9862 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -3821,6 +3821,9 @@ remote_target::update_thread_list ()
 	  if (tp->inf->process_target () != this)
 	    continue;
 
+	  if (tp->suspend.waitstatus_pending_p)
+	    continue;
+
 	  if (!context.contains_thread (tp->ptid))
 	    {
 	      /* Not found.  */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
                     ` (2 preceding siblings ...)
  2020-04-06 15:45   ` [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event Tankut Baris Aktemur
@ 2020-04-06 15:45   ` Tankut Baris Aktemur
  2020-04-16 16:24     ` Pedro Alves
  2020-04-06 15:45   ` [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop Tankut Baris Aktemur
  4 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

This is a refactoring.  The extracted function is placed deliberately
before 'stop_all_threads' because the function will be re-used there
in a subsequent patch for handling an exit status kind received from
a thread that GDB attempted to stop.

gdb/ChangeLog:
2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* infrun.c (handle_inferior_event): Extract out a piece of code
	into...
	(mark_non_executing_threads): ...this new function.

Change-Id: I2b088f4a724f4260cb37068264964525cf62a118
---
 gdb/infrun.c | 77 ++++++++++++++++++++++++++++------------------------
 1 file changed, 42 insertions(+), 35 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index f5d3b79e0d0..ce8b544ab8d 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4704,6 +4704,47 @@ save_waitstatus (struct thread_info *tp, const target_waitstatus *ws)
     }
 }
 
+/* Mark the non-executing threads accordingly.  In all-stop, all
+   threads of all processes are stopped when we get any event
+   reported.  In non-stop mode, only the event thread stops.  */
+
+static void
+mark_non_executing_threads (process_stratum_target *target,
+			    ptid_t event_ptid,
+			    struct target_waitstatus ws)
+{
+  ptid_t mark_ptid;
+
+  if (!target_is_non_stop_p ())
+    mark_ptid = minus_one_ptid;
+  else if (ws.kind == TARGET_WAITKIND_SIGNALLED
+	   || ws.kind == TARGET_WAITKIND_EXITED)
+    {
+      /* If we're handling a process exit in non-stop mode, even
+	 though threads haven't been deleted yet, one would think
+	 that there is nothing to do, as threads of the dead process
+	 will be soon deleted, and threads of any other process were
+	 left running.  However, on some targets, threads survive a
+	 process exit event.  E.g., for the "checkpoint" command,
+	 when the current checkpoint/fork exits, linux-fork.c
+	 automatically switches to another fork from within
+	 target_mourn_inferior, by associating the same
+	 inferior/thread to another fork.  We haven't mourned yet at
+	 this point, but we must mark any threads left in the
+	 process as not-executing so that finish_thread_state marks
+	 them stopped (in the user's perspective) if/when we present
+	 the stop to the user.  */
+      mark_ptid = ptid_t (event_ptid.pid ());
+    }
+  else
+    mark_ptid = event_ptid;
+
+  set_executing (target, mark_ptid, false);
+
+  /* Likewise the resumed flag.  */
+  set_resumed (target, mark_ptid, false);
+}
+
 /* See infrun.h.  */
 
 void
@@ -5137,41 +5178,7 @@ handle_inferior_event (struct execution_control_state *ecs)
 	}
     }
 
-  /* Mark the non-executing threads accordingly.  In all-stop, all
-     threads of all processes are stopped when we get any event
-     reported.  In non-stop mode, only the event thread stops.  */
-  {
-    ptid_t mark_ptid;
-
-    if (!target_is_non_stop_p ())
-      mark_ptid = minus_one_ptid;
-    else if (ecs->ws.kind == TARGET_WAITKIND_SIGNALLED
-	     || ecs->ws.kind == TARGET_WAITKIND_EXITED)
-      {
-	/* If we're handling a process exit in non-stop mode, even
-	   though threads haven't been deleted yet, one would think
-	   that there is nothing to do, as threads of the dead process
-	   will be soon deleted, and threads of any other process were
-	   left running.  However, on some targets, threads survive a
-	   process exit event.  E.g., for the "checkpoint" command,
-	   when the current checkpoint/fork exits, linux-fork.c
-	   automatically switches to another fork from within
-	   target_mourn_inferior, by associating the same
-	   inferior/thread to another fork.  We haven't mourned yet at
-	   this point, but we must mark any threads left in the
-	   process as not-executing so that finish_thread_state marks
-	   them stopped (in the user's perspective) if/when we present
-	   the stop to the user.  */
-	mark_ptid = ptid_t (ecs->ptid.pid ());
-      }
-    else
-      mark_ptid = ecs->ptid;
-
-    set_executing (ecs->target, mark_ptid, false);
-
-    /* Likewise the resumed flag.  */
-    set_resumed (ecs->target, mark_ptid, false);
-  }
+  mark_non_executing_threads (ecs->target, ecs->ptid, ecs->ws);
 
   switch (ecs->ws.kind)
     {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop
       [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
                     ` (3 preceding siblings ...)
  2020-04-06 15:45   ` [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function Tankut Baris Aktemur
@ 2020-04-06 15:45   ` Tankut Baris Aktemur
  2020-04-16 17:06     ` Pedro Alves
  4 siblings, 1 reply; 16+ messages in thread
From: Tankut Baris Aktemur @ 2020-04-06 15:45 UTC (permalink / raw)
  To: gdb-patches; +Cc: palves, tdevries

In stop_all_threads, GDB sends signals to other threads in an attempt
to stop them.  While in a typical scenario the expected wait status is
TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
to stop has already terminated.  If so, a waitstatus other than
TARGET_WAITKIND_STOPPED would be received.  Handle this case
appropriately.

If a wait status that denotes thread termination is ignored, GDB goes
into an infinite loop in stop_all_threads.
E.g.:

  $ gdb ./a.out
  (gdb) start
  ...
  (gdb) add-inferior -exec ./a.out
  ...
  (gdb) inferior 2
  ...
  (gdb) start
  ...
  (gdb) set schedule-multiple on
  (gdb) set debug infrun 2
  (gdb) continue
  Continuing.
  infrun: clear_proceed_status_thread (process 10449)
  infrun: clear_proceed_status_thread (process 10453)
  infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
  infrun: proceed: resuming process 10449
  infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
  infrun: infrun_async(1)
  infrun: prepare_to_wait
  infrun: proceed: resuming process 10453
  infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
  infrun: prepare_to_wait
  infrun: Found 2 inferiors, starting at #0
  infrun: target_wait (-1.0.0, status) =
  infrun:   10449.10449.0 [process 10449],
  infrun:   status->kind = exited, status = 0
  infrun: handle_inferior_event status->kind = exited, status = 0
  [Inferior 1 (process 10449) exited normally]
  infrun: stop_waiting
  infrun: stop_all_threads
  infrun: stop_all_threads, pass=0, iterations=0
  infrun:   process 10453 executing, need stop
  infrun: target_wait (-1.0.0, status) =
  infrun:   10453.10453.0 [process 10453],
  infrun:   status->kind = exited, status = 0
  infrun: stop_all_threads status->kind = exited, status = 0 process 10453
  infrun:   process 10453 executing, already stopping
  infrun: target_wait (-1.0.0, status) =
  infrun:   -1.0.0 [process -1],
  infrun:   status->kind = no-resumed
  infrun: infrun_async(0)
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  infrun: stop_all_threads status->kind = no-resumed process -1
  infrun:   process 10453 executing, already stopping
  ...

And this polling goes on forever.  This patch prevents the infinite
looping behavior.  For the same scenario above, we obtain the
following behavior:

  ...
  (gdb) continue
  Continuing.
  infrun: clear_proceed_status_thread (process 31229)
  infrun: clear_proceed_status_thread (process 31233)
  infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
  infrun: proceed: resuming process 31229
  infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
  infrun: infrun_async(1)
  infrun: prepare_to_wait
  infrun: proceed: resuming process 31233
  infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
  infrun: prepare_to_wait
  infrun: Found 2 inferiors, starting at #0
  infrun: target_wait (-1.0.0, status) =
  infrun:   31229.31229.0 [process 31229],
  infrun:   status->kind = exited, status = 0
  infrun: handle_inferior_event status->kind = exited, status = 0
  [Inferior 1 (process 31229) exited normally]
  infrun: stop_waiting
  infrun: stop_all_threads
  infrun: stop_all_threads, pass=0, iterations=0
  infrun:   process 31233 executing, need stop
  infrun: target_wait (-1.0.0, status) =
  infrun:   31233.31233.0 [process 31233],
  infrun:   status->kind = exited, status = 0
  infrun: stop_all_threads status->kind = exited, status = 0 process 31233
  infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
  infrun:   process 31233 not executing
  infrun: stop_all_threads, pass=1, iterations=1
  infrun:   process 31233 not executing
  infrun: stop_all_threads done
  (gdb)

The exit event from Inferior 1 is received and shown to the user.
The exit event from Inferior 2 is not displayed, but kept pending.

  (gdb) info inferiors
    Num  Description       Connection           Executable
  * 1    <null>                                 a.out
    2    process 31233     1 (native)           a.out
  (gdb) inferior 2
  [Switching to inferior 2 [process 31233] (a.out)]
  [Switching to thread 2.1 (process 31233)]
  Couldn't get registers: No such process.
  (gdb)

Note the "Couldn't get regsiters" error above.  As of writing this patch,
GDB normally goes into another hang, infinitely trying register access.
A patch such as

  https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html

eliminates the infinite polling.  Resuming Inferior 2 processes the
pending exit event.

  (gdb) continue
  Continuing.
  infrun: clear_proceed_status_thread (process 31233)
  infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
  infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
  infrun: proceed: resuming process 31233
  infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
  infrun: prepare_to_wait
  infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
  infrun: target_wait (-1.0.0, status) =
  infrun:   31233.31233.0 [process 31233],
  infrun:   status->kind = exited, status = 0
  infrun: handle_inferior_event status->kind = exited, status = 0
  [Inferior 2 (process 31233) exited normally]
  infrun: stop_waiting
  (gdb) info inferiors
    Num  Description       Connection           Executable
    1    <null>                                 a.out
  * 2    <null>                                 a.out
  (gdb)

Regression-tested on X86_64 Linux.

gdb/ChangeLog:
2020-02-05  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
	    Tom de Vries  <tdevries@suse.de>

	PR threads/25478
	* infrun.c (stop_all_threads): Do NOT ignore
	TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
	TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
	received from threads we attempt to stop.

gdb/testsuite/ChangeLog:
2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

	* gdb.multi/multi-exit.c: New file.
	* gdb.multi/multi-exit.exp: New file.
	* gdb.multi/multi-kill.c: New file.
	* gdb.multi/multi-kill.exp: New file.

Change-Id: I7cec98f40283773b79255d998511da434e9cd408
---
 gdb/infrun.c                           |  59 +++++++++-
 gdb/testsuite/gdb.multi/multi-exit.c   |  22 ++++
 gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
 gdb/testsuite/gdb.multi/multi-kill.c   |  34 ++++++
 gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
 5 files changed, 360 insertions(+), 6 deletions(-)
 create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
 create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
 create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
 create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp

diff --git a/gdb/infrun.c b/gdb/infrun.c
index ce8b544ab8d..ead60a0d152 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4854,13 +4854,60 @@ stop_all_threads (void)
 				  target_pid_to_str (event.ptid).c_str ());
 	    }
 
-	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED
-	      || event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
-	      || event.ws.kind == TARGET_WAITKIND_EXITED
-	      || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
+	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED)
+	    {
+	      /* All resumed threads exited.  */
+	    }
+	  else if (event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
+		   || event.ws.kind == TARGET_WAITKIND_EXITED
+		   || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
 	    {
-	      /* All resumed threads exited
-		 or one thread/process exited/signalled.  */
+	      /* One thread/process exited/signalled.  */
+
+	      thread_info *t = nullptr;
+
+	      /* The target may have reported just a pid.  If so, try
+		 the first non-exited thread.  */
+	      if (event.ptid.is_pid ())
+		{
+		  int pid  = event.ptid.pid ();
+		  inferior *inf = find_inferior_pid (event.target, pid);
+		  for (thread_info *tp : inf->non_exited_threads ())
+		    {
+		      t = tp;
+		      break;
+		    }
+
+		  /* FIXME: If there is no available thread, the event
+		     would have to be appended to a per-inferior event
+		     list, which, unfortunately, does not exist yet.  We
+		     assert here instead of going into an infinite loop.  */
+		  gdb_assert (t != nullptr);
+
+		  if (debug_infrun)
+		    fprintf_unfiltered (gdb_stdlog,
+					"infrun: stop_all_threads, using %s\n",
+					target_pid_to_str (t->ptid).c_str ());
+		}
+	      else
+		{
+		  t = find_thread_ptid (event.target, event.ptid);
+		  /* Check if this is the first time we see this thread.
+		     Don't bother adding if it individually exited.  */
+		  if (t == nullptr
+		      && event.ws.kind != TARGET_WAITKIND_THREAD_EXITED)
+		    t = add_thread (event.target, event.ptid);
+		}
+
+	      if (t != nullptr)
+		{
+		  /* Set the threads as non-executing to avoid
+		     another stop attempt on them.  */
+		  mark_non_executing_threads (event.target, event.ptid,
+					      event.ws);
+		  save_waitstatus (t, &event.ws);
+		  t->stop_requested = false;
+		}
 	    }
 	  else
 	    {
diff --git a/gdb/testsuite/gdb.multi/multi-exit.c b/gdb/testsuite/gdb.multi/multi-exit.c
new file mode 100644
index 00000000000..f4825c8a7c1
--- /dev/null
+++ b/gdb/testsuite/gdb.multi/multi-exit.c
@@ -0,0 +1,22 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+int
+main ()
+{
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.multi/multi-exit.exp b/gdb/testsuite/gdb.multi/multi-exit.exp
new file mode 100644
index 00000000000..2236243461d
--- /dev/null
+++ b/gdb/testsuite/gdb.multi/multi-exit.exp
@@ -0,0 +1,147 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test receiving TARGET_WAITKIND_EXITED events from multiple
+# inferiors.  In all stop-mode, upon receiving the exit event from one
+# of the inferiors, GDB will try to stop the other inferior, too.  So,
+# a stop request will be sent.  Receiving a TARGET_WAITKIND_EXITED
+# status kind as a response to that stop request instead of a
+# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
+
+load_lib gdbserver-support.exp
+
+if {[skip_gdbserver_tests]} {
+    return 0
+}
+
+standard_testfile
+
+# The plain remote target can't do multiple inferiors.
+if {[target_info gdb_protocol] != ""} {
+    return 0
+}
+
+if {[build_executable "failed to build" $testfile $srcfile {debug}] == -1} {
+    return -1
+}
+
+# Set up the current inferior with a gdbserver in multi mode as its
+# target, if TARGET is "extended-remote".  Otherwise the target is native.
+
+proc setup_inferior {target infnum} {
+    global binfile
+
+    gdb_test "file ${binfile}" ".*" "load file in inferior $infnum"
+
+    if {$target == "extended-remote"} {
+	gdb_test_no_output "set remote exec-file ${binfile}" \
+	    "set remote-exec file in inferior $infnum"
+	set res [gdbserver_start "--multi" ""]
+	set gdbserver_gdbport [lindex $res 1]
+	if {[gdb_target_cmd "extended-remote" $gdbserver_gdbport]} {
+	    return 0
+	}
+    }
+
+    if {![runto_main]} {
+	fail "starting inferior $infnum"
+	return 0
+    }
+    return 1
+}
+
+# Set up two inferiors and start the processes.  The underlying target
+# of each inferior is determined by the TARGET argument.
+
+proc setup {target} {
+    clean_restart
+
+    # This is a test specific for GDB's ability to stop all threads.
+    gdb_test_no_output "maint set target-non-stop on"
+
+    if {![setup_inferior $target 1]} {
+	return 0
+    }
+
+    gdb_test "add-inferior -no-connection" "Added inferior 2" \
+	"add the second inferior"
+    gdb_test "inferior 2" "Switching to inferior 2.*" \
+	"switch to inferior 2"
+
+    if {![setup_inferior $target 2]} {
+	return 0
+    }
+
+    # We want to continue both processes.
+    gdb_test_no_output "set schedule-multiple on"
+
+    return 1
+}
+
+# Run the test using TARGET as the target of the inferiors.
+
+proc test {target} {
+    if {![setup $target]} {
+	untested "could not do setup"
+	return
+    }
+
+    # We want GDB to complete the command and return the prompt
+    # instead of going into an infinite loop.
+    global decimal gdb_prompt inferior_exited_re
+    gdb_test_multiple "continue" "first continue" {
+	-re "Inferior ($decimal) \[^\n\r\]+ exited normally.*$gdb_prompt $" {
+	    set exited_inferior $expect_out(1,string)
+	    pass $gdb_test_name
+	}
+    }
+
+    if {![info exists exited_inferior]} {
+	fail "first continue"
+	return 0
+    }
+
+    if {$exited_inferior == 1} {
+	set other_inferior 2
+    } else {
+	set other_inferior 1
+    }
+
+    # Switch to the other inferior and check it, too, continues to the end.
+    gdb_test "inferior $other_inferior" \
+	".*Switching to inferior $other_inferior.*" \
+	"switch to the other inferior"
+
+    gdb_continue_to_end
+
+    # Finally, check if we can re-run both inferiors.
+    delete_breakpoints
+
+    gdb_test "run" "$inferior_exited_re normally\]" \
+	"re-run the other inferior"
+
+    gdb_test "inferior $exited_inferior" \
+	".*Switching to inferior $exited_inferior.*" \
+	"switch to the first exited inferior"
+
+    gdb_test "run" "$inferior_exited_re normally\]" \
+	"re-run the first exited inferior"
+}
+
+foreach_with_prefix target {"native" "extended-remote"} {
+    test $target
+}
diff --git a/gdb/testsuite/gdb.multi/multi-kill.c b/gdb/testsuite/gdb.multi/multi-kill.c
new file mode 100644
index 00000000000..3622c499202
--- /dev/null
+++ b/gdb/testsuite/gdb.multi/multi-kill.c
@@ -0,0 +1,34 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This program is intended to be started outside of gdb, and then
+   attached to by gdb.  */
+
+#include <unistd.h>
+
+int
+main ()
+{
+  /* Don't run forever in case GDB crashes and DejaGNU fails to kill
+     this program.  */
+  alarm (10);
+
+  while (1)
+    ;
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.multi/multi-kill.exp b/gdb/testsuite/gdb.multi/multi-kill.exp
new file mode 100644
index 00000000000..09de64c9d01
--- /dev/null
+++ b/gdb/testsuite/gdb.multi/multi-kill.exp
@@ -0,0 +1,104 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test receiving TARGET_WAITKIND_SIGNALLED events from multiple
+# inferiors.  In all stop-mode, upon receiving the exit event from one
+# of the inferiors, GDB will try to stop the other inferior, too.  So,
+# a stop request will be sent.  Receiving a TARGET_WAITKIND_SIGNALLED
+# status kind as a response to that stop request instead of a
+# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
+
+standard_testfile
+
+# The plain remote target can't do multiple inferiors.
+if {[target_info gdb_protocol] != ""} {
+    return
+}
+
+# This is a test specific for native GDB's ability to stop all
+# threads.
+if {![can_spawn_for_attach]} {
+    return
+}
+
+if {[prepare_for_testing "failed to prepare" $testfile $srcfile]} {
+    return -1
+}
+
+gdb_test "add-inferior -exec $binfile" "Added inferior 2.*" \
+    "add the second inferior"
+
+# We want both processes in a running state.
+gdb_test_no_output "set schedule-multiple on"
+
+# Start the programs, attach to them, then kill both from outside.
+set spawn_id_list [spawn_wait_for_attach [list $binfile $binfile]]
+set test_spawn_id1 [lindex $spawn_id_list 0]
+set test_spawn_id2 [lindex $spawn_id_list 1]
+set testpid1 [spawn_id_get_pid $test_spawn_id1]
+set testpid2 [spawn_id_get_pid $test_spawn_id2]
+
+gdb_test "inferior 1" ".*Switching to inferior 1.*"
+
+gdb_test "attach $testpid1" \
+    "Attaching to program: .*, process $testpid1.*(in|at).*" \
+    "attach to program 1"
+
+gdb_test "inferior 2" ".*Switching to inferior 2.*"
+
+gdb_test "attach $testpid2" \
+    "Attaching to program: .*, process $testpid2.*(in|at).*" \
+    "attach to program 2"
+
+gdb_test_multiple "continue" "continue processes" {
+    -re "Continuing.\[\r\n\]+" {
+	# Kill both processes at once.
+	remote_exec build "kill -9 ${testpid1} ${testpid2}"
+	exp_continue
+    }
+    -re "Program terminated with signal.*$gdb_prompt" {
+	pass $gdb_test_name
+    }
+}
+
+# Find the current inferior's id.
+set current_inferior 1
+gdb_test_multiple "info inferiors" "find the current inf id" {
+    -re "\\* 1 .*$gdb_prompt $" {
+        set current_inferior 1
+	set other_inferior 2
+	pass $gdb_test_name
+    }
+    -re "\\* 2 .*$gdb_prompt $" {
+        set current_inferior 2
+	set other_inferior 1
+	pass $gdb_test_name
+    }
+}
+
+# Switch to the other inferior and check it, too, continues to the end.
+gdb_test "inferior $other_inferior" \
+    ".*Switching to inferior $other_inferior.*" \
+    "switch to the other inferior"
+
+gdb_test "continue" \
+    "Program terminated with signal SIGKILL, .*" \
+    "continue the other inferior"
+
+# Make sure that the processes are gone.
+kill_wait_spawned_process $test_spawn_id1
+kill_wait_spawned_process $test_spawn_id2
-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls
  2020-04-06 15:45   ` [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls Tankut Baris Aktemur
@ 2020-04-16 16:11     ` Pedro Alves
  2020-04-20 20:13       ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2020-04-16 16:11 UTC (permalink / raw)
  To: Tankut Baris Aktemur, gdb-patches

On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:

> --- a/gdb/regcache.c
> +++ b/gdb/regcache.c
> @@ -1220,6 +1220,24 @@ regcache_read_pc (struct regcache *regcache)
>    return pc_val;
>  }
>  
> +/* See gdbsupport/common-regcache.h.  */
> +
> +CORE_ADDR
> +regcache_read_pc_protected (regcache *regcache)
> +{
> +  CORE_ADDR pc;
> +  try
> +    {
> +      pc = regcache_read_pc (regcache);
> +    }
> +  catch (const gdb_exception &ex)

This swallows Ctrl-C/QUIT, which is usually not a good idea.
Let's default to catching gdb_exception_error instead.

Otherwise OK.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use
  2020-04-06 15:45   ` [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use Tankut Baris Aktemur
@ 2020-04-16 16:12     ` Pedro Alves
  0 siblings, 0 replies; 16+ messages in thread
From: Pedro Alves @ 2020-04-16 16:12 UTC (permalink / raw)
  To: Tankut Baris Aktemur, gdb-patches

On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> In infrun.c's resume_1 function, move the definition of the local
> variable PC down to its first use.  This is useful if the thread we want
> to resume is already gone with a pending exit event, because we avoid
> the error we would see otherwise when trying to read the PC.
> 
> gdb/ChangeLog:
> 2020-04-03  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 
> 	* infrun.c (resume_1): Move a 'regcache_read_pc' call down to first
> 	use.

OK.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event
  2020-04-06 15:45   ` [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event Tankut Baris Aktemur
@ 2020-04-16 16:24     ` Pedro Alves
  2020-04-20 20:35       ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2020-04-16 16:24 UTC (permalink / raw)
  To: Tankut Baris Aktemur, gdb-patches

On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> gdb/ChangeLog:
> 2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 
> 	* remote.c (remote_target::update_thread_list): Do not delete
> 	a thread if it has a pending event.
> ---
>  gdb/remote.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/gdb/remote.c b/gdb/remote.c
> index bfbc0bc21d3..12ac7cb9862 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -3821,6 +3821,9 @@ remote_target::update_thread_list ()
>  	  if (tp->inf->process_target () != this)
>  	    continue;
>  
> +	  if (tp->suspend.waitstatus_pending_p)
> +	    continue;
> +

This patch makes me nervous.  :-/

Why doesn't prune_threads, via remote_target::thread_alive end
up removing the thread anyway?

I think it'd be better to extend the check to also check for
TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED.  Let threads
with other kinds of events pending be deleted.

Other targets are likely to need something like this, but I'm
OK with going with remote-specific test to start with, so we
can all this in, including the testcase, and then work on
something more generic if we find a need.

In any case, please add some comments.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function
  2020-04-06 15:45   ` [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function Tankut Baris Aktemur
@ 2020-04-16 16:24     ` Pedro Alves
  0 siblings, 0 replies; 16+ messages in thread
From: Pedro Alves @ 2020-04-16 16:24 UTC (permalink / raw)
  To: Tankut Baris Aktemur, gdb-patches

On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> This is a refactoring.  The extracted function is placed deliberately
> before 'stop_all_threads' because the function will be re-used there
> in a subsequent patch for handling an exit status kind received from
> a thread that GDB attempted to stop.
> 
> gdb/ChangeLog:
> 2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 
> 	* infrun.c (handle_inferior_event): Extract out a piece of code
> 	into...
> 	(mark_non_executing_threads): ...this new function.

OK.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop
  2020-04-06 15:45   ` [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop Tankut Baris Aktemur
@ 2020-04-16 17:06     ` Pedro Alves
  2020-04-20 20:43       ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2020-04-16 17:06 UTC (permalink / raw)
  To: Tankut Baris Aktemur, gdb-patches

On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> In stop_all_threads, GDB sends signals to other threads in an attempt
> to stop them.  While in a typical scenario the expected wait status is
> TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
> to stop has already terminated.  If so, a waitstatus other than
> TARGET_WAITKIND_STOPPED would be received.  Handle this case
> appropriately.
> 
> If a wait status that denotes thread termination is ignored, GDB goes
> into an infinite loop in stop_all_threads.
> E.g.:
> 
>   $ gdb ./a.out
>   (gdb) start
>   ...
>   (gdb) add-inferior -exec ./a.out
>   ...
>   (gdb) inferior 2
>   ...
>   (gdb) start
>   ...
>   (gdb) set schedule-multiple on
>   (gdb) set debug infrun 2
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 10449)
>   infrun: clear_proceed_status_thread (process 10453)
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 10449
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
>   infrun: infrun_async(1)
>   infrun: prepare_to_wait
>   infrun: proceed: resuming process 10453
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
>   infrun: prepare_to_wait
>   infrun: Found 2 inferiors, starting at #0
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   10449.10449.0 [process 10449],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 1 (process 10449) exited normally]
>   infrun: stop_waiting
>   infrun: stop_all_threads
>   infrun: stop_all_threads, pass=0, iterations=0
>   infrun:   process 10453 executing, need stop
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   10453.10453.0 [process 10453],
>   infrun:   status->kind = exited, status = 0
>   infrun: stop_all_threads status->kind = exited, status = 0 process 10453
>   infrun:   process 10453 executing, already stopping
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   -1.0.0 [process -1],
>   infrun:   status->kind = no-resumed
>   infrun: infrun_async(0)
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   infrun: stop_all_threads status->kind = no-resumed process -1
>   infrun:   process 10453 executing, already stopping
>   ...
> 
> And this polling goes on forever.  This patch prevents the infinite
> looping behavior.  For the same scenario above, we obtain the
> following behavior:
> 
>   ...
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 31229)
>   infrun: clear_proceed_status_thread (process 31233)
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 31229
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
>   infrun: infrun_async(1)
>   infrun: prepare_to_wait
>   infrun: proceed: resuming process 31233
>   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
>   infrun: prepare_to_wait
>   infrun: Found 2 inferiors, starting at #0
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31229.31229.0 [process 31229],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 1 (process 31229) exited normally]
>   infrun: stop_waiting
>   infrun: stop_all_threads
>   infrun: stop_all_threads, pass=0, iterations=0
>   infrun:   process 31233 executing, need stop
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31233.31233.0 [process 31233],
>   infrun:   status->kind = exited, status = 0
>   infrun: stop_all_threads status->kind = exited, status = 0 process 31233
>   infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
>   infrun:   process 31233 not executing
>   infrun: stop_all_threads, pass=1, iterations=1
>   infrun:   process 31233 not executing
>   infrun: stop_all_threads done
>   (gdb)
> 
> The exit event from Inferior 1 is received and shown to the user.
> The exit event from Inferior 2 is not displayed, but kept pending.
> 
>   (gdb) info inferiors
>     Num  Description       Connection           Executable
>   * 1    <null>                                 a.out
>     2    process 31233     1 (native)           a.out
>   (gdb) inferior 2
>   [Switching to inferior 2 [process 31233] (a.out)]
>   [Switching to thread 2.1 (process 31233)]
>   Couldn't get registers: No such process.
>   (gdb)
> 
> Note the "Couldn't get regsiters" error above.  As of writing this patch,
> GDB normally goes into another hang, infinitely trying register access.
> A patch such as
> 
>   https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html
> 

I've reviewed that patch today.  Once that is in, remember to update
this commit log.

> eliminates the infinite polling.  Resuming Inferior 2 processes the
> pending exit event.
> 
>   (gdb) continue
>   Continuing.
>   infrun: clear_proceed_status_thread (process 31233)
>   infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
>   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
>   infrun: proceed: resuming process 31233
>   infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0 (currently_stepping=0).
>   infrun: prepare_to_wait
>   infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
>   infrun: target_wait (-1.0.0, status) =
>   infrun:   31233.31233.0 [process 31233],
>   infrun:   status->kind = exited, status = 0
>   infrun: handle_inferior_event status->kind = exited, status = 0
>   [Inferior 2 (process 31233) exited normally]
>   infrun: stop_waiting
>   (gdb) info inferiors
>     Num  Description       Connection           Executable
>     1    <null>                                 a.out
>   * 2    <null>                                 a.out
>   (gdb)
> 
> Regression-tested on X86_64 Linux.

Awesome, thanks much for tackling this.

> 
> gdb/ChangeLog:
> 2020-02-05  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 	    Tom de Vries  <tdevries@suse.de>
> 
> 	PR threads/25478
> 	* infrun.c (stop_all_threads): Do NOT ignore
> 	TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
> 	TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
> 	received from threads we attempt to stop.
> 
> gdb/testsuite/ChangeLog:
> 2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> 
> 	* gdb.multi/multi-exit.c: New file.
> 	* gdb.multi/multi-exit.exp: New file.
> 	* gdb.multi/multi-kill.c: New file.
> 	* gdb.multi/multi-kill.exp: New file.
> 
> Change-Id: I7cec98f40283773b79255d998511da434e9cd408
> ---
>  gdb/infrun.c                           |  59 +++++++++-
>  gdb/testsuite/gdb.multi/multi-exit.c   |  22 ++++
>  gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
>  gdb/testsuite/gdb.multi/multi-kill.c   |  34 ++++++
>  gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
>  5 files changed, 360 insertions(+), 6 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
>  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
>  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
>  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index ce8b544ab8d..ead60a0d152 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -4854,13 +4854,60 @@ stop_all_threads (void)
>  				  target_pid_to_str (event.ptid).c_str ());
>  	    }
>  
> -	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED
> -	      || event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> -	      || event.ws.kind == TARGET_WAITKIND_EXITED
> -	      || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> +	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED)
> +	    {
> +	      /* All resumed threads exited.  */
> +	    }
> +	  else if (event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> +		   || event.ws.kind == TARGET_WAITKIND_EXITED
> +		   || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
>  	    {
> -	      /* All resumed threads exited
> -		 or one thread/process exited/signalled.  */
> +	      /* One thread/process exited/signalled.  */
> +
> +	      thread_info *t = nullptr;
> +
> +	      /* The target may have reported just a pid.  If so, try
> +		 the first non-exited thread.  */
> +	      if (event.ptid.is_pid ())
> +		{
> +		  int pid  = event.ptid.pid ();
> +		  inferior *inf = find_inferior_pid (event.target, pid);
> +		  for (thread_info *tp : inf->non_exited_threads ())
> +		    {
> +		      t = tp;
> +		      break;
> +		    }
> +
> +		  /* FIXME: If there is no available thread, the event
> +		     would have to be appended to a per-inferior event
> +		     list, which, unfortunately, does not exist yet.  We
> +		     assert here instead of going into an infinite loop.  */
> +		  gdb_assert (t != nullptr);
> +
> +		  if (debug_infrun)
> +		    fprintf_unfiltered (gdb_stdlog,
> +					"infrun: stop_all_threads, using %s\n",
> +					target_pid_to_str (t->ptid).c_str ());
> +		}
> +	      else
> +		{
> +		  t = find_thread_ptid (event.target, event.ptid);
> +		  /* Check if this is the first time we see this thread.
> +		     Don't bother adding if it individually exited.  */
> +		  if (t == nullptr
> +		      && event.ws.kind != TARGET_WAITKIND_THREAD_EXITED)
> +		    t = add_thread (event.target, event.ptid);
> +		}
> +
> +	      if (t != nullptr)
> +		{
> +		  /* Set the threads as non-executing to avoid
> +		     another stop attempt on them.  */
> +		  mark_non_executing_threads (event.target, event.ptid,
> +					      event.ws);
> +		  save_waitstatus (t, &event.ws);
> +		  t->stop_requested = false;
> +		}
>  	    }
>  	  else
>  	    {
> diff --git a/gdb/testsuite/gdb.multi/multi-exit.c b/gdb/testsuite/gdb.multi/multi-exit.c
> new file mode 100644
> index 00000000000..f4825c8a7c1
> --- /dev/null
> +++ b/gdb/testsuite/gdb.multi/multi-exit.c
> @@ -0,0 +1,22 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +int
> +main ()
> +{
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.multi/multi-exit.exp b/gdb/testsuite/gdb.multi/multi-exit.exp
> new file mode 100644
> index 00000000000..2236243461d
> --- /dev/null
> +++ b/gdb/testsuite/gdb.multi/multi-exit.exp
> @@ -0,0 +1,147 @@
> +# This testcase is part of GDB, the GNU debugger.
> +
> +# Copyright 2020 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test receiving TARGET_WAITKIND_EXITED events from multiple
> +# inferiors.  In all stop-mode, upon receiving the exit event from one
> +# of the inferiors, GDB will try to stop the other inferior, too.  So,
> +# a stop request will be sent.  Receiving a TARGET_WAITKIND_EXITED
> +# status kind as a response to that stop request instead of a
> +# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
> +
> +load_lib gdbserver-support.exp
> +
> +if {[skip_gdbserver_tests]} {
> +    return 0
> +}
> +
> +standard_testfile
> +
> +# The plain remote target can't do multiple inferiors.
> +if {[target_info gdb_protocol] != ""} {

Use:

 if [use_gdb_stub] {

instead.

multi-target.exp is a bit special, in that it really needs to
control which targets are run.  It is not the case in this
testcase, AFAICT.  See more below.


> +    return 0
> +}
> +
> +if {[build_executable "failed to build" $testfile $srcfile {debug}] == -1} {
> +    return -1
> +}
> +
> +# Set up the current inferior with a gdbserver in multi mode as its
> +# target, if TARGET is "extended-remote".  Otherwise the target is native.
> +
> +proc setup_inferior {target infnum} {
> +    global binfile
> +
> +    gdb_test "file ${binfile}" ".*" "load file in inferior $infnum"
> +
> +    if {$target == "extended-remote"} {
> +	gdb_test_no_output "set remote exec-file ${binfile}" \
> +	    "set remote-exec file in inferior $infnum"
> +	set res [gdbserver_start "--multi" ""]
> +	set gdbserver_gdbport [lindex $res 1]
> +	if {[gdb_target_cmd "extended-remote" $gdbserver_gdbport]} {
> +	    return 0
> +	}
> +    }
> +
> +    if {![runto_main]} {
> +	fail "starting inferior $infnum"
> +	return 0
> +    }
> +    return 1
> +}
> +
> +# Set up two inferiors and start the processes.  The underlying target
> +# of each inferior is determined by the TARGET argument.
> +
> +proc setup {target} {
> +    clean_restart
> +
> +    # This is a test specific for GDB's ability to stop all threads.
> +    gdb_test_no_output "maint set target-non-stop on"
> +
> +    if {![setup_inferior $target 1]} {
> +	return 0
> +    }
> +
> +    gdb_test "add-inferior -no-connection" "Added inferior 2" \
> +	"add the second inferior"
> +    gdb_test "inferior 2" "Switching to inferior 2.*" \
> +	"switch to inferior 2"
> +
> +    if {![setup_inferior $target 2]} {
> +	return 0
> +    }
> +
> +    # We want to continue both processes.
> +    gdb_test_no_output "set schedule-multiple on"
> +
> +    return 1
> +}
> +
> +# Run the test using TARGET as the target of the inferiors.
> +
> +proc test {target} {
> +    if {![setup $target]} {
> +	untested "could not do setup"
> +	return
> +    }
> +
> +    # We want GDB to complete the command and return the prompt
> +    # instead of going into an infinite loop.
> +    global decimal gdb_prompt inferior_exited_re
> +    gdb_test_multiple "continue" "first continue" {
> +	-re "Inferior ($decimal) \[^\n\r\]+ exited normally.*$gdb_prompt $" {
> +	    set exited_inferior $expect_out(1,string)
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    if {![info exists exited_inferior]} {
> +	fail "first continue"
> +	return 0
> +    }
> +
> +    if {$exited_inferior == 1} {
> +	set other_inferior 2
> +    } else {
> +	set other_inferior 1
> +    }
> +
> +    # Switch to the other inferior and check it, too, continues to the end.
> +    gdb_test "inferior $other_inferior" \
> +	".*Switching to inferior $other_inferior.*" \
> +	"switch to the other inferior"
> +
> +    gdb_continue_to_end
> +
> +    # Finally, check if we can re-run both inferiors.
> +    delete_breakpoints
> +
> +    gdb_test "run" "$inferior_exited_re normally\]" \
> +	"re-run the other inferior"
> +
> +    gdb_test "inferior $exited_inferior" \
> +	".*Switching to inferior $exited_inferior.*" \
> +	"switch to the first exited inferior"
> +
> +    gdb_test "run" "$inferior_exited_re normally\]" \
> +	"re-run the first exited inferior"
> +}
> +
> +foreach_with_prefix target {"native" "extended-remote"} {

This is really usually not the right thing to do.  Better write
the testcase in a target-neutral form, and then let 

 make check RUNTESTFLAGS="--target_board=native-extended-gdbserver"

cover testing with extended-remote.

AFAICT, when testing with extended-remote, you're spawning
a gdbserver for each inferior.  You don't really need that,
right?  A single gdbserver with both inferiors should
trigger the problem as well.

I.e., remove this specific target stuff, and just use regular
commands.

Instead of issuing the "file" + "set remote exec-file", use
gdb_load.

Instead of the "run" command, try gdb_run_cmd.

> +gdb_test_multiple "continue" "continue processes" {
> +    -re "Continuing.\[\r\n\]+" {
> +	# Kill both processes at once.
> +	remote_exec build "kill -9 ${testpid1} ${testpid2}"

Pedantically, "remote_exec target"

> +# Find the current inferior's id.
> +set current_inferior 1
> +gdb_test_multiple "info inferiors" "find the current inf id" {
> +    -re "\\* 1 .*$gdb_prompt $" {
> +        set current_inferior 1
> +	set other_inferior 2
> +	pass $gdb_test_name
> +    }
> +    -re "\\* 2 .*$gdb_prompt $" {
> +        set current_inferior 2
> +	set other_inferior 1
> +	pass $gdb_test_name
> +    }

Watch out for tabs vs spaces above.

Does multi-kill.exp really need to attach to processes instead of
spawning them via GDB?  If we do the latter, then the testcase
will run on more configurations.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls
  2020-04-16 16:11     ` Pedro Alves
@ 2020-04-20 20:13       ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 16+ messages in thread
From: Aktemur, Tankut Baris @ 2020-04-20 20:13 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On Thursday, April 16, 2020 6:12 PM Pedro Alves wrote:
> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> 
> > --- a/gdb/regcache.c
> > +++ b/gdb/regcache.c
> > @@ -1220,6 +1220,24 @@ regcache_read_pc (struct regcache *regcache)
> >    return pc_val;
> >  }
> >
> > +/* See gdbsupport/common-regcache.h.  */
> > +
> > +CORE_ADDR
> > +regcache_read_pc_protected (regcache *regcache)
> > +{
> > +  CORE_ADDR pc;
> > +  try
> > +    {
> > +      pc = regcache_read_pc (regcache);
> > +    }
> > +  catch (const gdb_exception &ex)
> 
> This swallows Ctrl-C/QUIT, which is usually not a good idea.
> Let's default to catching gdb_exception_error instead.

I replaced gdb_exception with gdb_exception_error.  I'm not sending another revision
for this.

Thanks.
Baris


 

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event
  2020-04-16 16:24     ` Pedro Alves
@ 2020-04-20 20:35       ` Aktemur, Tankut Baris
  2020-04-21 18:54         ` Pedro Alves
  0 siblings, 1 reply; 16+ messages in thread
From: Aktemur, Tankut Baris @ 2020-04-20 20:35 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On Thursday, April 16, 2020 6:24 PM, Pedro Alves wrote:
> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> > gdb/ChangeLog:
> > 2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> >
> > 	* remote.c (remote_target::update_thread_list): Do not delete
> > 	a thread if it has a pending event.
> > ---
> >  gdb/remote.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/gdb/remote.c b/gdb/remote.c
> > index bfbc0bc21d3..12ac7cb9862 100644
> > --- a/gdb/remote.c
> > +++ b/gdb/remote.c
> > @@ -3821,6 +3821,9 @@ remote_target::update_thread_list ()
> >  	  if (tp->inf->process_target () != this)
> >  	    continue;
> >
> > +	  if (tp->suspend.waitstatus_pending_p)
> > +	    continue;
> > +
> 
> This patch makes me nervous.  :-/
> 
> Why doesn't prune_threads, via remote_target::thread_alive end
> up removing the thread anyway?

As far as I see, prune_threads is called only if gdbserver does not
send a thread list.  The comment in the function suggests that this
could be the case for older servers.  I'm not sure how old it should
be, though, for me to be able to test it.
 
> I think it'd be better to extend the check to also check for
> TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED.  Let threads
> with other kinds of events pending be deleted.
> 
> Other targets are likely to need something like this, but I'm
> OK with going with remote-specific test to start with, so we
> can all this in, including the testcase, and then work on
> something more generic if we find a need.
> 
> In any case, please add some comments.

First I did this change, but when I started working on Patch #5, interesting
things have come up.

Having both of the inferiors on a single extended-remote target is supposed
to trigger the same stop-all-threads scenario that this series attempts to fix.
But it turned out to be slightly different.

The 'update_thread_list' at the beginning of the stop_all_threads pass
removes all the threads because the remote target sends an empty list of threads
(well both inferiors exited and no thread remained).  So, before we even attempt
to stop threads, they are already gone when the list is updated.  The problem is,
inferior 2 (i.e. the inferior whose thread we would attempt to stop but would
receive an EXITED waitstatus for) is then left in a seemingly live state (pid != 0),
but with no threads.  Hence, we cannot even switch to it after inferior 1's
exit event is shown.

Therefore I updated this patch as below.  Although no regression was detected,
I don't feel much safe with this new version.  Comments welcome.

Thanks,
-Baris

From f9c9a7c1bb15cb99925744e25026bd2b09b74194 Mon Sep 17 00:00:00 2001
From: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
Date: Mon, 6 Apr 2020 13:54:35 +0200
Subject: [PATCH 3/5] gdb/remote: do not delete a thread if this could leave
 its inferior empty

gdb/ChangeLog:
2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>

        * remote.c (remote_target::update_thread_list): Do not delete
        a thread if this could leave its inferior empty.
---
 gdb/remote.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/gdb/remote.c b/gdb/remote.c
index 5db406e045c..681957c012b 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -3822,6 +3822,18 @@ remote_target::update_thread_list ()
          if (tp->inf->process_target () != this)
            continue;

+         /* Do not delete the thread if it belongs to another inferior
+            that has only one non-exited thread, because otherwise we
+            would end up with a seemingly live inferior with no threads.  */
+         if (tp->inf != current_inferior ())
+           {
+             int num_threads = 0;
+             for (thread_info *t ATTRIBUTE_UNUSED : tp->inf->non_exited_threads ())
+               num_threads++;
+             if (num_threads == 1)
+               continue;
+           }
+
          if (!context.contains_thread (tp->ptid))
            {
              /* Not found.  */
--
2.17.1

 


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop
  2020-04-16 17:06     ` Pedro Alves
@ 2020-04-20 20:43       ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 16+ messages in thread
From: Aktemur, Tankut Baris @ 2020-04-20 20:43 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On Thursday, April 16, 2020 7:07 PM, Pedro Alves wrote:
> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> > In stop_all_threads, GDB sends signals to other threads in an attempt
> > to stop them.  While in a typical scenario the expected wait status is
> > TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
> > to stop has already terminated.  If so, a waitstatus other than
> > TARGET_WAITKIND_STOPPED would be received.  Handle this case
> > appropriately.
> >
> > If a wait status that denotes thread termination is ignored, GDB goes
> > into an infinite loop in stop_all_threads.
> > E.g.:
> >
> >   $ gdb ./a.out
> >   (gdb) start
> >   ...
> >   (gdb) add-inferior -exec ./a.out
> >   ...
> >   (gdb) inferior 2
> >   ...
> >   (gdb) start
> >   ...
> >   (gdb) set schedule-multiple on
> >   (gdb) set debug infrun 2
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 10449)
> >   infrun: clear_proceed_status_thread (process 10453)
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 10449
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10449] at 0x55555555514e
> >   infrun: infrun_async(1)
> >   infrun: prepare_to_wait
> >   infrun: proceed: resuming process 10453
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 10453] at 0x55555555514e
> >   infrun: prepare_to_wait
> >   infrun: Found 2 inferiors, starting at #0
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   10449.10449.0 [process 10449],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 1 (process 10449) exited normally]
> >   infrun: stop_waiting
> >   infrun: stop_all_threads
> >   infrun: stop_all_threads, pass=0, iterations=0
> >   infrun:   process 10453 executing, need stop
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   10453.10453.0 [process 10453],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: stop_all_threads status->kind = exited, status = 0 process 10453
> >   infrun:   process 10453 executing, already stopping
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   -1.0.0 [process -1],
> >   infrun:   status->kind = no-resumed
> >   infrun: infrun_async(0)
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   infrun: stop_all_threads status->kind = no-resumed process -1
> >   infrun:   process 10453 executing, already stopping
> >   ...
> >
> > And this polling goes on forever.  This patch prevents the infinite
> > looping behavior.  For the same scenario above, we obtain the
> > following behavior:
> >
> >   ...
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 31229)
> >   infrun: clear_proceed_status_thread (process 31233)
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 31229
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31229] at 0x55555555514e
> >   infrun: infrun_async(1)
> >   infrun: prepare_to_wait
> >   infrun: proceed: resuming process 31233
> >   infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [process 31233] at 0x55555555514e
> >   infrun: prepare_to_wait
> >   infrun: Found 2 inferiors, starting at #0
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31229.31229.0 [process 31229],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 1 (process 31229) exited normally]
> >   infrun: stop_waiting
> >   infrun: stop_all_threads
> >   infrun: stop_all_threads, pass=0, iterations=0
> >   infrun:   process 31233 executing, need stop
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31233.31233.0 [process 31233],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: stop_all_threads status->kind = exited, status = 0 process 31233
> >   infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
> >   infrun:   process 31233 not executing
> >   infrun: stop_all_threads, pass=1, iterations=1
> >   infrun:   process 31233 not executing
> >   infrun: stop_all_threads done
> >   (gdb)
> >
> > The exit event from Inferior 1 is received and shown to the user.
> > The exit event from Inferior 2 is not displayed, but kept pending.
> >
> >   (gdb) info inferiors
> >     Num  Description       Connection           Executable
> >   * 1    <null>                                 a.out
> >     2    process 31233     1 (native)           a.out
> >   (gdb) inferior 2
> >   [Switching to inferior 2 [process 31233] (a.out)]
> >   [Switching to thread 2.1 (process 31233)]
> >   Couldn't get registers: No such process.
> >   (gdb)
> >
> > Note the "Couldn't get regsiters" error above.  As of writing this patch,
> > GDB normally goes into another hang, infinitely trying register access.
> > A patch such as
> >
> >   https://sourceware.org/pipermail/gdb-patches/2020-March/166982.html
> >
> 
> I've reviewed that patch today.  Once that is in, remember to update
> this commit log.

Done.
 
> > eliminates the infinite polling.  Resuming Inferior 2 processes the
> > pending exit event.
> >
> >   (gdb) continue
> >   Continuing.
> >   infrun: clear_proceed_status_thread (process 31233)
> >   infrun: clear_proceed_status_thread: thread process 31233 has pending wait status status->kind = exited, status
> = 0 (currently_stepping=0).
> >   infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> >   infrun: proceed: resuming process 31233
> >   infrun: resume: thread process 31233 has pending wait status status->kind = exited, status = 0
> (currently_stepping=0).
> >   infrun: prepare_to_wait
> >   infrun: Using pending wait status status->kind = exited, status = 0 for process 31233.
> >   infrun: target_wait (-1.0.0, status) =
> >   infrun:   31233.31233.0 [process 31233],
> >   infrun:   status->kind = exited, status = 0
> >   infrun: handle_inferior_event status->kind = exited, status = 0
> >   [Inferior 2 (process 31233) exited normally]
> >   infrun: stop_waiting
> >   (gdb) info inferiors
> >     Num  Description       Connection           Executable
> >     1    <null>                                 a.out
> >   * 2    <null>                                 a.out
> >   (gdb)
> >
> > Regression-tested on X86_64 Linux.
> 
> Awesome, thanks much for tackling this.

Thanks for your review.  I repeated regression testing for v6 using
the default board and native-extended-gdbserver.
 
> >
> > gdb/ChangeLog:
> > 2020-02-05  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> > 	    Tom de Vries  <tdevries@suse.de>
> >
> > 	PR threads/25478
> > 	* infrun.c (stop_all_threads): Do NOT ignore
> > 	TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
> > 	TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
> > 	received from threads we attempt to stop.
> >
> > gdb/testsuite/ChangeLog:
> > 2019-11-04  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> >
> > 	* gdb.multi/multi-exit.c: New file.
> > 	* gdb.multi/multi-exit.exp: New file.
> > 	* gdb.multi/multi-kill.c: New file.
> > 	* gdb.multi/multi-kill.exp: New file.
> >
> > Change-Id: I7cec98f40283773b79255d998511da434e9cd408
> > ---
> >  gdb/infrun.c                           |  59 +++++++++-
> >  gdb/testsuite/gdb.multi/multi-exit.c   |  22 ++++
> >  gdb/testsuite/gdb.multi/multi-exit.exp | 147 +++++++++++++++++++++++++
> >  gdb/testsuite/gdb.multi/multi-kill.c   |  34 ++++++
> >  gdb/testsuite/gdb.multi/multi-kill.exp | 104 +++++++++++++++++
> >  5 files changed, 360 insertions(+), 6 deletions(-)
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.c
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-exit.exp
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.c
> >  create mode 100644 gdb/testsuite/gdb.multi/multi-kill.exp
> >
> > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > index ce8b544ab8d..ead60a0d152 100644
> > --- a/gdb/infrun.c
> > +++ b/gdb/infrun.c
> > @@ -4854,13 +4854,60 @@ stop_all_threads (void)
> >  				  target_pid_to_str (event.ptid).c_str ());
> >  	    }
> >
> > -	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED
> > -	      || event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > -	      || event.ws.kind == TARGET_WAITKIND_EXITED
> > -	      || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> > +	  if (event.ws.kind == TARGET_WAITKIND_NO_RESUMED)
> > +	    {
> > +	      /* All resumed threads exited.  */
> > +	    }
> > +	  else if (event.ws.kind == TARGET_WAITKIND_THREAD_EXITED
> > +		   || event.ws.kind == TARGET_WAITKIND_EXITED
> > +		   || event.ws.kind == TARGET_WAITKIND_SIGNALLED)
> >  	    {
> > -	      /* All resumed threads exited
> > -		 or one thread/process exited/signalled.  */
> > +	      /* One thread/process exited/signalled.  */
> > +
> > +	      thread_info *t = nullptr;
> > +
> > +	      /* The target may have reported just a pid.  If so, try
> > +		 the first non-exited thread.  */
> > +	      if (event.ptid.is_pid ())
> > +		{
> > +		  int pid  = event.ptid.pid ();
> > +		  inferior *inf = find_inferior_pid (event.target, pid);
> > +		  for (thread_info *tp : inf->non_exited_threads ())
> > +		    {
> > +		      t = tp;
> > +		      break;
> > +		    }
> > +
> > +		  /* FIXME: If there is no available thread, the event
> > +		     would have to be appended to a per-inferior event
> > +		     list, which, unfortunately, does not exist yet.  We
> > +		     assert here instead of going into an infinite loop.  */
> > +		  gdb_assert (t != nullptr);
> > +
> > +		  if (debug_infrun)
> > +		    fprintf_unfiltered (gdb_stdlog,
> > +					"infrun: stop_all_threads, using %s\n",
> > +					target_pid_to_str (t->ptid).c_str ());
> > +		}
> > +	      else
> > +		{
> > +		  t = find_thread_ptid (event.target, event.ptid);
> > +		  /* Check if this is the first time we see this thread.
> > +		     Don't bother adding if it individually exited.  */
> > +		  if (t == nullptr
> > +		      && event.ws.kind != TARGET_WAITKIND_THREAD_EXITED)
> > +		    t = add_thread (event.target, event.ptid);
> > +		}
> > +
> > +	      if (t != nullptr)
> > +		{
> > +		  /* Set the threads as non-executing to avoid
> > +		     another stop attempt on them.  */
> > +		  mark_non_executing_threads (event.target, event.ptid,
> > +					      event.ws);
> > +		  save_waitstatus (t, &event.ws);
> > +		  t->stop_requested = false;
> > +		}
> >  	    }
> >  	  else
> >  	    {
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.c b/gdb/testsuite/gdb.multi/multi-exit.c
> > new file mode 100644
> > index 00000000000..f4825c8a7c1
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.c
> > @@ -0,0 +1,22 @@
> > +/* This testcase is part of GDB, the GNU debugger.
> > +
> > +   Copyright 2020 Free Software Foundation, Inc.
> > +
> > +   This program is free software; you can redistribute it and/or modify
> > +   it under the terms of the GNU General Public License as published by
> > +   the Free Software Foundation; either version 3 of the License, or
> > +   (at your option) any later version.
> > +
> > +   This program is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +   GNU General Public License for more details.
> > +
> > +   You should have received a copy of the GNU General Public License
> > +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> > +
> > +int
> > +main ()
> > +{
> > +  return 0;
> > +}
> > diff --git a/gdb/testsuite/gdb.multi/multi-exit.exp b/gdb/testsuite/gdb.multi/multi-exit.exp
> > new file mode 100644
> > index 00000000000..2236243461d
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.multi/multi-exit.exp
> > @@ -0,0 +1,147 @@
> > +# This testcase is part of GDB, the GNU debugger.
> > +
> > +# Copyright 2020 Free Software Foundation, Inc.
> > +
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 3 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +# Test receiving TARGET_WAITKIND_EXITED events from multiple
> > +# inferiors.  In all stop-mode, upon receiving the exit event from one
> > +# of the inferiors, GDB will try to stop the other inferior, too.  So,
> > +# a stop request will be sent.  Receiving a TARGET_WAITKIND_EXITED
> > +# status kind as a response to that stop request instead of a
> > +# TARGET_WAITKIND_STOPPED should be handled by GDB without problems.
> > +
> > +load_lib gdbserver-support.exp
> > +
> > +if {[skip_gdbserver_tests]} {
> > +    return 0
> > +}
> > +
> > +standard_testfile
> > +
> > +# The plain remote target can't do multiple inferiors.
> > +if {[target_info gdb_protocol] != ""} {
> 
> Use:
> 
>  if [use_gdb_stub] {
> 
> instead.
> 
> multi-target.exp is a bit special, in that it really needs to
> control which targets are run.  It is not the case in this
> testcase, AFAICT.  See more below.

Done.

> 
> > +    return 0
> > +}
> > +
> > +if {[build_executable "failed to build" $testfile $srcfile {debug}] == -1} {
> > +    return -1
> > +}
> > +
> > +# Set up the current inferior with a gdbserver in multi mode as its
> > +# target, if TARGET is "extended-remote".  Otherwise the target is native.
> > +
> > +proc setup_inferior {target infnum} {
> > +    global binfile
> > +
> > +    gdb_test "file ${binfile}" ".*" "load file in inferior $infnum"
> > +
> > +    if {$target == "extended-remote"} {
> > +	gdb_test_no_output "set remote exec-file ${binfile}" \
> > +	    "set remote-exec file in inferior $infnum"
> > +	set res [gdbserver_start "--multi" ""]
> > +	set gdbserver_gdbport [lindex $res 1]
> > +	if {[gdb_target_cmd "extended-remote" $gdbserver_gdbport]} {
> > +	    return 0
> > +	}
> > +    }
> > +
> > +    if {![runto_main]} {
> > +	fail "starting inferior $infnum"
> > +	return 0
> > +    }
> > +    return 1
> > +}
> > +
> > +# Set up two inferiors and start the processes.  The underlying target
> > +# of each inferior is determined by the TARGET argument.
> > +
> > +proc setup {target} {
> > +    clean_restart
> > +
> > +    # This is a test specific for GDB's ability to stop all threads.
> > +    gdb_test_no_output "maint set target-non-stop on"
> > +
> > +    if {![setup_inferior $target 1]} {
> > +	return 0
> > +    }
> > +
> > +    gdb_test "add-inferior -no-connection" "Added inferior 2" \
> > +	"add the second inferior"
> > +    gdb_test "inferior 2" "Switching to inferior 2.*" \
> > +	"switch to inferior 2"
> > +
> > +    if {![setup_inferior $target 2]} {
> > +	return 0
> > +    }
> > +
> > +    # We want to continue both processes.
> > +    gdb_test_no_output "set schedule-multiple on"
> > +
> > +    return 1
> > +}
> > +
> > +# Run the test using TARGET as the target of the inferiors.
> > +
> > +proc test {target} {
> > +    if {![setup $target]} {
> > +	untested "could not do setup"
> > +	return
> > +    }
> > +
> > +    # We want GDB to complete the command and return the prompt
> > +    # instead of going into an infinite loop.
> > +    global decimal gdb_prompt inferior_exited_re
> > +    gdb_test_multiple "continue" "first continue" {
> > +	-re "Inferior ($decimal) \[^\n\r\]+ exited normally.*$gdb_prompt $" {
> > +	    set exited_inferior $expect_out(1,string)
> > +	    pass $gdb_test_name
> > +	}
> > +    }
> > +
> > +    if {![info exists exited_inferior]} {
> > +	fail "first continue"
> > +	return 0
> > +    }
> > +
> > +    if {$exited_inferior == 1} {
> > +	set other_inferior 2
> > +    } else {
> > +	set other_inferior 1
> > +    }
> > +
> > +    # Switch to the other inferior and check it, too, continues to the end.
> > +    gdb_test "inferior $other_inferior" \
> > +	".*Switching to inferior $other_inferior.*" \
> > +	"switch to the other inferior"
> > +
> > +    gdb_continue_to_end
> > +
> > +    # Finally, check if we can re-run both inferiors.
> > +    delete_breakpoints
> > +
> > +    gdb_test "run" "$inferior_exited_re normally\]" \
> > +	"re-run the other inferior"
> > +
> > +    gdb_test "inferior $exited_inferior" \
> > +	".*Switching to inferior $exited_inferior.*" \
> > +	"switch to the first exited inferior"
> > +
> > +    gdb_test "run" "$inferior_exited_re normally\]" \
> > +	"re-run the first exited inferior"
> > +}
> > +
> > +foreach_with_prefix target {"native" "extended-remote"} {
> 
> This is really usually not the right thing to do.  Better write
> the testcase in a target-neutral form, and then let
> 
>  make check RUNTESTFLAGS="--target_board=native-extended-gdbserver"
> 
> cover testing with extended-remote.

I believe this comment is addressed in v6.
 
> AFAICT, when testing with extended-remote, you're spawning
> a gdbserver for each inferior.  You don't really need that,
> right?  A single gdbserver with both inferiors should
> trigger the problem as well.

Correct.  New problems arose, though, that I explained in the reply
to Patch #3.
 
> I.e., remove this specific target stuff, and just use regular
> commands.
> 
> Instead of issuing the "file" + "set remote exec-file", use
> gdb_load.

Done.

> Instead of the "run" command, try gdb_run_cmd.

Done.
 
> > +gdb_test_multiple "continue" "continue processes" {
> > +    -re "Continuing.\[\r\n\]+" {
> > +	# Kill both processes at once.
> > +	remote_exec build "kill -9 ${testpid1} ${testpid2}"
> 
> Pedantically, "remote_exec target"

Done.
 
> > +# Find the current inferior's id.
> > +set current_inferior 1
> > +gdb_test_multiple "info inferiors" "find the current inf id" {
> > +    -re "\\* 1 .*$gdb_prompt $" {
> > +        set current_inferior 1
> > +	set other_inferior 2
> > +	pass $gdb_test_name
> > +    }
> > +    -re "\\* 2 .*$gdb_prompt $" {
> > +        set current_inferior 2
> > +	set other_inferior 1
> > +	pass $gdb_test_name
> > +    }
> 
> Watch out for tabs vs spaces above.

Fixed.
 
> Does multi-kill.exp really need to attach to processes instead of
> spawning them via GDB?  If we do the latter, then the testcase
> will run on more configurations.

No, it does not have to attach.  I revised the test and it can now
run with native-extended-gdbserver, too.

Thanks.
Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event
  2020-04-20 20:35       ` Aktemur, Tankut Baris
@ 2020-04-21 18:54         ` Pedro Alves
  2020-04-22 14:57           ` Aktemur, Tankut Baris
  0 siblings, 1 reply; 16+ messages in thread
From: Pedro Alves @ 2020-04-21 18:54 UTC (permalink / raw)
  To: Aktemur, Tankut Baris, gdb-patches

On 4/20/20 9:35 PM, Aktemur, Tankut Baris wrote:
> On Thursday, April 16, 2020 6:24 PM, Pedro Alves wrote:
>> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
>>> gdb/ChangeLog:
>>> 2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
>>>
>>> 	* remote.c (remote_target::update_thread_list): Do not delete
>>> 	a thread if it has a pending event.
>>> ---
>>>  gdb/remote.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/gdb/remote.c b/gdb/remote.c
>>> index bfbc0bc21d3..12ac7cb9862 100644
>>> --- a/gdb/remote.c
>>> +++ b/gdb/remote.c
>>> @@ -3821,6 +3821,9 @@ remote_target::update_thread_list ()
>>>  	  if (tp->inf->process_target () != this)
>>>  	    continue;
>>>
>>> +	  if (tp->suspend.waitstatus_pending_p)
>>> +	    continue;
>>> +
>>
>> This patch makes me nervous.  :-/
>>
>> Why doesn't prune_threads, via remote_target::thread_alive end
>> up removing the thread anyway?
> 
> As far as I see, prune_threads is called only if gdbserver does not
> send a thread list.  The comment in the function suggests that this
> could be the case for older servers.  I'm not sure how old it should
> be, though, for me to be able to test it.
>  
>> I think it'd be better to extend the check to also check for
>> TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED.  Let threads
>> with other kinds of events pending be deleted.
>>
>> Other targets are likely to need something like this, but I'm
>> OK with going with remote-specific test to start with, so we
>> can all this in, including the testcase, and then work on
>> something more generic if we find a need.
>>
>> In any case, please add some comments.
> 
> First I did this change, but when I started working on Patch #5, interesting
> things have come up.
> 
> Having both of the inferiors on a single extended-remote target is supposed
> to trigger the same stop-all-threads scenario that this series attempts to fix.
> But it turned out to be slightly different.
> 
> The 'update_thread_list' at the beginning of the stop_all_threads pass
> removes all the threads because the remote target sends an empty list of threads
> (well both inferiors exited and no thread remained).  So, before we even attempt
> to stop threads, they are already gone when the list is updated.  The problem is,
> inferior 2 (i.e. the inferior whose thread we would attempt to stop but would
> receive an EXITED waitstatus for) is then left in a seemingly live state (pid != 0),
> but with no threads.  Hence, we cannot even switch to it after inferior 1's
> exit event is shown.
> 
> Therefore I updated this patch as below.  Although no regression was detected,
> I don't feel much safe with this new version.  Comments welcome.

Hmm.  The current_inferior() check seems like a red flag, IMO.
Why is that needed?

How does it happen that all threads are deleted?  I've seen
this before, but I'll need to refresh my memory.  Let's see.

We have this in gdbserver/linux-low.cc:

/* Return true if LWP has exited already, and has a pending exit event
   to report to GDB.  */

static int
lwp_is_marked_dead (struct lwp_info *lwp)
{
  return (lwp->status_pending_p
	  && (WIFEXITED (lwp->status_pending)
	      || WIFSIGNALED (lwp->status_pending)));
}

/* Return true if the given thread is still alive.  */

bool
linux_process_target::thread_alive (ptid_t ptid)
{
  struct lwp_info *lwp = find_lwp_pid (ptid);

  /* We assume we always know if a thread exits.  If a whole process
     exited but we still haven't been able to report it to GDB, we'll
     hold on to the last lwp of the dead process.  */
  if (lwp != NULL)
    return !lwp_is_marked_dead (lwp);
  else
    return 0;
}

If gdbserver deleted all threads, it seems like that would
mean that the linux-low.cc backend has already reported the
pending exit event to gdbserver's core, and the process exit event
is thus guaranteed to already be in transit, somewhere, maybe
queued in gdbserver/server.cc's notif_stop queue, or maybe even
already in gdb, but queued in gdb/remote.c's stop_reply_queue queue,
or at whatever other level we queue events.

As I mentioned, I've ran into something like this before.
See infrun.c:handle_no_resumed where it reads:

  /* Note however that we may find no resumed thread because the whole
     process exited meanwhile (thus updating the thread list results
     in an empty thread list).  In this case we know we'll be getting
     a process exit event shortly.  */

Maybe we should make sure that stop_all_threads ends up in wait_one
if the inferior has a pid != 0, and no threads?

If we still need to keep one thread in the inferior's thread list,
it feels like that should be handled by common code, like e.g.,
from within delete_thread.  

But if we do that, then we don't need to change stop_all_threads.
And that bit I quoted in handle_no_resumed becomes dead, and
seems like could trick gdb into reporting a TARGET_WAITKIND_NO_RESUMED
to the user.  :-/

It also feels like we're missing a thread state in enum thread_state,
like a THREAD_ZOMBIE state, for a thread that has exited, but that
should still be listed, and only exists because it needs to be waited
on to consume the exit event.  We can probably do without it, though
I've been having this feeling for a long while.  Maybe
THREAD_EXITED + pending status would be the same thing.

Or maybe we should be able to select an inferior and resume it
even if it has no threads but still has pid != 0, which indicates
that the inferior has execution and thus has a pending exit wait
status to collect.  That would go along with storing that
pending exit waitstatus on a separate list, instead of storing
it on a thread.  That might be tougher to hack in.  Not sure it's
worth it if we can do without it.  All the continue/resume code
in infcmd.c/infrun.c assumes there's a current thread.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event
  2020-04-21 18:54         ` Pedro Alves
@ 2020-04-22 14:57           ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 16+ messages in thread
From: Aktemur, Tankut Baris @ 2020-04-22 14:57 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On Tuesday, April 21, 2020 8:54 PM, Pedro Alves wrote:
> On 4/20/20 9:35 PM, Aktemur, Tankut Baris wrote:
> > On Thursday, April 16, 2020 6:24 PM, Pedro Alves wrote:
> >> On 4/6/20 4:45 PM, Tankut Baris Aktemur via Gdb-patches wrote:
> >>> gdb/ChangeLog:
> >>> 2020-04-06  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
> >>>
> >>> 	* remote.c (remote_target::update_thread_list): Do not delete
> >>> 	a thread if it has a pending event.
> >>> ---
> >>>  gdb/remote.c | 3 +++
> >>>  1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/gdb/remote.c b/gdb/remote.c
> >>> index bfbc0bc21d3..12ac7cb9862 100644
> >>> --- a/gdb/remote.c
> >>> +++ b/gdb/remote.c
> >>> @@ -3821,6 +3821,9 @@ remote_target::update_thread_list ()
> >>>  	  if (tp->inf->process_target () != this)
> >>>  	    continue;
> >>>
> >>> +	  if (tp->suspend.waitstatus_pending_p)
> >>> +	    continue;
> >>> +
> >>
> >> This patch makes me nervous.  :-/
> >>
> >> Why doesn't prune_threads, via remote_target::thread_alive end
> >> up removing the thread anyway?
> >
> > As far as I see, prune_threads is called only if gdbserver does not
> > send a thread list.  The comment in the function suggests that this
> > could be the case for older servers.  I'm not sure how old it should
> > be, though, for me to be able to test it.
> >
> >> I think it'd be better to extend the check to also check for
> >> TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED.  Let threads
> >> with other kinds of events pending be deleted.
> >>
> >> Other targets are likely to need something like this, but I'm
> >> OK with going with remote-specific test to start with, so we
> >> can all this in, including the testcase, and then work on
> >> something more generic if we find a need.
> >>
> >> In any case, please add some comments.
> >
> > First I did this change, but when I started working on Patch #5, interesting
> > things have come up.

So, I've converted this patch to your previous suggestion of also checking
TARGET_WAITKIND_EXITED/TARGET_WAITKIND_SIGNALLED.  Below are more details.
I'll submit the revised version shortly.

> >
> > Having both of the inferiors on a single extended-remote target is supposed
> > to trigger the same stop-all-threads scenario that this series attempts to fix.
> > But it turned out to be slightly different.
> >
> > The 'update_thread_list' at the beginning of the stop_all_threads pass
> > removes all the threads because the remote target sends an empty list of threads
> > (well both inferiors exited and no thread remained).  So, before we even attempt
> > to stop threads, they are already gone when the list is updated.  The problem is,
> > inferior 2 (i.e. the inferior whose thread we would attempt to stop but would
> > receive an EXITED waitstatus for) is then left in a seemingly live state (pid != 0),
> > but with no threads.  Hence, we cannot even switch to it after inferior 1's
> > exit event is shown.
> >
> > Therefore I updated this patch as below.  Although no regression was detected,
> > I don't feel much safe with this new version.  Comments welcome.
> 
> Hmm.  The current_inferior() check seems like a red flag, IMO.
> Why is that needed?

That was an attempt to constrain the potential side effects, but I've reverted
back to checking a pending exit event.
 
> How does it happen that all threads are deleted?  I've seen
> this before, but I'll need to refresh my memory.  Let's see.
> 
> We have this in gdbserver/linux-low.cc:
> 
> /* Return true if LWP has exited already, and has a pending exit event
>    to report to GDB.  */
> 
> static int
> lwp_is_marked_dead (struct lwp_info *lwp)
> {
>   return (lwp->status_pending_p
> 	  && (WIFEXITED (lwp->status_pending)
> 	      || WIFSIGNALED (lwp->status_pending)));
> }
> 
> /* Return true if the given thread is still alive.  */
> 
> bool
> linux_process_target::thread_alive (ptid_t ptid)
> {
>   struct lwp_info *lwp = find_lwp_pid (ptid);
> 
>   /* We assume we always know if a thread exits.  If a whole process
>      exited but we still haven't been able to report it to GDB, we'll
>      hold on to the last lwp of the dead process.  */
>   if (lwp != NULL)
>     return !lwp_is_marked_dead (lwp);
>   else
>     return 0;
> }
> 
> If gdbserver deleted all threads, it seems like that would
> mean that the linux-low.cc backend has already reported the
> pending exit event to gdbserver's core, and the process exit event
> is thus guaranteed to already be in transit, somewhere, maybe
> queued in gdbserver/server.cc's notif_stop queue, or maybe even
> already in gdb, but queued in gdb/remote.c's stop_reply_queue queue,
> or at whatever other level we queue events.

Based on the tests I made, I can see that gdbserver has sent two exit events
(one per each process) to GDB.  GDB is processing the first one.  So, yes,
the second must be somewhere in the queue.

> As I mentioned, I've ran into something like this before.
> See infrun.c:handle_no_resumed where it reads:
> 
>   /* Note however that we may find no resumed thread because the whole
>      process exited meanwhile (thus updating the thread list results
>      in an empty thread list).  In this case we know we'll be getting
>      a process exit event shortly.  */
> 
> Maybe we should make sure that stop_all_threads ends up in wait_one
> if the inferior has a pid != 0, and no threads?

This will be my proposal.  Here is the flow:

The exit event is somewhere in the event queue.  Thus, we end up with
an inferior with pid != 0 and no threads after update_thread_list.  We 
silently add a thread to this inferior in stop_all_threads, so that the
iteration over threads calls wait_one, picks the exit event, and attaches
it to the thread.

Now that the exit event has been picked from the event queue and attached to
a thread as a pending event, the check in remote.c's update_thread_list
does not delete it the next time we update threads (i.e. the next pass).

When "adding a new thread silently", there are two cases to consider:
delete_thread() may have not physically deleted the thread object, but only
marked is as THREAD_EXITED.  In this case, we "resurrect" the thread and make
it non-exited to be picked up by the loop that iterates over threads.
If it's gone completely, we simply add_thread_silently().

A side note:
update_thread_list() updates the list for the current target.  I think
it should have updated the list for all targets.  In v7, I've made this
change.  In fact, the different behavior that I had seen before between
using a single remote target vs. two remote targets was due to this.  When
using two targets, GDB was updating only the target of the first-exiting
inferior.  So, the initial call to update_thread_list was not deleting the
thread of the second inferior.  When using a single target, the thread of
the second inferior was being deleted, because it shares the same target with
the first inferior, and this had led to the pid != 0 and no threads case.
 
> If we still need to keep one thread in the inferior's thread list,
> it feels like that should be handled by common code, like e.g.,
> from within delete_thread.
> 
> But if we do that, then we don't need to change stop_all_threads.
> And that bit I quoted in handle_no_resumed becomes dead, and
> seems like could trick gdb into reporting a TARGET_WAITKIND_NO_RESUMED
> to the user.  :-/
> 
> It also feels like we're missing a thread state in enum thread_state,
> like a THREAD_ZOMBIE state, for a thread that has exited, but that
> should still be listed, and only exists because it needs to be waited
> on to consume the exit event.  We can probably do without it, though
> I've been having this feeling for a long while.  Maybe
> THREAD_EXITED + pending status would be the same thing.
> 
> Or maybe we should be able to select an inferior and resume it
> even if it has no threads but still has pid != 0, which indicates
> that the inferior has execution and thus has a pending exit wait
> status to collect.  That would go along with storing that
> pending exit waitstatus on a separate list, instead of storing
> it on a thread.  That might be tougher to hack in.  Not sure it's
> worth it if we can do without it.  All the continue/resume code
> in infcmd.c/infrun.c assumes there's a current thread.

I'm afraid introducing a per-inferior event list or the THREAD_ZOMBIE
state will be intrusive.  I hope v7 will fulfill the "if we can do without
it" case.

> 
> Thanks,
> Pedro Alves

Regards.
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-04-22 14:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-06 15:45 [PATCH v5 0/5] Handling already-exited threads in 'stop_all_threads' Tankut Baris Aktemur
     [not found] ` <cover.1586187408.git.tankut.baris.aktemur@intel.com>
2020-04-06 15:45   ` [PATCH v5 1/5] gdb: protect some 'regcache_read_pc' calls Tankut Baris Aktemur
2020-04-16 16:11     ` Pedro Alves
2020-04-20 20:13       ` Aktemur, Tankut Baris
2020-04-06 15:45   ` [PATCH v5 2/5] gdb/infrun: move a 'regcache_read_pc' call down to first use Tankut Baris Aktemur
2020-04-16 16:12     ` Pedro Alves
2020-04-06 15:45   ` [PATCH v5 3/5] gdb/remote: do not delete a thread if it has a pending event Tankut Baris Aktemur
2020-04-16 16:24     ` Pedro Alves
2020-04-20 20:35       ` Aktemur, Tankut Baris
2020-04-21 18:54         ` Pedro Alves
2020-04-22 14:57           ` Aktemur, Tankut Baris
2020-04-06 15:45   ` [PATCH v5 4/5] gdb/infrun: extract out a code piece into 'mark_non_executing_threads' function Tankut Baris Aktemur
2020-04-16 16:24     ` Pedro Alves
2020-04-06 15:45   ` [PATCH v5 5/5] gdb/infrun: handle already-exited threads when attempting to stop Tankut Baris Aktemur
2020-04-16 17:06     ` Pedro Alves
2020-04-20 20:43       ` Aktemur, Tankut Baris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).