[FYI/pushed v4 00/25] Step over thread clone and thread exit

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [FYI/pushed v4 00/25] Step over thread clone and thread exit
@ 2023-11-13 15:04 Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 01/25] Add "maint info linux-lwps" command Pedro Alves
                   ` (25 more replies)
  0 siblings, 26 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches

Here's v4 of the series I previously posted here:

  [PATCH 00/31] Step over thread clone and thread exit
  https://inbox.sourceware.org/gdb-patches/20221212203101.1034916-1-pedro@palves.net/

(That was v3, I had forgotten to update the subject line back then.)

As I mentioned earlier today, I've addressed all the review comments
in the remaining patches in the v3 series, and since none required any
invasive/design change, I've gone ahead with merging this.  If there's
anything that doesn't look right, please don't hesitate to raise it
and I'll try to address it.

New in v4:

  - Patches 1-2, 29-30 from v3 were already merged a while ago.

  - Addressed Andrew's comments throughout.

  - The stepi-over-clone.exp testcase is now merged into the patch
    where the test first works on native GNU/Linux.

  - Patch "[PATCH 25/31] Ignore failure to read PC when resuming" from
    v3 was dropped, for now.

Here's the series description, updated for v4:

This is a new series that replaces two different series from a couple
years ago.

The first is this series Simon and I wrote, here:

  [PATCH 00/10] Step over thread exit (PR gdb/27338)
  https://sourceware.org/pipermail/gdb-patches/2021-July/180567.html

The other is a series that coincidentally, back then, Andrew posted at
about the same time, and that addressed problems in kind of the mirror
scenario.  His patch series was about stepping over clone (creating
new threads), instead of stepping over thread exit:

  [PATCH 0/3] Stepping over clone syscall
  https://sourceware.org/pipermail/gdb-patches/2021-June/180517.html

My & Simon's solution back then involved adding a new contract between
GDB and GDBserver -- if a thread is single stepping, and it exits, the
server was supposed to report back the thread's exit to GDB.  One of
the motivations for this approach was to be able to control the
enablement of thread exit events per thread, to avoid creating
thread-exit event traffic unnecessarily, as done by
target_thread_events()/QThreadEvents.

Andrew's solution involves using the QThreadEvents mechanism, which
tells the server to report thread create and thread exit events for
all threads.  This would conflict with the desire to avoid unnecessary
traffic in the step over thread exit series.

The step over clone fixes back then also weren't yet fully complete,
as Andrew's series only addressed inline step overs.  Fixing displaced
stepping over clone syscall would still remain broken.

This new series fixes all of stepping over thread exit and clone, for
both of displaced stepping and inline step overs.  It:

- Merges both Andrew's and my/Simon's series, and then reworks both
  parts in different ways.

- Introduces clone events at the GDB core and remote protocol level.

- Gets rid of the idea of "reporting thread exit if thread is
  single-stepping", replaces it by a new mechanism GDB can use to
  explicitly enable thread clone and/or thread exit events, and other
  events in the future.  The old mechanism also only worked when the
  remote server supported hardware single-stepping.  This new approach
  has an advantage of also working on software single-step targets.

- Uses the new clone events to fix displaced stepping over clone
  syscalls too.

- Addresses an issue that Andrew alluded to in his series, and that
  coincidentally, we/AMD also ran into with AMDGPU debugging --
  currently, with "set scheduler-locking on", if you step over a
  function that spawns a thread, that thread runs free, for a bit at
  least, and then may stop or not, basically in an unspecified manner.

- Cancels next/step/until/etc. commands on thread exit event, like so:

     (gdb) n
     [Thread 0x7ffff7d89700 (LWP 3961883) exited]
     Command aborted, thread exited.
     (gdb)

- Non-trivial documentation changes have already been approved by Eli.

Tested on x86-64 Ubuntu 20.04, native and gdbserver.

Andrew Burgess (1):
  Add "maint info linux-lwps" command

Pedro Alves (23):
  gdb/linux: Delete all other LWPs immediately on ptrace exec event
  Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  Support clone events in the remote protocol
  Avoid duplicate QThreadEvents packets
  Thread options & clone events (core + remote)
  Thread options & clone events (native Linux)
  Thread options & clone events (Linux GDBserver)
  gdbserver: Hide and don't detach pending clone children
  Remove gdb/19675 kfails (displaced stepping + clone)
  all-stop/synchronous RSP support thread-exit events
  gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE
  Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core
  Introduce GDB_THREAD_OPTION_EXIT thread option, fix
    step-over-thread-exit
  Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver
  Implement GDB_THREAD_OPTION_EXIT support for native Linux
  gdb: clear step over information on thread exit (PR gdb/27338)
  stop_all_threads: (re-)enable async before waiting for stops
  gdbserver: Queue no-resumed event after thread exit
  Don't resume new threads if scheduler-locking is in effect
  Report thread exit event for leader if reporting thread exit events
  gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro
  Document remote clone events, and QThreadOptions packet
  Cancel execution command on thread exit, when stepping, nexting, etc.

Simon Marchi (1):
  Testcases for stepping over thread exit syscall (PR gdb/27338)

 gdb/NEWS                                      |  32 +
 gdb/displaced-stepping.c                      |   7 +
 gdb/doc/gdb.texinfo                           | 136 +++-
 gdb/gdbarch-gen.h                             |   6 +-
 gdb/gdbarch_components.py                     |   4 +
 gdb/gdbthread.h                               |  16 +
 gdb/infrun.c                                  | 588 +++++++++++++++---
 gdb/linux-nat.c                               | 383 +++++++-----
 gdb/linux-nat.h                               |   4 +
 gdb/remote.c                                  | 304 +++++++--
 gdb/target-debug.h                            |   2 +
 gdb/target-delegates.c                        |  52 ++
 gdb/target.c                                  |  16 +
 gdb/target.h                                  |  15 +
 gdb/target/target.c                           |  12 +
 gdb/target/target.h                           |  20 +
 gdb/target/waitstatus.c                       |   1 +
 gdb/target/waitstatus.h                       |  31 +-
 gdb/testsuite/gdb.base/step-over-syscall.exp  |  44 +-
 .../gdb.threads/schedlock-new-thread.c        |  54 ++
 .../gdb.threads/schedlock-new-thread.exp      |  67 ++
 ...-over-thread-exit-while-stop-all-threads.c |  77 +++
 ...ver-thread-exit-while-stop-all-threads.exp |  69 ++
 .../gdb.threads/step-over-thread-exit.c       |  52 ++
 .../gdb.threads/step-over-thread-exit.exp     | 155 +++++
 gdb/testsuite/gdb.threads/stepi-over-clone.c  |  90 +++
 .../gdb.threads/stepi-over-clone.exp          | 389 ++++++++++++
 .../gdb.threads/threads-after-exec.c          |  56 ++
 .../gdb.threads/threads-after-exec.exp        |  57 ++
 gdb/testsuite/lib/my-syscalls.S               |  54 +-
 gdb/testsuite/lib/my-syscalls.h               |   5 +
 gdb/thread.c                                  |  18 +
 gdbserver/gdbthread.h                         |   3 +
 gdbserver/linux-low.cc                        | 417 ++++++++-----
 gdbserver/linux-low.h                         |  78 ++-
 gdbserver/remote-utils.cc                     |  26 +-
 gdbserver/server.cc                           | 158 ++++-
 gdbserver/target.cc                           |  15 +-
 gdbserver/target.h                            |  28 +-
 39 files changed, 2994 insertions(+), 547 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/schedlock-new-thread.c
 create mode 100644 gdb/testsuite/gdb.threads/schedlock-new-thread.exp
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.c
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit.c
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit.exp
 create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.c
 create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.exp
 create mode 100644 gdb/testsuite/gdb.threads/threads-after-exec.c
 create mode 100644 gdb/testsuite/gdb.threads/threads-after-exec.exp


base-commit: 6b682bbf86f37982ce1d270fb47f363413490bda
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 01/25] Add "maint info linux-lwps" command
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 02/25] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
                   ` (24 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Pedro Alves

From: Andrew Burgess <aburgess@redhat.com>

This adds a maintenance command that lets you list all the LWPs under
control of the linux-nat target.

For example:

 (gdb) maint info linux-lwps
 LWP Ptid        Thread ID
 560948.561047.0 None
 560948.560948.0 1.1

This shows that "560948.561047.0" LWP doesn't map to any thread_info
object, which is bogus.  We'll be using this in a testcase in a
following patch.

Co-Authored-By: Pedro Alves <pedro@palves.net>
Change-Id: Ic4e9e123385976e5cd054391990124b7a20fb3f5
---
 gdb/NEWS            |  5 +++++
 gdb/doc/gdb.texinfo |  4 ++++
 gdb/linux-nat.c     | 46 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/gdb/NEWS b/gdb/NEWS
index 3851114a9f7..d85a13b64fe 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -24,6 +24,11 @@ disassemble
   ** New read/write attribute gdb.Value.bytes that contains a bytes
      object holding the contents of this value.
 
+* New commands
+
+maintenance info linux-lwps
+  List all LWPs under control of the linux-nat target.
+
 *** Changes in GDB 14
 
 * GDB now supports the AArch64 Scalable Matrix Extension 2 (SME2), which
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 2cd565ed5b4..4cbaaa6804f 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -41189,6 +41189,10 @@ module (@pxref{Disassembly In Python}), and will only be present after
 that module has been imported.  To force the module to be imported do
 the following:
 
+@kindex maint info linux-lwps
+@item maint info linux-lwps
+Print information about LWPs under control of the Linux native target.
+
 @smallexample
 (@value{GDBP}) python import gdb.disassembler
 @end smallexample
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 1c9756c18bd..f73e52f9617 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -4503,6 +4503,49 @@ current_lwp_ptid (void)
   return inferior_ptid;
 }
 
+/* Implement 'maintenance info linux-lwps'.  Displays some basic
+   information about all the current lwp_info objects.  */
+
+static void
+maintenance_info_lwps (const char *arg, int from_tty)
+{
+  if (all_lwps ().size () == 0)
+    {
+      gdb_printf ("No Linux LWPs\n");
+      return;
+    }
+
+  /* Start the width at 8 to match the column heading below, then
+     figure out the widest ptid string.  We'll use this to build our
+     output table below.  */
+  size_t ptid_width = 8;
+  for (lwp_info *lp : all_lwps ())
+    ptid_width = std::max (ptid_width, lp->ptid.to_string ().size ());
+
+  /* Setup the table headers.  */
+  struct ui_out *uiout = current_uiout;
+  ui_out_emit_table table_emitter (uiout, 2, -1, "linux-lwps");
+  uiout->table_header (ptid_width, ui_left, "lwp-ptid", _("LWP Ptid"));
+  uiout->table_header (9, ui_left, "thread-info", _("Thread ID"));
+  uiout->table_body ();
+
+  /* Display one table row for each lwp_info.  */
+  for (lwp_info *lp : all_lwps ())
+    {
+      ui_out_emit_tuple tuple_emitter (uiout, "lwp-entry");
+
+      thread_info *th = linux_target->find_thread (lp->ptid);
+
+      uiout->field_string ("lwp-ptid", lp->ptid.to_string ().c_str ());
+      if (th == nullptr)
+	uiout->field_string ("thread-info", "None");
+      else
+	uiout->field_string ("thread-info", print_full_thread_id (th));
+
+      uiout->message ("\n");
+    }
+}
+
 void _initialize_linux_nat ();
 void
 _initialize_linux_nat ()
@@ -4540,6 +4583,9 @@ Enables printf debugging output."),
   sigemptyset (&blocked_mask);
 
   lwp_lwpid_htab_create ();
+
+  add_cmd ("linux-lwps", class_maintenance, maintenance_info_lwps,
+	 _("List the Linux LWPS."), &maintenanceinfolist);
 }
 \f
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 02/25] gdb/linux: Delete all other LWPs immediately on ptrace exec event
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 01/25] Add "maint info linux-lwps" command Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

I noticed that on an Ubuntu 20.04 system, after a following patch
("Step over clone syscall w/ breakpoint,
TARGET_WAITKIND_THREAD_CLONED"), the gdb.threads/step-over-exec.exp
was passing cleanly, but still, we'd end up with four new unexpected
GDB core dumps:

		 === gdb Summary ===

 # of unexpected core files      4
 # of expected passes            48

That said patch is making the pre-existing
gdb.threads/step-over-exec.exp testcase (almost silently) expose a
latent problem in gdb/linux-nat.c, resulting in a GDB crash when:

 #1 - a non-leader thread execs
 #2 - the post-exec program stops somewhere
 #3 - you kill the inferior

Instead of #3 directly, the testcase just returns, which ends up in
gdb_exit, tearing down GDB, which kills the inferior, and is thus
equivalent to #3 above.

Vis (after said patch is applied):

 $ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true
 ...
 (top-gdb) r
 ...
 (gdb) b main
 ...
 (gdb) r
 ...
 Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69
 69        argv0 = argv[0];
 (gdb) c
 Continuing.
 [New Thread 0x7ffff7d89700 (LWP 2506975)]
 Other going in exec.
 Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
 process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd

 Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28
 28        foo ();
 (gdb) k
 ...
 Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
 393         return m_suspend.waitstatus_pending_p;
 (top-gdb) bt
 #0  0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
 #1  0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345
 #2  0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564
 #3  0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284
 #4  0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278
 #5  0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247
 #6  0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864
 #7  0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590
 #8  0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911
 ...

The root of the problem is that when a non-leader LWP execs, it just
changes its tid to the tgid, replacing the pre-exec leader thread,
becoming the new leader.  There's no thread exit event for the execing
thread.  It's as if the old pre-exec LWP vanishes without trace.  The
ptrace man page says:

 "PTRACE_O_TRACEEXEC (since Linux 2.5.46)
	Stop the tracee at the next execve(2).  A waitpid(2) by the
	tracer will return a status value such that

	  status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))

	If the execing thread is not a thread group leader, the thread
	ID is reset to thread group leader's ID before this stop.
	Since Linux 3.0, the former thread ID can be retrieved with
	PTRACE_GETEVENTMSG."

When the core of GDB processes an exec events, it deletes all the
threads of the inferior.  But, that is too late -- deleting the thread
does not delete the corresponding LWP, so we end leaving the pre-exec
non-leader LWP stale in the LWP list.  That's what leads to the crash
above -- linux_nat_target::kill iterates over all LWPs, and after the
patch in question, that code will look for the corresponding
thread_info for each LWP.  For the pre-exec non-leader LWP still
listed, won't find one.

This patch fixes it, by deleting the pre-exec non-leader LWP (and
thread) from the LWP/thread lists as soon as we get an exec event out
of ptrace.

GDBserver does not need an equivalent fix, because it is already doing
this, as side effect of mourning the pre-exec process, in
gdbserver/linux-low.cc:

  else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events)
    {
...
      /* Delete the execing process and all its threads.  */
      mourn (proc);
      switch_to_thread (nullptr);


The crash with gdb.threads/step-over-exec.exp is not observable on
newer systems, which postdate the glibc change to move "libpthread.so"
internals to "libc.so.6", because right after the exec, GDB traps a
load event for "libc.so.6", which leads to GDB trying to open
libthread_db for the post-exec inferior, and, on such systems that
succeeds.  When we load libthread_db, we call
linux_stop_and_wait_all_lwps, which, as the name suggests, stops all
lwps, and then waits to see their stops.  While doing this, GDB
detects that the pre-exec stale LWP is gone, and deletes it.

If we use "catch exec" to stop right at the exec before the
"libc.so.6" load event ever happens, and issue "kill" right there,
then GDB crashes on newer systems as well.  So instead of tweaking
gdb.threads/step-over-exec.exp to cover the fix, add a new
gdb.threads/threads-after-exec.exp testcase that uses "catch exec".
The test also uses the new "maint info linux-lwps" command if testing
on Linux native, which also exposes the stale LWP problem with an
unfixed GDB.

Also tweak a comment in infrun.c:follow_exec referring to how
linux-nat.c used to behave, as it would become stale otherwise.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I21ec18072c7750f3a972160ae6b9e46590376643
---
 gdb/infrun.c                                  |  8 +--
 gdb/linux-nat.c                               | 15 +++++
 .../gdb.threads/threads-after-exec.c          | 56 ++++++++++++++++++
 .../gdb.threads/threads-after-exec.exp        | 57 +++++++++++++++++++
 4 files changed, 131 insertions(+), 5 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/threads-after-exec.c
 create mode 100644 gdb/testsuite/gdb.threads/threads-after-exec.exp

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 4c7eb9be792..c60cfc07aa7 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1245,13 +1245,11 @@ follow_exec (ptid_t ptid, const char *exec_file_target)
      some other thread does the exec, and even if the main thread was
      stopped or already gone.  We may still have non-leader threads of
      the process on our list.  E.g., on targets that don't have thread
-     exit events (like remote); or on native Linux in non-stop mode if
-     there were only two threads in the inferior and the non-leader
-     one is the one that execs (and nothing forces an update of the
-     thread list up to here).  When debugging remotely, it's best to
+     exit events (like remote) and nothing forces an update of the
+     thread list up to here.  When debugging remotely, it's best to
      avoid extra traffic, when possible, so avoid syncing the thread
      list with the target, and instead go ahead and delete all threads
-     of the process but one that reported the event.  Note this must
+     of the process but the one that reported the event.  Note this must
      be done before calling update_breakpoints_after_exec, as
      otherwise clearing the threads' resources would reference stale
      thread breakpoints -- it may have been one of these threads that
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index f73e52f9617..97d80053c6f 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -2001,6 +2001,21 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
 	 thread execs, it changes its tid to the tgid, and the old
 	 tgid thread might have not been resumed.  */
       lp->resumed = 1;
+
+      /* All other LWPs are gone now.  We'll have received a thread
+	 exit notification for all threads other the execing one.
+	 That one, if it wasn't the leader, just silently changes its
+	 tid to the tgid, and the previous leader vanishes.  Since
+	 Linux 3.0, the former thread ID can be retrieved with
+	 PTRACE_GETEVENTMSG, but since we support older kernels, don't
+	 bother with it, and just walk the LWP list.  Even with
+	 PTRACE_GETEVENTMSG, we'd still need to lookup the
+	 corresponding LWP object, and it would be an extra ptrace
+	 syscall, so this way may even be more efficient.  */
+      for (lwp_info *other_lp : all_lwps_safe ())
+	if (other_lp != lp && other_lp->ptid.pid () == lp->ptid.pid ())
+	  exit_lwp (other_lp);
+
       return 0;
     }
 
diff --git a/gdb/testsuite/gdb.threads/threads-after-exec.c b/gdb/testsuite/gdb.threads/threads-after-exec.c
new file mode 100644
index 00000000000..b3ed7ec5f69
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/threads-after-exec.c
@@ -0,0 +1,56 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdio.h>
+
+static char *program_name;
+
+void *
+thread_execler (void *arg)
+{
+  /* Exec ourselves again, but with an extra argument, to avoid
+     infinite recursion.  */
+  if (execl (program_name, program_name, "1", NULL) == -1)
+    {
+      perror ("execl");
+      abort ();
+    }
+
+  return NULL;
+}
+
+int
+main (int argc, char **argv)
+{
+  pthread_t thread;
+
+  if (argc > 1)
+    {
+      /* Getting here via execl.  */
+      return 0;
+    }
+
+  program_name = argv[0];
+
+  pthread_create (&thread, NULL, thread_execler, NULL);
+  pthread_join (thread, NULL);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/threads-after-exec.exp b/gdb/testsuite/gdb.threads/threads-after-exec.exp
new file mode 100644
index 00000000000..cd8adf900d9
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/threads-after-exec.exp
@@ -0,0 +1,57 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that after an exec of a non-leader thread, we don't leave the
+# non-leader thread listed in internal thread lists, causing problems.
+
+standard_testfile .c
+
+proc do_test { } {
+    if [prepare_for_testing "failed to prepare" $::testfile $::srcfile {debug pthreads}] {
+	return -1
+    }
+
+    if ![runto_main] {
+	return
+    }
+
+    gdb_test "catch exec" "Catchpoint $::decimal \\(exec\\)"
+
+    gdb_test "continue" "Catchpoint $::decimal .*" "continue until exec"
+
+    # Confirm we only have one thread in the thread list.
+    gdb_test "info threads" "\\* 1\[ \t\]+\[^\r\n\]+.*"
+
+    if {[istarget *-*-linux*] && [gdb_is_target_native]} {
+	# Confirm there's only one LWP in the list as well, and that
+	# it is bound to thread 1.1.
+	set inf_pid [get_inferior_pid]
+	gdb_test_multiple "maint info linux-lwps" "" {
+	    -wrap -re "Thread ID *\r\n$inf_pid\.$inf_pid\.0\[ \t\]+1\.1 *" {
+		pass $gdb_test_name
+	    }
+	}
+    }
+
+    # Test that GDB is able to kill the inferior.  This used to crash
+    # on native Linux as GDB did not dispose of the pre-exec LWP for
+    # the non-leader (and that LWP did not have a matching thread in
+    # the core thread list).
+    gdb_test "with confirm off -- kill" \
+	"\\\[Inferior 1 (.*) killed\\\]" \
+	"kill inferior"
+}
+
+do_test
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 01/25] Add "maint info linux-lwps" command Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 02/25] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-14 12:55   ` Guinevere Larsen
  2023-11-13 15:04 ` [FYI/pushed v4 04/25] Support clone events in the remote protocol Pedro Alves
                   ` (22 subsequent siblings)
  25 siblings, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

(A good chunk of the problem statement in the commit log below is
Andrew's, adjusted for a different solution, and for covering
displaced stepping too.  The testcase is mostly Andrew's too.)

This commit addresses bugs gdb/19675 and gdb/27830, which are about
stepping over a breakpoint set at a clone syscall instruction, one is
about displaced stepping, and the other about in-line stepping.

Currently, when a new thread is created through a clone syscall, GDB
sets the new thread running.  With 'continue' this makes sense
(assuming no schedlock):

 - all-stop mode, user issues 'continue', all threads are set running,
   a newly created thread should also be set running.

 - non-stop mode, user issues 'continue', other pre-existing threads
   are not affected, but as the new thread is (sort-of) a child of the
   thread the user asked to run, it makes sense that the new threads
   should be created in the running state.

Similarly, if we are stopped at the clone syscall, and there's no
software breakpoint at this address, then the current behaviour is
fine:

 - all-stop mode, user issues 'stepi', stepping will be done in place
   (as there's no breakpoint to step over).  While stepping the thread
   of interest all the other threads will be allowed to continue.  A
   newly created thread will be set running, and then stopped once the
   thread of interest has completed its step.

 - non-stop mode, user issues 'stepi', stepping will be done in place
   (as there's no breakpoint to step over).  Other threads might be
   running or stopped, but as with the continue case above, the new
   thread will be created running.  The only possible issue here is
   that the new thread will be left running after the initial thread
   has completed its stepi.  The user would need to manually select
   the thread and interrupt it, this might not be what the user
   expects.  However, this is not something this commit tries to
   change.

The problem then is what happens when we try to step over a clone
syscall if there is a breakpoint at the syscall address.

- For both all-stop and non-stop modes, with in-line stepping:

   + user issues 'stepi',
   + [non-stop mode only] GDB stops all threads.  In all-stop mode all
     threads are already stopped.
   + GDB removes s/w breakpoint at syscall address,
   + GDB single steps just the thread of interest, all other threads
     are left stopped,
   + New thread is created running,
   + Initial thread completes its step,
   + [non-stop mode only] GDB resumes all threads that it previously
     stopped.

There are two problems in the in-line stepping scenario above:

  1. The new thread might pass through the same code that the initial
     thread is in (i.e. the clone syscall code), in which case it will
     fail to hit the breakpoint in clone as this was removed so the
     first thread can single step,

  2. The new thread might trigger some other stop event before the
     initial thread reports its step completion.  If this happens we
     end up triggering an assertion as GDB assumes that only the
     thread being stepped should stop.  The assert looks like this:

     infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.

- For both all-stop and non-stop modes, with displaced stepping:

   + user issues 'stepi',
   + GDB starts the displaced step, moves thread's PC to the
     out-of-line scratch pad, maybe adjusts registers,
   + GDB single steps the thread of interest, [non-stop mode only] all
     other threads are left as they were, either running or stopped.
     In all-stop, all other threads are left stopped.
   + New thread is created running,
   + Initial thread completes its step, GDB re-adjusts its PC,
     restores/releases scratchpad,
   + [non-stop mode only] GDB resumes the thread, now past its
     breakpoint.
   + [all-stop mode only] GDB resumes all threads.

There is one problem with the displaced stepping scenario above:

  3. When the parent thread completed its step, GDB adjusted its PC,
     but did not adjust the child's PC, thus that new child thread
     will continue execution in the scratch pad, invoking undefined
     behavior.  If you're lucky, you see a crash.  If unlucky, the
     inferior gets silently corrupted.

What is needed is for GDB to have more control over whether the new
thread is created running or not.  Issue #1 above requires that the
new thread not be allowed to run until the breakpoint has been
reinserted.  The only way to guarantee this is if the new thread is
held in a stopped state until the single step has completed.  Issue #3
above requires that GDB is informed of when a thread clones itself,
and of what is the child's ptid, so that GDB can fixup both the parent
and the child.

When looking for solutions to this problem I considered how GDB
handles fork/vfork as these have some of the same issues.  The main
difference between fork/vfork and clone is that the clone events are
not reported back to core GDB.  Instead, the clone event is handled
automatically in the target code and the child thread is immediately
set running.

Note we have support for requesting thread creation events out of the
target (TARGET_WAITKIND_THREAD_CREATED).  However, those are reported
for the new/child thread.  That would be sufficient to address in-line
stepping (issue #1), but not for displaced-stepping (issue #3).  To
handle displaced-stepping, we need an event that is reported to the
_parent_ of the clone, as the information about the displaced step is
associated with the clone parent.  TARGET_WAITKIND_THREAD_CREATED
includes no indication of which thread is the parent that spawned the
new child.  In fact, for some targets, like e.g., Windows, it would be
impossible to know which thread that was, as thread creation there
doesn't work by "cloning".

The solution implemented here is to model clone on fork/vfork, and
introduce a new TARGET_WAITKIND_THREAD_CLONED event.  This event is
similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
that we end up with a new thread in the same process, instead of a new
thread of a new process.  Like FORKED and VFORKED, THREAD_CLONED
waitstatuses have a child_ptid property, and the child is held stopped
until GDB explicitly resumes it.  This addresses the in-line stepping
case (issues #1 and #2).

The infrun code that handles displaced stepping fixup for the child
after a fork/vfork event is thus reused for THREAD_CLONE, with some
minimal conditions added, addressing the displaced stepping case
(issue #3).

The native Linux backend is adjusted to unconditionally report
TARGET_WAITKIND_THREAD_CLONED events to the core.

Following the follow_fork model in core GDB, we introduce a
target_follow_clone target method, which is responsible for making the
new clone child visible to the rest of GDB.

Subsequent patches will add clone events support to the remote
protocol and gdbserver.

displaced_step_in_progress_thread becomes unused with this patch, but
a new use will reappear later in the series.  To avoid deleting it and
readding it back, this patch marks it with attribute unused, and the
latter patch removes the attribute again.  We need to do this because
the function is static, and with no callers, the compiler would warn,
(error with -Werror), breaking the build.

This adds a new gdb.threads/stepi-over-clone.exp testcase, which
exercises stepping over a clone syscall, with displaced stepping vs
inline stepping, and all-stop vs non-stop.  We already test stepping
over clone syscalls with gdb.base/step-over-syscall.exp, but this test
uses pthreads, while the other test uses raw clone, and this one is
more thorough.  The testcase passes on native GNU/Linux, but fails
against GDBserver.  GDBserver will be fixed by a later patch in the
series.

Co-authored-by: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
---
 gdb/infrun.c                                  | 158 +++----
 gdb/linux-nat.c                               | 252 +++++------
 gdb/linux-nat.h                               |   2 +
 gdb/target-delegates.c                        |  24 ++
 gdb/target.c                                  |   7 +
 gdb/target.h                                  |   7 +
 gdb/target/waitstatus.c                       |   1 +
 gdb/target/waitstatus.h                       |  31 +-
 gdb/testsuite/gdb.threads/stepi-over-clone.c  |  90 ++++
 .../gdb.threads/stepi-over-clone.exp          | 395 ++++++++++++++++++
 10 files changed, 775 insertions(+), 192 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.c
 create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.exp

diff --git a/gdb/infrun.c b/gdb/infrun.c
index c60cfc07aa7..e3157f86aff 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1606,6 +1606,7 @@ step_over_info_valid_p (void)
 /* Return true if THREAD is doing a displaced step.  */
 
 static bool
+ATTRIBUTE_UNUSED
 displaced_step_in_progress_thread (thread_info *thread)
 {
   gdb_assert (thread != nullptr);
@@ -1967,6 +1968,31 @@ static displaced_step_finish_status
 displaced_step_finish (thread_info *event_thread,
 		       const target_waitstatus &event_status)
 {
+  /* Check whether the parent is displaced stepping.  */
+  struct regcache *regcache = get_thread_regcache (event_thread);
+  struct gdbarch *gdbarch = regcache->arch ();
+  inferior *parent_inf = event_thread->inf;
+
+  /* If this was a fork/vfork/clone, this event indicates that the
+     displaced stepping of the syscall instruction has been done, so
+     we perform cleanup for parent here.  Also note that this
+     operation also cleans up the child for vfork, because their pages
+     are shared.  */
+
+  /* If this is a fork (child gets its own address space copy) and
+     some displaced step buffers were in use at the time of the fork,
+     restore the displaced step buffer bytes in the child process.
+
+     Architectures which support displaced stepping and fork events
+     must supply an implementation of
+     gdbarch_displaced_step_restore_all_in_ptid.  This is not enforced
+     during gdbarch validation to support architectures which support
+     displaced stepping but not forks.  */
+  if (event_status.kind () == TARGET_WAITKIND_FORKED
+      && gdbarch_supports_displaced_stepping (gdbarch))
+    gdbarch_displaced_step_restore_all_in_ptid
+      (gdbarch, parent_inf, event_status.child_ptid ());
+
   displaced_step_thread_state *displaced = &event_thread->displaced_step_state;
 
   /* Was this thread performing a displaced step?  */
@@ -1986,8 +2012,39 @@ displaced_step_finish (thread_info *event_thread,
 
   /* Do the fixup, and release the resources acquired to do the displaced
      step. */
-  return gdbarch_displaced_step_finish (displaced->get_original_gdbarch (),
-					event_thread, event_status);
+  displaced_step_finish_status status
+    = gdbarch_displaced_step_finish (displaced->get_original_gdbarch (),
+				     event_thread, event_status);
+
+  if (event_status.kind () == TARGET_WAITKIND_FORKED
+      || event_status.kind () == TARGET_WAITKIND_VFORKED
+      || event_status.kind () == TARGET_WAITKIND_THREAD_CLONED)
+    {
+      /* Since the vfork/fork/clone syscall instruction was executed
+	 in the scratchpad, the child's PC is also within the
+	 scratchpad.  Set the child's PC to the parent's PC value,
+	 which has already been fixed up.  Note: we use the parent's
+	 aspace here, although we're touching the child, because the
+	 child hasn't been added to the inferior list yet at this
+	 point.  */
+
+      struct regcache *child_regcache
+	= get_thread_arch_aspace_regcache (parent_inf,
+					   event_status.child_ptid (),
+					   gdbarch,
+					   parent_inf->aspace);
+      /* Read PC value of parent.  */
+      CORE_ADDR parent_pc = regcache_read_pc (regcache);
+
+      displaced_debug_printf ("write child pc from %s to %s",
+			      paddress (gdbarch,
+					regcache_read_pc (child_regcache)),
+			      paddress (gdbarch, parent_pc));
+
+      regcache_write_pc (child_regcache, parent_pc);
+    }
+
+  return status;
 }
 
 /* Data to be passed around while handling an event.  This data is
@@ -5854,67 +5911,13 @@ handle_inferior_event (struct execution_control_state *ecs)
 
     case TARGET_WAITKIND_FORKED:
     case TARGET_WAITKIND_VFORKED:
-      /* Check whether the inferior is displaced stepping.  */
-      {
-	struct regcache *regcache = get_thread_regcache (ecs->event_thread);
-	struct gdbarch *gdbarch = regcache->arch ();
-	inferior *parent_inf = find_inferior_ptid (ecs->target, ecs->ptid);
-
-	/* If this is a fork (child gets its own address space copy)
-	   and some displaced step buffers were in use at the time of
-	   the fork, restore the displaced step buffer bytes in the
-	   child process.
-
-	   Architectures which support displaced stepping and fork
-	   events must supply an implementation of
-	   gdbarch_displaced_step_restore_all_in_ptid.  This is not
-	   enforced during gdbarch validation to support architectures
-	   which support displaced stepping but not forks.  */
-	if (ecs->ws.kind () == TARGET_WAITKIND_FORKED
-	    && gdbarch_supports_displaced_stepping (gdbarch))
-	  gdbarch_displaced_step_restore_all_in_ptid
-	    (gdbarch, parent_inf, ecs->ws.child_ptid ());
-
-	/* If displaced stepping is supported, and thread ecs->ptid is
-	   displaced stepping.  */
-	if (displaced_step_in_progress_thread (ecs->event_thread))
-	  {
-	    struct regcache *child_regcache;
-	    CORE_ADDR parent_pc;
-
-	    /* GDB has got TARGET_WAITKIND_FORKED or TARGET_WAITKIND_VFORKED,
-	       indicating that the displaced stepping of syscall instruction
-	       has been done.  Perform cleanup for parent process here.  Note
-	       that this operation also cleans up the child process for vfork,
-	       because their pages are shared.  */
-	    displaced_step_finish (ecs->event_thread, ecs->ws);
-	    /* Start a new step-over in another thread if there's one
-	       that needs it.  */
-	    start_step_over ();
-
-	    /* Since the vfork/fork syscall instruction was executed in the scratchpad,
-	       the child's PC is also within the scratchpad.  Set the child's PC
-	       to the parent's PC value, which has already been fixed up.
-	       FIXME: we use the parent's aspace here, although we're touching
-	       the child, because the child hasn't been added to the inferior
-	       list yet at this point.  */
-
-	    child_regcache
-	      = get_thread_arch_aspace_regcache (parent_inf,
-						 ecs->ws.child_ptid (),
-						 gdbarch,
-						 parent_inf->aspace);
-	    /* Read PC value of parent process.  */
-	    parent_pc = regcache_read_pc (regcache);
-
-	    displaced_debug_printf ("write child pc from %s to %s",
-				    paddress (gdbarch,
-					      regcache_read_pc (child_regcache)),
-				    paddress (gdbarch, parent_pc));
-
-	    regcache_write_pc (child_regcache, parent_pc);
-	  }
-      }
+    case TARGET_WAITKIND_THREAD_CLONED:
+
+      displaced_step_finish (ecs->event_thread, ecs->ws);
+
+      /* Start a new step-over in another thread if there's one that
+	 needs it.  */
+      start_step_over ();
 
       context_switch (ecs);
 
@@ -5930,7 +5933,7 @@ handle_inferior_event (struct execution_control_state *ecs)
 	 need to unpatch at follow/detach time instead to be certain
 	 that new breakpoints added between catchpoint hit time and
 	 vfork follow are detached.  */
-      if (ecs->ws.kind () != TARGET_WAITKIND_VFORKED)
+      if (ecs->ws.kind () == TARGET_WAITKIND_FORKED)
 	{
 	  /* This won't actually modify the breakpoint list, but will
 	     physically remove the breakpoints from the child.  */
@@ -5962,14 +5965,24 @@ handle_inferior_event (struct execution_control_state *ecs)
       if (!bpstat_causes_stop (ecs->event_thread->control.stop_bpstat))
 	{
 	  bool follow_child
-	    = (follow_fork_mode_string == follow_fork_mode_child);
+	    = (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED
+	       && follow_fork_mode_string == follow_fork_mode_child);
 
 	  ecs->event_thread->set_stop_signal (GDB_SIGNAL_0);
 
 	  process_stratum_target *targ
 	    = ecs->event_thread->inf->process_target ();
 
-	  bool should_resume = follow_fork ();
+	  bool should_resume;
+	  if (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED)
+	    should_resume = follow_fork ();
+	  else
+	    {
+	      should_resume = true;
+	      inferior *inf = ecs->event_thread->inf;
+	      inf->top_target ()->follow_clone (ecs->ws.child_ptid ());
+	      ecs->event_thread->pending_follow.set_spurious ();
+	    }
 
 	  /* Note that one of these may be an invalid pointer,
 	     depending on detach_fork.  */
@@ -5980,16 +5993,21 @@ handle_inferior_event (struct execution_control_state *ecs)
 	     child is marked stopped.  */
 
 	  /* If not resuming the parent, mark it stopped.  */
-	  if (follow_child && !detach_fork && !non_stop && !sched_multi)
+	  if (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED
+	      && follow_child && !detach_fork && !non_stop && !sched_multi)
 	    parent->set_running (false);
 
 	  /* If resuming the child, mark it running.  */
-	  if (follow_child || (!detach_fork && (non_stop || sched_multi)))
+	  if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED
+	      || (follow_child || (!detach_fork && (non_stop || sched_multi))))
 	    child->set_running (true);
 
 	  /* In non-stop mode, also resume the other branch.  */
-	  if (!detach_fork && (non_stop
-			       || (sched_multi && target_is_non_stop_p ())))
+	  if ((ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED
+	       && target_is_non_stop_p ())
+	      || (!detach_fork && (non_stop
+				   || (sched_multi
+				       && target_is_non_stop_p ()))))
 	    {
 	      if (follow_child)
 		switch_to_thread (parent);
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 97d80053c6f..da870e84922 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1280,69 +1280,85 @@ get_detach_signal (struct lwp_info *lp)
   return 0;
 }
 
-/* Detach from LP.  If SIGNO_P is non-NULL, then it points to the
-   signal number that should be passed to the LWP when detaching.
-   Otherwise pass any pending signal the LWP may have, if any.  */
+/* If LP has a pending fork/vfork/clone status, return it.  */
 
-static void
-detach_one_lwp (struct lwp_info *lp, int *signo_p)
+static gdb::optional<target_waitstatus>
+get_pending_child_status (lwp_info *lp)
 {
   LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
 
   linux_nat_debug_printf ("lwp %s (stopped = %d)",
 			  lp->ptid.to_string ().c_str (), lp->stopped);
 
-  int lwpid = lp->ptid.lwp ();
-  int signo;
-
-  gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status));
-
-  /* If the lwp/thread we are about to detach has a pending fork event,
-     there is a process GDB is attached to that the core of GDB doesn't know
-     about.  Detach from it.  */
-
   /* Check in lwp_info::status.  */
   if (WIFSTOPPED (lp->status) && linux_is_extended_waitstatus (lp->status))
     {
       int event = linux_ptrace_get_extended_event (lp->status);
 
-      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
+      if (event == PTRACE_EVENT_FORK
+	  || event == PTRACE_EVENT_VFORK
+	  || event == PTRACE_EVENT_CLONE)
 	{
 	  unsigned long child_pid;
 	  int ret = ptrace (PTRACE_GETEVENTMSG, lp->ptid.lwp (), 0, &child_pid);
 	  if (ret == 0)
-	    detach_one_pid (child_pid, 0);
+	    {
+	      target_waitstatus ws;
+
+	      if (event == PTRACE_EVENT_FORK)
+		ws.set_forked (ptid_t (child_pid, child_pid));
+	      else if (event == PTRACE_EVENT_VFORK)
+		ws.set_vforked (ptid_t (child_pid, child_pid));
+	      else if (event == PTRACE_EVENT_CLONE)
+		ws.set_thread_cloned (ptid_t (lp->ptid.pid (), child_pid));
+	      else
+		gdb_assert_not_reached ("unhandled");
+
+	      return ws;
+	    }
 	  else
-	    perror_warning_with_name (_("Failed to detach fork child"));
+	    {
+	      perror_warning_with_name (_("Failed to retrieve event msg"));
+	      return {};
+	    }
 	}
     }
 
   /* Check in lwp_info::waitstatus.  */
-  if (lp->waitstatus.kind () == TARGET_WAITKIND_VFORKED
-      || lp->waitstatus.kind () == TARGET_WAITKIND_FORKED)
-    detach_one_pid (lp->waitstatus.child_ptid ().pid (), 0);
-
+  if (is_new_child_status (lp->waitstatus.kind ()))
+    return lp->waitstatus;
 
-  /* Check in thread_info::pending_waitstatus.  */
   thread_info *tp = linux_target->find_thread (lp->ptid);
-  if (tp->has_pending_waitstatus ())
-    {
-      const target_waitstatus &ws = tp->pending_waitstatus ();
 
-      if (ws.kind () == TARGET_WAITKIND_VFORKED
-	  || ws.kind () == TARGET_WAITKIND_FORKED)
-	detach_one_pid (ws.child_ptid ().pid (), 0);
-    }
+  /* Check in thread_info::pending_waitstatus.  */
+  if (tp->has_pending_waitstatus ()
+      && is_new_child_status (tp->pending_waitstatus ().kind ()))
+    return tp->pending_waitstatus ();
 
   /* Check in thread_info::pending_follow.  */
-  if (tp->pending_follow.kind () == TARGET_WAITKIND_VFORKED
-      || tp->pending_follow.kind () == TARGET_WAITKIND_FORKED)
-    detach_one_pid (tp->pending_follow.child_ptid ().pid (), 0);
+  if (is_new_child_status (tp->pending_follow.kind ()))
+    return tp->pending_follow;
 
-  if (lp->status != 0)
-    linux_nat_debug_printf ("Pending %s for %s on detach.",
-			    strsignal (WSTOPSIG (lp->status)),
-			    lp->ptid.to_string ().c_str ());
+  return {};
+}
+
+/* Detach from LP.  If SIGNO_P is non-NULL, then it points to the
+   signal number that should be passed to the LWP when detaching.
+   Otherwise pass any pending signal the LWP may have, if any.  */
+
+static void
+detach_one_lwp (struct lwp_info *lp, int *signo_p)
+{
+  int lwpid = lp->ptid.lwp ();
+  int signo;
+
+  /* If the lwp/thread we are about to detach has a pending fork/clone
+     event, there is a process/thread GDB is attached to that the core
+     of GDB doesn't know about.  Detach from it.  */
+
+  gdb::optional<target_waitstatus> ws = get_pending_child_status (lp);
+  if (ws.has_value ())
+    detach_one_pid (ws->child_ptid ().lwp (), 0);
 
   /* If there is a pending SIGSTOP, get rid of it.  */
   if (lp->signalled)
@@ -1836,6 +1852,55 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
   return 1;
 }
 
+/* See target.h.  */
+
+void
+linux_nat_target::follow_clone (ptid_t child_ptid)
+{
+  lwp_info *new_lp = add_lwp (child_ptid);
+  new_lp->stopped = 1;
+
+  /* If the thread_db layer is active, let it record the user
+     level thread id and status, and add the thread to GDB's
+     list.  */
+  if (!thread_db_notice_clone (inferior_ptid, new_lp->ptid))
+    {
+      /* The process is not using thread_db.  Add the LWP to
+	 GDB's list.  */
+      add_thread (linux_target, new_lp->ptid);
+    }
+
+  /* We just created NEW_LP so it cannot yet contain STATUS.  */
+  gdb_assert (new_lp->status == 0);
+
+  if (!pull_pid_from_list (&stopped_pids, child_ptid.lwp (), &new_lp->status))
+    internal_error (_("no saved status for clone lwp"));
+
+  if (WSTOPSIG (new_lp->status) != SIGSTOP)
+    {
+      /* This can happen if someone starts sending signals to
+	 the new thread before it gets a chance to run, which
+	 have a lower number than SIGSTOP (e.g. SIGUSR1).
+	 This is an unlikely case, and harder to handle for
+	 fork / vfork than for clone, so we do not try - but
+	 we handle it for clone events here.  */
+
+      new_lp->signalled = 1;
+
+      /* Save the wait status to report later.  */
+      linux_nat_debug_printf
+	("waitpid of new LWP %ld, saving status %s",
+	 (long) new_lp->ptid.lwp (), status_to_str (new_lp->status).c_str ());
+    }
+  else
+    {
+      new_lp->status = 0;
+
+      if (report_thread_events)
+	new_lp->waitstatus.set_thread_created ();
+    }
+}
+
 /* Handle a GNU/Linux extended wait response.  If we see a clone
    event, we need to add the new LWP to our list (and not report the
    trap to higher layers).  This function returns non-zero if the
@@ -1876,11 +1941,9 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
 	    internal_error (_("wait returned unexpected status 0x%x"), status);
 	}
 
-      ptid_t child_ptid (new_pid, new_pid);
-
       if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
 	{
-	  open_proc_mem_file (child_ptid);
+	  open_proc_mem_file (ptid_t (new_pid, new_pid));
 
 	  /* The arch-specific native code may need to know about new
 	     forks even if those end up never mapped to an
@@ -1917,66 +1980,18 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
 	}
 
       if (event == PTRACE_EVENT_FORK)
-	ourstatus->set_forked (child_ptid);
+	ourstatus->set_forked (ptid_t (new_pid, new_pid));
       else if (event == PTRACE_EVENT_VFORK)
-	ourstatus->set_vforked (child_ptid);
+	ourstatus->set_vforked (ptid_t (new_pid, new_pid));
       else if (event == PTRACE_EVENT_CLONE)
 	{
-	  struct lwp_info *new_lp;
-
-	  ourstatus->set_ignore ();
-
 	  linux_nat_debug_printf
 	    ("Got clone event from LWP %d, new child is LWP %ld", pid, new_pid);
 
-	  new_lp = add_lwp (ptid_t (lp->ptid.pid (), new_pid));
-	  new_lp->stopped = 1;
-	  new_lp->resumed = 1;
-
-	  /* If the thread_db layer is active, let it record the user
-	     level thread id and status, and add the thread to GDB's
-	     list.  */
-	  if (!thread_db_notice_clone (lp->ptid, new_lp->ptid))
-	    {
-	      /* The process is not using thread_db.  Add the LWP to
-		 GDB's list.  */
-	      add_thread (linux_target, new_lp->ptid);
-	    }
-
-	  /* Even if we're stopping the thread for some reason
-	     internal to this module, from the perspective of infrun
-	     and the user/frontend, this new thread is running until
-	     it next reports a stop.  */
-	  set_running (linux_target, new_lp->ptid, true);
-	  set_executing (linux_target, new_lp->ptid, true);
-
-	  if (WSTOPSIG (status) != SIGSTOP)
-	    {
-	      /* This can happen if someone starts sending signals to
-		 the new thread before it gets a chance to run, which
-		 have a lower number than SIGSTOP (e.g. SIGUSR1).
-		 This is an unlikely case, and harder to handle for
-		 fork / vfork than for clone, so we do not try - but
-		 we handle it for clone events here.  */
-
-	      new_lp->signalled = 1;
+	  /* Save the status again, we'll use it in follow_clone.  */
+	  add_to_pid_list (&stopped_pids, new_pid, status);
 
-	      /* We created NEW_LP so it cannot yet contain STATUS.  */
-	      gdb_assert (new_lp->status == 0);
-
-	      /* Save the wait status to report later.  */
-	      linux_nat_debug_printf
-		("waitpid of new LWP %ld, saving status %s",
-		 (long) new_lp->ptid.lwp (), status_to_str (status).c_str ());
-	      new_lp->status = status;
-	    }
-	  else if (report_thread_events)
-	    {
-	      new_lp->waitstatus.set_thread_created ();
-	      new_lp->status = status;
-	    }
-
-	  return 1;
+	  ourstatus->set_thread_cloned (ptid_t (lp->ptid.pid (), new_pid));
 	}
 
       return 0;
@@ -3562,59 +3577,56 @@ kill_wait_callback (struct lwp_info *lp)
   return 0;
 }
 
-/* Kill the fork children of any threads of inferior INF that are
-   stopped at a fork event.  */
+/* Kill the fork/clone child of LP if it has an unfollowed child.  */
 
-static void
-kill_unfollowed_fork_children (struct inferior *inf)
+static int
+kill_unfollowed_child_callback (lwp_info *lp)
 {
-  for (thread_info *thread : inf->non_exited_threads ())
+  gdb::optional<target_waitstatus> ws = get_pending_child_status (lp);
+  if (ws.has_value ())
     {
-      struct target_waitstatus *ws = &thread->pending_follow;
-
-      if (ws->kind () == TARGET_WAITKIND_FORKED
-	  || ws->kind () == TARGET_WAITKIND_VFORKED)
-	{
-	  ptid_t child_ptid = ws->child_ptid ();
-	  int child_pid = child_ptid.pid ();
-	  int child_lwp = child_ptid.lwp ();
+      ptid_t child_ptid = ws->child_ptid ();
+      int child_pid = child_ptid.pid ();
+      int child_lwp = child_ptid.lwp ();
 
-	  kill_one_lwp (child_lwp);
-	  kill_wait_one_lwp (child_lwp);
+      kill_one_lwp (child_lwp);
+      kill_wait_one_lwp (child_lwp);
 
-	  /* Let the arch-specific native code know this process is
-	     gone.  */
-	  linux_target->low_forget_process (child_pid);
-	}
+      /* Let the arch-specific native code know this process is
+	 gone.  */
+      if (ws->kind () != TARGET_WAITKIND_THREAD_CLONED)
+	linux_target->low_forget_process (child_pid);
     }
+
+  return 0;
 }
 
 void
 linux_nat_target::kill ()
 {
-  /* If we're stopped while forking and we haven't followed yet,
-     kill the other task.  We need to do this first because the
+  ptid_t pid_ptid (inferior_ptid.pid ());
+
+  /* If we're stopped while forking/cloning and we haven't followed
+     yet, kill the child task.  We need to do this first because the
      parent will be sleeping if this is a vfork.  */
-  kill_unfollowed_fork_children (current_inferior ());
+  iterate_over_lwps (pid_ptid, kill_unfollowed_child_callback);
 
   if (forks_exist_p ())
     linux_fork_killall ();
   else
     {
-      ptid_t ptid = ptid_t (inferior_ptid.pid ());
-
       /* Stop all threads before killing them, since ptrace requires
 	 that the thread is stopped to successfully PTRACE_KILL.  */
-      iterate_over_lwps (ptid, stop_callback);
+      iterate_over_lwps (pid_ptid, stop_callback);
       /* ... and wait until all of them have reported back that
 	 they're no longer running.  */
-      iterate_over_lwps (ptid, stop_wait_callback);
+      iterate_over_lwps (pid_ptid, stop_wait_callback);
 
       /* Kill all LWP's ...  */
-      iterate_over_lwps (ptid, kill_callback);
+      iterate_over_lwps (pid_ptid, kill_callback);
 
       /* ... and wait until we've flushed all events.  */
-      iterate_over_lwps (ptid, kill_wait_callback);
+      iterate_over_lwps (pid_ptid, kill_wait_callback);
     }
 
   target_mourn_inferior (inferior_ptid);
diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h
index 770fe924427..1cdbeafd4f3 100644
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -129,6 +129,8 @@ class linux_nat_target : public inf_ptrace_target
 
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
 
+  void follow_clone (ptid_t) override;
+
   std::vector<static_tracepoint_marker>
     static_tracepoint_markers_by_strid (const char *id) override;
 
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index 580fc768dd1..eae96e2daba 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -76,6 +76,7 @@ struct dummy_target : public target_ops
   int insert_vfork_catchpoint (int arg0) override;
   int remove_vfork_catchpoint (int arg0) override;
   void follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bool arg3, bool arg4) override;
+  void follow_clone (ptid_t arg0) override;
   int insert_exec_catchpoint (int arg0) override;
   int remove_exec_catchpoint (int arg0) override;
   void follow_exec (inferior *arg0, ptid_t arg1, const char *arg2) override;
@@ -251,6 +252,7 @@ struct debug_target : public target_ops
   int insert_vfork_catchpoint (int arg0) override;
   int remove_vfork_catchpoint (int arg0) override;
   void follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bool arg3, bool arg4) override;
+  void follow_clone (ptid_t arg0) override;
   int insert_exec_catchpoint (int arg0) override;
   int remove_exec_catchpoint (int arg0) override;
   void follow_exec (inferior *arg0, ptid_t arg1, const char *arg2) override;
@@ -1547,6 +1549,28 @@ debug_target::follow_fork (inferior *arg0, ptid_t arg1, target_waitkind arg2, bo
   gdb_puts (")\n", gdb_stdlog);
 }
 
+void
+target_ops::follow_clone (ptid_t arg0)
+{
+  this->beneath ()->follow_clone (arg0);
+}
+
+void
+dummy_target::follow_clone (ptid_t arg0)
+{
+  default_follow_clone (this, arg0);
+}
+
+void
+debug_target::follow_clone (ptid_t arg0)
+{
+  gdb_printf (gdb_stdlog, "-> %s->follow_clone (...)\n", this->beneath ()->shortname ());
+  this->beneath ()->follow_clone (arg0);
+  gdb_printf (gdb_stdlog, "<- %s->follow_clone (", this->beneath ()->shortname ());
+  target_debug_print_ptid_t (arg0);
+  gdb_puts (")\n", gdb_stdlog);
+}
+
 int
 target_ops::insert_exec_catchpoint (int arg0)
 {
diff --git a/gdb/target.c b/gdb/target.c
index f688ff33e3b..bf82649ed98 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -2685,6 +2685,13 @@ default_follow_fork (struct target_ops *self, inferior *child_inf,
   internal_error (_("could not find a target to follow fork"));
 }
 
+static void
+default_follow_clone (struct target_ops *self, ptid_t child_ptid)
+{
+  /* Some target returned a clone event, but did not know how to follow it.  */
+  internal_error (_("could not find a target to follow clone"));
+}
+
 /* See target.h.  */
 
 void
diff --git a/gdb/target.h b/gdb/target.h
index 68b269fb3e6..d4d81e727e9 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -642,6 +642,13 @@ struct target_ops
       TARGET_DEFAULT_RETURN (1);
     virtual void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool)
       TARGET_DEFAULT_FUNC (default_follow_fork);
+
+    /* Add CHILD_PTID to the thread list, after handling a
+       TARGET_WAITKIND_THREAD_CLONE event for the clone parent.  The
+       parent is inferior_ptid.  */
+    virtual void follow_clone (ptid_t child_ptid)
+      TARGET_DEFAULT_FUNC (default_follow_clone);
+
     virtual int insert_exec_catchpoint (int)
       TARGET_DEFAULT_RETURN (1);
     virtual int remove_exec_catchpoint (int)
diff --git a/gdb/target/waitstatus.c b/gdb/target/waitstatus.c
index 2b8404fb75b..a8edbb17d60 100644
--- a/gdb/target/waitstatus.c
+++ b/gdb/target/waitstatus.c
@@ -45,6 +45,7 @@ DIAGNOSTIC_ERROR_SWITCH
 
     case TARGET_WAITKIND_FORKED:
     case TARGET_WAITKIND_VFORKED:
+    case TARGET_WAITKIND_THREAD_CLONED:
       return string_appendf (str, ", child_ptid = %s",
 			     this->child_ptid ().to_string ().c_str ());
 
diff --git a/gdb/target/waitstatus.h b/gdb/target/waitstatus.h
index 4d23f1cbff4..3d3a0cf9d02 100644
--- a/gdb/target/waitstatus.h
+++ b/gdb/target/waitstatus.h
@@ -95,6 +95,13 @@ enum target_waitkind
   /* There are no resumed children left in the program.  */
   TARGET_WAITKIND_NO_RESUMED,
 
+  /* The thread was cloned.  The event's ptid corresponds to the
+     cloned parent.  The cloned child is held stopped at its entry
+     point, and its ptid is in the event's m_child_ptid.  The target
+     must not add the cloned child to GDB's thread list until
+     target_ops::follow_clone() is called.  */
+  TARGET_WAITKIND_THREAD_CLONED,
+
   /* The thread was created.  */
   TARGET_WAITKIND_THREAD_CREATED,
 
@@ -102,6 +109,17 @@ enum target_waitkind
   TARGET_WAITKIND_THREAD_EXITED,
 };
 
+/* Determine if KIND represents an event with a new child - a fork,
+   vfork, or clone.  */
+
+static inline bool
+is_new_child_status (target_waitkind kind)
+{
+  return (kind == TARGET_WAITKIND_FORKED
+	  || kind == TARGET_WAITKIND_VFORKED
+	  || kind == TARGET_WAITKIND_THREAD_CLONED);
+}
+
 /* Return KIND as a string.  */
 
 static inline const char *
@@ -125,6 +143,8 @@ DIAGNOSTIC_ERROR_SWITCH
       return "FORKED";
     case TARGET_WAITKIND_VFORKED:
       return "VFORKED";
+    case TARGET_WAITKIND_THREAD_CLONED:
+      return "THREAD_CLONED";
     case TARGET_WAITKIND_EXECD:
       return "EXECD";
     case TARGET_WAITKIND_VFORK_DONE:
@@ -325,6 +345,14 @@ struct target_waitstatus
     return *this;
   }
 
+  target_waitstatus &set_thread_cloned (ptid_t child_ptid)
+  {
+    this->reset ();
+    m_kind = TARGET_WAITKIND_THREAD_CLONED;
+    m_value.child_ptid = child_ptid;
+    return *this;
+  }
+
   target_waitstatus &set_thread_created ()
   {
     this->reset ();
@@ -369,8 +397,7 @@ struct target_waitstatus
 
   ptid_t child_ptid () const
   {
-    gdb_assert (m_kind == TARGET_WAITKIND_FORKED
-		|| m_kind == TARGET_WAITKIND_VFORKED);
+    gdb_assert (is_new_child_status (m_kind));
     return m_value.child_ptid;
   }
 
diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.c b/gdb/testsuite/gdb.threads/stepi-over-clone.c
new file mode 100644
index 00000000000..12909161c4c
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/stepi-over-clone.c
@@ -0,0 +1,90 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021-2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <pthread.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdlib.h>
+
+/* Set this to non-zero from GDB to start a third worker thread.  */
+volatile int start_third_thread = 0;
+
+void *
+thread_worker_2 (void *arg)
+{
+  int i;
+
+  printf ("Hello from the third thread.\n");
+  fflush (stdout);
+
+  for (i = 0; i < 300; ++i)
+    sleep (1);
+
+  return NULL;
+}
+
+void *
+thread_worker_1 (void *arg)
+{
+  int i;
+  pthread_t thr;
+  void *val;
+
+  if (start_third_thread)
+    pthread_create (&thr, NULL, thread_worker_2, NULL);
+
+  printf ("Hello from the first thread.\n");
+  fflush (stdout);
+
+  for (i = 0; i < 300; ++i)
+    sleep (1);
+
+  if (start_third_thread)
+    pthread_join (thr, &val);
+
+  return NULL;
+}
+
+void *
+thread_idle_loop (void *arg)
+{
+  int i;
+
+  for (i = 0; i < 300; ++i)
+    sleep (1);
+
+  return NULL;
+}
+
+int
+main ()
+{
+  pthread_t thr, thr_idle;
+  void *val;
+
+  if (getenv ("MAKE_EXTRA_THREAD") != NULL)
+    pthread_create (&thr_idle, NULL, thread_idle_loop, NULL);
+
+  pthread_create (&thr, NULL, thread_worker_1, NULL);
+  pthread_join (thr, &val);
+
+  if (getenv ("MAKE_EXTRA_THREAD") != NULL)
+    pthread_join (thr_idle, &val);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
new file mode 100644
index 00000000000..e580f2248ac
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
@@ -0,0 +1,395 @@
+# Copyright 2021-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test performing a 'stepi' over a clone syscall instruction.
+
+# This test relies on us being able to spot syscall instructions in
+# disassembly output.  For now this is only implemented for x86-64.
+require {istarget x86_64-*-*}
+
+# Test only on native targets, for now.
+proc is_native_target {} {
+    return [expr {[target_info gdb_protocol] == ""}]
+}
+require is_native_target
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile \
+	  {debug pthreads additional_flags=-static}] } {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+# Arrange to catch the 'clone' syscall, run until we catch the
+# syscall, and try to figure out the address of the actual syscall
+# instruction so we can place a breakpoint at this address.
+
+gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
+    -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" {
+	unsupported $gdb_test_name
+	return
+    }
+    -re ".*$gdb_prompt $" {
+	pass $gdb_test_name
+    }
+}
+
+gdb_test "continue" \
+    "Catchpoint $decimal \\(call to syscall clone\[23\]\\), .*"
+
+# Return true if INSN is a syscall instruction.
+
+proc is_syscall_insn { insn } {
+    if [istarget x86_64-*-* ] {
+	return { $insn == "syscall" }
+    } else {
+	error "port me"
+    }
+}
+
+# A list of addresses with syscall instructions.
+set syscall_addrs {}
+
+# Get list of addresses with syscall instructions.
+gdb_test_multiple "disassemble" "" {
+    -re "Dump of assembler code for function \[^\r\n\]+:\r\n" {
+	exp_continue
+    }
+    -re "^(?:=>)?\\s+(${hex})\\s+<\\+${decimal}>:\\s+(\[^\r\n\]+)\r\n" {
+	set addr $expect_out(1,string)
+	set insn [string trim $expect_out(2,string)]
+	if [is_syscall_insn $insn] {
+	    verbose -log "Found a syscall at: $addr"
+	    lappend syscall_addrs $addr
+	}
+	exp_continue
+    }
+    -re "^End of assembler dump\\.\r\n$gdb_prompt $" {
+	if { [llength $syscall_addrs] == 0 } {
+	    unsupported "no syscalls found"
+	    return -1
+	}
+    }
+}
+
+# The test proc.  NON_STOP and DISPLACED are either 'on' or 'off', and are
+# used to configure how GDB starts up.  THIRD_THREAD is either true or false,
+# and is used to configure the inferior.
+proc test {non_stop displaced third_thread} {
+    global binfile srcfile
+    global syscall_addrs
+    global GDBFLAGS
+    global gdb_prompt hex decimal
+
+    for { set i 0 } { $i < 3 } { incr i } {
+	with_test_prefix "i=$i" {
+
+	    # Arrange to start GDB in the correct mode.
+	    save_vars { GDBFLAGS } {
+		append GDBFLAGS " -ex \"set non-stop $non_stop\""
+		append GDBFLAGS " -ex \"set displaced $displaced\""
+		clean_restart $binfile
+	    }
+
+	    runto_main
+
+	    # Setup breakpoints at all the syscall instructions we
+	    # might hit.  Only issue one pass/fail to make tests more
+	    # comparable between systems.
+	    set test "break at syscall insns"
+	    foreach addr $syscall_addrs {
+		if {[gdb_test -nopass "break *$addr" \
+			 ".*" \
+			 $test] != 0} {
+		    return
+		}
+	    }
+	    # If we got here, all breakpoints were set successfully.
+	    # We used -nopass above, so issue a pass now.
+	    pass $test
+
+	    # Continue until we hit the syscall.
+	    gdb_test "continue"
+
+	    if { $third_thread } {
+		gdb_test_no_output "set start_third_thread=1"
+	    }
+
+	    set stepi_error_count 0
+	    set stepi_new_thread_count 0
+	    set thread_1_stopped false
+	    set thread_2_stopped false
+	    set seen_prompt false
+	    set hello_first_thread false
+
+	    # The program is now stopped at main, but if testing
+	    # against GDBserver, inferior_spawn_id is GDBserver's
+	    # spawn_id, and the GDBserver output emitted before the
+	    # program stopped isn't flushed unless we explicitly do
+	    # so, because it is on a different spawn_id.  We could try
+	    # flushing it now, to avoid confusing the following tests,
+	    # but that would have to be done under a timeout, and
+	    # would thus slow down the testcase.  Instead, if inferior
+	    # output goes to a different spawn id, then we don't need
+	    # to wait for the first message from the inferior with an
+	    # anchor, as we know consuming inferior output won't
+	    # consume GDB output.  OTOH, if inferior output is coming
+	    # out on GDB's terminal, then we must use an anchor,
+	    # otherwise matching inferior output without one could
+	    # consume GDB output that we are waiting for in regular
+	    # expressions that are written after the inferior output
+	    # regular expression match.
+	    if {$::inferior_spawn_id != $::gdb_spawn_id} {
+		set anchor ""
+	    } else {
+		set anchor "^"
+	    }
+
+	    gdb_test_multiple "stepi" "" {
+		-re "^stepi\r\n" {
+		    verbose -log "XXX: Consume the initial command"
+		    exp_continue
+		}
+		-re "^\\\[New Thread\[^\r\n\]+\\\]\r\n" {
+		    verbose -log "XXX: Consume new thread line"
+		    incr stepi_new_thread_count
+		    exp_continue
+		}
+		-re "^\\\[Switching to Thread\[^\r\n\]+\\\]\r\n" {
+		    verbose -log "XXX: Consume switching to thread line"
+		    exp_continue
+		}
+		-re "^\\s*\r\n" {
+		    verbose -log "XXX: Consume blank line"
+		    exp_continue
+		}
+
+		-i $::inferior_spawn_id
+
+		-re "${anchor}Hello from the first thread\\.\r\n" {
+		    set hello_first_thread true
+
+		    verbose -log "XXX: Consume first worker thread message"
+		    if { $third_thread } {
+			# If we are going to start a third thread then GDB
+			# should hit the breakpoint in clone before printing
+			# this message.
+			incr stepi_error_count
+		    }
+		    if { !$seen_prompt } {
+			exp_continue
+		    }
+		}
+		-re "^Hello from the third thread\\.\r\n" {
+		    # We should never see this message.
+		    verbose -log "XXX: Consume third worker thread message"
+		    incr stepi_error_count
+		    if { !$seen_prompt } {
+			exp_continue
+		    }
+		}
+
+		-i $::gdb_spawn_id
+
+		-re "^$hex in clone\[23\]? \\(\\)\r\n" {
+		    verbose -log "XXX: Consume stop location line"
+		    set thread_1_stopped true
+		    if { !$seen_prompt } {
+			verbose -log "XXX: Continuing to look for the prompt"
+			exp_continue
+		    }
+		}
+		-re "^$gdb_prompt " {
+		    verbose -log "XXX: Consume the final prompt"
+		    gdb_assert { $stepi_error_count == 0 }
+		    gdb_assert { $stepi_new_thread_count == 1 }
+		    set seen_prompt true
+		    if { $third_thread } {
+			if { $non_stop } {
+			    # In non-stop mode if we are trying to start a
+			    # third thread (from the second thread), then the
+			    # second thread should hit the breakpoint in clone
+			    # before actually starting the third thread.  And
+			    # so, at this point both thread 1, and thread 2
+			    # should now be stopped.
+			    if { !$thread_1_stopped || !$thread_2_stopped } {
+				verbose -log "XXX: Continue looking for an additional stop event"
+				exp_continue
+			    }
+			} else {
+			    # All stop mode.  Something should have stoppped
+			    # by now otherwise we shouldn't have a prompt, but
+			    # we can't know which thread will have stopped as
+			    # that is a race condition.
+			    gdb_assert { $thread_1_stopped || $thread_2_stopped }
+			}
+		    }
+
+		    if {$non_stop && !$hello_first_thread} {
+			exp_continue
+		    }
+
+		}
+		-re "^Thread 2\[^\r\n\]+ hit Breakpoint $decimal, $hex in clone\[23\]? \\(\\)\r\n" {
+		    verbose -log "XXX: Consume thread 2 hit breakpoint"
+		    set thread_2_stopped true
+		    if { !$seen_prompt } {
+			verbose -log "XXX: Continuing to look for the prompt"
+			exp_continue
+		    }
+		}
+		-re "^PC register is not available\r\n" {
+		    # This is the error we'd see for remote targets.
+		    verbose -log "XXX: Consume error line"
+		    incr stepi_error_count
+		    exp_continue
+		}
+		-re "^Couldn't get registers: No such process\\.\r\n" {
+		    # This is the error we see'd for native linux
+		    # targets.
+		    verbose -log "XXX: Consume error line"
+		    incr stepi_error_count
+		    exp_continue
+		}
+	    }
+
+	    # Ensure we are back at a GDB prompt, resynchronise.
+	    verbose -log "XXX: Have completed scanning the 'stepi' output"
+	    gdb_test "p 1 + 2 + 3" " = 6"
+
+	    # Check the number of threads we have, it should be exactly two.
+	    set thread_count 0
+	    set bad_threads 0
+
+	    # Build up our expectations for what the current thread state
+	    # should be.  Thread 1 is the easiest, this is the thread we are
+	    # stepping, so this thread should always be stopped, and should
+	    # always still be in clone.
+	    set match_code {}
+	    lappend match_code {
+		-re "\\*?\\s+1\\s+Thread\[^\r\n\]+clone\[23\]? \\(\\)\r\n" {
+		    incr thread_count
+		    exp_continue
+		}
+	    }
+
+	    # What state should thread 2 be in?
+	    if { $non_stop == "on" } {
+		if { $third_thread } {
+		    # With non-stop mode on, and creation of a third thread
+		    # having been requested, we expect Thread 2 to exist, and
+		    # be stopped at the breakpoint in clone (just before the
+		    # third thread is actually created).
+		    lappend match_code {
+			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+$hex in clone\[23\]? \\(\\)\r\n" {
+			    incr thread_count
+			    exp_continue
+			}
+			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
+			    incr thread_count
+			    incr bad_threads
+			    exp_continue
+			}
+			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
+			    verbose -log "XXX: thread 2 is bad, unknown state"
+			    incr thread_count
+			    incr bad_threads
+			    exp_continue
+			}
+		    }
+
+		} else {
+		    # With non-stop mode on, and no third thread having been
+		    # requested, then we expect Thread 2 to exist, and still
+		    # be running.
+		    lappend match_code {
+			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
+			    incr thread_count
+			    exp_continue
+			}
+			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
+			    verbose -log "XXX: thread 2 is bad, unknown state"
+			    incr thread_count
+			    incr bad_threads
+			    exp_continue
+			}
+		    }
+		}
+	    } else {
+		# With non-stop mode off then we expect Thread 2 to exist, and
+		# be stopped.  We don't have any guarantee about where the
+		# thread will have stopped though, so we need to be vague.
+		lappend match_code {
+		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
+			verbose -log "XXX: thread 2 is bad, unexpectedly running"
+			incr thread_count
+			incr bad_threads
+			exp_continue
+		    }
+		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+_start\[^\r\n\]+\r\n" {
+			# We know that the thread shouldn't be stopped
+			# at _start, though.  This is the location of
+			# the scratch pad on Linux at the time of
+			# writting.
+			verbose -log "XXX: thread 2 is bad, stuck in scratchpad"
+			incr thread_count
+			incr bad_threads
+			exp_continue
+		    }
+		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
+			incr thread_count
+			exp_continue
+		    }
+		}
+	    }
+
+	    # We don't expect to ever see a thread 3.  Even when we are
+	    # requesting that this third thread be created, thread 2, the
+	    # thread that creates thread 3, should stop before executing the
+	    # clone syscall.  So, if we do ever see this then something has
+	    # gone wrong.
+	    lappend match_code {
+		-re "\\s+3\\s+Thread\[^\r\n\]+\r\n" {
+		    incr thread_count
+		    incr bad_threads
+		    exp_continue
+		}
+	    }
+
+	    lappend match_code {
+		-re "$gdb_prompt $" {
+		    gdb_assert { $thread_count == 2 }
+		    gdb_assert { $bad_threads == 0 }
+		}
+	    }
+
+	    set match_code [join $match_code]
+	    gdb_test_multiple "info threads" "" $match_code
+	}
+    }
+}
+
+# Run the test in all suitable configurations.
+foreach_with_prefix third_thread { false true } {
+    foreach_with_prefix non-stop { "on" "off" } {
+	foreach_with_prefix displaced { "off" "on" } {
+	    test ${non-stop} ${displaced} ${third_thread}
+	}
+    }
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 04/25] Support clone events in the remote protocol
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (2 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 05/25] Avoid duplicate QThreadEvents packets Pedro Alves
                   ` (21 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

The previous patch taught GDB about a new
TARGET_WAITKIND_THREAD_CLONED event kind, and made the Linux target
report clone events.

A following patch will teach Linux GDBserver to do the same thing.

But before we get there, we need to teach the remote protocol about
TARGET_WAITKIND_THREAD_CLONED.  That's what this patch does.  Clone is
very similar to vfork and fork, and the new stop reply is likewise
handled similarly.  The stub reports "T05clone:...".

GDBserver core is taught to handle TARGET_WAITKIND_THREAD_CLONED and
forward it to GDB in this patch, but no backend actually emits it yet.
That will be done in a following patch.

Documentation for this new remote protocol feature is included in a
documentation patch later in the series.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: If271f20320d864f074d8ac0d531cc1a323da847f
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
---
 gdb/remote.c              | 98 ++++++++++++++++++++++++++-------------
 gdbserver/remote-utils.cc | 26 +++++++++--
 gdbserver/server.cc       |  3 +-
 3 files changed, 89 insertions(+), 38 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index 69271048da2..8c7979ce6a6 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -1019,6 +1019,7 @@ class remote_target : public process_stratum_target
   const struct btrace_config *btrace_conf (const struct btrace_target_info *) override;
   bool augmented_libraries_svr4_read () override;
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
+  void follow_clone (ptid_t child_ptid) override;
   void follow_exec (inferior *, ptid_t, const char *) override;
   int insert_fork_catchpoint (int) override;
   int remove_fork_catchpoint (int) override;
@@ -1112,7 +1113,7 @@ class remote_target : public process_stratum_target
 
   void remote_btrace_maybe_reopen ();
 
-  void remove_new_fork_children (threads_listing_context *context);
+  void remove_new_children (threads_listing_context *context);
   void kill_new_fork_children (inferior *inf);
   void discard_pending_stop_replies (struct inferior *inf);
   int stop_reply_queue_length ();
@@ -2792,10 +2793,8 @@ remote_target::remote_add_thread (ptid_t ptid, bool running, bool executing,
   else
     thread = add_thread (this, ptid);
 
-  /* We start by assuming threads are resumed.  That state then gets updated
-     when we process a matching stop reply.  */
-  get_remote_thread_info (thread)->set_resumed ();
-
+  if (executing)
+    get_remote_thread_info (thread)->set_resumed ();
   set_executing (this, ptid, executing);
   set_running (this, ptid, running);
 
@@ -4194,10 +4193,11 @@ remote_target::update_thread_list ()
 	    }
 	}
 
-      /* Remove any unreported fork child threads from CONTEXT so
-	 that we don't interfere with follow fork, which is where
-	 creation of such threads is handled.  */
-      remove_new_fork_children (&context);
+      /* Remove any unreported fork/vfork/clone child threads from
+	 CONTEXT so that we don't interfere with follow
+	 fork/vfork/clone, which is where creation of such threads is
+	 handled.  */
+      remove_new_children (&context);
 
       /* And now add threads we don't know about yet to our list.  */
       for (thread_item &item : context.items)
@@ -5161,6 +5161,8 @@ remote_target::start_remote_1 (int from_tty, int extended_p)
 	    }
 	  else
 	    switch_to_thread (this->find_thread (curr_thread));
+
+	  get_remote_thread_info (inferior_thread ())->set_resumed ();
 	}
 
       /* init_wait_for_inferior should be called before get_offsets in order
@@ -6120,16 +6122,25 @@ is_fork_status (target_waitkind kind)
 	  || kind == TARGET_WAITKIND_VFORKED);
 }
 
-/* Return THREAD's pending status if it is a pending fork parent, else
-   return nullptr.  */
+/* Return a reference to the field where a pending child status, if
+   there's one, is recorded.  If there's no child event pending, the
+   returned waitstatus has TARGET_WAITKIND_IGNORE kind.  */
+
+static const target_waitstatus &
+thread_pending_status (struct thread_info *thread)
+{
+  return (thread->has_pending_waitstatus ()
+	  ? thread->pending_waitstatus ()
+	  : thread->pending_follow);
+}
+
+/* Return THREAD's pending status if it is a pending fork/vfork (but
+   not clone) parent, else return nullptr.  */
 
 static const target_waitstatus *
 thread_pending_fork_status (struct thread_info *thread)
 {
-  const target_waitstatus &ws
-    = (thread->has_pending_waitstatus ()
-       ? thread->pending_waitstatus ()
-       : thread->pending_follow);
+  const target_waitstatus &ws = thread_pending_status (thread);
 
   if (!is_fork_status (ws.kind ()))
     return nullptr;
@@ -6137,6 +6148,20 @@ thread_pending_fork_status (struct thread_info *thread)
   return &ws;
 }
 
+/* Return THREAD's pending status if is is a pending fork/vfork/clone
+   event, else return nullptr.  */
+
+static const target_waitstatus *
+thread_pending_child_status (thread_info *thread)
+{
+  const target_waitstatus &ws = thread_pending_status (thread);
+
+  if (!is_new_child_status (ws.kind ()))
+    return nullptr;
+
+  return &ws;
+}
+
 /* Detach the specified process.  */
 
 void
@@ -6327,6 +6352,12 @@ remote_target::follow_fork (inferior *child_inf, ptid_t child_ptid,
     }
 }
 
+void
+remote_target::follow_clone (ptid_t child_ptid)
+{
+  remote_add_thread (child_ptid, false, false, false);
+}
+
 /* Target follow-exec function for remote targets.  Save EXECD_PATHNAME
    in the program space of the new inferior.  */
 
@@ -7054,10 +7085,10 @@ remote_target::commit_resumed ()
       if (priv->get_resume_state () == resume_state::RESUMED_PENDING_VCONT)
 	any_pending_vcont_resume = true;
 
-      /* If a thread is the parent of an unfollowed fork, then we
-	 can't do a global wildcard, as that would resume the fork
-	 child.  */
-      if (thread_pending_fork_status (tp) != nullptr)
+      /* If a thread is the parent of an unfollowed fork/vfork/clone,
+	 then we can't do a global wildcard, as that would resume the
+	 pending child.  */
+      if (thread_pending_child_status (tp) != nullptr)
 	may_global_wildcard_vcont = false;
     }
 
@@ -7517,22 +7548,22 @@ const notif_client notif_client_stop =
   REMOTE_NOTIF_STOP,
 };
 
-/* If CONTEXT contains any fork child threads that have not been
-   reported yet, remove them from the CONTEXT list.  If such a
-   thread exists it is because we are stopped at a fork catchpoint
-   and have not yet called follow_fork, which will set up the
-   host-side data structures for the new process.  */
+/* If CONTEXT contains any fork/vfork/clone child threads that have
+   not been reported yet, remove them from the CONTEXT list.  If such
+   a thread exists it is because we are stopped at a fork/vfork/clone
+   catchpoint and have not yet called follow_fork/follow_clone, which
+   will set up the host-side data structures for the new child.  */
 
 void
-remote_target::remove_new_fork_children (threads_listing_context *context)
+remote_target::remove_new_children (threads_listing_context *context)
 {
   const notif_client *notif = &notif_client_stop;
 
-  /* For any threads stopped at a fork event, remove the corresponding
-     fork child threads from the CONTEXT list.  */
+  /* For any threads stopped at a (v)fork/clone event, remove the
+     corresponding child threads from the CONTEXT list.  */
   for (thread_info *thread : all_non_exited_threads (this))
     {
-      const target_waitstatus *ws = thread_pending_fork_status (thread);
+      const target_waitstatus *ws = thread_pending_child_status (thread);
 
       if (ws == nullptr)
 	continue;
@@ -7540,13 +7571,12 @@ remote_target::remove_new_fork_children (threads_listing_context *context)
       context->remove_thread (ws->child_ptid ());
     }
 
-  /* Check for any pending fork events (not reported or processed yet)
-     in process PID and remove those fork child threads from the
-     CONTEXT list as well.  */
+  /* Check for any pending (v)fork/clone events (not reported or
+     processed yet) in process PID and remove those child threads from
+     the CONTEXT list as well.  */
   remote_notif_get_pending_events (notif);
   for (auto &event : get_remote_state ()->stop_reply_queue)
-    if (event->ws.kind () == TARGET_WAITKIND_FORKED
-	|| event->ws.kind () == TARGET_WAITKIND_VFORKED)
+    if (is_new_child_status (event->ws.kind ()))
       context->remove_thread (event->ws.child_ptid ());
     else if (event->ws.kind () == TARGET_WAITKIND_THREAD_EXITED)
       context->remove_thread (event->ptid);
@@ -7877,6 +7907,8 @@ Packet: '%s'\n"),
 	    event->ws.set_forked (read_ptid (++p1, &p));
 	  else if (strprefix (p, p1, "vfork"))
 	    event->ws.set_vforked (read_ptid (++p1, &p));
+	  else if (strprefix (p, p1, "clone"))
+	    event->ws.set_thread_cloned (read_ptid (++p1, &p));
 	  else if (strprefix (p, p1, "vforkdone"))
 	    {
 	      event->ws.set_vfork_done ();
diff --git a/gdbserver/remote-utils.cc b/gdbserver/remote-utils.cc
index fb5c38d4522..49dcd986c1c 100644
--- a/gdbserver/remote-utils.cc
+++ b/gdbserver/remote-utils.cc
@@ -1063,6 +1063,7 @@ prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
     case TARGET_WAITKIND_FORKED:
     case TARGET_WAITKIND_VFORKED:
     case TARGET_WAITKIND_VFORK_DONE:
+    case TARGET_WAITKIND_THREAD_CLONED:
     case TARGET_WAITKIND_EXECD:
     case TARGET_WAITKIND_THREAD_CREATED:
     case TARGET_WAITKIND_SYSCALL_ENTRY:
@@ -1071,13 +1072,30 @@ prepare_resume_reply (char *buf, ptid_t ptid, const target_waitstatus &status)
 	struct regcache *regcache;
 	char *buf_start = buf;
 
-	if ((status.kind () == TARGET_WAITKIND_FORKED && cs.report_fork_events)
+	if ((status.kind () == TARGET_WAITKIND_FORKED
+	     && cs.report_fork_events)
 	    || (status.kind () == TARGET_WAITKIND_VFORKED
-		&& cs.report_vfork_events))
+		&& cs.report_vfork_events)
+	    || status.kind () == TARGET_WAITKIND_THREAD_CLONED)
 	  {
 	    enum gdb_signal signal = GDB_SIGNAL_TRAP;
-	    const char *event = (status.kind () == TARGET_WAITKIND_FORKED
-				 ? "fork" : "vfork");
+
+	    auto kind_remote_str = [] (target_waitkind kind)
+	    {
+	      switch (kind)
+		{
+		case TARGET_WAITKIND_FORKED:
+		  return "fork";
+		case TARGET_WAITKIND_VFORKED:
+		  return "vfork";
+		case TARGET_WAITKIND_THREAD_CLONED:
+		  return "clone";
+		default:
+		  gdb_assert_not_reached ("unhandled kind");
+		}
+	    };
+
+	    const char *event = kind_remote_str (status.kind ());
 
 	    sprintf (buf, "T%02x%s:", signal, event);
 	    buf += strlen (buf);
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 5f2032c37c1..38f084bb035 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -241,7 +241,8 @@ in_queued_stop_replies_ptid (struct notif_event *event, ptid_t filter_ptid)
 
   /* Don't resume fork children that GDB does not know about yet.  */
   if ((vstop_event->status.kind () == TARGET_WAITKIND_FORKED
-       || vstop_event->status.kind () == TARGET_WAITKIND_VFORKED)
+       || vstop_event->status.kind () == TARGET_WAITKIND_VFORKED
+       || vstop_event->status.kind () == TARGET_WAITKIND_THREAD_CLONED)
       && vstop_event->status.child_ptid ().matches (filter_ptid))
     return true;
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 05/25] Avoid duplicate QThreadEvents packets
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (3 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 04/25] Support clone events in the remote protocol Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 06/25] Thread options & clone events (core + remote) Pedro Alves
                   ` (20 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Similarly to QProgramSignals and QPassSignals, avoid sending duplicate
QThreadEvents packets.

Approved-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Iaf5babb0b64e1527ba4db31aac8674d82b17e8b4
---
 gdb/remote.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/gdb/remote.c b/gdb/remote.c
index 8c7979ce6a6..2a460f6f57a 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -529,6 +529,10 @@ class remote_state
      the target know about program signals list changes.  */
   char *last_program_signals_packet = nullptr;
 
+  /* Similarly, the last QThreadEvents state we sent to the
+     target.  */
+  bool last_thread_events = false;
+
   gdb_signal last_sent_signal = GDB_SIGNAL_0;
 
   bool last_sent_step = false;
@@ -15010,6 +15014,9 @@ remote_target::thread_events (int enable)
   if (m_features.packet_support (PACKET_QThreadEvents) == PACKET_DISABLE)
     return;
 
+  if (rs->last_thread_events == enable)
+    return;
+
   xsnprintf (rs->buf.data (), size, "QThreadEvents:%x", enable ? 1 : 0);
   putpkt (rs->buf);
   getpkt (&rs->buf);
@@ -15019,6 +15026,7 @@ remote_target::thread_events (int enable)
     case PACKET_OK:
       if (strcmp (rs->buf.data (), "OK") != 0)
 	error (_("Remote refused setting thread events: %s"), rs->buf.data ());
+      rs->last_thread_events = enable;
       break;
     case PACKET_ERROR:
       warning (_("Remote failure reply: %s"), rs->buf.data ());
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 06/25] Thread options & clone events (core + remote)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (4 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 05/25] Avoid duplicate QThreadEvents packets Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 07/25] Thread options & clone events (native Linux) Pedro Alves
                   ` (19 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED
event kind, and made the Linux target report clone events.

A following patch will teach Linux GDBserver to do the same thing.

However, for remote debugging, it wouldn't be ideal for GDBserver to
report every clone event to GDB, when GDB only cares about such events
in some specific situations.  Reporting clone events all the time
would be potentially chatty.  We don't enable thread create/exit
events all the time for the same reason.  Instead we have the
QThreadEvents packet.  QThreadEvents is target-wide, though.

This patch makes GDB instead explicitly request that the target
reports clone events or not, on a per-thread basis.

In order to be able to do that with GDBserver, we need a new remote
protocol feature.  Since a following patch will want to enable thread
exit events on per-thread basis too, the packet introduced here is
more generic than just for clone events.  It lets you enable/disable a
set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS.

IOW, this commit introduces a new QThreadOptions packet, that lets you
specify a set of per-thread event options you want to enable.  The
packet accepts a list of options/thread-id pairs, similarly to vCont,
processed left to right, with the options field being a number
interpreted as a bit mask of options.  The only option defined in this
commit is GDB_THREAD_OPTION_CLONE (0x1), which ask the remote target
to report clone events.  Another patch later in the series will
introduce another option.

For example, this packet sets option "1" (clone events) on thread
p1000.2345:

  QThreadOptions;1:p1000.2345

and this clears options for all threads of process 1000, and then sets
option "1" (clone events) on thread p1000.2345:

  QThreadOptions;0:p1000.-1;1:p1000.2345

This clears options of all threads of all processes:

  QThreadOptions;0

The target reports the set of supported options by including
"QThreadOptions=<supported options>" in its qSupported response.

infrun is then tweaked to enable GDB_THREAD_OPTION_CLONE when stepping
over a breakpoint.

Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit
their parent's thread options.  This is so that GDB can send e.g.,
"QThreadOptions;0;1:TID" without worrying about threads it doesn't
know about yet.

Documentation for this new remote protocol feature is included in a
documentation patch later in the series.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie41e5093b2573f14cf6ac41b0b5804eba75be37e
---
 gdb/gdbthread.h        |  16 ++++
 gdb/infrun.c           |  63 +++++++++++++-
 gdb/remote.c           | 182 ++++++++++++++++++++++++++++++++++++++++-
 gdb/target-debug.h     |   2 +
 gdb/target-delegates.c |  28 +++++++
 gdb/target.c           |   9 ++
 gdb/target.h           |   8 ++
 gdb/target/target.c    |  11 +++
 gdb/target/target.h    |  16 ++++
 gdb/thread.c           |  15 ++++
 gdbserver/gdbthread.h  |   3 +
 gdbserver/server.cc    | 130 +++++++++++++++++++++++++++++
 gdbserver/target.cc    |   6 ++
 gdbserver/target.h     |   6 ++
 14 files changed, 493 insertions(+), 2 deletions(-)

diff --git a/gdb/gdbthread.h b/gdb/gdbthread.h
index 48f32bb3a0b..938a47ff012 100644
--- a/gdb/gdbthread.h
+++ b/gdb/gdbthread.h
@@ -28,6 +28,7 @@ struct symtab;
 #include "ui-out.h"
 #include "btrace.h"
 #include "target/waitstatus.h"
+#include "target/target.h"
 #include "cli/cli-utils.h"
 #include "gdbsupport/refcounted-object.h"
 #include "gdbsupport/common-gdbthread.h"
@@ -473,6 +474,17 @@ class thread_info : public refcounted_object,
     m_thread_fsm = std::move (fsm);
   }
 
+  /* Record the thread options last set for this thread.  */
+
+  void set_thread_options (gdb_thread_options thread_options);
+
+  /* Get the thread options last set for this thread.  */
+
+  gdb_thread_options thread_options () const
+  {
+    return m_thread_options;
+  }
+
   int current_line = 0;
   struct symtab *current_symtab = NULL;
 
@@ -580,6 +592,10 @@ class thread_info : public refcounted_object,
      left to do for the thread's execution command after the target
      stops.  Several execution commands use it.  */
   std::unique_ptr<struct thread_fsm> m_thread_fsm;
+
+  /* The thread options as last set with a call to
+     set_thread_options.  */
+  gdb_thread_options m_thread_options;
 };
 
 using thread_info_resumed_with_pending_wait_status_node
diff --git a/gdb/infrun.c b/gdb/infrun.c
index e3157f86aff..03eb32a68c4 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1606,7 +1606,6 @@ step_over_info_valid_p (void)
 /* Return true if THREAD is doing a displaced step.  */
 
 static bool
-ATTRIBUTE_UNUSED
 displaced_step_in_progress_thread (thread_info *thread)
 {
   gdb_assert (thread != nullptr);
@@ -1955,6 +1954,28 @@ displaced_step_prepare (thread_info *thread)
   return status;
 }
 
+/* Maybe disable thread-{cloned,created,exited} event reporting after
+   a step-over (either in-line or displaced) finishes.  */
+
+static void
+update_thread_events_after_step_over (thread_info *event_thread)
+{
+  if (target_supports_set_thread_options (0))
+    {
+      /* We can control per-thread options.  Disable events for the
+	 event thread.  */
+      event_thread->set_thread_options (0);
+    }
+  else
+    {
+      /* We can only control the target-wide target_thread_events
+	 setting.  Disable it, but only if other threads don't need it
+	 enabled.  */
+      if (!displaced_step_in_progress_any_thread ())
+	target_thread_events (false);
+    }
+}
+
 /* If we displaced stepped an instruction successfully, adjust registers and
    memory to yield the same effect the instruction would have had if we had
    executed it at its original address, and return
@@ -1999,6 +2020,8 @@ displaced_step_finish (thread_info *event_thread,
   if (!displaced->in_progress ())
     return DISPLACED_STEP_FINISH_STATUS_OK;
 
+  update_thread_events_after_step_over (event_thread);
+
   gdb_assert (event_thread->inf->displaced_step_state.in_progress_count > 0);
   event_thread->inf->displaced_step_state.in_progress_count--;
 
@@ -2493,6 +2516,42 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
   else
     target_pass_signals (signal_pass);
 
+  /* Request that the target report thread-{created,cloned} events in
+     the following situations:
+
+     - If we are performing an in-line step-over-breakpoint, then we
+       will remove a breakpoint from the target and only run the
+       current thread.  We don't want any new thread (spawned by the
+       step) to start running, as it might miss the breakpoint.
+
+     - If we are stepping over a breakpoint out of line (displaced
+       stepping) then we won't remove a breakpoint from the target,
+       but, if the step spawns a new clone thread, then we will need
+       to fixup the $pc address in the clone child too, so we need it
+       to start stopped.
+  */
+  if (step_over_info_valid_p ()
+      || displaced_step_in_progress_thread (tp))
+    {
+      gdb_thread_options options = GDB_THREAD_OPTION_CLONE;
+      if (target_supports_set_thread_options (options))
+	tp->set_thread_options (options);
+      else
+	target_thread_events (true);
+    }
+
+  /* If we're resuming more than one thread simultaneously, then any
+     thread other than the leader is being set to run free.  Clear any
+     previous thread option for those threads.  */
+  if (resume_ptid != inferior_ptid && target_supports_set_thread_options (0))
+    {
+      process_stratum_target *resume_target = tp->inf->process_target ();
+      for (thread_info *thr_iter : all_non_exited_threads (resume_target,
+							   resume_ptid))
+	if (thr_iter != tp)
+	  thr_iter->set_thread_options (0);
+    }
+
   infrun_debug_printf ("resume_ptid=%s, step=%d, sig=%s",
 		       resume_ptid.to_string ().c_str (),
 		       step, gdb_signal_to_symbol_string (sig));
@@ -6281,6 +6340,8 @@ finish_step_over (struct execution_control_state *ecs)
 	 back an event.  */
       gdb_assert (ecs->event_thread->control.trap_expected);
 
+      update_thread_events_after_step_over (ecs->event_thread);
+
       clear_step_over_info ();
     }
 
diff --git a/gdb/remote.c b/gdb/remote.c
index 2a460f6f57a..991a4344c7f 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -248,6 +248,9 @@ enum {
   /* Support for the QThreadEvents packet.  */
   PACKET_QThreadEvents,
 
+  /* Support for the QThreadOptions packet.  */
+  PACKET_QThreadOptions,
+
   /* Support for multi-process extensions.  */
   PACKET_multiprocess_feature,
 
@@ -596,6 +599,10 @@ class remote_state
      this can go away.  */
   bool wait_forever_enabled_p = true;
 
+  /* The set of thread options the target reported it supports, via
+     qSupported.  */
+  gdb_thread_options supported_thread_options = 0;
+
 private:
   /* Asynchronous signal handle registered as event loop source for
      when we have pending events ready to be passed to the core.  */
@@ -760,6 +767,8 @@ class remote_target : public process_stratum_target
   void detach (inferior *, int) override;
   void disconnect (const char *, int) override;
 
+  void commit_requested_thread_options ();
+
   void commit_resumed () override;
   void resume (ptid_t, int, enum gdb_signal) override;
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
@@ -886,6 +895,8 @@ class remote_target : public process_stratum_target
 
   void thread_events (int) override;
 
+  bool supports_set_thread_options (gdb_thread_options) override;
+
   int can_do_single_step () override;
 
   void terminal_inferior () override;
@@ -1184,6 +1195,9 @@ class remote_target : public process_stratum_target
 
   void remote_packet_size (const protocol_feature *feature,
 			   packet_support support, const char *value);
+  void remote_supported_thread_options (const protocol_feature *feature,
+					enum packet_support support,
+					const char *value);
 
   void remote_serial_quit_handler ();
 
@@ -5506,7 +5520,8 @@ remote_supported_packet (remote_target *remote,
 
 void
 remote_target::remote_packet_size (const protocol_feature *feature,
-				   enum packet_support support, const char *value)
+				   enum packet_support support,
+				   const char *value)
 {
   struct remote_state *rs = get_remote_state ();
 
@@ -5543,6 +5558,49 @@ remote_packet_size (remote_target *remote, const protocol_feature *feature,
   remote->remote_packet_size (feature, support, value);
 }
 
+void
+remote_target::remote_supported_thread_options (const protocol_feature *feature,
+						enum packet_support support,
+						const char *value)
+{
+  struct remote_state *rs = get_remote_state ();
+
+  m_features.m_protocol_packets[feature->packet].support = support;
+
+  if (support != PACKET_ENABLE)
+    return;
+
+  if (value == nullptr || *value == '\0')
+    {
+      warning (_("Remote target reported \"%s\" without supported options."),
+	       feature->name);
+      return;
+    }
+
+  ULONGEST options = 0;
+  const char *p = unpack_varlen_hex (value, &options);
+
+  if (*p != '\0')
+    {
+      warning (_("Remote target reported \"%s\" with "
+		 "bad thread options: \"%s\"."),
+	       feature->name, value);
+      return;
+    }
+
+  /* Record the set of supported options.  */
+  rs->supported_thread_options = (gdb_thread_option) options;
+}
+
+static void
+remote_supported_thread_options (remote_target *remote,
+				 const protocol_feature *feature,
+				 enum packet_support support,
+				 const char *value)
+{
+  remote->remote_supported_thread_options (feature, support, value);
+}
+
 static const struct protocol_feature remote_protocol_features[] = {
   { "PacketSize", PACKET_DISABLE, remote_packet_size, -1 },
   { "qXfer:auxv:read", PACKET_DISABLE, remote_supported_packet,
@@ -5645,6 +5703,8 @@ static const struct protocol_feature remote_protocol_features[] = {
     PACKET_Qbtrace_conf_pt_size },
   { "vContSupported", PACKET_DISABLE, remote_supported_packet, PACKET_vContSupported },
   { "QThreadEvents", PACKET_DISABLE, remote_supported_packet, PACKET_QThreadEvents },
+  { "QThreadOptions", PACKET_DISABLE, remote_supported_thread_options,
+    PACKET_QThreadOptions },
   { "no-resumed", PACKET_DISABLE, remote_supported_packet, PACKET_no_resumed },
   { "memory-tagging", PACKET_DISABLE, remote_supported_packet,
     PACKET_memory_tagging_feature },
@@ -5747,6 +5807,10 @@ remote_target::remote_query_supported ()
 	  != AUTO_BOOLEAN_FALSE)
 	remote_query_supported_append (&q, "QThreadEvents+");
 
+      if (m_features.packet_set_cmd_state (PACKET_QThreadOptions)
+	  != AUTO_BOOLEAN_FALSE)
+	remote_query_supported_append (&q, "QThreadOptions+");
+
       if (m_features.packet_set_cmd_state (PACKET_no_resumed)
 	  != AUTO_BOOLEAN_FALSE)
 	remote_query_supported_append (&q, "no-resumed+");
@@ -6843,6 +6907,8 @@ remote_target::resume (ptid_t scope_ptid, int step, enum gdb_signal siggnal)
       return;
     }
 
+  commit_requested_thread_options ();
+
   /* In all-stop, we can't mark REMOTE_ASYNC_GET_PENDING_EVENTS_TOKEN
      (explained in remote-notif.c:handle_notification) so
      remote_notif_process is not called.  We need find a place where
@@ -7005,6 +7071,8 @@ remote_target::commit_resumed ()
   if (!target_is_non_stop_p () || ::execution_direction == EXEC_REVERSE)
     return;
 
+  commit_requested_thread_options ();
+
   /* Try to send wildcard actions ("vCont;c" or "vCont;c:pPID.-1")
      instead of resuming all threads of each process individually.
      However, if any thread of a process must remain halted, we can't
@@ -15036,6 +15104,115 @@ remote_target::thread_events (int enable)
     }
 }
 
+/* Implementation of the supports_set_thread_options target
+   method.  */
+
+bool
+remote_target::supports_set_thread_options (gdb_thread_options options)
+{
+  remote_state *rs = get_remote_state ();
+  return (m_features.packet_support (PACKET_QThreadOptions) == PACKET_ENABLE
+	  && (rs->supported_thread_options & options) == options);
+}
+
+/* For coalescing reasons, actually sending the options to the target
+   happens at resume time, via this function.  See target_resume for
+   all-stop, and target_commit_resumed for non-stop.  */
+
+void
+remote_target::commit_requested_thread_options ()
+{
+  struct remote_state *rs = get_remote_state ();
+
+  if (m_features.packet_support (PACKET_QThreadOptions) != PACKET_ENABLE)
+    return;
+
+  char *p = rs->buf.data ();
+  char *endp = p + get_remote_packet_size ();
+
+  /* Clear options for all threads by default.  Note that unlike
+     vCont, the rightmost options that match a thread apply, so we
+     don't have to worry about whether we can use wildcard ptids.  */
+  strcpy (p, "QThreadOptions;0");
+  p += strlen (p);
+
+  /* Send the QThreadOptions packet stored in P.  */
+  auto flush = [&] ()
+    {
+      *p++ = '\0';
+
+      putpkt (rs->buf);
+      getpkt (&rs->buf, 0);
+
+      switch (m_features.packet_ok (rs->buf, PACKET_QThreadOptions))
+	{
+	case PACKET_OK:
+	  if (strcmp (rs->buf.data (), "OK") != 0)
+	    error (_("Remote refused setting thread options: %s"), rs->buf.data ());
+	  break;
+	case PACKET_ERROR:
+	  error (_("Remote failure reply: %s"), rs->buf.data ());
+	case PACKET_UNKNOWN:
+	  gdb_assert_not_reached ("PACKET_UNKNOWN");
+	  break;
+	}
+    };
+
+  /* Prepare P for another QThreadOptions packet.  */
+  auto restart = [&] ()
+    {
+      p = rs->buf.data ();
+      strcpy (p, "QThreadOptions");
+      p += strlen (p);
+    };
+
+  /* Now set non-zero options for threads that need them.  We don't
+     bother with the case of all threads of a process wanting the same
+     non-zero options as that's not an expected scenario.  */
+  for (thread_info *tp : all_non_exited_threads (this))
+    {
+      gdb_thread_options options = tp->thread_options ();
+
+      if (options == 0)
+	continue;
+
+      /* It might be possible to we have more threads with options
+	 than can fit a single QThreadOptions packet.  So build each
+	 options/thread pair in this separate buffer to make sure it
+	 fits.  */
+      constexpr size_t max_options_size = 100;
+      char obuf[max_options_size];
+      char *obuf_p = obuf;
+      char *obuf_endp = obuf + max_options_size;
+
+      *obuf_p++ = ';';
+      obuf_p += xsnprintf (obuf_p, obuf_endp - obuf_p, "%s",
+			   phex_nz (options, sizeof (options)));
+      if (tp->ptid != magic_null_ptid)
+	{
+	  *obuf_p++ = ':';
+	  obuf_p = write_ptid (obuf_p, obuf_endp, tp->ptid);
+	}
+
+      size_t osize = obuf_p - obuf;
+      if (osize > endp - p)
+	{
+	  /* This new options/thread pair doesn't fit the packet
+	     buffer.  Send what we have already.  */
+	  flush ();
+	  restart ();
+
+	  /* Should now fit.  */
+	  gdb_assert (osize <= endp - p);
+	}
+
+      memcpy (p, obuf, osize);
+      p += osize;
+    }
+
+  flush ();
+}
+
 static void
 show_remote_cmd (const char *args, int from_tty)
 {
@@ -15784,6 +15961,9 @@ Show the maximum size of the address (in bits) in a memory packet."), NULL,
   add_packet_config_cmd (PACKET_QThreadEvents, "QThreadEvents", "thread-events",
 			 0);
 
+  add_packet_config_cmd (PACKET_QThreadOptions, "QThreadOptions",
+			 "thread-options", 0);
+
   add_packet_config_cmd (PACKET_no_resumed, "N stop reply",
 			 "no-resumed-stop-reply", 0);
 
diff --git a/gdb/target-debug.h b/gdb/target-debug.h
index 3663ec21740..431f99ed3b1 100644
--- a/gdb/target-debug.h
+++ b/gdb/target-debug.h
@@ -176,6 +176,8 @@
   target_debug_do_print (X.get ())
 #define target_debug_print_target_waitkind(X) \
   target_debug_do_print (pulongest (X))
+#define target_debug_print_gdb_thread_options(X) \
+  target_debug_do_print (to_string (X).c_str ())
 
 static void
 target_debug_print_target_waitstatus_p (struct target_waitstatus *status)
diff --git a/gdb/target-delegates.c b/gdb/target-delegates.c
index eae96e2daba..c5540c366e4 100644
--- a/gdb/target-delegates.c
+++ b/gdb/target-delegates.c
@@ -106,6 +106,7 @@ struct dummy_target : public target_ops
   int async_wait_fd () override;
   bool has_pending_events () override;
   void thread_events (int arg0) override;
+  bool supports_set_thread_options (gdb_thread_options arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
   int find_memory_regions (find_memory_region_ftype arg0, void *arg1) override;
@@ -282,6 +283,7 @@ struct debug_target : public target_ops
   int async_wait_fd () override;
   bool has_pending_events () override;
   void thread_events (int arg0) override;
+  bool supports_set_thread_options (gdb_thread_options arg0) override;
   bool supports_non_stop () override;
   bool always_non_stop_p () override;
   int find_memory_regions (find_memory_region_ftype arg0, void *arg1) override;
@@ -2274,6 +2276,32 @@ debug_target::thread_events (int arg0)
   gdb_puts (")\n", gdb_stdlog);
 }
 
+bool
+target_ops::supports_set_thread_options (gdb_thread_options arg0)
+{
+  return this->beneath ()->supports_set_thread_options (arg0);
+}
+
+bool
+dummy_target::supports_set_thread_options (gdb_thread_options arg0)
+{
+  return false;
+}
+
+bool
+debug_target::supports_set_thread_options (gdb_thread_options arg0)
+{
+  bool result;
+  gdb_printf (gdb_stdlog, "-> %s->supports_set_thread_options (...)\n", this->beneath ()->shortname ());
+  result = this->beneath ()->supports_set_thread_options (arg0);
+  gdb_printf (gdb_stdlog, "<- %s->supports_set_thread_options (", this->beneath ()->shortname ());
+  target_debug_print_gdb_thread_options (arg0);
+  gdb_puts (") = ", gdb_stdlog);
+  target_debug_print_bool (result);
+  gdb_puts ("\n", gdb_stdlog);
+  return result;
+}
+
 bool
 target_ops::supports_non_stop ()
 {
diff --git a/gdb/target.c b/gdb/target.c
index bf82649ed98..a6ca7fc4f07 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -4329,6 +4329,15 @@ target_thread_events (int enable)
   current_inferior ()->top_target ()->thread_events (enable);
 }
 
+/* See target.h.  */
+
+bool
+target_supports_set_thread_options (gdb_thread_options options)
+{
+  inferior *inf = current_inferior ();
+  return inf->top_target ()->supports_set_thread_options (options);
+}
+
 /* Controls if targets can report that they can/are async.  This is
    just for maintainers to use when debugging gdb.  */
 bool target_async_permitted = true;
diff --git a/gdb/target.h b/gdb/target.h
index d4d81e727e9..558be463755 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -741,6 +741,10 @@ struct target_ops
       TARGET_DEFAULT_RETURN (false);
     virtual void thread_events (int)
       TARGET_DEFAULT_IGNORE ();
+    /* Returns true if the target supports setting thread options
+       OPTIONS, false otherwise.  */
+    virtual bool supports_set_thread_options (gdb_thread_options options)
+      TARGET_DEFAULT_RETURN (false);
     /* This method must be implemented in some situations.  See the
        comment on 'can_run'.  */
     virtual bool supports_non_stop ()
@@ -1895,6 +1899,10 @@ extern void target_async (bool enable);
 /* Enables/disables thread create and exit events.  */
 extern void target_thread_events (int enable);
 
+/* Returns true if the target supports setting thread options
+   OPTIONS.  */
+extern bool target_supports_set_thread_options (gdb_thread_options options);
+
 /* Whether support for controlling the target backends always in
    non-stop mode is enabled.  */
 extern enum auto_boolean target_non_stop_enabled;
diff --git a/gdb/target/target.c b/gdb/target/target.c
index 8089918f1d0..3af7d73df5a 100644
--- a/gdb/target/target.c
+++ b/gdb/target/target.c
@@ -188,3 +188,14 @@ target_read_string (CORE_ADDR memaddr, int len, int *bytes_read)
 
   return gdb::unique_xmalloc_ptr<char> ((char *) buffer.release ());
 }
+
+/* See target/target.h.  */
+
+std::string
+to_string (gdb_thread_options options)
+{
+  static constexpr gdb_thread_options::string_mapping mapping[] = {
+    MAP_ENUM_FLAG (GDB_THREAD_OPTION_CLONE),
+  };
+  return options.to_string (mapping);
+}
diff --git a/gdb/target/target.h b/gdb/target/target.h
index d1a18ee2212..2691f92e4ef 100644
--- a/gdb/target/target.h
+++ b/gdb/target/target.h
@@ -22,9 +22,25 @@
 
 #include "target/waitstatus.h"
 #include "target/wait.h"
+#include "gdbsupport/enum-flags.h"
 
 /* This header is a stopgap until more code is shared.  */
 
+/* Available thread options.  Keep this in sync with to_string, in
+   target.c.  */
+
+enum gdb_thread_option : unsigned
+{
+  /* Tell the target to report TARGET_WAITKIND_THREAD_CLONED events
+     for the thread.  */
+  GDB_THREAD_OPTION_CLONE = 1 << 0,
+};
+
+DEF_ENUM_FLAGS_TYPE (enum gdb_thread_option, gdb_thread_options);
+
+/* Convert gdb_thread_option to a string.  */
+extern std::string to_string (gdb_thread_options options);
+
 /* Read LEN bytes of target memory at address MEMADDR, placing the
    results in GDB's memory at MYADDR.  Return zero for success,
    nonzero if any error occurs.  This function must be provided by
diff --git a/gdb/thread.c b/gdb/thread.c
index 0660589abbf..ca0466f35ec 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -431,6 +431,21 @@ thread_info::clear_pending_waitstatus ()
 
 /* See gdbthread.h.  */
 
+void
+thread_info::set_thread_options (gdb_thread_options thread_options)
+{
+  if (m_thread_options == thread_options)
+    return;
+
+  m_thread_options = thread_options;
+
+  infrun_debug_printf ("[options for %s are now %s]",
+		       this->ptid.to_string ().c_str (),
+		       to_string (thread_options).c_str ());
+}
+
+/* See gdbthread.h.  */
+
 int
 thread_is_in_step_over_chain (struct thread_info *tp)
 {
diff --git a/gdbserver/gdbthread.h b/gdbserver/gdbthread.h
index 493e1dbf6cb..a4dff0fe1a2 100644
--- a/gdbserver/gdbthread.h
+++ b/gdbserver/gdbthread.h
@@ -80,6 +80,9 @@ struct thread_info
 
   /* Branch trace target information for this thread.  */
   struct btrace_target_info *btrace = nullptr;
+
+  /* Thread options GDB requested with QThreadOptions.  */
+  gdb_thread_options thread_options = 0;
 };
 
 extern std::list<thread_info *> all_threads;
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 38f084bb035..c24a5c9fb96 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -36,6 +36,7 @@
 #include "dll.h"
 #include "hostio.h"
 #include <vector>
+#include <unordered_map>
 #include "gdbsupport/common-inferior.h"
 #include "gdbsupport/job-control.h"
 #include "gdbsupport/environ.h"
@@ -616,6 +617,17 @@ parse_store_memtags_request (char *request, CORE_ADDR *addr, size_t *len,
   return true;
 }
 
+/* Parse thread options starting at *P and return them.  On exit,
+   advance *P past the options.  */
+
+static gdb_thread_options
+parse_gdb_thread_options (const char **p)
+{
+  ULONGEST options = 0;
+  *p = unpack_varlen_hex (*p, &options);
+  return (gdb_thread_option) options;
+}
+
 /* Handle all of the extended 'Q' packets.  */
 
 static void
@@ -897,6 +909,114 @@ handle_general_set (char *own_buf)
       return;
     }
 
+  if (startswith (own_buf, "QThreadOptions;"))
+    {
+      const char *p = own_buf + strlen ("QThreadOptions");
+
+      gdb_thread_options supported_options = target_supported_thread_options ();
+      if (supported_options == 0)
+	{
+	  /* Something went wrong -- we don't support any option, but
+	     GDB sent the packet anyway.  */
+	  write_enn (own_buf);
+	  return;
+	}
+
+      /* We could store the options directly in thread->thread_options
+	 without this map, but that would mean that a QThreadOptions
+	 packet with a wildcard like "QThreadOptions;0;3:TID" would
+	 result in the debug logs showing:
+
+	   [options for TID are now 0x0]
+	   [options for TID are now 0x3]
+
+	 It's nicer if we only print the final options for each TID,
+	 and if we only print about it if the options changed compared
+	 to the options that were previously set on the thread.  */
+      std::unordered_map<thread_info *, gdb_thread_options> set_options;
+
+      while (*p != '\0')
+	{
+	  if (p[0] != ';')
+	    {
+	      write_enn (own_buf);
+	      return;
+	    }
+	  p++;
+
+	  /* Read the options.  */
+
+	  gdb_thread_options options = parse_gdb_thread_options (&p);
+
+	  if ((options & ~supported_options) != 0)
+	    {
+	      /* GDB asked for an unknown or unsupported option, so
+		 error out.  */
+	      std::string err
+		= string_printf ("E.Unknown thread options requested: %s\n",
+				 to_string (options).c_str ());
+	      strcpy (own_buf, err.c_str ());
+	      return;
+	    }
+
+	  ptid_t ptid;
+
+	  if (p[0] == ';' || p[0] == '\0')
+	    ptid = minus_one_ptid;
+	  else if (p[0] == ':')
+	    {
+	      const char *q;
+
+	      ptid = read_ptid (p + 1, &q);
+
+	      if (p == q)
+		{
+		  write_enn (own_buf);
+		  return;
+		}
+	      p = q;
+	      if (p[0] != ';' && p[0] != '\0')
+		{
+		  write_enn (own_buf);
+		  return;
+		}
+	    }
+	  else
+	    {
+	      write_enn (own_buf);
+	      return;
+	    }
+
+	  /* Convert PID.-1 => PID.0 for ptid.matches.  */
+	  if (ptid.lwp () == -1)
+	    ptid = ptid_t (ptid.pid ());
+
+	  for_each_thread ([&] (thread_info *thread)
+	    {
+	      if (ptid_of (thread).matches (ptid))
+		set_options[thread] = options;
+	    });
+	}
+
+      for (const auto &iter : set_options)
+	{
+	  thread_info *thread = iter.first;
+	  gdb_thread_options options = iter.second;
+
+	  if (thread->thread_options != options)
+	    {
+	      threads_debug_printf ("[options for %s are now %s]\n",
+				    target_pid_to_str (ptid_of (thread)).c_str (),
+				    to_string (options).c_str ());
+
+	      thread->thread_options = options;
+	    }
+	}
+
+      write_ok (own_buf);
+      return;
+    }
+
   if (startswith (own_buf, "QStartupWithShell:"))
     {
       const char *value = own_buf + strlen ("QStartupWithShell:");
@@ -2348,6 +2468,8 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 		cs.vCont_supported = 1;
 	      else if (feature == "QThreadEvents+")
 		;
+	      else if (feature == "QThreadOptions+")
+		;
 	      else if (feature == "no-resumed+")
 		{
 		  /* GDB supports and wants TARGET_WAITKIND_NO_RESUMED
@@ -2474,6 +2596,14 @@ handle_query (char *own_buf, int packet_len, int *new_packet_len_p)
 
       strcat (own_buf, ";vContSupported+");
 
+      gdb_thread_options supported_options = target_supported_thread_options ();
+      if (supported_options != 0)
+	{
+	  char *end_buf = own_buf + strlen (own_buf);
+	  sprintf (end_buf, ";QThreadOptions=%s",
+		   phex_nz (supported_options, sizeof (supported_options)));
+	}
+
       strcat (own_buf, ";QThreadEvents+");
 
       strcat (own_buf, ";no-resumed+");
diff --git a/gdbserver/target.cc b/gdbserver/target.cc
index f8e592d20c3..1c740bbf583 100644
--- a/gdbserver/target.cc
+++ b/gdbserver/target.cc
@@ -532,6 +532,12 @@ process_stratum_target::supports_vfork_events ()
   return false;
 }
 
+gdb_thread_options
+process_stratum_target::supported_thread_options ()
+{
+  return 0;
+}
+
 bool
 process_stratum_target::supports_exec_events ()
 {
diff --git a/gdbserver/target.h b/gdbserver/target.h
index f13ee40489f..8893e0a6a8b 100644
--- a/gdbserver/target.h
+++ b/gdbserver/target.h
@@ -276,6 +276,9 @@ class process_stratum_target
   /* Returns true if vfork events are supported.  */
   virtual bool supports_vfork_events ();
 
+  /* Returns the set of supported thread options.  */
+  virtual gdb_thread_options supported_thread_options ();
+
   /* Returns true if exec events are supported.  */
   virtual bool supports_exec_events ();
 
@@ -531,6 +534,9 @@ int kill_inferior (process_info *proc);
 #define target_supports_vfork_events() \
   the_target->supports_vfork_events ()
 
+#define target_supported_thread_options(options) \
+  the_target->supported_thread_options (options)
+
 #define target_supports_exec_events() \
   the_target->supports_exec_events ()
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 07/25] Thread options & clone events (native Linux)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (5 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 06/25] Thread options & clone events (core + remote) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver) Pedro Alves
                   ` (18 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit teaches the native Linux target about the
GDB_THREAD_OPTION_CLONE thread option.  It's actually simpler to just
continue reporting all clone events unconditionally to the core.
There's never any harm in reporting a clone event when the option is
disabled.  All we need to do is to report support for the option,
otherwise GDB falls back to use target_thread_events().

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: If90316e2dcd0c61d0fefa0d463c046011698acf9
---
 gdb/linux-nat.c | 7 +++++++
 gdb/linux-nat.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index da870e84922..5bbdabc241a 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -4503,6 +4503,13 @@ linux_nat_target::thread_events (int enable)
   report_thread_events = enable;
 }
 
+bool
+linux_nat_target::supports_set_thread_options (gdb_thread_options options)
+{
+  constexpr gdb_thread_options supported_options = GDB_THREAD_OPTION_CLONE;
+  return ((options & supported_options) == options);
+}
+
 linux_nat_target::linux_nat_target ()
 {
   /* We don't change the stratum; this target will sit at
diff --git a/gdb/linux-nat.h b/gdb/linux-nat.h
index 1cdbeafd4f3..cf236160b4a 100644
--- a/gdb/linux-nat.h
+++ b/gdb/linux-nat.h
@@ -82,6 +82,8 @@ class linux_nat_target : public inf_ptrace_target
 
   void thread_events (int) override;
 
+  bool supports_set_thread_options (gdb_thread_options options) override;
+
   bool can_async_p () override;
 
   bool supports_non_stop () override;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (6 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 07/25] Thread options & clone events (native Linux) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2024-02-06 11:04   ` Luis Machado
  2023-11-13 15:04 ` [FYI/pushed v4 09/25] gdbserver: Hide and don't detach pending clone children Pedro Alves
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This patch teaches the Linux GDBserver backend to report clone events
to GDB, when GDB has requested them with the GDB_THREAD_OPTION_CLONE
thread option, via the new QThreadOptions packet.

This shuffles code in linux_process_target::handle_extended_wait
around to a more logical order when we now have to handle and
potentially report all of fork/vfork/clone.

Raname lwp_info::fork_relative -> lwp_info::relative as the field is
no longer only about (v)fork.

With this, gdb.threads/stepi-over-clone.exp now cleanly passes against
GDBserver, so remove the native-target-only requirement from that
testcase.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I3a19bc98801ec31e5c6fdbe1ebe17df855142bb2
---
 .../gdb.threads/stepi-over-clone.exp          |   6 -
 gdbserver/linux-low.cc                        | 253 ++++++++++--------
 gdbserver/linux-low.h                         |  47 ++--
 3 files changed, 160 insertions(+), 146 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
index e580f2248ac..4c496429632 100644
--- a/gdb/testsuite/gdb.threads/stepi-over-clone.exp
+++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
@@ -19,12 +19,6 @@
 # disassembly output.  For now this is only implemented for x86-64.
 require {istarget x86_64-*-*}
 
-# Test only on native targets, for now.
-proc is_native_target {} {
-    return [expr {[target_info gdb_protocol] == ""}]
-}
-require is_native_target
-
 standard_testfile
 
 if { [prepare_for_testing "failed to prepare" $testfile $srcfile \
diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 40b6a907ad9..136a8b6c9a1 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -491,7 +491,6 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
   struct lwp_info *event_lwp = *orig_event_lwp;
   int event = linux_ptrace_get_extended_event (wstat);
   struct thread_info *event_thr = get_lwp_thread (event_lwp);
-  struct lwp_info *new_lwp;
 
   gdb_assert (event_lwp->waitstatus.kind () == TARGET_WAITKIND_IGNORE);
 
@@ -503,7 +502,6 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
   if ((event == PTRACE_EVENT_FORK) || (event == PTRACE_EVENT_VFORK)
       || (event == PTRACE_EVENT_CLONE))
     {
-      ptid_t ptid;
       unsigned long new_pid;
       int ret, status;
 
@@ -527,61 +525,65 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
 	    warning ("wait returned unexpected status 0x%x", status);
 	}
 
-      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
+      if (debug_threads)
 	{
-	  struct process_info *parent_proc;
-	  struct process_info *child_proc;
-	  struct lwp_info *child_lwp;
-	  struct thread_info *child_thr;
+	  debug_printf ("HEW: Got %s event from LWP %ld, new child is %ld\n",
+			(event == PTRACE_EVENT_FORK ? "fork"
+			 : event == PTRACE_EVENT_VFORK ? "vfork"
+			 : event == PTRACE_EVENT_CLONE ? "clone"
+			 : "???"),
+			ptid_of (event_thr).lwp (),
+			new_pid);
+	}
+
+      ptid_t child_ptid = (event != PTRACE_EVENT_CLONE
+			   ? ptid_t (new_pid, new_pid)
+			   : ptid_t (ptid_of (event_thr).pid (), new_pid));
 
-	  ptid = ptid_t (new_pid, new_pid);
+      lwp_info *child_lwp = add_lwp (child_ptid);
+      gdb_assert (child_lwp != NULL);
+      child_lwp->stopped = 1;
+      if (event != PTRACE_EVENT_CLONE)
+	child_lwp->must_set_ptrace_flags = 1;
+      child_lwp->status_pending_p = 0;
 
-	  threads_debug_printf ("Got fork event from LWP %ld, "
-				"new child is %d",
-				ptid_of (event_thr).lwp (),
-				ptid.pid ());
+      thread_info *child_thr = get_lwp_thread (child_lwp);
 
+      /* If we're suspending all threads, leave this one suspended
+	 too.  If the fork/clone parent is stepping over a breakpoint,
+	 all other threads have been suspended already.  Leave the
+	 child suspended too.  */
+      if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
+	  || event_lwp->bp_reinsert != 0)
+	{
+	  threads_debug_printf ("leaving child suspended");
+	  child_lwp->suspended = 1;
+	}
+
+      if (event_lwp->bp_reinsert != 0
+	  && supports_software_single_step ()
+	  && event == PTRACE_EVENT_VFORK)
+	{
+	  /* If we leave single-step breakpoints there, child will
+	     hit it, so uninsert single-step breakpoints from parent
+	     (and child).  Once vfork child is done, reinsert
+	     them back to parent.  */
+	  uninsert_single_step_breakpoints (event_thr);
+	}
+
+      if (event != PTRACE_EVENT_CLONE)
+	{
 	  /* Add the new process to the tables and clone the breakpoint
 	     lists of the parent.  We need to do this even if the new process
 	     will be detached, since we will need the process object and the
 	     breakpoints to remove any breakpoints from memory when we
 	     detach, and the client side will access registers.  */
-	  child_proc = add_linux_process (new_pid, 0);
+	  process_info *child_proc = add_linux_process (new_pid, 0);
 	  gdb_assert (child_proc != NULL);
-	  child_lwp = add_lwp (ptid);
-	  gdb_assert (child_lwp != NULL);
-	  child_lwp->stopped = 1;
-	  child_lwp->must_set_ptrace_flags = 1;
-	  child_lwp->status_pending_p = 0;
-	  child_thr = get_lwp_thread (child_lwp);
-	  child_thr->last_resume_kind = resume_stop;
-	  child_thr->last_status.set_stopped (GDB_SIGNAL_0);
-
-	  /* If we're suspending all threads, leave this one suspended
-	     too.  If the fork/clone parent is stepping over a breakpoint,
-	     all other threads have been suspended already.  Leave the
-	     child suspended too.  */
-	  if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
-	      || event_lwp->bp_reinsert != 0)
-	    {
-	      threads_debug_printf ("leaving child suspended");
-	      child_lwp->suspended = 1;
-	    }
 
-	  parent_proc = get_thread_process (event_thr);
+	  process_info *parent_proc = get_thread_process (event_thr);
 	  child_proc->attached = parent_proc->attached;
 
-	  if (event_lwp->bp_reinsert != 0
-	      && supports_software_single_step ()
-	      && event == PTRACE_EVENT_VFORK)
-	    {
-	      /* If we leave single-step breakpoints there, child will
-		 hit it, so uninsert single-step breakpoints from parent
-		 (and child).  Once vfork child is done, reinsert
-		 them back to parent.  */
-	      uninsert_single_step_breakpoints (event_thr);
-	    }
-
 	  clone_all_breakpoints (child_thr, event_thr);
 
 	  target_desc_up tdesc = allocate_target_description ();
@@ -590,88 +592,97 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
 
 	  /* Clone arch-specific process data.  */
 	  low_new_fork (parent_proc, child_proc);
+	}
 
-	  /* Save fork info in the parent thread.  */
-	  if (event == PTRACE_EVENT_FORK)
-	    event_lwp->waitstatus.set_forked (ptid);
-	  else if (event == PTRACE_EVENT_VFORK)
-	    event_lwp->waitstatus.set_vforked (ptid);
-
+      /* Save fork/clone info in the parent thread.  */
+      if (event == PTRACE_EVENT_FORK)
+	event_lwp->waitstatus.set_forked (child_ptid);
+      else if (event == PTRACE_EVENT_VFORK)
+	event_lwp->waitstatus.set_vforked (child_ptid);
+      else if (event == PTRACE_EVENT_CLONE
+	       && (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) != 0)
+	event_lwp->waitstatus.set_thread_cloned (child_ptid);
+
+      if (event != PTRACE_EVENT_CLONE
+	  || (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) != 0)
+	{
 	  /* The status_pending field contains bits denoting the
-	     extended event, so when the pending event is handled,
-	     the handler will look at lwp->waitstatus.  */
+	     extended event, so when the pending event is handled, the
+	     handler will look at lwp->waitstatus.  */
 	  event_lwp->status_pending_p = 1;
 	  event_lwp->status_pending = wstat;
 
-	  /* Link the threads until the parent event is passed on to
-	     higher layers.  */
-	  event_lwp->fork_relative = child_lwp;
-	  child_lwp->fork_relative = event_lwp;
-
-	  /* If the parent thread is doing step-over with single-step
-	     breakpoints, the list of single-step breakpoints are cloned
-	     from the parent's.  Remove them from the child process.
-	     In case of vfork, we'll reinsert them back once vforked
-	     child is done.  */
-	  if (event_lwp->bp_reinsert != 0
-	      && supports_software_single_step ())
-	    {
-	      /* The child process is forked and stopped, so it is safe
-		 to access its memory without stopping all other threads
-		 from other processes.  */
-	      delete_single_step_breakpoints (child_thr);
-
-	      gdb_assert (has_single_step_breakpoints (event_thr));
-	      gdb_assert (!has_single_step_breakpoints (child_thr));
-	    }
-
-	  /* Report the event.  */
-	  return 0;
+	  /* Link the threads until the parent's event is passed on to
+	     GDB.  */
+	  event_lwp->relative = child_lwp;
+	  child_lwp->relative = event_lwp;
 	}
 
-      threads_debug_printf
-	("Got clone event from LWP %ld, new child is LWP %ld",
-	 lwpid_of (event_thr), new_pid);
-
-      ptid = ptid_t (pid_of (event_thr), new_pid);
-      new_lwp = add_lwp (ptid);
-
-      /* Either we're going to immediately resume the new thread
-	 or leave it stopped.  resume_one_lwp is a nop if it
-	 thinks the thread is currently running, so set this first
-	 before calling resume_one_lwp.  */
-      new_lwp->stopped = 1;
+      /* If the parent thread is doing step-over with single-step
+	 breakpoints, the list of single-step breakpoints are cloned
+	 from the parent's.  Remove them from the child process.
+	 In case of vfork, we'll reinsert them back once vforked
+	 child is done.  */
+      if (event_lwp->bp_reinsert != 0
+	  && supports_software_single_step ())
+	{
+	  /* The child process is forked and stopped, so it is safe
+	     to access its memory without stopping all other threads
+	     from other processes.  */
+	  delete_single_step_breakpoints (child_thr);
 
-      /* If we're suspending all threads, leave this one suspended
-	 too.  If the fork/clone parent is stepping over a breakpoint,
-	 all other threads have been suspended already.  Leave the
-	 child suspended too.  */
-      if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
-	  || event_lwp->bp_reinsert != 0)
-	new_lwp->suspended = 1;
+	  gdb_assert (has_single_step_breakpoints (event_thr));
+	  gdb_assert (!has_single_step_breakpoints (child_thr));
+	}
 
       /* Normally we will get the pending SIGSTOP.  But in some cases
 	 we might get another signal delivered to the group first.
 	 If we do get another signal, be sure not to lose it.  */
       if (WSTOPSIG (status) != SIGSTOP)
 	{
-	  new_lwp->stop_expected = 1;
-	  new_lwp->status_pending_p = 1;
-	  new_lwp->status_pending = status;
+	  child_lwp->stop_expected = 1;
+	  child_lwp->status_pending_p = 1;
+	  child_lwp->status_pending = status;
 	}
-      else if (cs.report_thread_events)
+      else if (event == PTRACE_EVENT_CLONE && cs.report_thread_events)
 	{
-	  new_lwp->waitstatus.set_thread_created ();
-	  new_lwp->status_pending_p = 1;
-	  new_lwp->status_pending = status;
+	  child_lwp->waitstatus.set_thread_created ();
+	  child_lwp->status_pending_p = 1;
+	  child_lwp->status_pending = status;
 	}
 
+      if (event == PTRACE_EVENT_CLONE)
+	{
 #ifdef USE_THREAD_DB
-      thread_db_notice_clone (event_thr, ptid);
+	  thread_db_notice_clone (event_thr, child_ptid);
 #endif
+	}
 
-      /* Don't report the event.  */
-      return 1;
+      if (event == PTRACE_EVENT_CLONE
+	  && (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) == 0)
+	{
+	  threads_debug_printf
+	    ("not reporting clone event from LWP %ld, new child is %ld\n",
+	     ptid_of (event_thr).lwp (),
+	     new_pid);
+	  return 1;
+	}
+
+      /* Leave the child stopped until GDB processes the parent
+	 event.  */
+      child_thr->last_resume_kind = resume_stop;
+      child_thr->last_status.set_stopped (GDB_SIGNAL_0);
+
+      /* Report the event.  */
+      threads_debug_printf
+	("reporting %s event from LWP %ld, new child is %ld\n",
+	 (event == PTRACE_EVENT_FORK ? "fork"
+	  : event == PTRACE_EVENT_VFORK ? "vfork"
+	  : event == PTRACE_EVENT_CLONE ? "clone"
+	  : "???"),
+	 ptid_of (event_thr).lwp (),
+	 new_pid);
+      return 0;
     }
   else if (event == PTRACE_EVENT_VFORK_DONE)
     {
@@ -3531,15 +3542,14 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
 
   if (event_child->waitstatus.kind () != TARGET_WAITKIND_IGNORE)
     {
-      /* If the reported event is an exit, fork, vfork or exec, let
-	 GDB know.  */
+      /* If the reported event is an exit, fork, vfork, clone or exec,
+	 let GDB know.  */
 
-      /* Break the unreported fork relationship chain.  */
-      if (event_child->waitstatus.kind () == TARGET_WAITKIND_FORKED
-	  || event_child->waitstatus.kind () == TARGET_WAITKIND_VFORKED)
+      /* Break the unreported fork/vfork/clone relationship chain.  */
+      if (is_new_child_status (event_child->waitstatus.kind ()))
 	{
-	  event_child->fork_relative->fork_relative = NULL;
-	  event_child->fork_relative = NULL;
+	  event_child->relative->relative = NULL;
+	  event_child->relative = NULL;
 	}
 
       *ourstatus = event_child->waitstatus;
@@ -4272,15 +4282,14 @@ linux_set_resume_request (thread_info *thread, thread_resume *resume, size_t n)
 	      continue;
 	    }
 
-	  /* Don't let wildcard resumes resume fork children that GDB
-	     does not yet know are new fork children.  */
-	  if (lwp->fork_relative != NULL)
+	  /* Don't let wildcard resumes resume fork/vfork/clone
+	     children that GDB does not yet know are new children.  */
+	  if (lwp->relative != NULL)
 	    {
-	      struct lwp_info *rel = lwp->fork_relative;
+	      struct lwp_info *rel = lwp->relative;
 
 	      if (rel->status_pending_p
-		  && (rel->waitstatus.kind () == TARGET_WAITKIND_FORKED
-		      || rel->waitstatus.kind () == TARGET_WAITKIND_VFORKED))
+		  && is_new_child_status (rel->waitstatus.kind ()))
 		{
 		  threads_debug_printf
 		    ("not resuming LWP %ld: has queued stop reply",
@@ -5907,6 +5916,14 @@ linux_process_target::supports_vfork_events ()
   return true;
 }
 
+/* Return the set of supported thread options.  */
+
+gdb_thread_options
+linux_process_target::supported_thread_options ()
+{
+  return GDB_THREAD_OPTION_CLONE;
+}
+
 /* Check if exec events are supported.  */
 
 bool
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index f7cedf6706b..94093dd4ed8 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -234,6 +234,8 @@ class linux_process_target : public process_stratum_target
 
   bool supports_vfork_events () override;
 
+  gdb_thread_options supported_thread_options () override;
+
   bool supports_exec_events () override;
 
   void handle_new_gdb_connection () override;
@@ -732,48 +734,47 @@ struct pending_signal
 
 struct lwp_info
 {
-  /* If this LWP is a fork child that wasn't reported to GDB yet, return
-     its parent, else nullptr.  */
+  /* If this LWP is a fork/vfork/clone child that wasn't reported to
+     GDB yet, return its parent, else nullptr.  */
   lwp_info *pending_parent () const
   {
-    if (this->fork_relative == nullptr)
+    if (this->relative == nullptr)
       return nullptr;
 
-    gdb_assert (this->fork_relative->fork_relative == this);
+    gdb_assert (this->relative->relative == this);
 
-    /* In a fork parent/child relationship, the parent has a status pending and
+    /* In a parent/child relationship, the parent has a status pending and
        the child does not, and a thread can only be in one such relationship
        at most.  So we can recognize who is the parent based on which one has
        a pending status.  */
     gdb_assert (!!this->status_pending_p
-		!= !!this->fork_relative->status_pending_p);
+		!= !!this->relative->status_pending_p);
 
-    if (!this->fork_relative->status_pending_p)
+    if (!this->relative->status_pending_p)
       return nullptr;
 
     const target_waitstatus &ws
-      = this->fork_relative->waitstatus;
+      = this->relative->waitstatus;
     gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
 		|| ws.kind () == TARGET_WAITKIND_VFORKED);
 
-    return this->fork_relative;
-  }
+    return this->relative; }
 
-  /* If this LWP is the parent of a fork child we haven't reported to GDB yet,
-     return that child, else nullptr.  */
+  /* If this LWP is the parent of a fork/vfork/clone child we haven't
+     reported to GDB yet, return that child, else nullptr.  */
   lwp_info *pending_child () const
   {
-    if (this->fork_relative == nullptr)
+    if (this->relative == nullptr)
       return nullptr;
 
-    gdb_assert (this->fork_relative->fork_relative == this);
+    gdb_assert (this->relative->relative == this);
 
-    /* In a fork parent/child relationship, the parent has a status pending and
+    /* In a parent/child relationship, the parent has a status pending and
        the child does not, and a thread can only be in one such relationship
        at most.  So we can recognize who is the parent based on which one has
        a pending status.  */
     gdb_assert (!!this->status_pending_p
-		!= !!this->fork_relative->status_pending_p);
+		!= !!this->relative->status_pending_p);
 
     if (!this->status_pending_p)
       return nullptr;
@@ -782,7 +783,7 @@ struct lwp_info
     gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
 		|| ws.kind () == TARGET_WAITKIND_VFORKED);
 
-    return this->fork_relative;
+    return this->relative;
   }
 
   /* Backlink to the parent object.  */
@@ -820,11 +821,13 @@ struct lwp_info
      information or exit status until it can be reported to GDB.  */
   struct target_waitstatus waitstatus;
 
-  /* A pointer to the fork child/parent relative.  Valid only while
-     the parent fork event is not reported to higher layers.  Used to
-     avoid wildcard vCont actions resuming a fork child before GDB is
-     notified about the parent's fork event.  */
-  struct lwp_info *fork_relative = nullptr;
+  /* A pointer to the fork/vfork/clone child/parent relative (like
+     people, LWPs have relatives).  Valid only while the parent
+     fork/vfork/clone event is not reported to higher layers.  Used to
+     avoid wildcard vCont actions resuming a fork/vfork/clone child
+     before GDB is notified about the parent's fork/vfork/clone
+     event.  */
+  struct lwp_info *relative = nullptr;
 
   /* When stopped is set, this is where the lwp last stopped, with
      decr_pc_after_break already accounted for.  If the LWP is
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 09/25] gdbserver: Hide and don't detach pending clone children
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (7 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 10/25] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit extends the logic added by these two commits from a while
ago:

 #1  7b961964f866  (gdbserver: hide fork child threads from GDB),
 #2  df5ad102009c  (gdb, gdbserver: detach fork child when detaching from fork parent)

... to handle thread clone events, which are very similar to (v)fork
events.

For #1, we want to hide clone children as well, so just update the
comments.

For #2, unlike (v)fork children, pending clone children aren't full
processes, they're just threads, so don't detach them in
handle_detach.  linux-low.cc will take care of detaching them along
with all other threads of the process, there's nothing special that
needs to be done.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I7f5901d07efda576a2522d03e183994e071b8ffc
---
 gdbserver/linux-low.cc |  5 +++--
 gdbserver/linux-low.h  | 15 ++++++++++-----
 gdbserver/server.cc    | 12 +++++++-----
 gdbserver/target.cc    |  3 ++-
 gdbserver/target.h     | 16 +++++++++-------
 5 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 136a8b6c9a1..7a4f8758ae9 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -6951,9 +6951,10 @@ linux_process_target::thread_pending_parent (thread_info *thread)
 }
 
 thread_info *
-linux_process_target::thread_pending_child (thread_info *thread)
+linux_process_target::thread_pending_child (thread_info *thread,
+					    target_waitkind *kind)
 {
-  lwp_info *child = get_thread_lwp (thread)->pending_child ();
+  lwp_info *child = get_thread_lwp (thread)->pending_child (kind);
 
   if (child == nullptr)
     return nullptr;
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index 94093dd4ed8..b5ff9391198 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -315,7 +315,8 @@ class linux_process_target : public process_stratum_target
 #endif
 
   thread_info *thread_pending_parent (thread_info *thread) override;
-  thread_info *thread_pending_child (thread_info *thread) override;
+  thread_info *thread_pending_child (thread_info *thread,
+				     target_waitkind *kind) override;
 
   bool supports_catch_syscall () override;
 
@@ -756,13 +757,15 @@ struct lwp_info
     const target_waitstatus &ws
       = this->relative->waitstatus;
     gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
-		|| ws.kind () == TARGET_WAITKIND_VFORKED);
+		|| ws.kind () == TARGET_WAITKIND_VFORKED
+		|| ws.kind () == TARGET_WAITKIND_THREAD_CLONED);
 
     return this->relative; }
 
   /* If this LWP is the parent of a fork/vfork/clone child we haven't
-     reported to GDB yet, return that child, else nullptr.  */
-  lwp_info *pending_child () const
+     reported to GDB yet, return that child and fill in KIND with the
+     matching waitkind, otherwise nullptr.  */
+  lwp_info *pending_child (target_waitkind *kind) const
   {
     if (this->relative == nullptr)
       return nullptr;
@@ -781,8 +784,10 @@ struct lwp_info
 
     const target_waitstatus &ws = this->waitstatus;
     gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
-		|| ws.kind () == TARGET_WAITKIND_VFORKED);
+		|| ws.kind () == TARGET_WAITKIND_VFORKED
+		|| ws.kind () == TARGET_WAITKIND_THREAD_CLONED);
 
+    *kind = ws.kind ();
     return this->relative;
   }
 
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index c24a5c9fb96..2a70ca63cbd 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -1349,8 +1349,9 @@ handle_detach (char *own_buf)
 	continue;
 
       /* Only threads that have a pending fork event.  */
-      thread_info *child = target_thread_pending_child (thread);
-      if (child == nullptr)
+      target_waitkind kind;
+      thread_info *child = target_thread_pending_child (thread, &kind);
+      if (child == nullptr || kind == TARGET_WAITKIND_THREAD_CLONED)
 	continue;
 
       process_info *fork_child_process = get_thread_process (child);
@@ -1771,9 +1772,10 @@ handle_qxfer_threads_worker (thread_info *thread, std::string *buffer)
   gdb_byte *handle;
   bool handle_status = target_thread_handle (ptid, &handle, &handle_len);
 
-  /* If this is a fork or vfork child (has a fork parent), GDB does not yet
-     know about this process, and must not know about it until it gets the
-     corresponding (v)fork event.  Exclude this thread from the list.  */
+  /* If this is a (v)fork/clone child (has a (v)fork/clone parent),
+     GDB does not yet know about this thread, and must not know about
+     it until it gets the corresponding (v)fork/clone event.  Exclude
+     this thread from the list.  */
   if (target_thread_pending_parent (thread) != nullptr)
     return;
 
diff --git a/gdbserver/target.cc b/gdbserver/target.cc
index 1c740bbf583..dbb4e2d9024 100644
--- a/gdbserver/target.cc
+++ b/gdbserver/target.cc
@@ -816,7 +816,8 @@ process_stratum_target::thread_pending_parent (thread_info *thread)
 }
 
 thread_info *
-process_stratum_target::thread_pending_child (thread_info *thread)
+process_stratum_target::thread_pending_child (thread_info *thread,
+					      target_waitkind *kind)
 {
   return nullptr;
 }
diff --git a/gdbserver/target.h b/gdbserver/target.h
index 8893e0a6a8b..0f1fd5906fb 100644
--- a/gdbserver/target.h
+++ b/gdbserver/target.h
@@ -478,13 +478,15 @@ class process_stratum_target
   virtual bool thread_handle (ptid_t ptid, gdb_byte **handle,
 			      int *handle_len);
 
-  /* If THREAD is a fork child that was not reported to GDB, return its parent
-     else nullptr.  */
+  /* If THREAD is a fork/vfork/clone child that was not reported to
+     GDB, return its parent else nullptr.  */
   virtual thread_info *thread_pending_parent (thread_info *thread);
 
-  /* If THREAD is the parent of a fork child that was not reported to GDB,
-     return this child, else nullptr.  */
-  virtual thread_info *thread_pending_child (thread_info *thread);
+  /* If THREAD is the parent of a fork/vfork/clone child that was not
+     reported to GDB, return this child and fill in KIND with the
+     matching waitkind, otherwise nullptr.  */
+  virtual thread_info *thread_pending_child (thread_info *thread,
+					     target_waitkind *kind);
 
   /* Returns true if the target can software single step.  */
   virtual bool supports_software_single_step ();
@@ -700,9 +702,9 @@ target_thread_pending_parent (thread_info *thread)
 }
 
 static inline thread_info *
-target_thread_pending_child (thread_info *thread)
+target_thread_pending_child (thread_info *thread, target_waitkind *kind)
 {
-  return the_target->thread_pending_child (thread);
+  return the_target->thread_pending_child (thread, kind);
 }
 
 /* Read LEN bytes from MEMADDR in the buffer MYADDR.  Return 0 if the read
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 10/25] Remove gdb/19675 kfails (displaced stepping + clone)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (8 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 09/25] gdbserver: Hide and don't detach pending clone children Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 11/25] all-stop/synchronous RSP support thread-exit events Pedro Alves
                   ` (15 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Now that gdb/19675 is fixed for both native and gdbserver GNU/Linux,
remove the gdb/19675 kfails.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I95c1c38ca370100675d303cd3c8995860bef465d
---
 gdb/testsuite/gdb.base/step-over-syscall.exp | 44 ++------------------
 1 file changed, 3 insertions(+), 41 deletions(-)

diff --git a/gdb/testsuite/gdb.base/step-over-syscall.exp b/gdb/testsuite/gdb.base/step-over-syscall.exp
index 1f1e62296c6..a03ed2d455c 100644
--- a/gdb/testsuite/gdb.base/step-over-syscall.exp
+++ b/gdb/testsuite/gdb.base/step-over-syscall.exp
@@ -42,46 +42,17 @@ if { [istarget "i\[34567\]86-*-linux*"] || [istarget "x86_64-*-linux*"] } {
 }
 
 proc_with_prefix check_pc_after_cross_syscall { displaced syscall syscall_insn_next_addr } {
-    global gdb_prompt
-
     set syscall_insn_next_addr_found [get_hexadecimal_valueof "\$pc" "0"]
 
     # After the 'stepi' we expect thread 1 to still be selected.
-    # However, when displaced stepping over a clone bug gdb/19675
-    # means this might not be the case.
-    #
-    # Which thread we end up in depends on a race between the original
-    # thread-1, and the new thread (created by the clone), so we can't
-    # guarantee which thread we will be in at this point.
-    #
-    # For the fork/vfork syscalls, which are correctly handled by
-    # displaced stepping we will always be in thread-1 or the original
-    # process at this point.
     set curr_thread "unknown"
-    gdb_test_multiple "info threads" "" {
-	-re "Id\\s+Target Id\\s+Frame\\s*\r\n" {
-	    exp_continue
-	}
-	-re "^\\* (\\d+)\\s+\[^\r\n\]+\r\n" {
+    gdb_test_multiple "thread" "" {
+	-re -wrap "Current thread is (\\d+) .*" {
 	    set curr_thread $expect_out(1,string)
-	    exp_continue
-	}
-	-re "^\\s+\\d+\\s+\[^\r\n\]+\r\n" {
-	    exp_continue
-	}
-	-re "$gdb_prompt " {
+	    pass $gdb_test_name
 	}
     }
 
-    # If we are displaced stepping over a clone, and we ended up in
-    # the wrong thread then the following check of the $pc value will
-    # fail.
-    if { $displaced == "on" && $syscall == "clone" && $curr_thread != 1 } {
-	# GDB doesn't support stepping over clone syscall with
-	# displaced stepping.
-	setup_kfail "*-*-*" "gdb/19675"
-    }
-
     gdb_assert {$syscall_insn_next_addr != 0 \
       && $syscall_insn_next_addr == $syscall_insn_next_addr_found \
       && $curr_thread == 1} \
@@ -299,15 +270,6 @@ proc step_over_syscall { syscall } {
 
 	    gdb_test "break marker" "Breakpoint.*at.* file .*${testfile}.c, line.*"
 
-	    # If we are displaced stepping over a clone syscall then
-	    # we expect the following check to fail.  See also the
-	    # code in check_pc_after_cross_syscall.
-	    if { $displaced == "on" && $syscall == "clone" } {
-		# GDB doesn't support stepping over clone syscall with
-		# displaced stepping.
-		setup_kfail "*-*-*" "gdb/19675"
-	    }
-
 	    gdb_test "continue" "Continuing\\..*Breakpoint \[0-9\]+, marker \\(\\) at.*" \
 		"continue to marker ($syscall)"
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 11/25] all-stop/synchronous RSP support thread-exit events
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (9 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 10/25] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 12/25] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches

Currently, GDB does not understand the THREAD_EXITED stop reply in
remote all-stop mode.  There's no good reason for this, it just
happened that THREAD_EXITED was only ever reported in non-stop mode so
far.  This patch teaches GDB to parse that event in all-stop RSP too.
There is no need to add a qSupported feature for this, because the
server won't send a THREAD_EXITED event unless GDB explicitly asks for
it, with QThreadEvents, or with the GDB_THREAD_OPTION_EXIT
QThreadOptions option added in the next patch.

Change-Id: Ide5d12391adf432779fe4c79526801c4a5630966
---
 gdb/remote.c        | 7 ++++++-
 gdbserver/server.cc | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index 991a4344c7f..eb537fc48a5 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -8415,6 +8415,11 @@ remote_target::process_stop_reply (struct stop_reply *stop_reply,
       /* Expedited registers.  */
       if (!stop_reply->regcache.empty ())
 	{
+	  /* 'w' stop replies don't cary expedited registers (which
+	     wouldn't make any sense for a thread that is gone
+	     already).  */
+	  gdb_assert (status->kind () != TARGET_WAITKIND_THREAD_EXITED);
+
 	  struct regcache *regcache
 	    = get_thread_arch_regcache (this, ptid, stop_reply->arch);
 
@@ -8599,7 +8604,7 @@ remote_target::wait_as (ptid_t ptid, target_waitstatus *status,
 	     again.  Keep waiting for events.  */
 	  rs->waiting_for_stop_reply = 1;
 	  break;
-	case 'N': case 'T': case 'S': case 'X': case 'W':
+	case 'N': case 'T': case 'S': case 'X': case 'W': case 'w':
 	  {
 	    /* There is a stop reply to handle.  */
 	    rs->waiting_for_stop_reply = 0;
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 2a70ca63cbd..4a312da40bc 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -3045,6 +3045,7 @@ resume (struct thread_resume *actions, size_t num_actions)
 
       if (cs.last_status.kind () != TARGET_WAITKIND_EXITED
 	  && cs.last_status.kind () != TARGET_WAITKIND_SIGNALLED
+	  && cs.last_status.kind () != TARGET_WAITKIND_THREAD_EXITED
 	  && cs.last_status.kind () != TARGET_WAITKIND_NO_RESUMED)
 	current_thread->last_status = cs.last_status;
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 12/25] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (10 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 11/25] all-stop/synchronous RSP support thread-exit events Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 13/25] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches

gdbserver's linux_process_target::wait loops if:

 - called in sync mode, and,
 - wait_1 returns TARGET_WAITKIND_IGNORE, _and_,
 - wait_1 also returns null_ptid.

The null_ptid check fails however when this path is taken:

   ptid_t
   linux_process_target::filter_exit_event (lwp_info *event_child,
					    target_waitstatus *ourstatus)
   {
   ...
     if (!is_leader (thread))
       {
	 if (report_exit_events_for (thread))
	   ourstatus->set_thread_exited (0);
	 else
	   ourstatus->set_ignore ();            <<<<<<<

	 delete_lwp (event_child);
       }
     return ptid;
   }

This makes linux_process_target::wait return TARGET_WAITKIND_IGNORE in
sync mode, which is unexpected by the core and fails an assertion.

This commit fixes it by just making linux_process_target::wait loop if
it got a TARGET_WAITKIND_IGNORE, irrespective of event_ptid.

Change-Id: I39776908a6c75cbd68aa04139ffcf7be334868cf
---
 gdbserver/linux-low.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 7a4f8758ae9..ca2b7aa1e1b 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -3643,7 +3643,6 @@ linux_process_target::wait (ptid_t ptid,
       event_ptid = wait_1 (ptid, ourstatus, target_options);
     }
   while ((target_options & TARGET_WNOHANG) == 0
-	 && event_ptid == null_ptid
 	 && ourstatus->kind () == TARGET_WAITKIND_IGNORE);
 
   /* If at least one stop was reported, there may be more.  A single
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 13/25] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (11 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 12/25] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 14/25] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
                   ` (12 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Currently, infrun assumes that when TARGET_WAITKIND_THREAD_EXITED is
reported, the corresponding GDB thread has already been removed from
the GDB thread list.

Later in the series, that will no longer work, as infrun will need to
refer to the thread's thread_info when it processes
TARGET_WAITKIND_THREAD_EXITED.

As preparation, this patch makes deleting the GDB thread
responsibility of infrun, instead of the target.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I013d87f61ffc9aaca49f0d6ce2a43e3ea69274de
---
 gdb/infrun.c    | 32 +++++++++++++++++++++++++++-----
 gdb/linux-nat.c | 21 ++++++++++++++-------
 2 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 03eb32a68c4..e1e761cf48f 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4352,7 +4352,12 @@ reinstall_readline_callback_handler_cleanup ()
 }
 
 /* Clean up the FSMs of threads that are now stopped.  In non-stop,
-   that's just the event thread.  In all-stop, that's all threads.  */
+   that's just the event thread.  In all-stop, that's all threads.  In
+   all-stop, threads that had a pending exit no longer have a reason
+   to be around, as their FSMs/commands are canceled, so we delete
+   them.  This avoids "info threads" listing such threads as if they
+   were alive (and failing to read their registers), the user being
+   able to select and resume them (and that failing), etc.  */
 
 static void
 clean_up_just_stopped_threads_fsms (struct execution_control_state *ecs)
@@ -4370,15 +4375,29 @@ clean_up_just_stopped_threads_fsms (struct execution_control_state *ecs)
     {
       scoped_restore_current_thread restore_thread;
 
-      for (thread_info *thr : all_non_exited_threads ())
+      for (thread_info *thr : all_threads_safe ())
 	{
-	  if (thr->thread_fsm () == nullptr)
+	  if (thr->state == THREAD_EXITED)
 	    continue;
+
 	  if (thr == ecs->event_thread)
 	    continue;
 
-	  switch_to_thread (thr);
-	  thr->thread_fsm ()->clean_up (thr);
+	  if (thr->thread_fsm () != nullptr)
+	    {
+	      switch_to_thread (thr);
+	      thr->thread_fsm ()->clean_up (thr);
+	    }
+
+	  /* As we are cancelling the command/FSM of this thread,
+	     whatever was the reason we needed to report a thread
+	     exited event to the user, that reason is gone.  Delete
+	     the thread, so that the user doesn't see it in the thread
+	     list, the next proceed doesn't try to resume it, etc.  */
+	  if (thr->has_pending_waitstatus ()
+	      && (thr->pending_waitstatus ().kind ()
+		  == TARGET_WAITKIND_THREAD_EXITED))
+	    delete_thread (thr);
 	}
     }
 }
@@ -5728,6 +5747,9 @@ handle_inferior_event (struct execution_control_state *ecs)
 
   if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_EXITED)
     {
+      ecs->event_thread = ecs->target->find_thread (ecs->ptid);
+      gdb_assert (ecs->event_thread != nullptr);
+      delete_thread (ecs->event_thread);
       prepare_to_wait (ecs);
       return;
     }
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 5bbdabc241a..1e224e03ecb 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -898,14 +898,15 @@ linux_nat_switch_fork (ptid_t new_ptid)
   registers_changed ();
 }
 
-/* Handle the exit of a single thread LP.  */
+/* Handle the exit of a single thread LP.  If DEL_THREAD is true,
+   delete the thread_info associated to LP, if it exists.  */
 
 static void
-exit_lwp (struct lwp_info *lp)
+exit_lwp (struct lwp_info *lp, bool del_thread = true)
 {
   struct thread_info *th = linux_target->find_thread (lp->ptid);
 
-  if (th)
+  if (th != nullptr && del_thread)
     delete_thread (th);
 
   delete_lwp (lp->ptid);
@@ -3155,11 +3156,17 @@ filter_exit_event (struct lwp_info *event_child,
   if (!is_leader (event_child))
     {
       if (report_thread_events)
-	ourstatus->set_thread_exited (0);
+	{
+	  ourstatus->set_thread_exited (0);
+	  /* Delete lwp, but not thread_info, infrun will need it to
+	     process the event.  */
+	  exit_lwp (event_child, false);
+	}
       else
-	ourstatus->set_ignore ();
-
-      exit_lwp (event_child);
+	{
+	  ourstatus->set_ignore ();
+	  exit_lwp (event_child);
+	}
     }
 
   return ptid;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 14/25] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (12 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 13/25] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 15/25] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
                   ` (11 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When stepping over a breakpoint with displaced stepping, GDB needs to
be informed if the stepped thread exits, otherwise the displaced
stepping buffer that was allocated to that thread leaks, and this can
result in deadlock, with other threads waiting for their turn to
displaced step, but their turn never comes.

Similarly, when stepping over a breakpoint in line, GDB also needs to
be informed if the stepped thread exits, so that is can clear the step
over state and re-resume threads.

This commit makes it possible for GDB to ask the target to report
thread exit events for a given thread, using the new "thread options"
mechanism introduced by a previous patch.

This only adds the core bits.  Following patches in the series will
teach the Linux backends (native & gdbserver) to handle the
GDB_THREAD_OPTION_EXIT option, and then a later patch will make use of
these thread exit events to clean up displaced stepping and inline
stepping state properly.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I96b719fdf7fee94709e98bb3a90751d8134f3a38
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
---
 gdb/infrun.c        | 15 ++++++++++-----
 gdb/remote.c        |  9 +++++++++
 gdb/target/target.c |  1 +
 gdb/target/target.h |  4 ++++
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index e1e761cf48f..0a189d0a485 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -2516,24 +2516,29 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
   else
     target_pass_signals (signal_pass);
 
-  /* Request that the target report thread-{created,cloned} events in
-     the following situations:
+  /* Request that the target report thread-{created,cloned,exited}
+     events in the following situations:
 
      - If we are performing an in-line step-over-breakpoint, then we
        will remove a breakpoint from the target and only run the
        current thread.  We don't want any new thread (spawned by the
-       step) to start running, as it might miss the breakpoint.
+       step) to start running, as it might miss the breakpoint.  We
+       need to clear the step-over state if the stepped thread exits,
+       so we also enable thread-exit events.
 
      - If we are stepping over a breakpoint out of line (displaced
        stepping) then we won't remove a breakpoint from the target,
        but, if the step spawns a new clone thread, then we will need
        to fixup the $pc address in the clone child too, so we need it
-       to start stopped.
+       to start stopped.  We need to release the displaced stepping
+       buffer if the stepped thread exits, so we also enable
+       thread-exit events.
   */
   if (step_over_info_valid_p ()
       || displaced_step_in_progress_thread (tp))
     {
-      gdb_thread_options options = GDB_THREAD_OPTION_CLONE;
+      gdb_thread_options options
+	= GDB_THREAD_OPTION_CLONE | GDB_THREAD_OPTION_EXIT;
       if (target_supports_set_thread_options (options))
 	tp->set_thread_options (options);
       else
diff --git a/gdb/remote.c b/gdb/remote.c
index eb537fc48a5..ce5addade6f 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -4206,6 +4206,15 @@ remote_target::update_thread_list ()
 	      if (has_single_non_exited_thread (tp->inf))
 		continue;
 
+	      /* Do not remove the thread if we've requested to be
+		 notified of its exit.  For example, the thread may be
+		 displaced stepping, infrun will need to handle the
+		 exit event, and displaced stepping info is recorded
+		 in the thread object.  If we deleted the thread now,
+		 we'd lose that info.  */
+	      if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
+		continue;
+
 	      /* Not found.  */
 	      delete_thread (tp);
 	    }
diff --git a/gdb/target/target.c b/gdb/target/target.c
index 3af7d73df5a..58d0f63c872 100644
--- a/gdb/target/target.c
+++ b/gdb/target/target.c
@@ -196,6 +196,7 @@ to_string (gdb_thread_options options)
 {
   static constexpr gdb_thread_options::string_mapping mapping[] = {
     MAP_ENUM_FLAG (GDB_THREAD_OPTION_CLONE),
+    MAP_ENUM_FLAG (GDB_THREAD_OPTION_EXIT),
   };
   return options.to_string (mapping);
 }
diff --git a/gdb/target/target.h b/gdb/target/target.h
index 2691f92e4ef..bad4daa22a1 100644
--- a/gdb/target/target.h
+++ b/gdb/target/target.h
@@ -34,6 +34,10 @@ enum gdb_thread_option : unsigned
   /* Tell the target to report TARGET_WAITKIND_THREAD_CLONED events
      for the thread.  */
   GDB_THREAD_OPTION_CLONE = 1 << 0,
+
+  /* Tell the target to report TARGET_WAITKIND_THREAD_EXIT events for
+     the thread.  */
+  GDB_THREAD_OPTION_EXIT = 1 << 1,
 };
 
 DEF_ENUM_FLAGS_TYPE (enum gdb_thread_option, gdb_thread_options);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 15/25] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (13 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 14/25] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 16/25] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This implements support for the new GDB_THREAD_OPTION_EXIT thread
option for Linux GDBserver.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I96b719fdf7fee94709e98bb3a90751d8134f3a38
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
---
 gdbserver/linux-low.cc | 38 +++++++++++++++++++++++++-------------
 gdbserver/linux-low.h  |  9 +++++----
 2 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index ca2b7aa1e1b..b4a1191b5b4 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -144,6 +144,18 @@ is_leader (thread_info *thread)
   return ptid.pid () == ptid.lwp ();
 }
 
+/* Return true if we should report thread exit events to GDB, for
+   THR.  */
+
+static bool
+report_exit_events_for (thread_info *thr)
+{
+  client_state &cs = get_client_state ();
+
+  return (cs.report_thread_events
+	  || (thr->thread_options & GDB_THREAD_OPTION_EXIT) != 0);
+}
+
 /* LWP accessors.  */
 
 /* See nat/linux-nat.h.  */
@@ -2233,7 +2245,6 @@ linux_low_ptrace_options (int attached)
 void
 linux_process_target::filter_event (int lwpid, int wstat)
 {
-  client_state &cs = get_client_state ();
   struct lwp_info *child;
   struct thread_info *thread;
   int have_stop_pc = 0;
@@ -2320,7 +2331,7 @@ linux_process_target::filter_event (int lwpid, int wstat)
       /* If this is not the leader LWP, then the exit signal was not
 	 the end of the debugged application and should be ignored,
 	 unless GDB wants to hear about thread exits.  */
-      if (cs.report_thread_events || is_leader (thread))
+      if (report_exit_events_for (thread) || is_leader (thread))
 	{
 	  /* Since events are serialized to GDB core, and we can't
 	     report this one right now.  Leave the status pending for
@@ -2888,13 +2899,20 @@ ptid_t
 linux_process_target::filter_exit_event (lwp_info *event_child,
 					 target_waitstatus *ourstatus)
 {
-  client_state &cs = get_client_state ();
   struct thread_info *thread = get_lwp_thread (event_child);
   ptid_t ptid = ptid_of (thread);
 
+  /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
+     if a non-leader thread exits with a signal, we'd report it to the
+     core which would interpret it as the whole-process exiting.
+     There is no TARGET_WAITKIND_THREAD_SIGNALLED event kind.  */
+  if (ourstatus->kind () != TARGET_WAITKIND_EXITED
+      && ourstatus->kind () != TARGET_WAITKIND_SIGNALLED)
+    return ptid;
+
   if (!is_leader (thread))
     {
-      if (cs.report_thread_events)
+      if (report_exit_events_for (thread))
 	ourstatus->set_thread_exited (0);
       else
 	ourstatus->set_ignore ();
@@ -3037,10 +3055,7 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
 	     WTERMSIG (w));
 	}
 
-      if (ourstatus->kind () == TARGET_WAITKIND_EXITED)
-	return filter_exit_event (event_child, ourstatus);
-
-      return ptid_of (current_thread);
+      return filter_exit_event (event_child, ourstatus);
     }
 
   /* If step-over executes a breakpoint instruction, in the case of a
@@ -3607,10 +3622,7 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
 			target_pid_to_str (ptid_of (current_thread)).c_str (),
 			ourstatus->to_string ().c_str ());
 
-  if (ourstatus->kind () == TARGET_WAITKIND_EXITED)
-    return filter_exit_event (event_child, ourstatus);
-
-  return ptid_of (current_thread);
+  return filter_exit_event (event_child, ourstatus);
 }
 
 /* Get rid of any pending event in the pipe.  */
@@ -5920,7 +5932,7 @@ linux_process_target::supports_vfork_events ()
 gdb_thread_options
 linux_process_target::supported_thread_options ()
 {
-  return GDB_THREAD_OPTION_CLONE;
+  return GDB_THREAD_OPTION_CLONE | GDB_THREAD_OPTION_EXIT;
 }
 
 /* Check if exec events are supported.  */
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index b5ff9391198..3597e33289c 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -575,10 +575,11 @@ class linux_process_target : public process_stratum_target
      exited.  */
   void check_zombie_leaders ();
 
-  /* Convenience function that is called when the kernel reports an exit
-     event.  This decides whether to report the event to GDB as a
-     process exit event, a thread exit event, or to suppress the
-     event.  */
+  /* Convenience function that is called when we're about to return an
+     event to the core.  If the event is an exit or signalled event,
+     then this decides whether to report it as process-wide event, as
+     a thread exit event, or to suppress it.  All other event kinds
+     are passed through unmodified.  */
   ptid_t filter_exit_event (lwp_info *event_child,
 			    target_waitstatus *ourstatus);
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 16/25] Implement GDB_THREAD_OPTION_EXIT support for native Linux
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (14 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 15/25] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 17/25] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
                   ` (9 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This implements support for the new GDB_THREAD_OPTION_EXIT thread
option for native Linux.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ia69fc0b9b96f9af7de7cefc1ddb1fba9bbb0bb90
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
---
 gdb/linux-nat.c | 44 +++++++++++++++++++++++++++++++-------------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 1e224e03ecb..8951b34e192 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -268,6 +268,18 @@ pending_status_str (lwp_info *lp)
     return status_to_str (lp->status);
 }
 
+/* Return true if we should report exit events for LP.  */
+
+static bool
+report_exit_events_for (lwp_info *lp)
+{
+  thread_info *thr = linux_target->find_thread (lp->ptid);
+  gdb_assert (thr != nullptr);
+
+  return (report_thread_events
+	  || (thr->thread_options () & GDB_THREAD_OPTION_EXIT) != 0);
+}
+
 \f
 /* LWP accessors.  */
 
@@ -2148,8 +2160,7 @@ wait_lwp (struct lwp_info *lp)
       /* Check if the thread has exited.  */
       if (WIFEXITED (status) || WIFSIGNALED (status))
 	{
-	  if (report_thread_events
-	      || lp->ptid.pid () == lp->ptid.lwp ())
+	  if (report_exit_events_for (lp) || is_leader (lp))
 	    {
 	      linux_nat_debug_printf ("LWP %d exited.", lp->ptid.pid ());
 
@@ -2932,7 +2943,7 @@ linux_nat_filter_event (int lwpid, int status)
   /* Check if the thread has exited.  */
   if (WIFEXITED (status) || WIFSIGNALED (status))
     {
-      if (!report_thread_events && !is_leader (lp))
+      if (!report_exit_events_for (lp) && !is_leader (lp))
 	{
 	  linux_nat_debug_printf ("%s exited.",
 				  lp->ptid.to_string ().c_str ());
@@ -3142,10 +3153,11 @@ check_zombie_leaders (void)
     }
 }
 
-/* Convenience function that is called when the kernel reports an exit
-   event.  This decides whether to report the event to GDB as a
-   process exit event, a thread exit event, or to suppress the
-   event.  */
+/* Convenience function that is called when we're about to return an
+   event to the core.  If the event is an exit or signalled event,
+   then this decides whether to report it as process-wide event, as a
+   thread exit event, or to suppress it.  All other event kinds are
+   passed through unmodified.  */
 
 static ptid_t
 filter_exit_event (struct lwp_info *event_child,
@@ -3153,9 +3165,17 @@ filter_exit_event (struct lwp_info *event_child,
 {
   ptid_t ptid = event_child->ptid;
 
+  /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
+     if a non-leader thread exits with a signal, we'd report it to the
+     core which would interpret it as the whole-process exiting.
+     There is no TARGET_WAITKIND_THREAD_SIGNALLED event kind.  */
+  if (ourstatus->kind () != TARGET_WAITKIND_EXITED
+      && ourstatus->kind () != TARGET_WAITKIND_SIGNALLED)
+    return ptid;
+
   if (!is_leader (event_child))
     {
-      if (report_thread_events)
+      if (report_exit_events_for (event_child))
 	{
 	  ourstatus->set_thread_exited (0);
 	  /* Delete lwp, but not thread_info, infrun will need it to
@@ -3388,10 +3408,7 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     lp->core = linux_common_core_of_thread (lp->ptid);
 
-  if (ourstatus->kind () == TARGET_WAITKIND_EXITED)
-    return filter_exit_event (lp, ourstatus);
-
-  return lp->ptid;
+  return filter_exit_event (lp, ourstatus);
 }
 
 /* Resume LWPs that are currently stopped without any pending status
@@ -4513,7 +4530,8 @@ linux_nat_target::thread_events (int enable)
 bool
 linux_nat_target::supports_set_thread_options (gdb_thread_options options)
 {
-  constexpr gdb_thread_options supported_options = GDB_THREAD_OPTION_CLONE;
+  constexpr gdb_thread_options supported_options
+    = GDB_THREAD_OPTION_CLONE | GDB_THREAD_OPTION_EXIT;
   return ((options & supported_options) == options);
 }
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 17/25] gdb: clear step over information on thread exit (PR gdb/27338)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (15 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 16/25] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 18/25] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

GDB doesn't handle correctly the case where a thread steps over a
breakpoint (using either in-line or displaced stepping), and the
executed instruction causes the thread to exit.

Using the test program included later in the series, this is what it
looks like with displaced-stepping, on x86-64 Linux, where we have two
displaced-step buffers:

  $ ./gdb -q -nx --data-directory=data-directory build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit -ex "b my_exit_syscall" -ex r
  Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
  Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
  Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
  [New Thread 0x7ffff7c5f640 (LWP 2915510)]
  [Switching to Thread 0x7ffff7c5f640 (LWP 2915510)]

  Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
  68              syscall
  (gdb) c
  Continuing.
  [New Thread 0x7ffff7c5f640 (LWP 2915524)]
  [Thread 0x7ffff7c5f640 (LWP 2915510) exited]
  [Switching to Thread 0x7ffff7c5f640 (LWP 2915524)]

  Thread 3 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
  68              syscall
  (gdb) c
  Continuing.
  [New Thread 0x7ffff7c5f640 (LWP 2915616)]
  [Thread 0x7ffff7c5f640 (LWP 2915524) exited]
  [Switching to Thread 0x7ffff7c5f640 (LWP 2915616)]

  Thread 4 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
  68              syscall
  (gdb) c
  Continuing.
  ... hangs ...

The first two times we do "continue", we displaced-step the syscall
instruction that causes the thread to exit.  When the thread exits,
the main thread, waiting on pthread_join, is unblocked.  It spawns a
new thread, which hits the breakpoint on the syscall again.  However,
infrun was never notified that the displaced-stepping threads are done
using the displaced-step buffer, so now both buffers are marked as
used.  So when we do the third continue, there are no buffers
available to displaced-step the syscall, so the thread waits forever
for its turn.

When trying the same but with in-line step over (displaced-stepping
disabled):

  $ ./gdb -q -nx --data-directory=data-directory \
  build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit \
    -ex "b my_exit_syscall" -ex "set displaced-stepping off" -ex r
  Reading symbols from build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit...
  Breakpoint 1 at 0x123c: file src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S, line 68.
  Starting program: build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-over-thread-exit/step-over-thread-exit
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
  [New Thread 0x7ffff7c5f640 (LWP 2928290)]
  [Switching to Thread 0x7ffff7c5f640 (LWP 2928290)]

  Thread 2 "step-over-threa" hit Breakpoint 1, my_exit_syscall () at src/binutils-gdb/gdb/testsuite/lib/my-syscalls.S:68
  68              syscall
  (gdb) c
  Continuing.
  [Thread 0x7ffff7c5f640 (LWP 2928290) exited]
  No unwaited-for children left.
  (gdb) i th
    Id   Target Id                                             Frame
    1    Thread 0x7ffff7c60740 (LWP 2928285) "step-over-threa" 0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0

  The current thread <Thread ID 2> has terminated.  See `help thread'.
  (gdb) thread 1
  [Switching to thread 1 (Thread 0x7ffff7c60740 (LWP 2928285))]
  #0  0x00007ffff7f7c9b7 in __pthread_clockjoin_ex () from /usr/lib/libpthread.so.0
  (gdb) c
  Continuing.
  ^C^C
  ... hangs ...

The "continue" causes an in-line step to occur, meaning the main
thread is stopped while we step the syscall.  The stepped thread exits
when executing the syscall, the linux-nat target notices there are no
more resumed threads to be waited for, so returns
TARGET_WAITKIND_NO_RESUMED, which causes the prompt to return.  But
infrun never clears the in-line step over info.  So if we try
continuing the main thread, GDB doesn't resume it, because it thinks
there's an in-line step in progress that we need to wait for to
finish, and we are stuck there.

To fix this, infrun needs to be informed when a thread doing a
displaced or in-line step over exits.  We can do that with the new
target_set_thread_options mechanism which is optimal for only enabling
exit events of the thread that needs it; or, if that is not supported,
by using target_thread_events, which enables thread exit events for
all threads.  This is done by this commit.

This patch then modifies handle_inferior_event in infrun.c to clean up
any step-over the exiting thread might have been doing at the time of
the exit.  The cases to consider are:

 - the exiting thread was doing an in-line step-over with an all-stop
   target
 - the exiting thread was doing an in-line step-over with a non-stop
   target
 - the exiting thread was doing a displaced step-over with a non-stop
   target

The displaced-stepping buffer implementation in displaced-stepping.c
is modified to account for the fact that it's possible that we
"finish" a displaced step after a thread exit event.  The buffer that
the exiting thread was using is marked as available again and the
original instructions under the scratch pad are restored.  However, it
skips applying the fixup, which wouldn't make sense since the thread
does not exist anymore.

Another case that needs handling is if a displaced-stepping thread
exits, and the event is reported while we are in stop_all_threads.  We
should call displaced_step_finish in the handle_one function, in that
case.  It was already called in other code paths, just not the "thread
exited" path.

This commit doesn't make infrun ask the target to report the
TARGET_WAITKIND_THREAD_EXITED events yet, that'll be done later in the
series.

Note that "stop_print_frame = false;" line is moved to normal_stop,
because TARGET_WAITKIND_THREAD_EXITED can also end up with the event
transmorphed into TARGET_WAITKIND_NO_RESUMED.  Moving it to
normal_stop keeps it centralized.

Co-authored-by: Simon Marchi <simon.marchi@efficios.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I745c6955d7ef90beb83bcf0ff1d1ac8b9b6285a5
---
 gdb/displaced-stepping.c  |   7 ++
 gdb/gdbarch-gen.h         |   6 +-
 gdb/gdbarch_components.py |   4 +
 gdb/infrun.c              | 171 ++++++++++++++++++++++++++++++++++----
 gdb/thread.c              |   3 +
 5 files changed, 174 insertions(+), 17 deletions(-)

diff --git a/gdb/displaced-stepping.c b/gdb/displaced-stepping.c
index 41c3c999d1e..6d5e6cddb9b 100644
--- a/gdb/displaced-stepping.c
+++ b/gdb/displaced-stepping.c
@@ -258,6 +258,13 @@ displaced_step_buffers::finish (gdbarch *arch, thread_info *thread,
 			  thread->ptid.to_string ().c_str (),
 			  paddress (arch, buffer->addr));
 
+  /* If the thread exited while stepping, we are done.  The code above
+     made the buffer available again, and we restored the bytes in the
+     buffer.  We don't want to run the fixup: since the thread is now
+     dead there's nothing to adjust.  */
+  if (status.kind () == TARGET_WAITKIND_THREAD_EXITED)
+    return DISPLACED_STEP_FINISH_STATUS_OK;
+
   regcache *rc = get_thread_regcache (thread);
 
   bool instruction_executed_successfully
diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h
index 1d169c6e4f4..9f468bd1f61 100644
--- a/gdb/gdbarch-gen.h
+++ b/gdb/gdbarch-gen.h
@@ -1138,7 +1138,11 @@ typedef displaced_step_prepare_status (gdbarch_displaced_step_prepare_ftype) (st
 extern displaced_step_prepare_status gdbarch_displaced_step_prepare (struct gdbarch *gdbarch, thread_info *thread, CORE_ADDR &displaced_pc);
 extern void set_gdbarch_displaced_step_prepare (struct gdbarch *gdbarch, gdbarch_displaced_step_prepare_ftype *displaced_step_prepare);
 
-/* Clean up after a displaced step of THREAD. */
+/* Clean up after a displaced step of THREAD.
+
+   It is possible for the displaced-stepped instruction to have caused
+   the thread to exit.  The implementation can detect this case by
+   checking if WS.kind is TARGET_WAITKIND_THREAD_EXITED. */
 
 typedef displaced_step_finish_status (gdbarch_displaced_step_finish_ftype) (struct gdbarch *gdbarch, thread_info *thread, const target_waitstatus &ws);
 extern displaced_step_finish_status gdbarch_displaced_step_finish (struct gdbarch *gdbarch, thread_info *thread, const target_waitstatus &ws);
diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py
index 592d301ed35..694ac366023 100644
--- a/gdb/gdbarch_components.py
+++ b/gdb/gdbarch_components.py
@@ -1880,6 +1880,10 @@ Throw an exception if any unexpected error happens.
 Method(
     comment="""
 Clean up after a displaced step of THREAD.
+
+It is possible for the displaced-stepped instruction to have caused
+the thread to exit.  The implementation can detect this case by
+checking if WS.kind is TARGET_WAITKIND_THREAD_EXITED.
 """,
     type="displaced_step_finish_status",
     name="displaced_step_finish",
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 0a189d0a485..e0125e11b32 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1958,13 +1958,15 @@ displaced_step_prepare (thread_info *thread)
    a step-over (either in-line or displaced) finishes.  */
 
 static void
-update_thread_events_after_step_over (thread_info *event_thread)
+update_thread_events_after_step_over (thread_info *event_thread,
+				      const target_waitstatus &event_status)
 {
   if (target_supports_set_thread_options (0))
     {
       /* We can control per-thread options.  Disable events for the
-	 event thread.  */
-      event_thread->set_thread_options (0);
+	 event thread, unless the thread is gone.  */
+      if (event_status.kind () != TARGET_WAITKIND_THREAD_EXITED)
+	event_thread->set_thread_options (0);
     }
   else
     {
@@ -2020,7 +2022,7 @@ displaced_step_finish (thread_info *event_thread,
   if (!displaced->in_progress ())
     return DISPLACED_STEP_FINISH_STATUS_OK;
 
-  update_thread_events_after_step_over (event_thread);
+  update_thread_events_after_step_over (event_thread, event_status);
 
   gdb_assert (event_thread->inf->displaced_step_state.in_progress_count > 0);
   event_thread->inf->displaced_step_state.in_progress_count--;
@@ -4168,6 +4170,7 @@ struct wait_one_event
 };
 
 static bool handle_one (const wait_one_event &event);
+static int finish_step_over (struct execution_control_state *ecs);
 
 /* Prepare and stabilize the inferior for detaching it.  E.g.,
    detaching while a thread is displaced stepping is a recipe for
@@ -5372,6 +5375,16 @@ handle_one (const wait_one_event &event)
 				      event.ws);
 	  save_waitstatus (t, event.ws);
 	  t->stop_requested = false;
+
+	  if (event.ws.kind () == TARGET_WAITKIND_THREAD_EXITED)
+	    {
+	      if (displaced_step_finish (t, event.ws)
+		  != DISPLACED_STEP_FINISH_STATUS_OK)
+		{
+		  gdb_assert_not_reached ("displaced_step_finish on "
+					  "exited thread failed");
+		}
+	    }
 	}
     }
   else
@@ -5584,7 +5597,9 @@ stop_all_threads (const char *reason, inferior *inf)
     }
 }
 
-/* Handle a TARGET_WAITKIND_NO_RESUMED event.  */
+/* Handle a TARGET_WAITKIND_NO_RESUMED event.  Return true if we
+   handled the event and should continue waiting.  Return false if we
+   should stop and report the event to the user.  */
 
 static bool
 handle_no_resumed (struct execution_control_state *ecs)
@@ -5712,6 +5727,125 @@ handle_no_resumed (struct execution_control_state *ecs)
   return false;
 }
 
+/* Handle a TARGET_WAITKIND_THREAD_EXITED event.  Return true if we
+   handled the event and should continue waiting.  Return false if we
+   should stop and report the event to the user.  */
+
+static bool
+handle_thread_exited (execution_control_state *ecs)
+{
+  context_switch (ecs);
+
+  /* Clear these so we don't re-start the thread stepping over a
+     breakpoint/watchpoint.  */
+  ecs->event_thread->stepping_over_breakpoint = 0;
+  ecs->event_thread->stepping_over_watchpoint = 0;
+
+  /* Maybe the thread was doing a step-over, if so release
+     resources and start any further pending step-overs.
+
+     If we are on a non-stop target and the thread was doing an
+     in-line step, this also restarts the other threads.  */
+  int ret = finish_step_over (ecs);
+
+  /* finish_step_over returns true if it moves ecs' wait status
+     back into the thread, so that we go handle another pending
+     event before this one.  But we know it never does that if
+     the event thread has exited.  */
+  gdb_assert (ret == 0);
+
+  /* If finish_step_over started a new in-line step-over, don't
+     try to restart anything else.  */
+  if (step_over_info_valid_p ())
+    {
+      delete_thread (ecs->event_thread);
+      return true;
+    }
+
+  /* Maybe we are on an all-stop target and we got this event
+     while doing a step-like command on another thread.  If so,
+     go back to doing that.  If this thread was stepping,
+     switch_back_to_stepped_thread will consider that the thread
+     was interrupted mid-step and will try keep stepping it.  We
+     don't want that, the thread is gone.  So clear the proceed
+     status so it doesn't do that.  */
+  clear_proceed_status_thread (ecs->event_thread);
+  if (switch_back_to_stepped_thread (ecs))
+    {
+      delete_thread (ecs->event_thread);
+      return true;
+    }
+
+  inferior *inf = ecs->event_thread->inf;
+  bool slock_applies = schedlock_applies (ecs->event_thread);
+
+  delete_thread (ecs->event_thread);
+  ecs->event_thread = nullptr;
+
+  /* Continue handling the event as if we had gotten a
+     TARGET_WAITKIND_NO_RESUMED.  */
+  auto handle_as_no_resumed = [ecs] ()
+  {
+    /* handle_no_resumed doesn't really look at the event kind, but
+       normal_stop does.  */
+    ecs->ws.set_no_resumed ();
+    ecs->event_thread = nullptr;
+    ecs->ptid = minus_one_ptid;
+
+    /* Re-record the last target status.  */
+    set_last_target_status (ecs->target, ecs->ptid, ecs->ws);
+
+    return handle_no_resumed (ecs);
+  };
+
+  /* If we are on an all-stop target, the target has stopped all
+     threads to report the event.  We don't actually want to
+     stop, so restart the threads.  */
+  if (!target_is_non_stop_p ())
+    {
+      if (slock_applies)
+	{
+	  /* Since the target is !non-stop, then everything is stopped
+	     at this point, and we can't assume we'll get further
+	     events until we resume the target again.  Handle this
+	     event like if it were a TARGET_WAITKIND_NO_RESUMED.  Note
+	     this refreshes the thread list and checks whether there
+	     are other resumed threads before deciding whether to
+	     print "no-unwaited-for left".  This is important because
+	     the user could have done:
+
+	      (gdb) set scheduler-locking on
+	      (gdb) thread 1
+	      (gdb) c&
+	      (gdb) thread 2
+	      (gdb) c
+
+	     ... and only one of the threads exited.  */
+	  return handle_as_no_resumed ();
+	}
+      else
+	{
+	  /* Switch to the first non-exited thread we can find, and
+	     resume.  */
+	  auto range = inf->non_exited_threads ();
+	  if (range.begin () == range.end ())
+	    {
+	      /* Looks like the target reported a
+		 TARGET_WAITKIND_THREAD_EXITED for its last known
+		 thread.  */
+	      return handle_as_no_resumed ();
+	    }
+	  thread_info *non_exited_thread = *range.begin ();
+	  switch_to_thread (non_exited_thread);
+	  insert_breakpoints ();
+	  resume (GDB_SIGNAL_0);
+	}
+    }
+
+  prepare_to_wait (ecs);
+  return true;
+}
+
 /* Given an execution control state that has been freshly filled in by
    an event from the inferior, figure out what it means and take
    appropriate action.
@@ -5750,15 +5884,6 @@ handle_inferior_event (struct execution_control_state *ecs)
       return;
     }
 
-  if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_EXITED)
-    {
-      ecs->event_thread = ecs->target->find_thread (ecs->ptid);
-      gdb_assert (ecs->event_thread != nullptr);
-      delete_thread (ecs->event_thread);
-      prepare_to_wait (ecs);
-      return;
-    }
-
   if (ecs->ws.kind () == TARGET_WAITKIND_NO_RESUMED
       && handle_no_resumed (ecs))
     return;
@@ -5773,7 +5898,6 @@ handle_inferior_event (struct execution_control_state *ecs)
     {
       /* No unwaited-for children left.  IOW, all resumed children
 	 have exited.  */
-      stop_print_frame = false;
       stop_waiting (ecs);
       return;
     }
@@ -5922,6 +6046,12 @@ handle_inferior_event (struct execution_control_state *ecs)
 	keep_going (ecs);
       return;
 
+    case TARGET_WAITKIND_THREAD_EXITED:
+      if (handle_thread_exited (ecs))
+	return;
+      stop_waiting (ecs);
+      break;
+
     case TARGET_WAITKIND_EXITED:
     case TARGET_WAITKIND_SIGNALLED:
       {
@@ -6367,7 +6497,7 @@ finish_step_over (struct execution_control_state *ecs)
 	 back an event.  */
       gdb_assert (ecs->event_thread->control.trap_expected);
 
-      update_thread_events_after_step_over (ecs->event_thread);
+      update_thread_events_after_step_over (ecs->event_thread, ecs->ws);
 
       clear_step_over_info ();
     }
@@ -6413,6 +6543,13 @@ finish_step_over (struct execution_control_state *ecs)
       if (ecs->event_thread->stepping_over_watchpoint)
 	return 0;
 
+      /* The code below is meant to avoid one thread hogging the event
+	 loop by doing constant in-line step overs.  If the stepping
+	 thread exited, there's no risk for this to happen, so we can
+	 safely let our caller process the event immediately.  */
+      if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_EXITED)
+       return 0;
+
       pending = iterate_over_threads (resumed_thread_with_pending_status,
 				      nullptr);
       if (pending != nullptr)
@@ -9105,6 +9242,8 @@ normal_stop ()
 
   if (last.kind () == TARGET_WAITKIND_NO_RESUMED)
     {
+      stop_print_frame = false;
+
       SWITCH_THRU_ALL_UIS ()
 	if (current_ui->prompt_state == PROMPT_BLOCKED)
 	  {
diff --git a/gdb/thread.c b/gdb/thread.c
index ca0466f35ec..47cc5c9cd14 100644
--- a/gdb/thread.c
+++ b/gdb/thread.c
@@ -434,6 +434,9 @@ thread_info::clear_pending_waitstatus ()
 void
 thread_info::set_thread_options (gdb_thread_options thread_options)
 {
+  gdb_assert (this->state != THREAD_EXITED);
+  gdb_assert (!this->executing ());
+
   if (m_thread_options == thread_options)
     return;
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 18/25] stop_all_threads: (re-)enable async before waiting for stops
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (16 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 17/25] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 19/25] gdbserver: Queue no-resumed event after thread exit Pedro Alves
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Running the
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase
added later in the series against gdbserver, after the
TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run
into an infinite loop in stop_all_threads, leading to a timeout:

  FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout)

The is really a latent bug, and it is about the fact that
stop_all_threads stops listening to events from a target as soon as it
sees a TARGET_WAITKIND_NO_RESUMED, ignoring that
TARGET_WAITKIND_NO_RESUMED may be delayed.  handle_no_resumed knows
how to handle delayed no-resumed events, but stop_all_threads was
never taught to.

In more detail, here's what happens with that testcase:

#1 - Multiple threads report breakpoint hits to gdb.

#2 - gdb picks one events, and it's for thread 1.  All other stops are
     left pending.  thread 1 needs to move past a breakpoint, so gdb
     stops all threads to start an inline step over for thread 1.
     While stopping threads, some of the threads that were still
     running report events that are also left pending.

#2 - gdb steps thread 1

#3 - Thread 1 exits while stepping (it steps over an exit syscall),
     gdbserver reports thread exit for thread 1

#4 - Thread 1 was the last resumed thread, so gdbserver also reports
     no-resumed:

    [remote]   Notification received: Stop:w0;p3445d0.3445d3
    [remote] Sending packet: $vStopped#55
    [remote] Packet received: N
    [remote] Sending packet: $vStopped#55
    [remote] Packet received: OK

#5 - gdb processes the thread exit for thread 1, finishes the step
     over and restarts threads.

#6 - gdb picks the next event to process out of one of the resumed
     threads with pending events:

    [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11

#7 - This is again a breakpoint hit and the breakpoint needs to be
     stepped over too, so gdb starts a step-over dance again.

#8 - We reach stop_all_threads, which finds that some threads need to
     be stopped.

#9 - wait_one finally consumes the no-resumed event queue by #4.
     Seeing this, wait_one disable target async, to stop listening for
     events out of the remote target.

#10 - We still haven't seen all the stops expected, so
      stop_all_threads tries another iteration.

#11 - Because the remote target is no longer async, and there are no
      other targets, wait_one return no-resumed immediately without
      polling the remote target.

#12 - We still haven't seen all the stops expected, so
      stop_all_threads tries another iteration.  goto #11, looping
      forever.

Fix this by explicitly enabling/re-enabling target async on targets
that can async, before waiting for stops.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f
---
 gdb/infrun.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index e0125e11b32..409c2249b13 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -5202,6 +5202,8 @@ wait_one ()
       if (nfds == 0)
 	{
 	  /* No waitable targets left.  All must be stopped.  */
+	  infrun_debug_printf ("no waitable targets left");
+
 	  target_waitstatus ws;
 	  ws.set_no_resumed ();
 	  return {nullptr, minus_one_ptid, std::move (ws)};
@@ -5461,6 +5463,83 @@ handle_one (const wait_one_event &event)
   return false;
 }
 
+/* Helper for stop_all_threads.  wait_one waits for events until it
+   sees a TARGET_WAITKIND_NO_RESUMED event.  When it sees one, it
+   disables target_async for the target to stop waiting for events
+   from it.  TARGET_WAITKIND_NO_RESUMED can be delayed though,
+   consider, debugging against gdbserver:
+
+    #1 - Threads 1-5 are running, and thread 1 hits a breakpoint.
+
+    #2 - gdb processes the breakpoint hit for thread 1, stops all
+	 threads, and steps thread 1 over the breakpoint.  while
+	 stopping threads, some other threads reported interesting
+	 events, which were left pending in the thread's objects
+	 (infrun's queue).
+
+    #2 - Thread 1 exits (it stepped an exit syscall), and gdbserver
+	 reports the thread exit for thread 1.	The event ends up in
+	 remote's stop reply queue.
+
+    #3 - That was the last resumed thread, so gdbserver reports
+	 no-resumed, and that event also ends up in remote's stop
+	 reply queue, queued after the thread exit from #2.
+
+    #4 - gdb processes the thread exit event, which finishes the
+	 step-over, and so gdb restarts all threads (threads with
+	 pending events are left marked resumed, but aren't set
+	 executing).  The no-resumed event is still left pending in
+	 the remote stop reply queue.
+
+    #5 - Since there are now resumed threads with pending breakpoint
+	 hits, gdb picks one at random to process next.
+
+    #5 - gdb picks the breakpoint hit for thread 2 this time, and that
+	 breakpoint also needs to be stepped over, so gdb stops all
+	 threads again.
+
+    #6 - stop_all_threads counts number of expected stops and calls
+	 wait_one once for each.
+
+    #7 - The first wait_one call collects the no-resumed event from #3
+	 above.
+
+    #9 - Seeing the no-resumed event, wait_one disables target async
+	 for the remote target, to stop waiting for events from it.
+	 wait_one from here on always return no-resumed directly
+	 without reaching the target.
+
+    #10 - stop_all_threads still hasn't seen all the stops it expects,
+	  so it does another pass.
+
+    #11 - Since the remote target is not async (disabled in #9),
+	  wait_one doesn't wait on it, so it won't see the expected
+	  stops, and instead returns no-resumed directly.
+
+    #12 - stop_all_threads still haven't seen all the stops, so it
+	  does another pass.  goto #11, looping forever.
+
+   To handle this, we explicitly (re-)enable target async on all
+   targets that can async every time stop_all_threads goes wait for
+   the expected stops.  */
+
+static void
+reenable_target_async ()
+{
+  for (inferior *inf : all_inferiors ())
+    {
+      process_stratum_target *target = inf->process_target ();
+      if (target != nullptr
+	  && target->threads_executing
+	  && target->can_async_p ()
+	  && !target->is_async_p ())
+	{
+	  switch_to_inferior_no_thread (inf);
+	  target_async (1);
+	}
+    }
+}
+
 /* See infrun.h.  */
 
 void
@@ -5587,6 +5666,8 @@ stop_all_threads (const char *reason, inferior *inf)
 	  if (pass > 0)
 	    pass = -1;
 
+	  reenable_target_async ();
+
 	  for (int i = 0; i < waits_needed; i++)
 	    {
 	      wait_one_event event = wait_one ();
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 19/25] gdbserver: Queue no-resumed event after thread exit
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (17 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 18/25] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 20/25] Don't resume new threads if scheduler-locking is in effect Pedro Alves
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Normally, if the last resumed thread on the target exits, the server
sends a no-resumed event to GDB.  If however, GDB enables the
GDB_THREAD_OPTION_EXIT option on a thread, and, that thread exits, the
server sends a thread exit event for that thread instead.

In all-stop RSP mode, since events can only be forwarded to GDB one at
a time, and the whole target stops whenever an event is reported, GDB
resumes the target again after getting a THREAD_EXITED event, and then
the server finally reports back a no-resumed event if/when
appropriate.

For non-stop RSP though, events are asynchronous, and if the server
sends a thread-exit event for the last resumed thread, the no-resumed
event is never sent.  This patch makes sure that in non-stop mode, the
server queues a no-resumed event after the thread-exit event if it was
the last resumed thread that exited.

Without this, we'd see failures in step-over-thread-exit testcases
added later in the series, like so:

   continue
   Continuing.
 - No unwaited-for children left.
 - (gdb) PASS: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits
 + FAIL: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits (timeout)

(and other similar ones)

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I927d78b30f88236dbd5634b051a716f72420e7c7
---
 gdbserver/linux-low.cc | 47 +++++++++++++++++++++++++-----------------
 gdbserver/linux-low.h  |  2 ++
 gdbserver/server.cc    | 12 ++++++++++-
 gdbserver/target.cc    |  6 ++++++
 gdbserver/target.h     |  6 ++++++
 5 files changed, 53 insertions(+), 20 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index b4a1191b5b4..44d0fe38030 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -2972,7 +2972,6 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
   int report_to_gdb;
   int trace_event;
   int in_step_range;
-  int any_resumed;
 
   threads_debug_printf ("[%s]", target_pid_to_str (ptid).c_str ());
 
@@ -2986,23 +2985,7 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
   in_step_range = 0;
   ourstatus->set_ignore ();
 
-  auto status_pending_p_any = [&] (thread_info *thread)
-    {
-      return status_pending_p_callback (thread, minus_one_ptid);
-    };
-
-  auto not_stopped = [&] (thread_info *thread)
-    {
-      return not_stopped_callback (thread, minus_one_ptid);
-    };
-
-  /* Find a resumed LWP, if any.  */
-  if (find_thread (status_pending_p_any) != NULL)
-    any_resumed = 1;
-  else if (find_thread (not_stopped) != NULL)
-    any_resumed = 1;
-  else
-    any_resumed = 0;
+  bool was_any_resumed = any_resumed ();
 
   if (step_over_bkpt == null_ptid)
     pid = wait_for_event (ptid, &w, options);
@@ -3013,7 +2996,7 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
       pid = wait_for_event (step_over_bkpt, &w, options & ~WNOHANG);
     }
 
-  if (pid == 0 || (pid == -1 && !any_resumed))
+  if (pid == 0 || (pid == -1 && !was_any_resumed))
     {
       gdb_assert (target_options & TARGET_WNOHANG);
 
@@ -6177,6 +6160,32 @@ linux_process_target::thread_stopped (thread_info *thread)
   return get_thread_lwp (thread)->stopped;
 }
 
+bool
+linux_process_target::any_resumed ()
+{
+  bool any_resumed;
+
+  auto status_pending_p_any = [&] (thread_info *thread)
+    {
+      return status_pending_p_callback (thread, minus_one_ptid);
+    };
+
+  auto not_stopped = [&] (thread_info *thread)
+    {
+      return not_stopped_callback (thread, minus_one_ptid);
+    };
+
+  /* Find a resumed LWP, if any.  */
+  if (find_thread (status_pending_p_any) != NULL)
+    any_resumed = 1;
+  else if (find_thread (not_stopped) != NULL)
+    any_resumed = 1;
+  else
+    any_resumed = 0;
+
+  return any_resumed;
+}
+
 /* This exposes stop-all-threads functionality to other modules.  */
 
 void
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index 3597e33289c..d46ea5aa3ec 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -259,6 +259,8 @@ class linux_process_target : public process_stratum_target
 
   bool thread_stopped (thread_info *thread) override;
 
+  bool any_resumed () override;
+
   void pause_all (bool freeze) override;
 
   void unpause_all (bool unfreeze) override;
diff --git a/gdbserver/server.cc b/gdbserver/server.cc
index 4a312da40bc..a8e23561dcb 100644
--- a/gdbserver/server.cc
+++ b/gdbserver/server.cc
@@ -4731,7 +4731,17 @@ handle_target_event (int err, gdb_client_data client_data)
 	    }
 	}
       else
-	push_stop_notification (cs.last_ptid, cs.last_status);
+	{
+	  push_stop_notification (cs.last_ptid, cs.last_status);
+
+	  if (cs.last_status.kind () == TARGET_WAITKIND_THREAD_EXITED
+	      && !target_any_resumed ())
+	    {
+	      target_waitstatus ws;
+	      ws.set_no_resumed ();
+	      push_stop_notification (null_ptid, ws);
+	    }
+	}
     }
 
   /* Be sure to not change the selected thread behind GDB's back.
diff --git a/gdbserver/target.cc b/gdbserver/target.cc
index dbb4e2d9024..81edff41268 100644
--- a/gdbserver/target.cc
+++ b/gdbserver/target.cc
@@ -614,6 +614,12 @@ process_stratum_target::thread_stopped (thread_info *thread)
   gdb_assert_not_reached ("target op thread_stopped not supported");
 }
 
+bool
+process_stratum_target::any_resumed ()
+{
+  return true;
+}
+
 bool
 process_stratum_target::supports_get_tib_address ()
 {
diff --git a/gdbserver/target.h b/gdbserver/target.h
index 0f1fd5906fb..28d134e7915 100644
--- a/gdbserver/target.h
+++ b/gdbserver/target.h
@@ -319,6 +319,9 @@ class process_stratum_target
   /* Return true if THREAD is known to be stopped now.  */
   virtual bool thread_stopped (thread_info *thread);
 
+  /* Return true if any thread is known to be resumed.  */
+  virtual bool any_resumed ();
+
   /* Return true if the get_tib_address op is supported.  */
   virtual bool supports_get_tib_address ();
 
@@ -683,6 +686,9 @@ target_read_btrace_conf (struct btrace_target_info *tinfo,
 #define target_supports_software_single_step() \
   the_target->supports_software_single_step ()
 
+#define target_any_resumed() \
+  the_target->any_resumed ()
+
 ptid_t mywait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	       target_wait_flags options, int connected_wait);
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 20/25] Don't resume new threads if scheduler-locking is in effect
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (18 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 19/25] gdbserver: Queue no-resumed event after thread exit Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 21/25] Report thread exit event for leader if reporting thread exit events Pedro Alves
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Eli Zaretskii, Andrew Burgess

If scheduler-locking is in effect, e.g., with "set scheduler-locking
on", and you step over a function that spawns a new thread, the new
thread is allowed to run free, at least until some event is hit, at
which point, whether the new thread is re-resumed depends on a number
of seemingly random factors.  E.g., if the target is all-stop, and the
parent thread hits a breakpoint, and GDB decides the breakpoint isn't
interesting to report to the user, then the parent thread is resumed,
but the new thread is left stopped.

I think that letting the new threads run with scheduler-locking
enabled is a defect.  This commit fixes that, making use of the new
clone events on Linux, and of target_thread_events() on targets where
new threads have no connection to the thread that spawned them.

Testcase and documentation changes included.

Approved-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie12140138b37534b7fc1d904da34f0f174aa11ce
---
 gdb/NEWS                                      |  7 ++
 gdb/doc/gdb.texinfo                           |  4 +-
 gdb/infrun.c                                  | 41 +++++++++---
 .../gdb.threads/schedlock-new-thread.c        | 54 +++++++++++++++
 .../gdb.threads/schedlock-new-thread.exp      | 67 +++++++++++++++++++
 5 files changed, 164 insertions(+), 9 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/schedlock-new-thread.c
 create mode 100644 gdb/testsuite/gdb.threads/schedlock-new-thread.exp

diff --git a/gdb/NEWS b/gdb/NEWS
index d85a13b64fe..f2861b1ace1 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -392,6 +392,13 @@ show tui mouse-events
   from the current process state.  GDB will show this additional information
   automatically, or through one of the memory-tag subcommands.
 
+* Scheduler-locking and new threads
+
+  When scheduler-locking is in effect, only the current thread may run
+  when the inferior is resumed.  However, previously, new threads
+  created by the resumed thread would still be able to run free.  Now,
+  they are held stopped.
+
 * "info breakpoints" now displays enabled breakpoint locations of
   disabled breakpoints as in the "y-" state.  For example:
 
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 4cbaaa6804f..79b7431dd78 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -7123,7 +7123,9 @@ the following:
 There is no locking and any thread may run at any time.
 
 @item on
-Only the current thread may run when the inferior is resumed.
+Only the current thread may run when the inferior is resumed.  New
+threads created by the resumed thread are held stopped at their entry
+point, before they execute any instruction.
 
 @item step
 Behaves like @code{on} when stepping, and @code{off} otherwise.
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 409c2249b13..943ea88538c 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -107,6 +107,8 @@ static bool start_step_over (void);
 
 static bool step_over_info_valid_p (void);
 
+static bool schedlock_applies (struct thread_info *tp);
+
 /* Asynchronous signal handler registered as event loop source for
    when we have pending events ready to be passed to the core.  */
 static struct async_event_handler *infrun_async_inferior_event_token;
@@ -1961,7 +1963,13 @@ static void
 update_thread_events_after_step_over (thread_info *event_thread,
 				      const target_waitstatus &event_status)
 {
-  if (target_supports_set_thread_options (0))
+  if (schedlock_applies (event_thread))
+    {
+      /* If scheduler-locking applies, continue reporting
+	 thread-created/thread-cloned events.  */
+      return;
+    }
+  else if (target_supports_set_thread_options (0))
     {
       /* We can control per-thread options.  Disable events for the
 	 event thread, unless the thread is gone.  */
@@ -2535,9 +2543,14 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
        to start stopped.  We need to release the displaced stepping
        buffer if the stepped thread exits, so we also enable
        thread-exit events.
+
+     - If scheduler-locking applies, threads that the current thread
+       spawns should remain halted.  It's not strictly necessary to
+       enable thread-exit events in this case, but it doesn't hurt.
   */
   if (step_over_info_valid_p ()
-      || displaced_step_in_progress_thread (tp))
+      || displaced_step_in_progress_thread (tp)
+      || schedlock_applies (tp))
     {
       gdb_thread_options options
 	= GDB_THREAD_OPTION_CLONE | GDB_THREAD_OPTION_EXIT;
@@ -2546,6 +2559,13 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
       else
 	target_thread_events (true);
     }
+  else
+    {
+      if (target_supports_set_thread_options (0))
+	tp->set_thread_options (0);
+      else if (!displaced_step_in_progress_any_thread ())
+	target_thread_events (false);
+    }
 
   /* If we're resuming more than one thread simultaneously, then any
      thread other than the leader is being set to run free.  Clear any
@@ -6295,16 +6315,21 @@ handle_inferior_event (struct execution_control_state *ecs)
 	    parent->set_running (false);
 
 	  /* If resuming the child, mark it running.  */
-	  if (ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED
-	      || (follow_child || (!detach_fork && (non_stop || sched_multi))))
+	  if ((ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED
+	       && !schedlock_applies (ecs->event_thread))
+	      || (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED
+		  && (follow_child
+		      || (!detach_fork && (non_stop || sched_multi)))))
 	    child->set_running (true);
 
 	  /* In non-stop mode, also resume the other branch.  */
 	  if ((ecs->ws.kind () == TARGET_WAITKIND_THREAD_CLONED
-	       && target_is_non_stop_p ())
-	      || (!detach_fork && (non_stop
-				   || (sched_multi
-				       && target_is_non_stop_p ()))))
+	       && target_is_non_stop_p ()
+	       && !schedlock_applies (ecs->event_thread))
+	      || (ecs->ws.kind () != TARGET_WAITKIND_THREAD_CLONED
+		  && (!detach_fork && (non_stop
+				       || (sched_multi
+					   && target_is_non_stop_p ())))))
 	    {
 	      if (follow_child)
 		switch_to_thread (parent);
diff --git a/gdb/testsuite/gdb.threads/schedlock-new-thread.c b/gdb/testsuite/gdb.threads/schedlock-new-thread.c
new file mode 100644
index 00000000000..4fe776906c6
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/schedlock-new-thread.c
@@ -0,0 +1,54 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021-2023 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+#include <assert.h>
+#include <unistd.h>
+
+static void *
+thread_func (void *arg)
+{
+#if !SCHEDLOCK
+  while (1)
+    sleep (1);
+#endif
+
+  return NULL;
+}
+
+int
+main (void)
+{
+  pthread_t thread;
+  int ret;
+
+  ret = pthread_create (&thread, NULL, thread_func, NULL); /* set break 1 here */
+  assert (ret == 0);
+
+#if SCHEDLOCK
+  /* When testing with schedlock enabled, the new thread won't run, so
+     we can't join it, as that would hang forever.  Instead, sleep for
+     a bit, enough that if the spawned thread is scheduled, it hits
+     the thread_func breakpoint before the main thread reaches the
+     "return 0" line below.  */
+  sleep (3);
+#else
+  pthread_join (thread, NULL);
+#endif
+
+  return 0; /* set break 2 here */
+}
diff --git a/gdb/testsuite/gdb.threads/schedlock-new-thread.exp b/gdb/testsuite/gdb.threads/schedlock-new-thread.exp
new file mode 100644
index 00000000000..ecaeea58f82
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/schedlock-new-thread.exp
@@ -0,0 +1,67 @@
+# Copyright 2021-2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test continuing over a thread spawn with scheduler-locking on.
+
+standard_testfile .c
+
+foreach_with_prefix schedlock {off on} {
+    set sl [expr $schedlock == "on" ? 1 : 0]
+    if { [build_executable "failed to prepare" $testfile-$sl \
+	      $srcfile \
+	      [list debug pthreads additional_flags=-DSCHEDLOCK=$sl]] \
+	     == -1 } {
+	return
+    }
+}
+
+proc test {non-stop schedlock} {
+    save_vars ::GDBFLAGS {
+	append ::GDBFLAGS " -ex \"set non-stop ${non-stop}\""
+	set sl [expr $schedlock == "on" ? 1 : 0]
+	clean_restart $::binfile-$sl
+    }
+
+    set linenum1 [gdb_get_line_number "set break 1 here"]
+
+    if { ![runto $::srcfile:$linenum1] } {
+	return
+    }
+
+    delete_breakpoints
+
+    set linenum2 [gdb_get_line_number "set break 2 here"]
+    gdb_breakpoint $linenum2
+
+    gdb_breakpoint "thread_func"
+
+    gdb_test_no_output "set scheduler-locking $schedlock"
+
+    if {$schedlock} {
+	gdb_test "continue" \
+	    "return 0.*set break 2 here .*" \
+	    "continue does not stop in new thread"
+    } else {
+	gdb_test "continue" \
+	    "thread_func .*" \
+	    "continue stops in new thread"
+    }
+}
+
+foreach_with_prefix non-stop {off on} {
+    foreach_with_prefix schedlock {off on} {
+	test ${non-stop} ${schedlock}
+    }
+}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 21/25] Report thread exit event for leader if reporting thread exit events
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (19 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 20/25] Don't resume new threads if scheduler-locking is in effect Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 22/25] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
thread disappears from the thread list, GDB expects to shortly see a
thread exit event for it.  See e.g., here, in
remote_target::update_thread_list():

    /* Do not remove the thread if we've requested to be
       notified of its exit.  For example, the thread may be
       displaced stepping, infrun will need to handle the
       exit event, and displaced stepping info is recorded
       in the thread object.  If we deleted the thread now,
       we'd lose that info.  */
    if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
      continue;

There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders.  This can lead to
GDB getting confused.  For example, with a following patch that
enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
this regression:

 $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
 ...
 Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
 ... some more cascading FAILs ...

gdb.log shows:

 (gdb) continue
 Continuing.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)

A passing run would have resulted in:

 (gdb) continue
 Continuing.
 No unwaited-for children left.
 (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits

note how the leader thread is not listed in the remote-reported XML
thread list below:

 (gdb) set debug remote 1
 (gdb) set debug infrun 1
 (gdb) info threads
   Id   Target Id                                Frame
 * 1    Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
   3    Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
 (gdb) c
 Continuing.
 [infrun] clear_proceed_status_thread: 1163850.1163850.0
 ...
     [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
     [remote] Sending packet: $QPassSignals:#f3
     [remote] Packet received: OK
     [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
     [remote] Packet received: OK
 ...
     [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
 ...
   [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
   [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
   [infrun] prepare_to_wait: prepare_to_wait
   [infrun] reset: reason=handling event
   [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
   [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
 [infrun] fetch_inferior_event: exit
 [infrun] fetch_inferior_event: enter
   [infrun] scoped_disable_commit_resumed: reason=handling event
   [infrun] random_pending_event_thread: None found.
   [remote] wait: enter
     [remote] Packet received: N
   [remote] wait: exit
   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
   [infrun] print_target_wait_results:   -1.0.0 [process -1],
   [infrun] print_target_wait_results:   status->kind = NO_RESUMED
   [infrun] handle_inferior_event: status->kind = NO_RESUMED
   [remote] Sending packet: $Hgp0.0#ad
   [remote] Packet received: OK
   [remote] Sending packet: $qXfer:threads:read::0,1000#92
   [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
   [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
 ...

... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event.  Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.

This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
invariant.  To do that, it takes a bit more code than one would
imagine off hand, due to the fact that we currently always report LWP
exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
the event to GDBserver core.  For the zombie leader scenario
described, we need to record early on that we want to report a
THREAD_EXITED event, and then make sure that decision isn't lost along
the way to reporting the event to GDBserver core.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b
---
 gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------
 gdbserver/linux-low.h  |  5 +--
 2 files changed, 68 insertions(+), 12 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 44d0fe38030..f9001e2fa17 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -279,7 +279,8 @@ int using_threads = 1;
 static int stabilizing_threads;
 
 static void unsuspend_all_lwps (struct lwp_info *except);
-static void mark_lwp_dead (struct lwp_info *lwp, int wstat);
+static void mark_lwp_dead (struct lwp_info *lwp, int wstat,
+			   bool thread_event);
 static int lwp_is_marked_dead (struct lwp_info *lwp);
 static int kill_lwp (unsigned long lwpid, int signo);
 static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info);
@@ -1803,10 +1804,12 @@ iterate_over_lwps (ptid_t filter,
   return get_thread_lwp (thread);
 }
 
-void
+bool
 linux_process_target::check_zombie_leaders ()
 {
-  for_each_process ([this] (process_info *proc)
+  bool new_pending_event = false;
+
+  for_each_process ([&] (process_info *proc)
     {
       pid_t leader_pid = pid_of (proc);
       lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid));
@@ -1875,9 +1878,19 @@ linux_process_target::check_zombie_leaders ()
 				"(it exited, or another thread execd), "
 				"deleting it.",
 				leader_pid);
-	  delete_lwp (leader_lp);
+
+	  thread_info *leader_thread = get_lwp_thread (leader_lp);
+	  if (report_exit_events_for (leader_thread))
+	    {
+	      mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true);
+	      new_pending_event = true;
+	    }
+	  else
+	    delete_lwp (leader_lp);
 	}
     });
+
+  return new_pending_event;
 }
 
 /* Callback for `find_thread'.  Returns the first LWP that is not
@@ -2336,7 +2349,7 @@ linux_process_target::filter_event (int lwpid, int wstat)
 	  /* Since events are serialized to GDB core, and we can't
 	     report this one right now.  Leave the status pending for
 	     the next time we're able to report it.  */
-	  mark_lwp_dead (child, wstat);
+	  mark_lwp_dead (child, wstat, false);
 	  return;
 	}
       else
@@ -2655,7 +2668,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid,
 
       /* Check for zombie thread group leaders.  Those can't be reaped
 	 until all other threads in the thread group are.  */
-      check_zombie_leaders ();
+      if (check_zombie_leaders ())
+	goto retry;
 
       auto not_stopped = [&] (thread_info *thread)
 	{
@@ -2902,6 +2916,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child,
   struct thread_info *thread = get_lwp_thread (event_child);
   ptid_t ptid = ptid_of (thread);
 
+  if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED)
+    {
+      /* We're reporting a thread exit for the leader.  The exit was
+	 detected by check_zombie_leaders.  */
+      gdb_assert (is_leader (thread));
+      gdb_assert (report_exit_events_for (thread));
+
+      delete_lwp (event_child);
+      return ptid;
+    }
+
   /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise
      if a non-leader thread exits with a signal, we'd report it to the
      core which would interpret it as the whole-process exiting.
@@ -3021,7 +3046,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
     {
       if (WIFEXITED (w))
 	{
-	  ourstatus->set_exited (WEXITSTATUS (w));
+	  /* If we already have the exit recorded in waitstatus, use
+	     it.  This will happen when we detect a zombie leader,
+	     when we had GDB_THREAD_OPTION_EXIT enabled for it.  We
+	     want to report its exit as TARGET_WAITKIND_THREAD_EXITED,
+	     as the whole process hasn't exited yet.  */
+	  const target_waitstatus &ws = event_child->waitstatus;
+	  if (ws.kind () != TARGET_WAITKIND_IGNORE)
+	    {
+	      gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED
+			  || ws.kind () == TARGET_WAITKIND_THREAD_EXITED);
+	      *ourstatus = ws;
+	    }
+	  else
+	    ourstatus->set_exited (WEXITSTATUS (w));
 
 	  threads_debug_printf
 	    ("ret = %s, exited with retcode %d",
@@ -3727,8 +3765,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except)
   send_sigstop (thread, except);
 }
 
+/* Mark LWP dead, with WSTAT as exit status pending to report later.
+   If THREAD_EVENT is true, interpret WSTAT as a thread exit event
+   instead of a process exit event.  This is meaningful for the leader
+   thread, as we normally report a process-wide exit event when we see
+   the leader exit, and a thread exit event when we see any other
+   thread exit.  */
+
 static void
-mark_lwp_dead (struct lwp_info *lwp, int wstat)
+mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event)
 {
   /* Store the exit status for later.  */
   lwp->status_pending_p = 1;
@@ -3737,9 +3782,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat)
   /* Store in waitstatus as well, as there's nothing else to process
      for this event.  */
   if (WIFEXITED (wstat))
-    lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
+    {
+      if (thread_event)
+	lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat));
+      else
+	lwp->waitstatus.set_exited (WEXITSTATUS (wstat));
+    }
   else if (WIFSIGNALED (wstat))
-    lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
+    {
+      gdb_assert (!thread_event);
+      lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat)));
+    }
+  else
+    gdb_assert_not_reached ("unknown status kind");
 
   /* Prevent trying to stop it.  */
   lwp->stopped = 1;
diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
index d46ea5aa3ec..51d1899893a 100644
--- a/gdbserver/linux-low.h
+++ b/gdbserver/linux-low.h
@@ -574,8 +574,9 @@ class linux_process_target : public process_stratum_target
 
   /* Detect zombie thread group leaders, and "exit" them.  We can't
      reap their exits until all other threads in the group have
-     exited.  */
-  void check_zombie_leaders ();
+     exited.  Returns true if we left any new event pending, false
+     otherwise.  */
+  bool check_zombie_leaders ();
 
   /* Convenience function that is called when we're about to return an
      event to the core.  If the event is an exit or signalled event,
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 22/25] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (20 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 21/25] Report thread exit event for leader if reporting thread exit events Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 23/25] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Refactor the syscall assembly code in gdb/testsuite/lib/my-syscalls.S
behind a SYSCALL macro so that it's easy to add new syscalls without
duplicating code.

Note that the way the macro is implemented, it only works correctly
for syscalls with up to 3 arguments, and, if the syscall doesn't
return (the macro doesn't bother to save/restore callee-saved
registers).

The following patch will want to use the macro to define a wrapper for
the "exit" syscall, so the limitations continue to be sufficient.

Change-Id: I8acf1463b11a084d6b4579aaffb49b5d0dea3bba
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
---
 gdb/testsuite/lib/my-syscalls.S | 50 +++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/gdb/testsuite/lib/my-syscalls.S b/gdb/testsuite/lib/my-syscalls.S
index c0dbd0dffea..38299e45284 100644
--- a/gdb/testsuite/lib/my-syscalls.S
+++ b/gdb/testsuite/lib/my-syscalls.S
@@ -21,38 +21,52 @@
 
 #include <asm/unistd.h>
 
-/* int my_execve (const char *file, char *argv[], char *envp[]);  */
-
-.global my_execve
-my_execve:
+/* The SYSCALL macro below current supports calling syscalls with up
+   to 3 arguments, and, assumes the syscall never returns, like exec
+   and exit.  If you need to call syscalls with more arguments or you
+   need to call syscalls that actually return, you'll need to update
+   the macros.  We don't bother with optimizing setting up fewer
+   arguments for syscalls that take fewer arguments, as we're not
+   optimizating for speed or space, but for maintainability.  */
 
 #if defined(__x86_64__)
 
-	mov $__NR_execve, %rax
-	/* rdi, rsi and rdx already contain the right arguments.  */
-my_execve_syscall:
-	syscall
-	ret
+#define SYSCALL(NAME, NR)	\
+.global NAME			;\
+NAME:				;\
+	mov $NR, %rax		;\
+	/* rdi, rsi and rdx already contain the right arguments.  */ \
+NAME ## _syscall:		;\
+	syscall			;\
+	ret			;
 
 #elif defined(__i386__)
 
-	mov $__NR_execve, %eax
-	mov 4(%esp), %ebx
-	mov 8(%esp), %ecx
-	mov 12(%esp), %edx
-my_execve_syscall:
-	int $0x80
+#define SYSCALL(NAME, NR)	\
+.global NAME			;\
+NAME:				;\
+	mov $NR, %eax		;\
+	mov 4(%esp), %ebx	;\
+	mov 8(%esp), %ecx	;\
+	mov 12(%esp), %edx	;\
+NAME ## _syscall:		;\
+	int $0x80		;\
 	ret
 
 #elif defined(__aarch64__)
 
-	mov x8, #__NR_execve
-	/* x0, x1 and x2 already contain the right arguments.  */
-my_execve_syscall:
+#define SYSCALL(NAME, NR)	\
+.global NAME			;\
+NAME:				;\
+	mov x8, NR		;\
+	/* x0, x1 and x2 already contain the right arguments.  */ \
+NAME ## _syscall:		;\
 	svc #0
 
 #else
 # error "Unsupported architecture"
 #endif
 
+SYSCALL (my_execve, __NR_execve)
+
 	.section	.note.GNU-stack,"",@progbits
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 23/25] Testcases for stepping over thread exit syscall (PR gdb/27338)
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (21 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 22/25] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 24/25] Document remote clone events, and QThreadOptions packet Pedro Alves
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Pedro Alves, Andrew Burgess

From: Simon Marchi <simon.marchi@efficios.com>

Add new gdb.threads/step-over-thread-exit.exp and
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
testcases, exercising stepping over thread exit syscall.  These make
use of lib/my-syscalls.S to define the exit syscall.

Co-authored-by: Pedro Alves <pedro@palves.net>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: Ie8b2c5747db99b7023463a897a8390d9e814a9c9
---
 ...-over-thread-exit-while-stop-all-threads.c |  77 +++++++++++
 ...ver-thread-exit-while-stop-all-threads.exp |  69 ++++++++++
 .../gdb.threads/step-over-thread-exit.c       |  52 ++++++++
 .../gdb.threads/step-over-thread-exit.exp     | 126 ++++++++++++++++++
 gdb/testsuite/lib/my-syscalls.S               |   4 +
 gdb/testsuite/lib/my-syscalls.h               |   5 +
 6 files changed, 333 insertions(+)
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.c
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit.c
 create mode 100644 gdb/testsuite/gdb.threads/step-over-thread-exit.exp

diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.c b/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.c
new file mode 100644
index 00000000000..2699ad5d714
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.c
@@ -0,0 +1,77 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021-2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <unistd.h>
+#include <stdlib.h>
+#include <pthread.h>
+#include "../lib/my-syscalls.h"
+
+#define NUM_THREADS 32
+
+static void *
+stepper_over_exit_thread (void *v)
+{
+  my_exit (0);
+
+  /* my_exit above should exit the thread, we don't expect to reach
+     here.  */
+  abort ();
+}
+
+static void *
+spawner_thread (void *v)
+{
+  for (;;)
+    {
+      pthread_t threads[NUM_THREADS];
+      int i;
+
+      for (i = 0; i < NUM_THREADS; i++)
+	pthread_create (&threads[i], NULL, stepper_over_exit_thread, NULL);
+
+      for (i = 0; i < NUM_THREADS; i++)
+	pthread_join (threads[i], NULL);
+    }
+}
+
+static void
+break_here (void)
+{
+}
+
+static void *
+breakpoint_hitter_thread (void *v)
+{
+  for (;;)
+    break_here ();
+}
+
+int
+main ()
+{
+  pthread_t breakpoint_hitter;
+  pthread_t spawner;
+
+  alarm (60);
+
+  pthread_create (&spawner, NULL, spawner_thread, NULL);
+  pthread_create (&breakpoint_hitter, NULL, breakpoint_hitter_thread, NULL);
+
+  pthread_join (spawner, NULL);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.exp b/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
new file mode 100644
index 00000000000..6a46aff700e
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
@@ -0,0 +1,69 @@
+# Copyright 2021-2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test stepping over a breakpoint installed on an instruction that
+# exits the thread, while another thread is repeatedly hitting a
+# breakpoint, causing GDB to stop all threads.
+
+standard_testfile .c
+
+set syscalls_src $srcdir/lib/my-syscalls.S
+
+if { [build_executable "failed to prepare" $testfile \
+	  [list $srcfile $syscalls_src] {debug pthreads}] == -1 } {
+    return
+}
+
+proc test {displaced-stepping target-non-stop} {
+    save_vars ::GDBFLAGS {
+	append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
+	clean_restart $::binfile
+    }
+
+    gdb_test_no_output "set displaced-stepping ${displaced-stepping}"
+
+    if { ![runto_main] } {
+	return
+    }
+
+    # The "stepper over exit" threads will step over an instruction
+    # that causes them to exit.
+    gdb_test "break my_exit_syscall if 0"
+
+    # The "breakpoint hitter" thread will repeatedly hit this
+    # breakpoint, causing GDB to stop all threads.
+    gdb_test "break break_here"
+
+    # To avoid flooding the log with thread created/exited messages.
+    gdb_test_no_output "set print thread-events off"
+
+    # Make sure the target reports the breakpoint stops.
+    gdb_test_no_output "set breakpoint condition-evaluation host"
+
+    for { set i 0 } { $i < 30 } { incr i } {
+	with_test_prefix "iter $i" {
+	    if { [gdb_test "continue" "hit Breakpoint $::decimal, break_here .*"] != 0 } {
+		# Exit if there's a failure to avoid lengthy timeouts.
+		break
+	    }
+	}
+    }
+}
+
+foreach_with_prefix displaced-stepping {off auto} {
+    foreach_with_prefix target-non-stop {off on} {
+	test ${displaced-stepping} ${target-non-stop}
+    }
+}
diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit.c b/gdb/testsuite/gdb.threads/step-over-thread-exit.c
new file mode 100644
index 00000000000..878e5924c5c
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-over-thread-exit.c
@@ -0,0 +1,52 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021-2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+#include <assert.h>
+#include <stdlib.h>
+#include "../lib/my-syscalls.h"
+
+static void *
+thread_func (void *arg)
+{
+  my_exit (0);
+
+  /* my_exit above should exit the thread, we don't expect to reach
+     here.  */
+  abort ();
+}
+
+int
+main (void)
+{
+  int i;
+
+  /* Spawn and join a thread, 100 times.  */
+  for (i = 0; i < 100; i++)
+    {
+      pthread_t thread;
+      int ret;
+
+      ret = pthread_create (&thread, NULL, thread_func, NULL);
+      assert (ret == 0);
+
+      ret = pthread_join (thread, NULL);
+      assert (ret == 0);
+    }
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit.exp b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp
new file mode 100644
index 00000000000..ed8534cf518
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp
@@ -0,0 +1,126 @@
+# Copyright 2021-2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test stepping over a breakpoint installed on an instruction that
+# exits the thread.
+
+standard_testfile .c
+
+set syscalls_src $srcdir/lib/my-syscalls.S
+
+if { [build_executable "failed to prepare" $testfile \
+	  [list $srcfile $syscalls_src] {debug pthreads}] == -1 } {
+    return
+}
+
+# Each argument is a different testing axis, most of them obvious.
+# NS_STOP_ALL is only used if testing "set non-stop on", and indicates
+# whether to have GDB explicitly stop all threads before continuing to
+# thread exit.
+proc test {displaced-stepping non-stop target-non-stop schedlock ns_stop_all} {
+    if {${non-stop} == "off" && $ns_stop_all} {
+	error "invalid arguments"
+    }
+
+    save_vars ::GDBFLAGS {
+	append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
+	append ::GDBFLAGS " -ex \"set non-stop ${non-stop}\""
+	clean_restart $::binfile
+    }
+
+    gdb_test_no_output "set displaced-stepping ${displaced-stepping}"
+
+    if { ![runto_main] } {
+	return
+    }
+
+    gdb_breakpoint "my_exit_syscall"
+
+    if {$schedlock
+	|| (${non-stop} == "on" && $ns_stop_all)} {
+	gdb_test "continue" \
+	    "Thread 2 .*hit Breakpoint $::decimal.* my_exit_syscall .*" \
+	    "continue until syscall"
+
+	if {${non-stop} == "on"} {
+	    # The test only spawns one thread at a time, so this just
+	    # stops the main thread.
+	    gdb_test_multiple "interrupt -a" "" {
+		-re "$::gdb_prompt " {
+		    gdb_test_multiple "" $gdb_test_name {
+			-re "Thread 1 \[^\r\n\]*stopped." {
+			    pass $gdb_test_name
+			}
+		    }
+		}
+	    }
+	}
+
+	gdb_test "thread 2" "Switching to thread 2 .*"
+
+	gdb_test_no_output "set scheduler-locking ${schedlock}"
+
+	gdb_test "continue" \
+	    "No unwaited-for children left." \
+	    "continue stops when thread exits"
+    } else {
+	gdb_test_no_output "set scheduler-locking ${schedlock}"
+
+	for { set i 0 } { $i < 100 } { incr i } {
+	    with_test_prefix "iter $i" {
+		set ok 0
+		set thread "<unknown>"
+		gdb_test_multiple "continue" "" {
+		    -re -wrap "Thread ($::decimal) .*hit Breakpoint $::decimal.* my_exit_syscall .*" {
+			set thread $expect_out(1,string)
+			set ok 1
+		    }
+		}
+		if {!${ok}} {
+		    # Exit if there's a failure to avoid lengthy
+		    # timeouts.
+		    break
+		}
+
+		if {${non-stop}} {
+		    gdb_test "thread $thread" "Switching to thread .*" \
+			"switch to event thread"
+		}
+	    }
+	}
+    }
+}
+
+foreach_with_prefix displaced-stepping {off auto} {
+    foreach_with_prefix non-stop {off on} {
+	foreach_with_prefix target-non-stop {off on} {
+	    if {${non-stop} == "on" && ${target-non-stop} == "off"} {
+		# Invalid combination.
+		continue
+	    }
+
+	    foreach_with_prefix schedlock {off on} {
+		if {${non-stop} == "on"} {
+		    foreach_with_prefix ns_stop_all {0 1} {
+			test ${displaced-stepping} ${non-stop} ${target-non-stop} \
+			    ${schedlock} ${ns_stop_all}
+		    }
+		} else {
+		    test ${displaced-stepping} ${non-stop} ${target-non-stop} ${schedlock} 0
+		}
+	    }
+	}
+    }
+}
diff --git a/gdb/testsuite/lib/my-syscalls.S b/gdb/testsuite/lib/my-syscalls.S
index 38299e45284..02196dd9555 100644
--- a/gdb/testsuite/lib/my-syscalls.S
+++ b/gdb/testsuite/lib/my-syscalls.S
@@ -69,4 +69,8 @@ NAME ## _syscall:		;\
 
 SYSCALL (my_execve, __NR_execve)
 
+/* void my_exit (int code);  */
+
+SYSCALL (my_exit, __NR_exit)
+
 	.section	.note.GNU-stack,"",@progbits
diff --git a/gdb/testsuite/lib/my-syscalls.h b/gdb/testsuite/lib/my-syscalls.h
index cdce05058f9..7f9ae387427 100644
--- a/gdb/testsuite/lib/my-syscalls.h
+++ b/gdb/testsuite/lib/my-syscalls.h
@@ -22,4 +22,9 @@
 
 int my_execve (const char *file, char *argv[], char *envp[]);
 
+/* `exit` syscall, which makes the thread exit (as opposed to
+   `exit_group`, which makes the process exit).  */
+
+void my_exit (int code);
+
 #endif /* MY_SYSCALLS_H */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 24/25] Document remote clone events, and QThreadOptions packet
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (22 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 23/25] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 15:04 ` [FYI/pushed v4 25/25] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
  2023-11-13 19:28 ` [FYI/pushed v4 00/25] Step over thread clone and thread exit Tom de Vries
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Eli Zaretskii

This commit documents in both manual and NEWS:

 - the new remote clone event stop reply,
 - the new QThreadOptions packet and its current defined options,
 - the associated "set/show remote thread-events-packet" command,
 - and the associated QThreadOptions qSupported feature.

Approved-By: Eli Zaretskii <eliz@gnu.org>
Change-Id: Ic1c8de1fefba95729bbd242969284265de42427e
---
 gdb/NEWS            |  20 +++++++
 gdb/doc/gdb.texinfo | 128 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 145 insertions(+), 3 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index f2861b1ace1..682def44ce0 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -29,6 +29,26 @@ disassemble
 maintenance info linux-lwps
   List all LWPs under control of the linux-nat target.
 
+set remote thread-options-packet
+show remote thread-options-packet
+  Set/show the use of the thread options packet.
+
+* New remote packets
+
+New stop reason: clone
+  Indicates that a clone system call was executed.
+
+QThreadOptions
+  Enable/disable optional event reporting, on a per-thread basis.
+  Currently supported options are GDB_THREAD_OPTION_CLONE, to enable
+  clone event reporting, and GDB_THREAD_OPTION_EXIT to enable thread
+  exit event reporting.
+
+QThreadOptions in qSupported
+  The qSupported packet allows GDB to inform the stub it supports the
+  QThreadOptions packet, and the qSupported response can contain the
+  set of thread options the remote stub supports.
+
 *** Changes in GDB 14
 
 * GDB now supports the AArch64 Scalable Matrix Extension 2 (SME2), which
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 79b7431dd78..e4c00143fd1 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -24356,6 +24356,10 @@ future connections is shown.  The available settings are:
 @tab @code{QThreadEvents}
 @tab Tracking thread lifetime.
 
+@item @code{thread-options}
+@tab @code{QThreadOptions}
+@tab Set thread event reporting options.
+
 @item @code{no-resumed-stop-reply}
 @tab @code{no resumed thread left stop reply}
 @tab Tracking thread lifetime.
@@ -43280,6 +43284,17 @@ appropriate @samp{qSupported} feature (@pxref{qSupported}).  The
 remote stub must also supply the appropriate @samp{qSupported} feature
 indicating support.
 
+@cindex thread clone events, remote reply
+@anchor{thread clone event}
+@item clone
+The packet indicates that @code{clone} was called, and @var{r} is the
+thread ID of the new child thread, as specified in @ref{thread-id
+syntax}.  This packet is only applicable to targets that support clone
+events.
+
+This packet should not be sent by default; @value{GDBN} requests it
+with the @ref{QThreadOptions} packet.
+
 @cindex thread create event, remote reply
 @anchor{thread create event}
 @item create
@@ -43318,9 +43333,10 @@ hex strings.
 @item w @var{AA} ; @var{tid}
 
 The thread exited, and @var{AA} is the exit status.  This response
-should not be sent by default; @value{GDBN} requests it with the
-@ref{QThreadEvents} packet.  See also @ref{thread create event} above.
-@var{AA} is formatted as a big-endian hex string.
+should not be sent by default; @value{GDBN} requests it with either
+the @ref{QThreadEvents} or @ref{QThreadOptions} packets.  See also
+@ref{thread create event} above.  @var{AA} is formatted as a
+big-endian hex string.
 
 @item N
 There are no resumed threads left in the target.  In other words, even
@@ -44045,6 +44061,11 @@ same thread.  @value{GDBN} does not enable this feature unless the
 stub reports that it supports it by including @samp{QThreadEvents+} in
 its @samp{qSupported} reply.
 
+This packet always enables/disables event reporting for all threads of
+all processes under control of the remote stub.  For per-thread
+control of optional event reporting, see the @ref{QThreadOptions}
+packet.
+
 Reply:
 @table @samp
 @item OK
@@ -44061,6 +44082,95 @@ the stub.
 Use of this packet is controlled by the @code{set remote thread-events}
 command (@pxref{Remote Configuration, set remote thread-events}).
 
+@anchor{QThreadOptions}
+@item QThreadOptions@r{[};@var{options}@r{[}:@var{thread-id}@r{]]}@dots{}
+@cindex thread options, remote request
+@cindex @samp{QThreadOptions} packet
+
+For each inferior thread, the last @var{options} in the list with a
+matching @var{thread-id} are applied.  Any options previously set on a
+thread are discarded and replaced by the new options specified.
+Threads that do not match any @var{thread-id} retain their
+previously-set options.  Thread IDs are specified using the syntax
+described in @ref{thread-id syntax}.  If multiprocess extensions
+(@pxref{multiprocess extensions}) are supported, options can be
+specified to apply to all threads of a process by using the
+@samp{p@var{pid}.-1} form of @var{thread-id}.  Options with no
+@var{thread-id} apply to all threads.  Specifying no options value is
+an error.  Zero is a valid value.
+
+@var{options} is an hexadecimal integer specifying the enabled thread
+options, and is the bitwise @code{OR} of the following values.  All
+values are given in hexadecimal representation.
+
+@table @code
+@item GDB_THREAD_OPTION_CLONE (0x1)
+Report thread clone events (@pxref{thread clone event}).  This is only
+meaningful for targets that support clone events (e.g., GNU/Linux
+systems).
+
+@item GDB_THREAD_OPTION_EXIT (0x2)
+Report thread exit events (@pxref{thread exit event}).
+@end table
+
+@noindent
+
+For example, @value{GDBN} enables the @code{GDB_THREAD_OPTION_EXIT}
+and @code{GDB_THREAD_OPTION_CLONE} options when single-stepping a
+thread past a breakpoint, for the following reasons:
+
+@itemize @bullet
+@item
+If the single-stepped thread exits (e.g., it executes a thread exit
+system call), enabling @code{GDB_THREAD_OPTION_EXIT} prevents
+@value{GDBN} from waiting forever, not knowing that it should no
+longer expect a stop for that same thread, and blocking other threads
+from progressing.
+
+@item
+If the single-stepped thread spawns a new clone child (i.e., it
+executes a clone system call), enabling @code{GDB_THREAD_OPTION_CLONE}
+halts the cloned thread before it executes any instructions, and thus
+prevents the following problematic situations:
+
+@itemize @minus
+@item
+If the breakpoint is stepped-over in-line, the spawned thread would
+incorrectly run free while the breakpoint being stepped over is not
+inserted, and thus the cloned thread may potentially run past the
+breakpoint without stopping for it;
+
+@item
+If displaced (out-of-line) stepping is used, the cloned thread starts
+running at the out-of-line PC, leading to undefined behavior, usually
+crashing or corrupting data.
+@end itemize
+
+@end itemize
+
+New threads start with thread options cleared.
+
+@value{GDBN} does not enable this feature unless the stub reports that
+it supports it by including
+@samp{QThreadOptions=@var{supported_options}} in its @samp{qSupported}
+reply.
+
+Reply:
+@table @samp
+@item OK
+The request succeeded.
+
+@item E @var{nn}
+An error occurred.  The error number @var{nn} is given as hex digits.
+
+@item @w{}
+An empty reply indicates that @samp{QThreadOptions} is not supported by
+the stub.
+@end table
+
+Use of this packet is controlled by the @code{set remote thread-options}
+command (@pxref{Remote Configuration, set remote thread-options}).
+
 @item qRcmd,@var{command}
 @cindex execute remote command, remote request
 @cindex @samp{qRcmd} packet
@@ -44506,6 +44616,11 @@ These are the currently defined stub features and their properties:
 @tab @samp{-}
 @tab No
 
+@item @samp{QThreadOptions}
+@tab Yes
+@tab @samp{-}
+@tab No
+
 @item @samp{no-resumed}
 @tab No
 @tab @samp{-}
@@ -44727,6 +44842,13 @@ The remote stub reports the supported actions in the reply to
 @item QThreadEvents
 The remote stub understands the @samp{QThreadEvents} packet.
 
+@item QThreadOptions=@var{supported_options}
+The remote stub understands the @samp{QThreadOptions} packet.
+@var{supported_options} indicates the set of thread options the remote
+stub supports.  @var{supported_options} has the same format as the
+@var{options} parameter of the @code{QThreadOptions} packet, described
+at @ref{QThreadOptions}.
+
 @item no-resumed
 The remote stub reports the @samp{N} stop reply.
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [FYI/pushed v4 25/25] Cancel execution command on thread exit, when stepping, nexting, etc.
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (23 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 24/25] Document remote clone events, and QThreadOptions packet Pedro Alves
@ 2023-11-13 15:04 ` Pedro Alves
  2023-11-13 19:28 ` [FYI/pushed v4 00/25] Step over thread clone and thread exit Tom de Vries
  25 siblings, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-13 15:04 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

If your target has no support for TARGET_WAITKIND_NO_RESUMED events
(and no way to support them, such as the yet-unsubmitted AMDGPU
target), and you step over thread exit with scheduler-locking on, this
is what you get:

 (gdb) n
 [Thread ... exited]
 *hang*

Getting back the prompt by typing Ctrl-C may not even work, since no
inferior thread is running to receive the SIGINT.  Even if it works,
it seems unnecessarily harsh.  If you started an execution command for
which there's a clear thread of interest (step, next, until, etc.),
and that thread disappears, then I think it's more user friendly if
GDB just detects the situation and aborts the command, giving back the
prompt.

That is what this commit implements.  It does this by explicitly
requesting the target to report thread exit events whenever the main
resumed thread has a thread_fsm.  Note that unlike stepping over a
breakpoint, we don't need to enable clone events in this case.

With this patch, we get:

 (gdb) n
 [Thread 0x7ffff7d89700 (LWP 3961883) exited]
 Command aborted, thread exited.
 (gdb)

Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: I901ab64c91d10830590b2dac217b5264635a2b95
---
 gdb/infrun.c                                  | 73 ++++++++++++++---
 .../gdb.threads/step-over-thread-exit.exp     | 81 +++++++++++++------
 2 files changed, 118 insertions(+), 36 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 943ea88538c..62b306ff347 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -1956,6 +1956,22 @@ displaced_step_prepare (thread_info *thread)
   return status;
 }
 
+/* True if any thread of TARGET that matches RESUME_PTID requires
+   target_thread_events enabled.  This assumes TARGET does not support
+   target thread options.  */
+
+static bool
+any_thread_needs_target_thread_events (process_stratum_target *target,
+				       ptid_t resume_ptid)
+{
+  for (thread_info *tp : all_non_exited_threads (target, resume_ptid))
+    if (displaced_step_in_progress_thread (tp)
+	|| schedlock_applies (tp)
+	|| tp->thread_fsm () != nullptr)
+      return true;
+  return false;
+}
+
 /* Maybe disable thread-{cloned,created,exited} event reporting after
    a step-over (either in-line or displaced) finishes.  */
 
@@ -1979,9 +1995,10 @@ update_thread_events_after_step_over (thread_info *event_thread,
   else
     {
       /* We can only control the target-wide target_thread_events
-	 setting.  Disable it, but only if other threads don't need it
-	 enabled.  */
-      if (!displaced_step_in_progress_any_thread ())
+	 setting.  Disable it, but only if other threads in the target
+	 don't need it enabled.  */
+      process_stratum_target *target = event_thread->inf->process_target ();
+      if (!any_thread_needs_target_thread_events (target, minus_one_ptid))
 	target_thread_events (false);
     }
 }
@@ -2559,12 +2576,25 @@ do_target_resume (ptid_t resume_ptid, bool step, enum gdb_signal sig)
       else
 	target_thread_events (true);
     }
+  else if (tp->thread_fsm () != nullptr)
+    {
+      gdb_thread_options options = GDB_THREAD_OPTION_EXIT;
+      if (target_supports_set_thread_options (options))
+	tp->set_thread_options (options);
+      else
+	target_thread_events (true);
+    }
   else
     {
       if (target_supports_set_thread_options (0))
 	tp->set_thread_options (0);
-      else if (!displaced_step_in_progress_any_thread ())
-	target_thread_events (false);
+      else
+	{
+	  process_stratum_target *resume_target = tp->inf->process_target ();
+	  if (!any_thread_needs_target_thread_events (resume_target,
+						      resume_ptid))
+	    target_thread_events (false);
+	}
     }
 
   /* If we're resuming more than one thread simultaneously, then any
@@ -5842,6 +5872,13 @@ handle_thread_exited (execution_control_state *ecs)
   ecs->event_thread->stepping_over_breakpoint = 0;
   ecs->event_thread->stepping_over_watchpoint = 0;
 
+  /* If the thread had an FSM, then abort the command.  But only after
+     finishing the step over, as in non-stop mode, aborting this
+     thread's command should not interfere with other threads.  We
+     must check this before finish_step over, however, which may
+     update the thread list and delete the event thread.  */
+  bool abort_cmd = (ecs->event_thread->thread_fsm () != nullptr);
+
   /* Maybe the thread was doing a step-over, if so release
      resources and start any further pending step-overs.
 
@@ -5855,6 +5892,13 @@ handle_thread_exited (execution_control_state *ecs)
      the event thread has exited.  */
   gdb_assert (ret == 0);
 
+  if (abort_cmd)
+    {
+      delete_thread (ecs->event_thread);
+      ecs->event_thread = nullptr;
+      return false;
+    }
+
   /* If finish_step_over started a new in-line step-over, don't
      try to restart anything else.  */
   if (step_over_info_valid_p ())
@@ -9287,7 +9331,8 @@ normal_stop ()
       if (inferior_ptid != null_ptid)
 	finish_ptid = ptid_t (inferior_ptid.pid ());
     }
-  else if (last.kind () != TARGET_WAITKIND_NO_RESUMED)
+  else if (last.kind () != TARGET_WAITKIND_NO_RESUMED
+	   && last.kind () != TARGET_WAITKIND_THREAD_EXITED)
     finish_ptid = inferior_ptid;
 
   gdb::optional<scoped_finish_thread_state> maybe_finish_thread_state;
@@ -9330,7 +9375,8 @@ normal_stop ()
     {
       if ((last.kind () != TARGET_WAITKIND_SIGNALLED
 	   && last.kind () != TARGET_WAITKIND_EXITED
-	   && last.kind () != TARGET_WAITKIND_NO_RESUMED)
+	   && last.kind () != TARGET_WAITKIND_NO_RESUMED
+	   && last.kind () != TARGET_WAITKIND_THREAD_EXITED)
 	  && target_has_execution ()
 	  && previous_thread != inferior_thread ())
 	{
@@ -9346,7 +9392,8 @@ normal_stop ()
       update_previous_thread ();
     }
 
-  if (last.kind () == TARGET_WAITKIND_NO_RESUMED)
+  if (last.kind () == TARGET_WAITKIND_NO_RESUMED
+      || last.kind () == TARGET_WAITKIND_THREAD_EXITED)
     {
       stop_print_frame = false;
 
@@ -9354,7 +9401,12 @@ normal_stop ()
 	if (current_ui->prompt_state == PROMPT_BLOCKED)
 	  {
 	    target_terminal::ours_for_output ();
-	    gdb_printf (_("No unwaited-for children left.\n"));
+	    if (last.kind () == TARGET_WAITKIND_NO_RESUMED)
+	      gdb_printf (_("No unwaited-for children left.\n"));
+	    else if (last.kind () == TARGET_WAITKIND_THREAD_EXITED)
+	      gdb_printf (_("Command aborted, thread exited.\n"));
+	    else
+	      gdb_assert_not_reached ("unhandled");
 	  }
     }
 
@@ -9437,7 +9489,8 @@ normal_stop ()
     {
       if (last.kind () != TARGET_WAITKIND_SIGNALLED
 	  && last.kind () != TARGET_WAITKIND_EXITED
-	  && last.kind () != TARGET_WAITKIND_NO_RESUMED)
+	  && last.kind () != TARGET_WAITKIND_NO_RESUMED
+	  && last.kind () != TARGET_WAITKIND_THREAD_EXITED)
 	/* Delete the breakpoint we stopped at, if it wants to be deleted.
 	   Delete any breakpoint that is to be deleted at the next stop.  */
 	breakpoint_auto_delete (inferior_thread ()->control.stop_bpstat);
diff --git a/gdb/testsuite/gdb.threads/step-over-thread-exit.exp b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp
index ed8534cf518..615bd838763 100644
--- a/gdb/testsuite/gdb.threads/step-over-thread-exit.exp
+++ b/gdb/testsuite/gdb.threads/step-over-thread-exit.exp
@@ -29,7 +29,7 @@ if { [build_executable "failed to prepare" $testfile \
 # NS_STOP_ALL is only used if testing "set non-stop on", and indicates
 # whether to have GDB explicitly stop all threads before continuing to
 # thread exit.
-proc test {displaced-stepping non-stop target-non-stop schedlock ns_stop_all} {
+proc test {displaced-stepping non-stop target-non-stop schedlock cmd ns_stop_all} {
     if {${non-stop} == "off" && $ns_stop_all} {
 	error "invalid arguments"
     }
@@ -72,31 +72,58 @@ proc test {displaced-stepping non-stop target-non-stop schedlock ns_stop_all} {
 
 	gdb_test_no_output "set scheduler-locking ${schedlock}"
 
-	gdb_test "continue" \
-	    "No unwaited-for children left." \
-	    "continue stops when thread exits"
+	if {$cmd == "continue"} {
+	    gdb_test "continue" \
+		"No unwaited-for children left." \
+		"continue stops when thread exits"
+	} else {
+	    gdb_test_multiple $cmd "command aborts when thread exits" {
+		-re "Command aborted, thread exited\\.\r\n$::gdb_prompt " {
+		    pass $gdb_test_name
+		}
+	    }
+	}
     } else {
 	gdb_test_no_output "set scheduler-locking ${schedlock}"
 
-	for { set i 0 } { $i < 100 } { incr i } {
-	    with_test_prefix "iter $i" {
-		set ok 0
-		set thread "<unknown>"
-		gdb_test_multiple "continue" "" {
-		    -re -wrap "Thread ($::decimal) .*hit Breakpoint $::decimal.* my_exit_syscall .*" {
-			set thread $expect_out(1,string)
-			set ok 1
-		    }
+	if {$cmd != "continue"} {
+	    set thread "<unknown>"
+	    gdb_test_multiple "continue" "" {
+		-re -wrap "Thread ($::decimal) .*hit Breakpoint $::decimal.* my_exit_syscall .*" {
+		    set thread $expect_out(1,string)
 		}
-		if {!${ok}} {
-		    # Exit if there's a failure to avoid lengthy
-		    # timeouts.
-		    break
+	    }
+	    if {${non-stop}} {
+		gdb_test -nopass "thread $thread" "Switching to thread .*" \
+		    "switch to event thread"
+	    }
+
+	    gdb_test_multiple $cmd "command aborts when thread exits" {
+		-re "Command aborted, thread exited\\.\r\n$::gdb_prompt " {
+		    pass $gdb_test_name
 		}
+	    }
+	} else {
+	    for { set i 0 } { $i < 100 } { incr i } {
+		with_test_prefix "iter $i" {
+		    set ok 0
+		    set thread "<unknown>"
+		    gdb_test_multiple "continue" "" {
+			-re -wrap "Thread ($::decimal) .*hit Breakpoint $::decimal.* my_exit_syscall .*" {
+			    set thread $expect_out(1,string)
+			    set ok 1
+			}
+		    }
+		    if {!${ok}} {
+			# Exit if there's a failure to avoid lengthy
+			# timeouts.
+			break
+		    }
 
-		if {${non-stop}} {
-		    gdb_test "thread $thread" "Switching to thread .*" \
-			"switch to event thread"
+		    if {${non-stop}} {
+			gdb_test -nopass "thread $thread" "Switching to thread .*" \
+			    "switch to event thread"
+		    }
 		}
 	    }
 	}
@@ -112,13 +139,15 @@ foreach_with_prefix displaced-stepping {off auto} {
 	    }
 
 	    foreach_with_prefix schedlock {off on} {
-		if {${non-stop} == "on"} {
-		    foreach_with_prefix ns_stop_all {0 1} {
-			test ${displaced-stepping} ${non-stop} ${target-non-stop} \
-			    ${schedlock} ${ns_stop_all}
+		foreach_with_prefix cmd {"next" "continue"} {
+		    if {${non-stop} == "on"} {
+			foreach_with_prefix ns_stop_all {0 1} {
+			    test ${displaced-stepping} ${non-stop} ${target-non-stop} \
+				${schedlock} ${cmd} ${ns_stop_all}
+			}
+		    } else {
+			test ${displaced-stepping} ${non-stop} ${target-non-stop} ${schedlock} ${cmd} 0
 		    }
-		} else {
-		    test ${displaced-stepping} ${non-stop} ${target-non-stop} ${schedlock} 0
 		}
 	    }
 	}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 00/25] Step over thread clone and thread exit
  2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
                   ` (24 preceding siblings ...)
  2023-11-13 15:04 ` [FYI/pushed v4 25/25] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
@ 2023-11-13 19:28 ` Tom de Vries
  2023-11-14 10:51   ` Pedro Alves
  25 siblings, 1 reply; 49+ messages in thread
From: Tom de Vries @ 2023-11-13 19:28 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 11/13/23 16:04, Pedro Alves wrote:
> Here's v4 of the series I previously posted here:

I'm seeing new FAILs:
...
FAIL: gdb.threads/stepi-over-clone.exp: continue
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=on: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=off: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=off: displaced=on: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=0: stepi (timeout)
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=0: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=1: stepi (timeout)
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=1: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=off: i=2: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=0: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=1: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=on: 
displaced=on: i=2: $bad_threads == 0
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=off: i=2: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=0: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=0: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=1: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=1: $thread_count == 2
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=2: stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=true: non-stop=off: 
displaced=on: i=2: $thread_count == 2
...

First in more detail:
...
(gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
continue^M
Continuing.^M
^M
Catchpoint 2 (call to syscall clone), clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M
warning: 78     ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file 
or directory^M
(gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 00/25] Step over thread clone and thread exit
  2023-11-13 19:28 ` [FYI/pushed v4 00/25] Step over thread clone and thread exit Tom de Vries
@ 2023-11-14 10:51   ` Pedro Alves
  2023-11-14 13:39     ` Tom de Vries
  0 siblings, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2023-11-14 10:51 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

Hi Tom,

On 2023-11-13 19:28, Tom de Vries wrote:

> I'm seeing new FAILs:
> ...
> FAIL: gdb.threads/stepi-over-clone.exp: continue
> FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: displaced=off: i=0: stepi
> FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: displaced=off: i=0: $thread_count == 2

...

> ...
> 
> First in more detail:
> ...
> (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
> continue^M
> Continuing.^M
> ^M
> Catchpoint 2 (call to syscall clone), clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M
> warning: 78     ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file or directory^M
> (gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue
> ...
> 

Thanks.  I think the patch below would fix this one.  The others are hopefully something similar,
but I wasn't able to spot anything wrong by inspection.  I'd have to see the relevant part of the
gdb.log to hazard a better guess.


--- 8< ---
From: Pedro Alves <pedro@palves.net>
Subject: [PATCH] Fix gdb.threads/stepi-over-clone.exp regexp

Tom de Vries reported this FAIL:

 (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
 continue^M
 Continuing.^M
 ^M
 Catchpoint 2 (call to syscall clone), clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M
 warning: 78     ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file or directory^M
 (gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue

All but one regexps in the .exp file use "clone\[23\]?" with "?" to
also accept "clone", except the failing case.  This commit fixes that
case to also use "?".

Change-Id: I74ca9e7d4cfe6af294fd50e8c509fcbad289b78c
---
 gdb/testsuite/gdb.threads/stepi-over-clone.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
index 4c496429632..18cfec19ffa 100644
--- a/gdb/testsuite/gdb.threads/stepi-over-clone.exp
+++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
@@ -45,7 +45,7 @@ gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
 }
 
 gdb_test "continue" \
-    "Catchpoint $decimal \\(call to syscall clone\[23\]\\), .*"
+    "Catchpoint $decimal \\(call to syscall clone\[23\]?\\), .*"
 
 # Return true if INSN is a syscall instruction.
 

base-commit: 319b460545dc79280e2904dcc280057cf71fb753
-- 
2.34.1



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-13 15:04 ` [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
@ 2023-11-14 12:55   ` Guinevere Larsen
  2023-11-14 13:26     ` Pedro Alves
  2023-11-14 13:28     ` Pedro Alves
  0 siblings, 2 replies; 49+ messages in thread
From: Guinevere Larsen @ 2023-11-14 12:55 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Andrew Burgess

On 13/11/2023 16:04, Pedro Alves wrote:
> (A good chunk of the problem statement in the commit log below is
> Andrew's, adjusted for a different solution, and for covering
> displaced stepping too.  The testcase is mostly Andrew's too.)
>
> This commit addresses bugs gdb/19675 and gdb/27830, which are about
> stepping over a breakpoint set at a clone syscall instruction, one is
> about displaced stepping, and the other about in-line stepping.
>
> Currently, when a new thread is created through a clone syscall, GDB
> sets the new thread running.  With 'continue' this makes sense
> (assuming no schedlock):
>
>   - all-stop mode, user issues 'continue', all threads are set running,
>     a newly created thread should also be set running.
>
>   - non-stop mode, user issues 'continue', other pre-existing threads
>     are not affected, but as the new thread is (sort-of) a child of the
>     thread the user asked to run, it makes sense that the new threads
>     should be created in the running state.
>
> Similarly, if we are stopped at the clone syscall, and there's no
> software breakpoint at this address, then the current behaviour is
> fine:
>
>   - all-stop mode, user issues 'stepi', stepping will be done in place
>     (as there's no breakpoint to step over).  While stepping the thread
>     of interest all the other threads will be allowed to continue.  A
>     newly created thread will be set running, and then stopped once the
>     thread of interest has completed its step.
>
>   - non-stop mode, user issues 'stepi', stepping will be done in place
>     (as there's no breakpoint to step over).  Other threads might be
>     running or stopped, but as with the continue case above, the new
>     thread will be created running.  The only possible issue here is
>     that the new thread will be left running after the initial thread
>     has completed its stepi.  The user would need to manually select
>     the thread and interrupt it, this might not be what the user
>     expects.  However, this is not something this commit tries to
>     change.
>
> The problem then is what happens when we try to step over a clone
> syscall if there is a breakpoint at the syscall address.
>
> - For both all-stop and non-stop modes, with in-line stepping:
>
>     + user issues 'stepi',
>     + [non-stop mode only] GDB stops all threads.  In all-stop mode all
>       threads are already stopped.
>     + GDB removes s/w breakpoint at syscall address,
>     + GDB single steps just the thread of interest, all other threads
>       are left stopped,
>     + New thread is created running,
>     + Initial thread completes its step,
>     + [non-stop mode only] GDB resumes all threads that it previously
>       stopped.
>
> There are two problems in the in-line stepping scenario above:
>
>    1. The new thread might pass through the same code that the initial
>       thread is in (i.e. the clone syscall code), in which case it will
>       fail to hit the breakpoint in clone as this was removed so the
>       first thread can single step,
>
>    2. The new thread might trigger some other stop event before the
>       initial thread reports its step completion.  If this happens we
>       end up triggering an assertion as GDB assumes that only the
>       thread being stepped should stop.  The assert looks like this:
>
>       infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
>
> - For both all-stop and non-stop modes, with displaced stepping:
>
>     + user issues 'stepi',
>     + GDB starts the displaced step, moves thread's PC to the
>       out-of-line scratch pad, maybe adjusts registers,
>     + GDB single steps the thread of interest, [non-stop mode only] all
>       other threads are left as they were, either running or stopped.
>       In all-stop, all other threads are left stopped.
>     + New thread is created running,
>     + Initial thread completes its step, GDB re-adjusts its PC,
>       restores/releases scratchpad,
>     + [non-stop mode only] GDB resumes the thread, now past its
>       breakpoint.
>     + [all-stop mode only] GDB resumes all threads.
>
> There is one problem with the displaced stepping scenario above:
>
>    3. When the parent thread completed its step, GDB adjusted its PC,
>       but did not adjust the child's PC, thus that new child thread
>       will continue execution in the scratch pad, invoking undefined
>       behavior.  If you're lucky, you see a crash.  If unlucky, the
>       inferior gets silently corrupted.
>
> What is needed is for GDB to have more control over whether the new
> thread is created running or not.  Issue #1 above requires that the
> new thread not be allowed to run until the breakpoint has been
> reinserted.  The only way to guarantee this is if the new thread is
> held in a stopped state until the single step has completed.  Issue #3
> above requires that GDB is informed of when a thread clones itself,
> and of what is the child's ptid, so that GDB can fixup both the parent
> and the child.
>
> When looking for solutions to this problem I considered how GDB
> handles fork/vfork as these have some of the same issues.  The main
> difference between fork/vfork and clone is that the clone events are
> not reported back to core GDB.  Instead, the clone event is handled
> automatically in the target code and the child thread is immediately
> set running.
>
> Note we have support for requesting thread creation events out of the
> target (TARGET_WAITKIND_THREAD_CREATED).  However, those are reported
> for the new/child thread.  That would be sufficient to address in-line
> stepping (issue #1), but not for displaced-stepping (issue #3).  To
> handle displaced-stepping, we need an event that is reported to the
> _parent_ of the clone, as the information about the displaced step is
> associated with the clone parent.  TARGET_WAITKIND_THREAD_CREATED
> includes no indication of which thread is the parent that spawned the
> new child.  In fact, for some targets, like e.g., Windows, it would be
> impossible to know which thread that was, as thread creation there
> doesn't work by "cloning".
>
> The solution implemented here is to model clone on fork/vfork, and
> introduce a new TARGET_WAITKIND_THREAD_CLONED event.  This event is
> similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
> that we end up with a new thread in the same process, instead of a new
> thread of a new process.  Like FORKED and VFORKED, THREAD_CLONED
> waitstatuses have a child_ptid property, and the child is held stopped
> until GDB explicitly resumes it.  This addresses the in-line stepping
> case (issues #1 and #2).
>
> The infrun code that handles displaced stepping fixup for the child
> after a fork/vfork event is thus reused for THREAD_CLONE, with some
> minimal conditions added, addressing the displaced stepping case
> (issue #3).
>
> The native Linux backend is adjusted to unconditionally report
> TARGET_WAITKIND_THREAD_CLONED events to the core.
>
> Following the follow_fork model in core GDB, we introduce a
> target_follow_clone target method, which is responsible for making the
> new clone child visible to the rest of GDB.
>
> Subsequent patches will add clone events support to the remote
> protocol and gdbserver.
>
> displaced_step_in_progress_thread becomes unused with this patch, but
> a new use will reappear later in the series.  To avoid deleting it and
> readding it back, this patch marks it with attribute unused, and the
> latter patch removes the attribute again.  We need to do this because
> the function is static, and with no callers, the compiler would warn,
> (error with -Werror), breaking the build.
>
> This adds a new gdb.threads/stepi-over-clone.exp testcase, which
> exercises stepping over a clone syscall, with displaced stepping vs
> inline stepping, and all-stop vs non-stop.  We already test stepping
> over clone syscalls with gdb.base/step-over-syscall.exp, but this test
> uses pthreads, while the other test uses raw clone, and this one is
> more thorough.  The testcase passes on native GNU/Linux, but fails
> against GDBserver.  GDBserver will be fixed by a later patch in the
> series.
>
> Co-authored-by: Andrew Burgess <aburgess@redhat.com>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
> Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
> Reviewed-By: Andrew Burgess <aburgess@redhat.com>
> ---
>   gdb/infrun.c                                  | 158 +++----
>   gdb/linux-nat.c                               | 252 +++++------
>   gdb/linux-nat.h                               |   2 +
>   gdb/target-delegates.c                        |  24 ++
>   gdb/target.c                                  |   7 +
>   gdb/target.h                                  |   7 +
>   gdb/target/waitstatus.c                       |   1 +
>   gdb/target/waitstatus.h                       |  31 +-
>   gdb/testsuite/gdb.threads/stepi-over-clone.c  |  90 ++++
>   .../gdb.threads/stepi-over-clone.exp          | 395 ++++++++++++++++++
>   10 files changed, 775 insertions(+), 192 deletions(-)
>   create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.c
>   create mode 100644 gdb/testsuite/gdb.threads/stepi-over-clone.exp
>
The test introduced by this patch does not work with clang, it adds 110 
fails, and almost all successes seem to be incorrect. I left inlined 
thoughts on the incorrect successes, and the fails are pretty self 
evident based on those thoughts, if they are correct. Unfortunately, 
something is up with my local setup and I can't run the test locally to 
double check, I'm stuck at looking at the buildbot log.

For a full log, check: 
https://builder.sourceware.org/testrun/9907246dce7004c32f7dabc8bbce8e5c9788b7ca?dgexpfile=gdb.threads%2Fstepi-over-clone.exp

> diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> new file mode 100644
> index 00000000000..e580f2248ac
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> @@ -0,0 +1,395 @@
> +# Copyright 2021-2023 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test performing a 'stepi' over a clone syscall instruction.
> +
> +# This test relies on us being able to spot syscall instructions in
> +# disassembly output.  For now this is only implemented for x86-64.
> +require {istarget x86_64-*-*}
> +
> +# Test only on native targets, for now.
> +proc is_native_target {} {
> +    return [expr {[target_info gdb_protocol] == ""}]
> +}
> +require is_native_target
> +
> +standard_testfile
> +
> +if { [prepare_for_testing "failed to prepare" $testfile $srcfile \
> +	  {debug pthreads additional_flags=-static}] } {
> +    return
> +}
> +
> +if {![runto_main]} {
> +    return
> +}
> +
> +# Arrange to catch the 'clone' syscall, run until we catch the
> +# syscall, and try to figure out the address of the actual syscall
> +# instruction so we can place a breakpoint at this address.
> +
> +gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
> +    -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" {
> +	unsupported $gdb_test_name
> +	return
> +    }
> +    -re ".*$gdb_prompt $" {
> +	pass $gdb_test_name
> +    }
> +}
In the clang buildbot we're getting this output:

(gdb) catch syscall group:process
warning: Can not parse XML syscalls information; XML support was 
disabled at compile time.
Unknown syscall group 'process'.
(gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls

This should be a failure, and should likely skip the next few tests.

I don't know

> +
> +gdb_test "continue" \
> +    "Catchpoint $decimal \\(call to syscall clone\[23\]\\), .*"
> +
> +# Return true if INSN is a syscall instruction.
> +
> +proc is_syscall_insn { insn } {
> +    if [istarget x86_64-*-* ] {
> +	return { $insn == "syscall" }
> +    } else {
> +	error "port me"
> +    }
> +}
> +
> +# A list of addresses with syscall instructions.
> +set syscall_addrs {}
> +
> +# Get list of addresses with syscall instructions.
> +gdb_test_multiple "disassemble" "" {
> +    -re "Dump of assembler code for function \[^\r\n\]+:\r\n" {
> +	exp_continue
> +    }
> +    -re "^(?:=>)?\\s+(${hex})\\s+<\\+${decimal}>:\\s+(\[^\r\n\]+)\r\n" {
> +	set addr $expect_out(1,string)
> +	set insn [string trim $expect_out(2,string)]
> +	if [is_syscall_insn $insn] {
> +	    verbose -log "Found a syscall at: $addr"
> +	    lappend syscall_addrs $addr
> +	}
> +	exp_continue
> +    }
> +    -re "^End of assembler dump\\.\r\n$gdb_prompt $" {
> +	if { [llength $syscall_addrs] == 0 } {
> +	    unsupported "no syscalls found"
> +	    return -1
> +	}
> +    }
> +}
> +
> +# The test proc.  NON_STOP and DISPLACED are either 'on' or 'off', and are
> +# used to configure how GDB starts up.  THIRD_THREAD is either true or false,
> +# and is used to configure the inferior.
> +proc test {non_stop displaced third_thread} {
> +    global binfile srcfile
> +    global syscall_addrs
> +    global GDBFLAGS
> +    global gdb_prompt hex decimal
> +
> +    for { set i 0 } { $i < 3 } { incr i } {
> +	with_test_prefix "i=$i" {
> +
> +	    # Arrange to start GDB in the correct mode.
> +	    save_vars { GDBFLAGS } {
> +		append GDBFLAGS " -ex \"set non-stop $non_stop\""
> +		append GDBFLAGS " -ex \"set displaced $displaced\""
> +		clean_restart $binfile
> +	    }
> +
> +	    runto_main
> +
> +	    # Setup breakpoints at all the syscall instructions we
> +	    # might hit.  Only issue one pass/fail to make tests more
> +	    # comparable between systems.
> +	    set test "break at syscall insns"
> +	    foreach addr $syscall_addrs {
> +		if {[gdb_test -nopass "break *$addr" \
> +			 ".*" \
> +			 $test] != 0} {
> +		    return
> +		}
> +	    }

I may be mis-interpreting this part of the test and log, but it seems 
that we set no breakpoints at all when testing in the buildbot, which 
seems to be the reason for all the failures. This is what the log says 
in this part:

(gdb) run
Starting program: 
/home/buildbot/buildbot/binutils-gdb-clang-fedrawhide-x86_64/build/gdb/testsuite/outputs/gdb.threads/stepi-over-clone/stepi-over-clone 

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, main () at 
/home/buildbot/buildbot/binutils-gdb-clang-fedrawhide-x86_64/build/gdb/testsuite/gdb.threads/stepi-over-clone.c:80
80      if (getenv ("MAKE_EXTRA_THREAD") != NULL)
(gdb) PASS: gdb.threads/stepi-over-clone.exp: third_thread=false: 
non-stop=on: displaced=off: i=0: break at syscall insns
continue
Continuing.
[New Thread 0x7ffff7ff86c0 (LWP 1302753)]
Hello from the first thread.
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=0: continue (timeout)
stepi
FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: 
displaced=off: i=0: stepi (timeout)
XXX: Have completed scanning the 'stepi' output
p 1 + 2 + 3


> +	    # If we got here, all breakpoints were set successfully.
> +	    # We used -nopass above, so issue a pass now.
> +	    pass $test
> +
> +	    # Continue until we hit the syscall.
> +	    gdb_test "continue"
> +
> +	    if { $third_thread } {
> +		gdb_test_no_output "set start_third_thread=1"
> +	    }
> +
> +	    set stepi_error_count 0
> +	    set stepi_new_thread_count 0
> +	    set thread_1_stopped false
> +	    set thread_2_stopped false
> +	    set seen_prompt false
> +	    set hello_first_thread false
> +
> +	    # The program is now stopped at main, but if testing
> +	    # against GDBserver, inferior_spawn_id is GDBserver's
> +	    # spawn_id, and the GDBserver output emitted before the
> +	    # program stopped isn't flushed unless we explicitly do
> +	    # so, because it is on a different spawn_id.  We could try
> +	    # flushing it now, to avoid confusing the following tests,
> +	    # but that would have to be done under a timeout, and
> +	    # would thus slow down the testcase.  Instead, if inferior
> +	    # output goes to a different spawn id, then we don't need
> +	    # to wait for the first message from the inferior with an
> +	    # anchor, as we know consuming inferior output won't
> +	    # consume GDB output.  OTOH, if inferior output is coming
> +	    # out on GDB's terminal, then we must use an anchor,
> +	    # otherwise matching inferior output without one could
> +	    # consume GDB output that we are waiting for in regular
> +	    # expressions that are written after the inferior output
> +	    # regular expression match.
> +	    if {$::inferior_spawn_id != $::gdb_spawn_id} {
> +		set anchor ""
> +	    } else {
> +		set anchor "^"
> +	    }
> +
> +	    gdb_test_multiple "stepi" "" {
> +		-re "^stepi\r\n" {
> +		    verbose -log "XXX: Consume the initial command"
> +		    exp_continue
> +		}
> +		-re "^\\\[New Thread\[^\r\n\]+\\\]\r\n" {
> +		    verbose -log "XXX: Consume new thread line"
> +		    incr stepi_new_thread_count
> +		    exp_continue
> +		}
> +		-re "^\\\[Switching to Thread\[^\r\n\]+\\\]\r\n" {
> +		    verbose -log "XXX: Consume switching to thread line"
> +		    exp_continue
> +		}
> +		-re "^\\s*\r\n" {
> +		    verbose -log "XXX: Consume blank line"
> +		    exp_continue
> +		}
> +
> +		-i $::inferior_spawn_id
> +
> +		-re "${anchor}Hello from the first thread\\.\r\n" {
> +		    set hello_first_thread true
> +
> +		    verbose -log "XXX: Consume first worker thread message"
> +		    if { $third_thread } {
> +			# If we are going to start a third thread then GDB
> +			# should hit the breakpoint in clone before printing
> +			# this message.
> +			incr stepi_error_count
> +		    }
> +		    if { !$seen_prompt } {
> +			exp_continue
> +		    }
> +		}
> +		-re "^Hello from the third thread\\.\r\n" {
> +		    # We should never see this message.
> +		    verbose -log "XXX: Consume third worker thread message"
> +		    incr stepi_error_count
> +		    if { !$seen_prompt } {
> +			exp_continue
> +		    }
> +		}
> +
> +		-i $::gdb_spawn_id
> +
> +		-re "^$hex in clone\[23\]? \\(\\)\r\n" {
> +		    verbose -log "XXX: Consume stop location line"
> +		    set thread_1_stopped true
> +		    if { !$seen_prompt } {
> +			verbose -log "XXX: Continuing to look for the prompt"
> +			exp_continue
> +		    }
> +		}
> +		-re "^$gdb_prompt " {
> +		    verbose -log "XXX: Consume the final prompt"
> +		    gdb_assert { $stepi_error_count == 0 }
> +		    gdb_assert { $stepi_new_thread_count == 1 }
> +		    set seen_prompt true
> +		    if { $third_thread } {
> +			if { $non_stop } {
> +			    # In non-stop mode if we are trying to start a
> +			    # third thread (from the second thread), then the
> +			    # second thread should hit the breakpoint in clone
> +			    # before actually starting the third thread.  And
> +			    # so, at this point both thread 1, and thread 2
> +			    # should now be stopped.
> +			    if { !$thread_1_stopped || !$thread_2_stopped } {
> +				verbose -log "XXX: Continue looking for an additional stop event"
> +				exp_continue
> +			    }
> +			} else {
> +			    # All stop mode.  Something should have stoppped
> +			    # by now otherwise we shouldn't have a prompt, but
> +			    # we can't know which thread will have stopped as
> +			    # that is a race condition.
> +			    gdb_assert { $thread_1_stopped || $thread_2_stopped }
> +			}
> +		    }
> +
> +		    if {$non_stop && !$hello_first_thread} {
> +			exp_continue
> +		    }
> +
> +		}
> +		-re "^Thread 2\[^\r\n\]+ hit Breakpoint $decimal, $hex in clone\[23\]? \\(\\)\r\n" {
> +		    verbose -log "XXX: Consume thread 2 hit breakpoint"
> +		    set thread_2_stopped true
> +		    if { !$seen_prompt } {
> +			verbose -log "XXX: Continuing to look for the prompt"
> +			exp_continue
> +		    }
> +		}
> +		-re "^PC register is not available\r\n" {
> +		    # This is the error we'd see for remote targets.
> +		    verbose -log "XXX: Consume error line"
> +		    incr stepi_error_count
> +		    exp_continue
> +		}
> +		-re "^Couldn't get registers: No such process\\.\r\n" {
> +		    # This is the error we see'd for native linux
> +		    # targets.
> +		    verbose -log "XXX: Consume error line"
> +		    incr stepi_error_count
> +		    exp_continue
> +		}
> +	    }
> +
> +	    # Ensure we are back at a GDB prompt, resynchronise.
> +	    verbose -log "XXX: Have completed scanning the 'stepi' output"
> +	    gdb_test "p 1 + 2 + 3" " = 6"
> +
> +	    # Check the number of threads we have, it should be exactly two.
> +	    set thread_count 0
> +	    set bad_threads 0
> +
> +	    # Build up our expectations for what the current thread state
> +	    # should be.  Thread 1 is the easiest, this is the thread we are
> +	    # stepping, so this thread should always be stopped, and should
> +	    # always still be in clone.
> +	    set match_code {}
> +	    lappend match_code {
> +		-re "\\*?\\s+1\\s+Thread\[^\r\n\]+clone\[23\]? \\(\\)\r\n" {
> +		    incr thread_count
> +		    exp_continue
> +		}
> +	    }
> +
> +	    # What state should thread 2 be in?
> +	    if { $non_stop == "on" } {
> +		if { $third_thread } {
> +		    # With non-stop mode on, and creation of a third thread
> +		    # having been requested, we expect Thread 2 to exist, and
> +		    # be stopped at the breakpoint in clone (just before the
> +		    # third thread is actually created).
> +		    lappend match_code {
> +			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+$hex in clone\[23\]? \\(\\)\r\n" {
> +			    incr thread_count
> +			    exp_continue
> +			}
> +			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
> +			    incr thread_count
> +			    incr bad_threads
> +			    exp_continue
> +			}
> +			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
> +			    verbose -log "XXX: thread 2 is bad, unknown state"
> +			    incr thread_count
> +			    incr bad_threads
> +			    exp_continue
> +			}
> +		    }
> +
> +		} else {
> +		    # With non-stop mode on, and no third thread having been
> +		    # requested, then we expect Thread 2 to exist, and still
> +		    # be running.
> +		    lappend match_code {
> +			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
> +			    incr thread_count
> +			    exp_continue
> +			}
> +			-re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
> +			    verbose -log "XXX: thread 2 is bad, unknown state"
> +			    incr thread_count
> +			    incr bad_threads
> +			    exp_continue
> +			}
> +		    }
> +		}
> +	    } else {
> +		# With non-stop mode off then we expect Thread 2 to exist, and
> +		# be stopped.  We don't have any guarantee about where the
> +		# thread will have stopped though, so we need to be vague.
> +		lappend match_code {
> +		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\\(running\\)\r\n" {
> +			verbose -log "XXX: thread 2 is bad, unexpectedly running"
> +			incr thread_count
> +			incr bad_threads
> +			exp_continue
> +		    }
> +		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+_start\[^\r\n\]+\r\n" {
> +			# We know that the thread shouldn't be stopped
> +			# at _start, though.  This is the location of
> +			# the scratch pad on Linux at the time of
> +			# writting.
> +			verbose -log "XXX: thread 2 is bad, stuck in scratchpad"
> +			incr thread_count
> +			incr bad_threads
> +			exp_continue
> +		    }
> +		    -re "\\*?\\s+2\\s+Thread\[^\r\n\]+\r\n" {
> +			incr thread_count
> +			exp_continue
> +		    }
> +		}
> +	    }
> +
> +	    # We don't expect to ever see a thread 3.  Even when we are
> +	    # requesting that this third thread be created, thread 2, the
> +	    # thread that creates thread 3, should stop before executing the
> +	    # clone syscall.  So, if we do ever see this then something has
> +	    # gone wrong.
> +	    lappend match_code {
> +		-re "\\s+3\\s+Thread\[^\r\n\]+\r\n" {
> +		    incr thread_count
> +		    incr bad_threads
> +		    exp_continue
> +		}
> +	    }
> +
> +	    lappend match_code {
> +		-re "$gdb_prompt $" {
> +		    gdb_assert { $thread_count == 2 }
> +		    gdb_assert { $bad_threads == 0 }
> +		}
> +	    }
> +
> +	    set match_code [join $match_code]
> +	    gdb_test_multiple "info threads" "" $match_code
> +	}
> +    }
> +}
> +
> +# Run the test in all suitable configurations.
> +foreach_with_prefix third_thread { false true } {
> +    foreach_with_prefix non-stop { "on" "off" } {
> +	foreach_with_prefix displaced { "off" "on" } {
> +	    test ${non-stop} ${displaced} ${third_thread}
> +	}
> +    }
> +}


-- 
Cheers,
Guinevere Larsen
She/Her/Hers


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-14 12:55   ` Guinevere Larsen
@ 2023-11-14 13:26     ` Pedro Alves
  2023-11-14 16:29       ` Guinevere Larsen
  2023-11-14 13:28     ` Pedro Alves
  1 sibling, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2023-11-14 13:26 UTC (permalink / raw)
  To: Guinevere Larsen, gdb-patches; +Cc: Andrew Burgess

Hi!

On 2023-11-14 12:55, Guinevere Larsen wrote:
>>
>> +gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
>> +    -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" {
>> +    unsupported $gdb_test_name
>> +    return
>> +    }
>> +    -re ".*$gdb_prompt $" {
>> +    pass $gdb_test_name
>> +    }
>> +}
> In the clang buildbot we're getting this output:
> 
> (gdb) catch syscall group:process
> warning: Can not parse XML syscalls information; XML support was disabled at compile time.
> Unknown syscall group 'process'.
> (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
> 
> This should be a failure, and should likely skip the next few tests.
> 
> I don't know

Meh.  I guess we should make it UNSUPPORTED and bail.  But, really that is just
likely to make us not notice GDB wasn't built with XML support.  There's really no
good reason for that nowadays.  It's not like expat is a complicated dependency.

That buildbot should be fixed to configure gdb with libexpat available, IMO.

Or could it be that expat is supposedly available but it fails to build with Clang?
That would be very surprising, though.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-14 12:55   ` Guinevere Larsen
  2023-11-14 13:26     ` Pedro Alves
@ 2023-11-14 13:28     ` Pedro Alves
  1 sibling, 0 replies; 49+ messages in thread
From: Pedro Alves @ 2023-11-14 13:28 UTC (permalink / raw)
  To: Guinevere Larsen, gdb-patches; +Cc: Andrew Burgess

Sorry, hit reply too soon previously, before responding to the rest.

On 2023-11-14 12:55, Guinevere Larsen wrote:
> 
> I may be mis-interpreting this part of the test and log, but it seems that we set no breakpoints at all when testing in the buildbot, which seems to be the reason for all the failures. This is what the log says in this part:

If we didn't manage to set a syscall catchpoint, then we weren't able to
figure out that address of the syscall instruction, and then everything
breaks down.  Everything past that first failure is just cascading failures.
If the "catch syscall" didn't work, there's no point in continuing.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 00/25] Step over thread clone and thread exit
  2023-11-14 10:51   ` Pedro Alves
@ 2023-11-14 13:39     ` Tom de Vries
  0 siblings, 0 replies; 49+ messages in thread
From: Tom de Vries @ 2023-11-14 13:39 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches

On 11/14/23 11:51, Pedro Alves wrote:
> Hi Tom,
> 
> On 2023-11-13 19:28, Tom de Vries wrote:
> 
>> I'm seeing new FAILs:
>> ...
>> FAIL: gdb.threads/stepi-over-clone.exp: continue
>> FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: displaced=off: i=0: stepi
>> FAIL: gdb.threads/stepi-over-clone.exp: third_thread=false: non-stop=on: displaced=off: i=0: $thread_count == 2
> 
> ...
> 
>> ...
>>
>> First in more detail:
>> ...
>> (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
>> continue^M
>> Continuing.^M
>> ^M
>> Catchpoint 2 (call to syscall clone), clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M
>> warning: 78     ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file or directory^M
>> (gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue
>> ...
>>
> 
> Thanks.  I think the patch below would fix this one.  

It does, thanks.

> The others are hopefully something similar,
> but I wasn't able to spot anything wrong by inspection.  I'd have to see the relevant part of the
> gdb.log to hazard a better guess.
> 
> 

I've managed to fix those as wel.  Posted here ( 
https://sourceware.org/pipermail/gdb-patches/2023-November/204118.html ).

[ I've submitted it as regular patch rather than attaching it here to 
make sure git-pw will pick it up. ]

Thanks,
- Tom

> --- 8< ---
> From: Pedro Alves <pedro@palves.net>
> Subject: [PATCH] Fix gdb.threads/stepi-over-clone.exp regexp
> 
> Tom de Vries reported this FAIL:
> 
>   (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
>   continue^M
>   Continuing.^M
>   ^M
>   Catchpoint 2 (call to syscall clone), clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:78^M
>   warning: 78     ../sysdeps/unix/sysv/linux/x86_64/clone.S: No such file or directory^M
>   (gdb) FAIL: gdb.threads/stepi-over-clone.exp: continue
> 
> All but one regexps in the .exp file use "clone\[23\]?" with "?" to
> also accept "clone", except the failing case.  This commit fixes that
> case to also use "?".
> 
> Change-Id: I74ca9e7d4cfe6af294fd50e8c509fcbad289b78c
> ---
>   gdb/testsuite/gdb.threads/stepi-over-clone.exp | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> index 4c496429632..18cfec19ffa 100644
> --- a/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> +++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> @@ -45,7 +45,7 @@ gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
>   }
>   
>   gdb_test "continue" \
> -    "Catchpoint $decimal \\(call to syscall clone\[23\]\\), .*"
> +    "Catchpoint $decimal \\(call to syscall clone\[23\]?\\), .*"
>   
>   # Return true if INSN is a syscall instruction.
>   
> 
> base-commit: 319b460545dc79280e2904dcc280057cf71fb753


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-14 13:26     ` Pedro Alves
@ 2023-11-14 16:29       ` Guinevere Larsen
  2023-11-14 16:44         ` Luis Machado
  0 siblings, 1 reply; 49+ messages in thread
From: Guinevere Larsen @ 2023-11-14 16:29 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Andrew Burgess

On 14/11/2023 14:26, Pedro Alves wrote:
> Hi!
>
> On 2023-11-14 12:55, Guinevere Larsen wrote:
>>> +gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
>>> +    -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" {
>>> +    unsupported $gdb_test_name
>>> +    return
>>> +    }
>>> +    -re ".*$gdb_prompt $" {
>>> +    pass $gdb_test_name
>>> +    }
>>> +}
>> In the clang buildbot we're getting this output:
>>
>> (gdb) catch syscall group:process
>> warning: Can not parse XML syscalls information; XML support was disabled at compile time.
>> Unknown syscall group 'process'.
>> (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
>>
>> This should be a failure, and should likely skip the next few tests.
>>
>> I don't know
> Meh.  I guess we should make it UNSUPPORTED and bail.  But, really that is just
> likely to make us not notice GDB wasn't built with XML support.  There's really no
> good reason for that nowadays.  It's not like expat is a complicated dependency.

That's fair. I still think that test should read FAIL if we get an XML 
parse fail, but maybe this isn't the place to leave early.

I gave it some more thought, and I think the best place would be after 
the test builds the list of syscalls, if that list is empty there is no 
point in further testing, since no breakpoints will ever be set. I think 
a better the point to bail would be right before the main loop, since 
its unlikely we get an empty list at that point without a previous 
failure. Bonus points, we don't get tons of timeout-based errors, 
speeding up testing.

>
> That buildbot should be fixed to configure gdb with libexpat available, IMO.

You're right, and I'll try to do that. Do we need some specific 
configure option? or is it just that the container (probably) doesn't 
have expat and thus configure is automatically skipping it?

I think the test could be improved either way.

>
> Or could it be that expat is supposedly available but it fails to build with Clang?
> That would be very surprising, though.
>
More of an FIY than anything. The "clang" buildbot still builds GDB with 
gcc, it just uses clang for the testsuite, so it's not about expat 
problems with clang.

-- 
Cheers,
Guinevere Larsen
She/Her/Hers


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
  2023-11-14 16:29       ` Guinevere Larsen
@ 2023-11-14 16:44         ` Luis Machado
  0 siblings, 0 replies; 49+ messages in thread
From: Luis Machado @ 2023-11-14 16:44 UTC (permalink / raw)
  To: Guinevere Larsen, Pedro Alves, gdb-patches; +Cc: Andrew Burgess

On 11/14/23 16:29, Guinevere Larsen wrote:
> On 14/11/2023 14:26, Pedro Alves wrote:
>> Hi!
>>
>> On 2023-11-14 12:55, Guinevere Larsen wrote:
>>>> +gdb_test_multiple "catch syscall group:process" "catch process syscalls" {
>>>> +    -re "The feature \'catch syscall\' is not supported.*\r\n$gdb_prompt $" {
>>>> +    unsupported $gdb_test_name
>>>> +    return
>>>> +    }
>>>> +    -re ".*$gdb_prompt $" {
>>>> +    pass $gdb_test_name
>>>> +    }
>>>> +}
>>> In the clang buildbot we're getting this output:
>>>
>>> (gdb) catch syscall group:process
>>> warning: Can not parse XML syscalls information; XML support was disabled at compile time.
>>> Unknown syscall group 'process'.
>>> (gdb) PASS: gdb.threads/stepi-over-clone.exp: catch process syscalls
>>>
>>> This should be a failure, and should likely skip the next few tests.
>>>
>>> I don't know
>> Meh.  I guess we should make it UNSUPPORTED and bail.  But, really that is just
>> likely to make us not notice GDB wasn't built with XML support.  There's really no
>> good reason for that nowadays.  It's not like expat is a complicated dependency.
> 
> That's fair. I still think that test should read FAIL if we get an XML parse fail, but maybe this isn't the place to leave early.
> 
> I gave it some more thought, and I think the best place would be after the test builds the list of syscalls, if that list is empty there is no point in further testing, since no breakpoints will ever be set. I think a better the point to bail would be right before the main loop, since its unlikely we get an empty list at that point without a previous failure. Bonus points, we don't get tons of timeout-based errors, speeding up testing.
> 
>>
>> That buildbot should be fixed to configure gdb with libexpat available, IMO.

Before that, should we agree on making libexpat a required dependency for gdb? Some targets (aarch64-linux for one) won't work properly without XML support.

Unfortunately I don't think that is clear at the moment. Those of us developing gdb know what the message means, but more unfamiliar users may not know.

> 
> You're right, and I'll try to do that. Do we need some specific configure option? or is it just that the container (probably) doesn't have expat and thus configure is automatically skipping it?
> 
> I think the test could be improved either way.
> 
>>
>> Or could it be that expat is supposedly available but it fails to build with Clang?
>> That would be very surprising, though.
>>
> More of an FIY than anything. The "clang" buildbot still builds GDB with gcc, it just uses clang for the testsuite, so it's not about expat problems with clang.
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2023-11-13 15:04 ` [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver) Pedro Alves
@ 2024-02-06 11:04   ` Luis Machado
  2024-02-06 14:57     ` Tom Tromey
  0 siblings, 1 reply; 49+ messages in thread
From: Luis Machado @ 2024-02-06 11:04 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches, Tom Tromey; +Cc: Andrew Burgess

Hi,

On 11/13/23 15:04, Pedro Alves wrote:
> This patch teaches the Linux GDBserver backend to report clone events
> to GDB, when GDB has requested them with the GDB_THREAD_OPTION_CLONE
> thread option, via the new QThreadOptions packet.
> 
> This shuffles code in linux_process_target::handle_extended_wait
> around to a more logical order when we now have to handle and
> potentially report all of fork/vfork/clone.
> 
> Raname lwp_info::fork_relative -> lwp_info::relative as the field is
> no longer only about (v)fork.
> 
> With this, gdb.threads/stepi-over-clone.exp now cleanly passes against
> GDBserver, so remove the native-target-only requirement from that
> testcase.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
> Reviewed-By: Andrew Burgess <aburgess@redhat.com>
> Change-Id: I3a19bc98801ec31e5c6fdbe1ebe17df855142bb2
> ---
>  .../gdb.threads/stepi-over-clone.exp          |   6 -
>  gdbserver/linux-low.cc                        | 253 ++++++++++--------
>  gdbserver/linux-low.h                         |  47 ++--
>  3 files changed, 160 insertions(+), 146 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.threads/stepi-over-clone.exp b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> index e580f2248ac..4c496429632 100644
> --- a/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> +++ b/gdb/testsuite/gdb.threads/stepi-over-clone.exp
> @@ -19,12 +19,6 @@
>  # disassembly output.  For now this is only implemented for x86-64.
>  require {istarget x86_64-*-*}
>  
> -# Test only on native targets, for now.
> -proc is_native_target {} {
> -    return [expr {[target_info gdb_protocol] == ""}]
> -}
> -require is_native_target
> -
>  standard_testfile
>  
>  if { [prepare_for_testing "failed to prepare" $testfile $srcfile \
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index 40b6a907ad9..136a8b6c9a1 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -491,7 +491,6 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>    struct lwp_info *event_lwp = *orig_event_lwp;
>    int event = linux_ptrace_get_extended_event (wstat);
>    struct thread_info *event_thr = get_lwp_thread (event_lwp);
> -  struct lwp_info *new_lwp;
>  
>    gdb_assert (event_lwp->waitstatus.kind () == TARGET_WAITKIND_IGNORE);
>  
> @@ -503,7 +502,6 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>    if ((event == PTRACE_EVENT_FORK) || (event == PTRACE_EVENT_VFORK)
>        || (event == PTRACE_EVENT_CLONE))
>      {
> -      ptid_t ptid;
>        unsigned long new_pid;
>        int ret, status;
>  
> @@ -527,61 +525,65 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>  	    warning ("wait returned unexpected status 0x%x", status);
>  	}
>  
> -      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
> +      if (debug_threads)
>  	{
> -	  struct process_info *parent_proc;
> -	  struct process_info *child_proc;
> -	  struct lwp_info *child_lwp;
> -	  struct thread_info *child_thr;
> +	  debug_printf ("HEW: Got %s event from LWP %ld, new child is %ld\n",
> +			(event == PTRACE_EVENT_FORK ? "fork"
> +			 : event == PTRACE_EVENT_VFORK ? "vfork"
> +			 : event == PTRACE_EVENT_CLONE ? "clone"
> +			 : "???"),
> +			ptid_of (event_thr).lwp (),
> +			new_pid);
> +	}
> +
> +      ptid_t child_ptid = (event != PTRACE_EVENT_CLONE
> +			   ? ptid_t (new_pid, new_pid)
> +			   : ptid_t (ptid_of (event_thr).pid (), new_pid));
>  
> -	  ptid = ptid_t (new_pid, new_pid);
> +      lwp_info *child_lwp = add_lwp (child_ptid);
> +      gdb_assert (child_lwp != NULL);
> +      child_lwp->stopped = 1;
> +      if (event != PTRACE_EVENT_CLONE)
> +	child_lwp->must_set_ptrace_flags = 1;
> +      child_lwp->status_pending_p = 0;
>  
> -	  threads_debug_printf ("Got fork event from LWP %ld, "
> -				"new child is %d",
> -				ptid_of (event_thr).lwp (),
> -				ptid.pid ());
> +      thread_info *child_thr = get_lwp_thread (child_lwp);
>  
> +      /* If we're suspending all threads, leave this one suspended
> +	 too.  If the fork/clone parent is stepping over a breakpoint,
> +	 all other threads have been suspended already.  Leave the
> +	 child suspended too.  */
> +      if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
> +	  || event_lwp->bp_reinsert != 0)
> +	{
> +	  threads_debug_printf ("leaving child suspended");
> +	  child_lwp->suspended = 1;
> +	}
> +
> +      if (event_lwp->bp_reinsert != 0
> +	  && supports_software_single_step ()
> +	  && event == PTRACE_EVENT_VFORK)
> +	{
> +	  /* If we leave single-step breakpoints there, child will
> +	     hit it, so uninsert single-step breakpoints from parent
> +	     (and child).  Once vfork child is done, reinsert
> +	     them back to parent.  */
> +	  uninsert_single_step_breakpoints (event_thr);
> +	}
> +
> +      if (event != PTRACE_EVENT_CLONE)
> +	{
>  	  /* Add the new process to the tables and clone the breakpoint
>  	     lists of the parent.  We need to do this even if the new process
>  	     will be detached, since we will need the process object and the
>  	     breakpoints to remove any breakpoints from memory when we
>  	     detach, and the client side will access registers.  */
> -	  child_proc = add_linux_process (new_pid, 0);
> +	  process_info *child_proc = add_linux_process (new_pid, 0);
>  	  gdb_assert (child_proc != NULL);
> -	  child_lwp = add_lwp (ptid);
> -	  gdb_assert (child_lwp != NULL);
> -	  child_lwp->stopped = 1;
> -	  child_lwp->must_set_ptrace_flags = 1;
> -	  child_lwp->status_pending_p = 0;
> -	  child_thr = get_lwp_thread (child_lwp);
> -	  child_thr->last_resume_kind = resume_stop;
> -	  child_thr->last_status.set_stopped (GDB_SIGNAL_0);
> -
> -	  /* If we're suspending all threads, leave this one suspended
> -	     too.  If the fork/clone parent is stepping over a breakpoint,
> -	     all other threads have been suspended already.  Leave the
> -	     child suspended too.  */
> -	  if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
> -	      || event_lwp->bp_reinsert != 0)
> -	    {
> -	      threads_debug_printf ("leaving child suspended");
> -	      child_lwp->suspended = 1;
> -	    }
>  
> -	  parent_proc = get_thread_process (event_thr);
> +	  process_info *parent_proc = get_thread_process (event_thr);
>  	  child_proc->attached = parent_proc->attached;
>  
> -	  if (event_lwp->bp_reinsert != 0
> -	      && supports_software_single_step ()
> -	      && event == PTRACE_EVENT_VFORK)
> -	    {
> -	      /* If we leave single-step breakpoints there, child will
> -		 hit it, so uninsert single-step breakpoints from parent
> -		 (and child).  Once vfork child is done, reinsert
> -		 them back to parent.  */
> -	      uninsert_single_step_breakpoints (event_thr);
> -	    }
> -
>  	  clone_all_breakpoints (child_thr, event_thr);
>  
>  	  target_desc_up tdesc = allocate_target_description ();
> @@ -590,88 +592,97 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>  
>  	  /* Clone arch-specific process data.  */
>  	  low_new_fork (parent_proc, child_proc);
> +	}
>  
> -	  /* Save fork info in the parent thread.  */
> -	  if (event == PTRACE_EVENT_FORK)
> -	    event_lwp->waitstatus.set_forked (ptid);
> -	  else if (event == PTRACE_EVENT_VFORK)
> -	    event_lwp->waitstatus.set_vforked (ptid);
> -
> +      /* Save fork/clone info in the parent thread.  */
> +      if (event == PTRACE_EVENT_FORK)
> +	event_lwp->waitstatus.set_forked (child_ptid);
> +      else if (event == PTRACE_EVENT_VFORK)
> +	event_lwp->waitstatus.set_vforked (child_ptid);
> +      else if (event == PTRACE_EVENT_CLONE
> +	       && (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) != 0)
> +	event_lwp->waitstatus.set_thread_cloned (child_ptid);
> +
> +      if (event != PTRACE_EVENT_CLONE
> +	  || (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) != 0)
> +	{
>  	  /* The status_pending field contains bits denoting the
> -	     extended event, so when the pending event is handled,
> -	     the handler will look at lwp->waitstatus.  */
> +	     extended event, so when the pending event is handled, the
> +	     handler will look at lwp->waitstatus.  */
>  	  event_lwp->status_pending_p = 1;
>  	  event_lwp->status_pending = wstat;
>  
> -	  /* Link the threads until the parent event is passed on to
> -	     higher layers.  */
> -	  event_lwp->fork_relative = child_lwp;
> -	  child_lwp->fork_relative = event_lwp;
> -
> -	  /* If the parent thread is doing step-over with single-step
> -	     breakpoints, the list of single-step breakpoints are cloned
> -	     from the parent's.  Remove them from the child process.
> -	     In case of vfork, we'll reinsert them back once vforked
> -	     child is done.  */
> -	  if (event_lwp->bp_reinsert != 0
> -	      && supports_software_single_step ())
> -	    {
> -	      /* The child process is forked and stopped, so it is safe
> -		 to access its memory without stopping all other threads
> -		 from other processes.  */
> -	      delete_single_step_breakpoints (child_thr);
> -
> -	      gdb_assert (has_single_step_breakpoints (event_thr));
> -	      gdb_assert (!has_single_step_breakpoints (child_thr));
> -	    }
> -
> -	  /* Report the event.  */
> -	  return 0;
> +	  /* Link the threads until the parent's event is passed on to
> +	     GDB.  */
> +	  event_lwp->relative = child_lwp;
> +	  child_lwp->relative = event_lwp;
>  	}
>  
> -      threads_debug_printf
> -	("Got clone event from LWP %ld, new child is LWP %ld",
> -	 lwpid_of (event_thr), new_pid);
> -
> -      ptid = ptid_t (pid_of (event_thr), new_pid);
> -      new_lwp = add_lwp (ptid);
> -
> -      /* Either we're going to immediately resume the new thread
> -	 or leave it stopped.  resume_one_lwp is a nop if it
> -	 thinks the thread is currently running, so set this first
> -	 before calling resume_one_lwp.  */
> -      new_lwp->stopped = 1;
> +      /* If the parent thread is doing step-over with single-step
> +	 breakpoints, the list of single-step breakpoints are cloned
> +	 from the parent's.  Remove them from the child process.
> +	 In case of vfork, we'll reinsert them back once vforked
> +	 child is done.  */
> +      if (event_lwp->bp_reinsert != 0
> +	  && supports_software_single_step ())
> +	{
> +	  /* The child process is forked and stopped, so it is safe
> +	     to access its memory without stopping all other threads
> +	     from other processes.  */
> +	  delete_single_step_breakpoints (child_thr);
>  
> -      /* If we're suspending all threads, leave this one suspended
> -	 too.  If the fork/clone parent is stepping over a breakpoint,
> -	 all other threads have been suspended already.  Leave the
> -	 child suspended too.  */
> -      if (stopping_threads == STOPPING_AND_SUSPENDING_THREADS
> -	  || event_lwp->bp_reinsert != 0)
> -	new_lwp->suspended = 1;
> +	  gdb_assert (has_single_step_breakpoints (event_thr));
> +	  gdb_assert (!has_single_step_breakpoints (child_thr));
> +	}
>  
>        /* Normally we will get the pending SIGSTOP.  But in some cases
>  	 we might get another signal delivered to the group first.
>  	 If we do get another signal, be sure not to lose it.  */
>        if (WSTOPSIG (status) != SIGSTOP)
>  	{
> -	  new_lwp->stop_expected = 1;
> -	  new_lwp->status_pending_p = 1;
> -	  new_lwp->status_pending = status;
> +	  child_lwp->stop_expected = 1;
> +	  child_lwp->status_pending_p = 1;
> +	  child_lwp->status_pending = status;
>  	}
> -      else if (cs.report_thread_events)
> +      else if (event == PTRACE_EVENT_CLONE && cs.report_thread_events)
>  	{
> -	  new_lwp->waitstatus.set_thread_created ();
> -	  new_lwp->status_pending_p = 1;
> -	  new_lwp->status_pending = status;
> +	  child_lwp->waitstatus.set_thread_created ();
> +	  child_lwp->status_pending_p = 1;
> +	  child_lwp->status_pending = status;
>  	}
>  
> +      if (event == PTRACE_EVENT_CLONE)
> +	{
>  #ifdef USE_THREAD_DB
> -      thread_db_notice_clone (event_thr, ptid);
> +	  thread_db_notice_clone (event_thr, child_ptid);
>  #endif
> +	}
>  
> -      /* Don't report the event.  */
> -      return 1;
> +      if (event == PTRACE_EVENT_CLONE
> +	  && (event_thr->thread_options & GDB_THREAD_OPTION_CLONE) == 0)
> +	{
> +	  threads_debug_printf
> +	    ("not reporting clone event from LWP %ld, new child is %ld\n",
> +	     ptid_of (event_thr).lwp (),
> +	     new_pid);
> +	  return 1;
> +	}
> +
> +      /* Leave the child stopped until GDB processes the parent
> +	 event.  */
> +      child_thr->last_resume_kind = resume_stop;
> +      child_thr->last_status.set_stopped (GDB_SIGNAL_0);
> +
> +      /* Report the event.  */
> +      threads_debug_printf
> +	("reporting %s event from LWP %ld, new child is %ld\n",
> +	 (event == PTRACE_EVENT_FORK ? "fork"
> +	  : event == PTRACE_EVENT_VFORK ? "vfork"
> +	  : event == PTRACE_EVENT_CLONE ? "clone"
> +	  : "???"),
> +	 ptid_of (event_thr).lwp (),
> +	 new_pid);
> +      return 0;
>      }
>    else if (event == PTRACE_EVENT_VFORK_DONE)
>      {
> @@ -3531,15 +3542,14 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
>  
>    if (event_child->waitstatus.kind () != TARGET_WAITKIND_IGNORE)
>      {
> -      /* If the reported event is an exit, fork, vfork or exec, let
> -	 GDB know.  */
> +      /* If the reported event is an exit, fork, vfork, clone or exec,
> +	 let GDB know.  */
>  
> -      /* Break the unreported fork relationship chain.  */
> -      if (event_child->waitstatus.kind () == TARGET_WAITKIND_FORKED
> -	  || event_child->waitstatus.kind () == TARGET_WAITKIND_VFORKED)
> +      /* Break the unreported fork/vfork/clone relationship chain.  */
> +      if (is_new_child_status (event_child->waitstatus.kind ()))
>  	{
> -	  event_child->fork_relative->fork_relative = NULL;
> -	  event_child->fork_relative = NULL;
> +	  event_child->relative->relative = NULL;
> +	  event_child->relative = NULL;
>  	}
>  
>        *ourstatus = event_child->waitstatus;
> @@ -4272,15 +4282,14 @@ linux_set_resume_request (thread_info *thread, thread_resume *resume, size_t n)
>  	      continue;
>  	    }
>  
> -	  /* Don't let wildcard resumes resume fork children that GDB
> -	     does not yet know are new fork children.  */
> -	  if (lwp->fork_relative != NULL)
> +	  /* Don't let wildcard resumes resume fork/vfork/clone
> +	     children that GDB does not yet know are new children.  */
> +	  if (lwp->relative != NULL)
>  	    {
> -	      struct lwp_info *rel = lwp->fork_relative;
> +	      struct lwp_info *rel = lwp->relative;
>  
>  	      if (rel->status_pending_p
> -		  && (rel->waitstatus.kind () == TARGET_WAITKIND_FORKED
> -		      || rel->waitstatus.kind () == TARGET_WAITKIND_VFORKED))
> +		  && is_new_child_status (rel->waitstatus.kind ()))
>  		{
>  		  threads_debug_printf
>  		    ("not resuming LWP %ld: has queued stop reply",
> @@ -5907,6 +5916,14 @@ linux_process_target::supports_vfork_events ()
>    return true;
>  }
>  
> +/* Return the set of supported thread options.  */
> +
> +gdb_thread_options
> +linux_process_target::supported_thread_options ()
> +{
> +  return GDB_THREAD_OPTION_CLONE;
> +}
> +
>  /* Check if exec events are supported.  */
>  
>  bool
> diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
> index f7cedf6706b..94093dd4ed8 100644
> --- a/gdbserver/linux-low.h
> +++ b/gdbserver/linux-low.h
> @@ -234,6 +234,8 @@ class linux_process_target : public process_stratum_target
>  
>    bool supports_vfork_events () override;
>  
> +  gdb_thread_options supported_thread_options () override;
> +
>    bool supports_exec_events () override;
>  
>    void handle_new_gdb_connection () override;
> @@ -732,48 +734,47 @@ struct pending_signal
>  
>  struct lwp_info
>  {
> -  /* If this LWP is a fork child that wasn't reported to GDB yet, return
> -     its parent, else nullptr.  */
> +  /* If this LWP is a fork/vfork/clone child that wasn't reported to
> +     GDB yet, return its parent, else nullptr.  */
>    lwp_info *pending_parent () const
>    {
> -    if (this->fork_relative == nullptr)
> +    if (this->relative == nullptr)
>        return nullptr;
>  
> -    gdb_assert (this->fork_relative->fork_relative == this);
> +    gdb_assert (this->relative->relative == this);
>  
> -    /* In a fork parent/child relationship, the parent has a status pending and
> +    /* In a parent/child relationship, the parent has a status pending and
>         the child does not, and a thread can only be in one such relationship
>         at most.  So we can recognize who is the parent based on which one has
>         a pending status.  */
>      gdb_assert (!!this->status_pending_p
> -		!= !!this->fork_relative->status_pending_p);
> +		!= !!this->relative->status_pending_p);
>  
> -    if (!this->fork_relative->status_pending_p)
> +    if (!this->relative->status_pending_p)
>        return nullptr;
>  
>      const target_waitstatus &ws
> -      = this->fork_relative->waitstatus;
> +      = this->relative->waitstatus;
>      gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
>  		|| ws.kind () == TARGET_WAITKIND_VFORKED);
>  
> -    return this->fork_relative;
> -  }
> +    return this->relative; }
>  
> -  /* If this LWP is the parent of a fork child we haven't reported to GDB yet,
> -     return that child, else nullptr.  */
> +  /* If this LWP is the parent of a fork/vfork/clone child we haven't
> +     reported to GDB yet, return that child, else nullptr.  */
>    lwp_info *pending_child () const
>    {
> -    if (this->fork_relative == nullptr)
> +    if (this->relative == nullptr)
>        return nullptr;
>  
> -    gdb_assert (this->fork_relative->fork_relative == this);
> +    gdb_assert (this->relative->relative == this);
>  
> -    /* In a fork parent/child relationship, the parent has a status pending and
> +    /* In a parent/child relationship, the parent has a status pending and
>         the child does not, and a thread can only be in one such relationship
>         at most.  So we can recognize who is the parent based on which one has
>         a pending status.  */
>      gdb_assert (!!this->status_pending_p
> -		!= !!this->fork_relative->status_pending_p);
> +		!= !!this->relative->status_pending_p);
>  
>      if (!this->status_pending_p)
>        return nullptr;
> @@ -782,7 +783,7 @@ struct lwp_info
>      gdb_assert (ws.kind () == TARGET_WAITKIND_FORKED
>  		|| ws.kind () == TARGET_WAITKIND_VFORKED);
>  
> -    return this->fork_relative;
> +    return this->relative;
>    }
>  
>    /* Backlink to the parent object.  */
> @@ -820,11 +821,13 @@ struct lwp_info
>       information or exit status until it can be reported to GDB.  */
>    struct target_waitstatus waitstatus;
>  
> -  /* A pointer to the fork child/parent relative.  Valid only while
> -     the parent fork event is not reported to higher layers.  Used to
> -     avoid wildcard vCont actions resuming a fork child before GDB is
> -     notified about the parent's fork event.  */
> -  struct lwp_info *fork_relative = nullptr;
> +  /* A pointer to the fork/vfork/clone child/parent relative (like
> +     people, LWPs have relatives).  Valid only while the parent
> +     fork/vfork/clone event is not reported to higher layers.  Used to
> +     avoid wildcard vCont actions resuming a fork/vfork/clone child
> +     before GDB is notified about the parent's fork/vfork/clone
> +     event.  */
> +  struct lwp_info *relative = nullptr;
>  
>    /* When stopped is set, this is where the lwp last stopped, with
>       decr_pc_after_break already accounted for.  If the LWP is

Tromey had pointed out, on IRC, gdbserver was crashing when stepping over a fork on aarch64. I went to investigate it and noticed the testsuite
run for --target_board=native-gdbserver was really bad in terms of FAIL's (over 700). This is Ubuntu 20.04.

I bisected the FAIL's for at least one testcase (gdb.threads/next-fork-other-thread.exp) to this particular commit. But the series is large, so it could
potentially be something else in the series.

I haven't fully investigated the crashes yet, but thought I'd mention it for the record and to see if any bells rang.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-06 11:04   ` Luis Machado
@ 2024-02-06 14:57     ` Tom Tromey
  2024-02-06 15:12       ` Luis Machado
  2024-02-07  8:59       ` Luis Machado
  0 siblings, 2 replies; 49+ messages in thread
From: Tom Tromey @ 2024-02-06 14:57 UTC (permalink / raw)
  To: Luis Machado; +Cc: Pedro Alves, gdb-patches, Tom Tromey, Andrew Burgess

>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:

Luis> Tromey had pointed out, on IRC, gdbserver was crashing when
Luis> stepping over a fork on aarch64. I went to investigate it and
Luis> noticed the testsuite run for --target_board=native-gdbserver was
Luis> really bad in terms of FAIL's (over 700). This is Ubuntu 20.04.

I didn't try this, but I do have a fix for the fork bug.
I'll send it in a few days, I hope.

Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-06 14:57     ` Tom Tromey
@ 2024-02-06 15:12       ` Luis Machado
  2024-02-07  8:59       ` Luis Machado
  1 sibling, 0 replies; 49+ messages in thread
From: Luis Machado @ 2024-02-06 15:12 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, gdb-patches, Andrew Burgess

On 2/6/24 14:57, Tom Tromey wrote:
>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
> 
> Luis> Tromey had pointed out, on IRC, gdbserver was crashing when
> Luis> stepping over a fork on aarch64. I went to investigate it and
> Luis> noticed the testsuite run for --target_board=native-gdbserver was
> Luis> really bad in terms of FAIL's (over 700). This is Ubuntu 20.04.
> 
> I didn't try this, but I do have a fix for the fork bug.
> I'll send it in a few days, I hope.
> 
> Tom

I'll keep an eye out for it, so I can do some testing on our end.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-06 14:57     ` Tom Tromey
  2024-02-06 15:12       ` Luis Machado
@ 2024-02-07  8:59       ` Luis Machado
  2024-02-07 15:43         ` Tom Tromey
  1 sibling, 1 reply; 49+ messages in thread
From: Luis Machado @ 2024-02-07  8:59 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Pedro Alves, gdb-patches, Andrew Burgess

On 2/6/24 14:57, Tom Tromey wrote:
>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
> 
> Luis> Tromey had pointed out, on IRC, gdbserver was crashing when
> Luis> stepping over a fork on aarch64. I went to investigate it and
> Luis> noticed the testsuite run for --target_board=native-gdbserver was
> Luis> really bad in terms of FAIL's (over 700). This is Ubuntu 20.04.
> 
> I didn't try this, but I do have a fix for the fork bug.
> I'll send it in a few days, I hope.
> 
> Tom

I was checking this today. Turns out we're trying to locate the process PID of the
process in this function, line 405:

402     struct aarch64_debug_reg_state *
403     aarch64_get_debug_reg_state (pid_t pid)
404     {
405       struct process_info *proc = find_process_pid (pid);
406
407       return &proc->priv->arch_private->debug_reg_state;
408     }

But find_process_pid returns nullptr. I wonder if it is one of those cases
where we have to deal with the tid rather than the pid.

Does this look like the same case you were chasing?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07  8:59       ` Luis Machado
@ 2024-02-07 15:43         ` Tom Tromey
  2024-02-07 17:10           ` Simon Marchi
  0 siblings, 1 reply; 49+ messages in thread
From: Tom Tromey @ 2024-02-07 15:43 UTC (permalink / raw)
  To: Luis Machado; +Cc: Tom Tromey, Pedro Alves, gdb-patches, Andrew Burgess

>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:

Luis> But find_process_pid returns nullptr. I wonder if it is one of those cases
Luis> where we have to deal with the tid rather than the pid.

Luis> Does this look like the same case you were chasing?

Yes.  The issue is that the new inferior isn't created until after the
new thread -- but the order can't really be reversed in the caller.

I've appended the patch.  I put off sending it because for internal
reasons it hasn't been through the AdaCore automated testing yet.
However, I did test it (using the AdaCore test suite -- not gdb's)
myself.

Let me know what you think.

Tom

commit 5464152cb1145bc1df108eb6904a642d8bc73b8c
Author: Tom Tromey <tromey@adacore.com>
Date:   Mon Feb 5 13:18:51 2024 -0700

    Fix crash in aarch64-linux gdbserver
    
    We noticed that aarch64-linux gdbserver will crash when the inferior
    vforks.  This happens in aarch64_get_debug_reg_state:
    
      struct process_info *proc = find_process_pid (pid);
    
      return &proc->priv->arch_private->debug_reg_state;
    
    Here, find_process_pid returns nullptr -- the new inferior hasn't yet
    been created in linux_process_target::handle_extended_wait.
    
    This patch fixes the problem by having aarch64_get_debug_reg_state
    return nullptr in this case, and then updating
    aarch64_linux_new_thread to check for this.

diff --git a/gdb/nat/aarch64-linux.c b/gdb/nat/aarch64-linux.c
index 5ebbc9b81f8..894de8aa3eb 100644
--- a/gdb/nat/aarch64-linux.c
+++ b/gdb/nat/aarch64-linux.c
@@ -81,9 +81,9 @@ aarch64_linux_new_thread (struct lwp_info *lwp)
   /* If there are hardware breakpoints/watchpoints in the process then mark that
      all the hardware breakpoint/watchpoint register pairs for this thread need
      to be initialized (with data from aarch_process_info.debug_reg_state).  */
-  if (aarch64_any_set_debug_regs_state (state, false))
+  if (state == nullptr || aarch64_any_set_debug_regs_state (state, false))
     DR_MARK_ALL_CHANGED (info->dr_changed_bp, aarch64_num_bp_regs);
-  if (aarch64_any_set_debug_regs_state (state, true))
+  if (state == nullptr || aarch64_any_set_debug_regs_state (state, true))
     DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs);
 
   lwp_set_arch_private_info (lwp, info);
diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
index 28d75d035dc..2a4f01a54da 100644
--- a/gdbserver/linux-aarch64-low.cc
+++ b/gdbserver/linux-aarch64-low.cc
@@ -403,7 +403,8 @@ struct aarch64_debug_reg_state *
 aarch64_get_debug_reg_state (pid_t pid)
 {
   struct process_info *proc = find_process_pid (pid);
-
+  if (proc == nullptr)
+    return nullptr;
   return &proc->priv->arch_private->debug_reg_state;
 }
 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 15:43         ` Tom Tromey
@ 2024-02-07 17:10           ` Simon Marchi
  2024-02-07 18:05             ` Luis Machado
  2024-02-07 18:06             ` Tom Tromey
  0 siblings, 2 replies; 49+ messages in thread
From: Simon Marchi @ 2024-02-07 17:10 UTC (permalink / raw)
  To: Tom Tromey, Luis Machado; +Cc: Pedro Alves, gdb-patches, Andrew Burgess

On 2/7/24 10:43, Tom Tromey wrote:
>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
> 
> Luis> But find_process_pid returns nullptr. I wonder if it is one of those cases
> Luis> where we have to deal with the tid rather than the pid.
> 
> Luis> Does this look like the same case you were chasing?
> 
> Yes.  The issue is that the new inferior isn't created until after the
> new thread -- but the order can't really be reversed in the caller.
> 
> I've appended the patch.  I put off sending it because for internal
> reasons it hasn't been through the AdaCore automated testing yet.
> However, I did test it (using the AdaCore test suite -- not gdb's)
> myself.
> 
> Let me know what you think.
> 
> Tom
> 
> commit 5464152cb1145bc1df108eb6904a642d8bc73b8c
> Author: Tom Tromey <tromey@adacore.com>
> Date:   Mon Feb 5 13:18:51 2024 -0700
> 
>     Fix crash in aarch64-linux gdbserver
>     
>     We noticed that aarch64-linux gdbserver will crash when the inferior
>     vforks.  This happens in aarch64_get_debug_reg_state:
>     
>       struct process_info *proc = find_process_pid (pid);
>     
>       return &proc->priv->arch_private->debug_reg_state;
>     
>     Here, find_process_pid returns nullptr -- the new inferior hasn't yet
>     been created in linux_process_target::handle_extended_wait.
>     
>     This patch fixes the problem by having aarch64_get_debug_reg_state
>     return nullptr in this case, and then updating
>     aarch64_linux_new_thread to check for this.
> 
> diff --git a/gdb/nat/aarch64-linux.c b/gdb/nat/aarch64-linux.c
> index 5ebbc9b81f8..894de8aa3eb 100644
> --- a/gdb/nat/aarch64-linux.c
> +++ b/gdb/nat/aarch64-linux.c
> @@ -81,9 +81,9 @@ aarch64_linux_new_thread (struct lwp_info *lwp)
>    /* If there are hardware breakpoints/watchpoints in the process then mark that
>       all the hardware breakpoint/watchpoint register pairs for this thread need
>       to be initialized (with data from aarch_process_info.debug_reg_state).  */
> -  if (aarch64_any_set_debug_regs_state (state, false))
> +  if (state == nullptr || aarch64_any_set_debug_regs_state (state, false))
>      DR_MARK_ALL_CHANGED (info->dr_changed_bp, aarch64_num_bp_regs);
> -  if (aarch64_any_set_debug_regs_state (state, true))
> +  if (state == nullptr || aarch64_any_set_debug_regs_state (state, true))
>      DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs);

I don't really understand all of this, but I'm wondering if the
condition should be:

  if (state != nullptr && aarch64_any_set_debug_regs_state (state, ...))

If we have no existing aarch64_debug_reg_state, do we really need to
mark the breakpoints as needing to be updated?

>    lwp_set_arch_private_info (lwp, info);
> diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
> index 28d75d035dc..2a4f01a54da 100644
> --- a/gdbserver/linux-aarch64-low.cc
> +++ b/gdbserver/linux-aarch64-low.cc
> @@ -403,7 +403,8 @@ struct aarch64_debug_reg_state *
>  aarch64_get_debug_reg_state (pid_t pid)
>  {
>    struct process_info *proc = find_process_pid (pid);
> -
> +  if (proc == nullptr)
> +    return nullptr;
>    return &proc->priv->arch_private->debug_reg_state;
>  }

I was wondering if the GDB version of this function needed to get
updated too.  It works differently:

    /* See aarch64-nat.h.  */

    struct aarch64_debug_reg_state *
    aarch64_get_debug_reg_state (pid_t pid)
    {
      return &aarch64_debug_process_state[pid];
    }

Here, aarch64_debug_process_state is an unordered_map<pid_t,
aarch64_debug_reg_state>, meaning that if pid isn't currently in the
map, a default aarch64_debug_reg_state will be constructed (is it going
to be initialized properly?).

So we end up with two different semantics for the two versions of the
function, which might become a source of confusion later.

Simon

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 17:10           ` Simon Marchi
@ 2024-02-07 18:05             ` Luis Machado
  2024-02-07 18:18               ` Tom Tromey
  2024-02-07 18:06             ` Tom Tromey
  1 sibling, 1 reply; 49+ messages in thread
From: Luis Machado @ 2024-02-07 18:05 UTC (permalink / raw)
  To: Simon Marchi, Tom Tromey; +Cc: Pedro Alves, gdb-patches, Andrew Burgess

Replying to both Tom's and Simon's comments.

On 2/7/24 17:10, Simon Marchi wrote:
> On 2/7/24 10:43, Tom Tromey wrote:
>>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
>>
>> Luis> But find_process_pid returns nullptr. I wonder if it is one of those cases
>> Luis> where we have to deal with the tid rather than the pid.
>>
>> Luis> Does this look like the same case you were chasing?
>>
>> Yes.  The issue is that the new inferior isn't created until after the
>> new thread -- but the order can't really be reversed in the caller.
>>

I see. Is this logic expected? Naturally I'd expect a process to exist before a thread can exist.

I haven't followed the patch series closely though, so there may be a reason for it.

>> I've appended the patch.  I put off sending it because for internal
>> reasons it hasn't been through the AdaCore automated testing yet.
>> However, I did test it (using the AdaCore test suite -- not gdb's)
>> myself.
>>
>> Let me know what you think.

It does fix the regressions I was seeing, but Simon made some good points as well.

>>
>> Tom
>>
>> commit 5464152cb1145bc1df108eb6904a642d8bc73b8c
>> Author: Tom Tromey <tromey@adacore.com>
>> Date:   Mon Feb 5 13:18:51 2024 -0700
>>
>>     Fix crash in aarch64-linux gdbserver
>>     
>>     We noticed that aarch64-linux gdbserver will crash when the inferior
>>     vforks.  This happens in aarch64_get_debug_reg_state:
>>     
>>       struct process_info *proc = find_process_pid (pid);
>>     
>>       return &proc->priv->arch_private->debug_reg_state;
>>     
>>     Here, find_process_pid returns nullptr -- the new inferior hasn't yet
>>     been created in linux_process_target::handle_extended_wait.
>>     
>>     This patch fixes the problem by having aarch64_get_debug_reg_state
>>     return nullptr in this case, and then updating
>>     aarch64_linux_new_thread to check for this.
>>
>> diff --git a/gdb/nat/aarch64-linux.c b/gdb/nat/aarch64-linux.c
>> index 5ebbc9b81f8..894de8aa3eb 100644
>> --- a/gdb/nat/aarch64-linux.c
>> +++ b/gdb/nat/aarch64-linux.c
>> @@ -81,9 +81,9 @@ aarch64_linux_new_thread (struct lwp_info *lwp)
>>    /* If there are hardware breakpoints/watchpoints in the process then mark that
>>       all the hardware breakpoint/watchpoint register pairs for this thread need
>>       to be initialized (with data from aarch_process_info.debug_reg_state).  */
>> -  if (aarch64_any_set_debug_regs_state (state, false))
>> +  if (state == nullptr || aarch64_any_set_debug_regs_state (state, false))
>>      DR_MARK_ALL_CHANGED (info->dr_changed_bp, aarch64_num_bp_regs);
>> -  if (aarch64_any_set_debug_regs_state (state, true))
>> +  if (state == nullptr || aarch64_any_set_debug_regs_state (state, true))
>>      DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs);
> 
> I don't really understand all of this, but I'm wondering if the
> condition should be:
> 
>   if (state != nullptr && aarch64_any_set_debug_regs_state (state, ...))
> 
> If we have no existing aarch64_debug_reg_state, do we really need to
> mark the breakpoints as needing to be updated?
> 

I think as long as we have a thread, we should always have the state for the debug registers,
so changing the approach to always initialize the state if there isn't one seems reasonable.

See below.

>>    lwp_set_arch_private_info (lwp, info);
>> diff --git a/gdbserver/linux-aarch64-low.cc b/gdbserver/linux-aarch64-low.cc
>> index 28d75d035dc..2a4f01a54da 100644
>> --- a/gdbserver/linux-aarch64-low.cc
>> +++ b/gdbserver/linux-aarch64-low.cc
>> @@ -403,7 +403,8 @@ struct aarch64_debug_reg_state *
>>  aarch64_get_debug_reg_state (pid_t pid)
>>  {
>>    struct process_info *proc = find_process_pid (pid);
>> -
>> +  if (proc == nullptr)
>> +    return nullptr;
>>    return &proc->priv->arch_private->debug_reg_state;
>>  }
> 
> I was wondering if the GDB version of this function needed to get
> updated too.  It works differently:
> 
>     /* See aarch64-nat.h.  */
> 
>     struct aarch64_debug_reg_state *
>     aarch64_get_debug_reg_state (pid_t pid)
>     {
>       return &aarch64_debug_process_state[pid];
>     }
> 
> Here, aarch64_debug_process_state is an unordered_map<pid_t,
> aarch64_debug_reg_state>, meaning that if pid isn't currently in the
> map, a default aarch64_debug_reg_state will be constructed (is it going
> to be initialized properly?).
> 
> So we end up with two different semantics for the two versions of the
> function, which might become a source of confusion later.

And it would sync the behavior from gdb nat and gdbserver nat layers.

I can put together a patch to do that. I wasn't aware there was this discrepancy
between gdb and gdbserver.

> 
> Simon


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 17:10           ` Simon Marchi
  2024-02-07 18:05             ` Luis Machado
@ 2024-02-07 18:06             ` Tom Tromey
  1 sibling, 0 replies; 49+ messages in thread
From: Tom Tromey @ 2024-02-07 18:06 UTC (permalink / raw)
  To: Simon Marchi
  Cc: Tom Tromey, Luis Machado, Pedro Alves, gdb-patches, Andrew Burgess

>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

>> -  if (aarch64_any_set_debug_regs_state (state, true))
>> +  if (state == nullptr || aarch64_any_set_debug_regs_state (state, true))
>> DR_MARK_ALL_CHANGED (info->dr_changed_wp, aarch64_num_wp_regs);

Simon> I don't really understand all of this, but I'm wondering if the
Simon> condition should be:

Simon>   if (state != nullptr && aarch64_any_set_debug_regs_state (state, ...))

Simon> If we have no existing aarch64_debug_reg_state, do we really need to
Simon> mark the breakpoints as needing to be updated?

I wasn't sure but I followed what I understood x86 to do, see
nat/x86-linux.c:lwp_set_debug_registers_changed.

Simon> Here, aarch64_debug_process_state is an unordered_map<pid_t,
Simon> aarch64_debug_reg_state> , meaning that if pid isn't currently in the
Simon> map, a default aarch64_debug_reg_state will be constructed (is it going
Simon> to be initialized properly?).

Simon> So we end up with two different semantics for the two versions of the
Simon> function, which might become a source of confusion later.

Yeah, I don't know the answer here.  I personally don't find it super
confusing, or at least not any more than the way that gdb and gdbserver
randomly do things differently already.

Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 18:05             ` Luis Machado
@ 2024-02-07 18:18               ` Tom Tromey
  2024-02-07 18:56                 ` Pedro Alves
  0 siblings, 1 reply; 49+ messages in thread
From: Tom Tromey @ 2024-02-07 18:18 UTC (permalink / raw)
  To: Luis Machado
  Cc: Simon Marchi, Tom Tromey, Pedro Alves, gdb-patches, Andrew Burgess

>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:

Luis> I see. Is this logic expected? Naturally I'd expect a process to
Luis> exist before a thread can exist.

Me too but you can see it in
linux-low.cc:linux_process_target::handle_extended_wait.

      lwp_info *child_lwp = add_lwp (child_ptid);
[...]
      if (event != PTRACE_EVENT_CLONE)
	{
	  /* Add the new process to the tables and clone the breakpoint
	     lists of the parent.  We need to do this even if the new process
	     will be detached, since we will need the process object and the
	     breakpoints to remove any breakpoints from memory when we
	     detach, and the client side will access registers.  */
	  process_info *child_proc = add_linux_process (new_pid, 0);
[...]

Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 18:18               ` Tom Tromey
@ 2024-02-07 18:56                 ` Pedro Alves
  2024-02-07 20:11                   ` Pedro Alves
  0 siblings, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2024-02-07 18:56 UTC (permalink / raw)
  To: Tom Tromey, Luis Machado; +Cc: Simon Marchi, gdb-patches, Andrew Burgess

Hi!

On 2024-02-07 18:18, Tom Tromey wrote:
>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
> 
> Luis> I see. Is this logic expected? Naturally I'd expect a process to
> Luis> exist before a thread can exist.
> 
> Me too but you can see it in
> linux-low.cc:linux_process_target::handle_extended_wait.
> 
>       lwp_info *child_lwp = add_lwp (child_ptid);
> [...]
>       if (event != PTRACE_EVENT_CLONE)
> 	{
> 	  /* Add the new process to the tables and clone the breakpoint
> 	     lists of the parent.  We need to do this even if the new process
> 	     will be detached, since we will need the process object and the
> 	     breakpoints to remove any breakpoints from memory when we
> 	     detach, and the client side will access registers.  */
> 	  process_info *child_proc = add_linux_process (new_pid, 0);
> [...]
> 

I don't recall off hand a reason that prevents us from tweaking this code a little to
create the child process before the child lwp is created.  I think that was how it was
done before my changes, and I just reordered code to make it end up with fewer lines.
I think we can create the child process earlier.

I'll send a patch in a sec, once I test it.

Pedro Alves

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 18:56                 ` Pedro Alves
@ 2024-02-07 20:11                   ` Pedro Alves
  2024-02-08  8:57                     ` Luis Machado
  2024-02-08 14:58                     ` Tom Tromey
  0 siblings, 2 replies; 49+ messages in thread
From: Pedro Alves @ 2024-02-07 20:11 UTC (permalink / raw)
  To: Tom Tromey, Luis Machado; +Cc: Simon Marchi, gdb-patches, Andrew Burgess

On 2024-02-07 18:56, Pedro Alves wrote:
> Hi!
> 
> On 2024-02-07 18:18, Tom Tromey wrote:
>>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
>>
>> Luis> I see. Is this logic expected? Naturally I'd expect a process to
>> Luis> exist before a thread can exist.
>>
>> Me too but you can see it in
>> linux-low.cc:linux_process_target::handle_extended_wait.
>>
>>       lwp_info *child_lwp = add_lwp (child_ptid);
>> [...]
>>       if (event != PTRACE_EVENT_CLONE)
>> 	{
>> 	  /* Add the new process to the tables and clone the breakpoint
>> 	     lists of the parent.  We need to do this even if the new process
>> 	     will be detached, since we will need the process object and the
>> 	     breakpoints to remove any breakpoints from memory when we
>> 	     detach, and the client side will access registers.  */
>> 	  process_info *child_proc = add_linux_process (new_pid, 0);
>> [...]
>>
> 
> I don't recall off hand a reason that prevents us from tweaking this code a little to
> create the child process before the child lwp is created.  I think that was how it was
> done before my changes, and I just reordered code to make it end up with fewer lines.
> I think we can create the child process earlier.
> 
> I'll send a patch in a sec, once I test it.

Like so?  Does it fix the crash?

From 0c308ac13c4537c885491305cee7215fbfdf04c0 Mon Sep 17 00:00:00 2001
From: Pedro Alves <pedro@palves.net>
Date: Wed, 7 Feb 2024 18:48:16 +0000
Subject: [PATCH] Fix crash in aarch64-linux gdbserver

Since commit 393a6b5947d0 ("Thread options & clone events (Linux
GDBserver)"), aarch64-linux gdbserver crashes when the inferior
vforks.  This happens in aarch64_get_debug_reg_state:

  struct process_info *proc = find_process_pid (pid);

  return &proc->priv->arch_private->debug_reg_state;

Here, find_process_pid returns nullptr -- the new inferior hasn't yet
been created in linux_process_target::handle_extended_wait.

This patch fixes the problem by having
linux_process_target::handle_extended_wait create the child process
earlier, before the child LWP is created.  This is what the function
did before it was reorganized by the commit referred above.

Change-Id: Ib8b3a2e6048c3ad2b91a92ea4430da507db03c50
Co-Authored-By: Tom Tromey <tromey@adacore.com>
---
 gdbserver/linux-low.cc | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 444eebc6bbe..9d5a6242803 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -555,6 +555,16 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
 			   ? ptid_t (new_pid, new_pid)
 			   : ptid_t (ptid_of (event_thr).pid (), new_pid));
 
+      process_info *child_proc = nullptr;
+
+      if (event != PTRACE_EVENT_CLONE)
+	{
+	  /* Add the new process to the tables before we add the LWP.
+	     We need to do this even if the new process will be
+	     detached.  See breakpoint cloning code further below.  */
+	  child_proc = add_linux_process (new_pid, 0);
+	}
+
       lwp_info *child_lwp = add_lwp (child_ptid);
       gdb_assert (child_lwp != NULL);
       child_lwp->stopped = 1;
@@ -588,12 +598,11 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
 
       if (event != PTRACE_EVENT_CLONE)
 	{
-	  /* Add the new process to the tables and clone the breakpoint
-	     lists of the parent.  We need to do this even if the new process
-	     will be detached, since we will need the process object and the
-	     breakpoints to remove any breakpoints from memory when we
-	     detach, and the client side will access registers.  */
-	  process_info *child_proc = add_linux_process (new_pid, 0);
+	  /* Clone the breakpoint lists of the parent.  We need to do
+	     this even if the new process will be detached, since we
+	     will need the process object and the breakpoints to
+	     remove any breakpoints from memory when we detach, and
+	     the client side will access registers.  */
 	  gdb_assert (child_proc != NULL);
 
 	  process_info *parent_proc = get_thread_process (event_thr);

base-commit: 6fb99666f4bbc79708acb8efb2d80e57de67b80b
-- 
2.43.0


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 20:11                   ` Pedro Alves
@ 2024-02-08  8:57                     ` Luis Machado
  2024-02-08 10:53                       ` Pedro Alves
  2024-02-08 14:58                     ` Tom Tromey
  1 sibling, 1 reply; 49+ messages in thread
From: Luis Machado @ 2024-02-08  8:57 UTC (permalink / raw)
  To: Pedro Alves, Tom Tromey; +Cc: Simon Marchi, gdb-patches, Andrew Burgess

Hi!

On 2/7/24 20:11, Pedro Alves wrote:
> On 2024-02-07 18:56, Pedro Alves wrote:
>> Hi!
>>
>> On 2024-02-07 18:18, Tom Tromey wrote:
>>>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
>>>
>>> Luis> I see. Is this logic expected? Naturally I'd expect a process to
>>> Luis> exist before a thread can exist.
>>>
>>> Me too but you can see it in
>>> linux-low.cc:linux_process_target::handle_extended_wait.
>>>
>>>       lwp_info *child_lwp = add_lwp (child_ptid);
>>> [...]
>>>       if (event != PTRACE_EVENT_CLONE)
>>> 	{
>>> 	  /* Add the new process to the tables and clone the breakpoint
>>> 	     lists of the parent.  We need to do this even if the new process
>>> 	     will be detached, since we will need the process object and the
>>> 	     breakpoints to remove any breakpoints from memory when we
>>> 	     detach, and the client side will access registers.  */
>>> 	  process_info *child_proc = add_linux_process (new_pid, 0);
>>> [...]
>>>
>>
>> I don't recall off hand a reason that prevents us from tweaking this code a little to
>> create the child process before the child lwp is created.  I think that was how it was
>> done before my changes, and I just reordered code to make it end up with fewer lines.
>> I think we can create the child process earlier.
>>
>> I'll send a patch in a sec, once I test it.
> 
> Like so?  Does it fix the crash?

It does, thanks for the quick patch.

Maybe before this series we were relying on some other path eventually creating a process first, and
the new code somehow caused a (indirect?) change.

I'm putting this through the gdbserver testsuite on my end. I'll let you know what comes out of it.

> 
> From 0c308ac13c4537c885491305cee7215fbfdf04c0 Mon Sep 17 00:00:00 2001
> From: Pedro Alves <pedro@palves.net>
> Date: Wed, 7 Feb 2024 18:48:16 +0000
> Subject: [PATCH] Fix crash in aarch64-linux gdbserver
> 
> Since commit 393a6b5947d0 ("Thread options & clone events (Linux
> GDBserver)"), aarch64-linux gdbserver crashes when the inferior
> vforks.  This happens in aarch64_get_debug_reg_state:
> 
>   struct process_info *proc = find_process_pid (pid);
> 
>   return &proc->priv->arch_private->debug_reg_state;
> 
> Here, find_process_pid returns nullptr -- the new inferior hasn't yet
> been created in linux_process_target::handle_extended_wait.
> 
> This patch fixes the problem by having
> linux_process_target::handle_extended_wait create the child process
> earlier, before the child LWP is created.  This is what the function
> did before it was reorganized by the commit referred above.
> 
> Change-Id: Ib8b3a2e6048c3ad2b91a92ea4430da507db03c50
> Co-Authored-By: Tom Tromey <tromey@adacore.com>
> ---
>  gdbserver/linux-low.cc | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index 444eebc6bbe..9d5a6242803 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -555,6 +555,16 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>  			   ? ptid_t (new_pid, new_pid)
>  			   : ptid_t (ptid_of (event_thr).pid (), new_pid));
>  
> +      process_info *child_proc = nullptr;
> +
> +      if (event != PTRACE_EVENT_CLONE)
> +	{
> +	  /* Add the new process to the tables before we add the LWP.
> +	     We need to do this even if the new process will be
> +	     detached.  See breakpoint cloning code further below.  */
> +	  child_proc = add_linux_process (new_pid, 0);
> +	}
> +
>        lwp_info *child_lwp = add_lwp (child_ptid);
>        gdb_assert (child_lwp != NULL);
>        child_lwp->stopped = 1;
> @@ -588,12 +598,11 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>  
>        if (event != PTRACE_EVENT_CLONE)
>  	{
> -	  /* Add the new process to the tables and clone the breakpoint
> -	     lists of the parent.  We need to do this even if the new process
> -	     will be detached, since we will need the process object and the
> -	     breakpoints to remove any breakpoints from memory when we
> -	     detach, and the client side will access registers.  */
> -	  process_info *child_proc = add_linux_process (new_pid, 0);
> +	  /* Clone the breakpoint lists of the parent.  We need to do
> +	     this even if the new process will be detached, since we
> +	     will need the process object and the breakpoints to
> +	     remove any breakpoints from memory when we detach, and
> +	     the client side will access registers.  */
>  	  gdb_assert (child_proc != NULL);
>  
>  	  process_info *parent_proc = get_thread_process (event_thr);
> 
> base-commit: 6fb99666f4bbc79708acb8efb2d80e57de67b80b


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-08  8:57                     ` Luis Machado
@ 2024-02-08 10:53                       ` Pedro Alves
  2024-02-08 11:47                         ` Luis Machado
  0 siblings, 1 reply; 49+ messages in thread
From: Pedro Alves @ 2024-02-08 10:53 UTC (permalink / raw)
  To: Luis Machado, Tom Tromey; +Cc: Simon Marchi, gdb-patches, Andrew Burgess



On 2024-02-08 08:57, Luis Machado wrote:
> Hi!
> 
> On 2/7/24 20:11, Pedro Alves wrote:
>> On 2024-02-07 18:56, Pedro Alves wrote:
>>> Hi!
>>>
>>> On 2024-02-07 18:18, Tom Tromey wrote:
>>>>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
>>>>
>>>> Luis> I see. Is this logic expected? Naturally I'd expect a process to
>>>> Luis> exist before a thread can exist.
>>>>
>>>> Me too but you can see it in
>>>> linux-low.cc:linux_process_target::handle_extended_wait.
>>>>
>>>>       lwp_info *child_lwp = add_lwp (child_ptid);
>>>> [...]
>>>>       if (event != PTRACE_EVENT_CLONE)
>>>> 	{
>>>> 	  /* Add the new process to the tables and clone the breakpoint
>>>> 	     lists of the parent.  We need to do this even if the new process
>>>> 	     will be detached, since we will need the process object and the
>>>> 	     breakpoints to remove any breakpoints from memory when we
>>>> 	     detach, and the client side will access registers.  */
>>>> 	  process_info *child_proc = add_linux_process (new_pid, 0);
>>>> [...]
>>>>
>>>
>>> I don't recall off hand a reason that prevents us from tweaking this code a little to
>>> create the child process before the child lwp is created.  I think that was how it was
>>> done before my changes, and I just reordered code to make it end up with fewer lines.
>>> I think we can create the child process earlier.
>>>
>>> I'll send a patch in a sec, once I test it.
>>
>> Like so?  Does it fix the crash?
> 
> It does, thanks for the quick patch.
> 
> Maybe before this series we were relying on some other path eventually creating a process first, and
> the new code somehow caused a (indirect?) change.

Right.  It was really a direct change in commit 393a6b5947d0 ("Thread options & clone events (Linux GDBserver)").
Before that change, we had, early in handle_extended_wait:

int
linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
                                            int wstat)
{
...
     if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
        {
...
          /* Add the new process to the tables and clone the breakpoint
             lists of the parent.  We need to do this even if the new process
             will be detached, since we will need the process object and the
             breakpoints to remove any breakpoints from memory when we
             detach, and the client side will access registers.  */
          child_proc = add_linux_process (new_pid, 0);
          gdb_assert (child_proc != NULL);
          child_lwp = add_lwp (ptid);
          gdb_assert (child_lwp != NULL);


So we used to add the process before the LWP.  393a6b5947d0 reordered things, as mentioned in the commit log:

    ...
    This shuffles code in linux_process_target::handle_extended_wait
    around to a more logical order when we now have to handle and
    potentially report all of fork/vfork/clone.
    ...

That shuffling made us create the process _after_ creating the LWP.  I had missed that this could
have a consequence, back then.

> 
> I'm putting this through the gdbserver testsuite on my end. I'll let you know what comes out of it.
> 

Thanks!

Pedro Alves

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-08 10:53                       ` Pedro Alves
@ 2024-02-08 11:47                         ` Luis Machado
  0 siblings, 0 replies; 49+ messages in thread
From: Luis Machado @ 2024-02-08 11:47 UTC (permalink / raw)
  To: Pedro Alves, Tom Tromey; +Cc: Simon Marchi, gdb-patches, Andrew Burgess

On 2/8/24 10:53, Pedro Alves wrote:
> 
> 
> On 2024-02-08 08:57, Luis Machado wrote:
>> Hi!
>>
>> On 2/7/24 20:11, Pedro Alves wrote:
>>> On 2024-02-07 18:56, Pedro Alves wrote:
>>>> Hi!
>>>>
>>>> On 2024-02-07 18:18, Tom Tromey wrote:
>>>>>>>>>> "Luis" == Luis Machado <luis.machado@arm.com> writes:
>>>>>
>>>>> Luis> I see. Is this logic expected? Naturally I'd expect a process to
>>>>> Luis> exist before a thread can exist.
>>>>>
>>>>> Me too but you can see it in
>>>>> linux-low.cc:linux_process_target::handle_extended_wait.
>>>>>
>>>>>       lwp_info *child_lwp = add_lwp (child_ptid);
>>>>> [...]
>>>>>       if (event != PTRACE_EVENT_CLONE)
>>>>> 	{
>>>>> 	  /* Add the new process to the tables and clone the breakpoint
>>>>> 	     lists of the parent.  We need to do this even if the new process
>>>>> 	     will be detached, since we will need the process object and the
>>>>> 	     breakpoints to remove any breakpoints from memory when we
>>>>> 	     detach, and the client side will access registers.  */
>>>>> 	  process_info *child_proc = add_linux_process (new_pid, 0);
>>>>> [...]
>>>>>
>>>>
>>>> I don't recall off hand a reason that prevents us from tweaking this code a little to
>>>> create the child process before the child lwp is created.  I think that was how it was
>>>> done before my changes, and I just reordered code to make it end up with fewer lines.
>>>> I think we can create the child process earlier.
>>>>
>>>> I'll send a patch in a sec, once I test it.
>>>
>>> Like so?  Does it fix the crash?
>>
>> It does, thanks for the quick patch.
>>
>> Maybe before this series we were relying on some other path eventually creating a process first, and
>> the new code somehow caused a (indirect?) change.
> 
> Right.  It was really a direct change in commit 393a6b5947d0 ("Thread options & clone events (Linux GDBserver)").
> Before that change, we had, early in handle_extended_wait:
> 
> int
> linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>                                             int wstat)
> {
> ...
>      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
>         {
> ...
>           /* Add the new process to the tables and clone the breakpoint
>              lists of the parent.  We need to do this even if the new process
>              will be detached, since we will need the process object and the
>              breakpoints to remove any breakpoints from memory when we
>              detach, and the client side will access registers.  */
>           child_proc = add_linux_process (new_pid, 0);
>           gdb_assert (child_proc != NULL);
>           child_lwp = add_lwp (ptid);
>           gdb_assert (child_lwp != NULL);
> 
> 
> So we used to add the process before the LWP.  393a6b5947d0 reordered things, as mentioned in the commit log:
> 
>     ...
>     This shuffles code in linux_process_target::handle_extended_wait
>     around to a more logical order when we now have to handle and
>     potentially report all of fork/vfork/clone.
>     ...
> 
> That shuffling made us create the process _after_ creating the LWP.  I had missed that this could
> have a consequence, back then.
> 
>>
>> I'm putting this through the gdbserver testsuite on my end. I'll let you know what comes out of it.
>>

I can confirm the testsuite run against gdbserver (native/extended) looks much better with the above patch applied.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver)
  2024-02-07 20:11                   ` Pedro Alves
  2024-02-08  8:57                     ` Luis Machado
@ 2024-02-08 14:58                     ` Tom Tromey
  1 sibling, 0 replies; 49+ messages in thread
From: Tom Tromey @ 2024-02-08 14:58 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Tom Tromey, Luis Machado, Simon Marchi, gdb-patches, Andrew Burgess

>>>>> "Pedro" == Pedro Alves <pedro@palves.net> writes:

Pedro> Like so?  Does it fix the crash?

Thanks for doing this.

Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2024-02-08 14:58 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-13 15:04 [FYI/pushed v4 00/25] Step over thread clone and thread exit Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 01/25] Add "maint info linux-lwps" command Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 02/25] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 03/25] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2023-11-14 12:55   ` Guinevere Larsen
2023-11-14 13:26     ` Pedro Alves
2023-11-14 16:29       ` Guinevere Larsen
2023-11-14 16:44         ` Luis Machado
2023-11-14 13:28     ` Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 04/25] Support clone events in the remote protocol Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 05/25] Avoid duplicate QThreadEvents packets Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 06/25] Thread options & clone events (core + remote) Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 07/25] Thread options & clone events (native Linux) Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 08/25] Thread options & clone events (Linux GDBserver) Pedro Alves
2024-02-06 11:04   ` Luis Machado
2024-02-06 14:57     ` Tom Tromey
2024-02-06 15:12       ` Luis Machado
2024-02-07  8:59       ` Luis Machado
2024-02-07 15:43         ` Tom Tromey
2024-02-07 17:10           ` Simon Marchi
2024-02-07 18:05             ` Luis Machado
2024-02-07 18:18               ` Tom Tromey
2024-02-07 18:56                 ` Pedro Alves
2024-02-07 20:11                   ` Pedro Alves
2024-02-08  8:57                     ` Luis Machado
2024-02-08 10:53                       ` Pedro Alves
2024-02-08 11:47                         ` Luis Machado
2024-02-08 14:58                     ` Tom Tromey
2024-02-07 18:06             ` Tom Tromey
2023-11-13 15:04 ` [FYI/pushed v4 09/25] gdbserver: Hide and don't detach pending clone children Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 10/25] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 11/25] all-stop/synchronous RSP support thread-exit events Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 12/25] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 13/25] Move deleting thread on TARGET_WAITKIND_THREAD_EXITED to core Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 14/25] Introduce GDB_THREAD_OPTION_EXIT thread option, fix step-over-thread-exit Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 15/25] Implement GDB_THREAD_OPTION_EXIT support for Linux GDBserver Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 16/25] Implement GDB_THREAD_OPTION_EXIT support for native Linux Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 17/25] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 18/25] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 19/25] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 20/25] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 21/25] Report thread exit event for leader if reporting thread exit events Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 22/25] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 23/25] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 24/25] Document remote clone events, and QThreadOptions packet Pedro Alves
2023-11-13 15:04 ` [FYI/pushed v4 25/25] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2023-11-13 19:28 ` [FYI/pushed v4 00/25] Step over thread clone and thread exit Tom de Vries
2023-11-14 10:51   ` Pedro Alves
2023-11-14 13:39     ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).