public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Pedro Alves <pedro@palves.net>
To: gdb-patches@sourceware.org
Subject: [PATCH v2 03/29] gdb/linux: Delete all other LWPs immediately on ptrace exec event
Date: Wed, 13 Jul 2022 23:24:07 +0100	[thread overview]
Message-ID: <20220713222433.374898-4-pedro@palves.net> (raw)
In-Reply-To: <20220713222433.374898-1-pedro@palves.net>

I noticed that after a following patch ("Step over clone syscall w/
breakpoint, TARGET_WAITKIND_THREAD_CLONED"), the
gdb.threads/step-over-exec.exp was passing cleanly, but still, we'd
end up with four new unexpected GDB core dumps:

		 === gdb Summary ===

 # of unexpected core files      4
 # of expected passes            48

That said patch is making the pre-existing
gdb.threads/step-over-exec.exp testcase (almost silently) expose a
latent problem in gdb/linux-nat.c, resulting in a GDB crash when:

 #1 - a non-leader thread execs
 #2 - the post-exec program stops somewhere
 #3 - you kill the inferior

Instead of #3 directly, the testcase just returns, which ends up in
gdb_exit, tearing down GDB, which kills the inferior, and is thus
equivalent to #3 above.

Vis:

 $ gdb --args ./gdb /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true
 ...
 (top-gdb) r
 ...
 (gdb) b main
 ...
 (gdb) r
 ...
 Breakpoint 1, main (argc=1, argv=0x7fffffffdb88) at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec.c:69
 69        argv0 = argv[0];
 (gdb) c
 Continuing.
 [New Thread 0x7ffff7d89700 (LWP 2506975)]
 Other going in exec.
 Exec-ing /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd
 process 2506769 is executing new program: /home/pedro/gdb/build/gdb/testsuite/outputs/gdb.threads/step-over-exec/step-over-exec-execr-thread-other-diff-text-segs-true-execd

 Thread 1 "step-over-exec-" hit Breakpoint 1, main () at /home/pedro/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/step-over-exec-execd.c:28
 28        foo ();
 (gdb) k
 ...
 Thread 1 "gdb" received signal SIGSEGV, Segmentation fault.
 0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
 393         return m_suspend.waitstatus_pending_p;
 (top-gdb) bt
 #0  0x000055555574444c in thread_info::has_pending_waitstatus (this=0x0) at ../../src/gdb/gdbthread.h:393
 #1  0x0000555555a884d1 in get_pending_child_status (lp=0x5555579b8230, ws=0x7fffffffd130) at ../../src/gdb/linux-nat.c:1345
 #2  0x0000555555a8e5e6 in kill_unfollowed_child_callback (lp=0x5555579b8230) at ../../src/gdb/linux-nat.c:3564
 #3  0x0000555555a92a26 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::operator()(gdb::fv_detail::erased_callable, lwp_info*) const (this=0x0, ecall=..., args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:284
 #4  0x0000555555a92a51 in gdb::function_view<int (lwp_info*)>::bind<int, lwp_info*>(int (*)(lwp_info*))::{lambda(gdb::fv_detail::erased_callable, lwp_info*)#1}::_FUN(gdb::fv_detail::erased_callable, lwp_info*) () at ../../src/gdb/../gdbsupport/function-view.h:278
 #5  0x0000555555a91f84 in gdb::function_view<int (lwp_info*)>::operator()(lwp_info*) const (this=0x7fffffffd210, args#0=0x5555579b8230) at ../../src/gdb/../gdbsupport/function-view.h:247
 #6  0x0000555555a87072 in iterate_over_lwps(ptid_t, gdb::function_view<int (lwp_info*)>) (filter=..., callback=...) at ../../src/gdb/linux-nat.c:864
 #7  0x0000555555a8e732 in linux_nat_target::kill (this=0x55555653af40 <the_amd64_linux_nat_target>) at ../../src/gdb/linux-nat.c:3590
 #8  0x0000555555cfdc11 in target_kill () at ../../src/gdb/target.c:911
 ...

The root of the problem is that when a non-leader LWP execs, it just
changes its tid to the tgid, replacing the pre-exec leader thread,
becoming the new leader.  There's no thread exit event for the execing
thread.  It's as if the old pre-exec LWP vanishes without trace.  The
ptrace man page says:

"PTRACE_O_TRACEEXEC (since Linux 2.5.46)
	Stop the tracee at the next execve(2).  A waitpid(2) by the
	tracer will return a status value such that

	  status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))

	If the execing thread is not a thread group leader, the thread
	ID is reset to thread group leader's ID before this stop.
	Since Linux 3.0, the former thread ID can be retrieved with
	PTRACE_GETEVENTMSG."

When the core of GDB processes an exec events, it deletes all the
threads of the inferior.  But, that is too late -- deleting the thread
does not delete the corresponding LWP, so we end leaving the pre-exec
non-leader LWP stale in the LWP list.  That's what leads to the crash
above -- linux_nat_target::kill iterates over all LWPs, and after the
patch in question, that code will look for the corresponding
thread_info for each LWP.  For the pre-exec non-leader LWP still
listed, won't find one.

This patch fixes it, by deleting the pre-exec non-leader LWP (and
thread) from the LWP/thread lists as seen as we get an exec event out
of ptrace.

GDBserver does not need an equivalent fix, because it is already doing
this, as side effect of mourning the pre-exec process, in
gdbserver/linux-low.cc:

  else if (event == PTRACE_EVENT_EXEC && cs.report_exec_events)
    {
...
      /* Delete the execing process and all its threads.  */
      mourn (proc);
      switch_to_thread (nullptr);

Change-Id: I21ec18072c7750f3a972160ae6b9e46590376643
---
 gdb/linux-nat.c                              | 15 +++++++++++++++
 gdb/testsuite/gdb.threads/step-over-exec.exp |  6 ++++++
 2 files changed, 21 insertions(+)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 57cab4ce50a..e27cc890ff5 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1987,6 +1987,21 @@ linux_handle_extended_wait (struct lwp_info *lp, int status)
 	 thread execs, it changes its tid to the tgid, and the old
 	 tgid thread might have not been resumed.  */
       lp->resumed = 1;
+
+      /* All other LWPs are gone now.  We'll have received a thread
+	 exit notification for all threads other the execing one.
+	 That one, if it wasn't the leader, just silently changes its
+	 tid to the tgid, and the previous leader vanishes.  Since
+	 Linux 3.0, the former thread ID can be retrieved with
+	 PTRACE_GETEVENTMSG, but since we support older kernels, don't
+	 bother with it, and just walk the LWP list.  Even with
+	 PTRACE_GETEVENTMSG, we'd still need to lookup the
+	 corresponding LWP object, and it would be an extra ptrace
+	 syscall, so this way may even be more efficient.  */
+      for (lwp_info *other_lp : all_lwps_safe ())
+	if (other_lp != lp && other_lp->ptid.pid () == lp->ptid.pid ())
+	  exit_lwp (other_lp);
+
       return 0;
     }
 
diff --git a/gdb/testsuite/gdb.threads/step-over-exec.exp b/gdb/testsuite/gdb.threads/step-over-exec.exp
index 783f865585c..a8b01f8aeda 100644
--- a/gdb/testsuite/gdb.threads/step-over-exec.exp
+++ b/gdb/testsuite/gdb.threads/step-over-exec.exp
@@ -102,6 +102,12 @@ proc do_test { execr_thread different_text_segments displaced_stepping } {
     gdb_breakpoint foo
     gdb_test "continue" "Breakpoint $decimal, foo .*" \
 	"continue to foo"
+
+    # Test that GDB is able to kill the inferior.  This may fail if
+    # e.g., GDB does not dispose of the pre-exec threads properly.
+    gdb_test "with confirm off -- kill" \
+	"\\\[Inferior 1 (.*) killed\\\]" \
+	"kill inferior"
 }
 
 foreach_with_prefix displaced_stepping {auto off} {
-- 
2.36.0


  parent reply	other threads:[~2022-07-13 22:24 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 22:24 [PATCH v2 00/29] Step over thread clone and thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 01/29] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2022-07-20 20:21   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 02/29] linux-nat: introduce pending_status_str Pedro Alves
2022-07-20 20:38   ` Simon Marchi
2022-07-13 22:24 ` Pedro Alves [this message]
2022-07-21  0:45   ` [PATCH v2 03/29] gdb/linux: Delete all other LWPs immediately on ptrace exec event Simon Marchi
2022-07-13 22:24 ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-21  1:35   ` Simon Marchi
2022-10-17 18:54     ` Pedro Alves
2022-10-18 12:40     ` [PATCH] Don't explicitly set clone child ptrace options (was: Re: [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED) Pedro Alves
2022-10-28 14:50       ` Simon Marchi
2022-12-12 20:13     ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-13 22:24 ` [PATCH v2 05/29] Support clone events in the remote protocol Pedro Alves
2022-07-21  2:25   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 06/29] Avoid duplicate QThreadEvents packets Pedro Alves
2022-07-21  2:30   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-21  3:14   ` Simon Marchi
2022-10-27 19:55     ` [PATCH] enum_flags to_string (was: Re: [PATCH v2 07/29] Thread options & clone events (core + remote)) Pedro Alves
2022-10-28 10:26       ` [PATCH v2] " Pedro Alves
2022-10-28 11:08         ` [PATCH v3] " Pedro Alves
2022-10-28 15:59           ` Simon Marchi
2022-10-28 18:23             ` Pedro Alves
2022-10-31 12:47               ` Simon Marchi
2022-11-07 17:26                 ` [PATCH v5] " Pedro Alves
2022-11-07 18:29                   ` Simon Marchi
2022-11-08 14:56                     ` Pedro Alves
2022-12-12 20:15     ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 08/29] Thread options & clone events (native Linux) Pedro Alves
2022-07-21 12:38   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 09/29] Thread options & clone events (Linux GDBserver) Pedro Alves
2022-07-21 13:11   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 10/29] gdbserver: Hide and don't detach pending clone children Pedro Alves
2022-07-13 22:24 ` [PATCH v2 11/29] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 12/29] Add test for stepping over clone syscall Pedro Alves
2022-07-13 22:24 ` [PATCH v2 13/29] all-stop/synchronous RSP support thread-exit events Pedro Alves
2022-07-13 22:24 ` [PATCH v2 14/29] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-07-13 22:24 ` [PATCH v2 15/29] Introduce GDB_TO_EXIT thread option, fix step-over-thread-exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 16/29] Implement GDB_TO_EXIT support for Linux GDBserver Pedro Alves
2022-07-13 22:24 ` [PATCH v2 17/29] Implement GDB_TO_EXIT support for native Linux Pedro Alves
2022-07-21 15:26   ` Simon Marchi
2022-12-12 20:17     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 18/29] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2022-07-21 18:12   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 19/29] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2022-07-21 18:21   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 20/29] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 21/29] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2022-07-14  5:28   ` Eli Zaretskii
2022-07-21 18:49   ` Simon Marchi
2022-12-12 20:19     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 22/29] Report thread exit event for leader if reporting thread exit events Pedro Alves
2022-07-13 22:24 ` [PATCH v2 23/29] Ignore failure to read PC when resuming Pedro Alves
2022-07-13 22:24 ` [PATCH v2 24/29] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2022-07-13 22:24 ` [PATCH v2 25/29] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 26/29] Document remote clone events, and QThreadOptions packet Pedro Alves
2022-07-14  5:27   ` Eli Zaretskii
2022-07-13 22:24 ` [PATCH v2 27/29] inferior::clear_thread_list always silent Pedro Alves
2022-07-13 22:24 ` [PATCH v2 28/29] Centralize "[Thread ...exited]" notifications Pedro Alves
2022-07-13 22:24 ` [PATCH v2 29/29] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2022-07-21 19:28 ` [PATCH v2 00/29] Step over thread clone and thread exit Simon Marchi
2022-10-03 13:46 ` Tom de Vries
2022-10-03 18:37   ` Tom de Vries
2022-12-12 20:20     ` Pedro Alves
2022-12-12 20:19   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220713222433.374898-4-pedro@palves.net \
    --to=pedro@palves.net \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).