[PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275)

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275)
@ 2022-11-17 19:42 Simon Marchi
  2022-11-17 19:42 ` [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp Simon Marchi
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

This series fixes some assertion failures related to the
commit_resumed_state flag being set while target_{wait,kill,stop} is
called.

Patch 3 comes from Andrew Burgess, it fixes the case where GDB detaches
an inferior while quitting, while the inferior is doing a step-over.

Patch 8 fixes the main issue reported by PR 28275.

The other patches are cleanups or fixes for other problems found while
working on this.

Andrew Burgess (2):
  gdb/testsuite: refactor gdb.threads/detach-step-over.exp
  gdb: fix assert when quitting GDB while a thread is stepping

Simon Marchi (6):
  gdb/testsuite: remove global declarations in
    gdb.threads/detach-step-over.exp
  gdbserver/linux: take condition out of callback in find_lwp_pid
  gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter
  gdbserver: use current_process in ps_getpid
  gdbserver: switch to right process in find_one_thread
  gdb: disable commit resumed in target_kill

 gdb/target.c                                  |  13 +-
 .../gdb.base/run-control-while-bg-execution.c |  33 ++
 .../run-control-while-bg-execution.exp        | 118 +++++++
 .../gdb.threads/detach-step-over.exp          | 322 +++++++++++-------
 gdbserver/linux-low.cc                        |   4 +-
 gdbserver/linux-x86-low.cc                    |  27 +-
 gdbserver/proc-service.cc                     |   2 +-
 gdbserver/thread-db.cc                        |  29 +-
 8 files changed, 400 insertions(+), 148 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp

-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18  8:30   ` Aktemur, Tankut Baris
  2022-11-17 19:42 ` [PATCH 2/8] gdb/testsuite: refactor gdb.threads/detach-step-over.exp Simon Marchi
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

Before doing further changes to this file, change to use the :: notation
instead of declaring global variables with the `global` keyword.

Change-Id: I72301fd8f4693fea61aac054ba17245a1f4442fb
---
 .../gdb.threads/detach-step-over.exp          | 40 ++++++++-----------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/detach-step-over.exp b/gdb/testsuite/gdb.threads/detach-step-over.exp
index 15af7e0e723..917be2ef378 100644
--- a/gdb/testsuite/gdb.threads/detach-step-over.exp
+++ b/gdb/testsuite/gdb.threads/detach-step-over.exp
@@ -58,24 +58,18 @@ set bp_lineno [gdb_get_line_number "Set breakpoint here"]
 
 # The test proper.  See description above.
 proc test {condition_eval target_non_stop non_stop displaced} {
-    global binfile srcfile
-    global gdb_prompt
-    global decimal
-    global bp_lineno
-    global GDBFLAGS
-
     # Number of threads started by the program.
     set n_threads 10
 
-    save_vars { GDBFLAGS } {
-	append GDBFLAGS " -ex \"maint set target-non-stop $target_non_stop\""
-	append GDBFLAGS " -ex \"set non-stop $non_stop\""
-	append GDBFLAGS " -ex \"set displaced $displaced\""
-	append GDBFLAGS " -ex \"set schedule-multiple on\""
-	clean_restart $binfile
+    save_vars { ::GDBFLAGS } {
+	append ::GDBFLAGS " -ex \"maint set target-non-stop $target_non_stop\""
+	append ::GDBFLAGS " -ex \"set non-stop $non_stop\""
+	append ::GDBFLAGS " -ex \"set displaced $displaced\""
+	append ::GDBFLAGS " -ex \"set schedule-multiple on\""
+	clean_restart $::binfile
     }
 
-    set test_spawn_id [spawn_wait_for_attach $binfile]
+    set test_spawn_id [spawn_wait_for_attach $::binfile]
     set testpid [spawn_id_get_pid $test_spawn_id]
 
     set any "\[^\r\n\]*"
@@ -83,7 +77,7 @@ proc test {condition_eval target_non_stop non_stop displaced} {
     gdb_test "add-inferior" "Added inferior 2.*"
     gdb_test "inferior 2" "Switching to .*"
 
-    gdb_load $binfile
+    gdb_load $::binfile
     if ![runto setup_done] then {
 	fail "can't run to setup_done"
 	kill_wait_spawned_process $test_spawn_id
@@ -95,7 +89,7 @@ proc test {condition_eval target_non_stop non_stop displaced} {
     # Get the PID of the test process.
     set pid_inf2 ""
     gdb_test_multiple "p mypid" "get pid of inferior 2" {
-	-re " = ($decimal)\r\n$gdb_prompt $" {
+	-re " = ($::decimal)\r\n$::gdb_prompt $" {
 	    set pid_inf2 $expect_out(1,string)
 	    pass $gdb_test_name
 	}
@@ -124,13 +118,13 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 			# Prevent -readnow timeout.
 			exp_continue
 		    }
-		    -re "is a zombie - the process has already terminated.*$gdb_prompt " {
+		    -re "is a zombie - the process has already terminated.*$::gdb_prompt " {
 			fail $gdb_test_name
 		    }
-		    -re "Unable to attach: .*$gdb_prompt " {
+		    -re "Unable to attach: .*$::gdb_prompt " {
 			fail $gdb_test_name
 		    }
-		    -re "\r\n$gdb_prompt " {
+		    -re "\r\n$::gdb_prompt " {
 			if { $saw_attaching } {
 			    set attached 1
 			    pass $test
@@ -173,7 +167,7 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 	    }
 
 	    # Set threads stepping over a breakpoint continuously.
-	    gdb_test "break $srcfile:$bp_lineno if 0" "Breakpoint.*" \
+	    gdb_test "break $::srcfile:$::bp_lineno if 0" "Breakpoint.*" \
 		"break LOC if 0"
 
 	    if {$attempt < $attempts} {
@@ -192,7 +186,7 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 
 	    set cont_cmd_re [string_to_regexp $cont_cmd]
 	    gdb_test_multiple $cont_cmd "" {
-		-re "^$cont_cmd_re\r\nContinuing\.\r\n$gdb_prompt " {
+		-re "^$cont_cmd_re\r\nContinuing\.\r\n$::gdb_prompt " {
 		    pass $gdb_test_name
 		}
 	    }
@@ -208,14 +202,14 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 		    incr running_count
 		    exp_continue
 		}
-		-re "Cannot execute this command while the target is running.*$gdb_prompt $" {
+		-re "Cannot execute this command while the target is running.*$::gdb_prompt $" {
 		    # Testing against a remote server that doesn't do
 		    # non-stop mode.  Explicitly interrupt.  This
 		    # doesn't test the same code paths in GDB, but
 		    # it's still something.
 		    set interrupted 1
 		    gdb_test_multiple "interrupt" "" {
-			-re "$gdb_prompt " {
+			-re "$::gdb_prompt " {
 			    gdb_test_multiple "" $gdb_test_name {
 				-re "received signal SIGINT, Interrupt" {
 				    pass $gdb_test_name
@@ -224,7 +218,7 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 			}
 		    }
 		}
-		-re "$gdb_prompt $" {
+		-re "$::gdb_prompt $" {
 		    gdb_assert {$running_count == ($n_threads + 1) * 2} $gdb_test_name
 		}
 	    }
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/8] gdb/testsuite: refactor gdb.threads/detach-step-over.exp
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
  2022-11-17 19:42 ` [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-17 19:42 ` [PATCH 3/8] gdb: fix assert when quitting GDB while a thread is stepping Simon Marchi
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Simon Marchi

From: Andrew Burgess <aburgess@redhat.com>

Factor out some bits of gdb.threads/detach-step-over.exp to procs in
preparation to adding some new variations of the test.  Rename the
existing "test" proc and make it use proc_with_prefix.

Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Change-Id: Ib4412545c81c8556029e0f7bfa9dd48d7a9f3189
---
 .../gdb.threads/detach-step-over.exp          | 238 ++++++++++--------
 1 file changed, 138 insertions(+), 100 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/detach-step-over.exp b/gdb/testsuite/gdb.threads/detach-step-over.exp
index 917be2ef378..ad9b08f549e 100644
--- a/gdb/testsuite/gdb.threads/detach-step-over.exp
+++ b/gdb/testsuite/gdb.threads/detach-step-over.exp
@@ -56,11 +56,11 @@ standard_testfile
 
 set bp_lineno [gdb_get_line_number "Set breakpoint here"]
 
-# The test proper.  See description above.
-proc test {condition_eval target_non_stop non_stop displaced} {
-    # Number of threads started by the program.
-    set n_threads 10
+# Number of threads started by the program.
+set n_threads 10
 
+# Start GDB, configuring various settings according to the arguments.
+proc start_gdb_for_test {condition_eval target_non_stop non_stop displaced} {
     save_vars { ::GDBFLAGS } {
 	append ::GDBFLAGS " -ex \"maint set target-non-stop $target_non_stop\""
 	append ::GDBFLAGS " -ex \"set non-stop $non_stop\""
@@ -69,10 +69,137 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 	clean_restart $::binfile
     }
 
+    gdb_test_no_output "set breakpoint condition-evaluation $condition_eval"
+}
+
+# Use the 'attach' command to attach to process with pid TESTPID.  Return true
+# if we believe GDB has attached and we are back at the GDB prompt, otherwise,
+# return false.
+proc attach_to {testpid} {
+    with_timeout_factor 2 {
+	set attached 0
+	set saw_attaching 0
+	gdb_test_multiple "attach $testpid" "attach" {
+	    -re "Attaching to program.*process $testpid\r\n" {
+		set saw_attaching 1
+		exp_continue
+	    }
+	    -re "new threads in iteration" {
+		# Seen when "set debug libthread_db" is on.
+		exp_continue
+	    }
+	    -re "Reading symbols from|Expanding full symbols from" {
+		# Prevent -readnow timeout.
+		exp_continue
+	    }
+	    -re "is a zombie - the process has already terminated.*$::gdb_prompt " {
+		fail $gdb_test_name
+	    }
+	    -re "Unable to attach: .*$::gdb_prompt " {
+		fail $gdb_test_name
+	    }
+	    -re "\r\n$::gdb_prompt " {
+		if { $saw_attaching } {
+		    set attached 1
+		    pass $gdb_test_name
+		} else {
+		    fail $gdb_test_name
+		}
+	    }
+	}
+    }
+
+    return $attached
+}
+
+# After attaching to a multi-threaded inferior in non-stop mode, we expect to
+# see a stop message from each thread.  This proc waits for all of these stop
+# messages.  TID_RE is a regexp used to match the thread-id of the stopped
+# thread.
+#
+# Return true if we saw a stop from each of the expected threads (based on the
+# global N_THREADS value), otherwise, return false.
+proc check_stops_after_non_stop_attach {tid_re} {
+    set any "\[^\r\n\]*"
+
+    # In non-stop, we will see one stop per thread after the prompt.
+    set stops 0
+    set test "seen all stops"
+    for {set thread 1} { $thread <= $::n_threads } { incr thread } {
+	if {[gdb_test_multiple "" $test {
+	    -re "Thread ${tid_re} ${any} stopped" {
+		incr stops
+	    }
+	}] != 0} {
+	    break
+	}
+    }
+
+    # If we haven't seen all stops, then the
+    # gdb_test_multiple in the loop above will have
+    # already issued a FAIL.
+    if {$stops != $::n_threads} {
+	return false
+    }
+    pass $test
+    return true
+}
+
+# Prepare for a single test iteration.  TESTPID is the pid of the process GDB
+# will be attached too.  NON_STOP indicates if GDB is configured in non-stop
+# mode or not.  ATTEMPT is the current attempt number, and ATTEMPTS is the
+# maximum number of attempts we plan to run.  TID_RE is a string used to match
+# against a thread-id in GDB's stop messages.
+#
+# Return true if everything is prepared correctly, otherwise return false.
+proc prepare_test_iter {testpid non_stop attempt attempts tid_re} {
+    if {![attach_to $testpid]} {
+	return false
+    }
+
+    if {$non_stop} {
+	if {![check_stops_after_non_stop_attach $tid_re]} {
+	    return false
+	}
+    }
+
+    gdb_test "break ${::srcfile}:${::bp_lineno} if 0" "Breakpoint.*" \
+	"break LOC if 0"
+
+    if {$attempt < $attempts} {
+	# Kick the time out timer for another round.
+	gdb_test "print again = 1" " = 1" "reset timer in the inferior"
+	# Show the time we had left in the logs, in case
+	# something goes wrong.
+	gdb_test "print seconds_left" " = .*"
+    }
+
+    if {$non_stop} {
+	set cont_cmd "continue -a &"
+    } else {
+	set cont_cmd "continue &"
+    }
+
+    set cont_cmd_re [string_to_regexp $cont_cmd]
+    gdb_test_multiple $cont_cmd "" {
+	-re "^$cont_cmd_re\r\nContinuing\.\r\n$::gdb_prompt " {
+	    pass $gdb_test_name
+	}
+    }
+
+    # Wait a bit, to give time for the threads to hit the
+    # breakpoint.
+    sleep 1
+
+    return true
+}
+
+# The test proper.  See the description at the top of the file.
+proc_with_prefix test_detach_command {condition_eval target_non_stop non_stop displaced} {
     set test_spawn_id [spawn_wait_for_attach $::binfile]
     set testpid [spawn_id_get_pid $test_spawn_id]
 
-    set any "\[^\r\n\]*"
+    start_gdb_for_test $condition_eval $target_non_stop $non_stop $displaced
 
     gdb_test "add-inferior" "Added inferior 2.*"
     gdb_test "inferior 2" "Switching to .*"
@@ -84,8 +211,6 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 	return
     }
 
-    gdb_test_no_output "set breakpoint condition-evaluation $condition_eval"
-
     # Get the PID of the test process.
     set pid_inf2 ""
     gdb_test_multiple "p mypid" "get pid of inferior 2" {
@@ -100,101 +225,12 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 	with_test_prefix "iter $attempt" {
 	    gdb_test "inferior 1" "Switching to .*"
 
-	    with_timeout_factor 2 {
-		set attached 0
-		set saw_attaching 0
-		set eperm 0
-		set test "attach"
-		gdb_test_multiple "attach $testpid" $test {
-		    -re "Attaching to program.*process $testpid\r\n" {
-			set saw_attaching 1
-			exp_continue
-		    }
-		    -re "new threads in iteration" {
-			# Seen when "set debug libthread_db" is on.
-			exp_continue
-		    }
-		    -re "Reading symbols from|Expanding full symbols from" {
-			# Prevent -readnow timeout.
-			exp_continue
-		    }
-		    -re "is a zombie - the process has already terminated.*$::gdb_prompt " {
-			fail $gdb_test_name
-		    }
-		    -re "Unable to attach: .*$::gdb_prompt " {
-			fail $gdb_test_name
-		    }
-		    -re "\r\n$::gdb_prompt " {
-			if { $saw_attaching } {
-			    set attached 1
-			    pass $test
-			} else {
-			    fail $test
-			}
-		    }
-		}
-	    }
-
-	    if {!$attached} {
+	    if {![prepare_test_iter $testpid $non_stop \
+		      $attempt $attempts "$::decimal\.$::decimal"]} {
 		kill_wait_spawned_process $test_spawn_id
 		return
 	    }
 
-	    if {$non_stop} {
-		# In non-stop, we will see one stop per thread after
-		# the prompt.
-		set stops 0
-		set tid_re "$::decimal\.$::decimal"
-		set test "seen all stops"
-		for {set thread 1} { $thread <= $n_threads } { incr thread } {
-		    if {[gdb_test_multiple "" $test {
-			-re "Thread ${tid_re} ${any} stopped" {
-			    incr stops
-			}
-		    }] != 0} {
-			break
-		    }
-		}
-
-		# If we haven't seen all stops, then the
-		# gdb_test_multiple in the loop above will have
-		# already issued a FAIL.
-		if {$stops != $n_threads} {
-		    kill_wait_spawned_process $test_spawn_id
-		    return
-		}
-		pass $test
-	    }
-
-	    # Set threads stepping over a breakpoint continuously.
-	    gdb_test "break $::srcfile:$::bp_lineno if 0" "Breakpoint.*" \
-		"break LOC if 0"
-
-	    if {$attempt < $attempts} {
-		# Kick the time out timer for another round.
-		gdb_test "print again = 1" " = 1" "reset timer in the inferior"
-		# Show the time we had left in the logs, in case
-		# something goes wrong.
-		gdb_test "print seconds_left" " = .*"
-	    }
-
-	    if {$non_stop} {
-		set cont_cmd "continue -a &"
-	    } else {
-		set cont_cmd "continue &"
-	    }
-
-	    set cont_cmd_re [string_to_regexp $cont_cmd]
-	    gdb_test_multiple $cont_cmd "" {
-		-re "^$cont_cmd_re\r\nContinuing\.\r\n$::gdb_prompt " {
-		    pass $gdb_test_name
-		}
-	    }
-
-	    # Wait a bit, to give time for the threads to hit the
-	    # breakpoint.
-	    sleep 1
-
 	    set running_count 0
 	    set interrupted 0
 	    gdb_test_multiple "info threads" "all threads running" {
@@ -219,7 +255,8 @@ proc test {condition_eval target_non_stop non_stop displaced} {
 		    }
 		}
 		-re "$::gdb_prompt $" {
-		    gdb_assert {$running_count == ($n_threads + 1) * 2} $gdb_test_name
+		    gdb_assert {$running_count == ($::n_threads + 1) * 2} \
+			$gdb_test_name
 		}
 	    }
 
@@ -292,7 +329,8 @@ foreach_with_prefix breakpoint-condition-evaluation {"host" "target"} {
 	    }
 
 	    foreach_with_prefix displaced {"off" "auto"} {
-		test ${breakpoint-condition-evaluation} ${target-non-stop} ${non-stop} ${displaced}
+		test_detach_command ${breakpoint-condition-evaluation} \
+		    ${target-non-stop} ${non-stop} ${displaced}
 	    }
 	}
     }
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 3/8] gdb: fix assert when quitting GDB while a thread is stepping
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
  2022-11-17 19:42 ` [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp Simon Marchi
  2022-11-17 19:42 ` [PATCH 2/8] gdb/testsuite: refactor gdb.threads/detach-step-over.exp Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-17 19:42 ` [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid Simon Marchi
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, Simon Marchi

From: Andrew Burgess <aburgess@redhat.com>

This commit addresses one of the issues identified in PR gdb/28275.

Bug gdb/28275 identifies a number of situations in which this assert:

  Assertion `!proc_target->commit_resumed_state' failed.

could be triggered.  There's actually a number of similar places where
this assert is found in GDB, the two of interest in gdb/28275 are in
target_wait and target_stop.

In one of the comments:

  https://sourceware.org/bugzilla/show_bug.cgi?id=28275#c1

steps to trigger the assertion within target_stop were identified when
using a modified version of the gdb.threads/detach-step-over.exp test
script.

In the gdb.threads/detach-step-over.exp test, we attach to a
multi-threaded inferior, and continue the inferior in asynchronous
(background) mode.  Each thread is continuously hitting a conditional
breakpoint where the condition is always false.  While the inferior is
running we detach.  The goal is that we detach while GDB is performing a
step-over for the conditional breakpoint in at least one thread.

While detaching, if a step-over is in progress, then GDB has to
complete the step over before we can detach.  This involves calling
target_stop and target_wait (see prepare_for_detach).

As far as gdb/28275 is concerned, the interesting part here, is the
the process_stratum_target::commit_resumed_state variable must be
false when target_stop and target_wait are called.

This is currently ensured because, in detach_command (infrun.c), we
create an instance of scoped_disable_commit_resumed, this ensures that
when target_detach is called, ::commit_resumed_state will be false.

The modification to the test that I propose here, and which exposed
the bug, is that, instead of using "detach" to detach from the
inferior, we instead use "quit".  Quitting GDB after attaching to an
inferior will cause GDB to first detach, and then exit.

When we quit GDB we end up calling target_detach via a different code
path, the stack looks like:

  #0 target_detach
  #1 kill_or_detach
  #2 quit_force
  #3 quit_command

Along this path there is no scoped_disable_commit_resumed created.
::commit_resumed_state can be true when we reach prepare_for_detach,
which calls target_wait and target_stop, so the assertion will trigger.

In this commit, I propose fixing this by adding the creation of a
scoped_disable_commit_resumed into target_detach.  This will ensure
that ::commit_resumed_state is false when calling prepare_for_detach
from within target_detach.

I did consider placing the scoped_disable_commit_resumed in
prepare_for_detach, however, placing it in target_detach ensures that
the target's commit_resumed_state flag is left to false if the detached
inferior was the last one for that target.  It's the same rationale as
for patch "gdb: disable commit resumed in target_kill" that comes later
in this series, but for detach instead of kill.

detach_command still includes a scoped_disable_commit_resumed too, but I
think it is still relevant to cover the resumption at the end of the
function.

Co-Authored-By: Simon Marchi <simon.marchi@efficios.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
Change-Id: Ie128f7aba6ef0e018859275eca372e6ea738e96f
---
 gdb/target.c                                  |  5 ++
 .../gdb.threads/detach-step-over.exp          | 52 +++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/gdb/target.c b/gdb/target.c
index 74925e139dc..4a22885b82f 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -2535,6 +2535,9 @@ target_preopen (int from_tty)
 void
 target_detach (inferior *inf, int from_tty)
 {
+  /* Thread's don't need to be resumed until the end of this function.  */
+  scoped_disable_commit_resumed disable_commit_resumed ("detaching");
+
   /* After we have detached, we will clear the register cache for this inferior
      by calling registers_changed_ptid.  We must save the pid_ptid before
      detaching, as the target detach method will clear inf->pid.  */
@@ -2565,6 +2568,8 @@ target_detach (inferior *inf, int from_tty)
      inferior_ptid matches save_pid_ptid, but in our case, it does not
      call it, as inferior_ptid has been reset.  */
   reinit_frame_cache ();
+
+  disable_commit_resumed.reset_and_commit ();
 }

 void
diff --git a/gdb/testsuite/gdb.threads/detach-step-over.exp b/gdb/testsuite/gdb.threads/detach-step-over.exp
index ad9b08f549e..d2cb52423d9 100644
--- a/gdb/testsuite/gdb.threads/detach-step-over.exp
+++ b/gdb/testsuite/gdb.threads/detach-step-over.exp
@@ -284,6 +284,56 @@ proc_with_prefix test_detach_command {condition_eval target_non_stop non_stop di
     kill_wait_spawned_process $test_spawn_id
 }

+# Similar to the proc above, but this time, instead of detaching using
+# the 'detach' command, we quit GDB, this will also trigger a detach, but
+# through a slightly different path, which can expose different bugs.
+proc_with_prefix test_detach_quit {condition_eval target_non_stop \
+	non_stop displaced} {
+    # If debugging with target remote, check whether the all-stop variant
+    # of the RSP is being used.  If so, we can't run the background tests.
+    if {!$non_stop
+	&& [target_info exists gdb_protocol]
+	&& ([target_info gdb_protocol] == "remote"
+	    || [target_info gdb_protocol] == "extended-remote")} {
+	start_gdb_for_test $condition_eval $target_non_stop \
+	    $non_stop $displaced
+
+	gdb_test_multiple "maint show target-non-stop" "" {
+	    -wrap -re "(is|currently) on.*" {
+	    }
+	    -wrap -re "(is|currently) off.*" {
+		return
+	    }
+	}
+    }
+
+    set test_spawn_id [spawn_wait_for_attach $::binfile]
+    set testpid [spawn_id_get_pid $test_spawn_id]
+
+    set attempts 3
+    for {set attempt 1} { $attempt <= $attempts } { incr attempt } {
+	with_test_prefix "iter $attempt" {
+
+	    start_gdb_for_test $condition_eval $target_non_stop \
+		$non_stop $displaced
+
+	    if {![prepare_test_iter $testpid $non_stop \
+		      $attempt $attempts "$::decimal"]} {
+		kill_wait_spawned_process $test_spawn_id
+		return
+	    }
+
+	    gdb_test_multiple "with confirm off -- quit" "" {
+		eof {
+		    pass $gdb_test_name
+		}
+	    }
+	}
+    }
+
+    kill_wait_spawned_process $test_spawn_id
+}
+
 # The test program exits after a while, in case GDB crashes.  Make it
 # wait at least as long as we may wait before declaring a time out
 # failure.
@@ -331,6 +381,8 @@ foreach_with_prefix breakpoint-condition-evaluation {"host" "target"} {
 	    foreach_with_prefix displaced {"off" "auto"} {
 		test_detach_command ${breakpoint-condition-evaluation} \
 		    ${target-non-stop} ${non-stop} ${displaced}
+		test_detach_quit ${breakpoint-condition-evaluation} \
+		    ${target-non-stop} ${non-stop} ${displaced}
 	    }
 	}
     }
-- 
2.37.3

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
                   ` (2 preceding siblings ...)
  2022-11-17 19:42 ` [PATCH 3/8] gdb: fix assert when quitting GDB while a thread is stepping Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18 11:28   ` Andrew Burgess
  2022-11-17 19:42 ` [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter Simon Marchi
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

Just a small optimization, it's not necessary to recompute lwp at each
iteration.

While at it, change the variable type to long, as ptid_t::lwp returns a
long.

Change-Id: I181670ce1f90b59cb09ea4899367750be2ad9105
---
 gdbserver/linux-low.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
index 0cbfeb99086..a896b37528b 100644
--- a/gdbserver/linux-low.cc
+++ b/gdbserver/linux-low.cc
@@ -1731,9 +1731,9 @@ linux_process_target::status_pending_p_callback (thread_info *thread,
 struct lwp_info *
 find_lwp_pid (ptid_t ptid)
 {
-  thread_info *thread = find_thread ([&] (thread_info *thr_arg)
+  long lwp = ptid.lwp () != 0 ? ptid.lwp () : ptid.pid ();
+  thread_info *thread = find_thread ([lwp] (thread_info *thr_arg)
     {
-      int lwp = ptid.lwp () != 0 ? ptid.lwp () : ptid.pid ();
       return thr_arg->id.lwp () == lwp;
     });
 
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
                   ` (3 preceding siblings ...)
  2022-11-17 19:42 ` [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18 11:32   ` Andrew Burgess
  2022-11-17 19:42 ` [PATCH 6/8] gdbserver: use current_process in ps_getpid Simon Marchi
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

ps_get_thread_area receives as a parameter the lwpid it must work on.
It then calls is_64bit_tdesc, which uses the current_thread as the
thread to work on.  However, it is not said that both are the same.

This became a problem when working in a following patch that pmakes
find_one_thread switch to a process but to no thread (current_thread ==
nullptr).  When libthread_db needed to get the thread area,
is_64bit_tdesc would try to get the regcache of a nullptr thread.

Fix that by making is_64bit_tdesc accept the thread to work on as a
parameter.  Find the right thread from the context, when possible (when
we know the lwpid to work on).  Otherwise, pass "current_thread", to
retain the existing behavior.

Change-Id: I44394d6be92392fa28de71982fd04517ce8a3007
---
 gdbserver/linux-x86-low.cc | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
index d2b55f6f0d2..c98a7a461fe 100644
--- a/gdbserver/linux-x86-low.cc
+++ b/gdbserver/linux-x86-low.cc
@@ -275,9 +275,9 @@ static /*const*/ int i386_regmap[] =
    per the tdesc.  */
 
 static int
-is_64bit_tdesc (void)
+is_64bit_tdesc (thread_info *thread)
 {
-  struct regcache *regcache = get_thread_regcache (current_thread, 0);
+  struct regcache *regcache = get_thread_regcache (thread, 0);
 
   return register_size (regcache->tdesc, 0) == 8;
 }
@@ -292,7 +292,9 @@ ps_get_thread_area (struct ps_prochandle *ph,
 		    lwpid_t lwpid, int idx, void **base)
 {
 #ifdef __x86_64__
-  int use_64bit = is_64bit_tdesc ();
+  lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
+  gdb_assert (lwp != nullptr);
+  int use_64bit = is_64bit_tdesc (get_lwp_thread (lwp));
 
   if (use_64bit)
     {
@@ -335,7 +337,9 @@ int
 x86_target::low_get_thread_area (int lwpid, CORE_ADDR *addr)
 {
 #ifdef __x86_64__
-  int use_64bit = is_64bit_tdesc ();
+  lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
+  gdb_assert (lwp != nullptr);
+  int use_64bit = is_64bit_tdesc (get_lwp_thread (lwp));
 
   if (use_64bit)
     {
@@ -351,7 +355,6 @@ x86_target::low_get_thread_area (int lwpid, CORE_ADDR *addr)
 #endif
 
   {
-    struct lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
     struct thread_info *thr = get_lwp_thread (lwp);
     struct regcache *regcache = get_thread_regcache (thr, 1);
     unsigned int desc[4];
@@ -379,7 +382,7 @@ bool
 x86_target::low_cannot_store_register (int regno)
 {
 #ifdef __x86_64__
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return false;
 #endif
 
@@ -390,7 +393,7 @@ bool
 x86_target::low_cannot_fetch_register (int regno)
 {
 #ifdef __x86_64__
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return false;
 #endif
 
@@ -815,7 +818,7 @@ x86_target::low_siginfo_fixup (siginfo_t *ptrace, gdb_byte *inf, int direction)
   int is_elf64 = linux_pid_exe_is_elf_64_file (tid, &machine);
 
   /* Is the inferior 32-bit?  If so, then fixup the siginfo object.  */
-  if (!is_64bit_tdesc ())
+  if (!is_64bit_tdesc (current_thread))
       return amd64_linux_siginfo_fixup_common (ptrace, inf, direction,
 					       FIXUP_32);
   /* No fixup for native x32 GDB.  */
@@ -1078,7 +1081,7 @@ const regs_info *
 x86_target::get_regs_info ()
 {
 #ifdef __x86_64__
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return &amd64_linux_regs_info;
   else
 #endif
@@ -1553,7 +1556,7 @@ x86_target::install_fast_tracepoint_jump_pad (CORE_ADDR tpoint,
 					      char *err)
 {
 #ifdef __x86_64__
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return amd64_install_fast_tracepoint_jump_pad (tpoint, tpaddr,
 						   collector, lockaddr,
 						   orig_size, jump_entry,
@@ -1587,7 +1590,7 @@ x86_target::get_min_fast_tracepoint_insn_len ()
 #ifdef __x86_64__
   /*  On x86-64, 5-byte jump instructions with a 4-byte offset are always
       used for fast tracepoints.  */
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return 5;
 #endif
 
@@ -2931,7 +2934,7 @@ emit_ops *
 x86_target::emit_ops ()
 {
 #ifdef __x86_64__
-  if (is_64bit_tdesc ())
+  if (is_64bit_tdesc (current_thread))
     return &amd64_emit_ops;
   else
 #endif
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 6/8] gdbserver: use current_process in ps_getpid
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
                   ` (4 preceding siblings ...)
  2022-11-17 19:42 ` [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18 11:33   ` Andrew Burgess
  2022-11-17 19:42 ` [PATCH 7/8] gdbserver: switch to right process in find_one_thread Simon Marchi
  2022-11-17 19:42 ` [PATCH 8/8] gdb: disable commit resumed in target_kill Simon Marchi
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

The following patch ("gdbserver: switch to right process in
find_one_thread") makes it so find_one_thread calls into libthread_db
with a current process but no current thread.  This tripped on ps_getpid
using current_thread in order to get the process' pid.  Get the pid from
`current_process ()` instead, which removes the need to have a current
thread.  Eventually, it would be good to get it from the
gdb_ps_prochandle_t structure, to avoid the need for a current process
as well.

Change-Id: I9d2fae266419199a2fbc2fde0a5104c6e0dbd2d5
---
 gdbserver/proc-service.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdbserver/proc-service.cc b/gdbserver/proc-service.cc
index a49e2b25a47..f86c0e99923 100644
--- a/gdbserver/proc-service.cc
+++ b/gdbserver/proc-service.cc
@@ -158,5 +158,5 @@ ps_lsetfpregs (gdb_ps_prochandle_t ph, lwpid_t lwpid, const prfpregset_t *fpregs
 pid_t
 ps_getpid (gdb_ps_prochandle_t ph)
 {
-  return pid_of (current_thread);
+  return current_process ()->pid;
 }
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 7/8] gdbserver: switch to right process in find_one_thread
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
                   ` (5 preceding siblings ...)
  2022-11-17 19:42 ` [PATCH 6/8] gdbserver: use current_process in ps_getpid Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18 13:19   ` Andrew Burgess
  2022-11-17 19:42 ` [PATCH 8/8] gdb: disable commit resumed in target_kill Simon Marchi
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

When I do this:

    $ ./gdb -nx --data-directory=data-directory -q \
        /bin/sleep \
	-ex "maint set target-non-stop on" \
	-ex "tar ext :1234" \
	-ex "set remote exec-file /bin/sleep" \
	-ex "run 1231 &" \
	-ex add-inferior \
	-ex "inferior 2"
    Reading symbols from /bin/sleep...
    (No debugging symbols found in /bin/sleep)
    Remote debugging using :1234
    Starting program: /bin/sleep 1231
    Reading /lib64/ld-linux-x86-64.so.2 from remote target...
    warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
    Reading /lib64/ld-linux-x86-64.so.2 from remote target...
    Reading /usr/lib/debug/.build-id/a6/7a1408f18db3576757eea210d07ba3fc560dff.debug from remote target...
    [New inferior 2]
    Added inferior 2 on connection 1 (extended-remote :1234)
    [Switching to inferior 2 [<null>] (<noexec>)]
    (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target...
    attach 3659848
    Attaching to process 3659848
    /home/smarchi/src/binutils-gdb/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.

The internal error of GDB is actually caused by GDBserver crashing, and
the error recovery of GDB is not on point.  This patch aims to fix just
the GDBserver crash, not the GDB problem.

GDBserver crashes with a segfault here:

    (gdb) bt
    #0  0x00005555557fb3f4 in find_one_thread (ptid=...) at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:177
    #1  0x00005555557fd5cf in thread_db_thread_handle (ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400, handle_len=0x7fffffffc3f0)
        at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:461
    #2  0x000055555578a0b6 in linux_process_target::thread_handle (this=0x5555558a64c0 <the_x86_target>, ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400,
        handle_len=0x7fffffffc3f0) at /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:6905
    #3  0x00005555556dfcc6 in handle_qxfer_threads_worker (thread=0x60b000000510, buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1645
    #4  0x00005555556e00e6 in operator() (__closure=0x7fffffffc5e0, thread=0x60b000000510) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1696
    #5  0x00005555556f54be in for_each_thread<handle_qxfer_threads_proper(buffer*)::<lambda(thread_info*)> >(struct {...}) (func=...) at /home/smarchi/src/binutils-gdb/gdbserver/gdbthread.h:159
    #6  0x00005555556e0242 in handle_qxfer_threads_proper (buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1694
    #7  0x00005555556e04ba in handle_qxfer_threads (annex=0x629000000213 "", readbuf=0x621000019100 '\276' <repeats 200 times>..., writebuf=0x0, offset=0, len=4097)
        at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1732
    #8  0x00005555556e1989 in handle_qxfer (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2045
    #9  0x00005555556e720a in handle_query (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2685
    #10 0x00005555556f1a01 in process_serial_event () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4176
    #11 0x00005555556f4457 in handle_serial_event (err=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4514
    #12 0x0000555555820f56 in handle_file_event (file_ptr=0x607000000250, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
    #13 0x0000555555821895 in gdb_wait_for_event (block=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
    #14 0x000055555581f533 in gdb_do_one_event (mstimeout=-1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264
    #15 0x00005555556ec9fb in start_event_loop () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3512
    #16 0x00005555556f0769 in captured_main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3992
    #17 0x00005555556f0e3f in main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4078

The reason is a wrong current process when find_one_thread is called.
The current process is the 2nd one, which was just attached.  It does
not yet have thread_db data (proc->priv->thread_db is nullptr).  As we
iterate on all threads of all process to fulfull the qxfer:threads:read
request, we get to a thread of process 1 for which we haven't read
thread_db information yet (lwp_info::thread_known is false), so we get
into find_one_thread.  find_one_thread uses
`current_process ()->priv->thread_db`, assuming the current process
matches the ptid passed as a parameter, which is wrong.  A segfault
happens when trying to dereference that thread_db pointer.

Fix this by making find_one_thread not assume what the current process /
current thread is.  If it needs to call into libthread_db, which we know
will try to read memory from the current process, then temporarily set
the current process.

In the case where the thread is already know and we return early, we
don't need to switch process.

I hit this case when running the test included with the following patch,
"gdb: disable commit resumed in target_kill", so the fix is exercised by
that test.

Change-Id: I09b00883e8b73b7e5f89d0f47cb4e9c0f3d6caaa
---
 gdbserver/thread-db.cc | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/gdbserver/thread-db.cc b/gdbserver/thread-db.cc
index 6e0e2228a5f..bf98ca9557a 100644
--- a/gdbserver/thread-db.cc
+++ b/gdbserver/thread-db.cc
@@ -155,30 +155,35 @@ thread_db_state_str (td_thr_state_e state)
 }
 #endif
 
-/* Get thread info about PTID, accessing memory via the current
-   thread.  */
+/* Get thread info about PTID.  */
 
 static int
 find_one_thread (ptid_t ptid)
 {
-  td_thrhandle_t th;
-  td_thrinfo_t ti;
-  td_err_e err;
-  struct lwp_info *lwp;
-  struct thread_db *thread_db = current_process ()->priv->thread_db;
-  int lwpid = ptid.lwp ();
-
   thread_info *thread = find_thread_ptid (ptid);
-  lwp = get_thread_lwp (thread);
+  lwp_info *lwp = get_thread_lwp (thread);
   if (lwp->thread_known)
     return 1;
 
-  /* Get information about this thread.  */
-  err = thread_db->td_ta_map_lwp2thr_p (thread_db->thread_agent, lwpid, &th);
+  /* Get information about this thread.  libthread_db will need to read some
+     memory, which will be done on the current process, so make PTID's process
+     the current one.  */
+  process_info *proc = find_process_pid (ptid.pid ());
+  gdb_assert (proc != nullptr);
+
+  scoped_restore_current_thread restore_thread;
+  switch_to_process (proc);
+
+  thread_db *thread_db = proc->priv->thread_db;
+  td_thrhandle_t th;
+  int lwpid = ptid.lwp ();
+  td_err_e err = thread_db->td_ta_map_lwp2thr_p (thread_db->thread_agent, lwpid,
+						 &th);
   if (err != TD_OK)
     error ("Cannot get thread handle for LWP %d: %s",
 	   lwpid, thread_db_err_str (err));
 
+  td_thrinfo_t ti;
   err = thread_db->td_thr_get_info_p (&th, &ti);
   if (err != TD_OK)
     error ("Cannot get thread info for LWP %d: %s",
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 8/8] gdb: disable commit resumed in target_kill
  2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
                   ` (6 preceding siblings ...)
  2022-11-17 19:42 ` [PATCH 7/8] gdbserver: switch to right process in find_one_thread Simon Marchi
@ 2022-11-17 19:42 ` Simon Marchi
  2022-11-18 13:33   ` Andrew Burgess
  7 siblings, 1 reply; 21+ messages in thread
From: Simon Marchi @ 2022-11-17 19:42 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi

PR 28275 shows that doing a sequence of:

 - Run inferior in background (run &)
 - kill that inferior
 - Run again

We get into this assertion:

    /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.

    #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
    #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
    #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
    #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
    #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
    #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
        at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
    #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
    #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526

When running the kill command, commit_resumed_state for the
process_stratum_target (linux-nat, here) is true.  After the kill, when
there are no more threads, commit_resumed_state is still true, as
nothing touches this flag during the kill operation.  During the
subsequent run command, run_command_1 does:

    scoped_disable_commit_resumed disable_commit_resumed ("running");

We would think that this would clear the commit_resumed_state flag of
our native target, but that's not the case, because
scoped_disable_commit_resumed iterates on non-exited inferiors in order
to find active process targets.  And after the kill, the inferior is
exited, and the native target was unpushed from it anyway.  So
scoped_disable_commit_resumed doesn't touch the commit_resumed_state
flag of the native target, it stays true.  When reaching target_wait, in
startup_inferior (to consume the initial expect stop events while the
inferior is starting up and working its way through the shell),
commit_resumed_state is true, breaking the contract saying that
commit_resumed_state is always false when calling the targets' wait
method.

(note: to be correct, I think that startup_inferior should toggle
commit_resumed between the target_wait and target_resume calls, but I'll
ignore that for now)

I can see multiple ways to fix this.  In the end, we need
commit_resumed_state to be cleared by the time we get to that
target_wait.  It could be done at the end of the kill command, or at the
beginning of the run command.

To keep things in a coherent state, I'd like to make it so that after
the kill command, when the target is left with no threads, its
commit_resumed_state flag is left to false.  This way, we can keep
working with the assumption that a target with no threads (and therefore
no running threads) has commit_resumed_state == false.

Do this by adding a scoped_disable_commit_resumed in target_kill.  It
clears the target's commit_resumed_state on entry, and leaves it false
if the target does not have any resumed thread on exit.  That means,
even if the target has another inferior with stopped threads,
commit_resumed_state will be left to false, which makes sense.

Add a test that tries to cover various combinations of actions done
while an inferior is running (and therefore while commit_resumed_state
is true on the process target).

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
---
 gdb/target.c                                  |   8 +-
 .../gdb.base/run-control-while-bg-execution.c |  33 +++++
 .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
 3 files changed, 157 insertions(+), 2 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
 create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp

diff --git a/gdb/target.c b/gdb/target.c
index 4a22885b82f..f5c6212310a 100644
--- a/gdb/target.c
+++ b/gdb/target.c
@@ -907,8 +907,12 @@ add_deprecated_target_alias (const target_info &tinfo, const char *alias)
 void
 target_kill (void)
 {
-  current_inferior ()->top_target ()->kill ();
-}
+
+  /* Ensure that, if commit resumed for the target is currently true (threads
+     are running), if we kill the last running inferior, commit resumed ends up
+     false.  */
+   scoped_disable_commit_resumed disable ("killing"); current_inferior
+   ()->top_target ()->kill (); }
 
 void
 target_load (const char *arg, int from_tty)
diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.c b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
new file mode 100644
index 00000000000..8092fadc8b9
--- /dev/null
+++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
@@ -0,0 +1,33 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2020-2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+					       //
+#include <unistd.h>
+
+static pid_t mypid = -1;
+
+static void
+after_getpid (void)
+{
+}
+
+int
+main (void)
+{
+  mypid = getpid ();
+  after_getpid ();
+  sleep (30);
+}
diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
new file mode 100644
index 00000000000..e637e2bfc98
--- /dev/null
+++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
@@ -0,0 +1,118 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test aims at testing various operations after getting rid of an inferior
+# that was running in background, or while we have an inferior running in
+# background.  The original intent was to expose cases where the commit-resumed
+# state of the process stratum target was not reset properly after killing an
+# inferior running in background, which would be a problem when trying to run
+# again.  The test was expanded to test various combinations of
+# run-control-related actions done with an inferior running in background.
+
+if {[use_gdb_stub]} {
+    unsupported "test requires running"
+    return
+}
+
+standard_testfile
+
+if {[build_executable "failed to prepare" $testfile $srcfile]} {
+    return
+}
+
+# Run one variation of the test:
+#
+# 1. Start an inferior in the background with "run &"
+# 2. Do action 1
+# 3. Do action 2
+#
+# Action 1 indicates what to do with the inferior running in background:
+#
+#  - kill: kill it
+#  - detach: detach it
+#  - add: add a new inferior and switch to it, leave the inferior running in
+#    background alone
+#  - none: do nothing, leave the inferior running in background alone
+#
+# Action 2 indicates what to do after that:
+#
+#  - start: use the start command
+#  - run: use the run command
+#  - attach: start a process outside of GDB and attach it
+proc do_test { action1 action2 } {
+    save_vars { ::GDBFLAGS } {
+	append ::GDBFLAGS " -ex \"maintenance set target-non-stop on\""
+	clean_restart $::binfile
+    }
+
+    # Ensure we are at least after the getpid call, shall we need it.
+    if { ![runto "after_getpid"] } {
+	return
+    }
+
+    # Some commands below ask for confirmation.  Turn that off for simplicity.
+    gdb_test "set confirm off"
+    gdb_test -no-prompt-anchor "continue &"
+
+    if { $action1 == "kill" } {
+	gdb_test "kill" "Inferior 1 .* killed.*"
+    } elseif { $action1 == "detach" } {
+	set child_pid [get_integer_valueof "mypid" -1]
+	if { $child_pid == -1 } {
+	    fail "failed to extract child pid"
+	    return
+	}
+
+	gdb_test "detach" "Inferior 1 .* detached.*" "detach from first instance"
+
+	# Kill the detached process, otherwise that makes "monitor exit" hang
+	# until the process disappears.
+	#remote_exec target "kill $child_pid"
+    } elseif { $action1 == "add" } {
+	gdb_test "add-inferior -exec $::binfile" \
+	    "Added inferior 2 on connection 1.*" "add-inferior"
+	gdb_test "inferior 2" "Switching to inferior 2 .*"
+    } elseif { $action1 == "none" } {
+
+    } else {
+	error "invalid action 1"
+    }
+
+    if { $action2 == "start" } {
+	gdb_test "start" "Temporary breakpoint $::decimal, main .*"
+    } elseif { $action2 == "run" } {
+	gdb_test "break main" "Breakpoint $::decimal at $::hex.*"
+	gdb_test "run" "Breakpoint $::decimal, main .*"
+    } elseif { $action2 == "attach" } {
+	set test_spawn_id [spawn_wait_for_attach $::binfile]
+	set test_pid [spawn_id_get_pid $test_spawn_id]
+
+	if { [gdb_attach $test_pid] } {
+	    gdb_test "detach" "Inferior $::decimal .* detached.*" \
+		"detach from second instance"
+	}
+
+	# Detach and kill this inferior so we don't leave it around.
+	kill_wait_spawned_process $test_spawn_id
+    } else {
+	error "invalid action 2"
+    }
+}
+
+foreach_with_prefix action1 { kill detach add none } {
+    foreach_with_prefix action2 { start run attach } {
+	do_test $action1 $action2
+    }
+}
-- 
2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp
  2022-11-17 19:42 ` [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp Simon Marchi
@ 2022-11-18  8:30   ` Aktemur, Tankut Baris
  2022-11-18 15:07     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Aktemur, Tankut Baris @ 2022-11-18  8:30 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

Hi Simon,

On Thursday, November 17, 2022 8:43 PM, Simon Marchi wrote:
> Before doing further changes to this file, change to use the :: notation
> instead of declaring global variables with the `global` keyword.

Could you please give more clarification about why this is needed?
I fear that this could lead to divergence in the testsuite code-style,
where some files use the global keyword and others use the `::` notation,
similar to the inconsistency of the use of the 'then' keyword [1].

-Baris

[1] https://sourceware.org/pipermail/gdb-patches/2022-November/193801.html


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid
  2022-11-17 19:42 ` [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid Simon Marchi
@ 2022-11-18 11:28   ` Andrew Burgess
  2022-11-18 16:09     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Burgess @ 2022-11-18 11:28 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches, gdb-patches; +Cc: Simon Marchi

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> Just a small optimization, it's not necessary to recompute lwp at each
> iteration.
>
> While at it, change the variable type to long, as ptid_t::lwp returns a
> long.
>
> Change-Id: I181670ce1f90b59cb09ea4899367750be2ad9105

LGTM.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

> ---
>  gdbserver/linux-low.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index 0cbfeb99086..a896b37528b 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -1731,9 +1731,9 @@ linux_process_target::status_pending_p_callback (thread_info *thread,
>  struct lwp_info *
>  find_lwp_pid (ptid_t ptid)
>  {
> -  thread_info *thread = find_thread ([&] (thread_info *thr_arg)
> +  long lwp = ptid.lwp () != 0 ? ptid.lwp () : ptid.pid ();
> +  thread_info *thread = find_thread ([lwp] (thread_info *thr_arg)
>      {
> -      int lwp = ptid.lwp () != 0 ? ptid.lwp () : ptid.pid ();
>        return thr_arg->id.lwp () == lwp;
>      });
>  
> -- 
> 2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter
  2022-11-17 19:42 ` [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter Simon Marchi
@ 2022-11-18 11:32   ` Andrew Burgess
  2022-11-18 16:12     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Burgess @ 2022-11-18 11:32 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches, gdb-patches; +Cc: Simon Marchi

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> ps_get_thread_area receives as a parameter the lwpid it must work on.
> It then calls is_64bit_tdesc, which uses the current_thread as the
> thread to work on.  However, it is not said that both are the same.
>
> This became a problem when working in a following patch that pmakes

s/pmakes/makes/

> find_one_thread switch to a process but to no thread (current_thread ==
> nullptr).  When libthread_db needed to get the thread area,
> is_64bit_tdesc would try to get the regcache of a nullptr thread.
>
> Fix that by making is_64bit_tdesc accept the thread to work on as a
> parameter.  Find the right thread from the context, when possible (when
> we know the lwpid to work on).  Otherwise, pass "current_thread", to
> retain the existing behavior.
>
> Change-Id: I44394d6be92392fa28de71982fd04517ce8a3007
> ---
>  gdbserver/linux-x86-low.cc | 27 +++++++++++++++------------
>  1 file changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
> index d2b55f6f0d2..c98a7a461fe 100644
> --- a/gdbserver/linux-x86-low.cc
> +++ b/gdbserver/linux-x86-low.cc
> @@ -275,9 +275,9 @@ static /*const*/ int i386_regmap[] =
>     per the tdesc.  */
>  
>  static int
> -is_64bit_tdesc (void)
> +is_64bit_tdesc (thread_info *thread)

I think the comment for this function needs updating, the reference to
"current inferior" probably needs to be replaced with THREAD.

With those changes, LGTM.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew


>  {
> -  struct regcache *regcache = get_thread_regcache (current_thread, 0);
> +  struct regcache *regcache = get_thread_regcache (thread, 0);
>  
>    return register_size (regcache->tdesc, 0) == 8;
>  }
> @@ -292,7 +292,9 @@ ps_get_thread_area (struct ps_prochandle *ph,
>  		    lwpid_t lwpid, int idx, void **base)
>  {
>  #ifdef __x86_64__
> -  int use_64bit = is_64bit_tdesc ();
> +  lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
> +  gdb_assert (lwp != nullptr);
> +  int use_64bit = is_64bit_tdesc (get_lwp_thread (lwp));
>  
>    if (use_64bit)
>      {
> @@ -335,7 +337,9 @@ int
>  x86_target::low_get_thread_area (int lwpid, CORE_ADDR *addr)
>  {
>  #ifdef __x86_64__
> -  int use_64bit = is_64bit_tdesc ();
> +  lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
> +  gdb_assert (lwp != nullptr);
> +  int use_64bit = is_64bit_tdesc (get_lwp_thread (lwp));
>  
>    if (use_64bit)
>      {
> @@ -351,7 +355,6 @@ x86_target::low_get_thread_area (int lwpid, CORE_ADDR *addr)
>  #endif
>  
>    {
> -    struct lwp_info *lwp = find_lwp_pid (ptid_t (lwpid));
>      struct thread_info *thr = get_lwp_thread (lwp);
>      struct regcache *regcache = get_thread_regcache (thr, 1);
>      unsigned int desc[4];
> @@ -379,7 +382,7 @@ bool
>  x86_target::low_cannot_store_register (int regno)
>  {
>  #ifdef __x86_64__
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return false;
>  #endif
>  
> @@ -390,7 +393,7 @@ bool
>  x86_target::low_cannot_fetch_register (int regno)
>  {
>  #ifdef __x86_64__
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return false;
>  #endif
>  
> @@ -815,7 +818,7 @@ x86_target::low_siginfo_fixup (siginfo_t *ptrace, gdb_byte *inf, int direction)
>    int is_elf64 = linux_pid_exe_is_elf_64_file (tid, &machine);
>  
>    /* Is the inferior 32-bit?  If so, then fixup the siginfo object.  */
> -  if (!is_64bit_tdesc ())
> +  if (!is_64bit_tdesc (current_thread))
>        return amd64_linux_siginfo_fixup_common (ptrace, inf, direction,
>  					       FIXUP_32);
>    /* No fixup for native x32 GDB.  */
> @@ -1078,7 +1081,7 @@ const regs_info *
>  x86_target::get_regs_info ()
>  {
>  #ifdef __x86_64__
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return &amd64_linux_regs_info;
>    else
>  #endif
> @@ -1553,7 +1556,7 @@ x86_target::install_fast_tracepoint_jump_pad (CORE_ADDR tpoint,
>  					      char *err)
>  {
>  #ifdef __x86_64__
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return amd64_install_fast_tracepoint_jump_pad (tpoint, tpaddr,
>  						   collector, lockaddr,
>  						   orig_size, jump_entry,
> @@ -1587,7 +1590,7 @@ x86_target::get_min_fast_tracepoint_insn_len ()
>  #ifdef __x86_64__
>    /*  On x86-64, 5-byte jump instructions with a 4-byte offset are always
>        used for fast tracepoints.  */
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return 5;
>  #endif
>  
> @@ -2931,7 +2934,7 @@ emit_ops *
>  x86_target::emit_ops ()
>  {
>  #ifdef __x86_64__
> -  if (is_64bit_tdesc ())
> +  if (is_64bit_tdesc (current_thread))
>      return &amd64_emit_ops;
>    else
>  #endif
> -- 
> 2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/8] gdbserver: use current_process in ps_getpid
  2022-11-17 19:42 ` [PATCH 6/8] gdbserver: use current_process in ps_getpid Simon Marchi
@ 2022-11-18 11:33   ` Andrew Burgess
  2022-11-18 16:21     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Burgess @ 2022-11-18 11:33 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches, gdb-patches; +Cc: Simon Marchi

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> The following patch ("gdbserver: switch to right process in
> find_one_thread") makes it so find_one_thread calls into libthread_db
> with a current process but no current thread.  This tripped on ps_getpid
> using current_thread in order to get the process' pid.  Get the pid from
> `current_process ()` instead, which removes the need to have a current
> thread.  Eventually, it would be good to get it from the
> gdb_ps_prochandle_t structure, to avoid the need for a current process
> as well.

LGTM.

Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks,
Andrew

>
> Change-Id: I9d2fae266419199a2fbc2fde0a5104c6e0dbd2d5
> ---
>  gdbserver/proc-service.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gdbserver/proc-service.cc b/gdbserver/proc-service.cc
> index a49e2b25a47..f86c0e99923 100644
> --- a/gdbserver/proc-service.cc
> +++ b/gdbserver/proc-service.cc
> @@ -158,5 +158,5 @@ ps_lsetfpregs (gdb_ps_prochandle_t ph, lwpid_t lwpid, const prfpregset_t *fpregs
>  pid_t
>  ps_getpid (gdb_ps_prochandle_t ph)
>  {
> -  return pid_of (current_thread);
> +  return current_process ()->pid;
>  }
> -- 
> 2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 7/8] gdbserver: switch to right process in find_one_thread
  2022-11-17 19:42 ` [PATCH 7/8] gdbserver: switch to right process in find_one_thread Simon Marchi
@ 2022-11-18 13:19   ` Andrew Burgess
  2022-11-18 17:34     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Burgess @ 2022-11-18 13:19 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches, gdb-patches; +Cc: Simon Marchi

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> When I do this:
>
>     $ ./gdb -nx --data-directory=data-directory -q \
>         /bin/sleep \
> 	-ex "maint set target-non-stop on" \
> 	-ex "tar ext :1234" \
> 	-ex "set remote exec-file /bin/sleep" \
> 	-ex "run 1231 &" \
> 	-ex add-inferior \
> 	-ex "inferior 2"
>     Reading symbols from /bin/sleep...
>     (No debugging symbols found in /bin/sleep)
>     Remote debugging using :1234
>     Starting program: /bin/sleep 1231
>     Reading /lib64/ld-linux-x86-64.so.2 from remote target...
>     warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
>     Reading /lib64/ld-linux-x86-64.so.2 from remote target...
>     Reading /usr/lib/debug/.build-id/a6/7a1408f18db3576757eea210d07ba3fc560dff.debug from remote target...
>     [New inferior 2]
>     Added inferior 2 on connection 1 (extended-remote :1234)
>     [Switching to inferior 2 [<null>] (<noexec>)]
>     (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target...
>     attach 3659848
>     Attaching to process 3659848
>     /home/smarchi/src/binutils-gdb/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>
> The internal error of GDB is actually caused by GDBserver crashing, and
> the error recovery of GDB is not on point.  This patch aims to fix just
> the GDBserver crash, not the GDB problem.
>
> GDBserver crashes with a segfault here:
>
>     (gdb) bt
>     #0  0x00005555557fb3f4 in find_one_thread (ptid=...) at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:177
>     #1  0x00005555557fd5cf in thread_db_thread_handle (ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400, handle_len=0x7fffffffc3f0)
>         at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:461
>     #2  0x000055555578a0b6 in linux_process_target::thread_handle (this=0x5555558a64c0 <the_x86_target>, ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400,
>         handle_len=0x7fffffffc3f0) at /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:6905
>     #3  0x00005555556dfcc6 in handle_qxfer_threads_worker (thread=0x60b000000510, buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1645
>     #4  0x00005555556e00e6 in operator() (__closure=0x7fffffffc5e0, thread=0x60b000000510) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1696
>     #5  0x00005555556f54be in for_each_thread<handle_qxfer_threads_proper(buffer*)::<lambda(thread_info*)> >(struct {...}) (func=...) at /home/smarchi/src/binutils-gdb/gdbserver/gdbthread.h:159
>     #6  0x00005555556e0242 in handle_qxfer_threads_proper (buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1694
>     #7  0x00005555556e04ba in handle_qxfer_threads (annex=0x629000000213 "", readbuf=0x621000019100 '\276' <repeats 200 times>..., writebuf=0x0, offset=0, len=4097)
>         at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1732
>     #8  0x00005555556e1989 in handle_qxfer (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2045
>     #9  0x00005555556e720a in handle_query (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2685
>     #10 0x00005555556f1a01 in process_serial_event () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4176
>     #11 0x00005555556f4457 in handle_serial_event (err=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4514
>     #12 0x0000555555820f56 in handle_file_event (file_ptr=0x607000000250, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
>     #13 0x0000555555821895 in gdb_wait_for_event (block=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
>     #14 0x000055555581f533 in gdb_do_one_event (mstimeout=-1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264
>     #15 0x00005555556ec9fb in start_event_loop () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3512
>     #16 0x00005555556f0769 in captured_main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3992
>     #17 0x00005555556f0e3f in main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4078
>
> The reason is a wrong current process when find_one_thread is called.
> The current process is the 2nd one, which was just attached.  It does
> not yet have thread_db data (proc->priv->thread_db is nullptr).  As we
> iterate on all threads of all process to fulfull the qxfer:threads:read
> request, we get to a thread of process 1 for which we haven't read
> thread_db information yet (lwp_info::thread_known is false), so we get
> into find_one_thread.  find_one_thread uses
> `current_process ()->priv->thread_db`, assuming the current process
> matches the ptid passed as a parameter, which is wrong.  A segfault
> happens when trying to dereference that thread_db pointer.
>
> Fix this by making find_one_thread not assume what the current process /
> current thread is.  If it needs to call into libthread_db, which we know
> will try to read memory from the current process, then temporarily set
> the current process.
>
> In the case where the thread is already know and we return early, we
> don't need to switch process.
>
> I hit this case when running the test included with the following patch,
> "gdb: disable commit resumed in target_kill", so the fix is exercised by
> that test.

I'm not able to reproduce the failure you describe.  I've applied this
series except for this patch, and run the test from patch #8 with
native-extended-gdbserver board, and it passes just fine.

Actually, I do see an issue with the test, but that issue doesn't seem
to be related to this problem, and is present with and without this
patch - I'll reply to patch #8 describing that issue.

Is this bug intermittent?  Or is it likely to depend on certain parts of
the environment?  I got the impression from the description that it if
we did the steps in the right order then we'd get a nullptr dereference,
which didn't feel like an intermittent issue, but maybe I don't
understand correctly.

That said, the change itself looks reasonable - but it would be nice to
know why I can't reproduce the failure.

Thanks,
Andrew


>
> Change-Id: I09b00883e8b73b7e5f89d0f47cb4e9c0f3d6caaa
> ---
>  gdbserver/thread-db.cc | 29 +++++++++++++++++------------
>  1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/gdbserver/thread-db.cc b/gdbserver/thread-db.cc
> index 6e0e2228a5f..bf98ca9557a 100644
> --- a/gdbserver/thread-db.cc
> +++ b/gdbserver/thread-db.cc
> @@ -155,30 +155,35 @@ thread_db_state_str (td_thr_state_e state)
>  }
>  #endif
>  
> -/* Get thread info about PTID, accessing memory via the current
> -   thread.  */
> +/* Get thread info about PTID.  */
>  
>  static int
>  find_one_thread (ptid_t ptid)
>  {
> -  td_thrhandle_t th;
> -  td_thrinfo_t ti;
> -  td_err_e err;
> -  struct lwp_info *lwp;
> -  struct thread_db *thread_db = current_process ()->priv->thread_db;
> -  int lwpid = ptid.lwp ();
> -
>    thread_info *thread = find_thread_ptid (ptid);
> -  lwp = get_thread_lwp (thread);
> +  lwp_info *lwp = get_thread_lwp (thread);
>    if (lwp->thread_known)
>      return 1;
>  
> -  /* Get information about this thread.  */
> -  err = thread_db->td_ta_map_lwp2thr_p (thread_db->thread_agent, lwpid, &th);
> +  /* Get information about this thread.  libthread_db will need to read some
> +     memory, which will be done on the current process, so make PTID's process
> +     the current one.  */
> +  process_info *proc = find_process_pid (ptid.pid ());
> +  gdb_assert (proc != nullptr);
> +
> +  scoped_restore_current_thread restore_thread;
> +  switch_to_process (proc);
> +
> +  thread_db *thread_db = proc->priv->thread_db;
> +  td_thrhandle_t th;
> +  int lwpid = ptid.lwp ();
> +  td_err_e err = thread_db->td_ta_map_lwp2thr_p (thread_db->thread_agent, lwpid,
> +						 &th);
>    if (err != TD_OK)
>      error ("Cannot get thread handle for LWP %d: %s",
>  	   lwpid, thread_db_err_str (err));
>  
> +  td_thrinfo_t ti;
>    err = thread_db->td_thr_get_info_p (&th, &ti);
>    if (err != TD_OK)
>      error ("Cannot get thread info for LWP %d: %s",
> -- 
> 2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/8] gdb: disable commit resumed in target_kill
  2022-11-17 19:42 ` [PATCH 8/8] gdb: disable commit resumed in target_kill Simon Marchi
@ 2022-11-18 13:33   ` Andrew Burgess
  2022-11-19  1:16     ` Simon Marchi
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Burgess @ 2022-11-18 13:33 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches, gdb-patches; +Cc: Simon Marchi

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> PR 28275 shows that doing a sequence of:
>
>  - Run inferior in background (run &)
>  - kill that inferior
>  - Run again
>
> We get into this assertion:
>
>     /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.
>
>     #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
>     #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
>     #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
>     #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
>     #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>         at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
>     #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>         at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
>     #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
>     #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526
>
> When running the kill command, commit_resumed_state for the
> process_stratum_target (linux-nat, here) is true.  After the kill, when
> there are no more threads, commit_resumed_state is still true, as
> nothing touches this flag during the kill operation.  During the
> subsequent run command, run_command_1 does:
>
>     scoped_disable_commit_resumed disable_commit_resumed ("running");
>
> We would think that this would clear the commit_resumed_state flag of
> our native target, but that's not the case, because
> scoped_disable_commit_resumed iterates on non-exited inferiors in order
> to find active process targets.  And after the kill, the inferior is
> exited, and the native target was unpushed from it anyway.  So
> scoped_disable_commit_resumed doesn't touch the commit_resumed_state
> flag of the native target, it stays true.  When reaching target_wait, in
> startup_inferior (to consume the initial expect stop events while the
> inferior is starting up and working its way through the shell),
> commit_resumed_state is true, breaking the contract saying that
> commit_resumed_state is always false when calling the targets' wait
> method.
>
> (note: to be correct, I think that startup_inferior should toggle
> commit_resumed between the target_wait and target_resume calls, but I'll
> ignore that for now)
>
> I can see multiple ways to fix this.  In the end, we need
> commit_resumed_state to be cleared by the time we get to that
> target_wait.  It could be done at the end of the kill command, or at the
> beginning of the run command.
>
> To keep things in a coherent state, I'd like to make it so that after
> the kill command, when the target is left with no threads, its
> commit_resumed_state flag is left to false.  This way, we can keep
> working with the assumption that a target with no threads (and therefore
> no running threads) has commit_resumed_state == false.
>
> Do this by adding a scoped_disable_commit_resumed in target_kill.  It
> clears the target's commit_resumed_state on entry, and leaves it false
> if the target does not have any resumed thread on exit.  That means,
> even if the target has another inferior with stopped threads,
> commit_resumed_state will be left to false, which makes sense.
>
> Add a test that tries to cover various combinations of actions done
> while an inferior is running (and therefore while commit_resumed_state
> is true on the process target).
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
> Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
> ---
>  gdb/target.c                                  |   8 +-
>  .../gdb.base/run-control-while-bg-execution.c |  33 +++++
>  .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
>  3 files changed, 157 insertions(+), 2 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
>
> diff --git a/gdb/target.c b/gdb/target.c
> index 4a22885b82f..f5c6212310a 100644
> --- a/gdb/target.c
> +++ b/gdb/target.c
> @@ -907,8 +907,12 @@ add_deprecated_target_alias (const target_info &tinfo, const char *alias)
>  void
>  target_kill (void)
>  {
> -  current_inferior ()->top_target ()->kill ();
> -}
> +
> +  /* Ensure that, if commit resumed for the target is currently true (threads
> +     are running), if we kill the last running inferior, commit resumed ends up
> +     false.  */
> +   scoped_disable_commit_resumed disable ("killing"); current_inferior
> +   ()->top_target ()->kill (); }

I think something went wrong with the formatting here!

I first read this comment before reading the commit message, and,
initially, the comment didn't make any sense to me.

My thinking was, surely the scoped_disable_commit_resumed would set the
commit resumed state to false, but only for the duration of this scope.
When we leave the scope, the state will be set back to true...

...the subtlety of course, is that when we create the
scoped_disable_commit_resumed, the current inferior is non-exited, and
so has its state set to false, but, when we exit the scope, the current
inferior will be exited, so its commit resume state will be left false.

Personally, I think this is a non-obvious detail.  I think it would be
helpful if the comment explained this aspect a little more.

I'm also seeing something weird with the test when run with the
native-extended-gdbserver board, this warning:

  WARNING: Timed out waiting for EOF in server after monitor exit

I've not done a full investigation, but what seems to happen is GDB
sends the 'monitor exit', and gdbserver does indeed try to exit.  It
then looks like pthreads(?) is waiting on some threads?  Or some child
processes?  At the point gdbserver is not exiting, the stack looks like:

  #0  0x00007f7009660374 in wait () from /lib64/libpthread.so.0
  #1  0x000056146a26a858 in reap_children ()
  #2  0x000056146a26b21c in new_job ()
  #3  0x000056146a27740c in update_file ()
  #4  0x000056146a277d64 in update_goal_chain ()
  #5  0x000056146a25b893 in main ()

Do you also see this timeout?

Additionally, I have not (yet) been able to reproduce this when running
the test from the command line - in that case, gdbserver always exits
immediately after getting the 'monitor exit'.

Thanks,
Andrew

>  
>  void
>  target_load (const char *arg, int from_tty)
> diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.c b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
> new file mode 100644
> index 00000000000..8092fadc8b9
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.c
> @@ -0,0 +1,33 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2020-2022 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +					       //
> +#include <unistd.h>
> +
> +static pid_t mypid = -1;
> +
> +static void
> +after_getpid (void)
> +{
> +}
> +
> +int
> +main (void)
> +{
> +  mypid = getpid ();
> +  after_getpid ();
> +  sleep (30);
> +}
> diff --git a/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
> new file mode 100644
> index 00000000000..e637e2bfc98
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
> @@ -0,0 +1,118 @@
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This test aims at testing various operations after getting rid of an inferior
> +# that was running in background, or while we have an inferior running in
> +# background.  The original intent was to expose cases where the commit-resumed
> +# state of the process stratum target was not reset properly after killing an
> +# inferior running in background, which would be a problem when trying to run
> +# again.  The test was expanded to test various combinations of
> +# run-control-related actions done with an inferior running in background.
> +
> +if {[use_gdb_stub]} {
> +    unsupported "test requires running"
> +    return
> +}
> +
> +standard_testfile
> +
> +if {[build_executable "failed to prepare" $testfile $srcfile]} {
> +    return
> +}
> +
> +# Run one variation of the test:
> +#
> +# 1. Start an inferior in the background with "run &"
> +# 2. Do action 1
> +# 3. Do action 2
> +#
> +# Action 1 indicates what to do with the inferior running in background:
> +#
> +#  - kill: kill it
> +#  - detach: detach it
> +#  - add: add a new inferior and switch to it, leave the inferior running in
> +#    background alone
> +#  - none: do nothing, leave the inferior running in background alone
> +#
> +# Action 2 indicates what to do after that:
> +#
> +#  - start: use the start command
> +#  - run: use the run command
> +#  - attach: start a process outside of GDB and attach it
> +proc do_test { action1 action2 } {
> +    save_vars { ::GDBFLAGS } {
> +	append ::GDBFLAGS " -ex \"maintenance set target-non-stop on\""
> +	clean_restart $::binfile
> +    }
> +
> +    # Ensure we are at least after the getpid call, shall we need it.
> +    if { ![runto "after_getpid"] } {
> +	return
> +    }
> +
> +    # Some commands below ask for confirmation.  Turn that off for simplicity.
> +    gdb_test "set confirm off"
> +    gdb_test -no-prompt-anchor "continue &"
> +
> +    if { $action1 == "kill" } {
> +	gdb_test "kill" "Inferior 1 .* killed.*"
> +    } elseif { $action1 == "detach" } {
> +	set child_pid [get_integer_valueof "mypid" -1]
> +	if { $child_pid == -1 } {
> +	    fail "failed to extract child pid"
> +	    return
> +	}
> +
> +	gdb_test "detach" "Inferior 1 .* detached.*" "detach from first instance"
> +
> +	# Kill the detached process, otherwise that makes "monitor exit" hang
> +	# until the process disappears.
> +	#remote_exec target "kill $child_pid"
> +    } elseif { $action1 == "add" } {
> +	gdb_test "add-inferior -exec $::binfile" \
> +	    "Added inferior 2 on connection 1.*" "add-inferior"
> +	gdb_test "inferior 2" "Switching to inferior 2 .*"
> +    } elseif { $action1 == "none" } {
> +
> +    } else {
> +	error "invalid action 1"
> +    }
> +
> +    if { $action2 == "start" } {
> +	gdb_test "start" "Temporary breakpoint $::decimal, main .*"
> +    } elseif { $action2 == "run" } {
> +	gdb_test "break main" "Breakpoint $::decimal at $::hex.*"
> +	gdb_test "run" "Breakpoint $::decimal, main .*"
> +    } elseif { $action2 == "attach" } {
> +	set test_spawn_id [spawn_wait_for_attach $::binfile]
> +	set test_pid [spawn_id_get_pid $test_spawn_id]
> +
> +	if { [gdb_attach $test_pid] } {
> +	    gdb_test "detach" "Inferior $::decimal .* detached.*" \
> +		"detach from second instance"
> +	}
> +
> +	# Detach and kill this inferior so we don't leave it around.
> +	kill_wait_spawned_process $test_spawn_id
> +    } else {
> +	error "invalid action 2"
> +    }
> +}
> +
> +foreach_with_prefix action1 { kill detach add none } {
> +    foreach_with_prefix action2 { start run attach } {
> +	do_test $action1 $action2
> +    }
> +}
> -- 
> 2.37.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp
  2022-11-18  8:30   ` Aktemur, Tankut Baris
@ 2022-11-18 15:07     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-18 15:07 UTC (permalink / raw)
  To: Aktemur, Tankut Baris, gdb-patches

On 11/18/22 03:30, Aktemur, Tankut Baris wrote:
> Hi Simon,
> 
> On Thursday, November 17, 2022 8:43 PM, Simon Marchi wrote:
>> Before doing further changes to this file, change to use the :: notation
>> instead of declaring global variables with the `global` keyword.
> 
> Could you please give more clarification about why this is needed?
> I fear that this could lead to divergence in the testsuite code-style,
> where some files use the global keyword and others use the `::` notation,
> similar to the inconsistency of the use of the 'then' keyword [1].

Hi Baris,

This is not mandatory at all, just a cleanup.  I've been suggesting to
people for a while to use the :: notation instead of the global keyword,
because I believe it makes things clearer, it avoids leaving stale
global declarations, and it's just less lines of code.  Since I use that
notation whenever I make changes, I thought I would do a preparatory
patch to change the test to use it, so that we don't end up with two
styles in the same file, and I don't inter-mix orthogonal changes in one
patch.

That being said, it's true that it creates mixed styles in the testsuite
(it is already the case, both styles are used).  Unfortunately, it's not
an as simple fix as the "then" thing, I don't think it can be a mass
change done by regexes.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid
  2022-11-18 11:28   ` Andrew Burgess
@ 2022-11-18 16:09     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-18 16:09 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi via Gdb-patches; +Cc: Simon Marchi



On 11/18/22 06:28, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> Just a small optimization, it's not necessary to recompute lwp at each
>> iteration.
>>
>> While at it, change the variable type to long, as ptid_t::lwp returns a
>> long.
>>
>> Change-Id: I181670ce1f90b59cb09ea4899367750be2ad9105
> 
> LGTM.
> 
> Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks, I pushed this patch.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter
  2022-11-18 11:32   ` Andrew Burgess
@ 2022-11-18 16:12     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-18 16:12 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi via Gdb-patches; +Cc: Simon Marchi



On 11/18/22 06:32, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> ps_get_thread_area receives as a parameter the lwpid it must work on.
>> It then calls is_64bit_tdesc, which uses the current_thread as the
>> thread to work on.  However, it is not said that both are the same.
>>
>> This became a problem when working in a following patch that pmakes
> 
> s/pmakes/makes/

Fixed.

>> find_one_thread switch to a process but to no thread (current_thread ==
>> nullptr).  When libthread_db needed to get the thread area,
>> is_64bit_tdesc would try to get the regcache of a nullptr thread.
>>
>> Fix that by making is_64bit_tdesc accept the thread to work on as a
>> parameter.  Find the right thread from the context, when possible (when
>> we know the lwpid to work on).  Otherwise, pass "current_thread", to
>> retain the existing behavior.
>>
>> Change-Id: I44394d6be92392fa28de71982fd04517ce8a3007
>> ---
>>  gdbserver/linux-x86-low.cc | 27 +++++++++++++++------------
>>  1 file changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/gdbserver/linux-x86-low.cc b/gdbserver/linux-x86-low.cc
>> index d2b55f6f0d2..c98a7a461fe 100644
>> --- a/gdbserver/linux-x86-low.cc
>> +++ b/gdbserver/linux-x86-low.cc
>> @@ -275,9 +275,9 @@ static /*const*/ int i386_regmap[] =
>>     per the tdesc.  */
>>  
>>  static int
>> -is_64bit_tdesc (void)
>> +is_64bit_tdesc (thread_info *thread)
> 
> I think the comment for this function needs updating, the reference to
> "current inferior" probably needs to be replaced with THREAD.

Indeed, fixed.

> 
> With those changes, LGTM.
> 
> Reviewed-By: Andrew Burgess <aburgess@redhat.com>

Thanks, pushed.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/8] gdbserver: use current_process in ps_getpid
  2022-11-18 11:33   ` Andrew Burgess
@ 2022-11-18 16:21     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-18 16:21 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi via Gdb-patches; +Cc: Simon Marchi



On 11/18/22 06:33, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> The following patch ("gdbserver: switch to right process in
>> find_one_thread") makes it so find_one_thread calls into libthread_db
>> with a current process but no current thread.  This tripped on ps_getpid
>> using current_thread in order to get the process' pid.  Get the pid from
>> `current_process ()` instead, which removes the need to have a current
>> thread.  Eventually, it would be good to get it from the
>> gdb_ps_prochandle_t structure, to avoid the need for a current process
>> as well.
> 
> LGTM.
> 
> Reviewed-By: Andrew Burgess <aburgess@redhat.com>
> 
> Thanks,
> Andrew

Thanks, pushed.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 7/8] gdbserver: switch to right process in find_one_thread
  2022-11-18 13:19   ` Andrew Burgess
@ 2022-11-18 17:34     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-18 17:34 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi via Gdb-patches; +Cc: Simon Marchi



On 11/18/22 08:19, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> When I do this:
>>
>>     $ ./gdb -nx --data-directory=data-directory -q \
>>         /bin/sleep \
>> 	-ex "maint set target-non-stop on" \
>> 	-ex "tar ext :1234" \
>> 	-ex "set remote exec-file /bin/sleep" \
>> 	-ex "run 1231 &" \
>> 	-ex add-inferior \
>> 	-ex "inferior 2"
>>     Reading symbols from /bin/sleep...
>>     (No debugging symbols found in /bin/sleep)
>>     Remote debugging using :1234
>>     Starting program: /bin/sleep 1231
>>     Reading /lib64/ld-linux-x86-64.so.2 from remote target...
>>     warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
>>     Reading /lib64/ld-linux-x86-64.so.2 from remote target...
>>     Reading /usr/lib/debug/.build-id/a6/7a1408f18db3576757eea210d07ba3fc560dff.debug from remote target...
>>     [New inferior 2]
>>     Added inferior 2 on connection 1 (extended-remote :1234)
>>     [Switching to inferior 2 [<null>] (<noexec>)]
>>     (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target...
>>     attach 3659848
>>     Attaching to process 3659848
>>     /home/smarchi/src/binutils-gdb/gdb/thread.c:85: internal-error: inferior_thread: Assertion `current_thread_ != nullptr' failed.
>>
>> The internal error of GDB is actually caused by GDBserver crashing, and
>> the error recovery of GDB is not on point.  This patch aims to fix just
>> the GDBserver crash, not the GDB problem.
>>
>> GDBserver crashes with a segfault here:
>>
>>     (gdb) bt
>>     #0  0x00005555557fb3f4 in find_one_thread (ptid=...) at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:177
>>     #1  0x00005555557fd5cf in thread_db_thread_handle (ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400, handle_len=0x7fffffffc3f0)
>>         at /home/smarchi/src/binutils-gdb/gdbserver/thread-db.cc:461
>>     #2  0x000055555578a0b6 in linux_process_target::thread_handle (this=0x5555558a64c0 <the_x86_target>, ptid=<error reading variable: Cannot access memory at address 0xffffffffffffffa0>, handle=0x7fffffffc400,
>>         handle_len=0x7fffffffc3f0) at /home/smarchi/src/binutils-gdb/gdbserver/linux-low.cc:6905
>>     #3  0x00005555556dfcc6 in handle_qxfer_threads_worker (thread=0x60b000000510, buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1645
>>     #4  0x00005555556e00e6 in operator() (__closure=0x7fffffffc5e0, thread=0x60b000000510) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1696
>>     #5  0x00005555556f54be in for_each_thread<handle_qxfer_threads_proper(buffer*)::<lambda(thread_info*)> >(struct {...}) (func=...) at /home/smarchi/src/binutils-gdb/gdbserver/gdbthread.h:159
>>     #6  0x00005555556e0242 in handle_qxfer_threads_proper (buffer=0x7fffffffc8a0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1694
>>     #7  0x00005555556e04ba in handle_qxfer_threads (annex=0x629000000213 "", readbuf=0x621000019100 '\276' <repeats 200 times>..., writebuf=0x0, offset=0, len=4097)
>>         at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:1732
>>     #8  0x00005555556e1989 in handle_qxfer (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2045
>>     #9  0x00005555556e720a in handle_query (own_buf=0x629000000200 "qXfer:threads", packet_len=26, new_packet_len_p=0x7fffffffd630) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:2685
>>     #10 0x00005555556f1a01 in process_serial_event () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4176
>>     #11 0x00005555556f4457 in handle_serial_event (err=0, client_data=0x0) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4514
>>     #12 0x0000555555820f56 in handle_file_event (file_ptr=0x607000000250, ready_mask=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:573
>>     #13 0x0000555555821895 in gdb_wait_for_event (block=1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:694
>>     #14 0x000055555581f533 in gdb_do_one_event (mstimeout=-1) at /home/smarchi/src/binutils-gdb/gdbsupport/event-loop.cc:264
>>     #15 0x00005555556ec9fb in start_event_loop () at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3512
>>     #16 0x00005555556f0769 in captured_main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:3992
>>     #17 0x00005555556f0e3f in main (argc=4, argv=0x7fffffffe0d8) at /home/smarchi/src/binutils-gdb/gdbserver/server.cc:4078
>>
>> The reason is a wrong current process when find_one_thread is called.
>> The current process is the 2nd one, which was just attached.  It does
>> not yet have thread_db data (proc->priv->thread_db is nullptr).  As we
>> iterate on all threads of all process to fulfull the qxfer:threads:read
>> request, we get to a thread of process 1 for which we haven't read
>> thread_db information yet (lwp_info::thread_known is false), so we get
>> into find_one_thread.  find_one_thread uses
>> `current_process ()->priv->thread_db`, assuming the current process
>> matches the ptid passed as a parameter, which is wrong.  A segfault
>> happens when trying to dereference that thread_db pointer.
>>
>> Fix this by making find_one_thread not assume what the current process /
>> current thread is.  If it needs to call into libthread_db, which we know
>> will try to read memory from the current process, then temporarily set
>> the current process.
>>
>> In the case where the thread is already know and we return early, we
>> don't need to switch process.
>>
>> I hit this case when running the test included with the following patch,
>> "gdb: disable commit resumed in target_kill", so the fix is exercised by
>> that test.
> 
> I'm not able to reproduce the failure you describe.  I've applied this
> series except for this patch, and run the test from patch #8 with
> native-extended-gdbserver board, and it passes just fine.
> 
> Actually, I do see an issue with the test, but that issue doesn't seem
> to be related to this problem, and is present with and without this
> patch - I'll reply to patch #8 describing that issue.
> 
> Is this bug intermittent?  Or is it likely to depend on certain parts of
> the environment?  I got the impression from the description that it if
> we did the steps in the right order then we'd get a nullptr dereference,
> which didn't feel like an intermittent issue, but maybe I don't
> understand correctly.

The crash reproduces all the time for me with the instructions provided
in the commit message, I think it's deterministic.  However, I got
confused by my own instructions :P.  There's an "attach" command buried
in there that is easy to miss.  It's on that attach that things fail.
If I put that attach on the command line, I don't get the crash (more on
this later).

I looked into it a bit more, it takes a relatively precise context for
this to reproduce:

 - The first process must be far enough to have loaded its libc or
   libpthread (whatever triggers the loading of libthread_db), such that
   its proc->priv->thread_db is not nullptr
 - However, its lwp must still be in the `!lwp->thread_known` state,
   meaning GDBserver hasn't asked libthread_db to compute the thread
   handle yet.  That means, GDB must not have refreshed the thread list
   yet, since that would cause the thread handles to be computed.  That
   means, no stopping on a breakpoint, since that causes a thread list
   update.  That's why the first inferior needs to be started with "run
   &".  It hits some internal breakpoints when shared libraries are
   loaded, GDBserver asks for the symbols necessary to load
   libthread_db, but GDB never asks for a full thread list.
 - The attach then causes GDB to ask for a thread list update here:

     #18 0x000055ac22438062 in remote_target::update_thread_list (this=0x61700003d800) at /home/simark/src/binutils-gdb/gdb/remote.c:3946
     #19 0x000055ac224445df in extended_remote_target::attach (this=0x61700003d800, args=0x602000077950 "936019", from_tty=1) at /home/simark/src/binutils-gdb/gdb/remote.c:6166
     #20 0x000055ac21e1ffe2 in attach_command (args=0x602000077950 "936019", from_tty=1) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2644

   At this point, the current process is the second one (the one we
   attach to).  Since we are early in the attach process, this process'
   proc->priv->thread_db is still nullptr.  In thread_db_thread_handle,
   while handling a thread from the first process, we determined
   (correctly) that this thread's process uses thread_db but this
   thread's handle is not know yet.  But then in find_one_thread, we use
   the current process, the second process, and try to dereference its
   nullptr proc->priv->thread_db.

If I put the attach on the command line, GDB doesn't go back to the
event loop between the "run &" and the "attach", so it won't handle
events for the first process, not to the qSymbol dance, so the first
process' thread_db is still nullptr when doing the attach.  And we don't
see the bug because thread_db_thread_handle will return early and not
call find_one_thread.

Ah, I know now why you don't see the crash when running the test without
this patch applied.  It's because I later modified the test to grab the
first process' pid.  To do so, it runs to some spot (after a getpid
call), reads a variable and then does "continue &".  The breakpoint stop
causes a thread list update, and breaks the second condition listed
above.

I will make a dedicated test for this specific bug then and include it
in a v2 for this patch.

> 
> That said, the change itself looks reasonable - but it would be nice to
> know why I can't reproduce the failure.

You're right to ask, it made me look at it more and understand the
conditions more clearly.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 8/8] gdb: disable commit resumed in target_kill
  2022-11-18 13:33   ` Andrew Burgess
@ 2022-11-19  1:16     ` Simon Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Simon Marchi @ 2022-11-19  1:16 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi via Gdb-patches; +Cc: Simon Marchi



On 11/18/22 08:33, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> PR 28275 shows that doing a sequence of:
>>
>>  - Run inferior in background (run &)
>>  - kill that inferior
>>  - Run again
>>
>> We get into this assertion:
>>
>>     /home/smarchi/src/binutils-gdb/gdb/target.c:2590: internal-error: target_wait: Assertion `!proc_target->commit_resumed_state' failed.
>>
>>     #0  internal_error_loc (file=0x5606b344e740 "/home/smarchi/src/binutils-gdb/gdb/target.c", line=2590, fmt=0x5606b344d6a0 "%s: Assertion `%s' failed.") at /home/smarchi/src/binutils-gdb/gdbsupport/errors.cc:54
>>     #1  0x00005606b6296475 in target_wait (ptid=..., status=0x7fffb9390630, options=...) at /home/smarchi/src/binutils-gdb/gdb/target.c:2590
>>     #2  0x00005606b5767a98 in startup_inferior (proc_target=0x5606bfccb2a0 <the_amd64_linux_nat_target>, pid=3884857, ntraps=1, last_waitstatus=0x0, last_ptid=0x0) at /home/smarchi/src/binutils-gdb/gdb/nat/fork-inferior.c:482
>>     #3  0x00005606b4e6c9c5 in gdb_startup_inferior (pid=3884857, num_traps=1) at /home/smarchi/src/binutils-gdb/gdb/fork-child.c:132
>>     #4  0x00005606b50f14a5 in inf_ptrace_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>>         at /home/smarchi/src/binutils-gdb/gdb/inf-ptrace.c:105
>>     #5  0x00005606b53b6d23 in linux_nat_target::create_inferior (this=0x5606bfccb2a0 <the_amd64_linux_nat_target>, exec_file=0x604000039f50 "/home/smarchi/build/binutils-gdb/gdb/test", allargs="", env=0x61500000a580, from_tty=1)
>>         at /home/smarchi/src/binutils-gdb/gdb/linux-nat.c:978
>>     #6  0x00005606b512b79b in run_command_1 (args=0x0, from_tty=1, run_how=RUN_NORMAL) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:468
>>     #7  0x00005606b512c236 in run_command (args=0x0, from_tty=1) at /home/smarchi/src/binutils-gdb/gdb/infcmd.c:526
>>
>> When running the kill command, commit_resumed_state for the
>> process_stratum_target (linux-nat, here) is true.  After the kill, when
>> there are no more threads, commit_resumed_state is still true, as
>> nothing touches this flag during the kill operation.  During the
>> subsequent run command, run_command_1 does:
>>
>>     scoped_disable_commit_resumed disable_commit_resumed ("running");
>>
>> We would think that this would clear the commit_resumed_state flag of
>> our native target, but that's not the case, because
>> scoped_disable_commit_resumed iterates on non-exited inferiors in order
>> to find active process targets.  And after the kill, the inferior is
>> exited, and the native target was unpushed from it anyway.  So
>> scoped_disable_commit_resumed doesn't touch the commit_resumed_state
>> flag of the native target, it stays true.  When reaching target_wait, in
>> startup_inferior (to consume the initial expect stop events while the
>> inferior is starting up and working its way through the shell),
>> commit_resumed_state is true, breaking the contract saying that
>> commit_resumed_state is always false when calling the targets' wait
>> method.
>>
>> (note: to be correct, I think that startup_inferior should toggle
>> commit_resumed between the target_wait and target_resume calls, but I'll
>> ignore that for now)
>>
>> I can see multiple ways to fix this.  In the end, we need
>> commit_resumed_state to be cleared by the time we get to that
>> target_wait.  It could be done at the end of the kill command, or at the
>> beginning of the run command.
>>
>> To keep things in a coherent state, I'd like to make it so that after
>> the kill command, when the target is left with no threads, its
>> commit_resumed_state flag is left to false.  This way, we can keep
>> working with the assumption that a target with no threads (and therefore
>> no running threads) has commit_resumed_state == false.
>>
>> Do this by adding a scoped_disable_commit_resumed in target_kill.  It
>> clears the target's commit_resumed_state on entry, and leaves it false
>> if the target does not have any resumed thread on exit.  That means,
>> even if the target has another inferior with stopped threads,
>> commit_resumed_state will be left to false, which makes sense.
>>
>> Add a test that tries to cover various combinations of actions done
>> while an inferior is running (and therefore while commit_resumed_state
>> is true on the process target).
>>
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28275
>> Change-Id: I8e6fe6dc1f475055921520e58cab68024039a1e9
>> ---
>>  gdb/target.c                                  |   8 +-
>>  .../gdb.base/run-control-while-bg-execution.c |  33 +++++
>>  .../run-control-while-bg-execution.exp        | 118 ++++++++++++++++++
>>  3 files changed, 157 insertions(+), 2 deletions(-)
>>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.c
>>  create mode 100644 gdb/testsuite/gdb.base/run-control-while-bg-execution.exp
>>
>> diff --git a/gdb/target.c b/gdb/target.c
>> index 4a22885b82f..f5c6212310a 100644
>> --- a/gdb/target.c
>> +++ b/gdb/target.c
>> @@ -907,8 +907,12 @@ add_deprecated_target_alias (const target_info &tinfo, const char *alias)
>>  void
>>  target_kill (void)
>>  {
>> -  current_inferior ()->top_target ()->kill ();
>> -}
>> +
>> +  /* Ensure that, if commit resumed for the target is currently true (threads
>> +     are running), if we kill the last running inferior, commit resumed ends up
>> +     false.  */
>> +   scoped_disable_commit_resumed disable ("killing"); current_inferior
>> +   ()->top_target ()->kill (); }
> 
> I think something went wrong with the formatting here!

Woops, fixed.

> 
> I first read this comment before reading the commit message, and,
> initially, the comment didn't make any sense to me.
> 
> My thinking was, surely the scoped_disable_commit_resumed would set the
> commit resumed state to false, but only for the duration of this scope.
> When we leave the scope, the state will be set back to true...
> 
> ...the subtlety of course, is that when we create the
> scoped_disable_commit_resumed, the current inferior is non-exited, and
> so has its state set to false, but, when we exit the scope, the current
> inferior will be exited, so its commit resume state will be left false.
> 
> Personally, I think this is a non-obvious detail.  I think it would be
> helpful if the comment explained this aspect a little more.

I agree, trying to fit it all in a single sentence didn't help.  What
about:

  /* If the commit_resume_state of the to-be-killed-inferior's process stratum
     is true, and this inferior is the last live inferior with resumed threads
     of that target, then we want to leave commit_resume_state to false, as the
     target won't have any resumed threads anymore.  We achieve this with
     this scoped_disable_commit_resumed.  On construction, it will set the flag
     to false.  On destruction, it will only set it to true if there are resumed
     threads left.  */

> 
> I'm also seeing something weird with the test when run with the
> native-extended-gdbserver board, this warning:
> 
>   WARNING: Timed out waiting for EOF in server after monitor exit
> 
> I've not done a full investigation, but what seems to happen is GDB
> sends the 'monitor exit', and gdbserver does indeed try to exit.  It
> then looks like pthreads(?) is waiting on some threads?  Or some child
> processes?  At the point gdbserver is not exiting, the stack looks like:
> 
>   #0  0x00007f7009660374 in wait () from /lib64/libpthread.so.0
>   #1  0x000056146a26a858 in reap_children ()
>   #2  0x000056146a26b21c in new_job ()
>   #3  0x000056146a27740c in update_file ()
>   #4  0x000056146a277d64 in update_goal_chain ()
>   #5  0x000056146a25b893 in main ()

I believe this is a stack trace of the make process, probably the one
invoking dejagnu / expect.

> 
> Do you also see this timeout?
> 
> Additionally, I have not (yet) been able to reproduce this when running
> the test from the command line - in that case, gdbserver always exits
> immediately after getting the 'monitor exit'.

Ah, funny that you dug into that too :).  I noticed it and had a
discussion about that with Pedro.  Here's what we found.  Doing "monitor
exit" does indeed make gdbserver exit, after which (during the delay) it
is zombie (it shows up as <defunct> in ps).

We expect this "eof" expect clause to trigger:

  https://gitlab.com/gnutools/binutils-gdb/-/blob/5e219e0f46055281cfbc9351a3d27a05841be34d/gdb/testsuite/lib/gdbserver-support.exp#L476

Our theory is that the "eof" clause triggers when the pty controlling
GDBserver returns EOF, that is when nobody is connected anymore on the
slave side.  However, there is a child process that is detached during
the test that is still alive, that is still connected to the pty.  So
eof never triggers, we hang until expect gives up.

Actually, I have this workaround in the patch:

+	# Kill the detached process, otherwise that makes "monitor exit" hang
+	# until the process disappears.
+	#remote_exec target "kill $child_pid"

I commented it out to investigate the issue, and forgot to uncomment it.
I'll uncomment it.

I think it would work fine to do the "wait" right after sending the
"monitor exit", just like this:

    if { $is_mi } {
	set monitor_exit "-interpreter-exec console \"monitor exit\""
    } else {
	set monitor_exit "monitor exit"
    }
    gdb_test -nopass "$monitor_exit" "" "monitor exit"
    wait -i $server_spawn_id

We reap gdbserver even if the detach process is still alive.  Worst
case, the pty and detached process stay there until the detached process
exits by itself (which generally happens with our testsuite programs, we
try not to write infinite loops).  The problem is that there is no
timeout mechanism, so it could hang forever.  Imagine the "monitor exit"
fails for some reason, or GDBserver is stuck in a loop while trying to
exit.  That's the point of the current code.

Note that this isn't unique to my test.  I tried putting a fail in the
eof clause and ran the testsuite, I had about 20 such failures when
running the native-extended-gdbserver board.

Maybe the solution is just to make sure the tests don't leave detached
process hanging around.  We can put a fail in there and fix the
offending tests.  Any new test that leaves child processes running will
be easy to spot.

Simon

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-11-19  1:16 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-17 19:42 [PATCH 0/8] Fix some commit_resumed_state assertion failures (PR 28275) Simon Marchi
2022-11-17 19:42 ` [PATCH 1/8] gdb/testsuite: remove global declarations in gdb.threads/detach-step-over.exp Simon Marchi
2022-11-18  8:30   ` Aktemur, Tankut Baris
2022-11-18 15:07     ` Simon Marchi
2022-11-17 19:42 ` [PATCH 2/8] gdb/testsuite: refactor gdb.threads/detach-step-over.exp Simon Marchi
2022-11-17 19:42 ` [PATCH 3/8] gdb: fix assert when quitting GDB while a thread is stepping Simon Marchi
2022-11-17 19:42 ` [PATCH 4/8] gdbserver/linux: take condition out of callback in find_lwp_pid Simon Marchi
2022-11-18 11:28   ` Andrew Burgess
2022-11-18 16:09     ` Simon Marchi
2022-11-17 19:42 ` [PATCH 5/8] gdbserver/linux-x86: make is_64bit_tdesc accept thread as a parameter Simon Marchi
2022-11-18 11:32   ` Andrew Burgess
2022-11-18 16:12     ` Simon Marchi
2022-11-17 19:42 ` [PATCH 6/8] gdbserver: use current_process in ps_getpid Simon Marchi
2022-11-18 11:33   ` Andrew Burgess
2022-11-18 16:21     ` Simon Marchi
2022-11-17 19:42 ` [PATCH 7/8] gdbserver: switch to right process in find_one_thread Simon Marchi
2022-11-18 13:19   ` Andrew Burgess
2022-11-18 17:34     ` Simon Marchi
2022-11-17 19:42 ` [PATCH 8/8] gdb: disable commit resumed in target_kill Simon Marchi
2022-11-18 13:33   ` Andrew Burgess
2022-11-19  1:16     ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).