public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Don't stop all threads prematurely after first step of "step N"
@ 2022-07-18 18:54 Pedro Alves
  2022-07-19 19:33 ` Simon Marchi
  0 siblings, 1 reply; 2+ messages in thread
From: Pedro Alves @ 2022-07-18 18:54 UTC (permalink / raw)
  To: gdb-patches

In all-stop mode, when the target is itself in non-stop mode (like
GNU/Linux), if you use the "step N" (or "stepi/next/nexti N") to step
a thread a number of times:

 (gdb) help step
 step, s
 Step program until it reaches a different source line.
 Usage: step [N]
 Argument N means step N times (or till program stops for another reason).

... GDB prematurely stops all threads after the first step, and
doesn't re-resume them for the subsequent N-1 steps.  It's as if for
the 2nd and subsequent steps, the command was running with
scheduler-locking enabled.

This can be observed with the testcase added by this commit, which
looks like this:

 static pthread_barrier_t barrier;

 static void *
 thread_func (void *arg)
 {
   pthread_barrier_wait (&barrier);
   return NULL;
 }

 int
 main ()
 {
   pthread_t thread;
   int ret;

   pthread_barrier_init (&barrier, NULL, 2);

   /* We run to this line below, and then issue "next 3".  That should
      step over the 3 lines below and land on the return statement.  If
      GDB prematurely stops the thread_func thread after the first of
      the 3 nexts (and never resumes it again), then the join won't
      ever return.  */
   pthread_create (&thread, NULL, thread_func, NULL); /* set break here */
   pthread_barrier_wait (&barrier);
   pthread_join (thread, NULL);

   return 0;
 }

The test hangs and times out without the GDB fix:

 (gdb) next 3
 [New Thread 0x7ffff7d89700 (LWP 525772)]
 FAIL: gdb.threads/step-N-all-progress.exp: non-stop=off: target-non-stop=on: next 3 (timeout)

The problem is a core gdb bug.

When you do "step/stepi/next/nexti N", GDB internally creates a
thread_fsm object and associates it with the stepping thread.  For the
stepping commands, the FSM's class is step_command_fsm.  That object
is what keeps track of how many steps are left to make.  When one step
finishes, handle_inferior_event calls stop_waiting and returns, and
then fetch_inferior_event calls the "should_stop" method of the event
thread's FSM.  The implementation of that method decrements the
steps-left counter.  If the counter is 0, it returns true and we
proceed to presenting the stop to the user.  If it isn't 0 yet, then
the method returns false, indicating to fetch_inferior_event to "keep
going".

Focusing now on when the first step finishes -- we're in "all-stop"
mode, with the target in non-stop mode.  When a step finishes,
handle_inferior_event calls stop_waiting, which itself calls
stop_all_threads to stop everything.  I.e., after the first step
completes, all threads are stopped, before handle_inferior_event
returns.  And after that, now in fetch_inferior_event, we consult the
thread's thread_fsm::should_stop, which as we've seen, for the first
step returns false -- i.e., we need to keep_going for another step.
However, since the target is in non-stop mode, keep_going resumes
_only_ the current thread.  All the other threads remain stopped,
inadvertently.

If the target is in non-stop mode, we don't actually need to stop all
threads right after each first step finishes, and then re-resume them
again.  We can instead defer stopping all threads until all the steps
are completed.

So fix this by delaying the stopping of all threads until after we
called the FSM's "should_stop" method.  I.e., move it from
stop_waiting, to handle_inferior_events's callers,
fetch_inferior_event and wait_for_inferior.

New test included.  Tested on x86-64 GNU/Linux native and gdbserver.

Change-Id: Iaad50dcfea4464c84bdbac853a89df92ade6ae01
---
 gdb/infrun.c                                  | 19 ++++--
 .../gdb.threads/step-N-all-progress.c         | 49 +++++++++++++++
 .../gdb.threads/step-N-all-progress.exp       | 59 +++++++++++++++++++
 3 files changed, 122 insertions(+), 5 deletions(-)
 create mode 100644 gdb/testsuite/gdb.threads/step-N-all-progress.c
 create mode 100644 gdb/testsuite/gdb.threads/step-N-all-progress.exp

diff --git a/gdb/infrun.c b/gdb/infrun.c
index a6694230d29..a59cbe64537 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3971,6 +3971,16 @@ prepare_for_detach (void)
     }
 }
 
+/* If all-stop, but there exists a non-stop target, stop all threads
+   now that we're presenting the stop to the user.  */
+
+static void
+stop_all_threads_if_all_stop_mode ()
+{
+  if (!non_stop && exists_non_stop_target ())
+    stop_all_threads ("presenting stop to user in all-stop");
+}
+
 /* Wait for control to return from inferior to debugger.
 
    If inferior gets a signal, we may decide to start it up again
@@ -4017,6 +4027,8 @@ wait_for_inferior (inferior *inf)
 	break;
     }
 
+  stop_all_threads_if_all_stop_mode ();
+
   /* No error, don't finish the state yet.  */
   finish_state.release ();
 }
@@ -4240,6 +4252,8 @@ fetch_inferior_event ()
 	    bool should_notify_stop = true;
 	    int proceeded = 0;
 
+	    stop_all_threads_if_all_stop_mode ();
+
 	    clean_up_just_stopped_threads_fsms (ecs);
 
 	    if (thr != nullptr && thr->thread_fsm () != nullptr)
@@ -8138,11 +8152,6 @@ stop_waiting (struct execution_control_state *ecs)
 
   /* Let callers know we don't want to wait for the inferior anymore.  */
   ecs->wait_some_more = 0;
-
-  /* If all-stop, but there exists a non-stop target, stop all
-     threads now that we're presenting the stop to the user.  */
-  if (!non_stop && exists_non_stop_target ())
-    stop_all_threads ("presenting stop to user in all-stop");
 }
 
 /* Like keep_going, but passes the signal to the inferior, even if the
diff --git a/gdb/testsuite/gdb.threads/step-N-all-progress.c b/gdb/testsuite/gdb.threads/step-N-all-progress.c
new file mode 100644
index 00000000000..26de9bcd3ed
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-N-all-progress.c
@@ -0,0 +1,49 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+#include <assert.h>
+#include <stdlib.h>
+
+static pthread_barrier_t barrier;
+
+static void *
+thread_func (void *arg)
+{
+  pthread_barrier_wait (&barrier);
+  return NULL;
+}
+
+int
+main ()
+{
+  pthread_t thread;
+  int ret;
+
+  pthread_barrier_init (&barrier, NULL, 2);
+
+  /* We run to this line below, and then issue "next 3".  That should
+     step over the 3 lines below and land on the return statement.  If
+     GDB prematurely stops the thread_func thread after the first of
+     the 3 nexts (and never resumes it again), then the join won't
+     ever return.  */
+  pthread_create (&thread, NULL, thread_func, NULL); /* set break here */
+  pthread_barrier_wait (&barrier);
+  pthread_join (thread, NULL);
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.threads/step-N-all-progress.exp b/gdb/testsuite/gdb.threads/step-N-all-progress.exp
new file mode 100644
index 00000000000..b1d1ba75b67
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/step-N-all-progress.exp
@@ -0,0 +1,59 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that with "next N", all threads are resumed at each of the N
+# steps.  GDB used to have a bug where in target-non-stop targets, and
+# in all-stop mode, after the first "next" (or "step/stepi/nexti"),
+# GDB would prematurely stop all threads.  For the subsequent N-1
+# steps, only the stepped thread would continue running, while all the
+# other threads would remain stopped.
+
+standard_testfile
+
+if { [build_executable "failed to prepare" $testfile \
+	  $srcfile {debug pthreads}] == -1 } {
+    return
+}
+
+proc test {non-stop target-non-stop} {
+    save_vars ::GDBFLAGS {
+	append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
+	append ::GDBFLAGS " -ex \"set non-stop ${non-stop}\""
+	clean_restart $::binfile
+    }
+
+    if { ![runto_main] } {
+	return
+    }
+
+    set lineno [gdb_get_line_number "set break here"]
+
+    gdb_breakpoint $lineno
+
+    gdb_continue_to_breakpoint "break here"
+
+    gdb_test "next 3" "return 0;"
+}
+
+foreach_with_prefix non-stop {off on} {
+    foreach_with_prefix target-non-stop {off on} {
+	if {${non-stop} == "on" && ${target-non-stop} == "off"} {
+	    # Invalid combination.
+	    continue
+	}
+
+	test ${non-stop} ${target-non-stop}
+    }
+}

base-commit: 23948f56021f46bb2bdee7afad074aafe8329230
-- 
2.36.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Don't stop all threads prematurely after first step of "step N"
  2022-07-18 18:54 [PATCH] Don't stop all threads prematurely after first step of "step N" Pedro Alves
@ 2022-07-19 19:33 ` Simon Marchi
  0 siblings, 0 replies; 2+ messages in thread
From: Simon Marchi @ 2022-07-19 19:33 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches



On 2022-07-18 14:54, Pedro Alves wrote:
> In all-stop mode, when the target is itself in non-stop mode (like
> GNU/Linux), if you use the "step N" (or "stepi/next/nexti N") to step
> a thread a number of times:
> 
>  (gdb) help step
>  step, s
>  Step program until it reaches a different source line.
>  Usage: step [N]
>  Argument N means step N times (or till program stops for another reason).
> 
> ... GDB prematurely stops all threads after the first step, and
> doesn't re-resume them for the subsequent N-1 steps.  It's as if for
> the 2nd and subsequent steps, the command was running with
> scheduler-locking enabled.
> 
> This can be observed with the testcase added by this commit, which
> looks like this:
> 
>  static pthread_barrier_t barrier;
> 
>  static void *
>  thread_func (void *arg)
>  {
>    pthread_barrier_wait (&barrier);
>    return NULL;
>  }
> 
>  int
>  main ()
>  {
>    pthread_t thread;
>    int ret;
> 
>    pthread_barrier_init (&barrier, NULL, 2);
> 
>    /* We run to this line below, and then issue "next 3".  That should
>       step over the 3 lines below and land on the return statement.  If
>       GDB prematurely stops the thread_func thread after the first of
>       the 3 nexts (and never resumes it again), then the join won't
>       ever return.  */
>    pthread_create (&thread, NULL, thread_func, NULL); /* set break here */
>    pthread_barrier_wait (&barrier);
>    pthread_join (thread, NULL);
> 
>    return 0;
>  }
> 
> The test hangs and times out without the GDB fix:
> 
>  (gdb) next 3
>  [New Thread 0x7ffff7d89700 (LWP 525772)]
>  FAIL: gdb.threads/step-N-all-progress.exp: non-stop=off: target-non-stop=on: next 3 (timeout)
> 
> The problem is a core gdb bug.
> 
> When you do "step/stepi/next/nexti N", GDB internally creates a
> thread_fsm object and associates it with the stepping thread.  For the
> stepping commands, the FSM's class is step_command_fsm.  That object
> is what keeps track of how many steps are left to make.  When one step
> finishes, handle_inferior_event calls stop_waiting and returns, and
> then fetch_inferior_event calls the "should_stop" method of the event
> thread's FSM.  The implementation of that method decrements the
> steps-left counter.  If the counter is 0, it returns true and we
> proceed to presenting the stop to the user.  If it isn't 0 yet, then
> the method returns false, indicating to fetch_inferior_event to "keep
> going".
> 
> Focusing now on when the first step finishes -- we're in "all-stop"
> mode, with the target in non-stop mode.  When a step finishes,
> handle_inferior_event calls stop_waiting, which itself calls
> stop_all_threads to stop everything.  I.e., after the first step
> completes, all threads are stopped, before handle_inferior_event
> returns.  And after that, now in fetch_inferior_event, we consult the
> thread's thread_fsm::should_stop, which as we've seen, for the first
> step returns false -- i.e., we need to keep_going for another step.
> However, since the target is in non-stop mode, keep_going resumes
> _only_ the current thread.  All the other threads remain stopped,
> inadvertently.
> 
> If the target is in non-stop mode, we don't actually need to stop all
> threads right after each first step finishes, and then re-resume them
> again.  We can instead defer stopping all threads until all the steps
> are completed.
> 
> So fix this by delaying the stopping of all threads until after we
> called the FSM's "should_stop" method.  I.e., move it from
> stop_waiting, to handle_inferior_events's callers,
> fetch_inferior_event and wait_for_inferior.
> 
> New test included.  Tested on x86-64 GNU/Linux native and gdbserver.
> 
> Change-Id: Iaad50dcfea4464c84bdbac853a89df92ade6ae01

LGTM.

Simon

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-07-19 19:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-18 18:54 [PATCH] Don't stop all threads prematurely after first step of "step N" Pedro Alves
2022-07-19 19:33 ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).