[PATCH 0/8] Some vfork related fixes

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/8] Some vfork related fixes
@ 2023-06-22 13:17 Andrew Burgess
  2023-06-22 13:17 ` [PATCH 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
                   ` (8 more replies)
  0 siblings, 9 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

While reviewing this patch:

  https://inbox.sourceware.org/gdb-patches/20230612074927.424961-2-christina.schimpe@intel.com/

I spotted a block of code which I wasn't certain the meaning of.  So I
started looking into the history of the code, and started testing it a
little, and ran into 3 vfork related bugs, which this series fixes.

Patches #1, #6, and #7 are all about updating an existing test to test
vfork a little more,

Patches #2, #3, and #5 are the three fixes,

Patch #4 is the original refactoring patch referenced above,

And patch #8 adds some extra debug output.

---

Andrew Burgess (7):
  gdb: catch more errors in gdb.base/foll-vfork.exp
  gdb: don't restart vfork parent while waiting for child to finish
  gdb: fix an issue with vfork in non-stop mode
  gdb: don't resume vfork parent while child is still running
  gdb/testsuite: expand gdb.base/foll-vfork.exp
  gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp
  gdb: additional debug output in infrun.c and linux-nat.c

Mihails Strasuns (1):
  gdb, infrun: refactor part of `proceed` into separate function

 gdb/infrun.c                          | 180 +++++---
 gdb/linux-nat.c                       |  32 +-
 gdb/testsuite/gdb.base/foll-vfork.exp | 588 ++++++++++++--------------
 3 files changed, 403 insertions(+), 397 deletions(-)


base-commit: 07a88d7f2121630956997bc5edd495af40d4494f
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

For *reasons* I was looking at gdb.base/foll-vfork.exp.  This test
script has a proc 'setup_gdb' that could (potentially) fail.  The
setup_gdb proc is called from many places and I, initially, didn't
think that we were checking if setup_gdb had failed or not.

My confusion was because I didn't understand what this tcl construct
did:

  return -code return

this will actually act as a return in the context of a proc's caller,
effectively returning two levels of the call stack.  Neat (I guess).

So it turns out my worries were misplaced, everywhere setup_gdb is
called, if setup_gdb fails then we will (magically) return.

However, I did spot one place where we managed to confuse ourselves
with our cleverness.

In check_vfork_catchpoints, this proc is called to check that GDB is
able to catch vforks or not.  This proc is called early in the test
script, and the intention is that, should this proc fail, we'll mark
the whole test script as unsupported and then move onto the next test
script.

However, check_vfork_catchpoints also uses setup_gdb, and so, if that
call to setup_gdb fails we'll end up returning immediately from
check_vfork_catchpoints, and then continue with the test of _this_
test script, which is not correct.

To fix this I see two choices, one is remove the use of 'return -code
return' from setup_gdb, however, this would require every use of
setup_gdb to then be placed inside:

  if { ![setup_gdb] } {
    return
  }

Or, I can wrap the one call to setup_gdb in check_vfork_catchpoints
and check its return code.

I chose the second option as this is the smaller code change.

There should be no change in what is tested after this commit.
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index 313afa06324..bdceff0f5de 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -70,9 +70,15 @@ proc setup_gdb {} {

 proc check_vfork_catchpoints {} {
   global gdb_prompt
-  global has_vfork_catchpoints

-  setup_gdb
+  # Because setup_gdb uses 'return -code return' which would return to
+  # our caller we need to wrap this call, spot when setup_gdb failed
+  # (with return code 2), and then issue our own 'return -code return'.
+  set code [catch {setup_gdb} string]
+  if { $code == 2 } {
+    unsupported "vfork catchpoints"
+    return -code return
+  }

   # Verify that the system supports "catch vfork".
   gdb_test "catch vfork" "Catchpoint \[0-9\]* \\(vfork\\)" "insert first vfork catchpoint"
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 2/8] gdb: don't restart vfork parent while waiting for child to finish
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
  2023-06-22 13:17 ` [PATCH 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

While working on a later patch, which changes gdb.base/foll-vfork.exp,
I noticed that sometimes I would hit this assert:

  x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.

I eventually tracked it down to a combination of schedule-multiple
mode being on, target-non-stop being off, follow-fork-mode being set
to child, and some bad timing.  The failing case is pretty simple, a
single threaded application performs a vfork, the child process then
execs some other application while the parent process (once the vfork
child has completed its exec) just exits.  As best I understand
things, here's what happens when things go wrong:

  1. The parent process performs a vfork, GDB sees the VFORKED event
  and creates an inferior and thread for the vfork child,

  2. GDB resumes the vfork child process.  As schedule-multiple is on
  and target-non-stop is off, this is translated into a request to
  start all processes (see user_visible_resume_ptid),

  3. In the linux-nat layer we spot that one of the threads we are
  about to start is a vfork parent, and so don't start that
  thread (see resume_lwp), the vfork child thread is resumed,

  4. GDB waits for the next event, eventually entering
  linux_nat_target::wait, which in turn calls linux_nat_wait_1,

  5. In linux_nat_wait_1 we eventually call
  resume_stopped_resumed_lwps, this should restart threads that have
  stopped but don't actually have anything interesting to report.

  6. Unfortunately, resume_stopped_resumed_lwps doesn't check for
  vfork parents like resume_lwp does, so at this point the vfork
  parent is resumed.  This feels like the start of the bug, and this
  is where I'm proposing to fix things, but, resuming the vfork parent
  isn't the worst thing in the world because....

  7. As the vfork child is still alive the kernel holds the vfork
  parent stopped,

  8. Eventually the child performs its exec and GDB is sent and EXECD
  event.  However, because the parent is resumed, as soon as the child
  performs its exec the vfork parent also sends a VFORK_DONE event to
  GDB,

  9. Depending on timing both of these events might seem to arrive in
  GDB at the same time.  Normally GDB expects to see the EXECD or
  EXITED/SIGNALED event from the vfork child before getting the
  VFORK_DONE in the parent.  We know this because it is as a result of
  the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
  handle_vfork_child_exec_or_exit for details).  Further the comment
  in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
  when we remain attached to the child (not the parent) we should not
  expect to see a VFORK_DONE,

  10. If both events arrive at the same time then GDB will randomly
  choose one event to handle first, in some cases this will be the
  VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
  expects that (a) the vfork child has finished, however, in this case
  this is not completely true, the child has finished, but GDB has not
  processed the event associated with the completion yet, and (b) upon
  seeing a VFORK_DONE GDB assumes we are remaining attached to the
  parent, and so resumes the parent process,

  11. GDB now handles the EXECD event.  In our case we are detaching
  from the parent, so GDB calls target_detach (see
  handle_vfork_child_exec_or_exit),

  12. While this has been going on the vfork parent is executing, and
  might even exit,

  13. In linux_nat_target::detach the first thing we do is stop all
  threads in the process we're detaching from, the result of the stop
  request will be cached on the lwp_info object,

  14. In our case the vfork parent has exited though, so when GDB
  waits for the thread, instead of a stop due to signal, we instead
  get a thread exited status,

  15. Later in the detach process we try to resume the threads just
  prior to making the ptrace call to actually detach (see
  detach_one_lwp), as part of the process to resume a thread we try to
  touch some registers within the thread, and before doing this GDB
  asserts that the thread is stopped,

  16. An exited thread is not classified as stopped, and so the assert
  triggers!

So there's two bugs I see here.  The first, and most critical one here
is in step #6.  I think that resume_stopped_resumed_lwps should not
resume a vfork parent, just like resume_lwp doesn't resume a vfork
parent.

With this change in place the vfork parent will remain stopped in step
instead GDB will only see the EXECD/EXITED/SIGNALLED event.  The
problems in #9 and #10 are therefore skipped and we arrive at #11,
handling the EXECD event.  As the parent is still stopped #12 doesn't
apply, and in #13 when we try to stop the process we will see that it
is already stopped, there's no risk of the vfork parent exiting before
we get to this point.  And finally, in #15 we are safe to poke the
process registers because it will not have exited by this point.

However, I did mention two bugs.

The second bug I've not yet managed to actually trigger, but I'm
convinced it must exist: if we forget vforks for a moment, in step #13
above, when linux_nat_target::detach is called, we first try to stop
all threads in the process GDB is detaching from.  If we imagine a
multi-threaded inferior with many threads, and GDB running in non-stop
mode, then, if the user tries to detach there is a chance that thread
could exit just as linux_nat_target::detach is entered, in which case
we should be able to trigger the same assert.

But, like I said, I've not (yet) managed to trigger this second bug,
and even if I could, the fix would not belong in this commit, so I'm
pointing this out just for completeness.

There's no test included in this commit.  In a couple of commits time
I will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.  Unfortunately there are at least two other bugs (separate
from the ones discussed above) that need fixing first, these will be
fixed in the next commits before the gdb.base/foll-vfork.exp test is
expanded.

If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive.  I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
---
 gdb/linux-nat.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 383ef58fa23..7e121b7ab41 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -3346,7 +3346,14 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 static int
 resume_stopped_resumed_lwps (struct lwp_info *lp, const ptid_t wait_ptid)
 {
-  if (!lp->stopped)
+  struct inferior *inf = find_inferior_ptid (linux_target, lp->ptid);
+
+  if (inf->vfork_child != nullptr)
+    {
+      linux_nat_debug_printf ("NOT resuming LWP %s (vfork parent)",
+			      lp->ptid.to_string ().c_str ());
+    }
+  else if (!lp->stopped)
     {
       linux_nat_debug_printf ("NOT resuming LWP %s, not stopped",
 			      lp->ptid.to_string ().c_str ());
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 3/8] gdb: fix an issue with vfork in non-stop mode
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
  2023-06-22 13:17 ` [PATCH 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
  2023-06-22 13:17 ` [PATCH 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit fixes a bug introduced by this commit:

  commit d8bbae6ea080249c05ca90b1f8640fde48a18301
  Date:   Fri Jan 14 15:40:59 2022 -0500

      gdb: fix handling of vfork by multi-threaded program (follow-fork-mode=parent, detach-on-fork=on)

The problem can be seen in this GDB session:

  $ gdb -q
  (gdb) set non-stop on
  (gdb) file ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork
  Reading symbols from ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork...
  (gdb) tcatch vfork
  Catchpoint 1 (vfork)
  (gdb) run
  Starting program: /tmp/gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork

  Temporary catchpoint 1 (vforked process 1375914), 0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  (gdb) bt
  #0  0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  #1  0x00000000004011af in main (argc=1, argv=0x7fffffffad88) at .../gdb/testsuite/gdb.base/foll-vfork.c:32
  (gdb) finish
  Run till exit from #0  0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  [Detaching after vfork from child process 1375914]
  No unwaited-for children left.
  (gdb)

Notice the "No unwaited-for children left." error.  This is incorrect,
given where we are stopped there's no reason why we shouldn't be able
to use "finish" to return to the main frame.

When the inferior is stopped as a result of the 'tcatch vfork', the
inferior is in the process of performing the vfork, that is, GDB has
seen the VFORKED event, but has not yet attached to the new child
process, nor has the child process been resumed.

However, GDB has seen the VFORKED, and, as we are going to follow the
parent process, the inferior for the vfork parent will have its
thread_waiting_for_vfork_done member variable set, this will point to
the one and only thread that makes up the vfork parent process.

When the "finish" command is used GDB eventually ends up in the
proceed function (in infrun.c), in here we pass through all the
function until we eventually encounter this 'else if' condition:

   else if (!cur_thr->resumed ()
	     && !thread_is_in_step_over_chain (cur_thr)
	     /* In non-stop, forbid resuming a thread if some other thread of
		that inferior is waiting for a vfork-done event (this means
		breakpoints are out for this inferior).  */
	     && !(non_stop
		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
      {

The first two of these conditions will both be true, the thread is not
already resumed, and is not in the step-over chain, however, the third
condition, this one:

	     && !(non_stop
		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))

is false, and this prevents the thread we are trying to finish from
being resumed.  This condition is false because (a) non_stop is true,
and (b) cur_thr->inf->thread_waiting_for_vfork_done is not
nullptr (see above for why).

Now, if we check the comment embedded within the condition it says:

     /* In non-stop, forbid resuming a thread if some other thread of
        that inferior is waiting for a vfork-done event (this means
        breakpoints are out for this inferior).  */

And this makes sense, if we have a vfork parent with two thread, and
one thread has performed a vfork, then we shouldn't try to resume the
second thread.

However, if we are trying to resume the thread that actually performed
a vfork, then this is fine.  If we never resume the vfork parent then
we'll never get a VFORK_DONE event, and so the vfork will never
complete.

Thus, the condition should actually be:

     && !(non_stop
	  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
	  && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))

This extra check will allow the vfork parent thread to resume, but
prevent any other thread in the vfork parent process from resuming.
This is the same condition that already exists in the all-stop on a
non-stop-target block earlier in the proceed function.

My actual fix is slightly different to the above, first, I've chosen
to use a nested 'if' check instead of extending the original 'else if'
check, this makes it easier to write a longer comment explaining
what's going on, and second, instead of checking 'non_stop' I've
switched to checking 'target_is_non_stop_p'.  In this context this is
effectively the same thing, a previous 'else if' block in proceed
already handles '!non_stop && target_is_non_stop_p ()', so by the time
we get here, if 'target_is_non_stop_p ()' then we must be running in
non_stop mode.

Both of these tweaks will make the next patch easier, which is a
refactor to merge two parts of the proceed function, so this nested
'if' block is not going to exist for long.

For testing, there is no test included with this commit.  The test was
exposed when using a modified version of the gdb.base/foll-vfork.exp
test script, however, there are other bugs that are exposed when using
the modified test script.  These bugs will be addressed in subsequent
commits, and then I'll add the updated gdb.base/foll-vfork.exp.

If you wish to reproduce this failure then grab the updates to
gdb.base/foll-vfork.exp from the later commit and run this test, the
failure is always reproducible.
---
 gdb/infrun.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 58da1cef29e..5b0257076f0 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3503,19 +3503,29 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 	  }
       }
     else if (!cur_thr->resumed ()
-	     && !thread_is_in_step_over_chain (cur_thr)
-	     /* In non-stop, forbid resuming a thread if some other thread of
-		that inferior is waiting for a vfork-done event (this means
-		breakpoints are out for this inferior).  */
-	     && !(non_stop
-		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
+	     && !thread_is_in_step_over_chain (cur_thr))
       {
-	/* The thread wasn't started, and isn't queued, run it now.  */
-	execution_control_state ecs (cur_thr);
-	switch_to_thread (cur_thr);
-	keep_going_pass_signal (&ecs);
-	if (!ecs.wait_some_more)
-	  error (_("Command aborted."));
+	/* In non-stop mode, if a there exists a thread waiting for a vfork
+	   then only allow that thread to resume (breakpoints are removed
+	   from an inferior when handling a vfork).
+
+	   We check target_is_non_stop_p here rather than just checking the
+	   non-stop flag, though these are equivalent (all-stop on a
+	   non-stop target was handled in a previous block, so at this
+	   point, if target_is_non_stop_p then GDB must be running on
+	   non-stop mode).  By using target_is_non_stop_p it will be easier
+	   to merge this block with the previous in a later commit.  */
+	if (!(target_is_non_stop_p ()
+	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
+	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
+	  {
+	    /* The thread wasn't started, and isn't queued, run it now.  */
+	    execution_control_state ecs (cur_thr);
+	    switch_to_thread (cur_thr);
+	    keep_going_pass_signal (&ecs);
+	    if (!ecs.wait_some_more)
+	      error (_("Command aborted."));
+	  }
       }

     disable_commit_resumed.reset_and_commit ();
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (2 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-28  8:47   ` Aktemur, Tankut Baris
  2023-06-22 13:17 ` [PATCH 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Mihails Strasuns, Christina Schimpe

From: Mihails Strasuns <mihails.strasuns@intel.com>

Split the thread resuming code from proceed into new function
proceed_resume_thread_checked.

Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
---
 gdb/infrun.c | 148 +++++++++++++++++++++++++++------------------------
 1 file changed, 79 insertions(+), 69 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 5b0257076f0..b10b7c59652 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3268,6 +3268,82 @@ check_multi_target_resumption (process_stratum_target *resume_target)
     }
 }
 
+/* Helper function for `proceed`.  Check if thread TP is suitable for
+   resuming, and, if it is, switch to the thread and call
+   `keep_going_pass_signal`.  If TP is not suitable for resuming then this
+   function will just return without switching threads.  */
+
+static void
+proceed_resume_thread_checked (thread_info *tp)
+{
+  if (!tp->inf->has_execution ())
+    {
+      infrun_debug_printf ("[%s] target has no execution",
+			   tp->ptid.to_string ().c_str ());
+      return;
+    }
+
+  if (tp->resumed ())
+    {
+      infrun_debug_printf ("[%s] resumed",
+			   tp->ptid.to_string ().c_str ());
+      gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
+      return;
+    }
+
+  if (thread_is_in_step_over_chain (tp))
+    {
+      infrun_debug_printf ("[%s] needs step-over",
+			   tp->ptid.to_string ().c_str ());
+      return;
+    }
+
+  /* When handling a vfork GDB removes all breakpoints from the program
+     space in which the vfork is being handled, as such we must take care
+     not to resume any thread other than the vfork parent -- resuming the
+     vfork parent allows GDB to receive and handle the 'vfork done'
+     event.  */
+  if (tp->inf->thread_waiting_for_vfork_done != nullptr)
+    {
+      if (target_is_non_stop_p ())
+	{
+	  /* For non-stop targets, regardless of whether GDB is using
+	     all-stop or non-stop mode, threads are controlled
+	     individually.
+
+	     When a thread is handling a vfork, breakpoints are removed
+	     from the inferior (well, program space in fact), so it is
+	     critical that we don't try to resume any thread other than the
+	     vfork parent.  */
+	  if (tp != tp->inf->thread_waiting_for_vfork_done)
+	    {
+	      infrun_debug_printf ("[%s] thread %s of this inferior is "
+				   "waiting for vfork-done",
+				   tp->ptid.to_string ().c_str (),
+				   tp->inf->thread_waiting_for_vfork_done
+				     ->ptid.to_string ().c_str ());
+	      return;
+	    }
+	}
+      else
+	{
+	  /* For all-stop targets, when we attempt to resume the inferior,
+	     we will in fact only resume the vfork parent thread, this is
+	     handled in internal_resume_ptid, as such, there is nothing
+	     more that needs doing here.  */
+	}
+    }
+
+  infrun_debug_printf ("resuming %s",
+		       tp->ptid.to_string ().c_str ());
+
+  execution_control_state ecs (tp);
+  switch_to_thread (tp);
+  keep_going_pass_signal (&ecs);
+  if (!ecs.wait_some_more)
+    error (_("Command aborted."));
+}
+
 /* Basic routine for continuing the program in various fashions.
 
    ADDR is the address to resume at, or -1 for resume where stopped.
@@ -3456,77 +3532,11 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 						       resume_ptid))
 	  {
 	    switch_to_thread_no_regs (tp);
-
-	    if (!tp->inf->has_execution ())
-	      {
-		infrun_debug_printf ("[%s] target has no execution",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    if (tp->resumed ())
-	      {
-		infrun_debug_printf ("[%s] resumed",
-				     tp->ptid.to_string ().c_str ());
-		gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
-		continue;
-	      }
-
-	    if (thread_is_in_step_over_chain (tp))
-	      {
-		infrun_debug_printf ("[%s] needs step-over",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    /* If a thread of that inferior is waiting for a vfork-done
-	       (for a detached vfork child to exec or exit), breakpoints are
-	       removed.  We must not resume any thread of that inferior, other
-	       than the one waiting for the vfork-done.  */
-	    if (tp->inf->thread_waiting_for_vfork_done != nullptr
-		&& tp != tp->inf->thread_waiting_for_vfork_done)
-	      {
-		infrun_debug_printf ("[%s] another thread of this inferior is "
-				     "waiting for vfork-done",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    infrun_debug_printf ("resuming %s",
-				 tp->ptid.to_string ().c_str ());
-
-	    execution_control_state ecs (tp);
-	    switch_to_thread (tp);
-	    keep_going_pass_signal (&ecs);
-	    if (!ecs.wait_some_more)
-	      error (_("Command aborted."));
-	  }
-      }
-    else if (!cur_thr->resumed ()
-	     && !thread_is_in_step_over_chain (cur_thr))
-      {
-	/* In non-stop mode, if a there exists a thread waiting for a vfork
-	   then only allow that thread to resume (breakpoints are removed
-	   from an inferior when handling a vfork).
-
-	   We check target_is_non_stop_p here rather than just checking the
-	   non-stop flag, though these are equivalent (all-stop on a
-	   non-stop target was handled in a previous block, so at this
-	   point, if target_is_non_stop_p then GDB must be running on
-	   non-stop mode).  By using target_is_non_stop_p it will be easier
-	   to merge this block with the previous in a later commit.  */
-	if (!(target_is_non_stop_p ()
-	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
-	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
-	  {
-	    /* The thread wasn't started, and isn't queued, run it now.  */
-	    execution_control_state ecs (cur_thr);
-	    switch_to_thread (cur_thr);
-	    keep_going_pass_signal (&ecs);
-	    if (!ecs.wait_some_more)
-	      error (_("Command aborted."));
+	    proceed_resume_thread_checked (tp);
 	  }
       }
+    else
+      proceed_resume_thread_checked (cur_thr);
 
     disable_commit_resumed.reset_and_commit ();
   }
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 5/8] gdb: don't resume vfork parent while child is still running
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (3 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Like the last few commit, this fixes yet another vfork related issue.
Like the commit titled:

  gdb: don't restart vfork parent while waiting for child to finish

which addressed a case in linux-nat where we would try to resume a
vfork parent, this commit addresses a very similar case, but this time
occurring in infrun.c.  Just like with that previous commit, this bug
results in the assert:

  x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.

In this case the issue occurs when target-non-stop is on, but non-stop
is off, and again, schedule-multiple is on.  As with the previous
commit, GDB is in follow-fork-mode child.  If you have not done so, it
is worth reading the earlier commit as many of the problems leading to
the failure are the same, they just appear in a different part of GDB.

Here are the steps leading to the assertion failure:

  1. The user performs a 'next' over a vfork, GDB stop in the vfork
  child,

  2. As we are planning to follow the child GDB sets the vfork_parent
  and vfork_child member variables in the two inferiors, the
  thread_waiting_for_vfork_done member is left as nullptr, that member
  is only used when GDB is planning to follow the parent inferior,

  3. The user does 'continue', our expectation is that the vfork child
  should resume, and once that process has exited or execd, GDB should
  detach from the vfork parent.  As a result of the 'continue' GDB
  eventually enters the proceed function,

  4. In proceed we selected a ptid_t to resume, because
  schedule-multiple is on we select minus_one_ptid (see
  user_visible_resume_ptid),

  5. As GDB is running in all-stop on top of non-stop mode, in the
  proceed function we iterate over all threads that match the resume
  ptid, which turns out to be all threads, and call
  proceed_resume_thread_checked.  One of the threads we iterate over
  is the vfork parent thread,

  6. As the thread passed to proceed_resume_thread_checked doesn't
  match any of the early return conditions, GDB will set the thread
  resumed,

  7. As we are resuming one thread at a time, this thread is seen by
  the lower layers (e.g. linux-nat) as the "event thread", which means
  we don't apply any of the checks, e.g. is this thread a
  vfork parent, instead we assume that GDB core knows what it's doing,
  and linux-nat will resume the thread, we have now incorrectly set
  running the vfork parent thread when this thread should be waiting
  for the vfork child to complete,

  8. Back in the proceed function GDB continues to iterate over all
  threads, and now (correctly) resumes the vfork child thread,

  8. As the vfork child is still alive the kernel holds the vfork
  parent stopped,

  9. Eventually the child performs its exec and GDB is sent and EXECD
  event.  However, because the parent is resumed, as soon as the child
  performs its exec the vfork parent also sends a VFORK_DONE event to
  GDB,

  10. Depending on timing both of these events might seem to arrive in
  GDB at the same time.  Normally GDB expects to see the EXECD or
  EXITED/SIGNALED event from the vfork child before getting the
  VFORK_DONE in the parent.  We know this because it is as a result of
  the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
  handle_vfork_child_exec_or_exit for details).  Further the comment
  in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
  when we remain attached to the child (not the parent) we should not
  expect to see a VFORK_DONE,

  11. If both events arrive at the same time then GDB will randomly
  choose one event to handle first, in some cases this will be the
  VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
  expects that (a) the vfork child has finished, however, in this case
  this is not completely true, the child has finished, but GDB has not
  processed the event associated with the completion yet, and (b) upon
  seeing a VFORK_DONE GDB assumes we are remaining attached to the
  parent, and so resumes the parent process,

  12. GDB now handles the EXECD event.  In our case we are detaching
  from the parent, so GDB calls target_detach (see
  handle_vfork_child_exec_or_exit),

  13. While this has been going on the vfork parent is executing, and
  might even exit,

  14. In linux_nat_target::detach the first thing we do is stop all
  threads in the process we're detaching from, the result of the stop
  request will be cached on the lwp_info object,

  15. In our case the vfork parent has exited though, so when GDB
  waits for the thread, instead of a stop due to signal, we instead
  get a thread exited status,

  16. Later in the detach process we try to resume the threads just
  prior to making the ptrace call to actually detach (see
  detach_one_lwp), as part of the process to resume a thread we try to
  touch some registers within the thread, and before doing this GDB
  asserts that the thread is stopped,

  17. An exited thread is not classified as stopped, and so the assert
  triggers!

Just like with the earlier commit, the fix is to spot the vfork parent
status of the thread, and not resume such threads.  Where the earlier
commit fixed this in linux-nat, in this case I think the fix should
live in infrun.c, in proceed_resume_thread_checked.  This function
already has a similar check to not resume the vfork parent in the case
where we are planning to follow the vfork parent, I propose adding a
similar case that checks for the vfork parent when we plan to follow
the vfork child.

This new check will mean that at step #6 above GDB doesn't try to
resume the vfork parent thread, which prevents the VFORK_DONE from
ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
the vfork child GDB will detach from the parent.

There's no test included in this commit.  In a subsequent commit I
will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.

If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive.  I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
---
 gdb/infrun.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index b10b7c59652..3eb3b290001 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3299,10 +3299,12 @@ proceed_resume_thread_checked (thread_info *tp)
     }

   /* When handling a vfork GDB removes all breakpoints from the program
-     space in which the vfork is being handled, as such we must take care
-     not to resume any thread other than the vfork parent -- resuming the
-     vfork parent allows GDB to receive and handle the 'vfork done'
-     event.  */
+     space in which the vfork is being handled.  If we are following the
+     parent then GDB will set the thread_waiting_for_vfork_done member of
+     the parent inferior.  In this case we should take care to only resume
+     the vfork parent thread, the kernel will hold this thread suspended
+     until the vfork child has exited or execd, at which point the parent
+     will be resumed and a VFORK_DONE event sent to GDB.  */
   if (tp->inf->thread_waiting_for_vfork_done != nullptr)
     {
       if (target_is_non_stop_p ())
@@ -3334,6 +3336,20 @@ proceed_resume_thread_checked (thread_info *tp)
 	}
     }

+  /* When handling a vfork GDB removes all breakpoints from the program
+     space in which the vfork is being handled.  If we are following the
+     child then GDB will set vfork_child member of the vfork parent
+     inferior.  Once the child has either exited or execd then GDB will
+     detach from the parent process.  Until that point GDB should not
+     resume any thread in the parent process.  */
+  if (tp->inf->vfork_child != nullptr)
+    {
+      infrun_debug_printf ("[%s] thread is part of a vfork parent, child is %d",
+			   tp->ptid.to_string ().c_str (),
+			   tp->inf->vfork_child->pid);
+      return;
+    }
+
   infrun_debug_printf ("resuming %s",
 		       tp->ptid.to_string ().c_str ());

-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (4 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit provides tests for all of the bugs fixed in the previous
four commits, this is achieved by expanding gdb.base/foll-vfork.exp to
run with different configurations:

  * target-non-stop on/off
  * non-stop on/off
  * schedule-multiple on/off

We don't test with schedule-multiple on if we are using a remote
target, this is due to bug gdb/30574.  I've added a link to that bug
in this commit, but all this commit does is expose that bug, there's
no fixes here.

Some of the bugs fixed in the previous commits are very timing
dependent, as such, they don't always show up.  I've had more success
when running this test on a very loaded machine -- I usually run ~8
copies of the test in parallel, then the bugs would normally show up
pretty quickly.

Other than running the test in more configurations, I've not made any
changes to what is actually being tested, other than in one place
where, when testing with non-stop mode, GDB stops in a different
inferior, as such I had to add a new 'inferior 2' call, this can be
found in vfork_relations_in_info_inferiors.

I have cleaned things up a little, for example, making use of
proc_with_prefix to remove some with_test_prefix calls.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30574
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 628 ++++++++++++--------------
 1 file changed, 300 insertions(+), 328 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index bdceff0f5de..be0715b05c0 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -25,56 +25,52 @@ if {![istarget "*-linux*"]} {
     continue
 }
 
-standard_testfile
+standard_testfile .c -exit.c vforked-prog.c
 
-set compile_options debug
+set binfile $testfile
+set binfile2 ${testfile}-exit
+set binfile3 vforked-prog
 
-if {[build_executable $testfile.exp $testfile $srcfile $compile_options] == -1} {
+if {[build_executable "compile $binfile" $binfile $srcfile] == -1} {
     untested "failed to compile main testcase"
     return -1
 }
 
-set testfile2 "vforked-prog"
-set srcfile2 ${testfile2}.c
+if {[build_executable "compile $binfile2" $binfile2 $srcfile2] == -1} {
+    untested "failed to compile main testcase"
+    return -1
+}
 
-if {[build_executable $testfile.exp $testfile2 $srcfile2 $compile_options] == -1} {
-    untested "failed to compile secondary testcase"
+if {[build_executable "compile $binfile3" $binfile3 $srcfile3] == -1} {
+    untested "failed to compile main testcase"
     return -1
 }
 
+# If required, download the program that we exec after vfork to the
+# remote target.
 if { [is_remote target] } {
-    gdb_remote_download target [standard_output_file $testfile2]
+    gdb_remote_download target [standard_output_file $binfile3]
 }
 
-# A few of these tests require a little more time than the standard
-# timeout allows.
-set oldtimeout $timeout
-set timeout [expr "$timeout + 10"]
-
 # Start with a fresh GDB, with verbosity enabled, and run to main.  On
 # error, behave as "return", so we don't try to continue testing with
 # a borked session.
-proc setup_gdb {} {
-    global testfile srcfile
-
-    clean_restart $testfile
+proc setup_gdb { binfile srcfile } {
+    clean_restart $binfile
 
     if ![runto_main] {
 	return -code return
     }
 
-    set tbreak_line [gdb_get_line_number " VFORK " $srcfile]
-    gdb_test "tbreak ${tbreak_line}"
-    gdb_continue_to_breakpoint ".*"
+    gdb_breakpoint [gdb_get_line_number " VFORK " $srcfile] temporary
+    gdb_continue_to_breakpoint "at VFORK"
 }
 
 proc check_vfork_catchpoints {} {
-  global gdb_prompt
-
   # Because setup_gdb uses 'return -code return' which would return to
   # our caller we need to wrap this call, spot when setup_gdb failed
   # (with return code 2), and then issue our own 'return -code return'.
-  set code [catch {setup_gdb} string]
+  set code [catch {setup_gdb $::testfile $::srcfile} string]
   if { $code == 2 } {
     unsupported "vfork catchpoints"
     return -code return
@@ -84,10 +80,10 @@ proc check_vfork_catchpoints {} {
   gdb_test "catch vfork" "Catchpoint \[0-9\]* \\(vfork\\)" "insert first vfork catchpoint"
   set has_vfork_catchpoints 0
   gdb_test_multiple "continue" "continue to first vfork catchpoint" {
-    -re ".*Your system does not support this type\r\nof catchpoint.*$gdb_prompt $" {
+    -re -wrap ".*Your system does not support this type\r\nof catchpoint.*" {
       unsupported "continue to first vfork catchpoint"
     }
-    -re ".*Catchpoint.*$gdb_prompt $" {
+    -re -wrap ".*Catchpoint.*" {
       set has_vfork_catchpoints 1
       pass "continue to first vfork catchpoint"
     }
@@ -99,319 +95,278 @@ proc check_vfork_catchpoints {} {
   }
 }
 
-proc vfork_parent_follow_through_step {} {
-  with_test_prefix "vfork parent follow, through step" {
-   global gdb_prompt
+proc_with_prefix vfork_parent_follow_through_step { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-   setup_gdb
+    gdb_test_no_output "set follow-fork parent"
 
-   gdb_test_no_output "set follow-fork parent"
+    gdb_test_multiple "next" "" {
+	-re -wrap "\\\[Detaching after vfork from.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
 
-   set test "step"
-   gdb_test_multiple "next" $test {
-       -re "\\\[Detaching after vfork from.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass $test
-       }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_parent_follow_to_bp {} {
-  with_test_prefix "vfork parent follow, to bp" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork parent"
-
-   set bp_location [gdb_get_line_number "I'm the proud parent of child"]
-   gdb_test "break ${srcfile}:${bp_location}" ".*" "break, vfork to bp"
-
-   set test "continue to bp"
-   gdb_test_multiple "continue" $test {
-       -re ".*\\\[Detaching after vfork from child process.*Breakpoint.*${bp_location}.*$gdb_prompt " {
-	   pass $test
-       }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_child_follow_to_exit {} {
-  with_test_prefix "vfork child follow, to exit" {
-   global gdb_prompt
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   set test "continue to child exit"
-   gdb_test_multiple "continue" $test {
-      -re "Couldn't get registers.*$gdb_prompt " {
-	  # PR gdb/14766
-	  fail "$test"
-      }
-       -re "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent .* after child exit.*$gdb_prompt " {
-	  pass $test
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
-proc vfork_and_exec_child_follow_to_main_bp {} {
-  with_test_prefix "vfork and exec child follow, to main bp" {
-   global gdb_prompt
-   global srcfile2
+proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-   setup_gdb
+    gdb_test_no_output "set follow-fork parent"
 
-   gdb_test_no_output "set follow-fork child"
+    set bp_location \
+	[gdb_get_line_number "I'm the proud parent of child" $srcfile]
+    gdb_test "break ${srcfile}:${bp_location}" ".*" "break, vfork to bp"
 
-   set linenum [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
+    gdb_test_multiple "continue" "continue to bp" {
+	-re -wrap ".*\\\[Detaching after vfork from child process.*Breakpoint.*${bp_location}.*" {
+	    pass $gdb_test_name
+	}
+    }
 
-   set test "continue to bp"
-   gdb_test_multiple "continue" $test {
-      -re "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent.*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*$gdb_prompt " {
-	  pass $test
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_and_exec_child_follow_through_step {} {
-  with_test_prefix "vfork and exec child follow, through step" {
-   global gdb_prompt
-   global srcfile2
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   set test "step over vfork"
-
-   # The ideal support is to be able to debug the child even
-   # before it execs.  Thus, "next" lands on the next line after
-   # the vfork.
-   gdb_test_multiple "next" $test {
-       -re "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass "$test"
-       }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test_multiple "continue" "continue to child exit" {
+	-re -wrap "Couldn't get registers.*" {
+	    # PR gdb/14766
+	    fail $gdb_test_name
+	}
+	-re -wrap "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent .* after child exit.*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    set linenum [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+
+    gdb_test_multiple "continue" "continue to bp" {
+	-re -wrap "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent.*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    # The ideal support is to be able to debug the child even
+    # before it execs.  Thus, "next" lands on the next line after
+    # the vfork.
+    gdb_test_multiple "next" "next over vfork" {
+	-re -wrap "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
 proc continue_to_vfork {} {
-   global gdb_prompt
-
-   # A vfork catchpoint may stop in either "vfork" or "_vfork".
-   set test "continue to vfork"
-   gdb_test_multiple "continue" $test {
-      -re "vfork \\(\\) at .*$gdb_prompt $" {
-	  pass $test
-      }
-      -re "0x\[0-9a-fA-F\]*.*(vfork|__kernel_v?syscall).*$gdb_prompt " {
-	  pass $test
-      }
-   }
+    # A vfork catchpoint may stop in either "vfork" or "_vfork".
+    gdb_test_multiple "continue" "continue to vfork" {
+	-re -wrap "vfork \\(\\) at .*" {
+	    pass $gdb_test_name
+	}
+	-re -wrap "0x\[0-9a-fA-F\]*.*(vfork|__kernel_v?syscall).*" {
+	    pass $gdb_test_name
+	}
+    }
 }
 
-proc tcatch_vfork_then_parent_follow {} {
-  with_test_prefix "vfork parent follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork parent"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set linenum [gdb_get_line_number "pid = vfork ();"]
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*0x\[0-9a-fA-F\]* in main .* at .*${srcfile}:${linenum}.*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc tcatch_vfork_then_child_follow_exec {} {
-  with_test_prefix "vfork child follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-   global srcfile2
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set linenum1 [gdb_get_line_number "pid = vfork ();"]
-   set linenum2 [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
-
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*${srcfile}:${linenum1}.*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-      -re "Run till exit from.*vfork.*${srcfile2}:${linenum2}.*$gdb_prompt " {
-	  pass "$test (followed exec)"
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc tcatch_vfork_then_child_follow_exit {} {
-  with_test_prefix "vfork child follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*exited normally.*$gdb_prompt " {
-	  setup_kfail "gdb/14762" *-*-*
-	  fail $test
-      }
-      -re "Run till exit from.*vfork.*pid = vfork \\(\\).*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+proc_with_prefix tcatch_vfork_then_parent_follow { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-proc vfork_relations_in_info_inferiors { variant } {
-  with_test_prefix "vfork relations in info inferiors" {
-   global gdb_prompt
+    gdb_test_no_output "set follow-fork parent"
 
-   setup_gdb
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
 
-   gdb_test_no_output "set follow-fork child"
+    continue_to_vfork
 
-   set test "step over vfork"
-   gdb_test_multiple "next" $test {
-       -re "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass "$test"
-       }
-   }
+    set linenum [gdb_get_line_number "pid = vfork ();" $srcfile]
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*0x\[0-9a-fA-F\]* in main .* at .*${srcfile}:${linenum}.*" {
+	    pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+    }
 
-   gdb_test "info inferiors" \
-       ".*is vfork parent of inferior 2.*is vfork child of inferior 1" \
-       "info inferiors shows vfork parent/child relation"
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
-   if { $variant == "exec" } {
-       global srcfile2
+proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-       set linenum [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
-       set test "continue to bp"
-       gdb_test_multiple "continue" $test {
-	   -re ".*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*$gdb_prompt " {
-	       pass $test
-	   }
-       }
-   } else {
-       set test "continue to child exit"
-       gdb_test_multiple "continue" $test {
-	   -re "exited normally.*$gdb_prompt " {
-	       pass $test
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
+
+    continue_to_vfork
+
+    set linenum1 [gdb_get_line_number "pid = vfork ();" $srcfile]
+    set linenum2 [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*${srcfile}:${linenum1}.*" {
+	  pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+	-re -wrap "Run till exit from.*vfork.*${::srcfile3}:${linenum2}.*" {
+	    pass "$gdb_test_name (followed exec)"
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
+   setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
+
+    continue_to_vfork
+
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*exited normally.*" {
+	    setup_kfail "gdb/14762" *-*-*
+	    fail $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*vfork.*pid = vfork \\(\\).*" {
+	  pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_relations_in_info_inferiors { variant binfile srcfile non_stop } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test_multiple "next" "next over vfork" {
+	-re -wrap "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    if { $non_stop } {
+	gdb_test "inferior 2" ".*"
+    }
+
+    gdb_test "info inferiors" \
+	".*is vfork parent of inferior 2.*is vfork child of inferior 1" \
+	"info inferiors shows vfork parent/child relation"
+
+    if { $variant == "exec" } {
+	set linenum [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+	gdb_test_multiple "continue" "continue to bp" {
+	    -re -wrap ".*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*" {
+		pass $gdb_test_name
+	    }
+	}
+    } else {
+	gdb_test_multiple "continue" "continue to child exit" {
+	   -re -wrap "exited normally.*" {
+	       pass $gdb_test_name
 	   }
-       }
-   }
-
-   set test "vfork relation no longer appears in info inferiors"
-   gdb_test_multiple "info inferiors" $test {
-       -re -wrap "is vfork child of inferior 1.*" {
-	   fail $test
-       }
-       -re -wrap "is vfork parent of inferior 2.*" {
-	   fail $test
-       }
-       -re -wrap "" {
-	   pass $test
-       }
-   }
-}}
-
-proc do_vfork_and_follow_parent_tests {} {
-   global gdb_prompt
+	}
+    }
 
+    gdb_test_multiple "info inferiors" "vfork relation no longer appears in info inferiors" {
+	-re -wrap "is vfork child of inferior 1.*" {
+	   fail $gdb_test_name
+	}
+	-re -wrap "is vfork parent of inferior 2.*" {
+	    fail $gdb_test_name
+	}
+	-re -wrap "" {
+	    pass $gdb_test_name
+	}
+    }
+}
+
+proc do_vfork_and_follow_parent_tests { binfile srcfile } {
    # Try following the parent process by stepping through a call to
    # vfork.  Do this without catchpoints.
-   vfork_parent_follow_through_step
+   vfork_parent_follow_through_step $binfile $srcfile
 
    # Try following the parent process by setting a breakpoint on the
    # other side of a vfork, and running to that point.  Do this
    # without catchpoints.
-   vfork_parent_follow_to_bp
+   vfork_parent_follow_to_bp $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the parent.
    #
-   tcatch_vfork_then_parent_follow
+   tcatch_vfork_then_parent_follow $binfile $srcfile
 }
 
-proc do_vfork_and_follow_child_tests_exec {} {
+proc do_vfork_and_follow_child_tests_exec { binfile srcfile non_stop } {
    # Try following the child process by just continuing through the
    # vfork, and letting the parent's breakpoint on "main" be auto-
    # magically reset in the child.
    #
-   vfork_and_exec_child_follow_to_main_bp
+   vfork_and_exec_child_follow_to_main_bp $binfile $srcfile
 
    # Try following the child process by stepping through a call to
    # vfork.  The child also executes an exec.  Since the child cannot
@@ -420,11 +375,11 @@ proc do_vfork_and_follow_child_tests_exec {} {
    # recomputed in the exec'd child, the step through a vfork should
    # land us in the "main" for the exec'd child, too.
    #
-   vfork_and_exec_child_follow_through_step
+   vfork_and_exec_child_follow_through_step $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the child.
    #
-   tcatch_vfork_then_child_follow_exec
+   tcatch_vfork_then_child_follow_exec $binfile $srcfile
 
    # Test the ability to follow both child and parent of a vfork.  Do
    # this without catchpoints.
@@ -442,25 +397,25 @@ proc do_vfork_and_follow_child_tests_exec {} {
    # and confirm the relation is no longer displayed in "info
    # inferiors".
    #
-   vfork_relations_in_info_inferiors "exec"
+   vfork_relations_in_info_inferiors "exec" $binfile $srcfile $non_stop
 }
 
-proc do_vfork_and_follow_child_tests_exit {} {
+proc do_vfork_and_follow_child_tests_exit { binfile srcfile non_stop } {
    # Try following the child process by just continuing through the
    # vfork, and letting the child exit.
    #
-   vfork_child_follow_to_exit
+   vfork_child_follow_to_exit $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the child.
    #
-   tcatch_vfork_then_child_follow_exit
+   tcatch_vfork_then_child_follow_exit $binfile $srcfile
 
    # Step over a vfork in the child, do "info inferiors" and check the
    # parent/child relation is displayed.  Run the child to completion,
    # and confirm the relation is no longer displayed in "info
    # inferiors".
    #
-   vfork_relations_in_info_inferiors "exit"
+   vfork_relations_in_info_inferiors "exit" $binfile $srcfile $non_stop
 }
 
 with_test_prefix "check vfork support" {
@@ -470,41 +425,58 @@ with_test_prefix "check vfork support" {
 }
 
 # Follow parent and follow child vfork tests with a child that execs.
-with_test_prefix "exec" {
+proc_with_prefix exec_tests { binfile srcfile non_stop } {
     # These are tests of gdb's ability to follow the parent of a Unix
     # vfork system call.  The child will subsequently call a variant
     # of the Unix exec system call.
-    do_vfork_and_follow_parent_tests
+    do_vfork_and_follow_parent_tests $binfile $srcfile
 
     # These are tests of gdb's ability to follow the child of a Unix
     # vfork system call.  The child will subsequently call a variant
     # of a Unix exec system call.
     #
-    do_vfork_and_follow_child_tests_exec
-}
-
-# Switch to test the case of the child exiting.  We can't use
-# standard_testfile here because we don't want to overwrite the binary
-# of the previous tests.
-set testfile "foll-vfork-exit"
-set srcfile ${testfile}.c
-set binfile [standard_output_file ${testfile}]
-
-if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
-    untested "failed to build $testfile"
-    return
+    do_vfork_and_follow_child_tests_exec $binfile $srcfile $non_stop
 }
 
 # Follow parent and follow child vfork tests with a child that exits.
-with_test_prefix "exit" {
+proc_with_prefix exit_tests { binfile srcfile non_stop } {
     # These are tests of gdb's ability to follow the parent of a Unix
     # vfork system call.  The child will subsequently exit.
-    do_vfork_and_follow_parent_tests
+    do_vfork_and_follow_parent_tests $binfile $srcfile
 
     # These are tests of gdb's ability to follow the child of a Unix
     # vfork system call.  The child will subsequently exit.
     #
-    do_vfork_and_follow_child_tests_exit
+    do_vfork_and_follow_child_tests_exit $binfile $srcfile $non_stop
+}
+
+# Using the remote protocol with schedule-multiple turned triggers bug
+# gdb/30574, so avoid this for now.
+if {[target_info exists gdb_protocol]
+    && ([target_info gdb_protocol] == "remote"
+	|| [target_info gdb_protocol] == "extended-remote")} {
+    set sm_modes { off }
+} else {
+    set sm_modes { off on }
 }
 
-set timeout $oldtimeout
+# A few of these tests require a little more time than the standard timeout
+# allows.
+with_timeout_factor 2 {
+    foreach_with_prefix target-non-stop {auto on off} {
+	foreach_with_prefix non-stop {off on} {
+	    foreach_with_prefix schedule-multiple $sm_modes {
+		save_vars { ::GDBFLAGS } {
+		    # These flags will be picked up by the call to
+		    # clean_restart inside setup_gdb.
+		    append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
+		    append ::GDBFLAGS " -ex \"set non-stop ${non-stop}\""
+		    append ::GDBFLAGS " -ex \"set schedule-multiple ${schedule-multiple}\""
+
+		    exec_tests $binfile $srcfile ${non-stop}
+		    exit_tests $binfile2 $srcfile2 ${non-stop}
+		}
+	    }
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (5 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-06-22 13:17 ` [PATCH 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

While working on gdb.base/foll-vfork.exp I noticed that there are
several random 'sleep' calls throughout the test.

The comment suggests these are to allow for output from a vforked
child to arrive, but in each case the test is about to close and
restart GDB, so I don't see how random output from a child process
could impact testing.

I removed the sleep calls and couldn't reproduce any failures from
this test, I left the test running for a couple of hours, and tried
loading my machine, and the test seems fine with these removed.

I've left this as a separate commit so that if, in the future, someone
can show that these are required, it will be easy to revert this one
patch and bring them back.

There should be no change in what is tested after this commit.
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 48 ---------------------------
 1 file changed, 48 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index be0715b05c0..43fc25cc2c0 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -105,12 +105,6 @@ proc_with_prefix vfork_parent_follow_through_step { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
@@ -127,12 +121,6 @@ proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
@@ -149,12 +137,6 @@ proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
@@ -169,12 +151,6 @@ proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
@@ -190,12 +166,6 @@ proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc continue_to_vfork {} {
@@ -229,12 +199,6 @@ proc_with_prefix tcatch_vfork_then_parent_follow { binfile srcfile } {
 	    exp_continue
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
@@ -261,12 +225,6 @@ proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
 	    pass "$gdb_test_name (followed exec)"
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
@@ -291,12 +249,6 @@ proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
 	    exp_continue
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_relations_in_info_inferiors { variant binfile srcfile non_stop } {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 8/8] gdb: additional debug output in infrun.c and linux-nat.c
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (6 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-06-22 13:17 ` Andrew Burgess
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-06-22 13:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

While working on some of the recent patches relating to vfork handling
I found this debug output very helpful, I'd like to propose adding
this into GDB.

With debug turned off there should be no user visible changes after
this commit.
---
 gdb/infrun.c    | 26 ++++++++++++++++++++++----
 gdb/linux-nat.c | 23 +++++++++++++++++++----
 2 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 3eb3b290001..52de4920661 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -435,6 +435,11 @@ show_follow_fork_mode_string (struct ui_file *file, int from_tty,
 static bool
 follow_fork_inferior (bool follow_child, bool detach_fork)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
+  infrun_debug_printf ("follow_child = %d, detach_fork = %d",
+		       follow_child, detach_fork);
+
   target_waitkind fork_kind = inferior_thread ()->pending_follow.kind ();
   gdb_assert (fork_kind == TARGET_WAITKIND_FORKED
 	      || fork_kind == TARGET_WAITKIND_VFORKED);
@@ -543,6 +548,13 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	  parent_inf->thread_waiting_for_vfork_done
 	    = detach_fork ? inferior_thread () : nullptr;
 	  parent_inf->pspace->breakpoints_not_allowed = detach_fork;
+
+	  infrun_debug_printf
+	    ("parent_inf->thread_waiting_for_vfork_done == %s",
+	     (parent_inf->thread_waiting_for_vfork_done == nullptr
+	      ? "nullptr"
+	      : (parent_inf->thread_waiting_for_vfork_done
+		 ->ptid.to_string ().c_str ())));
 	}
     }
   else
@@ -727,6 +739,8 @@ set_last_target_status_stopped (thread_info *tp)
 static bool
 follow_fork ()
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   bool follow_child = (follow_fork_mode_string == follow_fork_mode_child);
   bool should_resume = true;
 
@@ -996,6 +1010,8 @@ proceed_after_vfork_done (thread_info *thread)
 static void
 handle_vfork_child_exec_or_exit (int exec)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   struct inferior *inf = current_inferior ();
 
   if (inf->vfork_parent)
@@ -1127,6 +1143,8 @@ handle_vfork_child_exec_or_exit (int exec)
 static void
 handle_vfork_done (thread_info *event_thread)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   /* We only care about this event if inferior::thread_waiting_for_vfork_done is
      set, that is if we are waiting for a vfork child not under our control
      (because we detached it) to exec or exit.
@@ -1142,8 +1160,6 @@ handle_vfork_done (thread_info *event_thread)
       return;
     }
 
-  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
-
   /* We stopped all threads (other than the vforking thread) of the inferior in
      follow_fork and kept them stopped until now.  It should therefore not be
      possible for another thread to have reported a vfork during that window.
@@ -3456,8 +3472,10 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   if (!cur_thr->control.in_infcall)
     set_running (resume_target, resume_ptid, true);
 
-  infrun_debug_printf ("addr=%s, signal=%s", paddress (gdbarch, addr),
-		       gdb_signal_to_symbol_string (siggnal));
+  infrun_debug_printf ("addr=%s, signal=%s, resume_ptid=%s",
+		       paddress (gdbarch, addr),
+		       gdb_signal_to_symbol_string (siggnal),
+		       resume_ptid.to_string ().c_str ());
 
   annotate_starting ();
 
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 7e121b7ab41..87152621dd6 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1293,6 +1293,11 @@ get_detach_signal (struct lwp_info *lp)
 static void
 detach_one_lwp (struct lwp_info *lp, int *signo_p)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
+  linux_nat_debug_printf ("lwp %s (stopped = %d)",
+			  lp->ptid.to_string ().c_str (), lp->stopped);
+
   int lwpid = lp->ptid.lwp ();
   int signo;
 
@@ -1363,6 +1368,10 @@ detach_one_lwp (struct lwp_info *lp, int *signo_p)
   else
     signo = *signo_p;
 
+  linux_nat_debug_printf ("preparing to resume lwp %s (stopped = %d)",
+			  lp->ptid.to_string ().c_str (),
+			  lp->stopped);
+
   /* Preparing to resume may try to write registers, and fail if the
      lwp is zombie.  If that happens, ignore the error.  We'll handle
      it below, when detach fails with ESRCH.  */
@@ -1395,6 +1404,8 @@ detach_callback (struct lwp_info *lp)
 void
 linux_nat_target::detach (inferior *inf, int from_tty)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   struct lwp_info *main_lwp;
   int pid = inf->pid;
 
@@ -3122,13 +3133,13 @@ static ptid_t
 linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 		  target_wait_flags target_options)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   sigset_t prev_mask;
   enum resume_kind last_resume_kind;
   struct lwp_info *lp;
   int status;
 
-  linux_nat_debug_printf ("enter");
-
   /* The first time we get here after starting a new inferior, we may
      not have added it to the LWP list yet - this is the earliest
      moment at which we know its PID.  */
@@ -3228,7 +3239,7 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 
       if (target_options & TARGET_WNOHANG)
 	{
-	  linux_nat_debug_printf ("exit (ignore)");
+	  linux_nat_debug_printf ("no interesting events found");
 
 	  ourstatus->set_ignore ();
 	  restore_child_signals_mask (&prev_mask);
@@ -3314,7 +3325,7 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  linux_nat_debug_printf ("exit");
+  linux_nat_debug_printf ("event found");
 
   restore_child_signals_mask (&prev_mask);
 
@@ -3410,6 +3421,8 @@ ptid_t
 linux_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 			target_wait_flags target_options)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   ptid_t event_ptid;
 
   linux_nat_debug_printf ("[%s], [%s]", ptid.to_string ().c_str (),
@@ -3589,6 +3602,8 @@ linux_nat_target::kill ()
 void
 linux_nat_target::mourn_inferior ()
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   int pid = inferior_ptid.pid ();
 
   purge_lwp_list (pid);
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function
  2023-06-22 13:17 ` [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
@ 2023-06-28  8:47   ` Aktemur, Tankut Baris
  2023-07-04 15:24     ` Andrew Burgess
  0 siblings, 1 reply; 32+ messages in thread
From: Aktemur, Tankut Baris @ 2023-06-28  8:47 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: Mihails Strasuns, Schimpe, Christina


On Thursday, June 22, 2023 3:17 PM, Andrew Burgess wrote:
> From: Mihails Strasuns <mihails.strasuns@intel.com>
> 
> Split the thread resuming code from proceed into new function
> proceed_resume_thread_checked.
> 
> Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
> ---
>  gdb/infrun.c | 148 +++++++++++++++++++++++++++------------------------
>  1 file changed, 79 insertions(+), 69 deletions(-)
> 
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 5b0257076f0..b10b7c59652 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -3268,6 +3268,82 @@ check_multi_target_resumption (process_stratum_target *resume_target)
>      }
>  }
> 
> +/* Helper function for `proceed`.  Check if thread TP is suitable for
> +   resuming, and, if it is, switch to the thread and call
> +   `keep_going_pass_signal`.  If TP is not suitable for resuming then this
> +   function will just return without switching threads.  */
> +
> +static void
> +proceed_resume_thread_checked (thread_info *tp)
> +{
> +  if (!tp->inf->has_execution ())
> +    {
> +      infrun_debug_printf ("[%s] target has no execution",
> +			   tp->ptid.to_string ().c_str ());
> +      return;
> +    }
> +
> +  if (tp->resumed ())
> +    {
> +      infrun_debug_printf ("[%s] resumed",
> +			   tp->ptid.to_string ().c_str ());
> +      gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
> +      return;
> +    }
> +
> +  if (thread_is_in_step_over_chain (tp))
> +    {
> +      infrun_debug_printf ("[%s] needs step-over",
> +			   tp->ptid.to_string ().c_str ());
> +      return;
> +    }
> +
> +  /* When handling a vfork GDB removes all breakpoints from the program
> +     space in which the vfork is being handled, as such we must take care
> +     not to resume any thread other than the vfork parent -- resuming the
> +     vfork parent allows GDB to receive and handle the 'vfork done'
> +     event.  */
> +  if (tp->inf->thread_waiting_for_vfork_done != nullptr)
> +    {
> +      if (target_is_non_stop_p ())
> +	{
> +	  /* For non-stop targets, regardless of whether GDB is using
> +	     all-stop or non-stop mode, threads are controlled
> +	     individually.
> +
> +	     When a thread is handling a vfork, breakpoints are removed
> +	     from the inferior (well, program space in fact), so it is
> +	     critical that we don't try to resume any thread other than the
> +	     vfork parent.  */
> +	  if (tp != tp->inf->thread_waiting_for_vfork_done)
> +	    {
> +	      infrun_debug_printf ("[%s] thread %s of this inferior is "
> +				   "waiting for vfork-done",
> +				   tp->ptid.to_string ().c_str (),
> +				   tp->inf->thread_waiting_for_vfork_done
> +				     ->ptid.to_string ().c_str ());
> +	      return;
> +	    }
> +	}
> +      else
> +	{
> +	  /* For all-stop targets, when we attempt to resume the inferior,
> +	     we will in fact only resume the vfork parent thread, this is
> +	     handled in internal_resume_ptid, as such, there is nothing
> +	     more that needs doing here.  */

I wonder if here we should do

    gdb_assert (tp == tp->inf->thread_waiting_for_vfork_done);

-Baris

> +	}
> +    }
> +
> +  infrun_debug_printf ("resuming %s",
> +		       tp->ptid.to_string ().c_str ());
> +
> +  execution_control_state ecs (tp);
> +  switch_to_thread (tp);
> +  keep_going_pass_signal (&ecs);
> +  if (!ecs.wait_some_more)
> +    error (_("Command aborted."));
> +}
> +
>  /* Basic routine for continuing the program in various fashions.
> 
>     ADDR is the address to resume at, or -1 for resume where stopped.
> @@ -3456,77 +3532,11 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>  						       resume_ptid))
>  	  {
>  	    switch_to_thread_no_regs (tp);
> -
> -	    if (!tp->inf->has_execution ())
> -	      {
> -		infrun_debug_printf ("[%s] target has no execution",
> -				     tp->ptid.to_string ().c_str ());
> -		continue;
> -	      }
> -
> -	    if (tp->resumed ())
> -	      {
> -		infrun_debug_printf ("[%s] resumed",
> -				     tp->ptid.to_string ().c_str ());
> -		gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
> -		continue;
> -	      }
> -
> -	    if (thread_is_in_step_over_chain (tp))
> -	      {
> -		infrun_debug_printf ("[%s] needs step-over",
> -				     tp->ptid.to_string ().c_str ());
> -		continue;
> -	      }
> -
> -	    /* If a thread of that inferior is waiting for a vfork-done
> -	       (for a detached vfork child to exec or exit), breakpoints are
> -	       removed.  We must not resume any thread of that inferior, other
> -	       than the one waiting for the vfork-done.  */
> -	    if (tp->inf->thread_waiting_for_vfork_done != nullptr
> -		&& tp != tp->inf->thread_waiting_for_vfork_done)
> -	      {
> -		infrun_debug_printf ("[%s] another thread of this inferior is "
> -				     "waiting for vfork-done",
> -				     tp->ptid.to_string ().c_str ());
> -		continue;
> -	      }
> -
> -	    infrun_debug_printf ("resuming %s",
> -				 tp->ptid.to_string ().c_str ());
> -
> -	    execution_control_state ecs (tp);
> -	    switch_to_thread (tp);
> -	    keep_going_pass_signal (&ecs);
> -	    if (!ecs.wait_some_more)
> -	      error (_("Command aborted."));
> -	  }
> -      }
> -    else if (!cur_thr->resumed ()
> -	     && !thread_is_in_step_over_chain (cur_thr))
> -      {
> -	/* In non-stop mode, if a there exists a thread waiting for a vfork
> -	   then only allow that thread to resume (breakpoints are removed
> -	   from an inferior when handling a vfork).
> -
> -	   We check target_is_non_stop_p here rather than just checking the
> -	   non-stop flag, though these are equivalent (all-stop on a
> -	   non-stop target was handled in a previous block, so at this
> -	   point, if target_is_non_stop_p then GDB must be running on
> -	   non-stop mode).  By using target_is_non_stop_p it will be easier
> -	   to merge this block with the previous in a later commit.  */
> -	if (!(target_is_non_stop_p ()
> -	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
> -	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
> -	  {
> -	    /* The thread wasn't started, and isn't queued, run it now.  */
> -	    execution_control_state ecs (cur_thr);
> -	    switch_to_thread (cur_thr);
> -	    keep_going_pass_signal (&ecs);
> -	    if (!ecs.wait_some_more)
> -	      error (_("Command aborted."));
> +	    proceed_resume_thread_checked (tp);
>  	  }
>        }
> +    else
> +      proceed_resume_thread_checked (cur_thr);
> 
>      disable_commit_resumed.reset_and_commit ();
>    }
> --
> 2.25.4

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 0/8] Some vfork related fixes
  2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
                   ` (7 preceding siblings ...)
  2023-06-22 13:17 ` [PATCH 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
@ 2023-07-04 15:22 ` Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
                     ` (8 more replies)
  8 siblings, 9 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

Changes since v1:

  - Added an extra assert in patch #4, as suggested by Baris,

  - Updated some comments in patch #4 to better describe what's going
    on,

  - Added an extra debug printf in patch #8 which I found useful when
    checking what's actually going on (in order to change patch #4).

---

Andrew Burgess (7):
  gdb: catch more errors in gdb.base/foll-vfork.exp
  gdb: don't restart vfork parent while waiting for child to finish
  gdb: fix an issue with vfork in non-stop mode
  gdb: don't resume vfork parent while child is still running
  gdb/testsuite: expand gdb.base/foll-vfork.exp
  gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp
  gdb: additional debug output in infrun.c and linux-nat.c

Mihails Strasuns (1):
  gdb, infrun: refactor part of `proceed` into separate function

 gdb/infrun.c                          | 189 ++++++---
 gdb/linux-nat.c                       |  32 +-
 gdb/testsuite/gdb.base/foll-vfork.exp | 588 ++++++++++++--------------
 3 files changed, 412 insertions(+), 397 deletions(-)


base-commit: bb2bd584f31a25ba1cfe5bdac4d07d8cffe87c3d
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

For *reasons* I was looking at gdb.base/foll-vfork.exp.  This test
script has a proc 'setup_gdb' that could (potentially) fail.  The
setup_gdb proc is called from many places and I, initially, didn't
think that we were checking if setup_gdb had failed or not.

My confusion was because I didn't understand what this tcl construct
did:

  return -code return

this will actually act as a return in the context of a proc's caller,
effectively returning two levels of the call stack.  Neat (I guess).

So it turns out my worries were misplaced, everywhere setup_gdb is
called, if setup_gdb fails then we will (magically) return.

However, I did spot one place where we managed to confuse ourselves
with our cleverness.

In check_vfork_catchpoints, this proc is called to check that GDB is
able to catch vforks or not.  This proc is called early in the test
script, and the intention is that, should this proc fail, we'll mark
the whole test script as unsupported and then move onto the next test
script.

However, check_vfork_catchpoints also uses setup_gdb, and so, if that
call to setup_gdb fails we'll end up returning immediately from
check_vfork_catchpoints, and then continue with the test of _this_
test script, which is not correct.

To fix this I see two choices, one is remove the use of 'return -code
return' from setup_gdb, however, this would require every use of
setup_gdb to then be placed inside:

  if { ![setup_gdb] } {
    return
  }

Or, I can wrap the one call to setup_gdb in check_vfork_catchpoints
and check its return code.

I chose the second option as this is the smaller code change.

There should be no change in what is tested after this commit.
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index 313afa06324..bdceff0f5de 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -70,9 +70,15 @@ proc setup_gdb {} {

 proc check_vfork_catchpoints {} {
   global gdb_prompt
-  global has_vfork_catchpoints

-  setup_gdb
+  # Because setup_gdb uses 'return -code return' which would return to
+  # our caller we need to wrap this call, spot when setup_gdb failed
+  # (with return code 2), and then issue our own 'return -code return'.
+  set code [catch {setup_gdb} string]
+  if { $code == 2 } {
+    unsupported "vfork catchpoints"
+    return -code return
+  }

   # Verify that the system supports "catch vfork".
   gdb_test "catch vfork" "Catchpoint \[0-9\]* \\(vfork\\)" "insert first vfork catchpoint"
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-05 10:08     ` Aktemur, Tankut Baris
  2023-07-04 15:22   ` [PATCHv2 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
                     ` (6 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

While working on a later patch, which changes gdb.base/foll-vfork.exp,
I noticed that sometimes I would hit this assert:

  x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.

I eventually tracked it down to a combination of schedule-multiple
mode being on, target-non-stop being off, follow-fork-mode being set
to child, and some bad timing.  The failing case is pretty simple, a
single threaded application performs a vfork, the child process then
execs some other application while the parent process (once the vfork
child has completed its exec) just exits.  As best I understand
things, here's what happens when things go wrong:

  1. The parent process performs a vfork, GDB sees the VFORKED event
  and creates an inferior and thread for the vfork child,

  2. GDB resumes the vfork child process.  As schedule-multiple is on
  and target-non-stop is off, this is translated into a request to
  start all processes (see user_visible_resume_ptid),

  3. In the linux-nat layer we spot that one of the threads we are
  about to start is a vfork parent, and so don't start that
  thread (see resume_lwp), the vfork child thread is resumed,

  4. GDB waits for the next event, eventually entering
  linux_nat_target::wait, which in turn calls linux_nat_wait_1,

  5. In linux_nat_wait_1 we eventually call
  resume_stopped_resumed_lwps, this should restart threads that have
  stopped but don't actually have anything interesting to report.

  6. Unfortunately, resume_stopped_resumed_lwps doesn't check for
  vfork parents like resume_lwp does, so at this point the vfork
  parent is resumed.  This feels like the start of the bug, and this
  is where I'm proposing to fix things, but, resuming the vfork parent
  isn't the worst thing in the world because....

  7. As the vfork child is still alive the kernel holds the vfork
  parent stopped,

  8. Eventually the child performs its exec and GDB is sent and EXECD
  event.  However, because the parent is resumed, as soon as the child
  performs its exec the vfork parent also sends a VFORK_DONE event to
  GDB,

  9. Depending on timing both of these events might seem to arrive in
  GDB at the same time.  Normally GDB expects to see the EXECD or
  EXITED/SIGNALED event from the vfork child before getting the
  VFORK_DONE in the parent.  We know this because it is as a result of
  the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
  handle_vfork_child_exec_or_exit for details).  Further the comment
  in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
  when we remain attached to the child (not the parent) we should not
  expect to see a VFORK_DONE,

  10. If both events arrive at the same time then GDB will randomly
  choose one event to handle first, in some cases this will be the
  VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
  expects that (a) the vfork child has finished, however, in this case
  this is not completely true, the child has finished, but GDB has not
  processed the event associated with the completion yet, and (b) upon
  seeing a VFORK_DONE GDB assumes we are remaining attached to the
  parent, and so resumes the parent process,

  11. GDB now handles the EXECD event.  In our case we are detaching
  from the parent, so GDB calls target_detach (see
  handle_vfork_child_exec_or_exit),

  12. While this has been going on the vfork parent is executing, and
  might even exit,

  13. In linux_nat_target::detach the first thing we do is stop all
  threads in the process we're detaching from, the result of the stop
  request will be cached on the lwp_info object,

  14. In our case the vfork parent has exited though, so when GDB
  waits for the thread, instead of a stop due to signal, we instead
  get a thread exited status,

  15. Later in the detach process we try to resume the threads just
  prior to making the ptrace call to actually detach (see
  detach_one_lwp), as part of the process to resume a thread we try to
  touch some registers within the thread, and before doing this GDB
  asserts that the thread is stopped,

  16. An exited thread is not classified as stopped, and so the assert
  triggers!

So there's two bugs I see here.  The first, and most critical one here
is in step #6.  I think that resume_stopped_resumed_lwps should not
resume a vfork parent, just like resume_lwp doesn't resume a vfork
parent.

With this change in place the vfork parent will remain stopped in step
instead GDB will only see the EXECD/EXITED/SIGNALLED event.  The
problems in #9 and #10 are therefore skipped and we arrive at #11,
handling the EXECD event.  As the parent is still stopped #12 doesn't
apply, and in #13 when we try to stop the process we will see that it
is already stopped, there's no risk of the vfork parent exiting before
we get to this point.  And finally, in #15 we are safe to poke the
process registers because it will not have exited by this point.

However, I did mention two bugs.

The second bug I've not yet managed to actually trigger, but I'm
convinced it must exist: if we forget vforks for a moment, in step #13
above, when linux_nat_target::detach is called, we first try to stop
all threads in the process GDB is detaching from.  If we imagine a
multi-threaded inferior with many threads, and GDB running in non-stop
mode, then, if the user tries to detach there is a chance that thread
could exit just as linux_nat_target::detach is entered, in which case
we should be able to trigger the same assert.

But, like I said, I've not (yet) managed to trigger this second bug,
and even if I could, the fix would not belong in this commit, so I'm
pointing this out just for completeness.

There's no test included in this commit.  In a couple of commits time
I will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.  Unfortunately there are at least two other bugs (separate
from the ones discussed above) that need fixing first, these will be
fixed in the next commits before the gdb.base/foll-vfork.exp test is
expanded.

If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive.  I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
---
 gdb/linux-nat.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 383ef58fa23..7e121b7ab41 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -3346,7 +3346,14 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 static int
 resume_stopped_resumed_lwps (struct lwp_info *lp, const ptid_t wait_ptid)
 {
-  if (!lp->stopped)
+  struct inferior *inf = find_inferior_ptid (linux_target, lp->ptid);
+
+  if (inf->vfork_child != nullptr)
+    {
+      linux_nat_debug_printf ("NOT resuming LWP %s (vfork parent)",
+			      lp->ptid.to_string ().c_str ());
+    }
+  else if (!lp->stopped)
     {
       linux_nat_debug_printf ("NOT resuming LWP %s, not stopped",
 			      lp->ptid.to_string ().c_str ());
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 3/8] gdb: fix an issue with vfork in non-stop mode
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

This commit fixes a bug introduced by this commit:

  commit d8bbae6ea080249c05ca90b1f8640fde48a18301
  Date:   Fri Jan 14 15:40:59 2022 -0500

      gdb: fix handling of vfork by multi-threaded program (follow-fork-mode=parent, detach-on-fork=on)

The problem can be seen in this GDB session:

  $ gdb -q
  (gdb) set non-stop on
  (gdb) file ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork
  Reading symbols from ./gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork...
  (gdb) tcatch vfork
  Catchpoint 1 (vfork)
  (gdb) run
  Starting program: /tmp/gdb/testsuite/outputs/gdb.base/foll-vfork/foll-vfork

  Temporary catchpoint 1 (vforked process 1375914), 0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  (gdb) bt
  #0  0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  #1  0x00000000004011af in main (argc=1, argv=0x7fffffffad88) at .../gdb/testsuite/gdb.base/foll-vfork.c:32
  (gdb) finish
  Run till exit from #0  0x00007ffff7d5043c in vfork () from /lib64/libc.so.6
  [Detaching after vfork from child process 1375914]
  No unwaited-for children left.
  (gdb)

Notice the "No unwaited-for children left." error.  This is incorrect,
given where we are stopped there's no reason why we shouldn't be able
to use "finish" to return to the main frame.

When the inferior is stopped as a result of the 'tcatch vfork', the
inferior is in the process of performing the vfork, that is, GDB has
seen the VFORKED event, but has not yet attached to the new child
process, nor has the child process been resumed.

However, GDB has seen the VFORKED, and, as we are going to follow the
parent process, the inferior for the vfork parent will have its
thread_waiting_for_vfork_done member variable set, this will point to
the one and only thread that makes up the vfork parent process.

When the "finish" command is used GDB eventually ends up in the
proceed function (in infrun.c), in here we pass through all the
function until we eventually encounter this 'else if' condition:

   else if (!cur_thr->resumed ()
	     && !thread_is_in_step_over_chain (cur_thr)
	     /* In non-stop, forbid resuming a thread if some other thread of
		that inferior is waiting for a vfork-done event (this means
		breakpoints are out for this inferior).  */
	     && !(non_stop
		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
      {

The first two of these conditions will both be true, the thread is not
already resumed, and is not in the step-over chain, however, the third
condition, this one:

	     && !(non_stop
		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))

is false, and this prevents the thread we are trying to finish from
being resumed.  This condition is false because (a) non_stop is true,
and (b) cur_thr->inf->thread_waiting_for_vfork_done is not
nullptr (see above for why).

Now, if we check the comment embedded within the condition it says:

     /* In non-stop, forbid resuming a thread if some other thread of
        that inferior is waiting for a vfork-done event (this means
        breakpoints are out for this inferior).  */

And this makes sense, if we have a vfork parent with two thread, and
one thread has performed a vfork, then we shouldn't try to resume the
second thread.

However, if we are trying to resume the thread that actually performed
a vfork, then this is fine.  If we never resume the vfork parent then
we'll never get a VFORK_DONE event, and so the vfork will never
complete.

Thus, the condition should actually be:

     && !(non_stop
	  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
	  && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))

This extra check will allow the vfork parent thread to resume, but
prevent any other thread in the vfork parent process from resuming.
This is the same condition that already exists in the all-stop on a
non-stop-target block earlier in the proceed function.

My actual fix is slightly different to the above, first, I've chosen
to use a nested 'if' check instead of extending the original 'else if'
check, this makes it easier to write a longer comment explaining
what's going on, and second, instead of checking 'non_stop' I've
switched to checking 'target_is_non_stop_p'.  In this context this is
effectively the same thing, a previous 'else if' block in proceed
already handles '!non_stop && target_is_non_stop_p ()', so by the time
we get here, if 'target_is_non_stop_p ()' then we must be running in
non_stop mode.

Both of these tweaks will make the next patch easier, which is a
refactor to merge two parts of the proceed function, so this nested
'if' block is not going to exist for long.

For testing, there is no test included with this commit.  The test was
exposed when using a modified version of the gdb.base/foll-vfork.exp
test script, however, there are other bugs that are exposed when using
the modified test script.  These bugs will be addressed in subsequent
commits, and then I'll add the updated gdb.base/foll-vfork.exp.

If you wish to reproduce this failure then grab the updates to
gdb.base/foll-vfork.exp from the later commit and run this test, the
failure is always reproducible.
---
 gdb/infrun.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 58da1cef29e..5b0257076f0 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3503,19 +3503,29 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 	  }
       }
     else if (!cur_thr->resumed ()
-	     && !thread_is_in_step_over_chain (cur_thr)
-	     /* In non-stop, forbid resuming a thread if some other thread of
-		that inferior is waiting for a vfork-done event (this means
-		breakpoints are out for this inferior).  */
-	     && !(non_stop
-		  && cur_thr->inf->thread_waiting_for_vfork_done != nullptr))
+	     && !thread_is_in_step_over_chain (cur_thr))
       {
-	/* The thread wasn't started, and isn't queued, run it now.  */
-	execution_control_state ecs (cur_thr);
-	switch_to_thread (cur_thr);
-	keep_going_pass_signal (&ecs);
-	if (!ecs.wait_some_more)
-	  error (_("Command aborted."));
+	/* In non-stop mode, if a there exists a thread waiting for a vfork
+	   then only allow that thread to resume (breakpoints are removed
+	   from an inferior when handling a vfork).
+
+	   We check target_is_non_stop_p here rather than just checking the
+	   non-stop flag, though these are equivalent (all-stop on a
+	   non-stop target was handled in a previous block, so at this
+	   point, if target_is_non_stop_p then GDB must be running on
+	   non-stop mode).  By using target_is_non_stop_p it will be easier
+	   to merge this block with the previous in a later commit.  */
+	if (!(target_is_non_stop_p ()
+	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
+	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
+	  {
+	    /* The thread wasn't started, and isn't queued, run it now.  */
+	    execution_control_state ecs (cur_thr);
+	    switch_to_thread (cur_thr);
+	    keep_going_pass_signal (&ecs);
+	    if (!ecs.wait_some_more)
+	      error (_("Command aborted."));
+	  }
       }

     disable_commit_resumed.reset_and_commit ();
-- 
2.25.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 4/8] gdb, infrun: refactor part of `proceed` into separate function
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (2 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Mihails Strasuns, tankut.baris.aktemur, Christina Schimpe

From: Mihails Strasuns <mihails.strasuns@intel.com>

Split the thread resuming code from proceed into new function
proceed_resume_thread_checked.

Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
---
 gdb/infrun.c | 155 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 86 insertions(+), 69 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 5b0257076f0..010fcd7952f 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3268,6 +3268,89 @@ check_multi_target_resumption (process_stratum_target *resume_target)
     }
 }
 
+/* Helper function for `proceed`.  Check if thread TP is suitable for
+   resuming, and, if it is, switch to the thread and call
+   `keep_going_pass_signal`.  If TP is not suitable for resuming then this
+   function will just return without switching threads.  */
+
+static void
+proceed_resume_thread_checked (thread_info *tp)
+{
+  if (!tp->inf->has_execution ())
+    {
+      infrun_debug_printf ("[%s] target has no execution",
+			   tp->ptid.to_string ().c_str ());
+      return;
+    }
+
+  if (tp->resumed ())
+    {
+      infrun_debug_printf ("[%s] resumed",
+			   tp->ptid.to_string ().c_str ());
+      gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
+      return;
+    }
+
+  if (thread_is_in_step_over_chain (tp))
+    {
+      infrun_debug_printf ("[%s] needs step-over",
+			   tp->ptid.to_string ().c_str ());
+      return;
+    }
+
+  /* When handling a vfork GDB removes all breakpoints from the program
+     space in which the vfork is being handled, as such we must take care
+     not to resume any thread other than the vfork parent -- resuming the
+     vfork parent allows GDB to receive and handle the 'vfork done'
+     event.  */
+  if (tp->inf->thread_waiting_for_vfork_done != nullptr)
+    {
+      if (target_is_non_stop_p ())
+	{
+	  /* For non-stop targets, regardless of whether GDB is using
+	     all-stop or non-stop mode, threads are controlled
+	     individually.
+
+	     When a thread is handling a vfork, breakpoints are removed
+	     from the inferior (well, program space in fact), so it is
+	     critical that we don't try to resume any thread other than the
+	     vfork parent.  */
+	  if (tp != tp->inf->thread_waiting_for_vfork_done)
+	    {
+	      infrun_debug_printf ("[%s] thread %s of this inferior is "
+				   "waiting for vfork-done",
+				   tp->ptid.to_string ().c_str (),
+				   tp->inf->thread_waiting_for_vfork_done
+				     ->ptid.to_string ().c_str ());
+	      return;
+	    }
+	}
+      else
+	{
+	  /* For all-stop targets, when we attempt to resume the inferior,
+	     we will only resume the vfork parent thread, this is handled
+	     in internal_resume_ptid.
+
+	     Additionally, we will always be called with the vfork parent
+	     thread as the current thread (TP) thanks to follow_fork, as
+	     such the following assertion should hold.
+
+	     Beyond this there is nothing more that needs to be done
+	     here.  */
+	  gdb_assert (tp == tp->inf->thread_waiting_for_vfork_done);
+	}
+    }
+
+  infrun_debug_printf ("resuming %s",
+		       tp->ptid.to_string ().c_str ());
+
+  execution_control_state ecs (tp);
+  switch_to_thread (tp);
+  keep_going_pass_signal (&ecs);
+  if (!ecs.wait_some_more)
+    error (_("Command aborted."));
+}
+
 /* Basic routine for continuing the program in various fashions.
 
    ADDR is the address to resume at, or -1 for resume where stopped.
@@ -3456,77 +3539,11 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 						       resume_ptid))
 	  {
 	    switch_to_thread_no_regs (tp);
-
-	    if (!tp->inf->has_execution ())
-	      {
-		infrun_debug_printf ("[%s] target has no execution",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    if (tp->resumed ())
-	      {
-		infrun_debug_printf ("[%s] resumed",
-				     tp->ptid.to_string ().c_str ());
-		gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
-		continue;
-	      }
-
-	    if (thread_is_in_step_over_chain (tp))
-	      {
-		infrun_debug_printf ("[%s] needs step-over",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    /* If a thread of that inferior is waiting for a vfork-done
-	       (for a detached vfork child to exec or exit), breakpoints are
-	       removed.  We must not resume any thread of that inferior, other
-	       than the one waiting for the vfork-done.  */
-	    if (tp->inf->thread_waiting_for_vfork_done != nullptr
-		&& tp != tp->inf->thread_waiting_for_vfork_done)
-	      {
-		infrun_debug_printf ("[%s] another thread of this inferior is "
-				     "waiting for vfork-done",
-				     tp->ptid.to_string ().c_str ());
-		continue;
-	      }
-
-	    infrun_debug_printf ("resuming %s",
-				 tp->ptid.to_string ().c_str ());
-
-	    execution_control_state ecs (tp);
-	    switch_to_thread (tp);
-	    keep_going_pass_signal (&ecs);
-	    if (!ecs.wait_some_more)
-	      error (_("Command aborted."));
-	  }
-      }
-    else if (!cur_thr->resumed ()
-	     && !thread_is_in_step_over_chain (cur_thr))
-      {
-	/* In non-stop mode, if a there exists a thread waiting for a vfork
-	   then only allow that thread to resume (breakpoints are removed
-	   from an inferior when handling a vfork).
-
-	   We check target_is_non_stop_p here rather than just checking the
-	   non-stop flag, though these are equivalent (all-stop on a
-	   non-stop target was handled in a previous block, so at this
-	   point, if target_is_non_stop_p then GDB must be running on
-	   non-stop mode).  By using target_is_non_stop_p it will be easier
-	   to merge this block with the previous in a later commit.  */
-	if (!(target_is_non_stop_p ()
-	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
-	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
-	  {
-	    /* The thread wasn't started, and isn't queued, run it now.  */
-	    execution_control_state ecs (cur_thr);
-	    switch_to_thread (cur_thr);
-	    keep_going_pass_signal (&ecs);
-	    if (!ecs.wait_some_more)
-	      error (_("Command aborted."));
+	    proceed_resume_thread_checked (tp);
 	  }
       }
+    else
+      proceed_resume_thread_checked (cur_thr);
 
     disable_commit_resumed.reset_and_commit ();
   }
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (3 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-18 20:42     ` Simon Marchi
  2023-07-04 15:22   ` [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

Like the last few commit, this fixes yet another vfork related issue.
Like the commit titled:

  gdb: don't restart vfork parent while waiting for child to finish

which addressed a case in linux-nat where we would try to resume a
vfork parent, this commit addresses a very similar case, but this time
occurring in infrun.c.  Just like with that previous commit, this bug
results in the assert:

  x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.

In this case the issue occurs when target-non-stop is on, but non-stop
is off, and again, schedule-multiple is on.  As with the previous
commit, GDB is in follow-fork-mode child.  If you have not done so, it
is worth reading the earlier commit as many of the problems leading to
the failure are the same, they just appear in a different part of GDB.

Here are the steps leading to the assertion failure:

  1. The user performs a 'next' over a vfork, GDB stop in the vfork
  child,

  2. As we are planning to follow the child GDB sets the vfork_parent
  and vfork_child member variables in the two inferiors, the
  thread_waiting_for_vfork_done member is left as nullptr, that member
  is only used when GDB is planning to follow the parent inferior,

  3. The user does 'continue', our expectation is that the vfork child
  should resume, and once that process has exited or execd, GDB should
  detach from the vfork parent.  As a result of the 'continue' GDB
  eventually enters the proceed function,

  4. In proceed we selected a ptid_t to resume, because
  schedule-multiple is on we select minus_one_ptid (see
  user_visible_resume_ptid),

  5. As GDB is running in all-stop on top of non-stop mode, in the
  proceed function we iterate over all threads that match the resume
  ptid, which turns out to be all threads, and call
  proceed_resume_thread_checked.  One of the threads we iterate over
  is the vfork parent thread,

  6. As the thread passed to proceed_resume_thread_checked doesn't
  match any of the early return conditions, GDB will set the thread
  resumed,

  7. As we are resuming one thread at a time, this thread is seen by
  the lower layers (e.g. linux-nat) as the "event thread", which means
  we don't apply any of the checks, e.g. is this thread a
  vfork parent, instead we assume that GDB core knows what it's doing,
  and linux-nat will resume the thread, we have now incorrectly set
  running the vfork parent thread when this thread should be waiting
  for the vfork child to complete,

  8. Back in the proceed function GDB continues to iterate over all
  threads, and now (correctly) resumes the vfork child thread,

  8. As the vfork child is still alive the kernel holds the vfork
  parent stopped,

  9. Eventually the child performs its exec and GDB is sent and EXECD
  event.  However, because the parent is resumed, as soon as the child
  performs its exec the vfork parent also sends a VFORK_DONE event to
  GDB,

  10. Depending on timing both of these events might seem to arrive in
  GDB at the same time.  Normally GDB expects to see the EXECD or
  EXITED/SIGNALED event from the vfork child before getting the
  VFORK_DONE in the parent.  We know this because it is as a result of
  the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
  handle_vfork_child_exec_or_exit for details).  Further the comment
  in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
  when we remain attached to the child (not the parent) we should not
  expect to see a VFORK_DONE,

  11. If both events arrive at the same time then GDB will randomly
  choose one event to handle first, in some cases this will be the
  VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
  expects that (a) the vfork child has finished, however, in this case
  this is not completely true, the child has finished, but GDB has not
  processed the event associated with the completion yet, and (b) upon
  seeing a VFORK_DONE GDB assumes we are remaining attached to the
  parent, and so resumes the parent process,

  12. GDB now handles the EXECD event.  In our case we are detaching
  from the parent, so GDB calls target_detach (see
  handle_vfork_child_exec_or_exit),

  13. While this has been going on the vfork parent is executing, and
  might even exit,

  14. In linux_nat_target::detach the first thing we do is stop all
  threads in the process we're detaching from, the result of the stop
  request will be cached on the lwp_info object,

  15. In our case the vfork parent has exited though, so when GDB
  waits for the thread, instead of a stop due to signal, we instead
  get a thread exited status,

  16. Later in the detach process we try to resume the threads just
  prior to making the ptrace call to actually detach (see
  detach_one_lwp), as part of the process to resume a thread we try to
  touch some registers within the thread, and before doing this GDB
  asserts that the thread is stopped,

  17. An exited thread is not classified as stopped, and so the assert
  triggers!

Just like with the earlier commit, the fix is to spot the vfork parent
status of the thread, and not resume such threads.  Where the earlier
commit fixed this in linux-nat, in this case I think the fix should
live in infrun.c, in proceed_resume_thread_checked.  This function
already has a similar check to not resume the vfork parent in the case
where we are planning to follow the vfork parent, I propose adding a
similar case that checks for the vfork parent when we plan to follow
the vfork child.

This new check will mean that at step #6 above GDB doesn't try to
resume the vfork parent thread, which prevents the VFORK_DONE from
ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
the vfork child GDB will detach from the parent.

There's no test included in this commit.  In a subsequent commit I
will expand gdb.base/foll-vfork.exp which is when this bug would be
exposed.

If you do want to reproduce this failure then you will for certainly
need to run the gdb.base/foll-vfork.exp test in a loop as the failures
are all very timing sensitive.  I've found that running multiple
copies in parallel makes the failure more likely to appear, I usually
run ~6 copies in parallel and expect to see a failure after within
10mins.
---
 gdb/infrun.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 010fcd7952f..2d2f7d67a0f 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -3299,10 +3299,12 @@ proceed_resume_thread_checked (thread_info *tp)
     }
 
   /* When handling a vfork GDB removes all breakpoints from the program
-     space in which the vfork is being handled, as such we must take care
-     not to resume any thread other than the vfork parent -- resuming the
-     vfork parent allows GDB to receive and handle the 'vfork done'
-     event.  */
+     space in which the vfork is being handled.  If we are following the
+     parent then GDB will set the thread_waiting_for_vfork_done member of
+     the parent inferior.  In this case we should take care to only resume
+     the vfork parent thread, the kernel will hold this thread suspended
+     until the vfork child has exited or execd, at which point the parent
+     will be resumed and a VFORK_DONE event sent to GDB.  */
   if (tp->inf->thread_waiting_for_vfork_done != nullptr)
     {
       if (target_is_non_stop_p ())
@@ -3341,6 +3343,20 @@ proceed_resume_thread_checked (thread_info *tp)
 	}
     }
 
+  /* When handling a vfork GDB removes all breakpoints from the program
+     space in which the vfork is being handled.  If we are following the
+     child then GDB will set vfork_child member of the vfork parent
+     inferior.  Once the child has either exited or execd then GDB will
+     detach from the parent process.  Until that point GDB should not
+     resume any thread in the parent process.  */
+  if (tp->inf->vfork_child != nullptr)
+    {
+      infrun_debug_printf ("[%s] thread is part of a vfork parent, child is %d",
+			   tp->ptid.to_string ().c_str (),
+			   tp->inf->vfork_child->pid);
+      return;
+    }
+
   infrun_debug_printf ("resuming %s",
 		       tp->ptid.to_string ().c_str ());
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (4 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-05 11:22     ` Aktemur, Tankut Baris
  2023-07-04 15:22   ` [PATCHv2 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
                     ` (2 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

This commit provides tests for all of the bugs fixed in the previous
four commits, this is achieved by expanding gdb.base/foll-vfork.exp to
run with different configurations:

  * target-non-stop on/off
  * non-stop on/off
  * schedule-multiple on/off

We don't test with schedule-multiple on if we are using a remote
target, this is due to bug gdb/30574.  I've added a link to that bug
in this commit, but all this commit does is expose that bug, there's
no fixes here.

Some of the bugs fixed in the previous commits are very timing
dependent, as such, they don't always show up.  I've had more success
when running this test on a very loaded machine -- I usually run ~8
copies of the test in parallel, then the bugs would normally show up
pretty quickly.

Other than running the test in more configurations, I've not made any
changes to what is actually being tested, other than in one place
where, when testing with non-stop mode, GDB stops in a different
inferior, as such I had to add a new 'inferior 2' call, this can be
found in vfork_relations_in_info_inferiors.

I have cleaned things up a little, for example, making use of
proc_with_prefix to remove some with_test_prefix calls.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30574
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 628 ++++++++++++--------------
 1 file changed, 300 insertions(+), 328 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index bdceff0f5de..be0715b05c0 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -25,56 +25,52 @@ if {![istarget "*-linux*"]} {
     continue
 }
 
-standard_testfile
+standard_testfile .c -exit.c vforked-prog.c
 
-set compile_options debug
+set binfile $testfile
+set binfile2 ${testfile}-exit
+set binfile3 vforked-prog
 
-if {[build_executable $testfile.exp $testfile $srcfile $compile_options] == -1} {
+if {[build_executable "compile $binfile" $binfile $srcfile] == -1} {
     untested "failed to compile main testcase"
     return -1
 }
 
-set testfile2 "vforked-prog"
-set srcfile2 ${testfile2}.c
+if {[build_executable "compile $binfile2" $binfile2 $srcfile2] == -1} {
+    untested "failed to compile main testcase"
+    return -1
+}
 
-if {[build_executable $testfile.exp $testfile2 $srcfile2 $compile_options] == -1} {
-    untested "failed to compile secondary testcase"
+if {[build_executable "compile $binfile3" $binfile3 $srcfile3] == -1} {
+    untested "failed to compile main testcase"
     return -1
 }
 
+# If required, download the program that we exec after vfork to the
+# remote target.
 if { [is_remote target] } {
-    gdb_remote_download target [standard_output_file $testfile2]
+    gdb_remote_download target [standard_output_file $binfile3]
 }
 
-# A few of these tests require a little more time than the standard
-# timeout allows.
-set oldtimeout $timeout
-set timeout [expr "$timeout + 10"]
-
 # Start with a fresh GDB, with verbosity enabled, and run to main.  On
 # error, behave as "return", so we don't try to continue testing with
 # a borked session.
-proc setup_gdb {} {
-    global testfile srcfile
-
-    clean_restart $testfile
+proc setup_gdb { binfile srcfile } {
+    clean_restart $binfile
 
     if ![runto_main] {
 	return -code return
     }
 
-    set tbreak_line [gdb_get_line_number " VFORK " $srcfile]
-    gdb_test "tbreak ${tbreak_line}"
-    gdb_continue_to_breakpoint ".*"
+    gdb_breakpoint [gdb_get_line_number " VFORK " $srcfile] temporary
+    gdb_continue_to_breakpoint "at VFORK"
 }
 
 proc check_vfork_catchpoints {} {
-  global gdb_prompt
-
   # Because setup_gdb uses 'return -code return' which would return to
   # our caller we need to wrap this call, spot when setup_gdb failed
   # (with return code 2), and then issue our own 'return -code return'.
-  set code [catch {setup_gdb} string]
+  set code [catch {setup_gdb $::testfile $::srcfile} string]
   if { $code == 2 } {
     unsupported "vfork catchpoints"
     return -code return
@@ -84,10 +80,10 @@ proc check_vfork_catchpoints {} {
   gdb_test "catch vfork" "Catchpoint \[0-9\]* \\(vfork\\)" "insert first vfork catchpoint"
   set has_vfork_catchpoints 0
   gdb_test_multiple "continue" "continue to first vfork catchpoint" {
-    -re ".*Your system does not support this type\r\nof catchpoint.*$gdb_prompt $" {
+    -re -wrap ".*Your system does not support this type\r\nof catchpoint.*" {
       unsupported "continue to first vfork catchpoint"
     }
-    -re ".*Catchpoint.*$gdb_prompt $" {
+    -re -wrap ".*Catchpoint.*" {
       set has_vfork_catchpoints 1
       pass "continue to first vfork catchpoint"
     }
@@ -99,319 +95,278 @@ proc check_vfork_catchpoints {} {
   }
 }
 
-proc vfork_parent_follow_through_step {} {
-  with_test_prefix "vfork parent follow, through step" {
-   global gdb_prompt
+proc_with_prefix vfork_parent_follow_through_step { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-   setup_gdb
+    gdb_test_no_output "set follow-fork parent"
 
-   gdb_test_no_output "set follow-fork parent"
+    gdb_test_multiple "next" "" {
+	-re -wrap "\\\[Detaching after vfork from.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
 
-   set test "step"
-   gdb_test_multiple "next" $test {
-       -re "\\\[Detaching after vfork from.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass $test
-       }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_parent_follow_to_bp {} {
-  with_test_prefix "vfork parent follow, to bp" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork parent"
-
-   set bp_location [gdb_get_line_number "I'm the proud parent of child"]
-   gdb_test "break ${srcfile}:${bp_location}" ".*" "break, vfork to bp"
-
-   set test "continue to bp"
-   gdb_test_multiple "continue" $test {
-       -re ".*\\\[Detaching after vfork from child process.*Breakpoint.*${bp_location}.*$gdb_prompt " {
-	   pass $test
-       }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_child_follow_to_exit {} {
-  with_test_prefix "vfork child follow, to exit" {
-   global gdb_prompt
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   set test "continue to child exit"
-   gdb_test_multiple "continue" $test {
-      -re "Couldn't get registers.*$gdb_prompt " {
-	  # PR gdb/14766
-	  fail "$test"
-      }
-       -re "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent .* after child exit.*$gdb_prompt " {
-	  pass $test
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
-proc vfork_and_exec_child_follow_to_main_bp {} {
-  with_test_prefix "vfork and exec child follow, to main bp" {
-   global gdb_prompt
-   global srcfile2
+proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-   setup_gdb
+    gdb_test_no_output "set follow-fork parent"
 
-   gdb_test_no_output "set follow-fork child"
+    set bp_location \
+	[gdb_get_line_number "I'm the proud parent of child" $srcfile]
+    gdb_test "break ${srcfile}:${bp_location}" ".*" "break, vfork to bp"
 
-   set linenum [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
+    gdb_test_multiple "continue" "continue to bp" {
+	-re -wrap ".*\\\[Detaching after vfork from child process.*Breakpoint.*${bp_location}.*" {
+	    pass $gdb_test_name
+	}
+    }
 
-   set test "continue to bp"
-   gdb_test_multiple "continue" $test {
-      -re "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent.*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*$gdb_prompt " {
-	  pass $test
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any gdb_expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc vfork_and_exec_child_follow_through_step {} {
-  with_test_prefix "vfork and exec child follow, through step" {
-   global gdb_prompt
-   global srcfile2
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   set test "step over vfork"
-
-   # The ideal support is to be able to debug the child even
-   # before it execs.  Thus, "next" lands on the next line after
-   # the vfork.
-   gdb_test_multiple "next" $test {
-       -re "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass "$test"
-       }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test_multiple "continue" "continue to child exit" {
+	-re -wrap "Couldn't get registers.*" {
+	    # PR gdb/14766
+	    fail $gdb_test_name
+	}
+	-re -wrap "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent .* after child exit.*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    set linenum [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+
+    gdb_test_multiple "continue" "continue to bp" {
+	-re -wrap "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent.*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any gdb_expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    # The ideal support is to be able to debug the child even
+    # before it execs.  Thus, "next" lands on the next line after
+    # the vfork.
+    gdb_test_multiple "next" "next over vfork" {
+	-re -wrap "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
 proc continue_to_vfork {} {
-   global gdb_prompt
-
-   # A vfork catchpoint may stop in either "vfork" or "_vfork".
-   set test "continue to vfork"
-   gdb_test_multiple "continue" $test {
-      -re "vfork \\(\\) at .*$gdb_prompt $" {
-	  pass $test
-      }
-      -re "0x\[0-9a-fA-F\]*.*(vfork|__kernel_v?syscall).*$gdb_prompt " {
-	  pass $test
-      }
-   }
+    # A vfork catchpoint may stop in either "vfork" or "_vfork".
+    gdb_test_multiple "continue" "continue to vfork" {
+	-re -wrap "vfork \\(\\) at .*" {
+	    pass $gdb_test_name
+	}
+	-re -wrap "0x\[0-9a-fA-F\]*.*(vfork|__kernel_v?syscall).*" {
+	    pass $gdb_test_name
+	}
+    }
 }
 
-proc tcatch_vfork_then_parent_follow {} {
-  with_test_prefix "vfork parent follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork parent"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set linenum [gdb_get_line_number "pid = vfork ();"]
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*0x\[0-9a-fA-F\]* in main .* at .*${srcfile}:${linenum}.*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-   }
-   # The child has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc tcatch_vfork_then_child_follow_exec {} {
-  with_test_prefix "vfork child follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-   global srcfile2
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set linenum1 [gdb_get_line_number "pid = vfork ();"]
-   set linenum2 [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
-
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*${srcfile}:${linenum1}.*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-      -re "Run till exit from.*vfork.*${srcfile2}:${linenum2}.*$gdb_prompt " {
-	  pass "$test (followed exec)"
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
-
-proc tcatch_vfork_then_child_follow_exit {} {
-  with_test_prefix "vfork child follow, finish after tcatch vfork" {
-   global gdb_prompt
-   global srcfile
-
-   setup_gdb
-
-   gdb_test_no_output "set follow-fork child"
-
-   gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
-
-   continue_to_vfork
-
-   set test "finish"
-   gdb_test_multiple "finish" $test {
-      -re "Run till exit from.*vfork.*exited normally.*$gdb_prompt " {
-	  setup_kfail "gdb/14762" *-*-*
-	  fail $test
-      }
-      -re "Run till exit from.*vfork.*pid = vfork \\(\\).*$gdb_prompt " {
-	  pass $test
-      }
-      -re "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*$gdb_prompt " {
-	  send_gdb "finish\n"
-	  exp_continue
-      }
-   }
-   # The parent has been detached; allow time for any output it might
-   # generate to arrive, so that output doesn't get confused with
-   # any expected debugger output from a subsequent testpoint.
-   #
-   exec sleep 1
-}}
+proc_with_prefix tcatch_vfork_then_parent_follow { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-proc vfork_relations_in_info_inferiors { variant } {
-  with_test_prefix "vfork relations in info inferiors" {
-   global gdb_prompt
+    gdb_test_no_output "set follow-fork parent"
 
-   setup_gdb
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
 
-   gdb_test_no_output "set follow-fork child"
+    continue_to_vfork
 
-   set test "step over vfork"
-   gdb_test_multiple "next" $test {
-       -re "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*$gdb_prompt " {
-	   pass "$test"
-       }
-   }
+    set linenum [gdb_get_line_number "pid = vfork ();" $srcfile]
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*0x\[0-9a-fA-F\]* in main .* at .*${srcfile}:${linenum}.*" {
+	    pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+    }
 
-   gdb_test "info inferiors" \
-       ".*is vfork parent of inferior 2.*is vfork child of inferior 1" \
-       "info inferiors shows vfork parent/child relation"
+    # The child has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
 
-   if { $variant == "exec" } {
-       global srcfile2
+proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
+    setup_gdb $binfile $srcfile
 
-       set linenum [gdb_get_line_number "Hello from vforked-prog" ${srcfile2}]
-       set test "continue to bp"
-       gdb_test_multiple "continue" $test {
-	   -re ".*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*$gdb_prompt " {
-	       pass $test
-	   }
-       }
-   } else {
-       set test "continue to child exit"
-       gdb_test_multiple "continue" $test {
-	   -re "exited normally.*$gdb_prompt " {
-	       pass $test
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
+
+    continue_to_vfork
+
+    set linenum1 [gdb_get_line_number "pid = vfork ();" $srcfile]
+    set linenum2 [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*${srcfile}:${linenum1}.*" {
+	  pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+	-re -wrap "Run till exit from.*vfork.*${::srcfile3}:${linenum2}.*" {
+	    pass "$gdb_test_name (followed exec)"
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
+   setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test "tcatch vfork" "Catchpoint .*(vfork).*"
+
+    continue_to_vfork
+
+    gdb_test_multiple "finish" "" {
+	-re -wrap "Run till exit from.*vfork.*exited normally.*" {
+	    setup_kfail "gdb/14762" *-*-*
+	    fail $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*vfork.*pid = vfork \\(\\).*" {
+	  pass $gdb_test_name
+	}
+	-re -wrap "Run till exit from.*__kernel_v?syscall.*0x\[0-9a-fA-F\]* in vfork .*" {
+	    send_gdb "finish\n"
+	    exp_continue
+	}
+    }
+
+    # The parent has been detached; allow time for any output it might
+    # generate to arrive, so that output doesn't get confused with
+    # any expected debugger output from a subsequent testpoint.
+    #
+    exec sleep 1
+}
+
+proc_with_prefix vfork_relations_in_info_inferiors { variant binfile srcfile non_stop } {
+    setup_gdb $binfile $srcfile
+
+    gdb_test_no_output "set follow-fork child"
+
+    gdb_test_multiple "next" "next over vfork" {
+	-re -wrap "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*" {
+	    pass $gdb_test_name
+	}
+    }
+
+    if { $non_stop } {
+	gdb_test "inferior 2" ".*"
+    }
+
+    gdb_test "info inferiors" \
+	".*is vfork parent of inferior 2.*is vfork child of inferior 1" \
+	"info inferiors shows vfork parent/child relation"
+
+    if { $variant == "exec" } {
+	set linenum [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
+	gdb_test_multiple "continue" "continue to bp" {
+	    -re -wrap ".*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*" {
+		pass $gdb_test_name
+	    }
+	}
+    } else {
+	gdb_test_multiple "continue" "continue to child exit" {
+	   -re -wrap "exited normally.*" {
+	       pass $gdb_test_name
 	   }
-       }
-   }
-
-   set test "vfork relation no longer appears in info inferiors"
-   gdb_test_multiple "info inferiors" $test {
-       -re -wrap "is vfork child of inferior 1.*" {
-	   fail $test
-       }
-       -re -wrap "is vfork parent of inferior 2.*" {
-	   fail $test
-       }
-       -re -wrap "" {
-	   pass $test
-       }
-   }
-}}
-
-proc do_vfork_and_follow_parent_tests {} {
-   global gdb_prompt
+	}
+    }
 
+    gdb_test_multiple "info inferiors" "vfork relation no longer appears in info inferiors" {
+	-re -wrap "is vfork child of inferior 1.*" {
+	   fail $gdb_test_name
+	}
+	-re -wrap "is vfork parent of inferior 2.*" {
+	    fail $gdb_test_name
+	}
+	-re -wrap "" {
+	    pass $gdb_test_name
+	}
+    }
+}
+
+proc do_vfork_and_follow_parent_tests { binfile srcfile } {
    # Try following the parent process by stepping through a call to
    # vfork.  Do this without catchpoints.
-   vfork_parent_follow_through_step
+   vfork_parent_follow_through_step $binfile $srcfile
 
    # Try following the parent process by setting a breakpoint on the
    # other side of a vfork, and running to that point.  Do this
    # without catchpoints.
-   vfork_parent_follow_to_bp
+   vfork_parent_follow_to_bp $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the parent.
    #
-   tcatch_vfork_then_parent_follow
+   tcatch_vfork_then_parent_follow $binfile $srcfile
 }
 
-proc do_vfork_and_follow_child_tests_exec {} {
+proc do_vfork_and_follow_child_tests_exec { binfile srcfile non_stop } {
    # Try following the child process by just continuing through the
    # vfork, and letting the parent's breakpoint on "main" be auto-
    # magically reset in the child.
    #
-   vfork_and_exec_child_follow_to_main_bp
+   vfork_and_exec_child_follow_to_main_bp $binfile $srcfile
 
    # Try following the child process by stepping through a call to
    # vfork.  The child also executes an exec.  Since the child cannot
@@ -420,11 +375,11 @@ proc do_vfork_and_follow_child_tests_exec {} {
    # recomputed in the exec'd child, the step through a vfork should
    # land us in the "main" for the exec'd child, too.
    #
-   vfork_and_exec_child_follow_through_step
+   vfork_and_exec_child_follow_through_step $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the child.
    #
-   tcatch_vfork_then_child_follow_exec
+   tcatch_vfork_then_child_follow_exec $binfile $srcfile
 
    # Test the ability to follow both child and parent of a vfork.  Do
    # this without catchpoints.
@@ -442,25 +397,25 @@ proc do_vfork_and_follow_child_tests_exec {} {
    # and confirm the relation is no longer displayed in "info
    # inferiors".
    #
-   vfork_relations_in_info_inferiors "exec"
+   vfork_relations_in_info_inferiors "exec" $binfile $srcfile $non_stop
 }
 
-proc do_vfork_and_follow_child_tests_exit {} {
+proc do_vfork_and_follow_child_tests_exit { binfile srcfile non_stop } {
    # Try following the child process by just continuing through the
    # vfork, and letting the child exit.
    #
-   vfork_child_follow_to_exit
+   vfork_child_follow_to_exit $binfile $srcfile
 
    # Try catching a vfork, and stepping out to the child.
    #
-   tcatch_vfork_then_child_follow_exit
+   tcatch_vfork_then_child_follow_exit $binfile $srcfile
 
    # Step over a vfork in the child, do "info inferiors" and check the
    # parent/child relation is displayed.  Run the child to completion,
    # and confirm the relation is no longer displayed in "info
    # inferiors".
    #
-   vfork_relations_in_info_inferiors "exit"
+   vfork_relations_in_info_inferiors "exit" $binfile $srcfile $non_stop
 }
 
 with_test_prefix "check vfork support" {
@@ -470,41 +425,58 @@ with_test_prefix "check vfork support" {
 }
 
 # Follow parent and follow child vfork tests with a child that execs.
-with_test_prefix "exec" {
+proc_with_prefix exec_tests { binfile srcfile non_stop } {
     # These are tests of gdb's ability to follow the parent of a Unix
     # vfork system call.  The child will subsequently call a variant
     # of the Unix exec system call.
-    do_vfork_and_follow_parent_tests
+    do_vfork_and_follow_parent_tests $binfile $srcfile
 
     # These are tests of gdb's ability to follow the child of a Unix
     # vfork system call.  The child will subsequently call a variant
     # of a Unix exec system call.
     #
-    do_vfork_and_follow_child_tests_exec
-}
-
-# Switch to test the case of the child exiting.  We can't use
-# standard_testfile here because we don't want to overwrite the binary
-# of the previous tests.
-set testfile "foll-vfork-exit"
-set srcfile ${testfile}.c
-set binfile [standard_output_file ${testfile}]
-
-if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
-    untested "failed to build $testfile"
-    return
+    do_vfork_and_follow_child_tests_exec $binfile $srcfile $non_stop
 }
 
 # Follow parent and follow child vfork tests with a child that exits.
-with_test_prefix "exit" {
+proc_with_prefix exit_tests { binfile srcfile non_stop } {
     # These are tests of gdb's ability to follow the parent of a Unix
     # vfork system call.  The child will subsequently exit.
-    do_vfork_and_follow_parent_tests
+    do_vfork_and_follow_parent_tests $binfile $srcfile
 
     # These are tests of gdb's ability to follow the child of a Unix
     # vfork system call.  The child will subsequently exit.
     #
-    do_vfork_and_follow_child_tests_exit
+    do_vfork_and_follow_child_tests_exit $binfile $srcfile $non_stop
+}
+
+# Using the remote protocol with schedule-multiple turned triggers bug
+# gdb/30574, so avoid this for now.
+if {[target_info exists gdb_protocol]
+    && ([target_info gdb_protocol] == "remote"
+	|| [target_info gdb_protocol] == "extended-remote")} {
+    set sm_modes { off }
+} else {
+    set sm_modes { off on }
 }
 
-set timeout $oldtimeout
+# A few of these tests require a little more time than the standard timeout
+# allows.
+with_timeout_factor 2 {
+    foreach_with_prefix target-non-stop {auto on off} {
+	foreach_with_prefix non-stop {off on} {
+	    foreach_with_prefix schedule-multiple $sm_modes {
+		save_vars { ::GDBFLAGS } {
+		    # These flags will be picked up by the call to
+		    # clean_restart inside setup_gdb.
+		    append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
+		    append ::GDBFLAGS " -ex \"set non-stop ${non-stop}\""
+		    append ::GDBFLAGS " -ex \"set schedule-multiple ${schedule-multiple}\""
+
+		    exec_tests $binfile $srcfile ${non-stop}
+		    exit_tests $binfile2 $srcfile2 ${non-stop}
+		}
+	    }
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (5 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-04 15:22   ` [PATCHv2 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
  2023-07-05 11:30   ` [PATCHv2 0/8] Some vfork related fixes Aktemur, Tankut Baris
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

While working on gdb.base/foll-vfork.exp I noticed that there are
several random 'sleep' calls throughout the test.

The comment suggests these are to allow for output from a vforked
child to arrive, but in each case the test is about to close and
restart GDB, so I don't see how random output from a child process
could impact testing.

I removed the sleep calls and couldn't reproduce any failures from
this test, I left the test running for a couple of hours, and tried
loading my machine, and the test seems fine with these removed.

I've left this as a separate commit so that if, in the future, someone
can show that these are required, it will be easy to revert this one
patch and bring them back.

There should be no change in what is tested after this commit.
---
 gdb/testsuite/gdb.base/foll-vfork.exp | 48 ---------------------------
 1 file changed, 48 deletions(-)

diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-vfork.exp
index be0715b05c0..43fc25cc2c0 100644
--- a/gdb/testsuite/gdb.base/foll-vfork.exp
+++ b/gdb/testsuite/gdb.base/foll-vfork.exp
@@ -105,12 +105,6 @@ proc_with_prefix vfork_parent_follow_through_step { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
@@ -127,12 +121,6 @@ proc_with_prefix vfork_parent_follow_to_bp { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
@@ -149,12 +137,6 @@ proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
@@ -169,12 +151,6 @@ proc_with_prefix vfork_and_exec_child_follow_to_main_bp { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any gdb_expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
@@ -190,12 +166,6 @@ proc_with_prefix vfork_and_exec_child_follow_through_step { binfile srcfile } {
 	    pass $gdb_test_name
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc continue_to_vfork {} {
@@ -229,12 +199,6 @@ proc_with_prefix tcatch_vfork_then_parent_follow { binfile srcfile } {
 	    exp_continue
 	}
     }
-
-    # The child has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
@@ -261,12 +225,6 @@ proc_with_prefix tcatch_vfork_then_child_follow_exec { binfile srcfile } {
 	    pass "$gdb_test_name (followed exec)"
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
@@ -291,12 +249,6 @@ proc tcatch_vfork_then_child_follow_exit { binfile srcfile } {
 	    exp_continue
 	}
     }
-
-    # The parent has been detached; allow time for any output it might
-    # generate to arrive, so that output doesn't get confused with
-    # any expected debugger output from a subsequent testpoint.
-    #
-    exec sleep 1
 }
 
 proc_with_prefix vfork_relations_in_info_inferiors { variant binfile srcfile non_stop } {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCHv2 8/8] gdb: additional debug output in infrun.c and linux-nat.c
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (6 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-07-04 15:22   ` Andrew Burgess
  2023-07-05 11:30   ` [PATCHv2 0/8] Some vfork related fixes Aktemur, Tankut Baris
  8 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:22 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess, tankut.baris.aktemur

While working on some of the recent patches relating to vfork handling
I found this debug output very helpful, I'd like to propose adding
this into GDB.

With debug turned off there should be no user visible changes after
this commit.
---
 gdb/infrun.c    | 28 ++++++++++++++++++++++++----
 gdb/linux-nat.c | 23 +++++++++++++++++++----
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 2d2f7d67a0f..7efa0617526 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -435,6 +435,11 @@ show_follow_fork_mode_string (struct ui_file *file, int from_tty,
 static bool
 follow_fork_inferior (bool follow_child, bool detach_fork)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
+  infrun_debug_printf ("follow_child = %d, detach_fork = %d",
+		       follow_child, detach_fork);
+
   target_waitkind fork_kind = inferior_thread ()->pending_follow.kind ();
   gdb_assert (fork_kind == TARGET_WAITKIND_FORKED
 	      || fork_kind == TARGET_WAITKIND_VFORKED);
@@ -543,6 +548,13 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	  parent_inf->thread_waiting_for_vfork_done
 	    = detach_fork ? inferior_thread () : nullptr;
 	  parent_inf->pspace->breakpoints_not_allowed = detach_fork;
+
+	  infrun_debug_printf
+	    ("parent_inf->thread_waiting_for_vfork_done == %s",
+	     (parent_inf->thread_waiting_for_vfork_done == nullptr
+	      ? "nullptr"
+	      : (parent_inf->thread_waiting_for_vfork_done
+		 ->ptid.to_string ().c_str ())));
 	}
     }
   else
@@ -727,6 +739,8 @@ set_last_target_status_stopped (thread_info *tp)
 static bool
 follow_fork ()
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   bool follow_child = (follow_fork_mode_string == follow_fork_mode_child);
   bool should_resume = true;
 
@@ -996,6 +1010,8 @@ proceed_after_vfork_done (thread_info *thread)
 static void
 handle_vfork_child_exec_or_exit (int exec)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   struct inferior *inf = current_inferior ();
 
   if (inf->vfork_parent)
@@ -1127,6 +1143,8 @@ handle_vfork_child_exec_or_exit (int exec)
 static void
 handle_vfork_done (thread_info *event_thread)
 {
+  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
+
   /* We only care about this event if inferior::thread_waiting_for_vfork_done is
      set, that is if we are waiting for a vfork child not under our control
      (because we detached it) to exec or exit.
@@ -1142,8 +1160,6 @@ handle_vfork_done (thread_info *event_thread)
       return;
     }
 
-  INFRUN_SCOPED_DEBUG_ENTER_EXIT;
-
   /* We stopped all threads (other than the vforking thread) of the inferior in
      follow_fork and kept them stopped until now.  It should therefore not be
      possible for another thread to have reported a vfork during that window.
@@ -3407,6 +3423,8 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
 
   thread_info *cur_thr = inferior_thread ();
 
+  infrun_debug_printf ("cur_thr = %s", cur_thr->ptid.to_string ().c_str ());
+
   /* Fill in with reasonable starting values.  */
   init_thread_stepping_state (cur_thr);
 
@@ -3463,8 +3481,10 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   if (!cur_thr->control.in_infcall)
     set_running (resume_target, resume_ptid, true);
 
-  infrun_debug_printf ("addr=%s, signal=%s", paddress (gdbarch, addr),
-		       gdb_signal_to_symbol_string (siggnal));
+  infrun_debug_printf ("addr=%s, signal=%s, resume_ptid=%s",
+		       paddress (gdbarch, addr),
+		       gdb_signal_to_symbol_string (siggnal),
+		       resume_ptid.to_string ().c_str ());
 
   annotate_starting ();
 
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 7e121b7ab41..87152621dd6 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1293,6 +1293,11 @@ get_detach_signal (struct lwp_info *lp)
 static void
 detach_one_lwp (struct lwp_info *lp, int *signo_p)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
+  linux_nat_debug_printf ("lwp %s (stopped = %d)",
+			  lp->ptid.to_string ().c_str (), lp->stopped);
+
   int lwpid = lp->ptid.lwp ();
   int signo;
 
@@ -1363,6 +1368,10 @@ detach_one_lwp (struct lwp_info *lp, int *signo_p)
   else
     signo = *signo_p;
 
+  linux_nat_debug_printf ("preparing to resume lwp %s (stopped = %d)",
+			  lp->ptid.to_string ().c_str (),
+			  lp->stopped);
+
   /* Preparing to resume may try to write registers, and fail if the
      lwp is zombie.  If that happens, ignore the error.  We'll handle
      it below, when detach fails with ESRCH.  */
@@ -1395,6 +1404,8 @@ detach_callback (struct lwp_info *lp)
 void
 linux_nat_target::detach (inferior *inf, int from_tty)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   struct lwp_info *main_lwp;
   int pid = inf->pid;
 
@@ -3122,13 +3133,13 @@ static ptid_t
 linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 		  target_wait_flags target_options)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   sigset_t prev_mask;
   enum resume_kind last_resume_kind;
   struct lwp_info *lp;
   int status;
 
-  linux_nat_debug_printf ("enter");
-
   /* The first time we get here after starting a new inferior, we may
      not have added it to the LWP list yet - this is the earliest
      moment at which we know its PID.  */
@@ -3228,7 +3239,7 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
 
       if (target_options & TARGET_WNOHANG)
 	{
-	  linux_nat_debug_printf ("exit (ignore)");
+	  linux_nat_debug_printf ("no interesting events found");
 
 	  ourstatus->set_ignore ();
 	  restore_child_signals_mask (&prev_mask);
@@ -3314,7 +3325,7 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  linux_nat_debug_printf ("exit");
+  linux_nat_debug_printf ("event found");
 
   restore_child_signals_mask (&prev_mask);
 
@@ -3410,6 +3421,8 @@ ptid_t
 linux_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 			target_wait_flags target_options)
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   ptid_t event_ptid;
 
   linux_nat_debug_printf ("[%s], [%s]", ptid.to_string ().c_str (),
@@ -3589,6 +3602,8 @@ linux_nat_target::kill ()
 void
 linux_nat_target::mourn_inferior ()
 {
+  LINUX_NAT_SCOPED_DEBUG_ENTER_EXIT;
+
   int pid = inferior_ptid.pid ();
 
   purge_lwp_list (pid);
-- 
2.25.4


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function
  2023-06-28  8:47   ` Aktemur, Tankut Baris
@ 2023-07-04 15:24     ` Andrew Burgess
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-04 15:24 UTC (permalink / raw)
  To: Aktemur, Tankut Baris, gdb-patches; +Cc: Mihails Strasuns, Schimpe, Christina

"Aktemur, Tankut Baris" <tankut.baris.aktemur@intel.com> writes:

> On Thursday, June 22, 2023 3:17 PM, Andrew Burgess wrote:
>> From: Mihails Strasuns <mihails.strasuns@intel.com>
>> 
>> Split the thread resuming code from proceed into new function
>> proceed_resume_thread_checked.
>> 
>> Co-Authored-By: Christina Schimpe <christina.schimpe@intel.com>
>> ---
>>  gdb/infrun.c | 148 +++++++++++++++++++++++++++------------------------
>>  1 file changed, 79 insertions(+), 69 deletions(-)
>> 
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 5b0257076f0..b10b7c59652 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -3268,6 +3268,82 @@ check_multi_target_resumption (process_stratum_target *resume_target)
>>      }
>>  }
>> 
>> +/* Helper function for `proceed`.  Check if thread TP is suitable for
>> +   resuming, and, if it is, switch to the thread and call
>> +   `keep_going_pass_signal`.  If TP is not suitable for resuming then this
>> +   function will just return without switching threads.  */
>> +
>> +static void
>> +proceed_resume_thread_checked (thread_info *tp)
>> +{
>> +  if (!tp->inf->has_execution ())
>> +    {
>> +      infrun_debug_printf ("[%s] target has no execution",
>> +			   tp->ptid.to_string ().c_str ());
>> +      return;
>> +    }
>> +
>> +  if (tp->resumed ())
>> +    {
>> +      infrun_debug_printf ("[%s] resumed",
>> +			   tp->ptid.to_string ().c_str ());
>> +      gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
>> +      return;
>> +    }
>> +
>> +  if (thread_is_in_step_over_chain (tp))
>> +    {
>> +      infrun_debug_printf ("[%s] needs step-over",
>> +			   tp->ptid.to_string ().c_str ());
>> +      return;
>> +    }
>> +
>> +  /* When handling a vfork GDB removes all breakpoints from the program
>> +     space in which the vfork is being handled, as such we must take care
>> +     not to resume any thread other than the vfork parent -- resuming the
>> +     vfork parent allows GDB to receive and handle the 'vfork done'
>> +     event.  */
>> +  if (tp->inf->thread_waiting_for_vfork_done != nullptr)
>> +    {
>> +      if (target_is_non_stop_p ())
>> +	{
>> +	  /* For non-stop targets, regardless of whether GDB is using
>> +	     all-stop or non-stop mode, threads are controlled
>> +	     individually.
>> +
>> +	     When a thread is handling a vfork, breakpoints are removed
>> +	     from the inferior (well, program space in fact), so it is
>> +	     critical that we don't try to resume any thread other than the
>> +	     vfork parent.  */
>> +	  if (tp != tp->inf->thread_waiting_for_vfork_done)
>> +	    {
>> +	      infrun_debug_printf ("[%s] thread %s of this inferior is "
>> +				   "waiting for vfork-done",
>> +				   tp->ptid.to_string ().c_str (),
>> +				   tp->inf->thread_waiting_for_vfork_done
>> +				     ->ptid.to_string ().c_str ());
>> +	      return;
>> +	    }
>> +	}
>> +      else
>> +	{
>> +	  /* For all-stop targets, when we attempt to resume the inferior,
>> +	     we will in fact only resume the vfork parent thread, this is
>> +	     handled in internal_resume_ptid, as such, there is nothing
>> +	     more that needs doing here.  */
>
> I wonder if here we should do
>
>     gdb_assert (tp == tp->inf->thread_waiting_for_vfork_done);
>

That's a good idea.  I've also updated the comment within this else
block as the current text doesn't really cover why this assert holds.

I've posted a V2 with this change.

Thanks,
Andrew

> -Baris
>
>> +	}
>> +    }
>> +
>> +  infrun_debug_printf ("resuming %s",
>> +		       tp->ptid.to_string ().c_str ());
>> +
>> +  execution_control_state ecs (tp);
>> +  switch_to_thread (tp);
>> +  keep_going_pass_signal (&ecs);
>> +  if (!ecs.wait_some_more)
>> +    error (_("Command aborted."));
>> +}
>> +
>>  /* Basic routine for continuing the program in various fashions.
>> 
>>     ADDR is the address to resume at, or -1 for resume where stopped.
>> @@ -3456,77 +3532,11 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
>>  						       resume_ptid))
>>  	  {
>>  	    switch_to_thread_no_regs (tp);
>> -
>> -	    if (!tp->inf->has_execution ())
>> -	      {
>> -		infrun_debug_printf ("[%s] target has no execution",
>> -				     tp->ptid.to_string ().c_str ());
>> -		continue;
>> -	      }
>> -
>> -	    if (tp->resumed ())
>> -	      {
>> -		infrun_debug_printf ("[%s] resumed",
>> -				     tp->ptid.to_string ().c_str ());
>> -		gdb_assert (tp->executing () || tp->has_pending_waitstatus ());
>> -		continue;
>> -	      }
>> -
>> -	    if (thread_is_in_step_over_chain (tp))
>> -	      {
>> -		infrun_debug_printf ("[%s] needs step-over",
>> -				     tp->ptid.to_string ().c_str ());
>> -		continue;
>> -	      }
>> -
>> -	    /* If a thread of that inferior is waiting for a vfork-done
>> -	       (for a detached vfork child to exec or exit), breakpoints are
>> -	       removed.  We must not resume any thread of that inferior, other
>> -	       than the one waiting for the vfork-done.  */
>> -	    if (tp->inf->thread_waiting_for_vfork_done != nullptr
>> -		&& tp != tp->inf->thread_waiting_for_vfork_done)
>> -	      {
>> -		infrun_debug_printf ("[%s] another thread of this inferior is "
>> -				     "waiting for vfork-done",
>> -				     tp->ptid.to_string ().c_str ());
>> -		continue;
>> -	      }
>> -
>> -	    infrun_debug_printf ("resuming %s",
>> -				 tp->ptid.to_string ().c_str ());
>> -
>> -	    execution_control_state ecs (tp);
>> -	    switch_to_thread (tp);
>> -	    keep_going_pass_signal (&ecs);
>> -	    if (!ecs.wait_some_more)
>> -	      error (_("Command aborted."));
>> -	  }
>> -      }
>> -    else if (!cur_thr->resumed ()
>> -	     && !thread_is_in_step_over_chain (cur_thr))
>> -      {
>> -	/* In non-stop mode, if a there exists a thread waiting for a vfork
>> -	   then only allow that thread to resume (breakpoints are removed
>> -	   from an inferior when handling a vfork).
>> -
>> -	   We check target_is_non_stop_p here rather than just checking the
>> -	   non-stop flag, though these are equivalent (all-stop on a
>> -	   non-stop target was handled in a previous block, so at this
>> -	   point, if target_is_non_stop_p then GDB must be running on
>> -	   non-stop mode).  By using target_is_non_stop_p it will be easier
>> -	   to merge this block with the previous in a later commit.  */
>> -	if (!(target_is_non_stop_p ()
>> -	      && cur_thr->inf->thread_waiting_for_vfork_done != nullptr
>> -	      && cur_thr->inf->thread_waiting_for_vfork_done != cur_thr))
>> -	  {
>> -	    /* The thread wasn't started, and isn't queued, run it now.  */
>> -	    execution_control_state ecs (cur_thr);
>> -	    switch_to_thread (cur_thr);
>> -	    keep_going_pass_signal (&ecs);
>> -	    if (!ecs.wait_some_more)
>> -	      error (_("Command aborted."));
>> +	    proceed_resume_thread_checked (tp);
>>  	  }
>>        }
>> +    else
>> +      proceed_resume_thread_checked (cur_thr);
>> 
>>      disable_commit_resumed.reset_and_commit ();
>>    }
>> --
>> 2.25.4
>
> Intel Deutschland GmbH
> Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
> Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
> Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
> Chairperson of the Supervisory Board: Nicole Lau
> Registered Office: Munich
> Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish
  2023-07-04 15:22   ` [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
@ 2023-07-05 10:08     ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 32+ messages in thread
From: Aktemur, Tankut Baris @ 2023-07-05 10:08 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On Tuesday, July 4, 2023 5:23 PM, Andrew Burgess wrote:
> While working on a later patch, which changes gdb.base/foll-vfork.exp,
> I noticed that sometimes I would hit this assert:
> 
>   x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
> 
> I eventually tracked it down to a combination of schedule-multiple
> mode being on, target-non-stop being off, follow-fork-mode being set
> to child, and some bad timing.  The failing case is pretty simple, a
> single threaded application performs a vfork, the child process then
> execs some other application while the parent process (once the vfork
> child has completed its exec) just exits.  As best I understand
> things, here's what happens when things go wrong:
> 
>   1. The parent process performs a vfork, GDB sees the VFORKED event
>   and creates an inferior and thread for the vfork child,
> 
>   2. GDB resumes the vfork child process.  As schedule-multiple is on
>   and target-non-stop is off, this is translated into a request to
>   start all processes (see user_visible_resume_ptid),
> 
>   3. In the linux-nat layer we spot that one of the threads we are
>   about to start is a vfork parent, and so don't start that
>   thread (see resume_lwp), the vfork child thread is resumed,
> 
>   4. GDB waits for the next event, eventually entering
>   linux_nat_target::wait, which in turn calls linux_nat_wait_1,
> 
>   5. In linux_nat_wait_1 we eventually call
>   resume_stopped_resumed_lwps, this should restart threads that have
>   stopped but don't actually have anything interesting to report.
> 
>   6. Unfortunately, resume_stopped_resumed_lwps doesn't check for
>   vfork parents like resume_lwp does, so at this point the vfork
>   parent is resumed.  This feels like the start of the bug, and this
>   is where I'm proposing to fix things, but, resuming the vfork parent
>   isn't the worst thing in the world because....
> 
>   7. As the vfork child is still alive the kernel holds the vfork
>   parent stopped,
> 
>   8. Eventually the child performs its exec and GDB is sent and EXECD
>   event.  However, because the parent is resumed, as soon as the child
>   performs its exec the vfork parent also sends a VFORK_DONE event to
>   GDB,
> 
>   9. Depending on timing both of these events might seem to arrive in
>   GDB at the same time.  Normally GDB expects to see the EXECD or
>   EXITED/SIGNALED event from the vfork child before getting the
>   VFORK_DONE in the parent.  We know this because it is as a result of
>   the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
>   handle_vfork_child_exec_or_exit for details).  Further the comment
>   in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
>   when we remain attached to the child (not the parent) we should not
>   expect to see a VFORK_DONE,
> 
>   10. If both events arrive at the same time then GDB will randomly
>   choose one event to handle first, in some cases this will be the
>   VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
>   expects that (a) the vfork child has finished, however, in this case
>   this is not completely true, the child has finished, but GDB has not
>   processed the event associated with the completion yet, and (b) upon
>   seeing a VFORK_DONE GDB assumes we are remaining attached to the
>   parent, and so resumes the parent process,
> 
>   11. GDB now handles the EXECD event.  In our case we are detaching
>   from the parent, so GDB calls target_detach (see
>   handle_vfork_child_exec_or_exit),
> 
>   12. While this has been going on the vfork parent is executing, and
>   might even exit,
> 
>   13. In linux_nat_target::detach the first thing we do is stop all
>   threads in the process we're detaching from, the result of the stop
>   request will be cached on the lwp_info object,
> 
>   14. In our case the vfork parent has exited though, so when GDB
>   waits for the thread, instead of a stop due to signal, we instead
>   get a thread exited status,
> 
>   15. Later in the detach process we try to resume the threads just
>   prior to making the ptrace call to actually detach (see
>   detach_one_lwp), as part of the process to resume a thread we try to
>   touch some registers within the thread, and before doing this GDB
>   asserts that the thread is stopped,
> 
>   16. An exited thread is not classified as stopped, and so the assert
>   triggers!
> 
> So there's two bugs I see here.  The first, and most critical one here
> is in step #6.  I think that resume_stopped_resumed_lwps should not
> resume a vfork parent, just like resume_lwp doesn't resume a vfork
> parent.
> 
> With this change in place the vfork parent will remain stopped in step
> instead GDB will only see the EXECD/EXITED/SIGNALLED event.  The
> problems in #9 and #10 are therefore skipped and we arrive at #11,
> handling the EXECD event.  As the parent is still stopped #12 doesn't
> apply, and in #13 when we try to stop the process we will see that it
> is already stopped, there's no risk of the vfork parent exiting before
> we get to this point.  And finally, in #15 we are safe to poke the
> process registers because it will not have exited by this point.
> 
> However, I did mention two bugs.
> 
> The second bug I've not yet managed to actually trigger, but I'm
> convinced it must exist: if we forget vforks for a moment, in step #13
> above, when linux_nat_target::detach is called, we first try to stop
> all threads in the process GDB is detaching from.  If we imagine a
> multi-threaded inferior with many threads, and GDB running in non-stop
> mode, then, if the user tries to detach there is a chance that thread
> could exit just as linux_nat_target::detach is entered, in which case
> we should be able to trigger the same assert.
> 
> But, like I said, I've not (yet) managed to trigger this second bug,
> and even if I could, the fix would not belong in this commit, so I'm
> pointing this out just for completeness.
> 
> There's no test included in this commit.  In a couple of commits time
> I will expand gdb.base/foll-vfork.exp which is when this bug would be
> exposed.  Unfortunately there are at least two other bugs (separate
> from the ones discussed above) that need fixing first, these will be
> fixed in the next commits before the gdb.base/foll-vfork.exp test is
> expanded.
> 
> If you do want to reproduce this failure then you will for certainly
> need to run the gdb.base/foll-vfork.exp test in a loop as the failures
> are all very timing sensitive.  I've found that running multiple
> copies in parallel makes the failure more likely to appear, I usually
> run ~6 copies in parallel and expect to see a failure after within
> 10mins.
> ---
>  gdb/linux-nat.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 383ef58fa23..7e121b7ab41 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -3346,7 +3346,14 @@ linux_nat_wait_1 (ptid_t ptid, struct target_waitstatus
> *ourstatus,
>  static int
>  resume_stopped_resumed_lwps (struct lwp_info *lp, const ptid_t wait_ptid)
>  {
> -  if (!lp->stopped)
> +  struct inferior *inf = find_inferior_ptid (linux_target, lp->ptid);

Nit: The 'struct' keyword can be omitted.

-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp
  2023-07-04 15:22   ` [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
@ 2023-07-05 11:22     ` Aktemur, Tankut Baris
  0 siblings, 0 replies; 32+ messages in thread
From: Aktemur, Tankut Baris @ 2023-07-05 11:22 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On Tuesday, July 4, 2023 5:23 PM, Andrew Burgess wrote:
> This commit provides tests for all of the bugs fixed in the previous
> four commits, this is achieved by expanding gdb.base/foll-vfork.exp to
> run with different configurations:
> 
>   * target-non-stop on/off
>   * non-stop on/off
>   * schedule-multiple on/off
> 
> We don't test with schedule-multiple on if we are using a remote
> target, this is due to bug gdb/30574.  I've added a link to that bug
> in this commit, but all this commit does is expose that bug, there's
> no fixes here.
> 
> Some of the bugs fixed in the previous commits are very timing
> dependent, as such, they don't always show up.  I've had more success
> when running this test on a very loaded machine -- I usually run ~8
> copies of the test in parallel, then the bugs would normally show up
> pretty quickly.
> 
> Other than running the test in more configurations, I've not made any
> changes to what is actually being tested, other than in one place
> where, when testing with non-stop mode, GDB stops in a different
> inferior, as such I had to add a new 'inferior 2' call, this can be
> found in vfork_relations_in_info_inferiors.
> 
> I have cleaned things up a little, for example, making use of
> proc_with_prefix to remove some with_test_prefix calls.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30574
> ---
>  gdb/testsuite/gdb.base/foll-vfork.exp | 628 ++++++++++++--------------
>  1 file changed, 300 insertions(+), 328 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.base/foll-vfork.exp b/gdb/testsuite/gdb.base/foll-
> vfork.exp
> index bdceff0f5de..be0715b05c0 100644
> --- a/gdb/testsuite/gdb.base/foll-vfork.exp
> +++ b/gdb/testsuite/gdb.base/foll-vfork.exp
> @@ -25,56 +25,52 @@ if {![istarget "*-linux*"]} {
>      continue
>  }
> 
> -standard_testfile
> +standard_testfile .c -exit.c vforked-prog.c
> 
> -set compile_options debug
> +set binfile $testfile
> +set binfile2 ${testfile}-exit
> +set binfile3 vforked-prog
> 
> -if {[build_executable $testfile.exp $testfile $srcfile $compile_options] == -1} {
> +if {[build_executable "compile $binfile" $binfile $srcfile] == -1} {
>      untested "failed to compile main testcase"
>      return -1
>  }
> 
> -set testfile2 "vforked-prog"
> -set srcfile2 ${testfile2}.c
> +if {[build_executable "compile $binfile2" $binfile2 $srcfile2] == -1} {
> +    untested "failed to compile main testcase"

A slightly different message can be used here, to distinguish it from the
case above.  Also in the untested usage below.

> +    return -1
> +}
> 
> -if {[build_executable $testfile.exp $testfile2 $srcfile2 $compile_options] == -1} {
> -    untested "failed to compile secondary testcase"
> +if {[build_executable "compile $binfile3" $binfile3 $srcfile3] == -1} {
> +    untested "failed to compile main testcase"
>      return -1
>  }
> 
...
> +proc_with_prefix vfork_child_follow_to_exit { binfile srcfile } {
> +    setup_gdb $binfile $srcfile
> +
> +    gdb_test_no_output "set follow-fork child"
> +
> +    gdb_test_multiple "continue" "continue to child exit" {
> +	-re -wrap "Couldn't get registers.*" {
> +	    # PR gdb/14766
> +	    fail $gdb_test_name

Since this is known, we can use a setup_kfail, I think.

> +	}
> +	-re -wrap "\\\[Attaching after.* vfork to.*\\\[Detaching vfork parent .* after
> child exit.*" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    # The parent has been detached; allow time for any output it might
> +    # generate to arrive, so that output doesn't get confused with
> +    # any gdb_expected debugger output from a subsequent testpoint.
> +    #
> +    exec sleep 1
> +}
> +
...
> +proc_with_prefix vfork_relations_in_info_inferiors { variant binfile srcfile
> non_stop } {
> +    setup_gdb $binfile $srcfile
> +
> +    gdb_test_no_output "set follow-fork child"
> +
> +    gdb_test_multiple "next" "next over vfork" {
> +	-re -wrap "\\\[Attaching after .* vfork to child.*if \\(pid == 0\\).*" {
> +	    pass $gdb_test_name
> +	}
> +    }
> +
> +    if { $non_stop } {
> +	gdb_test "inferior 2" ".*"
> +    }
> +
> +    gdb_test "info inferiors" \
> +	".*is vfork parent of inferior 2.*is vfork child of inferior 1" \
> +	"info inferiors shows vfork parent/child relation"
> +
> +    if { $variant == "exec" } {
> +	set linenum [gdb_get_line_number "Hello from vforked-prog" ${::srcfile3}]
> +	gdb_test_multiple "continue" "continue to bp" {
> +	    -re -wrap ".*xecuting new program.*Breakpoint.*vforked-prog.c:${linenum}.*"
> {
> +		pass $gdb_test_name
> +	    }
> +	}
> +    } else {
> +	gdb_test_multiple "continue" "continue to child exit" {
> +	   -re -wrap "exited normally.*" {
> +	       pass $gdb_test_name
>  	   }

How about using gdb_continue_to_end instead of this gdb_test_multiple?

Thanks
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCHv2 0/8] Some vfork related fixes
  2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
                     ` (7 preceding siblings ...)
  2023-07-04 15:22   ` [PATCHv2 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
@ 2023-07-05 11:30   ` Aktemur, Tankut Baris
  2023-07-17  8:53     ` Andrew Burgess
  8 siblings, 1 reply; 32+ messages in thread
From: Aktemur, Tankut Baris @ 2023-07-05 11:30 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On Tuesday, July 4, 2023 5:23 PM, Andrew Burgess wrote:
> Changes since v1:
> 
>   - Added an extra assert in patch #4, as suggested by Baris,
> 
>   - Updated some comments in patch #4 to better describe what's going
>     on,
> 
>   - Added an extra debug printf in patch #8 which I found useful when
>     checking what's actually going on (in order to change patch #4).
> 
> ---

I looked at the patches in this series.  Thank you for the detailed
commit messages.  They were not only useful, but also educational.
I spotted only a few minor things and have already sent them.

Thanks
-Baris


Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [PATCHv2 0/8] Some vfork related fixes
  2023-07-05 11:30   ` [PATCHv2 0/8] Some vfork related fixes Aktemur, Tankut Baris
@ 2023-07-17  8:53     ` Andrew Burgess
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-17  8:53 UTC (permalink / raw)
  To: Aktemur, Tankut Baris, gdb-patches

"Aktemur, Tankut Baris" <tankut.baris.aktemur@intel.com> writes:

> On Tuesday, July 4, 2023 5:23 PM, Andrew Burgess wrote:
>> Changes since v1:
>> 
>>   - Added an extra assert in patch #4, as suggested by Baris,
>> 
>>   - Updated some comments in patch #4 to better describe what's going
>>     on,
>> 
>>   - Added an extra debug printf in patch #8 which I found useful when
>>     checking what's actually going on (in order to change patch #4).
>> 
>> ---
>
> I looked at the patches in this series.  Thank you for the detailed
> commit messages.  They were not only useful, but also educational.
> I spotted only a few minor things and have already sent them.

I've gone ahead and pushed this series.  If there is any additional
feedback I am, of course, happy to address it.  I have some additional
vfork related fixes that I'm hoping to post to the mailing list soon, so
having this series merged will help with that.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-07-04 15:22   ` [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
@ 2023-07-18 20:42     ` Simon Marchi
  2023-07-21  9:47       ` Andrew Burgess
  2023-07-23  8:53       ` Andrew Burgess
  0 siblings, 2 replies; 32+ messages in thread
From: Simon Marchi @ 2023-07-18 20:42 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: tankut.baris.aktemur



On 2023-07-04 11:22, Andrew Burgess via Gdb-patches wrote:
> Like the last few commit, this fixes yet another vfork related issue.
> Like the commit titled:
> 
>   gdb: don't restart vfork parent while waiting for child to finish
> 
> which addressed a case in linux-nat where we would try to resume a
> vfork parent, this commit addresses a very similar case, but this time
> occurring in infrun.c.  Just like with that previous commit, this bug
> results in the assert:
> 
>   x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
> 
> In this case the issue occurs when target-non-stop is on, but non-stop
> is off, and again, schedule-multiple is on.  As with the previous
> commit, GDB is in follow-fork-mode child.  If you have not done so, it
> is worth reading the earlier commit as many of the problems leading to
> the failure are the same, they just appear in a different part of GDB.
> 
> Here are the steps leading to the assertion failure:
> 
>   1. The user performs a 'next' over a vfork, GDB stop in the vfork
>   child,
> 
>   2. As we are planning to follow the child GDB sets the vfork_parent
>   and vfork_child member variables in the two inferiors, the
>   thread_waiting_for_vfork_done member is left as nullptr, that member
>   is only used when GDB is planning to follow the parent inferior,
> 
>   3. The user does 'continue', our expectation is that the vfork child
>   should resume, and once that process has exited or execd, GDB should
>   detach from the vfork parent.  As a result of the 'continue' GDB
>   eventually enters the proceed function,
> 
>   4. In proceed we selected a ptid_t to resume, because
>   schedule-multiple is on we select minus_one_ptid (see
>   user_visible_resume_ptid),
> 
>   5. As GDB is running in all-stop on top of non-stop mode, in the
>   proceed function we iterate over all threads that match the resume
>   ptid, which turns out to be all threads, and call
>   proceed_resume_thread_checked.  One of the threads we iterate over
>   is the vfork parent thread,
> 
>   6. As the thread passed to proceed_resume_thread_checked doesn't
>   match any of the early return conditions, GDB will set the thread
>   resumed,
> 
>   7. As we are resuming one thread at a time, this thread is seen by
>   the lower layers (e.g. linux-nat) as the "event thread", which means
>   we don't apply any of the checks, e.g. is this thread a
>   vfork parent, instead we assume that GDB core knows what it's doing,
>   and linux-nat will resume the thread, we have now incorrectly set
>   running the vfork parent thread when this thread should be waiting
>   for the vfork child to complete,
> 
>   8. Back in the proceed function GDB continues to iterate over all
>   threads, and now (correctly) resumes the vfork child thread,
> 
>   8. As the vfork child is still alive the kernel holds the vfork
>   parent stopped,
> 
>   9. Eventually the child performs its exec and GDB is sent and EXECD
>   event.  However, because the parent is resumed, as soon as the child
>   performs its exec the vfork parent also sends a VFORK_DONE event to
>   GDB,
> 
>   10. Depending on timing both of these events might seem to arrive in
>   GDB at the same time.  Normally GDB expects to see the EXECD or
>   EXITED/SIGNALED event from the vfork child before getting the
>   VFORK_DONE in the parent.  We know this because it is as a result of
>   the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
>   handle_vfork_child_exec_or_exit for details).  Further the comment
>   in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
>   when we remain attached to the child (not the parent) we should not
>   expect to see a VFORK_DONE,
> 
>   11. If both events arrive at the same time then GDB will randomly
>   choose one event to handle first, in some cases this will be the
>   VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
>   expects that (a) the vfork child has finished, however, in this case
>   this is not completely true, the child has finished, but GDB has not
>   processed the event associated with the completion yet, and (b) upon
>   seeing a VFORK_DONE GDB assumes we are remaining attached to the
>   parent, and so resumes the parent process,
> 
>   12. GDB now handles the EXECD event.  In our case we are detaching
>   from the parent, so GDB calls target_detach (see
>   handle_vfork_child_exec_or_exit),
> 
>   13. While this has been going on the vfork parent is executing, and
>   might even exit,
> 
>   14. In linux_nat_target::detach the first thing we do is stop all
>   threads in the process we're detaching from, the result of the stop
>   request will be cached on the lwp_info object,
> 
>   15. In our case the vfork parent has exited though, so when GDB
>   waits for the thread, instead of a stop due to signal, we instead
>   get a thread exited status,
> 
>   16. Later in the detach process we try to resume the threads just
>   prior to making the ptrace call to actually detach (see
>   detach_one_lwp), as part of the process to resume a thread we try to
>   touch some registers within the thread, and before doing this GDB
>   asserts that the thread is stopped,
> 
>   17. An exited thread is not classified as stopped, and so the assert
>   triggers!
> 
> Just like with the earlier commit, the fix is to spot the vfork parent
> status of the thread, and not resume such threads.  Where the earlier
> commit fixed this in linux-nat, in this case I think the fix should
> live in infrun.c, in proceed_resume_thread_checked.  This function
> already has a similar check to not resume the vfork parent in the case
> where we are planning to follow the vfork parent, I propose adding a
> similar case that checks for the vfork parent when we plan to follow
> the vfork child.
> 
> This new check will mean that at step #6 above GDB doesn't try to
> resume the vfork parent thread, which prevents the VFORK_DONE from
> ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
> the vfork child GDB will detach from the parent.
> 
> There's no test included in this commit.  In a subsequent commit I
> will expand gdb.base/foll-vfork.exp which is when this bug would be
> exposed.
> 
> If you do want to reproduce this failure then you will for certainly
> need to run the gdb.base/foll-vfork.exp test in a loop as the failures
> are all very timing sensitive.  I've found that running multiple
> copies in parallel makes the failure more likely to appear, I usually
> run ~6 copies in parallel and expect to see a failure after within
> 10mins.

Hi Andrew,

Since this commit, I see this on native-gdbserver and
native-extended-gdbserver:

FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)

I haven't had the time to read this vfork series, but I look forward to,
since I also did some vfork fixes not too long ago.

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-07-18 20:42     ` Simon Marchi
@ 2023-07-21  9:47       ` Andrew Burgess
  2023-07-23  8:53       ` Andrew Burgess
  1 sibling, 0 replies; 32+ messages in thread
From: Andrew Burgess @ 2023-07-21  9:47 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

Simon Marchi <simark@simark.ca> writes:

> On 2023-07-04 11:22, Andrew Burgess via Gdb-patches wrote:
>> Like the last few commit, this fixes yet another vfork related issue.
>> Like the commit titled:
>> 
>>   gdb: don't restart vfork parent while waiting for child to finish
>> 
>> which addressed a case in linux-nat where we would try to resume a
>> vfork parent, this commit addresses a very similar case, but this time
>> occurring in infrun.c.  Just like with that previous commit, this bug
>> results in the assert:
>> 
>>   x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
>> 
>> In this case the issue occurs when target-non-stop is on, but non-stop
>> is off, and again, schedule-multiple is on.  As with the previous
>> commit, GDB is in follow-fork-mode child.  If you have not done so, it
>> is worth reading the earlier commit as many of the problems leading to
>> the failure are the same, they just appear in a different part of GDB.
>> 
>> Here are the steps leading to the assertion failure:
>> 
>>   1. The user performs a 'next' over a vfork, GDB stop in the vfork
>>   child,
>> 
>>   2. As we are planning to follow the child GDB sets the vfork_parent
>>   and vfork_child member variables in the two inferiors, the
>>   thread_waiting_for_vfork_done member is left as nullptr, that member
>>   is only used when GDB is planning to follow the parent inferior,
>> 
>>   3. The user does 'continue', our expectation is that the vfork child
>>   should resume, and once that process has exited or execd, GDB should
>>   detach from the vfork parent.  As a result of the 'continue' GDB
>>   eventually enters the proceed function,
>> 
>>   4. In proceed we selected a ptid_t to resume, because
>>   schedule-multiple is on we select minus_one_ptid (see
>>   user_visible_resume_ptid),
>> 
>>   5. As GDB is running in all-stop on top of non-stop mode, in the
>>   proceed function we iterate over all threads that match the resume
>>   ptid, which turns out to be all threads, and call
>>   proceed_resume_thread_checked.  One of the threads we iterate over
>>   is the vfork parent thread,
>> 
>>   6. As the thread passed to proceed_resume_thread_checked doesn't
>>   match any of the early return conditions, GDB will set the thread
>>   resumed,
>> 
>>   7. As we are resuming one thread at a time, this thread is seen by
>>   the lower layers (e.g. linux-nat) as the "event thread", which means
>>   we don't apply any of the checks, e.g. is this thread a
>>   vfork parent, instead we assume that GDB core knows what it's doing,
>>   and linux-nat will resume the thread, we have now incorrectly set
>>   running the vfork parent thread when this thread should be waiting
>>   for the vfork child to complete,
>> 
>>   8. Back in the proceed function GDB continues to iterate over all
>>   threads, and now (correctly) resumes the vfork child thread,
>> 
>>   8. As the vfork child is still alive the kernel holds the vfork
>>   parent stopped,
>> 
>>   9. Eventually the child performs its exec and GDB is sent and EXECD
>>   event.  However, because the parent is resumed, as soon as the child
>>   performs its exec the vfork parent also sends a VFORK_DONE event to
>>   GDB,
>> 
>>   10. Depending on timing both of these events might seem to arrive in
>>   GDB at the same time.  Normally GDB expects to see the EXECD or
>>   EXITED/SIGNALED event from the vfork child before getting the
>>   VFORK_DONE in the parent.  We know this because it is as a result of
>>   the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
>>   handle_vfork_child_exec_or_exit for details).  Further the comment
>>   in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
>>   when we remain attached to the child (not the parent) we should not
>>   expect to see a VFORK_DONE,
>> 
>>   11. If both events arrive at the same time then GDB will randomly
>>   choose one event to handle first, in some cases this will be the
>>   VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
>>   expects that (a) the vfork child has finished, however, in this case
>>   this is not completely true, the child has finished, but GDB has not
>>   processed the event associated with the completion yet, and (b) upon
>>   seeing a VFORK_DONE GDB assumes we are remaining attached to the
>>   parent, and so resumes the parent process,
>> 
>>   12. GDB now handles the EXECD event.  In our case we are detaching
>>   from the parent, so GDB calls target_detach (see
>>   handle_vfork_child_exec_or_exit),
>> 
>>   13. While this has been going on the vfork parent is executing, and
>>   might even exit,
>> 
>>   14. In linux_nat_target::detach the first thing we do is stop all
>>   threads in the process we're detaching from, the result of the stop
>>   request will be cached on the lwp_info object,
>> 
>>   15. In our case the vfork parent has exited though, so when GDB
>>   waits for the thread, instead of a stop due to signal, we instead
>>   get a thread exited status,
>> 
>>   16. Later in the detach process we try to resume the threads just
>>   prior to making the ptrace call to actually detach (see
>>   detach_one_lwp), as part of the process to resume a thread we try to
>>   touch some registers within the thread, and before doing this GDB
>>   asserts that the thread is stopped,
>> 
>>   17. An exited thread is not classified as stopped, and so the assert
>>   triggers!
>> 
>> Just like with the earlier commit, the fix is to spot the vfork parent
>> status of the thread, and not resume such threads.  Where the earlier
>> commit fixed this in linux-nat, in this case I think the fix should
>> live in infrun.c, in proceed_resume_thread_checked.  This function
>> already has a similar check to not resume the vfork parent in the case
>> where we are planning to follow the vfork parent, I propose adding a
>> similar case that checks for the vfork parent when we plan to follow
>> the vfork child.
>> 
>> This new check will mean that at step #6 above GDB doesn't try to
>> resume the vfork parent thread, which prevents the VFORK_DONE from
>> ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
>> the vfork child GDB will detach from the parent.
>> 
>> There's no test included in this commit.  In a subsequent commit I
>> will expand gdb.base/foll-vfork.exp which is when this bug would be
>> exposed.
>> 
>> If you do want to reproduce this failure then you will for certainly
>> need to run the gdb.base/foll-vfork.exp test in a loop as the failures
>> are all very timing sensitive.  I've found that running multiple
>> copies in parallel makes the failure more likely to appear, I usually
>> run ~6 copies in parallel and expect to see a failure after within
>> 10mins.
>
> Hi Andrew,
>
> Since this commit, I see this on native-gdbserver and
> native-extended-gdbserver:
>
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
>
> I haven't had the time to read this vfork series, but I look forward to,
> since I also did some vfork fixes not too long ago.

Thanks, I'll take a look.

Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-07-18 20:42     ` Simon Marchi
  2023-07-21  9:47       ` Andrew Burgess
@ 2023-07-23  8:53       ` Andrew Burgess
  2023-08-16 14:02         ` Andrew Burgess
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-07-23  8:53 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

Simon Marchi <simark@simark.ca> writes:

> On 2023-07-04 11:22, Andrew Burgess via Gdb-patches wrote:
>> Like the last few commit, this fixes yet another vfork related issue.
>> Like the commit titled:
>> 
>>   gdb: don't restart vfork parent while waiting for child to finish
>> 
>> which addressed a case in linux-nat where we would try to resume a
>> vfork parent, this commit addresses a very similar case, but this time
>> occurring in infrun.c.  Just like with that previous commit, this bug
>> results in the assert:
>> 
>>   x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
>> 
>> In this case the issue occurs when target-non-stop is on, but non-stop
>> is off, and again, schedule-multiple is on.  As with the previous
>> commit, GDB is in follow-fork-mode child.  If you have not done so, it
>> is worth reading the earlier commit as many of the problems leading to
>> the failure are the same, they just appear in a different part of GDB.
>> 
>> Here are the steps leading to the assertion failure:
>> 
>>   1. The user performs a 'next' over a vfork, GDB stop in the vfork
>>   child,
>> 
>>   2. As we are planning to follow the child GDB sets the vfork_parent
>>   and vfork_child member variables in the two inferiors, the
>>   thread_waiting_for_vfork_done member is left as nullptr, that member
>>   is only used when GDB is planning to follow the parent inferior,
>> 
>>   3. The user does 'continue', our expectation is that the vfork child
>>   should resume, and once that process has exited or execd, GDB should
>>   detach from the vfork parent.  As a result of the 'continue' GDB
>>   eventually enters the proceed function,
>> 
>>   4. In proceed we selected a ptid_t to resume, because
>>   schedule-multiple is on we select minus_one_ptid (see
>>   user_visible_resume_ptid),
>> 
>>   5. As GDB is running in all-stop on top of non-stop mode, in the
>>   proceed function we iterate over all threads that match the resume
>>   ptid, which turns out to be all threads, and call
>>   proceed_resume_thread_checked.  One of the threads we iterate over
>>   is the vfork parent thread,
>> 
>>   6. As the thread passed to proceed_resume_thread_checked doesn't
>>   match any of the early return conditions, GDB will set the thread
>>   resumed,
>> 
>>   7. As we are resuming one thread at a time, this thread is seen by
>>   the lower layers (e.g. linux-nat) as the "event thread", which means
>>   we don't apply any of the checks, e.g. is this thread a
>>   vfork parent, instead we assume that GDB core knows what it's doing,
>>   and linux-nat will resume the thread, we have now incorrectly set
>>   running the vfork parent thread when this thread should be waiting
>>   for the vfork child to complete,
>> 
>>   8. Back in the proceed function GDB continues to iterate over all
>>   threads, and now (correctly) resumes the vfork child thread,
>> 
>>   8. As the vfork child is still alive the kernel holds the vfork
>>   parent stopped,
>> 
>>   9. Eventually the child performs its exec and GDB is sent and EXECD
>>   event.  However, because the parent is resumed, as soon as the child
>>   performs its exec the vfork parent also sends a VFORK_DONE event to
>>   GDB,
>> 
>>   10. Depending on timing both of these events might seem to arrive in
>>   GDB at the same time.  Normally GDB expects to see the EXECD or
>>   EXITED/SIGNALED event from the vfork child before getting the
>>   VFORK_DONE in the parent.  We know this because it is as a result of
>>   the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
>>   handle_vfork_child_exec_or_exit for details).  Further the comment
>>   in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
>>   when we remain attached to the child (not the parent) we should not
>>   expect to see a VFORK_DONE,
>> 
>>   11. If both events arrive at the same time then GDB will randomly
>>   choose one event to handle first, in some cases this will be the
>>   VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
>>   expects that (a) the vfork child has finished, however, in this case
>>   this is not completely true, the child has finished, but GDB has not
>>   processed the event associated with the completion yet, and (b) upon
>>   seeing a VFORK_DONE GDB assumes we are remaining attached to the
>>   parent, and so resumes the parent process,
>> 
>>   12. GDB now handles the EXECD event.  In our case we are detaching
>>   from the parent, so GDB calls target_detach (see
>>   handle_vfork_child_exec_or_exit),
>> 
>>   13. While this has been going on the vfork parent is executing, and
>>   might even exit,
>> 
>>   14. In linux_nat_target::detach the first thing we do is stop all
>>   threads in the process we're detaching from, the result of the stop
>>   request will be cached on the lwp_info object,
>> 
>>   15. In our case the vfork parent has exited though, so when GDB
>>   waits for the thread, instead of a stop due to signal, we instead
>>   get a thread exited status,
>> 
>>   16. Later in the detach process we try to resume the threads just
>>   prior to making the ptrace call to actually detach (see
>>   detach_one_lwp), as part of the process to resume a thread we try to
>>   touch some registers within the thread, and before doing this GDB
>>   asserts that the thread is stopped,
>> 
>>   17. An exited thread is not classified as stopped, and so the assert
>>   triggers!
>> 
>> Just like with the earlier commit, the fix is to spot the vfork parent
>> status of the thread, and not resume such threads.  Where the earlier
>> commit fixed this in linux-nat, in this case I think the fix should
>> live in infrun.c, in proceed_resume_thread_checked.  This function
>> already has a similar check to not resume the vfork parent in the case
>> where we are planning to follow the vfork parent, I propose adding a
>> similar case that checks for the vfork parent when we plan to follow
>> the vfork child.
>> 
>> This new check will mean that at step #6 above GDB doesn't try to
>> resume the vfork parent thread, which prevents the VFORK_DONE from
>> ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
>> the vfork child GDB will detach from the parent.
>> 
>> There's no test included in this commit.  In a subsequent commit I
>> will expand gdb.base/foll-vfork.exp which is when this bug would be
>> exposed.
>> 
>> If you do want to reproduce this failure then you will for certainly
>> need to run the gdb.base/foll-vfork.exp test in a loop as the failures
>> are all very timing sensitive.  I've found that running multiple
>> copies in parallel makes the failure more likely to appear, I usually
>> run ~6 copies in parallel and expect to see a failure after within
>> 10mins.
>
> Hi Andrew,
>
> Since this commit, I see this on native-gdbserver and
> native-extended-gdbserver:
>
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
>
> I haven't had the time to read this vfork series, but I look forward to,
> since I also did some vfork fixes not too long ago.

If I remember correctly your fixes focused on the follow-parent side of
vfork, while the fixes I looked at focused on the follow-child side.

I have some more vfork fixes that I'm working on, which I'm hoping to
get posted soon, but I have a couple of other tasks I need to get done
first.

Anyway, below is my proposed fix for the above regressions.  The GDB
part of the fix is trivial, then there's a bunch of changes to the above
test script so that we check more cases.  Let me know what you think.

Thanks,
Andrew

---

commit a6609bd4fcf8d6d5718a7fb093dbaa34286938b6
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Sat Jul 22 15:32:29 2023 +0100

    gdb: fix vfork regressions when target-non-stop is off
    
    It was pointed out on the mailing list[1] that after this commit:
    
      commit b1e0126ec56e099d753c20e91a9f8623aabd6b46
      Date:   Wed Jun 21 14:18:54 2023 +0100
    
          gdb: don't resume vfork parent while child is still running
    
    the test gdb.base/vfork-follow-parent.exp now has some failures when
    run with the native-gdbserver or native-extended-gdbserver boards:
    
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
    
    The reason that these failures don't show up when run on the standard
    unix board is that the test is only run in the default operating mode,
    so for Linux this will be all-stop on top of non-stop.
    
    If we adjust the test script so that it runs in the default mode and
    with target-non-stop turned off, then we see the same failures on the
    unix board.  This commit includes this change.
    
    The way that the test is written means that it is not (currently)
    possible to turn on non-stop mode and have the test still work, so
    this commit does not do that.
    
    I have also updated the test script so that the vfork child performs
    an exec as well as the current exit.  Exec and exit are the two ways
    in which a vfork child can release the vfork parent, so testing both
    of these cases is useful I think.
    
    In this test the inferior performs a vfork and the vfork-child
    immediately exits.  The vfork-parent will wait for the vfork-child and
    then blocks waiting for gdb.  Once gdb has released the vfork-parent,
    the vfork-parent also exits.
    
    In the test that fails, GDB sets 'detach-on-fork off' and then runs to
    the vfork.  At this point the test tries to just "continue", but this
    fails as the vfork-parent is still selected, and the parent can't
    continue until the vfork-child completes.  As the vfork-child is
    stopped by GDB the parent will never stop once resumed, so GDB refuses
    to resume it.
    
    The test script then sets 'schedule-multiple on' and once again
    continues.  This time GDB, in theory, resumes both the parent and the
    child, the parent will be held blocked by the kernel, but the child
    will run until it exits, and which point GDB stops again, this time
    with inferior 2, the newly exited vfork-child, selected.
    
    What happens after this in the test script is irrelevant as far as
    this failure is concerned.
    
    To understand why the test started failing we should consider the
    behaviour of four different cases:
    
      1. All-stop-on-non-stop before commit b1e0126ec56e,
    
      2. All-stop-on-non-stop after commit b1e0126ec56e,
    
      3. All-stop-on-all-stop before commit b1e0126ec56e, and
    
      4. All-stop-on-all-stop after commit b1e0126ec56e.
    
    Only case #4 is failing after commit b1e0126ec56e, but I think the
    other cases are interesting because, (a) they inform how we might fix
    the regression, and (b) it turns out the behaviour of #2 changed too
    with the commit, but the change was harmless.
    
    For #1 All-stop-on-non-stop before commit b1e0126ec56e, what happens
    is:
    
      1. GDB calls proceed with the vfork-parent selected, as schedule
         multiple is on user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid (see proceed function),
    
      2. As this is all-stop-on-non-stop, every thread is resumed
        individually, so GDB tries to resume both the vfork-parent and the
        vfork-child, both of which succeed,
    
      3. The vfork-parent is held stopped by the kernel,
    
      4. The vfork-child completes (exits) at which point the GDB sees the
         EXITED event for the vfork-child and the VFORK_DONE event for the
         vfork-parent,
    
      5. At this point we might take two paths depending on which event
         GDB handles first, if GDB handles the VFORK_DONE first then:
    
         (a) As GDB is controlling both parent and child the VFORK_DONE is
             ignored (see handle_vfork_done), the vfork-parent will be
             resumed,
    
         (b) GDB processes the EXITED event, selects the (now defunct)
             vfork-child, and stops, returning control to the user.
    
         Alternatively, if GDB selects the EXITED event first then:
    
         (c) GDB processes the EXITED event, selects the (now defunct)
             vfork-child, and stops, returning control to the user.
    
         (d) At some future time the user resumes the vfork-parent, at
             which point the VFORK_DONE is reported to GDB, however, GDB
             is ignoring the VFORK_DONE (see handle_vfork_done), so the
             parent is resumed.
    
    For case #2, all-stop-on-non-stop after commit b1e0126ec56e, the
    important difference is in step (2) above, now, instead of resuming
    both the vfork-parent and the vfork-child, only the vfork-child is
    resumed.  As such, when we get to step (5), only a single event, the
    EXITED event is reported.
    
    GDB handles the EXITED just as in (5)(c), then, later, when the user
    resumes the vfork-parent, the VFORKED_DONE is immediately delivered
    from the kernel, but this is ignored just as in (5)(d), and so,
    though the pattern of when the vfork-parent is resumed changes, the
    overall pattern of which events are reported and when, doesn't
    actually change.  In fact, by not resuming the vfork-parent, the order
    of events (in this test) is now deterministic, which (maybe?) is a
    good thing.
    
    If we now consider case #3, all-stop-on-all-stop before commit
    b1e0126ec56e, then what happens is:
    
      1. GDB calls proceed with the vfork-parent selected, as schedule
         multiple is on user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid (see proceed function),
    
      2. As this is all-stop-on-all-stop, the resume is passed down to the
         linux-nat target, the vfork-parent is the event thread, while the
         vfork-child is a sibling of the event thread,
    
      3. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
         for all threads, this causes the vfork-child to be resumed.  Then
         in linux_nat_target::resume, the event thread, the vfork-parent,
         is also resumed.
    
      4. The vfork-parent is held stopped by the kernel,
    
      5. The vfork-child completes (exits) at which point the GDB sees the
         EXITED event for the vfork-child and the VFORK_DONE event for the
         vfork-parent,
    
      6. We are now in a situation identical to step (5) as for
         all-stop-on-non-stop above, GDB selects one of the events to
         handle, and whichever we select the user sees the correct
         behaviour.
    
    And so, finally, we can consider #4, all-stop-on-all-stop after commit
    b1e0126ec56e, this is the case that started failing.
    
    We start out just like above, in proceed, the resume_ptid is
    -1 (resume everything), due to schedule multiple being on.  And just
    like above, due to the target being all-stop, we call
    proceed_resume_thread_checked just once, for the current thread,
    which, remember, is the vfork-parent thread.
    
    The change in commit b1e0126ec56e was to avoid resuming a vfork-parent
    thread, read the commit message for the justification for this change.
    
    However, this means that GDB now rejects resuming the vfork-parent in
    this case, which means that nothing gets resumed!  Obviously, if
    nothing resumes, then nothing will ever stop, and so GDB appears to
    hang.
    
    I considered a couple of solutions which, in the end, I didn't go
    with, these were:
    
      1. Move the vfork-parent check out of proceed_resume_thread_checked,
         and place it in proceed, but only on the all-stop-on-non-stop
         path, this should still address the issue seen in b1e0126ec56e,
         but would avoid the issue seen here.  I rejected this just
         because it didn't feel great to split the checks that exist in
         proceed_resume_thread_checked like this,
    
      2. Extend the condition in proceed_resume_thread_checked by adding a
         target_is_non_stop_p check.  This would have the same effect as
         idea 1, but leaves all the checks in the same place, which I
         think would be better, but this still just didn't feel right to
         me, and so,
    
    What I noticed was that for the all-stop-on-non-stop, after commit
    b1e0126ec56e, we only resumed the vfork-child, and this seems fine.
    The vfork-parent isn't going to run anyway (the kernel will hold it
    back), so if feels like we there's no harm in just waiting for the
    child to complete, and then resuming the parent.
    
    So then I started looking at follow_fork, which is called from the top
    of proceed.  This function already has the task of switching between
    the parent and child based on which the user wishes to follow.  So, I
    wondered, could we use this to switch to the vfork-child in the case
    that we are attached to both?
    
    Turns out this is pretty simple to do.
    
    Having done that, now the process is for all-stop-on-all-stop after
    commit b1e0126ec56e, and with this new fix is:
    
      1. GDB calls proceed with the vfork-parent selected, but,
    
      2. In follow_fork, and follow_fork_inferior, GDB switches the
         selected thread to be that of the vfork-child,
    
      3. Back in proceed user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid still, but now,
    
      4. When GDB calls proceed_resume_thread_checked, the vfork-child is
         the current selected thread, this is not a vfork-parent, and so
         GDB allows the proceed to continue to the linux-nat target,
    
      5. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
         for all threads, this does not resume the vfork-parent (because
         it is a vfork-parent), and then the vfork-child is resumed as
         this is the event thread,
    
    At this point we are back in the same situation as for
    all-stop-on-non-stop after commit b1e0126ec56e, that is, the
    vfork-child is resumed, while the vfork-parent is held stopped by
    GDB.
    
    Eventually the vfork-child will exit or exec, at which point the
    vfork-parent will be resumed.
    
    [1] https://inbox.sourceware.org/gdb-patches/3e1e1db0-13d9-dd32-b4bb-051149ae6e76@simark.ca/

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 7efa0617526..7f4e6e50d6b 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -713,7 +713,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	 (do not restore the parent as the current inferior).  */
       gdb::optional<scoped_restore_current_thread> maybe_restore;
 
-      if (!follow_child)
+      if (!follow_child && !sched_multi)
 	maybe_restore.emplace ();
 
       switch_to_thread (*child_inf->threads ().begin ());
@@ -3400,8 +3400,10 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   struct gdbarch *gdbarch;
   CORE_ADDR pc;
 
-  /* If we're stopped at a fork/vfork, follow the branch set by the
-     "set follow-fork-mode" command; otherwise, we'll just proceed
+  /* If we're stopped at a fork/vfork, switch to either the parent or child
+     thread as defined by the "set follow-fork-mode" command, or, if both
+     the parent and child are controlled by GDB, and schedule-multiple is
+     on, follow the child.  If none of the above apply then we just proceed
      resuming the current thread.  */
   if (!follow_fork ())
     {
diff --git a/gdb/testsuite/gdb.base/vfork-follow-parent.c b/gdb/testsuite/gdb.base/vfork-follow-parent.c
index df45b9c2dbe..15ff84a0bad 100644
--- a/gdb/testsuite/gdb.base/vfork-follow-parent.c
+++ b/gdb/testsuite/gdb.base/vfork-follow-parent.c
@@ -17,6 +17,10 @@
 
 #include <unistd.h>
 
+#include <string.h>
+#include <limits.h>
+#include <stdio.h>
+
 static volatile int unblock_parent = 0;
 
 static void
@@ -25,7 +29,7 @@ break_parent (void)
 }
 
 int
-main (void)
+main (int argc, char **argv)
 {
   alarm (30);
 
@@ -40,7 +44,28 @@ main (void)
       break_parent ();
     }
   else
-    _exit (0);
+    {
+#if defined TEST_EXEC
+      char prog[PATH_MAX];
+      int len;
+
+      strcpy (prog, argv[0]);
+      len = strlen (prog);
+      for (; len > 0; --len)
+	{
+	  if (prog[len - 1] == '/')
+	    break;
+	}
+      strcpy (&prog[len], "vforked-prog");
+      execlp (prog, prog, (char *) 0);
+      perror ("exec failed");
+      _exit (1);
+#elif defined TEST_EXIT
+      _exit (0);
+#else
+#error Define TEST_EXEC or TEST_EXIT
+#endif
+    }
 
   return 0;
 }
diff --git a/gdb/testsuite/gdb.base/vfork-follow-parent.exp b/gdb/testsuite/gdb.base/vfork-follow-parent.exp
index 89c38001dac..ee1aef128bc 100644
--- a/gdb/testsuite/gdb.base/vfork-follow-parent.exp
+++ b/gdb/testsuite/gdb.base/vfork-follow-parent.exp
@@ -19,20 +19,40 @@
 # schedule-multiple on" or "set detach-on-fork on".  Test these two resolution
 # methods.
 
-standard_testfile
+standard_testfile .c vforked-prog.c
 
-if { [build_executable "failed to prepare" \
-	${testfile} ${srcfile}] } {
+set binfile ${testfile}-exit
+set binfile2 ${testfile}-exec
+set binfile3 vforked-prog
+
+set opts [list debug additional_flags=-DTEST_EXIT]
+if { [build_executable "compile ${binfile}" ${binfile} ${srcfile} ${opts}] } {
+    untested "failed to compile first test binary"
     return
 }
 
+set opts [list debug additional_flags=-DTEST_EXEC]
+if { [build_executable "compile ${binfile2}" ${binfile2} ${srcfile} ${opts}] } {
+    untested "failed to compile second test binary"
+    return
+}
+
+if { [build_executable "compile $binfile3" $binfile3 $srcfile2] } {
+    untested "failed to compile third test binary"
+    return -1
+}
+
 # Test running to the "Can not resume the parent..." message.  Then, resolve
 # the situation using the method in RESOLUTION_METHOD, either "detach-on-fork"
 # or "schedule-multiple" (the two alternatives the message suggests to the
 # user).
 
-proc do_test { resolution_method } {
-    clean_restart $::binfile
+proc do_test { exec_file resolution_method target_non_stop non_stop } {
+    save_vars { ::GDBFLAGS } {
+	append ::GDBFLAGS " -ex \"maint set target-non-stop ${target_non_stop}\""
+	append ::GDBFLAGS " -ex \"set non-stop ${non_stop}\""
+	clean_restart $exec_file
+    }
 
     gdb_test_no_output "set detach-on-fork off"
 
@@ -40,6 +60,10 @@ proc do_test { resolution_method } {
 	return
     }
 
+    # Delete the breakpoint on main so we don't bit the breakpoint in
+    # the case that the vfork child performs an exec.
+    delete_breakpoints
+
     gdb_test "break break_parent"
 
     gdb_test "continue" \
@@ -75,6 +99,16 @@ proc do_test { resolution_method } {
 	"continue to break_parent"
 }
 
-foreach_with_prefix resolution_method {detach-on-fork schedule-multiple} {
-    do_test $resolution_method
+foreach_with_prefix exec_file [list $binfile $binfile2] {
+    foreach_with_prefix target-non-stop {on off} {
+	# This test was written assuming non-stop mode is off.
+	foreach_with_prefix non-stop {off} {
+	    if {!${target-non-stop} && ${non-stop}} {
+		continue
+	    }
+	    foreach_with_prefix resolution_method {detach-on-fork schedule-multiple} {
+		do_test $exec_file $resolution_method ${target-non-stop} ${non-stop}
+	    }
+	}
+    }
 }


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-07-23  8:53       ` Andrew Burgess
@ 2023-08-16 14:02         ` Andrew Burgess
  2023-08-17  6:36           ` Tom de Vries
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Burgess @ 2023-08-16 14:02 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

Andrew Burgess <aburgess@redhat.com> writes:

> Simon Marchi <simark@simark.ca> writes:
>
>> On 2023-07-04 11:22, Andrew Burgess via Gdb-patches wrote:
>>> Like the last few commit, this fixes yet another vfork related issue.
>>> Like the commit titled:
>>> 
>>>   gdb: don't restart vfork parent while waiting for child to finish
>>> 
>>> which addressed a case in linux-nat where we would try to resume a
>>> vfork parent, this commit addresses a very similar case, but this time
>>> occurring in infrun.c.  Just like with that previous commit, this bug
>>> results in the assert:
>>> 
>>>   x86-linux-dregs.c:146: internal-error: x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
>>> 
>>> In this case the issue occurs when target-non-stop is on, but non-stop
>>> is off, and again, schedule-multiple is on.  As with the previous
>>> commit, GDB is in follow-fork-mode child.  If you have not done so, it
>>> is worth reading the earlier commit as many of the problems leading to
>>> the failure are the same, they just appear in a different part of GDB.
>>> 
>>> Here are the steps leading to the assertion failure:
>>> 
>>>   1. The user performs a 'next' over a vfork, GDB stop in the vfork
>>>   child,
>>> 
>>>   2. As we are planning to follow the child GDB sets the vfork_parent
>>>   and vfork_child member variables in the two inferiors, the
>>>   thread_waiting_for_vfork_done member is left as nullptr, that member
>>>   is only used when GDB is planning to follow the parent inferior,
>>> 
>>>   3. The user does 'continue', our expectation is that the vfork child
>>>   should resume, and once that process has exited or execd, GDB should
>>>   detach from the vfork parent.  As a result of the 'continue' GDB
>>>   eventually enters the proceed function,
>>> 
>>>   4. In proceed we selected a ptid_t to resume, because
>>>   schedule-multiple is on we select minus_one_ptid (see
>>>   user_visible_resume_ptid),
>>> 
>>>   5. As GDB is running in all-stop on top of non-stop mode, in the
>>>   proceed function we iterate over all threads that match the resume
>>>   ptid, which turns out to be all threads, and call
>>>   proceed_resume_thread_checked.  One of the threads we iterate over
>>>   is the vfork parent thread,
>>> 
>>>   6. As the thread passed to proceed_resume_thread_checked doesn't
>>>   match any of the early return conditions, GDB will set the thread
>>>   resumed,
>>> 
>>>   7. As we are resuming one thread at a time, this thread is seen by
>>>   the lower layers (e.g. linux-nat) as the "event thread", which means
>>>   we don't apply any of the checks, e.g. is this thread a
>>>   vfork parent, instead we assume that GDB core knows what it's doing,
>>>   and linux-nat will resume the thread, we have now incorrectly set
>>>   running the vfork parent thread when this thread should be waiting
>>>   for the vfork child to complete,
>>> 
>>>   8. Back in the proceed function GDB continues to iterate over all
>>>   threads, and now (correctly) resumes the vfork child thread,
>>> 
>>>   8. As the vfork child is still alive the kernel holds the vfork
>>>   parent stopped,
>>> 
>>>   9. Eventually the child performs its exec and GDB is sent and EXECD
>>>   event.  However, because the parent is resumed, as soon as the child
>>>   performs its exec the vfork parent also sends a VFORK_DONE event to
>>>   GDB,
>>> 
>>>   10. Depending on timing both of these events might seem to arrive in
>>>   GDB at the same time.  Normally GDB expects to see the EXECD or
>>>   EXITED/SIGNALED event from the vfork child before getting the
>>>   VFORK_DONE in the parent.  We know this because it is as a result of
>>>   the EXECD/EXITED/SIGNALED that GDB detaches from the parent (see
>>>   handle_vfork_child_exec_or_exit for details).  Further the comment
>>>   in target/waitstatus.h on TARGET_WAITKIND_VFORK_DONE indicates that
>>>   when we remain attached to the child (not the parent) we should not
>>>   expect to see a VFORK_DONE,
>>> 
>>>   11. If both events arrive at the same time then GDB will randomly
>>>   choose one event to handle first, in some cases this will be the
>>>   VFORK_DONE.  As described above, upon seeing a VFORK_DONE GDB
>>>   expects that (a) the vfork child has finished, however, in this case
>>>   this is not completely true, the child has finished, but GDB has not
>>>   processed the event associated with the completion yet, and (b) upon
>>>   seeing a VFORK_DONE GDB assumes we are remaining attached to the
>>>   parent, and so resumes the parent process,
>>> 
>>>   12. GDB now handles the EXECD event.  In our case we are detaching
>>>   from the parent, so GDB calls target_detach (see
>>>   handle_vfork_child_exec_or_exit),
>>> 
>>>   13. While this has been going on the vfork parent is executing, and
>>>   might even exit,
>>> 
>>>   14. In linux_nat_target::detach the first thing we do is stop all
>>>   threads in the process we're detaching from, the result of the stop
>>>   request will be cached on the lwp_info object,
>>> 
>>>   15. In our case the vfork parent has exited though, so when GDB
>>>   waits for the thread, instead of a stop due to signal, we instead
>>>   get a thread exited status,
>>> 
>>>   16. Later in the detach process we try to resume the threads just
>>>   prior to making the ptrace call to actually detach (see
>>>   detach_one_lwp), as part of the process to resume a thread we try to
>>>   touch some registers within the thread, and before doing this GDB
>>>   asserts that the thread is stopped,
>>> 
>>>   17. An exited thread is not classified as stopped, and so the assert
>>>   triggers!
>>> 
>>> Just like with the earlier commit, the fix is to spot the vfork parent
>>> status of the thread, and not resume such threads.  Where the earlier
>>> commit fixed this in linux-nat, in this case I think the fix should
>>> live in infrun.c, in proceed_resume_thread_checked.  This function
>>> already has a similar check to not resume the vfork parent in the case
>>> where we are planning to follow the vfork parent, I propose adding a
>>> similar case that checks for the vfork parent when we plan to follow
>>> the vfork child.
>>> 
>>> This new check will mean that at step #6 above GDB doesn't try to
>>> resume the vfork parent thread, which prevents the VFORK_DONE from
>>> ever arriving.  Once GDB sees the EXECD/EXITED/SIGNALLED event from
>>> the vfork child GDB will detach from the parent.
>>> 
>>> There's no test included in this commit.  In a subsequent commit I
>>> will expand gdb.base/foll-vfork.exp which is when this bug would be
>>> exposed.
>>> 
>>> If you do want to reproduce this failure then you will for certainly
>>> need to run the gdb.base/foll-vfork.exp test in a loop as the failures
>>> are all very timing sensitive.  I've found that running multiple
>>> copies in parallel makes the failure more likely to appear, I usually
>>> run ~6 copies in parallel and expect to see a failure after within
>>> 10mins.
>>
>> Hi Andrew,
>>
>> Since this commit, I see this on native-gdbserver and
>> native-extended-gdbserver:
>>
>> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
>> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
>> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
>> FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
>>
>> I haven't had the time to read this vfork series, but I look forward to,
>> since I also did some vfork fixes not too long ago.
>
> If I remember correctly your fixes focused on the follow-parent side of
> vfork, while the fixes I looked at focused on the follow-child side.
>
> I have some more vfork fixes that I'm working on, which I'm hoping to
> get posted soon, but I have a couple of other tasks I need to get done
> first.
>
> Anyway, below is my proposed fix for the above regressions.  The GDB
> part of the fix is trivial, then there's a bunch of changes to the above
> test script so that we check more cases.  Let me know what you think.

Given this was fixing a regression, I've gone ahead and pushed this
fix.  If there's any follow-up feedback, I'm happy to address it.

I actually tweaked the test slightly so that it would pass on boards
with an actual remote target (e.g. local-remote-host-native).  The
version I pushed is below.

Thanks,
Andrew

---

commit 05e1cac2496f842f70744dc5210fb3072ef32f3a
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Sat Jul 22 15:32:29 2023 +0100

    gdb: fix vfork regressions when target-non-stop is off
    
    It was pointed out on the mailing list[1] that after this commit:
    
      commit b1e0126ec56e099d753c20e91a9f8623aabd6b46
      Date:   Wed Jun 21 14:18:54 2023 +0100
    
          gdb: don't resume vfork parent while child is still running
    
    the test gdb.base/vfork-follow-parent.exp now has some failures when
    run with the native-gdbserver or native-extended-gdbserver boards:
    
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: print unblock_parent = 1 (timeout)
      FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (timeout)
    
    The reason that these failures don't show up when run on the standard
    unix board is that the test is only run in the default operating mode,
    so for Linux this will be all-stop on top of non-stop.
    
    If we adjust the test script so that it runs in the default mode and
    with target-non-stop turned off, then we see the same failures on the
    unix board.  This commit includes this change.
    
    The way that the test is written means that it is not (currently)
    possible to turn on non-stop mode and have the test still work, so
    this commit does not do that.
    
    I have also updated the test script so that the vfork child performs
    an exec as well as the current exit.  Exec and exit are the two ways
    in which a vfork child can release the vfork parent, so testing both
    of these cases is useful I think.
    
    In this test the inferior performs a vfork and the vfork-child
    immediately exits.  The vfork-parent will wait for the vfork-child and
    then blocks waiting for gdb.  Once gdb has released the vfork-parent,
    the vfork-parent also exits.
    
    In the test that fails, GDB sets 'detach-on-fork off' and then runs to
    the vfork.  At this point the test tries to just "continue", but this
    fails as the vfork-parent is still selected, and the parent can't
    continue until the vfork-child completes.  As the vfork-child is
    stopped by GDB the parent will never stop once resumed, so GDB refuses
    to resume it.
    
    The test script then sets 'schedule-multiple on' and once again
    continues.  This time GDB, in theory, resumes both the parent and the
    child, the parent will be held blocked by the kernel, but the child
    will run until it exits, and which point GDB stops again, this time
    with inferior 2, the newly exited vfork-child, selected.
    
    What happens after this in the test script is irrelevant as far as
    this failure is concerned.
    
    To understand why the test started failing we should consider the
    behaviour of four different cases:
    
      1. All-stop-on-non-stop before commit b1e0126ec56e,
    
      2. All-stop-on-non-stop after commit b1e0126ec56e,
    
      3. All-stop-on-all-stop before commit b1e0126ec56e, and
    
      4. All-stop-on-all-stop after commit b1e0126ec56e.
    
    Only case #4 is failing after commit b1e0126ec56e, but I think the
    other cases are interesting because, (a) they inform how we might fix
    the regression, and (b) it turns out the behaviour of #2 changed too
    with the commit, but the change was harmless.
    
    For #1 All-stop-on-non-stop before commit b1e0126ec56e, what happens
    is:
    
      1. GDB calls proceed with the vfork-parent selected, as schedule
         multiple is on user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid (see proceed function),
    
      2. As this is all-stop-on-non-stop, every thread is resumed
        individually, so GDB tries to resume both the vfork-parent and the
        vfork-child, both of which succeed,
    
      3. The vfork-parent is held stopped by the kernel,
    
      4. The vfork-child completes (exits) at which point the GDB sees the
         EXITED event for the vfork-child and the VFORK_DONE event for the
         vfork-parent,
    
      5. At this point we might take two paths depending on which event
         GDB handles first, if GDB handles the VFORK_DONE first then:
    
         (a) As GDB is controlling both parent and child the VFORK_DONE is
             ignored (see handle_vfork_done), the vfork-parent will be
             resumed,
    
         (b) GDB processes the EXITED event, selects the (now defunct)
             vfork-child, and stops, returning control to the user.
    
         Alternatively, if GDB selects the EXITED event first then:
    
         (c) GDB processes the EXITED event, selects the (now defunct)
             vfork-child, and stops, returning control to the user.
    
         (d) At some future time the user resumes the vfork-parent, at
             which point the VFORK_DONE is reported to GDB, however, GDB
             is ignoring the VFORK_DONE (see handle_vfork_done), so the
             parent is resumed.
    
    For case #2, all-stop-on-non-stop after commit b1e0126ec56e, the
    important difference is in step (2) above, now, instead of resuming
    both the vfork-parent and the vfork-child, only the vfork-child is
    resumed.  As such, when we get to step (5), only a single event, the
    EXITED event is reported.
    
    GDB handles the EXITED just as in (5)(c), then, later, when the user
    resumes the vfork-parent, the VFORKED_DONE is immediately delivered
    from the kernel, but this is ignored just as in (5)(d), and so,
    though the pattern of when the vfork-parent is resumed changes, the
    overall pattern of which events are reported and when, doesn't
    actually change.  In fact, by not resuming the vfork-parent, the order
    of events (in this test) is now deterministic, which (maybe?) is a
    good thing.
    
    If we now consider case #3, all-stop-on-all-stop before commit
    b1e0126ec56e, then what happens is:
    
      1. GDB calls proceed with the vfork-parent selected, as schedule
         multiple is on user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid (see proceed function),
    
      2. As this is all-stop-on-all-stop, the resume is passed down to the
         linux-nat target, the vfork-parent is the event thread, while the
         vfork-child is a sibling of the event thread,
    
      3. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
         for all threads, this causes the vfork-child to be resumed.  Then
         in linux_nat_target::resume, the event thread, the vfork-parent,
         is also resumed.
    
      4. The vfork-parent is held stopped by the kernel,
    
      5. The vfork-child completes (exits) at which point the GDB sees the
         EXITED event for the vfork-child and the VFORK_DONE event for the
         vfork-parent,
    
      6. We are now in a situation identical to step (5) as for
         all-stop-on-non-stop above, GDB selects one of the events to
         handle, and whichever we select the user sees the correct
         behaviour.
    
    And so, finally, we can consider #4, all-stop-on-all-stop after commit
    b1e0126ec56e, this is the case that started failing.
    
    We start out just like above, in proceed, the resume_ptid is
    -1 (resume everything), due to schedule multiple being on.  And just
    like above, due to the target being all-stop, we call
    proceed_resume_thread_checked just once, for the current thread,
    which, remember, is the vfork-parent thread.
    
    The change in commit b1e0126ec56e was to avoid resuming a vfork-parent
    thread, read the commit message for the justification for this change.
    
    However, this means that GDB now rejects resuming the vfork-parent in
    this case, which means that nothing gets resumed!  Obviously, if
    nothing resumes, then nothing will ever stop, and so GDB appears to
    hang.
    
    I considered a couple of solutions which, in the end, I didn't go
    with, these were:
    
      1. Move the vfork-parent check out of proceed_resume_thread_checked,
         and place it in proceed, but only on the all-stop-on-non-stop
         path, this should still address the issue seen in b1e0126ec56e,
         but would avoid the issue seen here.  I rejected this just
         because it didn't feel great to split the checks that exist in
         proceed_resume_thread_checked like this,
    
      2. Extend the condition in proceed_resume_thread_checked by adding a
         target_is_non_stop_p check.  This would have the same effect as
         idea 1, but leaves all the checks in the same place, which I
         think would be better, but this still just didn't feel right to
         me, and so,
    
    What I noticed was that for the all-stop-on-non-stop, after commit
    b1e0126ec56e, we only resumed the vfork-child, and this seems fine.
    The vfork-parent isn't going to run anyway (the kernel will hold it
    back), so if feels like we there's no harm in just waiting for the
    child to complete, and then resuming the parent.
    
    So then I started looking at follow_fork, which is called from the top
    of proceed.  This function already has the task of switching between
    the parent and child based on which the user wishes to follow.  So, I
    wondered, could we use this to switch to the vfork-child in the case
    that we are attached to both?
    
    Turns out this is pretty simple to do.
    
    Having done that, now the process is for all-stop-on-all-stop after
    commit b1e0126ec56e, and with this new fix is:
    
      1. GDB calls proceed with the vfork-parent selected, but,
    
      2. In follow_fork, and follow_fork_inferior, GDB switches the
         selected thread to be that of the vfork-child,
    
      3. Back in proceed user_visible_resume_ptid returns -1 (everything)
         as the resume_ptid still, but now,
    
      4. When GDB calls proceed_resume_thread_checked, the vfork-child is
         the current selected thread, this is not a vfork-parent, and so
         GDB allows the proceed to continue to the linux-nat target,
    
      5. In linux_nat_target::resume, GDB calls linux_nat_resume_callback
         for all threads, this does not resume the vfork-parent (because
         it is a vfork-parent), and then the vfork-child is resumed as
         this is the event thread,
    
    At this point we are back in the same situation as for
    all-stop-on-non-stop after commit b1e0126ec56e, that is, the
    vfork-child is resumed, while the vfork-parent is held stopped by
    GDB.
    
    Eventually the vfork-child will exit or exec, at which point the
    vfork-parent will be resumed.
    
    [1] https://inbox.sourceware.org/gdb-patches/3e1e1db0-13d9-dd32-b4bb-051149ae6e76@simark.ca/

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 8286026e6c6..72852e63906 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -713,7 +713,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	 (do not restore the parent as the current inferior).  */
       gdb::optional<scoped_restore_current_thread> maybe_restore;
 
-      if (!follow_child)
+      if (!follow_child && !sched_multi)
 	maybe_restore.emplace ();
 
       switch_to_thread (*child_inf->threads ().begin ());
@@ -3400,8 +3400,10 @@ proceed (CORE_ADDR addr, enum gdb_signal siggnal)
   struct gdbarch *gdbarch;
   CORE_ADDR pc;
 
-  /* If we're stopped at a fork/vfork, follow the branch set by the
-     "set follow-fork-mode" command; otherwise, we'll just proceed
+  /* If we're stopped at a fork/vfork, switch to either the parent or child
+     thread as defined by the "set follow-fork-mode" command, or, if both
+     the parent and child are controlled by GDB, and schedule-multiple is
+     on, follow the child.  If none of the above apply then we just proceed
      resuming the current thread.  */
   if (!follow_fork ())
     {
diff --git a/gdb/testsuite/gdb.base/vfork-follow-parent.c b/gdb/testsuite/gdb.base/vfork-follow-parent.c
index df45b9c2dbe..15ff84a0bad 100644
--- a/gdb/testsuite/gdb.base/vfork-follow-parent.c
+++ b/gdb/testsuite/gdb.base/vfork-follow-parent.c
@@ -17,6 +17,10 @@
 
 #include <unistd.h>
 
+#include <string.h>
+#include <limits.h>
+#include <stdio.h>
+
 static volatile int unblock_parent = 0;
 
 static void
@@ -25,7 +29,7 @@ break_parent (void)
 }
 
 int
-main (void)
+main (int argc, char **argv)
 {
   alarm (30);
 
@@ -40,7 +44,28 @@ main (void)
       break_parent ();
     }
   else
-    _exit (0);
+    {
+#if defined TEST_EXEC
+      char prog[PATH_MAX];
+      int len;
+
+      strcpy (prog, argv[0]);
+      len = strlen (prog);
+      for (; len > 0; --len)
+	{
+	  if (prog[len - 1] == '/')
+	    break;
+	}
+      strcpy (&prog[len], "vforked-prog");
+      execlp (prog, prog, (char *) 0);
+      perror ("exec failed");
+      _exit (1);
+#elif defined TEST_EXIT
+      _exit (0);
+#else
+#error Define TEST_EXEC or TEST_EXIT
+#endif
+    }
 
   return 0;
 }
diff --git a/gdb/testsuite/gdb.base/vfork-follow-parent.exp b/gdb/testsuite/gdb.base/vfork-follow-parent.exp
index 89c38001dac..70b54e729a5 100644
--- a/gdb/testsuite/gdb.base/vfork-follow-parent.exp
+++ b/gdb/testsuite/gdb.base/vfork-follow-parent.exp
@@ -19,10 +19,28 @@
 # schedule-multiple on" or "set detach-on-fork on".  Test these two resolution
 # methods.
 
-standard_testfile
+standard_testfile .c vforked-prog.c
 
-if { [build_executable "failed to prepare" \
-	${testfile} ${srcfile}] } {
+set binfile ${testfile}-exit
+set binfile2 ${testfile}-exec
+set binfile3 vforked-prog
+
+if { [build_executable "compile $binfile3" $binfile3 $srcfile2] } {
+    untested "failed to compile third test binary"
+    return -1
+}
+
+set remote_exec_prog [gdb_remote_download target $binfile3]
+
+set opts [list debug additional_flags=-DTEST_EXIT]
+if { [build_executable "compile ${binfile}" ${binfile} ${srcfile} ${opts}] } {
+    untested "failed to compile first test binary"
+    return
+}
+
+set opts [list debug additional_flags=-DTEST_EXEC]
+if { [build_executable "compile ${binfile2}" ${binfile2} ${srcfile} ${opts}] } {
+    untested "failed to compile second test binary"
     return
 }
 
@@ -31,8 +49,12 @@ if { [build_executable "failed to prepare" \
 # or "schedule-multiple" (the two alternatives the message suggests to the
 # user).
 
-proc do_test { resolution_method } {
-    clean_restart $::binfile
+proc do_test { exec_file resolution_method target_non_stop non_stop } {
+    save_vars { ::GDBFLAGS } {
+	append ::GDBFLAGS " -ex \"maint set target-non-stop ${target_non_stop}\""
+	append ::GDBFLAGS " -ex \"set non-stop ${non_stop}\""
+	clean_restart $exec_file
+    }
 
     gdb_test_no_output "set detach-on-fork off"
 
@@ -40,6 +62,10 @@ proc do_test { resolution_method } {
 	return
     }
 
+    # Delete the breakpoint on main so we don't bit the breakpoint in
+    # the case that the vfork child performs an exec.
+    delete_breakpoints
+
     gdb_test "break break_parent"
 
     gdb_test "continue" \
@@ -75,6 +101,16 @@ proc do_test { resolution_method } {
 	"continue to break_parent"
 }
 
-foreach_with_prefix resolution_method {detach-on-fork schedule-multiple} {
-    do_test $resolution_method
+foreach_with_prefix exec_file [list $binfile $binfile2] {
+    foreach_with_prefix target-non-stop {on off} {
+	# This test was written assuming non-stop mode is off.
+	foreach_with_prefix non-stop {off} {
+	    if {!${target-non-stop} && ${non-stop}} {
+		continue
+	    }
+	    foreach_with_prefix resolution_method {detach-on-fork schedule-multiple} {
+		do_test $exec_file $resolution_method ${target-non-stop} ${non-stop}
+	    }
+	}
+    }
 }


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-08-16 14:02         ` Andrew Burgess
@ 2023-08-17  6:36           ` Tom de Vries
  2023-08-17  7:01             ` Tom de Vries
  0 siblings, 1 reply; 32+ messages in thread
From: Tom de Vries @ 2023-08-17  6:36 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

On 8/16/23 16:02, Andrew Burgess via Gdb-patches wrote:
> +set remote_exec_prog [gdb_remote_download target $binfile3]

With testing on target board unix I ran into:
...
ERROR: tcl error sourcing 
/data/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.exp.
ERROR: error copying "vforked-prog": no such file or directory
     while executing
"file copy -force $fromfile $tofile"
     (procedure "gdb_remote_download" line 29)
     invoked from within
"gdb_remote_download target $binfile3"
...

I managed to reproduce it, also with the mentioned target boards 
native-gdbserver and native-extended-gdbserver.

Did you mean something like:
...
if {[is_remote host]} {
     remote_upload target $binfile3
}
...
?

With that instead, I still see these FAILs with gdbserver boards:
...
FAIL: gdb.base/vfork-follow-parent.exp: 
exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
resolution_method=schedule-multiple: continue to end of inferior 2
FAIL: gdb.base/vfork-follow-parent.exp: 
exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
resolution_method=schedule-multiple: continue to break_parent
...
but not 100% reproducible.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-08-17  6:36           ` Tom de Vries
@ 2023-08-17  7:01             ` Tom de Vries
  2023-08-17  8:06               ` Tom de Vries
  0 siblings, 1 reply; 32+ messages in thread
From: Tom de Vries @ 2023-08-17  7:01 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

On 8/17/23 08:36, Tom de Vries wrote:
> On 8/16/23 16:02, Andrew Burgess via Gdb-patches wrote:
>> +set remote_exec_prog [gdb_remote_download target $binfile3]
> 
> With testing on target board unix I ran into:
> ...
> ERROR: tcl error sourcing 
> /data/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.exp.
> ERROR: error copying "vforked-prog": no such file or directory
>      while executing
> "file copy -force $fromfile $tofile"
>      (procedure "gdb_remote_download" line 29)
>      invoked from within
> "gdb_remote_download target $binfile3"
> ...
> 
> I managed to reproduce it, also with the mentioned target boards 
> native-gdbserver and native-extended-gdbserver.
> 
> Did you mean something like:
> ...
> if {[is_remote host]} {
>      remote_upload target $binfile3
> }
> ...
> ?
> 

Oops, of course that should be:
...
  if {[is_remote target]} {
       remote_upload target $binfile3
}
...

Thanks,
- Tom

> With that instead, I still see these FAILs with gdbserver boards:
> ...
> FAIL: gdb.base/vfork-follow-parent.exp: 
> exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
> resolution_method=schedule-multiple: continue to end of inferior 2
> FAIL: gdb.base/vfork-follow-parent.exp: 
> exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
> resolution_method=schedule-multiple: continue to break_parent
> ...
> but not 100% reproducible.
> 
> Thanks,
> - Tom


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-08-17  7:01             ` Tom de Vries
@ 2023-08-17  8:06               ` Tom de Vries
  2023-08-17  8:22                 ` Tom de Vries
  0 siblings, 1 reply; 32+ messages in thread
From: Tom de Vries @ 2023-08-17  8:06 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

On 8/17/23 09:01, Tom de Vries via Gdb-patches wrote:
> On 8/17/23 08:36, Tom de Vries wrote:
>> On 8/16/23 16:02, Andrew Burgess via Gdb-patches wrote:
>>> +set remote_exec_prog [gdb_remote_download target $binfile3]
>>
>> With testing on target board unix I ran into:
>> ...
>> ERROR: tcl error sourcing 
>> /data/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.exp.
>> ERROR: error copying "vforked-prog": no such file or directory
>>      while executing
>> "file copy -force $fromfile $tofile"
>>      (procedure "gdb_remote_download" line 29)
>>      invoked from within
>> "gdb_remote_download target $binfile3"
>> ...
>>
>> I managed to reproduce it, also with the mentioned target boards 
>> native-gdbserver and native-extended-gdbserver.
>>
>> Did you mean something like:
>> ...
>> if {[is_remote host]} {
>>      remote_upload target $binfile3
>> }
>> ...
>> ?
>>
> 
> Oops, of course that should be:
> ...
>   if {[is_remote target]} {
>        remote_upload target $binfile3
> }
> ...
> 

I've got a patch that works, I'll commit shortly.

FWIW, I got my uploads and download confused, I forgot dejagnu inverts 
the meaning of the concepts.

Thanks,
- Tom

> Thanks,
> - Tom
> 
>> With that instead, I still see these FAILs with gdbserver boards:
>> ...
>> FAIL: gdb.base/vfork-follow-parent.exp: 
>> exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
>> resolution_method=schedule-multiple: continue to end of inferior 2
>> FAIL: gdb.base/vfork-follow-parent.exp: 
>> exec_file=vfork-follow-parent-exec: target-non-stop=off: non-stop=off: 
>> resolution_method=schedule-multiple: continue to break_parent
>> ...
>> but not 100% reproducible.
>>
>> Thanks,
>> - Tom
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running
  2023-08-17  8:06               ` Tom de Vries
@ 2023-08-17  8:22                 ` Tom de Vries
  0 siblings, 0 replies; 32+ messages in thread
From: Tom de Vries @ 2023-08-17  8:22 UTC (permalink / raw)
  To: Andrew Burgess, Simon Marchi, gdb-patches; +Cc: tankut.baris.aktemur

On 8/17/23 10:06, Tom de Vries via Gdb-patches wrote:
> On 8/17/23 09:01, Tom de Vries via Gdb-patches wrote:
>> On 8/17/23 08:36, Tom de Vries wrote:
>>> On 8/16/23 16:02, Andrew Burgess via Gdb-patches wrote:
>>>> +set remote_exec_prog [gdb_remote_download target $binfile3]
>>>
>>> With testing on target board unix I ran into:
>>> ...
>>> ERROR: tcl error sourcing 
>>> /data/vries/gdb/src/gdb/testsuite/gdb.base/vfork-follow-parent.exp.
>>> ERROR: error copying "vforked-prog": no such file or directory
>>>      while executing
>>> "file copy -force $fromfile $tofile"
>>>      (procedure "gdb_remote_download" line 29)
>>>      invoked from within
>>> "gdb_remote_download target $binfile3"
>>> ...
>>>
>>> I managed to reproduce it, also with the mentioned target boards 
>>> native-gdbserver and native-extended-gdbserver.
>>>
>>> Did you mean something like:
>>> ...
>>> if {[is_remote host]} {
>>>      remote_upload target $binfile3
>>> }
>>> ...
>>> ?
>>>
>>
>> Oops, of course that should be:
>> ...
>>   if {[is_remote target]} {
>>        remote_upload target $binfile3
>> }
>> ...
>>
> 
> I've got a patch that works, I'll commit shortly.
> 

https://sourceware.org/pipermail/gdb-patches/2023-August/201694.html

Thanks,
- Tom

> FWIW, I got my uploads and download confused, I forgot dejagnu inverts 
> the meaning of the concepts.
> 
> Thanks,
> - Tom
> 
>> Thanks,
>> - Tom
>>
>>> With that instead, I still see these FAILs with gdbserver boards:
>>> ...
>>> FAIL: gdb.base/vfork-follow-parent.exp: 
>>> exec_file=vfork-follow-parent-exec: target-non-stop=off: 
>>> non-stop=off: resolution_method=schedule-multiple: continue to end of 
>>> inferior 2
>>> FAIL: gdb.base/vfork-follow-parent.exp: 
>>> exec_file=vfork-follow-parent-exec: target-non-stop=off: 
>>> non-stop=off: resolution_method=schedule-multiple: continue to 
>>> break_parent
>>> ...
>>> but not 100% reproducible.
>>>
>>> Thanks,
>>> - Tom
>>
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-08-17  8:23 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-22 13:17 [PATCH 0/8] Some vfork related fixes Andrew Burgess
2023-06-22 13:17 ` [PATCH 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
2023-06-22 13:17 ` [PATCH 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
2023-06-22 13:17 ` [PATCH 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
2023-06-22 13:17 ` [PATCH 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
2023-06-28  8:47   ` Aktemur, Tankut Baris
2023-07-04 15:24     ` Andrew Burgess
2023-06-22 13:17 ` [PATCH 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
2023-06-22 13:17 ` [PATCH 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
2023-06-22 13:17 ` [PATCH 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
2023-06-22 13:17 ` [PATCH 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
2023-07-04 15:22 ` [PATCHv2 0/8] Some vfork related fixes Andrew Burgess
2023-07-04 15:22   ` [PATCHv2 1/8] gdb: catch more errors in gdb.base/foll-vfork.exp Andrew Burgess
2023-07-04 15:22   ` [PATCHv2 2/8] gdb: don't restart vfork parent while waiting for child to finish Andrew Burgess
2023-07-05 10:08     ` Aktemur, Tankut Baris
2023-07-04 15:22   ` [PATCHv2 3/8] gdb: fix an issue with vfork in non-stop mode Andrew Burgess
2023-07-04 15:22   ` [PATCHv2 4/8] gdb, infrun: refactor part of `proceed` into separate function Andrew Burgess
2023-07-04 15:22   ` [PATCHv2 5/8] gdb: don't resume vfork parent while child is still running Andrew Burgess
2023-07-18 20:42     ` Simon Marchi
2023-07-21  9:47       ` Andrew Burgess
2023-07-23  8:53       ` Andrew Burgess
2023-08-16 14:02         ` Andrew Burgess
2023-08-17  6:36           ` Tom de Vries
2023-08-17  7:01             ` Tom de Vries
2023-08-17  8:06               ` Tom de Vries
2023-08-17  8:22                 ` Tom de Vries
2023-07-04 15:22   ` [PATCHv2 6/8] gdb/testsuite: expand gdb.base/foll-vfork.exp Andrew Burgess
2023-07-05 11:22     ` Aktemur, Tankut Baris
2023-07-04 15:22   ` [PATCHv2 7/8] gdb/testsuite: remove use of sleep from gdb.base/foll-vfork.exp Andrew Burgess
2023-07-04 15:22   ` [PATCHv2 8/8] gdb: additional debug output in infrun.c and linux-nat.c Andrew Burgess
2023-07-05 11:30   ` [PATCHv2 0/8] Some vfork related fixes Aktemur, Tankut Baris
2023-07-17  8:53     ` Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).