public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simon.marchi@polymtl.ca>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdbserver: hide fork child threads from GDB (ping)
Date: Fri, 12 Nov 2021 15:54:22 -0500	[thread overview]
Message-ID: <35710f09-2033-9ebf-d545-1f210166c235@polymtl.ca> (raw)
In-Reply-To: <20211029203332.69894-1-simon.marchi@polymtl.ca>

Ping.

On 2021-10-29 4:33 p.m., Simon Marchi wrote:
> This patch aims at fixing a bug where an inferior is unexpectedly
> created when a fork happens at the same time as another event, and that
> other event is reported to GDB first (and the fork event stays pending
> in GDBserver).  This happens for example when we step a thread and
> another thread forks at the same time.  The bug looks like (if I
> reproduce the included test by hand):
> 
>     (gdb) show detach-on-fork
>     Whether gdb will detach the child of a fork is on.
>     (gdb) show follow-fork-mode
>     Debugger response to a program call of fork or vfork is "parent".
>     (gdb) si
>     [New inferior 2]
>     Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
>     Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
>     Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
>     [New Thread 965190.965190]
>     [Switching to Thread 965190.965190]
>     Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
> 
> The sequence of events leading to the problem is:
> 
>  - We are using the all-stop user-visible mode as well as the
>    synchronous / all-stop variant of the remote protocol
>  - We have two threads, thread A that we single-step and thread B that
>    calls fork at the same time
>  - GDBserver's linux_process_target::wait pulls the "single step
>    complete SIGTRAP" and the "fork" events from the kernel.  It
>    arbitrarily choses one event to report, it happens to be the
>    single-step SIGTRAP.  The fork stays pending in the thread_info.
>  - GDBserver send that SIGTRAP as a stop reply to GDB
>  - While in stop_all_threads, GDB calls update_thread_list, which ends
>    up querying the remote thread list using qXfer:threads:read.
>  - In the reply, GDBserver includes the fork child created as a result
>    of thread B's fork.
>  - GDB-side, the remote target sees the new PID, calls
>    remote_notice_new_inferior, which ends up unexpectedly creating a new
>    inferior, and things go downhill from there.
> 
> The problem here is that as long as GDB did not process the fork event,
> it should pretend the fork child does not exist.  Ultimately, this event
> will be reported, we'll go through follow_fork, and that process will be
> detached.
> 
> The remote target (GDB-side), has some code to remove from the reported
> thread list the threads that are the result of forks not processed by
> GDB yet.  But that only works for fork events that have made their way
> to the remote target (GDB-side), but haven't been consumed by the core
> yet, so are still lingering as pending stop replies in the remote target
> (see remove_new_fork_children in remote.c).  But in our case, the fork
> event hasn't made its way to the GDB-side remote target.  We need to
> implement the same kind of logic GDBserver-side: if there exists a
> thread / inferior that is the result of a fork event GDBserver hasn't
> reported yet, it should exclude that thread / inferior from the reported
> thread list.
> 
> This was actually discussed a while ago, but not implemented AFAIK:
> 
>     https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
>     https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
> 
> Implementation details-wise, the fix for this is all in GDBserver.  The
> Linux layer of GDBserver already tracks unreported fork parent / child
> relationships using the lwp_info::fork_relative, in order to avoid
> wildcard actions resuming fork childs unknown to GDB.  Move this
> information to the common thread_info structure, so that it can be used
> in handle_qxfer_threads_worker to filter out unreported fork child
> threads.  Plus, that makes it usable for other OSes that use fork if
> need be.
> 
> I split the field in two (fork_parent / fork_child), I think it's
> clearer this way, easier to follow which thread is the parent and which
> is the child, and helps ensure things are consistent.  That simplifies
> things a bit in linux_set_resume_request.  Currently, when
> lwp->fork_relative is set, we have to deduce whether this is a parent or
> child using the pending event.  With separate fork_parent and fork_child
> fields, it becomes more obvious.  If the thread has a fork parent, then
> it means it's a fork child, and vice-versa.
> 
> Testing-wise, the test replicates pretty-much the sequence of events
> shown above.  The setup of the test makes it such that the main thread
> is about to fork.  We stepi the other thread, so that the step completes
> very quickly, in a single event.  Meanwhile, the main thread is resumed,
> so very likely has time to call fork.  This means that the bug may not
> reproduce every time (if the main thread does not have time to call
> fork), but it will reproduce more often than not.  The test fails
> without the fix applied on the native-gdbserver and
> native-extended-gdbserver boards.
> 
> At some point I suspected that which thread called fork and which thread
> did the step influenced the order in which the events were reported, and
> therefore the reproducibility of the bug.  So I made the test try  both
> combinations: main thread forks while other thread steps, and vice
> versa.  I'm not sure this is still necessary, but I left it there
> anyway.  It doesn't hurt to test a few more combinations.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
> Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
> ---
>  .../gdb.threads/pending-fork-event.c          | 82 ++++++++++++++++++
>  .../gdb.threads/pending-fork-event.exp        | 86 +++++++++++++++++++
>  gdbserver/gdbthread.h                         |  9 ++
>  gdbserver/linux-low.cc                        | 36 ++++----
>  gdbserver/linux-low.h                         |  6 --
>  gdbserver/server.cc                           |  6 ++
>  6 files changed, 202 insertions(+), 23 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.threads/pending-fork-event.c
>  create mode 100644 gdb/testsuite/gdb.threads/pending-fork-event.exp
> 
> diff --git a/gdb/testsuite/gdb.threads/pending-fork-event.c b/gdb/testsuite/gdb.threads/pending-fork-event.c
> new file mode 100644
> index 000000000000..a39ca75a49ac
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/pending-fork-event.c
> @@ -0,0 +1,82 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2021 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <pthread.h>
> +#include <unistd.h>
> +#include <assert.h>
> +
> +static volatile int release_forking_thread = 0;
> +static int x;
> +static pthread_barrier_t barrier;
> +
> +static void
> +break_here (void)
> +{
> +  x++;
> +}
> +
> +static void
> +do_fork (void)
> +{
> +  pthread_barrier_wait (&barrier);
> +
> +  while (!release_forking_thread);
> +
> +  if (FORK_FUNCTION () == 0)
> +    _exit (0);
> +
> +}
> +
> +static void *
> +thread_func (void *p)
> +{
> +#if defined(MAIN_THREAD_FORKS)
> +  break_here ();
> +#elif defined(OTHER_THREAD_FORKS)
> +  do_fork ();
> +#else
> +# error "MAIN_THREAD_FORKS or OTHER_THREAD_FORKS must be defined"
> +#endif
> +}
> +
> +
> +int
> +main (void)
> +{
> +  pthread_t thread;
> +  int ret;
> +
> +  pthread_barrier_init (&barrier, NULL, 2);
> +
> +  alarm (30);
> +  ret = pthread_create (&thread, NULL, thread_func, NULL);
> +  assert (ret == 0);
> +
> +  pthread_barrier_wait (&barrier);
> +
> +#if defined(MAIN_THREAD_FORKS)
> +  do_fork ();
> +#elif defined(OTHER_THREAD_FORKS)
> +  break_here ();
> +#else
> +# error "MAIN_THREAD_FORKS or OTHER_THREAD_FORKS must be defined"
> +#endif
> +
> +  pthread_join (thread, NULL);
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.threads/pending-fork-event.exp b/gdb/testsuite/gdb.threads/pending-fork-event.exp
> new file mode 100644
> index 000000000000..6574b9abf5b7
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/pending-fork-event.exp
> @@ -0,0 +1,86 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Test that we handle well, in all-stop, a fork happening in a thread B while
> +# doing a step-like command in a thread A.
> +#
> +# The scenario we want to test for is:
> +#
> +#   - the step of thread A completes, we handle the event and stop all threads
> +#     as a result
> +#   - while stopping all threads, thread B reports a fork event which stays
> +#     pending
> +#
> +# A buggy GDB / GDBserver combo would notice the thread of the child process
> +# of the (still unprocessed) fork event, and erroneously create a new inferior
> +# for it.  Once fixed, the child process' thread is hidden by whoever holds the
> +# pending fork event.
> +
> +standard_testfile
> +
> +proc do_test { target-non-stop who_forks fork_function } {
> +
> +    set opts [list debug "additional_flags=-DFORK_FUNCTION=$fork_function"]
> +
> +    # WHO_FORKS says which of the main or other thread calls (v)fork.  The
> +    # thread that does not call (v)fork is the one who tries to step.
> +    if { $who_forks == "main" } {
> +	lappend opts "additional_flags=-DMAIN_THREAD_FORKS"
> +	set this_binfile ${::binfile}-main-${fork_function}
> +    } elseif { $who_forks == "other" } {
> +	lappend opts "additional_flags=-DOTHER_THREAD_FORKS"
> +	set this_binfile ${::binfile}-other-${fork_function}
> +    } else {
> +	error "invalid who_forks value: $who_forks"
> +    }
> +
> +    if { [gdb_compile_pthreads "$::srcdir/$::subdir/$::srcfile" $this_binfile executable $opts] != "" } {
> +	return
> +    }
> +
> +    save_vars { ::GDBFLAGS } {
> +	append ::GDBFLAGS " -ex \"maintenance set target-non-stop ${target-non-stop}\""
> +	clean_restart $this_binfile
> +    }
> +
> +    if {![runto_main]} {
> +	fail "could not run to main"
> +	return
> +    }
> +
> +    # Run until breakpoint in the second thread.
> +    gdb_test "break break_here" "Breakpoint $::decimal.*"
> +    gdb_continue_to_breakpoint "thread started"
> +
> +    # Delete the breakpoint so the thread doesn't do a step-over.
> +    delete_breakpoints
> +
> +    # Let the forking thread make progress during the step.
> +    gdb_test "p release_forking_thread = 1" " = 1"
> +
> +    # stepi the non-forking thread.
> +    gdb_test "stepi"
> +
> +    # Make sure there's still a single inferior.
> +    gdb_test "info inferior" {\* 1 [^\r\n]+}
> +}
> +
> +foreach_with_prefix target-non-stop { auto on off } {
> +    foreach_with_prefix who_forks { main other } {
> +	foreach_with_prefix fork_function { fork vfork } {
> +	    do_test ${target-non-stop} $who_forks $fork_function
> +	}
> +    }
> +}
> diff --git a/gdbserver/gdbthread.h b/gdbserver/gdbthread.h
> index 7c293b1f89d2..9f9f69eb1ddd 100644
> --- a/gdbserver/gdbthread.h
> +++ b/gdbserver/gdbthread.h
> @@ -80,6 +80,15 @@ struct thread_info
>  
>    /* Branch trace target information for this thread.  */
>    struct btrace_target_info *btrace = nullptr;
> +
> +  /* A pointer to this thread's fork child or fork parent.  Valid only while
> +     the parent fork or vfork event is not reported to GDB.
> +
> +     Used to avoid wildcard vCont actions resuming a (v)fork child before GDB is
> +     notified about the parent's (v)fork event.  Also used to avoid including the
> +     (v)fork child in thread list packet responses (e.g. qfThreadInfo).  */
> +  thread_info *fork_child = nullptr;
> +  thread_info *fork_parent = nullptr;
>  };
>  
>  extern std::list<thread_info *> all_threads;
> diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc
> index 34ede238d19f..ff001475a9f0 100644
> --- a/gdbserver/linux-low.cc
> +++ b/gdbserver/linux-low.cc
> @@ -564,10 +564,10 @@ linux_process_target::handle_extended_wait (lwp_info **orig_event_lwp,
>  	  event_lwp->status_pending_p = 1;
>  	  event_lwp->status_pending = wstat;
>  
> -	  /* Link the threads until the parent event is passed on to
> -	     higher layers.  */
> -	  event_lwp->fork_relative = child_lwp;
> -	  child_lwp->fork_relative = event_lwp;
> +	  /* Link the threads until the parent's event is passed on to
> +	     GDB.  */
> +	  event_lwp->thread->fork_child = child_lwp->thread;
> +	  child_lwp->thread->fork_parent = event_lwp->thread;
>  
>  	  /* If the parent thread is doing step-over with single-step
>  	     breakpoints, the list of single-step breakpoints are cloned
> @@ -3594,8 +3594,10 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus,
>        if (event_child->waitstatus.kind () == TARGET_WAITKIND_FORKED
>  	  || event_child->waitstatus.kind () == TARGET_WAITKIND_VFORKED)
>  	{
> -	  event_child->fork_relative->fork_relative = NULL;
> -	  event_child->fork_relative = NULL;
> +	  gdb_assert (event_child->thread->fork_child != nullptr);
> +	  gdb_assert (event_child->thread->fork_child->fork_parent != nullptr);
> +	  event_child->thread->fork_child->fork_parent = nullptr;
> +	  event_child->thread->fork_child = nullptr;
>  	}
>  
>        *ourstatus = event_child->waitstatus;
> @@ -4375,19 +4377,19 @@ linux_set_resume_request (thread_info *thread, thread_resume *resume, size_t n)
>  
>  	  /* Don't let wildcard resumes resume fork children that GDB
>  	     does not yet know are new fork children.  */
> -	  if (lwp->fork_relative != NULL)
> +	  if (lwp->thread->fork_parent != nullptr)
>  	    {
> -	      struct lwp_info *rel = lwp->fork_relative;
> +	      lwp_info *parent = get_thread_lwp (lwp->thread->fork_parent);
> +	      target_waitkind kind = parent->waitstatus.kind ();
>  
> -	      if (rel->status_pending_p
> -		  && (rel->waitstatus.kind () == TARGET_WAITKIND_FORKED
> -		      || rel->waitstatus.kind () == TARGET_WAITKIND_VFORKED))
> -		{
> -		  if (debug_threads)
> -		    debug_printf ("not resuming LWP %ld: has queued stop reply\n",
> -				  lwpid_of (thread));
> -		  continue;
> -		}
> +	      gdb_assert (parent->status_pending_p);
> +	      gdb_assert (kind == TARGET_WAITKIND_FORKED
> +			  || kind == TARGET_WAITKIND_VFORKED);
> +
> +	      if (debug_threads)
> +		debug_printf ("not resuming LWP %ld: is a fork child not know to GDB\n",
> +			      lwpid_of (thread));
> +	      continue;
>  	    }
>  
>  	  /* If the thread has a pending event that has already been
> diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h
> index 05067ffc6e6f..46aff0a29524 100644
> --- a/gdbserver/linux-low.h
> +++ b/gdbserver/linux-low.h
> @@ -756,12 +756,6 @@ struct lwp_info
>       information or exit status until it can be reported to GDB.  */
>    struct target_waitstatus waitstatus;
>  
> -  /* A pointer to the fork child/parent relative.  Valid only while
> -     the parent fork event is not reported to higher layers.  Used to
> -     avoid wildcard vCont actions resuming a fork child before GDB is
> -     notified about the parent's fork event.  */
> -  struct lwp_info *fork_relative = nullptr;
> -
>    /* When stopped is set, this is where the lwp last stopped, with
>       decr_pc_after_break already accounted for.  If the LWP is
>       running, this is the address at which the lwp was resumed.  */
> diff --git a/gdbserver/server.cc b/gdbserver/server.cc
> index c58f7e01d8da..9890d6cc9798 100644
> --- a/gdbserver/server.cc
> +++ b/gdbserver/server.cc
> @@ -1656,6 +1656,12 @@ handle_qxfer_threads_worker (thread_info *thread, struct buffer *buffer)
>    gdb_byte *handle;
>    bool handle_status = target_thread_handle (ptid, &handle, &handle_len);
>  
> +  /* If this is a fork or vfork child (has a fork parent), GDB does not yet
> +     know about this process, and must not know about it until it gets the
> +     corresponding (v)fork event.  Exclude this thread from the list.  */
> +  if (thread->fork_parent != nullptr)
> +    return;
> +
>    write_ptid (ptid_s, ptid);
>  
>    buffer_xml_printf (buffer, "<thread id=\"%s\"", ptid_s);
> -- 
> 2.33.1
> 


  parent reply	other threads:[~2021-11-12 20:54 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-29 20:33 [PATCH 1/2] gdbserver: hide fork child threads from GDB Simon Marchi
2021-10-29 20:33 ` [PATCH 2/2] gdb, gdbserver: detach fork child when detaching from fork parent Simon Marchi
2021-11-12 20:54 ` Simon Marchi [this message]
2021-11-24 20:04 ` [PATCH 0/3] Fix handling of pending fork events Simon Marchi
2021-11-24 20:04   ` [PATCH 1/3] gdbserver: hide fork child threads from GDB Simon Marchi
2021-11-26 22:51     ` Pedro Alves
2021-11-29 12:46       ` Simon Marchi
2021-11-29 15:37         ` Pedro Alves
2021-11-24 20:04   ` [PATCH 2/3] gdb/linux-nat: factor ptrace-detach code to new detach_one_pid function Simon Marchi
2021-11-26 22:51     ` Pedro Alves
2021-11-24 20:04   ` [PATCH 3/3] gdb, gdbserver: detach fork child when detaching from fork parent Simon Marchi
2021-11-26 22:54     ` Pedro Alves
2021-11-29 18:27       ` Simon Marchi
2021-11-24 20:56   ` [PATCH 0/3] Fix handling of pending fork events Simon Marchi
2021-11-26 22:50   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35710f09-2033-9ebf-d545-1f210166c235@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).