public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Simon Marchi <simark@simark.ca>
To: John Baldwin <jhb@FreeBSD.org>, gdb-patches@sourceware.org
Subject: Re: [PATCH 6/9] fbsd-nat: Fix resuming and waiting with multiple processes.
Date: Mon, 20 Mar 2023 15:55:56 -0400	[thread overview]
Message-ID: <8fb811a5-f363-d3aa-5b63-4fcc434b3e17@simark.ca> (raw)
In-Reply-To: <20230228181845.99936-7-jhb@FreeBSD.org>

On 2/28/23 13:18, John Baldwin wrote:
> I did not fully understand the requirements of multiple process
> support when I enabled it previously and several parts were broken.
> In particular, the resume method was only resuming a single process,
> and wait was not stopping other processes when reporting an event.
> 
> To support multiple running inferiors, add a new per-inferior
> structure which trackes the number of existing and running LWPs for
> each process.  The structure also stores a ptid_t describing the
> set of LWPs currently resumed for each process.

Ah, that sounds good, related to my comments on previous patches about
tracking the resume state for each LWP.  I don't know if it needs to be
per-inferior though, versus a global table like Linux does (but perhaps
it helps, I'll see when reading the code).

> 
> For the resume method, iterate over all non-exited inferiors resuming
> each process matching the passed in ptid rather than only resuming the
> current inferior's process for a wildcard ptid.  If a resumed process
> has a pending event, don't actually resume the process, but other
> matching processes without a pending event are still resumed in case
> the later call to the wait method requests an event from one of the
> processes without a pending event.
> 
> For the wait method, stop other running processes before returning an
> event to the core.  When stopping a process, first check to see if an
> event is already pending.  If it is, queue the event to be reported
> later.  If not, send a SIGSTOP to the process and wait for it to stop.
> If the event reported by the wait is not for the SIGSTOP, queue the
> event and remember to ignore a future SIGSTOP event for the process.
> 
> Note that, unlike the Linux native target, entire processes are
> stopped rather than individual LWPs.  In FreeBSD one can only wait on
> processes (via pid), not for an event from a specific thread.
> 
> Other changes in this commit handle bookkeeping for the per-inferior
> data such as purging the data in the mourn_inferior method and
> migrating the data to the new inferior in the follow_exec method.  The
> per-inferior data is created in the attach, create_inferior, and
> follow_fork methods.
> ---
>  gdb/fbsd-nat.c | 403 +++++++++++++++++++++++++++++++++++++------------
>  gdb/fbsd-nat.h |   8 +
>  2 files changed, 317 insertions(+), 94 deletions(-)
> 
> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
> index 3f7278c6ea0..14b31ddd86e 100644
> --- a/gdb/fbsd-nat.c
> +++ b/gdb/fbsd-nat.c
> @@ -54,11 +54,26 @@
>  #define	PT_SETREGSET	43	/* Set a target register set */
>  #endif
>  
> -/* Filter for ptid's allowed to report events from wait.  Normally set
> -   in resume, but also reset to minus_one_ptid in create_inferior and
> -   attach.  */
> +/* Information stored about each inferior.  */
>  
> -static ptid_t resume_ptid;
> +struct fbsd_inferior_info
> +{
> +  /* Filter for resumed LWPs which can report events from wait.  */
> +  ptid_t resumed_lwps = null_ptid;
> +
> +  /* Number of LWPs this process contains.  */
> +  unsigned int num_lwps = 0;
> +
> +  /* Number of LWPs currently running.  */
> +  unsigned int running_lwps = 0;
> +
> +  /* Have a pending SIGSTOP event that needs to be discarded.  */
> +  bool pending_sigstop = false;
> +};

Ok, it's not exactly what I expected, but I will keep on reading.

Long term, I don't think the resumed_lwps field will be enough to
describe the resume state of individual threads.  Actually, to
complement the example I gave on an earlier patch, I guess you could do
that today?

(gdb) set scheduler-locking on
(gdb) thread 1
(gdb) continue & # only supposed to resume thread 1
(gdb) thread 2
(gdb) continue & # only supposed to resume thread 2

Here, resume_lwps would end up as (pid, thread_2_lwp, 0), right?


> +
> +/* Per-inferior data key.  */
> +
> +static const registry<inferior>::key<fbsd_inferior_info> fbsd_inferior_data;
>  
>  /* If an event is triggered asynchronously (fake vfork_done events) or
>     occurs when the core is not expecting it, a pending event is
> @@ -95,21 +110,27 @@ have_pending_event (ptid_t filter)
>    return false;
>  }
>  
> -/* Helper method called by the target wait method.  Returns true if
> -   there is a pending event matching resume_ptid.  If there is a
> -   matching event, PTID and *STATUS contain the event details, and the
> -   event is removed from the pending list.  */
> +/* Returns true if there is a pending event for a resumed process
> +   matching FILTER.  If there is a matching event, PTID and *STATUS
> +   contain the event details, and the event is removed from the
> +   pending list.  */
>  
>  static bool
> -take_pending_event (ptid_t &ptid, target_waitstatus *status)
> +take_pending_event (fbsd_nat_target *target, ptid_t filter, ptid_t &ptid,
> +		    target_waitstatus *status)
>  {
>    for (auto it = pending_events.begin (); it != pending_events.end (); it++)
> -    if (it->ptid.matches (resume_ptid))
> +    if (it->ptid.matches (filter))
>        {
> -	ptid = it->ptid;
> -	*status = it->status;
> -	pending_events.erase (it);
> -	return true;
> +	inferior *inf = find_inferior_ptid (target, it->ptid);
> +	fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> +	if (it->ptid.matches (info->resumed_lwps))
> +	  {
> +	    ptid = it->ptid;
> +	    *status = it->status;
> +	    pending_events.erase (it);
> +	    return true;
> +	  }

If that code was kept as-is, I think take_pending_event should be a
method of fbsd_nat_target, rather than passing it manually.

>        }
>    return false;
>  }
> @@ -799,6 +820,8 @@ show_fbsd_nat_debug (struct ui_file *file, int from_tty,
>  #define fbsd_nat_debug_printf(fmt, ...) \
>    debug_prefixed_printf_cond (debug_fbsd_nat, "fbsd-nat", fmt, ##__VA_ARGS__)
>  
> +#define fbsd_nat_debug_start_end(fmt, ...) \
> +  scoped_debug_start_end (debug_fbsd_nat, "fbsd-nat", fmt, ##__VA_ARGS__)
>  
>  /*
>    FreeBSD's first thread support was via a "reentrant" version of libc
> @@ -956,6 +979,9 @@ fbsd_add_threads (fbsd_nat_target *target, pid_t pid)
>    if (nlwps == -1)
>      perror_with_name (("ptrace (PT_GETLWPLIST)"));
>  
> +  inferior *inf = find_inferior_ptid (target, ptid_t (pid));
> +  fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> +  gdb_assert (info != nullptr);
>    for (i = 0; i < nlwps; i++)
>      {
>        ptid_t ptid = ptid_t (pid, lwps[i]);
> @@ -974,8 +1000,14 @@ fbsd_add_threads (fbsd_nat_target *target, pid_t pid)
>  #endif
>  	  fbsd_lwp_debug_printf ("adding thread for LWP %u", lwps[i]);
>  	  add_thread (target, ptid);
> +#ifdef PT_LWP_EVENTS
> +	  info->num_lwps++;
> +#endif
>  	}
>      }
> +#ifndef PT_LWP_EVENTS
> +  info->num_lwps = nlwps;
> +#endif
>  }
>  
>  /* Implement the "update_thread_list" target_ops method.  */
> @@ -1138,92 +1170,117 @@ fbsd_add_vfork_done (ptid_t pid)
>  #endif
>  #endif
>  
> -/* Implement the "resume" target_ops method.  */
> +/* Resume a single process.  */
>  
>  void
> -fbsd_nat_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
> +fbsd_nat_target::resume_one_process (ptid_t ptid, int step,
> +				     enum gdb_signal signo)
>  {
>    fbsd_nat_debug_printf ("[%s], step %d, signo %d (%s)",
>  			 target_pid_to_str (ptid).c_str (), step, signo,
>  			 gdb_signal_to_name (signo));
>  
> +  inferior *inf = find_inferior_ptid (this, ptid);
> +  fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> +  info->resumed_lwps = ptid;
> +  gdb_assert (info->running_lwps == 0);
> +
>    /* Don't PT_CONTINUE a thread or process which has a pending event.  */
> -  resume_ptid = ptid;
>    if (have_pending_event (ptid))
>      {
>        fbsd_nat_debug_printf ("found pending event");
>        return;
>      }
>  
> -  if (ptid.lwp_p ())
> +  for (thread_info *tp : inf->non_exited_threads ())
>      {
> -      /* If ptid is a specific LWP, suspend all other LWPs in the process.  */


> -      inferior *inf = find_inferior_ptid (this, ptid);
> +      int request;
>  
> -      for (thread_info *tp : inf->non_exited_threads ())
> -	{
> -	  int request;
> -
> -	  if (tp->ptid.lwp () == ptid.lwp ())
> -	    request = PT_RESUME;
> -	  else
> -	    request = PT_SUSPEND;
> -
> -	  if (ptrace (request, tp->ptid.lwp (), NULL, 0) == -1)
> -	    perror_with_name (request == PT_RESUME ?
> -			      ("ptrace (PT_RESUME)") :
> -			      ("ptrace (PT_SUSPEND)"));
> -	  if (request == PT_RESUME)
> -	    low_prepare_to_resume (tp);
> -	}
> -    }
> -  else
> -    {
> -      /* If ptid is a wildcard, resume all matching threads (they won't run
> -	 until the process is continued however).  */
> -      for (thread_info *tp : all_non_exited_threads (this, ptid))
> +      /* If ptid is a specific LWP, suspend all other LWPs in the
> +	 process, otherwise resume all LWPs in the process..  */

As mentioned in a previous message, I don't think this is how the resume
method is supposed to work.

> +      if (!ptid.lwp_p() || tp->ptid.lwp () == ptid.lwp ())
>  	{
>  	  if (ptrace (PT_RESUME, tp->ptid.lwp (), NULL, 0) == -1)
>  	    perror_with_name (("ptrace (PT_RESUME)"));
>  	  low_prepare_to_resume (tp);
> +	  info->running_lwps++;
>  	}
> +      else
> +	{
> +	  if (ptrace (PT_SUSPEND, tp->ptid.lwp (), NULL, 0) == -1)
> +	    perror_with_name (("ptrace (PT_SUSPEND)"));
> +	}
> +    }
> +
> +  if (ptid.pid () != inferior_ptid.pid ())
> +    {
> +      step = 0;
> +      signo = GDB_SIGNAL_0;
> +      gdb_assert (!ptid.lwp_p ());

I don't get why you ignore the step request here.  Perhaps it is due to
a misundertanding of the resume interface (which was really confusing
before Pedro's clarification)?

The comment on target_resume and the commit message on Pedro's change
explain it well, but essentially:

 - The ptid passed as a parameter (SCOPE_PTID) is the set of threads to
   resume.  If you keep a list of LWPs, then you can just go over that
   list and resume everything that ptid_t::matches SCOPE_PTID, and that
   doesn't have a pending event.
 - STEP indicates whether to resume INFERIOR_PTID in resume mode.  If
   STEP is 0, it means that no thread is resumed in step mode they are
   all resumed normally.
 - SIGNAL indicates what signal to resume INFERIOR_PTID with.  If
   it's GDB_SIGNAL_0, it means resume without a signal.

So, the last two bullets only modify how the thread identified by
INFERIOR_PTID is resumed.  I think that in practice, it's guaranteed in
practice today that INFERIOR_PTID is "within" SCOPE_PTID.  But you can
also write the code without that assumption, it should be much harder.

For threads that are not INFERIOR_PTID, I think the target should
resume them with the signal in thread_info::m_suspend::stop_signal.  But
that can be a problem for another day.

I'll wait for clarifications from you before continuing to read, because
I am a bit lost with the approach taken here.

Simon

  reply	other threads:[~2023-03-20 19:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-28 18:18 [PATCH 0/9] Fixes for multiprocess for FreeBSD's native target John Baldwin
2023-02-28 18:18 ` [PATCH 1/9] fbsd-nat: Add missing spaces John Baldwin
2023-03-20 17:43   ` Simon Marchi
2023-03-20 18:08     ` John Baldwin
2023-02-28 18:18 ` [PATCH 2/9] fbsd-nat: Avoid a direct write to target_waitstatus::kind John Baldwin
2023-03-20 17:45   ` Simon Marchi
2023-02-28 18:18 ` [PATCH 3/9] fbsd-nat: Use correct constant for target_waitstatus::sig John Baldwin
2023-03-20 17:47   ` Simon Marchi
2023-02-28 18:18 ` [PATCH 4/9] fbsd-nat: Add a list of pending events John Baldwin
2023-03-20 18:35   ` Simon Marchi
2023-03-27 20:24     ` John Baldwin
2023-03-27 20:57       ` Simon Marchi
2023-03-27 21:00       ` John Baldwin
2023-02-28 18:18 ` [PATCH 5/9] fbsd-nat: Defer any ineligible events reported by wait John Baldwin
2023-03-20 19:09   ` Simon Marchi
2023-03-27 20:49     ` John Baldwin
2023-03-27 21:04       ` Simon Marchi
2023-02-28 18:18 ` [PATCH 6/9] fbsd-nat: Fix resuming and waiting with multiple processes John Baldwin
2023-03-20 19:55   ` Simon Marchi [this message]
2023-03-27 21:13     ` John Baldwin
2023-02-28 18:18 ` [PATCH 7/9] fbsd-nat: Fix several issues with detaching John Baldwin
2023-02-28 18:18 ` [PATCH 8/9] fbsd-nat: Fix thread_alive against a running thread John Baldwin
2023-02-28 18:18 ` [PATCH 9/9] fbsd-nat: Stop a process if it is running before killing it John Baldwin
2023-03-14 18:21 ` [PING] [PATCH 0/9] Fixes for multiprocess for FreeBSD's native target John Baldwin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8fb811a5-f363-d3aa-5b63-4fcc434b3e17@simark.ca \
    --to=simark@simark.ca \
    --cc=gdb-patches@sourceware.org \
    --cc=jhb@FreeBSD.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).