From: Simon Marchi <simark@simark.ca>
To: John Baldwin <jhb@FreeBSD.org>, gdb-patches@sourceware.org
Subject: Re: [PATCH 6/9] fbsd-nat: Fix resuming and waiting with multiple processes.
Date: Mon, 20 Mar 2023 15:55:56 -0400 [thread overview]
Message-ID: <8fb811a5-f363-d3aa-5b63-4fcc434b3e17@simark.ca> (raw)
In-Reply-To: <20230228181845.99936-7-jhb@FreeBSD.org>
On 2/28/23 13:18, John Baldwin wrote:
> I did not fully understand the requirements of multiple process
> support when I enabled it previously and several parts were broken.
> In particular, the resume method was only resuming a single process,
> and wait was not stopping other processes when reporting an event.
>
> To support multiple running inferiors, add a new per-inferior
> structure which trackes the number of existing and running LWPs for
> each process. The structure also stores a ptid_t describing the
> set of LWPs currently resumed for each process.
Ah, that sounds good, related to my comments on previous patches about
tracking the resume state for each LWP. I don't know if it needs to be
per-inferior though, versus a global table like Linux does (but perhaps
it helps, I'll see when reading the code).
>
> For the resume method, iterate over all non-exited inferiors resuming
> each process matching the passed in ptid rather than only resuming the
> current inferior's process for a wildcard ptid. If a resumed process
> has a pending event, don't actually resume the process, but other
> matching processes without a pending event are still resumed in case
> the later call to the wait method requests an event from one of the
> processes without a pending event.
>
> For the wait method, stop other running processes before returning an
> event to the core. When stopping a process, first check to see if an
> event is already pending. If it is, queue the event to be reported
> later. If not, send a SIGSTOP to the process and wait for it to stop.
> If the event reported by the wait is not for the SIGSTOP, queue the
> event and remember to ignore a future SIGSTOP event for the process.
>
> Note that, unlike the Linux native target, entire processes are
> stopped rather than individual LWPs. In FreeBSD one can only wait on
> processes (via pid), not for an event from a specific thread.
>
> Other changes in this commit handle bookkeeping for the per-inferior
> data such as purging the data in the mourn_inferior method and
> migrating the data to the new inferior in the follow_exec method. The
> per-inferior data is created in the attach, create_inferior, and
> follow_fork methods.
> ---
> gdb/fbsd-nat.c | 403 +++++++++++++++++++++++++++++++++++++------------
> gdb/fbsd-nat.h | 8 +
> 2 files changed, 317 insertions(+), 94 deletions(-)
>
> diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c
> index 3f7278c6ea0..14b31ddd86e 100644
> --- a/gdb/fbsd-nat.c
> +++ b/gdb/fbsd-nat.c
> @@ -54,11 +54,26 @@
> #define PT_SETREGSET 43 /* Set a target register set */
> #endif
>
> -/* Filter for ptid's allowed to report events from wait. Normally set
> - in resume, but also reset to minus_one_ptid in create_inferior and
> - attach. */
> +/* Information stored about each inferior. */
>
> -static ptid_t resume_ptid;
> +struct fbsd_inferior_info
> +{
> + /* Filter for resumed LWPs which can report events from wait. */
> + ptid_t resumed_lwps = null_ptid;
> +
> + /* Number of LWPs this process contains. */
> + unsigned int num_lwps = 0;
> +
> + /* Number of LWPs currently running. */
> + unsigned int running_lwps = 0;
> +
> + /* Have a pending SIGSTOP event that needs to be discarded. */
> + bool pending_sigstop = false;
> +};
Ok, it's not exactly what I expected, but I will keep on reading.
Long term, I don't think the resumed_lwps field will be enough to
describe the resume state of individual threads. Actually, to
complement the example I gave on an earlier patch, I guess you could do
that today?
(gdb) set scheduler-locking on
(gdb) thread 1
(gdb) continue & # only supposed to resume thread 1
(gdb) thread 2
(gdb) continue & # only supposed to resume thread 2
Here, resume_lwps would end up as (pid, thread_2_lwp, 0), right?
> +
> +/* Per-inferior data key. */
> +
> +static const registry<inferior>::key<fbsd_inferior_info> fbsd_inferior_data;
>
> /* If an event is triggered asynchronously (fake vfork_done events) or
> occurs when the core is not expecting it, a pending event is
> @@ -95,21 +110,27 @@ have_pending_event (ptid_t filter)
> return false;
> }
>
> -/* Helper method called by the target wait method. Returns true if
> - there is a pending event matching resume_ptid. If there is a
> - matching event, PTID and *STATUS contain the event details, and the
> - event is removed from the pending list. */
> +/* Returns true if there is a pending event for a resumed process
> + matching FILTER. If there is a matching event, PTID and *STATUS
> + contain the event details, and the event is removed from the
> + pending list. */
>
> static bool
> -take_pending_event (ptid_t &ptid, target_waitstatus *status)
> +take_pending_event (fbsd_nat_target *target, ptid_t filter, ptid_t &ptid,
> + target_waitstatus *status)
> {
> for (auto it = pending_events.begin (); it != pending_events.end (); it++)
> - if (it->ptid.matches (resume_ptid))
> + if (it->ptid.matches (filter))
> {
> - ptid = it->ptid;
> - *status = it->status;
> - pending_events.erase (it);
> - return true;
> + inferior *inf = find_inferior_ptid (target, it->ptid);
> + fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> + if (it->ptid.matches (info->resumed_lwps))
> + {
> + ptid = it->ptid;
> + *status = it->status;
> + pending_events.erase (it);
> + return true;
> + }
If that code was kept as-is, I think take_pending_event should be a
method of fbsd_nat_target, rather than passing it manually.
> }
> return false;
> }
> @@ -799,6 +820,8 @@ show_fbsd_nat_debug (struct ui_file *file, int from_tty,
> #define fbsd_nat_debug_printf(fmt, ...) \
> debug_prefixed_printf_cond (debug_fbsd_nat, "fbsd-nat", fmt, ##__VA_ARGS__)
>
> +#define fbsd_nat_debug_start_end(fmt, ...) \
> + scoped_debug_start_end (debug_fbsd_nat, "fbsd-nat", fmt, ##__VA_ARGS__)
>
> /*
> FreeBSD's first thread support was via a "reentrant" version of libc
> @@ -956,6 +979,9 @@ fbsd_add_threads (fbsd_nat_target *target, pid_t pid)
> if (nlwps == -1)
> perror_with_name (("ptrace (PT_GETLWPLIST)"));
>
> + inferior *inf = find_inferior_ptid (target, ptid_t (pid));
> + fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> + gdb_assert (info != nullptr);
> for (i = 0; i < nlwps; i++)
> {
> ptid_t ptid = ptid_t (pid, lwps[i]);
> @@ -974,8 +1000,14 @@ fbsd_add_threads (fbsd_nat_target *target, pid_t pid)
> #endif
> fbsd_lwp_debug_printf ("adding thread for LWP %u", lwps[i]);
> add_thread (target, ptid);
> +#ifdef PT_LWP_EVENTS
> + info->num_lwps++;
> +#endif
> }
> }
> +#ifndef PT_LWP_EVENTS
> + info->num_lwps = nlwps;
> +#endif
> }
>
> /* Implement the "update_thread_list" target_ops method. */
> @@ -1138,92 +1170,117 @@ fbsd_add_vfork_done (ptid_t pid)
> #endif
> #endif
>
> -/* Implement the "resume" target_ops method. */
> +/* Resume a single process. */
>
> void
> -fbsd_nat_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
> +fbsd_nat_target::resume_one_process (ptid_t ptid, int step,
> + enum gdb_signal signo)
> {
> fbsd_nat_debug_printf ("[%s], step %d, signo %d (%s)",
> target_pid_to_str (ptid).c_str (), step, signo,
> gdb_signal_to_name (signo));
>
> + inferior *inf = find_inferior_ptid (this, ptid);
> + fbsd_inferior_info *info = fbsd_inferior_data.get (inf);
> + info->resumed_lwps = ptid;
> + gdb_assert (info->running_lwps == 0);
> +
> /* Don't PT_CONTINUE a thread or process which has a pending event. */
> - resume_ptid = ptid;
> if (have_pending_event (ptid))
> {
> fbsd_nat_debug_printf ("found pending event");
> return;
> }
>
> - if (ptid.lwp_p ())
> + for (thread_info *tp : inf->non_exited_threads ())
> {
> - /* If ptid is a specific LWP, suspend all other LWPs in the process. */
> - inferior *inf = find_inferior_ptid (this, ptid);
> + int request;
>
> - for (thread_info *tp : inf->non_exited_threads ())
> - {
> - int request;
> -
> - if (tp->ptid.lwp () == ptid.lwp ())
> - request = PT_RESUME;
> - else
> - request = PT_SUSPEND;
> -
> - if (ptrace (request, tp->ptid.lwp (), NULL, 0) == -1)
> - perror_with_name (request == PT_RESUME ?
> - ("ptrace (PT_RESUME)") :
> - ("ptrace (PT_SUSPEND)"));
> - if (request == PT_RESUME)
> - low_prepare_to_resume (tp);
> - }
> - }
> - else
> - {
> - /* If ptid is a wildcard, resume all matching threads (they won't run
> - until the process is continued however). */
> - for (thread_info *tp : all_non_exited_threads (this, ptid))
> + /* If ptid is a specific LWP, suspend all other LWPs in the
> + process, otherwise resume all LWPs in the process.. */
As mentioned in a previous message, I don't think this is how the resume
method is supposed to work.
> + if (!ptid.lwp_p() || tp->ptid.lwp () == ptid.lwp ())
> {
> if (ptrace (PT_RESUME, tp->ptid.lwp (), NULL, 0) == -1)
> perror_with_name (("ptrace (PT_RESUME)"));
> low_prepare_to_resume (tp);
> + info->running_lwps++;
> }
> + else
> + {
> + if (ptrace (PT_SUSPEND, tp->ptid.lwp (), NULL, 0) == -1)
> + perror_with_name (("ptrace (PT_SUSPEND)"));
> + }
> + }
> +
> + if (ptid.pid () != inferior_ptid.pid ())
> + {
> + step = 0;
> + signo = GDB_SIGNAL_0;
> + gdb_assert (!ptid.lwp_p ());
I don't get why you ignore the step request here. Perhaps it is due to
a misundertanding of the resume interface (which was really confusing
before Pedro's clarification)?
The comment on target_resume and the commit message on Pedro's change
explain it well, but essentially:
- The ptid passed as a parameter (SCOPE_PTID) is the set of threads to
resume. If you keep a list of LWPs, then you can just go over that
list and resume everything that ptid_t::matches SCOPE_PTID, and that
doesn't have a pending event.
- STEP indicates whether to resume INFERIOR_PTID in resume mode. If
STEP is 0, it means that no thread is resumed in step mode they are
all resumed normally.
- SIGNAL indicates what signal to resume INFERIOR_PTID with. If
it's GDB_SIGNAL_0, it means resume without a signal.
So, the last two bullets only modify how the thread identified by
INFERIOR_PTID is resumed. I think that in practice, it's guaranteed in
practice today that INFERIOR_PTID is "within" SCOPE_PTID. But you can
also write the code without that assumption, it should be much harder.
For threads that are not INFERIOR_PTID, I think the target should
resume them with the signal in thread_info::m_suspend::stop_signal. But
that can be a problem for another day.
I'll wait for clarifications from you before continuing to read, because
I am a bit lost with the approach taken here.
Simon
next prev parent reply other threads:[~2023-03-20 19:55 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-28 18:18 [PATCH 0/9] Fixes for multiprocess for FreeBSD's native target John Baldwin
2023-02-28 18:18 ` [PATCH 1/9] fbsd-nat: Add missing spaces John Baldwin
2023-03-20 17:43 ` Simon Marchi
2023-03-20 18:08 ` John Baldwin
2023-02-28 18:18 ` [PATCH 2/9] fbsd-nat: Avoid a direct write to target_waitstatus::kind John Baldwin
2023-03-20 17:45 ` Simon Marchi
2023-02-28 18:18 ` [PATCH 3/9] fbsd-nat: Use correct constant for target_waitstatus::sig John Baldwin
2023-03-20 17:47 ` Simon Marchi
2023-02-28 18:18 ` [PATCH 4/9] fbsd-nat: Add a list of pending events John Baldwin
2023-03-20 18:35 ` Simon Marchi
2023-03-27 20:24 ` John Baldwin
2023-03-27 20:57 ` Simon Marchi
2023-03-27 21:00 ` John Baldwin
2023-02-28 18:18 ` [PATCH 5/9] fbsd-nat: Defer any ineligible events reported by wait John Baldwin
2023-03-20 19:09 ` Simon Marchi
2023-03-27 20:49 ` John Baldwin
2023-03-27 21:04 ` Simon Marchi
2023-02-28 18:18 ` [PATCH 6/9] fbsd-nat: Fix resuming and waiting with multiple processes John Baldwin
2023-03-20 19:55 ` Simon Marchi [this message]
2023-03-27 21:13 ` John Baldwin
2023-02-28 18:18 ` [PATCH 7/9] fbsd-nat: Fix several issues with detaching John Baldwin
2023-02-28 18:18 ` [PATCH 8/9] fbsd-nat: Fix thread_alive against a running thread John Baldwin
2023-02-28 18:18 ` [PATCH 9/9] fbsd-nat: Stop a process if it is running before killing it John Baldwin
2023-03-14 18:21 ` [PING] [PATCH 0/9] Fixes for multiprocess for FreeBSD's native target John Baldwin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8fb811a5-f363-d3aa-5b63-4fcc434b3e17@simark.ca \
--to=simark@simark.ca \
--cc=gdb-patches@sourceware.org \
--cc=jhb@FreeBSD.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).