From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca (simark.ca [158.69.221.121]) by sourceware.org (Postfix) with ESMTPS id 204523858C74 for ; Mon, 20 Mar 2023 19:09:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 204523858C74 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca Received: from [172.16.0.146] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id CED281E110; Mon, 20 Mar 2023 15:09:34 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail; t=1679339374; bh=Dads/m76BbnwDnu0l8Gq1KyFetLJkH1LxQczd3LAGZI=; h=Date:Subject:To:References:From:In-Reply-To:From; b=TnixJHY1x4tbtomapfOf0Kp8QPBYQw6O+1za68IOPlG5JxvKop3otuRIbUFg9Q9Po TgAFAO+HHIs37/YqBnJ6I6Oy9wH/GuygpCNjPV2PGMgdIMkVEcHcNPpUgFg1qxlwuy D/Vwjqdg34ihHw8In7JUX+U31VnghsaHgTay8fiY= Message-ID: Date: Mon, 20 Mar 2023 15:09:34 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH 5/9] fbsd-nat: Defer any ineligible events reported by wait. Content-Language: fr To: John Baldwin , gdb-patches@sourceware.org References: <20230228181845.99936-1-jhb@FreeBSD.org> <20230228181845.99936-6-jhb@FreeBSD.org> From: Simon Marchi In-Reply-To: <20230228181845.99936-6-jhb@FreeBSD.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2/28/23 13:18, John Baldwin wrote: > If wait_1 finds an event for a thread or process that does not match > the set of threads and processes previously resumed, defer the event. > If the event is for a specific thread, suspend the thread and continue > the associated process before waiting for another event. I do not understand this, because it doesn't match my mental model of how things should work (which is derived from working with linux-nat). In that model, a thread with a pending event should never be ptrace-resumed. So the last sentence doesn't make sense, it implies that a thread has a pending and is ptrace-resumed (since we suspend it). > One specific example of such an event is if a thread is created while > another thread in the same process hits a breakpoint. If the second > thread's event is reported first, the target resume method does not > yet "know" about the new thread and will not suspend it via > PT_SUSPEND. I was surprised to read that the resume method suspends some threads. I don't think it should, but I do see the current code. The current code appears to interpret the resume request as: "ensure the state of thread resumption looks exactly like this", so it suspends threads as needed, to match the requested resumption state. I don't think it's supposed to work like that. The resume method should only resume some threads, and if the core of GDB wants some threads stopped, it will call the stop method. Anyhow, just to be sure I understand the problem you describe: when fbsd-nat reports the breakpoint hit from "another thread", is the "thread created" event still in the kernel, not consumed by GDB? > When wait is called, it will probably return the event > from the first thread before the result of the step from second > thread. This is the case reported in PR 21497. > --- > gdb/fbsd-nat.c | 34 +++++++++++++++++++++++++++++++++- > 1 file changed, 33 insertions(+), 1 deletion(-) > > diff --git a/gdb/fbsd-nat.c b/gdb/fbsd-nat.c > index ca278b871ef..3f7278c6ea0 100644 > --- a/gdb/fbsd-nat.c > +++ b/gdb/fbsd-nat.c > @@ -1514,7 +1514,39 @@ fbsd_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus, > if (is_async_p ()) > async_file_flush (); > > - wptid = wait_1 (ptid, ourstatus, target_options); > + while (1) > + { > + wptid = wait_1 (ptid, ourstatus, target_options); > + > + /* If no event was found, just return. */ > + if (ourstatus->kind () == TARGET_WAITKIND_IGNORE > + || ourstatus->kind () == TARGET_WAITKIND_NO_RESUMED) > + break; > + > + /* If an event is reported for a thread or process while > + stepping some other thread, suspend the thread reporting the > + event and defer the event until it can be reported to the > + core. */ > + if (!wptid.matches (resume_ptid)) > + { > + add_pending_event (wptid, *ourstatus); > + fbsd_nat_debug_printf ("deferring event [%s], [%s]", > + target_pid_to_str (wptid).c_str (), > + ourstatus->to_string ().c_str ()); > + if (wptid.pid () == resume_ptid.pid ()) > + { > + fbsd_nat_debug_printf ("suspending thread [%s]", > + target_pid_to_str (wptid).c_str ()); > + if (ptrace (PT_SUSPEND, wptid.lwp (), NULL, 0) == -1) > + perror_with_name (("ptrace (PT_SUSPEND)")); This is the part I don't understand. After wait_1 returned an event for a specific thread, I would expect this thread to be ptrace-stopped. So, I don't see the need to suspend it. You'd just leave it in the "resumed from the until. > + if (ptrace (PT_CONTINUE, wptid.pid (), (caddr_t) 1, 0) == -1) > + perror_with_name (("ptrace (PT_CONTINUE)")); I don't get why you continue `wptid.pid ()`. What assures you that this particular thread was stopped previously? Perhaps I don't understand because I assume somethings about pids/lwps and ptrace that are only true on Linux. Simon