From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 2E0B23858C62 for ; Thu, 8 Jun 2023 15:49:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2E0B23858C62 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686239378; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2E0fFzL741Mq9Vw/3vc3PZjZ+YjGSMSNJ2yut+r5u6I=; b=fMQREXZyYo14rq0v3ESwWUpVvEjg0J14CXQqUYsuajkKlOiVwm6fuF54e7CLb3mI8oW1rX 6dJjbJrxjU5bBiyOx0YNbWn6/W/s9v+sff875LlykiIBCp6BAmTBtZmQuCXiBHvAJ9QlLf hmR/jl2JYZ+qq/wrkO7zcVs3Q3Hmgu0= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-219-9DNhTWeBMNaQ7RWcKyjOKA-1; Thu, 08 Jun 2023 11:49:37 -0400 X-MC-Unique: 9DNhTWeBMNaQ7RWcKyjOKA-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-30793c16c78so1067822f8f.3 for ; Thu, 08 Jun 2023 08:49:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686239376; x=1688831376; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2E0fFzL741Mq9Vw/3vc3PZjZ+YjGSMSNJ2yut+r5u6I=; b=eQXLVElhDh4LP6QAIdeES3z2uIVEJkaIRqCAZOlIqZPmVBNwey/cpVLSUSPnbpQcQY uitmQ2NmpviVTgoRZ1KybtTwIjrbbIfrC2X9wZh5qVx2OJYO2aW4KraxqaAle13+odEL IjhrcsIGFdBvMAtdTiib+Nce3twBMZx7iYn5A6Lmw+Zqw74S50BlX2qaEBp2nsrZMnbE cs/AvMC6dD3EUreVeGxdGK1hRzjOLQgbb67olMTkYeehZDtmJzC8ivWdSY/SGJzI6ziR XYAUWSsushTOZUyFwsnUswefmDx6ShlGGByS4+evIBNoOPyxm9MgLAiA3I1lcDq5dbQx JyCA== X-Gm-Message-State: AC+VfDyeIG/yXdlRN6qpu2Lh4nYXKUwzDbMOnoDgNb+OFdJbzUwU/1FQ Wpury3zdkGOKDp4QjQzwejTckThW555Cb/R1U+TTSOIJzL58lIrRyZWfA/MbrUpkHueCSqsmp6q J37gyb/RvwHzi6YSRKPH9JbsGFgxWrA== X-Received: by 2002:adf:cd82:0:b0:30a:ea8a:7a6d with SMTP id q2-20020adfcd82000000b0030aea8a7a6dmr8115607wrj.16.1686239376249; Thu, 08 Jun 2023 08:49:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4LdyPP1AZbl9YCuwm6H88vx+2wd5PEHJGA3idaoXjyQNvz484E8Iz37LU4jqRK2aDpaXyhww== X-Received: by 2002:adf:cd82:0:b0:30a:ea8a:7a6d with SMTP id q2-20020adfcd82000000b0030aea8a7a6dmr8115593wrj.16.1686239375915; Thu, 08 Jun 2023 08:49:35 -0700 (PDT) Received: from localhost (11.72.115.87.dyn.plus.net. [87.115.72.11]) by smtp.gmail.com with ESMTPSA id y17-20020adfee11000000b0030af05fce4dsm1957589wrn.77.2023.06.08.08.49.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jun 2023 08:49:35 -0700 (PDT) From: Andrew Burgess To: Pedro Alves , gdb-patches@sourceware.org Subject: Re: [PATCH 21/31] stop_all_threads: (re-)enable async before waiting for stops In-Reply-To: <20221212203101.1034916-22-pedro@palves.net> References: <20221212203101.1034916-1-pedro@palves.net> <20221212203101.1034916-22-pedro@palves.net> Date: Thu, 08 Jun 2023 16:49:34 +0100 Message-ID: <87a5xaymwh.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Pedro Alves writes: > Running the > gdb.threads/step-over-thread-exit-while-stop-all-threads.exp testcase > added later in the series against gdbserver, after the > TARGET_WAITKIND_NO_RESUMED fix from the following patch, would run > into an infinite loop in stop_all_threads, leading to a timeout: > > FAIL: gdb.threads/step-over-thread-exit-while-stop-all-threads.exp: displaced-stepping=off: target-non-stop=on: iter 0: continue (timeout) > > The is really a latent bug, and it is about the fact that > stop_all_threads stops listening to events from a target as soon as it > sees a TARGET_WAITKIND_NO_RESUMED, ignoring that > TARGET_WAITKIND_NO_RESUMED may be delayed. handle_no_resumed knows > how to handle delayed no-resumed events, but stop_all_threads was > never taught to. > > In more detail, here's what happens with that testcase: > > #1 - Multiple threads report breakpoint hits to gdb. > > #2 - gdb picks one events, and it's for thread 1. All other stops are > left pending. thread 1 needs to move past a breakpoint, so gdb > stops all threads to start an inline step over for thread 1. > While stopping threads, some of the threads that were still > running report events that are also left pending. > > #2 - gdb steps thread 1 > > #3 - Thread 1 exits while stepping (it steps over an exit syscall), > gdbserver reports thread exit for thread 1 > > #4 - Thread 1 was the last resumed thread, so gdbserver also reports > no-resumed: > > [remote] Notification received: Stop:w0;p3445d0.3445d3 > [remote] Sending packet: $vStopped#55 > [remote] Packet received: N > [remote] Sending packet: $vStopped#55 > [remote] Packet received: OK > > #5 - gdb processes the thread exit for thread 1, finishes the step > over and restarts threads. > > #6 - gdb picks the next event to process out of one of the resumed > threads with pending events: > > [infrun] random_resumed_with_pending_wait_status: Found 32 events, selecting #11 > > #7 - This is again a breakpoint hit and the breakpoint needs to be > stepped over too, so gdb starts a step-over dance again. > > #8 - We reach stop_all_threads, which finds that some threads need to > be stopped. > > #9 - wait_one finally consumes the no-resumed event queue by #4. > Seeing this, wait_one disable target async, to stop listening for > events out of the remote target. > > #10 - We still haven't seen all the stops expected, so > stop_all_threads tries another iteration. > > #11 - Because the remote target is no longer async, and there are no > other targets, wait_one return no-resumed immediately without > polling the remote target. > > #12 - We still haven't seen all the stops expected, so > stop_all_threads tries another iteration. goto #11, looping > forever. > > Fix this by explicitly enabling/re-enabling target async on targets > that can async, before waiting for stops. > > Change-Id: Ie3ffb0df89635585a6631aa842689cecc989e33f > --- > gdb/infrun.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 81 insertions(+) > > diff --git a/gdb/infrun.c b/gdb/infrun.c > index 2866962d2dc..31321d758da 100644 > --- a/gdb/infrun.c > +++ b/gdb/infrun.c > @@ -5011,6 +5011,8 @@ wait_one () > if (nfds == 0) > { > /* No waitable targets left. All must be stopped. */ > + infrun_debug_printf ("no waitable targets left"); > + > target_waitstatus ws; > ws.set_no_resumed (); > return {nullptr, minus_one_ptid, std::move (ws)}; > @@ -5269,6 +5271,83 @@ handle_one (const wait_one_event &event) > return false; > } > > +/* Helper for stop_all_threads. wait_one waits for events until it > + sees a TARGET_WAITKIND_NO_RESUMED event. When it sees one, it > + disables target_async for the target to stop waiting for events > + from it. TARGET_WAITKIND_NO_RESUMED can be delayed though, > + consider, debugging against gdbserver: > + > + #1 - Threads 1-5 are running, and thread 1 hits a breakpoint. > + > + #2 - gdb processes the breakpoint hit for thread 1, stops all > + threads, and steps thread 1 over the breakpoint. while > + stopping threads, some other threads reported interesting > + events, which were left pending in the thread's objects > + (infrun's queue). > + > + #2 - Thread 1 exits (it stepped an exit syscall), and gdbserver > + reports the thread exit for thread 1. The event ends up in > + remote's stop reply queue. > + > + #3 - That was the last resumed thread, so gdbserver reports > + no-resumed, and that event also ends up in remote's stop > + reply queue, queued after the thread exit from #2. > + > + #4 - gdb processes the thread exit event, which finishes the > + step-over, and so gdb restarts all threads (threads with > + pending events are left marked resumed, but aren't set > + executing). The no-resumed event is still left pending in > + the remote stop reply queue. > + > + #5 - Since there are now resumed threads with pending breakpoint > + hits, gdb picks one at random to process next. > + > + #5 - gdb picks the breakpoint hit for thread 2 this time, and that > + breakpoint also needs to be stepped over, so gdb stops all > + threads again. > + > + #6 - stop_all_threads counts number of expected stops and calls > + wait_one once for each. > + > + #7 - The first wait_one call collects the no-resumed event from #3 > + above. > + > + #9 - Seeing the no-resumed event, wait_one disables target async > + for the remote target, to stop waiting for events from it. > + wait_one from here on always return no-resumed directly > + without reaching the target. > + > + #10 - stop_all_threads still hasn't seen all the stops it expects, > + so it does another pass. > + > + #11 - Since the remote target is not async (disabled in #9), > + wait_one doesn't wait on it, so it won't see the expected > + stops, and instead returns no-resumed directly. > + > + #12 - stop_all_threads still haven't seen all the stops, so it > + does another pass. goto #b, looping forever. s/#b/#11/ Otherwise, LGTM. Reviewed-By: Andrew Burgess Thanks, Andrew > + > + To handle this, we explicitly (re-)enable target async on all > + targets that can async every time stop_all_threads goes wait for > + the expected stops. */ > + > +static void > +reenable_target_async () > +{ > + for (inferior *inf : all_inferiors ()) > + { > + process_stratum_target *target = inf->process_target (); > + if (target != nullptr > + && target->threads_executing > + && target->can_async_p () > + && !target->is_async_p ()) > + { > + switch_to_inferior_no_thread (inf); > + target_async (1); > + } > + } > +} > + > /* See infrun.h. */ > > void > @@ -5395,6 +5474,8 @@ stop_all_threads (const char *reason, inferior *inf) > if (pass > 0) > pass = -1; > > + reenable_target_async (); > + > for (int i = 0; i < waits_needed; i++) > { > wait_one_event event = wait_one (); > -- > 2.36.0