From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 41F5A3858D35 for ; Fri, 9 Jun 2023 13:11:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 41F5A3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686316270; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Xdb7cY12b8BNURylxSMBFTVJVShqQzOlJyJvLCfR32Y=; b=O+VVqi+6VordCvsRBTIGJzkCq/jYFn5KJGoZ8uJgMKGutUs/SwmApYoD+KFYGW084FqWrp nYJjfFIH8NesFwMrMI54AmpQtshr+JemhA3TZf0vEw5FQraLI/k57G0coThRoNxRQdsKVd 9as8s7UzA2UpOf2B0KlxpARrjnlECxA= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-479-TQufxizPP-SoH56x7-OqFw-1; Fri, 09 Jun 2023 09:11:09 -0400 X-MC-Unique: TQufxizPP-SoH56x7-OqFw-1 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-30e4d85e1ffso2370154f8f.0 for ; Fri, 09 Jun 2023 06:11:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686316268; x=1688908268; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Xdb7cY12b8BNURylxSMBFTVJVShqQzOlJyJvLCfR32Y=; b=azTR28KsUnccPFeQtqzz48M+Z5JgJJXS9sjP6qtf4fsYkxJS2KNz8Wp7+n56e78z6e 0LehG7AClXgRSZB9dcWiT3bS3gkAHn/MRzWtWX38W9FtjSg3hvgw+va8vbi9csH3rIkS EVuTf479TOio9Z4x+kkC7TiGVXDBOUOQ5WsOeRgDFFFs1L3QBcWVgQfVw7gRMkiDm4Kn 3KMG6bpYDBiV8VoFum0x4ClSX/kxInBJ7Drzf8szNorvlMWkyqj5n+Ob+DwCUhkTOxQo elX/8dAXWkRUFtZ3lBi7789wvErgmdiQY2V0Pwv5KKxC2ulxrH1PMnrzXp9ALhi+LIMb egog== X-Gm-Message-State: AC+VfDyOnYSFG93Ogku2nEBJapInsXqTmX6IO/1Y7dbjSu4aUhy0sHlH f9G7l1FPspmILbVMr845kByk3xC09q0G5W+pa8CVBkLNcpsBdzuUzI4OedgdZrg+z34eODyTWyJ HXAlpE65UtRW7EYprr/T0nCpqLiTJMg== X-Received: by 2002:adf:fe52:0:b0:30e:3ec5:1fb1 with SMTP id m18-20020adffe52000000b0030e3ec51fb1mr1047418wrs.33.1686316267953; Fri, 09 Jun 2023 06:11:07 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5HzAsIdiP1uiiYjvw6IxKEMrP8T5Bvsw/ON4lKVSNTtt+Ftu30VDz+io/OnSdF6pwri/umdQ== X-Received: by 2002:adf:fe52:0:b0:30e:3ec5:1fb1 with SMTP id m18-20020adffe52000000b0030e3ec51fb1mr1047403wrs.33.1686316267537; Fri, 09 Jun 2023 06:11:07 -0700 (PDT) Received: from localhost (11.72.115.87.dyn.plus.net. [87.115.72.11]) by smtp.gmail.com with ESMTPSA id q18-20020a05600000d200b0030e5da3e295sm4417098wrx.65.2023.06.09.06.11.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Jun 2023 06:11:06 -0700 (PDT) From: Andrew Burgess To: Pedro Alves , gdb-patches@sourceware.org Subject: Re: [PATCH 24/31] Report thread exit event for leader if reporting thread exit events In-Reply-To: <20221212203101.1034916-25-pedro@palves.net> References: <20221212203101.1034916-1-pedro@palves.net> <20221212203101.1034916-25-pedro@palves.net> Date: Fri, 09 Jun 2023 14:11:01 +0100 Message-ID: <87r0qkye56.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Pedro Alves writes: > If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the > thread disappears from the thread list, GDB expects to shortly see a > thread exit event for it. See e.g., here, in > remote_target::update_thread_list(): > > /* Do not remove the thread if we've requested to be > notified of its exit. For example, the thread may be > displaced stepping, infrun will need to handle the > exit event, and displaced stepping info is recorded > in the thread object. If we deleted the thread now, > we'd lose that info. */ > if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0) > continue; > > There's one scenario that is deleting a thread from the > remote/gdbserver thread list without ever reporting a corresponding > thread exit event though -- check_zombie_leaders. This can lead to > GDB getting confused. For example, with a following patch that > enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see > this regression: > > $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp" > ... > Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ... > FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout) > ... some more cascading FAILs ... > > gdb.log shows: > > (gdb) continue > Continuing. > FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout) > > A passing run would have resulted in: > > (gdb) continue > Continuing. > No unwaited-for children left. > (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits > > note how the leader thread is not listed in the remote-reported XML > thread list below: > > (gdb) set debug remote 1 > (gdb) set debug infrun 1 > (gdb) info threads > Id Target Id Frame > * 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65 > 3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39 > (gdb) c > Continuing. > [infrun] clear_proceed_status_thread: 1163850.1163850.0 > ... > [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f > [remote] Sending packet: $QPassSignals:#f3 > [remote] Packet received: OK > [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3 > [remote] Packet received: OK > ... > [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3] > ... > [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0 > [remote] Sending packet: $vCont;c:p11c24a.11c24a#98 > [infrun] prepare_to_wait: prepare_to_wait > [infrun] reset: reason=handling event > [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote > [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote > [infrun] fetch_inferior_event: exit > [infrun] fetch_inferior_event: enter > [infrun] scoped_disable_commit_resumed: reason=handling event > [infrun] random_pending_event_thread: None found. > [remote] wait: enter > [remote] Packet received: N > [remote] wait: exit > [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) = > [infrun] print_target_wait_results: -1.0.0 [process -1], > [infrun] print_target_wait_results: status->kind = NO_RESUMED > [infrun] handle_inferior_event: status->kind = NO_RESUMED > [remote] Sending packet: $Hgp0.0#ad > [remote] Packet received: OK > [remote] Sending packet: $qXfer:threads:read::0,1000#92 > [remote] Packet received: l\n\n\n > [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed) > ... > > ... however, infrun decided there was a resumed thread still, so > ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see > that the "found resumed" thread that GDB finds, is the leader thread. > Even though that thread is not on the remote-reported thread list, it > is still on the GDB thread list, due to the special case in remote.c > mentioned above. > > This commit addresses the issue by fixing GDBserver to report a thread > exit event for the zombie leader too, i.e., making GDBserver respect > the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit" > invariant. To do that, it takes a bit more code than one would > imagine off hand, due to the fact that we currently always report LWP > exit pending events as TARGET_WAITKIND_EXITED, and then decide whether > to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting > the event to GDBserver core. For the zombie leader scenario > described, we need to record early on that we want to report a > THREAD_EXITED event, and then make sure that decision isn't lost along > the way to reporting the event to GDBserver core. LGTM. Reviewed-By: Andrew Burgess Thanks, Andrew > > Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b > --- > gdbserver/linux-low.cc | 75 ++++++++++++++++++++++++++++++++++++------ > gdbserver/linux-low.h | 5 +-- > 2 files changed, 68 insertions(+), 12 deletions(-) > > diff --git a/gdbserver/linux-low.cc b/gdbserver/linux-low.cc > index bf6dc1d995a..c197846810c 100644 > --- a/gdbserver/linux-low.cc > +++ b/gdbserver/linux-low.cc > @@ -279,7 +279,8 @@ int using_threads = 1; > static int stabilizing_threads; > > static void unsuspend_all_lwps (struct lwp_info *except); > -static void mark_lwp_dead (struct lwp_info *lwp, int wstat); > +static void mark_lwp_dead (struct lwp_info *lwp, int wstat, > + bool thread_event); > static int lwp_is_marked_dead (struct lwp_info *lwp); > static int kill_lwp (unsigned long lwpid, int signo); > static void enqueue_pending_signal (struct lwp_info *lwp, int signal, siginfo_t *info); > @@ -1800,10 +1801,12 @@ iterate_over_lwps (ptid_t filter, > return get_thread_lwp (thread); > } > > -void > +bool > linux_process_target::check_zombie_leaders () > { > - for_each_process ([this] (process_info *proc) > + bool new_pending_event = false; > + > + for_each_process ([&] (process_info *proc) > { > pid_t leader_pid = pid_of (proc); > lwp_info *leader_lp = find_lwp_pid (ptid_t (leader_pid)); > @@ -1872,9 +1875,19 @@ linux_process_target::check_zombie_leaders () > "(it exited, or another thread execd), " > "deleting it.", > leader_pid); > - delete_lwp (leader_lp); > + > + thread_info *leader_thread = get_lwp_thread (leader_lp); > + if (report_exit_events_for (leader_thread)) > + { > + mark_lwp_dead (leader_lp, W_EXITCODE (0, 0), true); > + new_pending_event = true; > + } > + else > + delete_lwp (leader_lp); > } > }); > + > + return new_pending_event; > } > > /* Callback for `find_thread'. Returns the first LWP that is not > @@ -2333,7 +2346,7 @@ linux_process_target::filter_event (int lwpid, int wstat) > /* Since events are serialized to GDB core, and we can't > report this one right now. Leave the status pending for > the next time we're able to report it. */ > - mark_lwp_dead (child, wstat); > + mark_lwp_dead (child, wstat, false); > return; > } > else > @@ -2646,7 +2659,8 @@ linux_process_target::wait_for_event_filtered (ptid_t wait_ptid, > > /* Check for zombie thread group leaders. Those can't be reaped > until all other threads in the thread group are. */ > - check_zombie_leaders (); > + if (check_zombie_leaders ()) > + goto retry; > > auto not_stopped = [&] (thread_info *thread) > { > @@ -2893,6 +2907,17 @@ linux_process_target::filter_exit_event (lwp_info *event_child, > struct thread_info *thread = get_lwp_thread (event_child); > ptid_t ptid = ptid_of (thread); > > + if (ourstatus->kind () == TARGET_WAITKIND_THREAD_EXITED) > + { > + /* We're reporting a thread exit for the leader. The exit was > + detected by check_zombie_leaders. */ > + gdb_assert (is_leader (thread)); > + gdb_assert (report_exit_events_for (thread)); > + > + delete_lwp (event_child); > + return ptid; > + } > + > /* Note we must filter TARGET_WAITKIND_SIGNALLED as well, otherwise > if a non-leader thread exits with a signal, we'd report it to the > core which would interpret it as the whole-process exiting. > @@ -3012,7 +3037,20 @@ linux_process_target::wait_1 (ptid_t ptid, target_waitstatus *ourstatus, > { > if (WIFEXITED (w)) > { > - ourstatus->set_exited (WEXITSTATUS (w)); > + /* If we already have the exit recorded in waitstatus, use > + it. This will happen when we detect a zombie leader, > + when we had GDB_THREAD_OPTION_EXIT enabled for it. We > + want to report its exit as TARGET_WAITKIND_THREAD_EXITED, > + as the whole process hasn't exited yet. */ > + const target_waitstatus &ws = event_child->waitstatus; > + if (ws.kind () != TARGET_WAITKIND_IGNORE) > + { > + gdb_assert (ws.kind () == TARGET_WAITKIND_EXITED > + || ws.kind () == TARGET_WAITKIND_THREAD_EXITED); > + *ourstatus = ws; > + } > + else > + ourstatus->set_exited (WEXITSTATUS (w)); > > threads_debug_printf > ("ret = %s, exited with retcode %d", > @@ -3720,8 +3758,15 @@ suspend_and_send_sigstop (thread_info *thread, lwp_info *except) > send_sigstop (thread, except); > } > > +/* Mark LWP dead, with WSTAT as exit status pending to report later. > + If THREAD_EVENT is true, interpret WSTAT as a thread exit event > + instead of a process exit event. This is meaningful for the leader > + thread, as we normally report a process-wide exit event when we see > + the leader exit, and a thread exit event when we see any other > + thread exit. */ > + > static void > -mark_lwp_dead (struct lwp_info *lwp, int wstat) > +mark_lwp_dead (struct lwp_info *lwp, int wstat, bool thread_event) > { > /* Store the exit status for later. */ > lwp->status_pending_p = 1; > @@ -3730,9 +3775,19 @@ mark_lwp_dead (struct lwp_info *lwp, int wstat) > /* Store in waitstatus as well, as there's nothing else to process > for this event. */ > if (WIFEXITED (wstat)) > - lwp->waitstatus.set_exited (WEXITSTATUS (wstat)); > + { > + if (thread_event) > + lwp->waitstatus.set_thread_exited (WEXITSTATUS (wstat)); > + else > + lwp->waitstatus.set_exited (WEXITSTATUS (wstat)); > + } > else if (WIFSIGNALED (wstat)) > - lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat))); > + { > + gdb_assert (!thread_event); > + lwp->waitstatus.set_signalled (gdb_signal_from_host (WTERMSIG (wstat))); > + } > + else > + gdb_assert_not_reached ("unknown status kind"); > > /* Prevent trying to stop it. */ > lwp->stopped = 1; > diff --git a/gdbserver/linux-low.h b/gdbserver/linux-low.h > index 33a14e15173..ffbc3c6f095 100644 > --- a/gdbserver/linux-low.h > +++ b/gdbserver/linux-low.h > @@ -574,8 +574,9 @@ class linux_process_target : public process_stratum_target > > /* Detect zombie thread group leaders, and "exit" them. We can't > reap their exits until all other threads in the group have > - exited. */ > - void check_zombie_leaders (); > + exited. Returns true if we left any new event pending, false > + otherwise. */ > + bool check_zombie_leaders (); > > /* Convenience function that is called when we're about to return an > event to the core. If the event is an exit or signalled event, > -- > 2.36.0