From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by sourceware.org (Postfix) with ESMTPS id 2596E385840B for ; Thu, 3 Mar 2022 14:40:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2596E385840B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f44.google.com with SMTP id p9so8157269wra.12 for ; Thu, 03 Mar 2022 06:40:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YJYSNoaigIyClW/PyW5lRJPESBaz3FY6JhwIDscst4g=; b=inGeeNTOEqMkKsWqrqBcoE6GMexeEodEfnbl9tL9HgCHdUZnEpphtQ2EygtKKVapyT 4TulltxnWLn5gp8ELu2ekt3eMDlhNAO7wvGPOnQCqj54V3UC3HDP2d1TbFl/Dc6oUjpm KaxZbdSH/af1z7pda9376+5iGDADi9EeOjoKgYPqCXMT+6fDbWEz1gIObMhG8sRVcXcM l89xUR17p4TmrODO0s1rznxyWVh+I/OOx7MdEKOiGrzp8t6FOvsJzNELs+u/197f2Pzu wdqHrNMteYAR6nvzMb9stfzvbPolY0NI7pUiBO+50L6DIvHtJiHPnYo9Ns9ElL+MpuRY 5mYg== X-Gm-Message-State: AOAM530ZanlklKeMjblNw2tn5640aeZ8vJ8ipuD4lIFTR6G/2X/6s2dg XeOG6pF9VgfM+LOMd5l+Iz/xDgbBi3E= X-Google-Smtp-Source: ABdhPJwmm3CSeeUNXLe1UlgLi1wWT752yqe+UdF3GDQPeFdsNbMMC/wYwDHyQAJDnrOsaJvL4ksUFg== X-Received: by 2002:a05:6000:1e17:b0:1ef:d2b0:5624 with SMTP id bj23-20020a0560001e1700b001efd2b05624mr14937109wrb.598.1646318442498; Thu, 03 Mar 2022 06:40:42 -0800 (PST) Received: from localhost ([2001:8a0:f924:2600:209d:85e2:409e:8726]) by smtp.gmail.com with ESMTPSA id f13-20020adff8cd000000b001f03439743fsm2235043wrq.75.2022.03.03.06.40.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 03 Mar 2022 06:40:41 -0800 (PST) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 09/11] Ensure EXIT is last event, gdb/linux Date: Thu, 3 Mar 2022 14:40:18 +0000 Message-Id: <20220303144020.3601082-10-pedro@palves.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220303144020.3601082-1-pedro@palves.net> References: <20220303144020.3601082-1-pedro@palves.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Mar 2022 14:40:45 -0000 From: Lancelot SIX When all threads of a multi-threaded process terminate, we can end up with multiple LWPs having a pending exit event (lets say that the events arrive faster than GDB processes them, which can be simulated by introducing an artificial delay in GDB). When we have multiple pending events to report to the core, linux_nat_wait_1 uses select_event_lwp to select randomly one of the pending events to report. If we have multiple pending exit events, and the randomization picks the leader's exit to report, filter_exit_event sees that this is the leader exiting, thus it is the exit of the whole process and thus reports an EXITED event to the core, while leaving the other threads' exit statuses pending. This is problematic for infrun.c:stop_all_threads, which asks the target to report all thread exit events to infrun. For example, in stop_all_threads, core GDB counts 2 threads that needs to be stopped. It then asks the target to stop those 2 threads (with target_stop(ptid)), and waits for 2 events to come back from the target. Unfortunately, when waiting for events, the linux-nat target, due to the random event selecting mentioned above, reports the whole-process EXIT event even though the other thread has exited and its exit hasn't been reported yet. As a consequence, stop_all_threads receives one event, but waits indefinitely for the second one to come. Effectively, GDB hangs forever. To fix this, this commit makes sure that a leader's exit event is not considered for selection by select_event_lwp as long as there is at least one other thread with a pending event remaining in the process. Considering that the leader thread's exit event can only be generated by the kernel once it has reaped all the non-leader threads, we are guaranteed that all other threads do have an exit event which is ready to be processed. Once all other exit events are processed, select_event_lwp will consider the leader's exit status. Tested on Linux-x86_64 with no regression observed. Co-Authored-By: Pedro Alves Change-Id: Id17ad5de76518925a968c0902860646820679dfa --- gdb/linux-nat.c | 46 +++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 41 insertions(+), 5 deletions(-) diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c index d97a770bf83..5d8887b11f6 100644 --- a/gdb/linux-nat.c +++ b/gdb/linux-nat.c @@ -2482,15 +2482,50 @@ status_callback (struct lwp_info *lp) return 1; } -/* Count the LWP's that have had events. */ +/* Return true if another thread of the same process as LP has a + pending status ready to be processed. LP is assumed to be the + leader of its process. */ + +static bool +non_leader_lwp_in_process_with_pending_event (lwp_info *lp) +{ + gdb_assert (is_leader (lp)); + + for (lwp_info *other : all_lwps ()) + { + if (other->ptid.pid () == lp->ptid.pid () + && !is_leader (other) + && other->resumed + && lwp_status_pending_p (other)) + return true; + } + + return false; +} + +/* Indicate whether LP has a pending event which should be considered + for immediate processing. Does not consider a leader thread's exit + event before the non-leader threads have reported their exits. */ + +static bool +has_reportable_pending_event (struct lwp_info *lp) +{ + return (lp->resumed && lwp_status_pending_p (lp) + && !(!WIFSTOPPED (lp->status) + && is_leader (lp) + && non_leader_lwp_in_process_with_pending_event (lp))); +} + +/* Count the LWP's that have had reportable events. */ static int count_events_callback (struct lwp_info *lp, int *count) { gdb_assert (count != NULL); - /* Select only resumed LWPs that have an event pending. */ - if (lp->resumed && lwp_status_pending_p (lp)) + /* Select only resumed LWPs that have a reportable event + pending. */ + if (has_reportable_pending_event (lp)) (*count)++; return 0; @@ -2526,8 +2561,9 @@ select_event_lwp_callback (struct lwp_info *lp, int *selector) { gdb_assert (selector != NULL); - /* Select only resumed LWPs that have an event pending. */ - if (lp->resumed && lwp_status_pending_p (lp)) + /* Select only resumed LWPs that have a reportable event + pending. */ + if (has_reportable_pending_event (lp)) if ((*selector)-- == 0) return 1; -- 2.26.2