From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 6B12B3832378 for ; Mon, 28 Nov 2022 15:20:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6B12B3832378 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669648859; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uLrNyxNEkBond+NF4dheygcNnk+poWfbpFBBHzt+gz0=; b=Z0Jyn/biCQZC8scF5OyYb4UdEjGlPAFvnNtMAt2MGzZSNaJGqIMcsrsEU7cFdgw0d0WrTX hi0zD9JmKWqFuz9q/C9dEgRgCAnu5cVOL+CW0478WoELm8VkLYS1LkWHQJQZK0mB9Ng95H KzFI/qABSqOBdI/i92ba1xFDZkOdug4= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-605-CGVp7lZqMRKCQx09lp57iQ-1; Mon, 28 Nov 2022 10:20:57 -0500 X-MC-Unique: CGVp7lZqMRKCQx09lp57iQ-1 Received: by mail-wm1-f70.google.com with SMTP id m14-20020a7bcb8e000000b003cfcff0057eso3808939wmi.9 for ; Mon, 28 Nov 2022 07:20:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uLrNyxNEkBond+NF4dheygcNnk+poWfbpFBBHzt+gz0=; b=6oL+xGF9RXuWqOuAPbZdMlkrXlQkI/jthBCXGw7rkhbRoOFT6i9VV2//miiP1YmGbK PMHWwr1daiq+3SXaHnM7fdWwYa8yuseWi2bgfY4rsvTx6/Wz7/dmaPNASt9Ho9LixVAm 4o3VUPbiPf+TGmLXRcv+QevbzPwsCsWFhcGoj/egnwx9xBdmHrYxUxb3e6rUheb2QTHs tfGjFpZzHeN5KrotMKjwnBP7IZu10dxbZxHYY4MnsAdcRHC2X1EUYx8ELwzrY5sHCYaq CN3VlGQL7ej++eJa+i9hang7aRh4aKlucavht1OO/0OXUXVU5quo1f4DZ2LBTXsHfRIK ltLg== X-Gm-Message-State: ANoB5plzwgUpmuBug9+P7i4flppFDCNi2DQoqwIvcnxGFShohcxAAFi9 B9CbzA6O2o///BRgVNZYeJLV6VD90DQGoujok/iKK7vWN4eYk7IV7urvgwRMfBlNOKJnMDnUkPL weyIEsmrMp8WVQ5Ti2ArHU7Bfw1h/Kd27dmbC3WQ6e777yRNiSm9VxFinbCFjqwT3n+0Rr5Gm1w == X-Received: by 2002:a1c:f401:0:b0:3c7:84d:72d with SMTP id z1-20020a1cf401000000b003c7084d072dmr24388699wma.181.1669648855853; Mon, 28 Nov 2022 07:20:55 -0800 (PST) X-Google-Smtp-Source: AA0mqf58k3XnKJTJ7ysiNplTam/EYklX1qPHL1BccFpdmda/YXMzNfFDZBygF6oL7s3PIz21YiG0jw== X-Received: by 2002:a1c:f401:0:b0:3c7:84d:72d with SMTP id z1-20020a1cf401000000b003c7084d072dmr24388665wma.181.1669648855349; Mon, 28 Nov 2022 07:20:55 -0800 (PST) Received: from localhost ([31.111.84.238]) by smtp.gmail.com with ESMTPSA id r4-20020a5d6944000000b0022cc0a2cbecsm11108152wrw.15.2022.11.28.07.20.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 07:20:55 -0800 (PST) From: Andrew Burgess To: Lancelot SIX via Gdb-patches , gdb-patches@sourceware.org Cc: lsix@lancelotsix.com, Lancelot SIX Subject: Re: [PATCH v3] gdb/gcore: interrupt all threads before generating the corefile In-Reply-To: <20221116123649.2430701-1-lancelot.six@amd.com> References: <20221116123649.2430701-1-lancelot.six@amd.com> Date: Mon, 28 Nov 2022 15:20:54 +0000 Message-ID: <874juj2jhl.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Lancelot SIX via Gdb-patches writes: > Hi, > > This is a V3 following > https://sourceware.org/pipermail/gdb-patches/2022-October/192746.html. > > Noticeable changes since V2: > - The test have been updated following Pedro's comment to be compatible > with --target_board=native-extended-gdbserver. > - Running the entire testsuite against gdbserver revealed that the > original implementation was incompatible with all-stop targets. The > V3 improves the implementation to support both all-stop and non-stop > targets. > > This patch have been regression tested against gdb and gdbserver > (native-extended-gdbserver). > > All feedbacks are welcome. > > Best, > Lancelot > > --- > > In non-stop mode, if the user tries to generate a core dump (using the > gcore command) while some threads are running, a non-helpful error > message is shown. > > Lets consider the following session as an example (debugging the test > program included in this patch): > > (gdb) set non-stop on > (gdb) b 37 > (gdb) r > Thread 1 "gcore-nonstop" hit Breakpoint 1, main () at gcore-nonstop.c:39 > (gdb) info thread > Id Target Id Frame > * 1 Thread 0x7ffff7d7a740 (LWP 431838) "gcore-nonstop" main () at gcore-nonstop.c:39 > 2 Thread 0x7ffff7d79640 (LWP 431841) "gcore-nonstop" (running) > (gdb) gcore > Couldn't get registers: No such process. > > The reported error ("No such process") does not help the user understand > what happens. This is due to the fact that we cannot access the > registers of a running thread. Even if we ignore this error, generating > a core dump while any thread might update memory would most likely > result in a core file with an inconsistent view of the process' memory. > > To solve this, this patch proposes to change the gcore command so it > first stops all running threads (from the current inferior) before > generating the corefile, and then resumes them in their previous state. > > To achieve this, this patch exposes the restart_threads function in infrun.h > (used to be local to infrun.c). We also allow the first parameter > (event_thread) to be nullptr as it is possible that the gcore command is > called while all threads are running, in which case we want all threads > to be restarted at the end of the procedure. > > When testing this patch against gdbserver, it appears that using > stop_all_threads / restart_threads was not compatible with all-stop > targets. For those targets, we need to call target_stop_and_wait / > target_resume. The gcore_command has code to handle both > configurations. I could not use a RAII-like object to have a cleaner > way to restore the state at the end as the restore procedure could > throw. Instead, the procedure keeps track of which method was used to > interrupt threads so the appropriate method can be used to restore their > state. > > Tested on x86_64 on navite GDB and the native-extended-gdbserver board. > --- > gdb/NEWS | 5 ++ > gdb/doc/gdb.texinfo | 5 ++ > gdb/gcore.c | 37 ++++++++++++ > gdb/infrun.c | 16 ++--- > gdb/infrun.h | 9 +++ > gdb/testsuite/gdb.base/gcore-nonstop.c | 44 ++++++++++++++ > gdb/testsuite/gdb.base/gcore-nonstop.exp | 77 ++++++++++++++++++++++++ > 7 files changed, 182 insertions(+), 11 deletions(-) > create mode 100644 gdb/testsuite/gdb.base/gcore-nonstop.c > create mode 100644 gdb/testsuite/gdb.base/gcore-nonstop.exp > > diff --git a/gdb/NEWS b/gdb/NEWS > index 3f31515297c..adc0963aecb 100644 > --- a/gdb/NEWS > +++ b/gdb/NEWS > @@ -128,6 +128,11 @@ set style tui-current-position [on|off] > > * Changed commands > > +gcore I think you should list both gcore and generate-core-file here. > + GDB now ensures that all threads of the current inferior are stopped > + before generating a core dump. At the end of the command, threads are > + restored to their previous state. > + > document user-defined > It is now possible to document user-defined aliases. > When a user-defined alias is documented, the help and apropos commands > diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo > index f5f664fd168..ec2eba4b410 100644 > --- a/gdb/doc/gdb.texinfo > +++ b/gdb/doc/gdb.texinfo > @@ -13570,6 +13570,11 @@ Produce a core dump of the inferior process. The optional argument > specified, the file name defaults to @file{core.@var{pid}}, where > @var{pid} is the inferior process ID. > > +@value{GDBN} ensures that all threads of the current inferior are stopped > +while generating the core dump. If any of the inferior's thread is found > +running when executing this command, @value{GDBN} stops it and resumes it > +when the command is done. I think the second sentence needs rewording. I would propose: If any of the inferior's threads are running when executing this command, @value{GDBN} stops the threads and resumes them when the command is done. > + > Note that this command is implemented only for some systems (as of > this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390). > > diff --git a/gdb/gcore.c b/gdb/gcore.c > index ede78534bd8..dda9e7af50b 100644 > --- a/gdb/gcore.c > +++ b/gdb/gcore.c > @@ -34,6 +34,7 @@ > #include "regset.h" > #include "gdb_bfd.h" > #include "readline/tilde.h" > +#include "infrun.h" > #include > #include "gdbsupport/gdb_unlinker.h" > #include "gdbsupport/byte-vector.h" > @@ -131,6 +132,35 @@ gcore_command (const char *args, int from_tty) > if (!target_has_execution ()) > noprocess (); > > + scoped_restore_current_thread restore_current_thread; > + scoped_disable_commit_resumed disable_commit_resume ("generating coredump"); > + struct inferior *inf = current_inferior (); > + scoped_finish_thread_state finish_state (inf->process_target (), > + ptid_t (inferior_ptid.pid ())); Through all of the following code you work with INF. I guess you maybe did this to reduce the line lengths? But it also has the nice effect that the following code is mostly independent of the global current_inferior() state. However, I found it a little confusing that when building the ptid_t you still make use of the global inferior_ptid. I think I'm correct to say that all of the 'inferior_ptid.pid ()' calls could be replaced with 'inf->pid', which I think is clearer. Certainly, I tried making this change, and the test still passed. Is there any reason why this wouldn't be a valid change? > + bool all_stop_was_running = false; > + if (exists_non_stop_target ()) > + stop_all_threads ("generating coredump", inf); > + else > + { > + for (thread_info *t > + : all_non_exited_threads (inf->process_target (), I don't think there's any reason to wrap this line, unwrapped it's still only 74 characters. Though I think you could just use 'inf->non_exited_thread ()' instead. > + ptid_t (inferior_ptid.pid ()))) > + { > + /* All threads are executing or none is, no need to go through the > + entire list. */ > + all_stop_was_running = t->executing (); > + break; > + } Actually, I think you could replace this whole loop with: const thread_info *thr = any_thread_of_inferior (inf); all_stop_was_running = thr->executing (); > + > + if (all_stop_was_running) > + { > + if (!may_stop) > + error (_("Cannot stop the target to generate the core dump.")); > + > + target_stop_and_wait (ptid_t (inferior_ptid.pid ())); > + } > + } > + > if (args && *args) > corefilename.reset (tilde_expand (args)); > else > @@ -161,6 +191,13 @@ gcore_command (const char *args, int from_tty) > } > > gdb_printf ("Saved corefile %s\n", corefilename.get ()); > + > + if (exists_non_stop_target ()) > + restart_threads (nullptr, inf); > + else if (all_stop_was_running) > + target_resume (ptid_t (inferior_ptid.pid ()), 0, GDB_SIGNAL_0); > + > + disable_commit_resume.reset_and_commit (); > } > > static enum bfd_architecture > diff --git a/gdb/infrun.c b/gdb/infrun.c > index 6da46b75ac7..c7c8ad5456f 100644 > --- a/gdb/infrun.c > +++ b/gdb/infrun.c > @@ -96,9 +96,6 @@ static void resume (gdb_signal sig); > > static void wait_for_inferior (inferior *inf); > > -static void restart_threads (struct thread_info *event_thread, > - inferior *inf = nullptr); > - > static bool start_step_over (void); > > static bool step_over_info_valid_p (void); > @@ -5886,18 +5883,15 @@ handle_inferior_event (struct execution_control_state *ecs) > } > } > > -/* Restart threads back to what they were trying to do back when we > - paused them (because of an in-line step-over or vfork, for example). > - The EVENT_THREAD thread is ignored (not restarted). > - > - If INF is non-nullptr, only resume threads from INF. */ > +/* See infrun.h. */ > > -static void > +void > restart_threads (struct thread_info *event_thread, inferior *inf) > { > INFRUN_SCOPED_DEBUG_START_END ("event_thread=%s, inf=%d", > - event_thread->ptid.to_string ().c_str (), > - inf != nullptr ? inf->num : -1); > + (event_thread != nullptr > + ? event_thread->ptid.to_string ().c_str () > + : "None"), inf != nullptr ? inf->num : -1); > > gdb_assert (!step_over_info_valid_p ()); > > diff --git a/gdb/infrun.h b/gdb/infrun.h > index c711b9b21cc..4cd98ec06c5 100644 > --- a/gdb/infrun.h > +++ b/gdb/infrun.h > @@ -175,6 +175,15 @@ extern void nullify_last_target_wait_ptid (); > all threads of all inferiors. */ > extern void stop_all_threads (const char *reason, inferior *inf = nullptr); > > +/* Restart threads back to what they were trying to do back when we > + paused them (because of an in-line step-over or vfork, for example). > + The EVENT_THREAD thread, if non-nullptr, is ignored (not restarted). > + > + If INF is non-nullptr, only resume threads from INF. */ > + > +extern void restart_threads (struct thread_info *event_thread, > + inferior *inf = nullptr); > + > extern void prepare_for_detach (void); > > extern void fetch_inferior_event (); > diff --git a/gdb/testsuite/gdb.base/gcore-nonstop.c b/gdb/testsuite/gdb.base/gcore-nonstop.c > new file mode 100644 > index 00000000000..191a1a26849 > --- /dev/null > +++ b/gdb/testsuite/gdb.base/gcore-nonstop.c > @@ -0,0 +1,44 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2022 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see . */ > + > +#include > + > +static pthread_barrier_t barrier; > + > +static void * > +worker_func (void *) For me, gcc 9.3.1, this test doesn't compile due to the missing argument name here. Changing to worker_func (void *ignored) works just fine. Could we do that? If you really wanted you could annotate with '__attribute__ ((unused))' though this doesn't seem to be widespread in our testsuite, so personally, I'd not bother, but it's up to you. > +{ > + pthread_barrier_wait (&barrier); > + return NULL; > +} > + > +int > +main (void) > +{ > + pthread_t worker_thread; > + pthread_barrier_init (&barrier, NULL, 2); > + > + pthread_create (&worker_thread, NULL, worker_func, NULL); > + > + /* Break here. */ > + > + pthread_barrier_wait (&barrier); > + pthread_join (worker_thread, NULL); > + pthread_barrier_destroy (&barrier); > + > + return 0; > +} > diff --git a/gdb/testsuite/gdb.base/gcore-nonstop.exp b/gdb/testsuite/gdb.base/gcore-nonstop.exp > new file mode 100644 > index 00000000000..56fc909b00b > --- /dev/null > +++ b/gdb/testsuite/gdb.base/gcore-nonstop.exp > @@ -0,0 +1,77 @@ > +# Copyright 2022 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +# This testcase checks that when in non-stop mode with some threads running > +# the gcore command can interrupt all threads, generate a core dump and > +# restart threads as required. > + > +standard_testfile > + > +if { [prepare_for_testing "failed to prepare" \ > + ${testfile} ${srcfile} {threads debug}] } { I think: s/threads/pthreads/ otherwise, for me, this test doesn't link with -lpthread, and fails to link. Thanks, Andrew > + return > +} > + > +save_vars { GDBFLAGS } { > + append GDBFLAGS " -ex \"set non-stop on\"" > + clean_restart ${binfile} > +} > + > +set lineno [gdb_get_line_number "Break here"] > +if { ![runto $lineno] } { > + return > +} > + > +# We should be stopped in thread 1 while thread 2 is running. > +gdb_test_sequence "info threads" "info threads" { > + {Id\s+Target Id\s+Frame} > + {\*\s+1[^\n]*\n} > + {\s+2\s+[^\n]*\(running\)[^\n]*\n} > +} > + > +set th1_pc "" > +gdb_test_multiple "p/x \$pc" "fetch thread 1 PC" { > + -wrap -re "$::decimal = ($::hex)" { > + set th1_pc $expect_out(1,string) > + pass $gdb_test_name > + } > +} > + > +set corefile [standard_output_file "corefile"] > +if {![gdb_gcore_cmd $corefile "generate corefile"]} { > + # gdb_gcore_cmd issues a "UNSUPPORTED". > + return > +} > + > +# After the core file is generated, thread 2 should be back running > +# and thread 1 should still be selected. > +gdb_test_sequence "info threads" "correct thread selection after gcore" { > + {Id\s+Target Id\s+Frame} > + {\*\s+1[^\n]*\n} > + {\s+2\s+[^\n]*\(running\)[^\n]*\n} > +} > + > +# Thread 1 is at the same stop it was before calling the gcore command. > +gdb_test "p/x \$pc" "\\\$$::decimal = $th1_pc" "thread 1 unchanged" > + > +clean_restart $binfile > +gdb_test "core-file $corefile" "Core was generated by.*" "load corefile" > + > +# The core file has the 2 threads. > +gdb_test_sequence "info threads" "threads in corefile" { > + {Id\s+Target Id\s+Frame} > + {\s+1\s+Thread[^\n]*\n} > + {\s+2\s+Thread[^\n]*\n} > +} > > -- > 2.34.1