public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Lancelot SIX <Lancelot.Six@amd.com>
To: Andrew Burgess <aburgess@redhat.com>,
	Lancelot SIX via Gdb-patches <gdb-patches@sourceware.org>
Cc: lsix@lancelotsix.com
Subject: Re: [PATCH v3] gdb/gcore: interrupt all threads before generating the corefile
Date: Fri, 2 Dec 2022 20:05:41 +0000	[thread overview]
Message-ID: <badbe732-aec6-6f6f-045c-fea294355663@amd.com> (raw)
In-Reply-To: <874juj2jhl.fsf@redhat.com>

Hi Andrew,

Thanks for the review. I have been through your various comments and all 
seem valid points.

I'll try to send a revised patch when possible integrating all of your 
comments.

Best,
Lancelot.

On 28/11/2022 15:20, Andrew Burgess wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> Lancelot SIX via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>> Hi,
>>
>> This is a V3 following
>> https://sourceware.org/pipermail/gdb-patches/2022-October/192746.html.
>>
>> Noticeable changes since V2:
>> - The test have been updated following Pedro's comment to be compatible
>>    with --target_board=native-extended-gdbserver.
>> - Running the entire testsuite against gdbserver revealed that the
>>    original implementation was incompatible with all-stop targets.  The
>>    V3 improves the implementation to support both all-stop and non-stop
>>    targets.
>>
>> This patch have been regression tested against gdb and gdbserver
>> (native-extended-gdbserver).
>>
>> All feedbacks are welcome.
>>
>> Best,
>> Lancelot
>>
>> ---
>>
>> In non-stop mode, if the user tries to generate a core dump (using the
>> gcore command) while some threads are running, a non-helpful error
>> message is shown.
>>
>> Lets consider the following session as an example (debugging the test
>> program included in this patch):
>>
>>      (gdb) set non-stop on
>>      (gdb) b 37
>>      (gdb) r
>>      Thread 1 "gcore-nonstop" hit Breakpoint 1, main () at gcore-nonstop.c:39
>>      (gdb) info thread
>>         Id   Target Id                                          Frame
>>       * 1    Thread 0x7ffff7d7a740 (LWP 431838) "gcore-nonstop" main () at gcore-nonstop.c:39
>>         2    Thread 0x7ffff7d79640 (LWP 431841) "gcore-nonstop" (running)
>>      (gdb) gcore
>>      Couldn't get registers: No such process.
>>
>> The reported error ("No such process") does not help the user understand
>> what happens.  This is due to the fact that we cannot access the
>> registers of a running thread.  Even if we ignore this error, generating
>> a core dump while any thread might update memory would most likely
>> result in a core file with an inconsistent view of the process' memory.
>>
>> To solve this, this patch proposes to change the gcore command so it
>> first stops all running threads (from the current inferior) before
>> generating the corefile, and then resumes them in their previous state.
>>
>> To achieve this, this patch exposes the restart_threads function in infrun.h
>> (used to be local to infrun.c).  We also allow the first parameter
>> (event_thread) to be nullptr as it is possible that the gcore command is
>> called while all threads are running, in which case we want all threads
>> to be restarted at the end of the procedure.
>>
>> When testing this patch against gdbserver, it appears that using
>> stop_all_threads / restart_threads was not compatible with all-stop
>> targets.  For those targets, we need to call target_stop_and_wait /
>> target_resume.  The gcore_command has code to handle both
>> configurations.  I could not use a RAII-like object to have a cleaner
>> way to restore the state at the end as the restore procedure could
>> throw.  Instead, the procedure keeps track of which method was used to
>> interrupt threads so the appropriate method can be used to restore their
>> state.
>>
>> Tested on x86_64 on navite GDB and the native-extended-gdbserver board.
>> ---
>>   gdb/NEWS                                 |  5 ++
>>   gdb/doc/gdb.texinfo                      |  5 ++
>>   gdb/gcore.c                              | 37 ++++++++++++
>>   gdb/infrun.c                             | 16 ++---
>>   gdb/infrun.h                             |  9 +++
>>   gdb/testsuite/gdb.base/gcore-nonstop.c   | 44 ++++++++++++++
>>   gdb/testsuite/gdb.base/gcore-nonstop.exp | 77 ++++++++++++++++++++++++
>>   7 files changed, 182 insertions(+), 11 deletions(-)
>>   create mode 100644 gdb/testsuite/gdb.base/gcore-nonstop.c
>>   create mode 100644 gdb/testsuite/gdb.base/gcore-nonstop.exp
>>
>> diff --git a/gdb/NEWS b/gdb/NEWS
>> index 3f31515297c..adc0963aecb 100644
>> --- a/gdb/NEWS
>> +++ b/gdb/NEWS
>> @@ -128,6 +128,11 @@ set style tui-current-position [on|off]
>>
>>   * Changed commands
>>
>> +gcore
> 
> I think you should list both gcore and generate-core-file here.
> 
>> +  GDB now ensures that all threads of the current inferior are stopped
>> +  before generating a core dump.  At the end of the command, threads are
>> +  restored to their previous state.
>> +
>>   document user-defined
>>     It is now possible to document user-defined aliases.
>>     When a user-defined alias is documented, the help and apropos commands
>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>> index f5f664fd168..ec2eba4b410 100644
>> --- a/gdb/doc/gdb.texinfo
>> +++ b/gdb/doc/gdb.texinfo
>> @@ -13570,6 +13570,11 @@ Produce a core dump of the inferior process.  The optional argument
>>   specified, the file name defaults to @file{core.@var{pid}}, where
>>   @var{pid} is the inferior process ID.
>>
>> +@value{GDBN} ensures that all threads of the current inferior are stopped
>> +while generating the core dump.  If any of the inferior's thread is found
>> +running when executing this command, @value{GDBN} stops it and resumes it
>> +when the command is done.
> 
> I think the second sentence needs rewording.  I would propose:
> 
>    If any of the inferior's threads are running when executing this
>    command, @value{GDBN} stops the threads and resumes them when the
>    command is done.
> 
>> +
>>   Note that this command is implemented only for some systems (as of
>>   this writing, @sc{gnu}/Linux, FreeBSD, Solaris, and S390).
>>
>> diff --git a/gdb/gcore.c b/gdb/gcore.c
>> index ede78534bd8..dda9e7af50b 100644
>> --- a/gdb/gcore.c
>> +++ b/gdb/gcore.c
>> @@ -34,6 +34,7 @@
>>   #include "regset.h"
>>   #include "gdb_bfd.h"
>>   #include "readline/tilde.h"
>> +#include "infrun.h"
>>   #include <algorithm>
>>   #include "gdbsupport/gdb_unlinker.h"
>>   #include "gdbsupport/byte-vector.h"
>> @@ -131,6 +132,35 @@ gcore_command (const char *args, int from_tty)
>>     if (!target_has_execution ())
>>       noprocess ();
>>
>> +  scoped_restore_current_thread restore_current_thread;
>> +  scoped_disable_commit_resumed disable_commit_resume ("generating coredump");
>> +  struct inferior *inf = current_inferior ();
>> +  scoped_finish_thread_state finish_state (inf->process_target (),
>> +                                        ptid_t (inferior_ptid.pid ()));
> 
> Through all of the following code you work with INF.  I guess you maybe
> did this to reduce the line lengths?  But it also has the nice effect
> that the following code is mostly independent of the global
> current_inferior() state.
> 
> However, I found it a little confusing that when building the ptid_t you
> still make use of the global inferior_ptid.
> 
> I think I'm correct to say that all of the 'inferior_ptid.pid ()' calls
> could be replaced with 'inf->pid', which I think is clearer.
> 
> Certainly, I tried making this change, and the test still passed.  Is
> there any reason why this wouldn't be a valid change?
> 
>> +  bool all_stop_was_running = false;
>> +  if (exists_non_stop_target ())
>> +    stop_all_threads ("generating coredump", inf);
>> +  else
>> +    {
>> +      for (thread_info *t
>> +        : all_non_exited_threads (inf->process_target (),
> 
> I don't think there's any reason to wrap this line, unwrapped it's still
> only 74 characters.
> 
> Though I think you could just use 'inf->non_exited_thread ()' instead.
> 
>> +                                  ptid_t (inferior_ptid.pid ())))
>> +     {
>> +       /* All threads are executing or none is, no need to go through the
>> +          entire list.  */
>> +       all_stop_was_running = t->executing ();
>> +       break;
>> +     }
> 
> Actually, I think you could replace this whole loop with:
> 
>    const thread_info *thr = any_thread_of_inferior (inf);
>    all_stop_was_running = thr->executing ();
> 
>> +
>> +      if (all_stop_was_running)
>> +     {
>> +       if (!may_stop)
>> +         error (_("Cannot stop the target to generate the core dump."));
>> +
>> +       target_stop_and_wait (ptid_t (inferior_ptid.pid ()));
>> +     }
>> +    }
>> +
>>     if (args && *args)
>>       corefilename.reset (tilde_expand (args));
>>     else
>> @@ -161,6 +191,13 @@ gcore_command (const char *args, int from_tty)
>>       }
>>
>>     gdb_printf ("Saved corefile %s\n", corefilename.get ());
>> +
>> +  if (exists_non_stop_target ())
>> +    restart_threads (nullptr, inf);
>> +  else if (all_stop_was_running)
>> +    target_resume (ptid_t (inferior_ptid.pid ()), 0, GDB_SIGNAL_0);
>> +
>> +  disable_commit_resume.reset_and_commit ();
>>   }
>>
>>   static enum bfd_architecture
>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>> index 6da46b75ac7..c7c8ad5456f 100644
>> --- a/gdb/infrun.c
>> +++ b/gdb/infrun.c
>> @@ -96,9 +96,6 @@ static void resume (gdb_signal sig);
>>
>>   static void wait_for_inferior (inferior *inf);
>>
>> -static void restart_threads (struct thread_info *event_thread,
>> -                          inferior *inf = nullptr);
>> -
>>   static bool start_step_over (void);
>>
>>   static bool step_over_info_valid_p (void);
>> @@ -5886,18 +5883,15 @@ handle_inferior_event (struct execution_control_state *ecs)
>>       }
>>   }
>>
>> -/* Restart threads back to what they were trying to do back when we
>> -   paused them (because of an in-line step-over or vfork, for example).
>> -   The EVENT_THREAD thread is ignored (not restarted).
>> -
>> -   If INF is non-nullptr, only resume threads from INF.  */
>> +/* See infrun.h.  */
>>
>> -static void
>> +void
>>   restart_threads (struct thread_info *event_thread, inferior *inf)
>>   {
>>     INFRUN_SCOPED_DEBUG_START_END ("event_thread=%s, inf=%d",
>> -                              event_thread->ptid.to_string ().c_str (),
>> -                              inf != nullptr ? inf->num : -1);
>> +                              (event_thread != nullptr
>> +                               ? event_thread->ptid.to_string ().c_str ()
>> +                               : "None"), inf != nullptr ? inf->num : -1);
>>
>>     gdb_assert (!step_over_info_valid_p ());
>>
>> diff --git a/gdb/infrun.h b/gdb/infrun.h
>> index c711b9b21cc..4cd98ec06c5 100644
>> --- a/gdb/infrun.h
>> +++ b/gdb/infrun.h
>> @@ -175,6 +175,15 @@ extern void nullify_last_target_wait_ptid ();
>>      all threads of all inferiors.  */
>>   extern void stop_all_threads (const char *reason, inferior *inf = nullptr);
>>
>> +/* Restart threads back to what they were trying to do back when we
>> +   paused them (because of an in-line step-over or vfork, for example).
>> +   The EVENT_THREAD thread, if non-nullptr, is ignored (not restarted).
>> +
>> +   If INF is non-nullptr, only resume threads from INF.  */
>> +
>> +extern void restart_threads (struct thread_info *event_thread,
>> +                          inferior *inf = nullptr);
>> +
>>   extern void prepare_for_detach (void);
>>
>>   extern void fetch_inferior_event ();
>> diff --git a/gdb/testsuite/gdb.base/gcore-nonstop.c b/gdb/testsuite/gdb.base/gcore-nonstop.c
>> new file mode 100644
>> index 00000000000..191a1a26849
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.base/gcore-nonstop.c
>> @@ -0,0 +1,44 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> +   Copyright 2022 Free Software Foundation, Inc.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; either version 3 of the License, or
>> +   (at your option) any later version.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>> +
>> +#include <pthread.h>
>> +
>> +static pthread_barrier_t barrier;
>> +
>> +static void *
>> +worker_func (void *)
> 
> For me, gcc 9.3.1, this test doesn't compile due to the missing argument
> name here.  Changing to
> 
>    worker_func (void *ignored)
> 
> works just fine.  Could we do that?  If you really wanted you could
> annotate with '__attribute__ ((unused))' though this doesn't seem to be
> widespread in our testsuite, so personally, I'd not bother, but it's up
> to you.
> 
>> +{
>> +  pthread_barrier_wait (&barrier);
>> +  return NULL;
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> +  pthread_t worker_thread;
>> +  pthread_barrier_init (&barrier, NULL, 2);
>> +
>> +  pthread_create (&worker_thread, NULL, worker_func, NULL);
>> +
>> +  /* Break here.  */
>> +
>> +  pthread_barrier_wait (&barrier);
>> +  pthread_join (worker_thread, NULL);
>> +  pthread_barrier_destroy (&barrier);
>> +
>> +  return 0;
>> +}
>> diff --git a/gdb/testsuite/gdb.base/gcore-nonstop.exp b/gdb/testsuite/gdb.base/gcore-nonstop.exp
>> new file mode 100644
>> index 00000000000..56fc909b00b
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.base/gcore-nonstop.exp
>> @@ -0,0 +1,77 @@
>> +# Copyright 2022 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +# This testcase checks that when in non-stop mode with some threads running
>> +# the gcore command can interrupt all threads, generate a core dump and
>> +# restart threads as required.
>> +
>> +standard_testfile
>> +
>> +if { [prepare_for_testing "failed to prepare" \
>> +     ${testfile} ${srcfile} {threads debug}] } {
> 
> I think: s/threads/pthreads/ otherwise, for me, this test doesn't link
> with -lpthread, and fails to link.
> 
> Thanks,
> Andrew
> 
>> +    return
>> +}
>> +
>> +save_vars { GDBFLAGS } {
>> +    append GDBFLAGS " -ex \"set non-stop on\""
>> +    clean_restart ${binfile}
>> +}
>> +
>> +set lineno [gdb_get_line_number "Break here"]
>> +if { ![runto $lineno] } {
>> +    return
>> +}
>> +
>> +# We should be stopped in thread 1 while thread 2 is running.
>> +gdb_test_sequence "info threads" "info threads" {
>> +    {Id\s+Target Id\s+Frame}
>> +    {\*\s+1[^\n]*\n}
>> +    {\s+2\s+[^\n]*\(running\)[^\n]*\n}
>> +}
>> +
>> +set th1_pc ""
>> +gdb_test_multiple "p/x \$pc" "fetch thread 1 PC" {
>> +  -wrap -re "$::decimal = ($::hex)" {
>> +    set th1_pc $expect_out(1,string)
>> +    pass $gdb_test_name
>> +  }
>> +}
>> +
>> +set corefile [standard_output_file "corefile"]
>> +if {![gdb_gcore_cmd $corefile "generate corefile"]} {
>> +  # gdb_gcore_cmd issues a "UNSUPPORTED".
>> +  return
>> +}
>> +
>> +# After the core file is generated, thread 2 should be back running
>> +# and thread 1 should still be selected.
>> +gdb_test_sequence "info threads" "correct thread selection after gcore" {
>> +    {Id\s+Target Id\s+Frame}
>> +    {\*\s+1[^\n]*\n}
>> +    {\s+2\s+[^\n]*\(running\)[^\n]*\n}
>> +}
>> +
>> +# Thread 1 is at the same stop it was before calling the gcore command.
>> +gdb_test "p/x \$pc" "\\\$$::decimal = $th1_pc" "thread 1 unchanged"
>> +
>> +clean_restart $binfile
>> +gdb_test "core-file $corefile" "Core was generated by.*" "load corefile"
>> +
>> +# The core file has the 2 threads.
>> +gdb_test_sequence "info threads" "threads in corefile" {
>> +    {Id\s+Target Id\s+Frame}
>> +    {\s+1\s+Thread[^\n]*\n}
>> +    {\s+2\s+Thread[^\n]*\n}
>> +}
>>
>> --
>> 2.34.1
> 

      reply	other threads:[~2022-12-02 20:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16 12:36 Lancelot SIX
2022-11-28 10:46 ` [PING] " Lancelot SIX
2022-11-28 15:20 ` Andrew Burgess
2022-12-02 20:05   ` Lancelot SIX [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=badbe732-aec6-6f6f-045c-fea294355663@amd.com \
    --to=lancelot.six@amd.com \
    --cc=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=lsix@lancelotsix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).