public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Better executable auto-loading when opening a core file
@ 2024-10-26 11:11 Andrew Burgess
  2024-10-26 11:11 ` [PATCH 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

There's actually a couple of core file related improvements in this
series.

Patches #1 and #2 improve what information GDB can extract about the
execution context (executable name, inferior arguments, and
environment) when opening a core file.

Then patch #4 improves GDB's ability to auto-load the executable that
matches a core file (on GNU/Linux).

Patch #3 is a testsuite refactor to allow for patch #4.

And patch #5 replicates patch #4, but for FreeBSD.

Thanks,
Andrew

---

Andrew Burgess (5):
  gdb: add gdbarch method to get execution context from core file
  gdb: parse and set the inferior environment from core files
  gdb/testsuite: make some of the core file / build-id tests harder
  gdb: improve GDB's ability to auto-load the exec for a core file
  gdb/freebsd: port core file context parsing to FreeBSD

 gdb/arch-utils.c                              |  26 ++
 gdb/arch-utils.h                              |  89 +++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 | 172 +++++++++-
 gdb/fbsd-tdep.c                               | 134 ++++++++
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 308 ++++++++++++++++++
 gdb/testsuite/gdb.base/coredump-filter.exp    |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp   | 252 ++++++--------
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 165 ++++++++++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 252 ++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp           |   9 +
 17 files changed, 1379 insertions(+), 163 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp


base-commit: 2bba46058789196c1c384896933cbc9692ef4933
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] gdb: add gdbarch method to get execution context from core file
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
@ 2024-10-26 11:11 ` Andrew Burgess
  2024-10-26 11:11 ` [PATCH 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Add a new gdbarch method which can read the execution context from a
core file.  An execution context, for this commit, means the filename
of the executable used to generate the core file and the arguments
passed to the executable.

In later commits this will be extended further to include the
environment in which the executable was run, but this commit is
already pretty big, so I've split that part out into a later commit.

Initially this new gdbarch method is only implemented for Linux
targets, but a later commit will add FreeBSD support too.

Currently when GDB opens a core file, GDB reports the command and
arguments used to generate the core file.  For example:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./gen-core abc def'.

However, this information comes from the psinfo structure in the core
file, and this struct only allows 80 characters for the command and
arguments combined.  If the command and arguments exceed this then
they are truncated.

Additionally, neither the executable nor the arguments are quoted in
the psinfo structure, so if, for example, the executable was named
'aaa bbb' (i.e. contains white space) and was run with the arguments
'ccc' and 'ddd', then when this core file was opened by GDB we'd see:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./aaa bbb ccc ddd'.

It is impossible to know if 'bbb' is part of the executable filename,
or another argument.

However, the kernel places the executable command onto the user stack,
this is pointed to by the AT_EXECFN entry in the auxv vector.
Additionally, the inferior arguments are all available on the user
stack.  The new gdbarch method added in this commit extracts this
information from the user stack and allows GDB to access it.

The information on the stack is writable by the user, so a user
application can start up, edit the arguments, override the AT_EXECFN
string, and then dump core.  In this case GDB will report incorrect
information, however, it is worth noting that the psinfo structure is
also filled (by the kernel) by just copying information from the user
stack, so, if the user edits the on stack arguments, the values
reported in psinfo will change, so the new approach is no worse than
what we currently have.

The benefit of this approach is that GDB gets to report the full
executable name and all the arguments without the 80 character limit,
and GDB is aware which parts are the executable name, and which parts
are arguments, so we can, for example, style the executable name.

Another benefit is that, now we know all the arguments, we can poke
these into the inferior object.  This means that after loading a core
file a user can 'show args' to see the arguments used.  A user could
even transition from core file debugging to live inferior debugging
using, e.g. 'run', and GDB would restart the inferior with the correct
arguments.

Now the downside: finding the AT_EXECFN string is easy, the auxv entry
points directly too it.  However, finding the arguments is a little
trickier.  There's currently no easy way to get a direct pointer to
the arguments.  Instead, I've got a heuristic which I believe should
find the arguments in most cases.  The algorithm is laid out in
linux-tdep.c, I'll not repeat it here, but it's basically a search of
the user stack, starting from AT_EXECFN.

If the new heuristic fails then GDB just falls back to the old
approach, asking bfd to read the psinfo structure for us, which gives
the old 80 character limited answer.

For testing, I've run this series on (all GNU/Linux) x86-64. s390,
ppc64le, and the new test passes in each case.
---
 gdb/arch-utils.h                              |  57 ++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 |  38 ++-
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 286 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 102 +++++++
 10 files changed, 572 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 40c62f30a65..8d9f1625bdd 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -74,6 +74,58 @@ struct bp_manipulation_endian
   bp_manipulation_endian<sizeof (BREAK_INSN_LITTLE),		  \
   BREAK_INSN_LITTLE, BREAK_INSN_BIG>
 
+/* Structure returned from gdbarch core_parse_exec_context method.  Wraps
+   the execfn string and a vector containing the inferior argument.  If a
+   gdbarch is unable to parse this information then an empty structure is
+   returned, check the execfn as an indication, if this is nullptr then no
+   other fields should be considered valid.  */
+
+struct core_file_exec_context
+{
+  /* Constructor, just move everything into place.  The EXEC_NAME should
+     never be nullptr.  Only call this constructor if all the arguments
+     have been collected successfully, i.e. if the EXEC_NAME could be
+     found but not ARGV then use the no-argument constructor to create an
+     empty context object.  */
+  core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+    : m_exec_name (std::move (exec_name)),
+      m_arguments (std::move (argv))
+  {
+    gdb_assert (m_exec_name != nullptr);
+  }
+
+  /* Create a default context object.  In its default state a context
+     object holds no useful information, and will return false from its
+     valid() method.  */
+  core_file_exec_context () = default;
+
+  /* Return true if this object contains valid context information.  */
+  bool valid () const
+  { return m_exec_name != nullptr; }
+
+  /* Return the execfn string (executable name) as extracted from the core
+     file.  Will always return non-nullptr if valid() returns true.  */
+  const char *execfn () const
+  { return m_exec_name.get (); }
+
+  /* Return the vector of inferior arguments as extracted from the core
+     file.  This does not include argv[0] (the executable name) for that
+     see the execfn() function.  */
+  const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
+  { return m_arguments; }
+
+private:
+
+  /* The executable filename as reported in the core file.  Can be nullptr
+     if no executable name is found.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_name;
+
+  /* List of arguments.  Doesn't include argv[0] which is the executable
+     name, for this look at m_exec_name field.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+};
+
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
 extern bool default_displaced_step_hw_singlestep (struct gdbarch *);
 
@@ -305,6 +357,11 @@ extern void default_read_core_file_mappings
    read_core_file_mappings_pre_loop_ftype pre_loop_cb,
    read_core_file_mappings_loop_ftype loop_cb);
 
+/* Default implementation of gdbarch_core_parse_exec_context.  Returns
+   an empty core_file_exec_context.  */
+extern core_file_exec_context default_core_parse_exec_context
+  (struct gdbarch *gdbarch, bfd *cbfd);
+
 /* Default implementation of gdbarch
    use_target_description_from_corefile_notes.  */
 extern bool default_use_target_description_from_corefile_notes
diff --git a/gdb/corefile.c b/gdb/corefile.c
index f6ec3cd5ca1..c3089e4516e 100644
--- a/gdb/corefile.c
+++ b/gdb/corefile.c
@@ -35,6 +35,7 @@
 #include "cli/cli-utils.h"
 #include "gdbarch.h"
 #include "interps.h"
+#include "arch-utils.h"
 
 void
 reopen_exec_file (void)
@@ -76,6 +77,15 @@ validate_files (void)
     }
 }
 
+/* See arch-utils.h.  */
+
+core_file_exec_context
+default_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  return {};
+}
+\f
+
 std::string
 memory_error_message (enum target_xfer_status err,
 		      struct gdbarch *gdbarch, CORE_ADDR memaddr)
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5820ffed332..5cc11d71b7b 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -854,7 +854,6 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
 void
 core_target_open (const char *arg, int from_tty)
 {
-  const char *p;
   int siggy;
   int scratch_chan;
   int flags;
@@ -990,9 +989,40 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  p = bfd_core_file_failing_command (current_program_space->core_bfd ());
-  if (p)
-    gdb_printf (_("Core was generated by `%s'.\n"), p);
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+  if (ctx.valid ())
+    {
+      std::string args;
+      for (const auto &a : ctx.args ())
+	{
+	  args += ' ';
+	  args += a.get ();
+	}
+
+      gdb_printf (_("Core was generated by `%ps%s'.\n"),
+		  styled_string (file_name_style.style (),
+				 ctx.execfn ()),
+		  args.c_str ());
+
+      /* Copy the arguments into the inferior.  */
+      std::vector<char *> argv;
+      for (const auto &a : ctx.args ())
+	argv.push_back (a.get ());
+      gdb::array_view<char * const> view (argv.data (), argv.size ());
+      current_inferior ()->set_args (view);
+    }
+  else
+    {
+      gdb::unique_xmalloc_ptr<char> failing_command = make_unique_xstrdup
+	(bfd_core_file_failing_command (current_program_space->core_bfd ()));
+      if (failing_command != nullptr)
+	gdb_printf (_("Core was generated by `%s'.\n"),
+		    failing_command.get ());
+    }
 
   /* Clearing any previous state of convenience variables.  */
   clear_exit_convenience_vars ();
diff --git a/gdb/gdbarch-gen.c b/gdb/gdbarch-gen.c
index 0d00cd7c993..6f41ce9d233 100644
--- a/gdb/gdbarch-gen.c
+++ b/gdb/gdbarch-gen.c
@@ -258,6 +258,7 @@ struct gdbarch
   gdbarch_get_pc_address_flags_ftype *get_pc_address_flags = default_get_pc_address_flags;
   gdbarch_read_core_file_mappings_ftype *read_core_file_mappings = default_read_core_file_mappings;
   gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes = default_use_target_description_from_corefile_notes;
+  gdbarch_core_parse_exec_context_ftype *core_parse_exec_context = default_core_parse_exec_context;
 };
 
 /* Create a new ``struct gdbarch'' based on information provided by
@@ -527,6 +528,7 @@ verify_gdbarch (struct gdbarch *gdbarch)
   /* Skip verify of get_pc_address_flags, invalid_p == 0.  */
   /* Skip verify of read_core_file_mappings, invalid_p == 0.  */
   /* Skip verify of use_target_description_from_corefile_notes, invalid_p == 0.  */
+  /* Skip verify of core_parse_exec_context, invalid_p == 0.  */
   if (!log.empty ())
     internal_error (_("verify_gdbarch: the following are invalid ...%s"),
 		    log.c_str ());
@@ -1386,6 +1388,9 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file)
   gdb_printf (file,
 	      "gdbarch_dump: use_target_description_from_corefile_notes = <%s>\n",
 	      host_address_to_string (gdbarch->use_target_description_from_corefile_notes));
+  gdb_printf (file,
+	      "gdbarch_dump: core_parse_exec_context = <%s>\n",
+	      host_address_to_string (gdbarch->core_parse_exec_context));
   if (gdbarch->dump_tdep != NULL)
     gdbarch->dump_tdep (gdbarch, file);
 }
@@ -5463,3 +5468,20 @@ set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch,
 {
   gdbarch->use_target_description_from_corefile_notes = use_target_description_from_corefile_notes;
 }
+
+core_file_exec_context
+gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != NULL);
+  gdb_assert (gdbarch->core_parse_exec_context != NULL);
+  if (gdbarch_debug >= 2)
+    gdb_printf (gdb_stdlog, "gdbarch_core_parse_exec_context called\n");
+  return gdbarch->core_parse_exec_context (gdbarch, cbfd);
+}
+
+void
+set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch,
+				     gdbarch_core_parse_exec_context_ftype core_parse_exec_context)
+{
+  gdbarch->core_parse_exec_context = core_parse_exec_context;
+}
diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h
index b982fd7cd09..29c5ad705f9 100644
--- a/gdb/gdbarch-gen.h
+++ b/gdb/gdbarch-gen.h
@@ -1751,3 +1751,18 @@ extern void set_gdbarch_read_core_file_mappings (struct gdbarch *gdbarch, gdbarc
 typedef bool (gdbarch_use_target_description_from_corefile_notes_ftype) (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern bool gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern void set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes);
+
+/* Examine the core file bfd object CBFD and try to extract the name of
+   the current executable and the argument list, which are return in a
+   core_file_exec_context object.
+
+   If for any reason the details can't be extracted from CBFD then an
+   empty context is returned.
+
+   It is required that the current inferior be the one associated with
+   CBFD, strings are read from the current inferior using target methods
+   which all assume current_inferior() is the one to read from. */
+
+typedef core_file_exec_context (gdbarch_core_parse_exec_context_ftype) (struct gdbarch *gdbarch, bfd *cbfd);
+extern core_file_exec_context gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd);
+extern void set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, gdbarch_core_parse_exec_context_ftype *core_parse_exec_context);
diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h
index 60a0f60df39..8359ae762de 100644
--- a/gdb/gdbarch.h
+++ b/gdb/gdbarch.h
@@ -59,6 +59,7 @@ struct ui_out;
 struct inferior;
 struct x86_xsave_layout;
 struct solib_ops;
+struct core_file_exec_context;
 
 #include "regcache.h"
 
diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py
index 4006380076d..7a218605d89 100644
--- a/gdb/gdbarch_components.py
+++ b/gdb/gdbarch_components.py
@@ -2778,3 +2778,23 @@ The corefile's bfd is passed through COREFILE_BFD.
     predefault="default_use_target_description_from_corefile_notes",
     invalid=False,
 )
+
+Method(
+    comment="""
+Examine the core file bfd object CBFD and try to extract the name of
+the current executable and the argument list, which are return in a
+core_file_exec_context object.
+
+If for any reason the details can't be extracted from CBFD then an
+empty context is returned.
+
+It is required that the current inferior be the one associated with
+CBFD, strings are read from the current inferior using target methods
+which all assume current_inferior() is the one to read from.
+""",
+    type="core_file_exec_context",
+    name="core_parse_exec_context",
+    params=[("bfd *", "cbfd")],
+    predefault="default_core_parse_exec_context",
+    invalid=False,
+)
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 65ec221ef48..d1937970be7 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -1835,6 +1835,290 @@ linux_corefile_thread (struct thread_info *info,
     }
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  /* This function (currently) assumes the stack grows down.  If this is
+     not the case then this function isn't going to help.  */
+  if (!gdbarch_stack_grows_down (gdbarch))
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Parse the .auxv section looking for the AT_EXECFN attribute.  The
+     value of this attribute is a pointer to a string, the string is the
+     executable command.  Additionally, this string is placed at the top of
+     the program stack, and so will be in the same PT_LOAD segment as the
+     argv and envp arrays.  We can use this to try and locate these arrays.
+     If we can't find the AT_EXECFN attribute then we're not going to be
+     able to do anything else here.  */
+  CORE_ADDR execfn_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_EXECFN, &execfn_string_addr) != 1)
+    return {};
+
+  /* Read in the program headers from CBFD.  If we can't do this for any
+     reason then just give up.  */
+  long phdrs_size = bfd_get_elf_phdr_upper_bound (cbfd);
+  if (phdrs_size == -1)
+    return {};
+  gdb::unique_xmalloc_ptr<Elf_Internal_Phdr>
+    phdrs ((Elf_Internal_Phdr *) xmalloc (phdrs_size));
+  int num_phdrs = bfd_get_elf_phdrs (cbfd, phdrs.get ());
+  if (num_phdrs == -1)
+    return {};
+
+  /* Now scan through the headers looking for the one which contains the
+     address held in EXECFN_STRING_ADDR, this is the address of the
+     executable command pointed too by the AT_EXECFN auxv entry.  */
+  Elf_Internal_Phdr *hdr = nullptr;
+  for (int i = 0; i < num_phdrs; i++)
+    {
+      /* The program header that contains the address EXECFN_STRING_ADDR
+	 should be one where all content is contained within CBFD, hence
+	 the check that the file size matches the memory size.  */
+      if (phdrs.get ()[i].p_type == PT_LOAD
+	  && phdrs.get ()[i].p_vaddr <= execfn_string_addr
+	  && (phdrs.get ()[i].p_vaddr
+	      + phdrs.get ()[i].p_memsz) > execfn_string_addr
+	  && phdrs.get ()[i].p_memsz == phdrs.get ()[i].p_filesz)
+	{
+	  hdr = &phdrs.get ()[i];
+	  break;
+	}
+    }
+
+  /* If we failed to find a suitable program header then give up.  */
+  if (hdr == nullptr)
+    return {};
+
+  /* As we assume the stack grows down (see early check in this function)
+     we know that the information we are looking for sits somewhere between
+     EXECFN_STRING_ADDR and the segments virtual address.  These define
+     the HIGH and LOW addresses between which we are going to search.  */
+  CORE_ADDR low = hdr->p_vaddr;
+  CORE_ADDR high = execfn_string_addr;
+
+  /* This PTR is going to be the address we are currently accessing.  */
+  CORE_ADDR ptr = align_down (high, ptr_bytes);
+
+  /* Setup DEREF a helper function which loads a value from an address.
+     The returned value is always placed into a uint64_t, even if we only
+     load 4-bytes, this allows the code below to be pretty generic.  All
+     the values we're dealing with are unsigned, so this should be OK.   */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+  gdb::function_view<uint64_t (CORE_ADDR)> deref
+    = [=] (CORE_ADDR p) -> uint64_t
+    {
+      ULONGEST value = read_memory_unsigned_integer (p, ptr_bytes, byte_order);
+      return (uint64_t) value;
+    };
+
+  /* Now search down through memory looking for a PTR_BYTES sized object
+     which contains the value EXECFN_STRING_ADDR.  The hope is that this
+     will be the AT_EXECFN entry in the auxv table.  There is no guarantee
+     that we'll find the auxv table this way, but we will do our best to
+     validate that what we find is the auxv table, see below.  */
+  while (ptr > low)
+    {
+      if (deref (ptr) == execfn_string_addr
+	  && (ptr - ptr_bytes) > low
+	  && deref (ptr - ptr_bytes) == AT_EXECFN)
+	break;
+
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* Assuming that we are looking at a value field in the auxv table, move
+     forward PTR_BYTES bytes so we are now looking at the next key field in
+     the auxv table, then scan forward until we find the null entry which
+     will be the last entry in the auxv table.  */
+  ptr += ptr_bytes;
+  while ((ptr + (2 * ptr_bytes)) < high
+	 && (deref (ptr) != 0 || deref (ptr + ptr_bytes) != 0))
+    ptr += (2 * ptr_bytes);
+
+  /* PTR now points to the null entry in the auxv table, or we think it
+     does.  Now we want to find the start of the auxv table.  There's no
+     in-memory pattern we can search for at the start of the table, but
+     we can find the start based on the size of the .auxv section within
+     the core file CBFD object.  In the actual core file the auxv is held
+     in a note, but the bfd library makes this into a section for us.
+
+     The addition of (2 * PTR_BYTES) here is because PTR is pointing at the
+     null entry, but the null entry is also included in CONTENTS.  */
+  ptr = ptr + (2 * ptr_bytes) - contents.size ();
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR should now be pointing to the start of the auxv table mapped into
+     the inferior memory.  As we got here using a heuristic then lets
+     compare an auxv table sized block of inferior memory, if this matches
+     then it's not a guarantee that we are in the right place, but it does
+     make it more likely.  */
+  gdb::byte_vector target_contents (size);
+  if (target_read_memory (ptr, target_contents.data (), size) != 0)
+    memory_error (TARGET_XFER_E_IO, ptr);
+  if (memcmp (contents.data (), target_contents.data (), size) != 0)
+    return {};
+
+  /* We have reasonable confidence that PTR points to the start of the auxv
+     table.  Below this should be the null terminated list of pointers to
+     environment strings, and below that the null terminated list of
+     pointers to arguments strings.  After that we should find the
+     argument count.  First, check for the null at the end of the
+     environment list.  */
+  if (deref (ptr - ptr_bytes) != 0)
+    return {};
+
+  ptr -= (2 * ptr_bytes);
+  while (ptr > low && deref (ptr) != 0)
+    ptr -= ptr_bytes;
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing to the null entry at the end of the argument
+     string pointer list.  We now want to scan backward to find the entire
+     argument list.  There's no handy null marker that we can look for
+     here, instead, as we scan backward we look for the argument count
+     (argc) value which appears immediately before the argument list.
+
+     Technically, we could have zero arguments, so the argument count would
+     be zero, however, we don't support this case.  If we find a null entry
+     in the argument list before we find the argument count then we just
+     bail out.
+
+     Start by moving to the last argument string pointer, we expect this
+     to be non-null.  */
+  ptr -= ptr_bytes;
+  uint64_t argc = 0;
+  while (ptr > low)
+    {
+      uint64_t val = deref (ptr);
+      if (val == 0)
+	return {};
+
+      if (val == argc)
+	break;
+
+      argc++;
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing at the argument count value.  Move it forward
+     so we're pointing at the first actual argument string pointer.  */
+  ptr += ptr_bytes;
+
+  /* We can now parse all of the argument strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+
+  /* Skip the first argument.  This is the executable command, but we'll
+     load that separately later.  */
+  ptr += ptr_bytes;
+
+  uint64_t v;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      arguments.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  /* Skip the null-pointer at the end of the argument list.  We will now
+     be pointing at the first environment string.  */
+  ptr += ptr_bytes;
+
+  /* Parse the environment strings.  Nothing is done with this yet, but
+     will be in a later commit.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      environment.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  gdb::unique_xmalloc_ptr<char> execfn
+    = target_read_string (execfn_string_addr, INT_MAX);
+  if (execfn == nullptr)
+    return {};
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (arguments));
+}
+
+/* Parse and return execution context details from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return linux_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Fill the PRPSINFO structure with information about the process being
    debugged.  Returns 1 in case of success, 0 for failures.  Please note that
    even if the structure cannot be entirely filled (e.g., GDB was unable to
@@ -2785,6 +3069,8 @@ linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch,
   set_gdbarch_infcall_mmap (gdbarch, linux_infcall_mmap);
   set_gdbarch_infcall_munmap (gdbarch, linux_infcall_munmap);
   set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       linux_corefile_parse_exec_context);
 }
 
 void _initialize_linux_tdep ();
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.c b/gdb/testsuite/gdb.base/corefile-exec-context.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
new file mode 100644
index 00000000000..b18a8104779
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -0,0 +1,102 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB can handle reading the full executable name and argument
+# list from a core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Linux core files can encore upto 80 characters for the command and
+# arguments in the psinfo.  If BINFILE is less than 80 characters in
+# length then lets try to make it longer.
+set binfile_len [string length $binfile]
+if { $binfile_len <= 80 } {
+    set extra_len [expr 80 - $binfile_len + 1]
+    set extra_str [string repeat "x" $extra_len]
+    set new_binfile $binfile$extra_str
+    remote_exec build "mv $binfile $new_binfile"
+    set binfile $new_binfile
+}
+
+# Generate a core file, this time the inferior has no additional
+# arguments.
+set corefile [core_find $binfile {}]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_1 "$binfile.1.core"
+remote_exec build "mv $corefile $corefile_1"
+
+# Load the core file and confirm that the full executable name is
+# seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_1" "load core file no args" {
+    -re "^Core was generated by `[string_to_regexp $binfile]'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Generate a core file, this time pass some arguments to the inferior.
+set args "aaaaa bbbbb ccccc ddddd eeeee"
+set corefile [core_find $binfile {} $args]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_2 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_2"
+
+# Load the core file and confirm that the full executable name and
+# argument list are seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_2" "load core file with args" {
+    -re "^Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Also, the argument list should be available through 'show args'.
+gdb_test "show args" \
+    "Argument list to give program being debugged when it is started is \"$args\"\\."
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/5] gdb: parse and set the inferior environment from core files
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-26 11:11 ` [PATCH 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
@ 2024-10-26 11:11 ` Andrew Burgess
  2024-10-26 11:11 ` [PATCH 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Extend the core file context parsing mechanism added in the previous
commit to also store the environment parsed from the core file.

This environment can then be injected into the inferior object.

The benefit of this is that when examining a core file in GDB, the
'show environment' command will now show the environment extracted
from a core file.

Consider this example:

  $ env -i GDB_TEST_VAR=FOO ./gen-core
  Segmentation fault (core dumped)
  $ gdb -c ./core.1669829
  ...
  [New LWP 1669829]
  Core was generated by `./gen-core'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x0000000000401111 in ?? ()
  (gdb) show environment
  GDB_TEST_VAR=foo
  (gdb)

There's a new test for this functionality.
---
 gdb/arch-utils.c                              | 26 ++++++++
 gdb/arch-utils.h                              | 13 +++-
 gdb/corelow.c                                 |  3 +
 gdb/linux-tdep.c                              |  6 +-
 .../gdb.base/corefile-exec-context.exp        | 63 +++++++++++++++++++
 5 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
index 6ffa4109765..567dc87d9dd 100644
--- a/gdb/arch-utils.c
+++ b/gdb/arch-utils.c
@@ -1499,6 +1499,32 @@ gdbarch_initialized_p (gdbarch *arch)
   return arch->initialized_p;
 }
 
+/* See arch-utils.h.  */
+
+gdb_environ
+core_file_exec_context::environment () const
+{
+  gdb_environ e;
+
+  for (const auto &entry : m_environment)
+    {
+      char *eq = strchr (entry.get (), '=');
+
+      /* If there's no '=' character, then skip this entry.  */
+      if (eq == nullptr)
+	continue;
+
+      const char *value = eq + 1;
+      const char *var = entry.get ();
+
+      *eq = '\0';
+      e.set (var, value);
+      *eq = '=';
+    }
+
+  return e;
+}
+
 void _initialize_gdbarch_utils ();
 void
 _initialize_gdbarch_utils ()
diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 8d9f1625bdd..1c33bfb4704 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -21,6 +21,7 @@
 #define ARCH_UTILS_H
 
 #include "gdbarch.h"
+#include "gdbsupport/environ.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -88,9 +89,11 @@ struct core_file_exec_context
      found but not ARGV then use the no-argument constructor to create an
      empty context object.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
-			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
-      m_arguments (std::move (argv))
+      m_arguments (std::move (argv)),
+      m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
   }
@@ -115,6 +118,9 @@ struct core_file_exec_context
   const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
   { return m_arguments; }
 
+  /* Return the environment variables from this context.  */
+  gdb_environ environment () const;
+
 private:
 
   /* The executable filename as reported in the core file.  Can be nullptr
@@ -124,6 +130,9 @@ struct core_file_exec_context
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+
+  /* List of environment strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_environment;
 };
 
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5cc11d71b7b..a0129f84b1c 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -1014,6 +1014,9 @@ core_target_open (const char *arg, int from_tty)
 	argv.push_back (a.get ());
       gdb::array_view<char * const> view (argv.data (), argv.size ());
       current_inferior ()->set_args (view);
+
+      /* And now copy the environment.  */
+      current_inferior ()->environment = ctx.environment ();
     }
   else
     {
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index d1937970be7..d981824f081 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2067,8 +2067,7 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
      be pointing at the first environment string.  */
   ptr += ptr_bytes;
 
-  /* Parse the environment strings.  Nothing is done with this yet, but
-     will be in a later commit.  */
+  /* Parse the environment strings.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> environment;
   while ((v = deref (ptr)) != 0)
     {
@@ -2085,7 +2084,8 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
     return {};
 
   return core_file_exec_context (std::move (execfn),
-				 std::move (arguments));
+				 std::move (arguments),
+				 std::move (environment));
 }
 
 /* Parse and return execution context details from core file CBFD.  */
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index b18a8104779..ac97754fe71 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -100,3 +100,66 @@ gdb_test_multiple "core-file $corefile_2" "load core file with args" {
 # Also, the argument list should be available through 'show args'.
 gdb_test "show args" \
     "Argument list to give program being debugged when it is started is \"$args\"\\."
+
+# Find the name of an environment variable that is not set.
+set env_var_base "GDB_TEST_ENV_VAR_"
+set env_var_name ""
+
+for { set i 0 } { $i < 10 } { incr i } {
+    set tmp_name ${env_var_base}${i}
+    if { ! [info exists ::env($tmp_name)] } {
+	set env_var_name $tmp_name
+	break
+    }
+}
+
+if { $env_var_name eq "" } {
+    unsupported "couldn't find suitable environment variable name"
+    return -1
+}
+
+# Generate a core file with this environment variable set.
+set env_var_value "TEST VALUE"
+save_vars { ::env($env_var_name) } {
+    setenv $env_var_name $env_var_value
+
+    set corefile [core_find $binfile {} $args]
+    if {$corefile == ""} {
+	untested "unable to create corefile"
+	return 0
+    }
+}
+set corefile_3 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_3"
+
+# Restart, load the core file, and check the environment variable
+# shows up.
+clean_restart $binfile
+
+# Check for environment variable VAR_NAME in the environment, its
+# value should be VAR_VALUE.
+proc check_for_env_var { var_name var_value } {
+    set saw_var false
+    gdb_test_multiple "show environment" "" {
+	-re "^$var_name=$var_value\r\n" {
+	    set saw_var true
+	    exp_continue
+	}
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+	-re "^$::gdb_prompt $" {
+	}
+    }
+    return $saw_var
+}
+
+gdb_assert { ![check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is not set before core file load"
+
+gdb_test "core-file $corefile_3" \
+    "Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n.*" \
+    "load core file for environment test"
+
+gdb_assert { [check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is set after core file load"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/5] gdb/testsuite: make some of the core file / build-id tests harder
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-26 11:11 ` [PATCH 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
  2024-10-26 11:11 ` [PATCH 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
@ 2024-10-26 11:11 ` Andrew Burgess
  2024-10-26 11:11 ` [PATCH 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

We have a few tests that load core files, which depend on GDB not
auto-loading the executable that matches the core file.  One of these
tests (corefile-buildid.exp) exercises GDB's ability to load the
executable via the build-id links in the debug directory, while the
other two tests are just written assuming that GDB hasn't auto-loaded
the executable.

In the next commit, GDB is going to get better at finding the
executable for a core file, and as a consequence these tests could
start to fail if the testsuite is being run using a compiler that adds
build-ids by default, and is on a target (currently only Linux) with
the improved executable auto-loading.

To avoid these test failures, this commit updates some of the tests.

coredump-filter.exp and corefile.exp are updated to unload the
executable should it be auto-loaded.  This means that the following
output from GDB will match the expected patterns.  If the executable
wasn't auto-loaded then the new step to unload is harmless.

The corefile-buildid.exp test needed some more significant changes.
For this test it is important that the executable be moved aside so
that GDB can't locate it, but we do still need the executable around
somewhere, so that the debug directory can link to it.  The point of
the test is that the executable _should_ be auto-loaded, but using the
debug directory, not using GDB's context parsing logic.

While looking at this test I noticed two additional problems, first we
were creating the core file more times than we needed.  We only need
to create one core file for each test binary (total two), while we
previously created one core file for each style of debug info
directory (total four).  The extra core files should be identical, and
were just overwriting each other, harmless, but still pointless work.

The other problem is that after running an earlier test we modified
the test binary in order to run a later test.  This means it's not
possible to manually re-run the first test as the binary for that test
is destroyed.

As part of the rewrite in this commit I've addressed these issues.

This test does change many of the test names, but there should be no
real changes in what is being tested after this commit.  However, when
the next commit is added, and GDB gets better at auto-loading the
executable for a core file, these tests should still be testing what
is expected.
---
 gdb/testsuite/gdb.base/coredump-filter.exp  |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp | 252 +++++++++-----------
 gdb/testsuite/gdb.base/corefile.exp         |   9 +
 3 files changed, 130 insertions(+), 148 deletions(-)

diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
index 0c1fc7c2dd6..18c3505172b 100644
--- a/gdb/testsuite/gdb.base/coredump-filter.exp
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -105,14 +105,23 @@ proc test_disasm { core address should_fail } {
 	    return
 	}
 
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	if { $should_fail == 1 } {
 	    remote_exec host "mv -f $hide_binfile $binfile"
-	    gdb_test "x/i \$pc" "=> $hex:\tCannot access memory at address $hex" \
-		"disassemble function with corefile and without a binary"
+	    set re "Cannot access memory at address $hex"
 	} else {
-	    gdb_test "x/i \$pc" "=> $hex:\t\[^C\].*" \
-		"disassemble function with corefile and without a binary"
+	    set re "\[^C\].*"
 	}
+
+	gdb_test "x/i \$pc" "=> $hex:\t${re}" \
+	    "disassemble function with corefile and without a binary"
     }
 
     with_test_prefix "with binary" {
diff --git a/gdb/testsuite/gdb.base/corefile-buildid.exp b/gdb/testsuite/gdb.base/corefile-buildid.exp
index fc54cf201d9..377ae802239 100644
--- a/gdb/testsuite/gdb.base/corefile-buildid.exp
+++ b/gdb/testsuite/gdb.base/corefile-buildid.exp
@@ -19,71 +19,72 @@
 
 # Build-id-related tests for core files.
 
-standard_testfile
+standard_testfile .c -shlib-shr.c -shlib.c
 
-# Build a non-shared executable.
+# Create a corefile from PROGNAME.  Return the name of the generated
+# corefile, or the empty string if anything goes wrong.
+#
+# The generated corefile must contain a buildid for PROGNAME.  If it
+# doesn't then an empty string will be returned.
+proc create_core_file { progname } {
+    # Generate a corefile.
+    set corefile [core_find $progname]
+    if {$corefile == ""} {
+	untested "could not generate core file"
+	return ""
+    }
+    verbose -log "corefile is $corefile"
+
+    # Check the corefile has a build-id for the executable.
+    if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
+	set line [lindex [split $output "\n"] 0]
+	set binfile_re (?:[string_to_regexp $progname]|\\\[(?:exe|pie)\\\])
+	if { ![regexp "^${::hex}\\+${::hex} \[a-f0-9\]+@${::hex}.*$binfile_re$" $line] } {
+	    unsupported "no build-id for executable in corefile"
+	    return ""
+	}
+    } else {
+	unsupported "eu-unstrip tool failed"
+	return ""
+    }
 
-proc build_corefile_buildid_exec {} {
-    global testfile srcfile binfile execdir
+    return $corefile
+}
 
-    if {[build_executable $testfile.exp $testfile $srcfile debug] == -1} {
-	untested "failed to compile"
-	return false
-    }
 
-    # Move executable to non-default path.
-    set builddir [standard_output_file $execdir]
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile [file join $builddir [file tail $binfile]]"
+# Build a non-shared executable.
 
-    return true
+proc build_corefile_buildid_exec { progname } {
+    return [expr {[build_executable "build non-shared exec" $progname $::srcfile] != -1}]
 }
 
 # Build a shared executable.
 
-proc build_corefile_buildid_shared {} {
-    global srcdir subdir testfile binfile srcfile sharedir
-
-    set builddir [standard_output_file $sharedir]
-
+proc build_corefile_buildid_shared { progname } {
     # Compile DSO.
-    set srcdso [file join $srcdir $subdir $testfile-shlib-shr.c]
-    set objdso [standard_output_file $testfile-shlib-shr.so]
-    if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
-	untested "failed to compile dso"
+    set objdso [standard_output_file $::testfile-shlib-shr.so]
+    if {[build_executable "build dso" $objdso $::srcfile2 {debug shlib}] == -1} {
 	return false
     }
 
+
     # Compile shared library.
-    set srclib [file join $srcdir $subdir $testfile-shlib.c]
-    set libname lib$testfile.so
+    set srclib $::srcfile3
+    set libname lib$::testfile.so
     set objlib [standard_output_file $libname]
-    set dlopen_lib [shlib_target_file \
-			[file join $builddir [file tail $objdso]]]
-    set opts [list debug shlib_load \
+    set dlopen_lib [shlib_target_file $objdso]
+    set opts [list debug shlib_load shlib \
 		  additional_flags=-DSHLIB_NAME=\"$dlopen_lib\"]
-    if {[gdb_compile_shlib $srclib $objlib $opts] != ""} {
-	untested "failed to compile shared library"
+    if {[build_executable "build solib" $objlib $::srcfile3 $opts] == -1} {
 	return false
     }
 
     # Compile main program.
-    set srcexec [file join $srcdir $subdir $srcfile]
-    set binfile [standard_output_file $testfile-shared]
     set opts [list debug shlib=$objlib additional_flags=-DTEST_SHARED]
-    if {[gdb_compile $srcexec $binfile executable $opts] != ""} {
-	untested "failed to compile shared executable"
+    if {[build_executable "build shared exec" $progname $::srcfile $opts] == -1} {
 	return false
     }
 
-    # Move objects to non-default path.
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile $builddir"
-    remote_exec build "mv $objdso  $builddir"
-    remote_exec build "mv $objlib $builddir"
-
     return true
 }
 
@@ -154,37 +155,43 @@ proc check_exec_file {file} {
 # SHARED is a boolean indicating whether we are testing the shared
 # library core dump test case.
 
-proc locate_exec_from_core_build_id {corefile buildid suffix \
+proc locate_exec_from_core_build_id {corefile buildid \
+					 dirname progname \
 					 sepdebug symlink shared} {
-    global testfile binfile srcfile
-
     clean_restart
 
     # Set up the build-id directory and symlink the binary there.
+    set d "debugdir"
+    if {$shared} {
+	set d "${d}_shared"
+    } else {
+	set d "${d}_not-shared"
+    }
     if {$symlink} {
-	set d "symlinkdir"
+	set d "${d}_symlink"
     } else {
-	set d "debugdir"
+	set d "${d}_copy"
     }
-    set debugdir [standard_output_file $d-$suffix]
-    remote_exec build "rm -rf $debugdir"
+    if {$sepdebug} {
+	set d "${d}_stripped"
+    } else {
+	set d "${d}_not-stripped"
+    }
+
+    set debugdir [standard_output_file $d]
     remote_exec build \
 	"mkdir -p [file join $debugdir [file dirname $buildid]]"
 
     set files_list {}
-    lappend files_list $binfile $buildid
+    lappend files_list [file join $dirname [file tail $progname]] \
+	$buildid
     if {$sepdebug} {
-	lappend files_list "$binfile.debug" "$buildid.debug"
-    }
-    if {$shared} {
-	global sharedir
-	set builddir [standard_output_file $sharedir]
-    } else {
-	global execdir
-	set builddir [standard_output_file $execdir]
+	lappend files_list [file join $dirname [file tail $progname]].debug \
+	    "$buildid.debug"
     }
+
     foreach {target name} $files_list {
-	set t [file join $builddir [file tail $target]]
+	set t [file join $dirname [file tail $target]]
 	if {$symlink} {
 	    remote_exec build "ln -s $t [file join $debugdir $name]"
 	} else {
@@ -198,109 +205,66 @@ proc locate_exec_from_core_build_id {corefile buildid suffix \
     gdb_test "core-file $corefile" "Program terminated with .*" \
 	"load core file"
     if {$symlink} {
-	set expected_file [file join $builddir [file tail $binfile]]
+	set expected_file [file join $dirname [file tail $progname]]
     } else {
 	set expected_file $buildid
     }
     check_exec_file [file join $debugdir $expected_file]
 }
 
-# Run a build-id tests on a core file.
-# Supported options: "-shared" and "-sepdebug" for running tests
-# of shared and/or stripped/.debug executables.
-
-proc do_corefile_buildid_tests {args} {
-    global binfile testfile srcfile execdir sharedir hex
-
-    # Parse options.
-    parse_args [list {sepdebug} {shared}]
+foreach_with_prefix mode { exec shared } {
+    # Build the executable.
+    set progname ${binfile}-$mode
+    set build_proc build_corefile_buildid_${mode}
+    if { ![$build_proc $progname] } {
+	return -1
+    }
 
-    # PROGRAM to run to generate core file.  This could be different
-    # than the program that was originally built, e.g., for a stripped
-    # executable.
-    if {$shared} {
-	set builddir [standard_output_file $sharedir]
-    } else {
-	set builddir [standard_output_file $execdir]
+    # Generate a corefile.
+    set corefile [create_core_file $progname]
+    if { $corefile eq "" } {
+	return -1
     }
-    set program_to_run [file join $builddir [file tail $binfile]]
 
-    # A list of suffixes to use to describe the test and the .build-id
-    # directory for the test.  The suffix will be used, joined with spaces,
-    # to prefix all tests for the given run.  It will be used, joined with
-    # dashes, to create a unique build-id directory.
-    set suffix {}
-    if {$shared} {
-	lappend suffix "shared"
-    } else {
-	lappend suffix "exec"
+    # Get the build-id filename without ".debug" on the end.  This
+    # will have the format: '.build-id/xx/xxxxx'
+    set buildid [build_id_debug_filename_get $progname ""]
+    if {$buildid == ""} {
+	untested "binary has no build-id"
+	return
     }
+    verbose -log "build-id is $buildid"
 
-    if {$sepdebug} {
-	# Strip debuginfo into its own file.
-	if {[gdb_gnu_strip_debug [standard_output_file $program_to_run] \
-		 no-debuglink] != 0} {
-	    untested "could not strip executable  for [join $suffix \ ]"
-	    return
-	}
+    # Create a directory for the non-stripped test.
+    set combined_dirname [standard_output_file ${mode}_non-stripped]
+    remote_exec build "mkdir -p $combined_dirname"
+    remote_exec build "cp $progname $combined_dirname"
 
-	lappend suffix "sepdebug"
+    # Create a directory for the stripped test.
+    if {[gdb_gnu_strip_debug [standard_output_file $progname] no-debuglink] != 0} {
+	untested "could not strip executable  for [join $suffix \ ]"
+	return
     }
-
-    with_test_prefix "[join $suffix \ ]" {
-	# Find the core file.
-	set corefile [core_find $program_to_run]
-	if {$corefile == ""} {
-	    untested "could not generate core file"
-	    return
-	}
-	verbose -log "corefile is $corefile"
-
-	if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
-	    set line [lindex [split $output "\n"] 0]
-	    set binfile_re (?:[string_to_regexp $program_to_run]|\\\[(?:exe|pie)\\\])
-	    if { ![regexp "^${hex}\\+${hex} \[a-f0-9\]+@${hex}.*$binfile_re$" $line] } {
-		unsupported "build id for exec"
-		return
-	    }
+    set sepdebug_dirname [standard_output_file ${mode}_stripped]
+    remote_exec build "mkdir -p $sepdebug_dirname"
+    remote_exec build "mv $progname $sepdebug_dirname"
+    remote_exec build "mv ${progname}.debug $sepdebug_dirname"
+
+    # Now do the actual testing part.  Fill out a debug directory with
+    # build-id related files (copies or symlinks) and then load the
+    # corefile.  Check GDB finds the executable and debug information
+    # via the build-id related debug directory contents.
+    foreach_with_prefix sepdebug { false true } {
+	if { $sepdebug } {
+	    set dirname $sepdebug_dirname
 	} else {
-	    unsupported "eu-unstrip execution"
-	    return
-	}
-
-	# Get the build-id filename without ".debug" on the end.  This
-	# will have the format: '.build-id/xx/xxxxx'
-	set buildid [build_id_debug_filename_get $program_to_run ""]
-	if {$buildid == ""} {
-	    untested "binary has no build-id"
-	    return
+	    set dirname $combined_dirname
 	}
-	verbose -log "build-id is $buildid"
-
-	locate_exec_from_core_build_id $corefile $buildid \
-	    [join $suffix -] $sepdebug false $shared
 
-	with_test_prefix "symlink" {
+	foreach_with_prefix symlink { false true } {
 	    locate_exec_from_core_build_id $corefile $buildid \
-		[join $suffix -] $sepdebug true $shared
+		$dirname $progname \
+		$sepdebug $symlink [expr {$mode eq "shared"}]
 	}
     }
 }
-
-# Directories where executables will be moved before testing.
-set execdir "build-exec"
-set sharedir "build-shared"
-
-#
-# Do tests
-#
-
-build_corefile_buildid_exec
-do_corefile_buildid_tests
-do_corefile_buildid_tests -sepdebug
-
-if {[allow_shlib_tests]} {
-    build_corefile_buildid_shared
-    do_corefile_buildid_tests -shared
-    do_corefile_buildid_tests -shared -sepdebug
-}
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index dc3c8b1dfc8..2111aa66d7d 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -348,6 +348,15 @@ proc corefile_test_attach {} {
 	gdb_start
 
 	gdb_test "core-file $corefile" "Core was generated by .*" "attach: load core again"
+
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	gdb_test "info files" "\r\nLocal core dump file:\r\n.*" "attach: sanity check we see the core file"
 
 	gdb_test "attach $pid" "Attaching to process $pid\r\n.*" "attach: with core"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/5] gdb: improve GDB's ability to auto-load the exec for a core file
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                   ` (2 preceding siblings ...)
  2024-10-26 11:11 ` [PATCH 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
@ 2024-10-26 11:11 ` Andrew Burgess
  2024-10-26 11:11 ` [PATCH 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

GDB already has a limited mechanism for auto-loading the executable
corresponding to a core file, this can be found in the function
locate_exec_from_corefile_build_id in corelow.c.

However, this approach uses the build-id of the core file to look in
either the debug directory (for a symlink back to the executable) or
by asking debuginfod.  This is great, and works fine if the core file
is a "system" binary, but often, when I'm debugging a core file, it's
part of my development cycle, so there's no build-id symlink in the
debug directory, and debuginfod doesn't know about the binary either,
so GDB can't auto load the executable....

... but the executable is right there!

This commit builds on the earlier commits in this series to make GDB
smarter.

On GNU/Linux, when we parse the execution context from the core
file (see linux-tdep.c), we already grab the command pointed to by
AT_EXECFN.  If this is an absolute path then GDB can use this to
locate the executable, a build-id check ensures we've found the
correct file.  With this small change GDB suddenly becomes a lot
better at auto-loading the executable for a core file.

But we can do better!  Often the AT_EXECFN is not an absolute path.

If it is a relative path then we check for this path relative to the
core file.  This helps if a user does something like:

  $ ./build/bin/some_prog
  Aborted (core dumped)
  $ gdb -c corefile

In this case the core file in the current directory will have an
AT_EXECFN value of './build/bin/some_prog', so if we look for that
path relative to the location of the core file this might result in a
hit, again, a build-id check ensures we found the right file.

But we can do better still!  What if the user moves the core file?  Or
the user is using some tool to manage core files (e.g. the systemd
core file management tool), and the user downloads the core file to a
location from which the relative path no longer works?

Well in this case we can make use of the core file's mapped file
information (the NT_FILE note).  The executable will be included in
the mapped file list, and the path within the mapped file list will be
an absolute path.  We can search for mapped file information based on
an address within the mapped file, and the auxv vector happens to
include an AT_ENTRY value, which is the entry address in the main
executable.  If we look up the mapped file containing this address
we'll have the absolute path to the main executable, a build-id check
ensures this really is the file we're looking for.

It might be tempting to jump straight to the third approach, however,
there is one small downside to the third approach: if the executable
is a symlink then the AT_EXECFN string will be the name of the
symlink, that is, the thing the user asked to run.  The mapped file
entry will be the name of the actual file, i.e. the symlink target.
When we auto-load the executable based on the third approach, the file
loaded might have a different name to that which the user expects,
though the build-id check (almost) guarantees that we've loaded the
correct binary.

But there's one more thing we can check for!

If the user has placed the core file and the executable into a
directory together, for example, as might happen with a bug report,
then neither the absolute path check, nor the relative patch check
will find the executable.  So GDB will also look for a file with the
right name in the same directory as the core file.  Again, a build-id
check is performed to ensure we find the correct file.

Of course, it's still possible that GDB is unable to find the
executable using any of these approaches.  In this case, nothing
changes, GDB will check in the debug info directory for a build-id
based link back to the executable, and if that fails, GDB will ask
debuginfod for the executable.  If this all fails, then, as usual, the
user is able to load the correct executable with the 'file' command,
but hopefully, this should be needed far less from now on.
---
 gdb/arch-utils.h                              |  25 +-
 gdb/corelow.c                                 | 141 ++++++++--
 gdb/linux-tdep.c                              |  22 ++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 242 ++++++++++++++++++
 5 files changed, 438 insertions(+), 17 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 1c33bfb4704..fb4a3ef9c5b 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -22,6 +22,7 @@
 
 #include "gdbarch.h"
 #include "gdbsupport/environ.h"
+#include "filenames.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -87,15 +88,23 @@ struct core_file_exec_context
      never be nullptr.  Only call this constructor if all the arguments
      have been collected successfully, i.e. if the EXEC_NAME could be
      found but not ARGV then use the no-argument constructor to create an
-     empty context object.  */
+     empty context object.
+
+     The EXEC_FILENAME must be the absolute filename of the executable
+     that generated this core file, or nullptr if the absolute filename
+     is not known.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  gdb::unique_xmalloc_ptr<char> exec_filename,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
+      m_exec_filename (std::move (exec_filename)),
       m_arguments (std::move (argv)),
       m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
+    gdb_assert (exec_filename == nullptr
+		|| IS_ABSOLUTE_PATH (exec_filename.get ()));
   }
 
   /* Create a default context object.  In its default state a context
@@ -112,6 +121,13 @@ struct core_file_exec_context
   const char *execfn () const
   { return m_exec_name.get (); }
 
+  /* Return the absolute path to the executable if known.  This might
+     return nullptr even when execfn() returns a non-nullptr value.
+     Additionally, the file referenced here might have a different name
+     than the file returned by execfn if execfn is a symbolic link.  */
+  const char *exec_filename () const
+  { return m_exec_filename.get (); }
+
   /* Return the vector of inferior arguments as extracted from the core
      file.  This does not include argv[0] (the executable name) for that
      see the execfn() function.  */
@@ -127,6 +143,13 @@ struct core_file_exec_context
      if no executable name is found.  */
   gdb::unique_xmalloc_ptr<char> m_exec_name;
 
+  /* Full filename to the executable that was actually executed.  The name
+     within EXEC_FILENAME might not match what the user typed, e.g. if the
+     user typed ./symlinked_name which is a symlink to /tmp/real_name then
+     this is going to contain '/tmp/realname' while EXEC_NAME above will
+     contain './symlinkedname'.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_filename;
+
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
diff --git a/gdb/corelow.c b/gdb/corelow.c
index a0129f84b1c..272b86b6f33 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -828,18 +828,117 @@ rename_vmcore_idle_reg_sections (bfd *abfd, inferior *inf)
 	     replacement_lwpid_str.c_str ());
 }
 
+/* Use CTX to try and find (and open) the executable file for the core file
+   CBFD.  BUILD_ID is the build-id for CBFD which was already extracted by
+   our caller.
+
+   Will return the opened executable or nullptr if the executable couldn't
+   be found.  */
+
+static gdb_bfd_ref_ptr
+locate_exec_from_corefile_exec_context (bfd *cbfd,
+					const bfd_build_id *build_id,
+					const core_file_exec_context &ctx)
+{
+  /* CTX must be valid, and a valid context has an execfn() string.  */
+  gdb_assert (ctx.valid ());
+  gdb_assert (ctx.execfn () != nullptr);
+
+  /* EXEC_NAME will be the command used to start the inferior.  This might
+     not be an absolute path (but could be).  */
+  const char *exec_name = ctx.execfn ();
+
+  /* Function to open FILENAME and check if its build-id matches BUILD_ID
+     from this enclosing scope.  Returns the open BFD for filename if the
+     FILENAME has a matching build-id, otherwise, returns nullptr.  */
+  const auto open_and_check_build_id
+    = [&build_id] (const char *filename) -> gdb_bfd_ref_ptr
+  {
+    /* Try to open a file.  If this succeeds then we still need to perform
+       a build-id check.  */
+    gdb_bfd_ref_ptr execbfd = gdb_bfd_open (filename, gnutarget);
+
+    /* We managed to open a file, but if it's build-id doesn't match
+       BUILD_ID then we just cannot trust it's the right file.  */
+    if (execbfd != nullptr)
+      {
+	const bfd_build_id *other_build_id = build_id_bfd_get (execbfd.get ());
+
+	if (other_build_id == nullptr
+	    || !build_id_equal (other_build_id, build_id))
+	  execbfd = nullptr;
+      }
+
+    return execbfd;
+  };
+
+  gdb_bfd_ref_ptr execbfd;
+
+  /* If EXEC_NAME is absolute then try to open it now.  Otherwise, see if
+     EXEC_NAME is a relative path from the location of the core file.  This
+     is just a guess, the executable might not be here, but we still rely
+     on a build-id match in order to accept any executable we find; we
+     don't accept something just because it happens to be in the right
+     location.  */
+  if (IS_ABSOLUTE_PATH (exec_name))
+    execbfd = open_and_check_build_id (exec_name);
+  else
+    {
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + exec_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If we haven't found the executable yet, then try checking to see if
+     the executable is in the same directory as the core file.  Again,
+     there's no reason why this should be the case, but it's worth a try,
+     and the build-id check should ensure we don't use an invalid file if
+     we happen to find one.  */
+  if (execbfd == nullptr)
+    {
+      const char *base_name = lbasename (exec_name);
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + base_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If the above didn't provide EXECBFD then try the exec_filename from
+     the context.  This will be an absolute filename which the gdbarch code
+     figured out from the core file.  In some cases the gdbarch code might
+     not be able to figure out a suitable absolute filename though.  */
+  if (execbfd == nullptr && ctx.exec_filename () != nullptr)
+    {
+      gdb_assert (IS_ABSOLUTE_PATH (ctx.exec_filename ()));
+
+      /* Try to open a file.  If this succeeds then we still need to
+	 perform a build-id check.  */
+      execbfd = open_and_check_build_id (ctx.exec_filename ());
+    }
+
+  return execbfd;
+}
+
 /* Locate (and load) an executable file (and symbols) given the core file
    BFD ABFD.  */
 
 static void
-locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
+locate_exec_from_corefile_build_id (bfd *abfd,
+				    const core_file_exec_context &ctx,
+				    int from_tty)
 {
   const bfd_build_id *build_id = build_id_bfd_get (abfd);
   if (build_id == nullptr)
     return;
 
-  gdb_bfd_ref_ptr execbfd
-    = find_objfile_by_build_id (build_id, abfd->filename);
+  gdb_bfd_ref_ptr execbfd;
+
+  if (ctx.valid ())
+    execbfd = locate_exec_from_corefile_exec_context (abfd, build_id, ctx);
+
+  if (execbfd == nullptr)
+    execbfd = find_objfile_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
@@ -908,13 +1007,6 @@ core_target_open (const char *arg, int from_tty)
 
   validate_files ();
 
-  /* If we have no exec file, try to set the architecture from the
-     core file.  We don't do this unconditionally since an exec file
-     typically contains more information that helps us determine the
-     architecture than a core file.  */
-  if (!current_program_space->exec_bfd ())
-    set_gdbarch_from_file (current_program_space->core_bfd ());
-
   current_inferior ()->push_target (std::move (target_holder));
 
   switch_to_no_thread ();
@@ -969,9 +1061,31 @@ core_target_open (const char *arg, int from_tty)
       switch_to_thread (thread);
     }
 
+  /* In order to parse the exec context from the core file the current
+     inferior needs to have a suitable gdbarch set.  If an exec file is
+     loaded then the gdbarch will have been set based on the exec file, but
+     if not, ensure we have a suitable gdbarch in place now.  */
+  if (current_program_space->exec_bfd () == nullptr)
+      current_inferior ()->set_arch (target->core_gdbarch ());
+
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+
+  /* If we don't have an executable loaded then see if we can locate one
+     based on the core file.  */
   if (current_program_space->exec_bfd () == nullptr)
     locate_exec_from_corefile_build_id (current_program_space->core_bfd (),
-					from_tty);
+					ctx, from_tty);
+
+  /* If we have no exec file, try to set the architecture from the
+     core file.  We don't do this unconditionally since an exec file
+     typically contains more information that helps us determine the
+     architecture than a core file.  */
+  if (current_program_space->exec_bfd () == nullptr)
+    set_gdbarch_from_file (current_program_space->core_bfd ());
 
   post_create_inferior (from_tty);
 
@@ -989,11 +1103,6 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  /* See if the gdbarch can find the executable name and argument list from
-     the core file.  */
-  core_file_exec_context ctx
-    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
-				       current_program_space->core_bfd ());
   if (ctx.valid ())
     {
       std::string args;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index d981824f081..2632d143569 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2083,7 +2083,29 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
   if (execfn == nullptr)
     return {};
 
+  /* When the core-file was loaded GDB processed the file backed mappings
+     (from the NT_FILE note).  One of these should have been for the
+     executable.  The AT_EXECFN string might not be an absolute path, but
+     the path in NT_FILE will be absolute, though if AT_EXECFN is a
+     symlink, then the NT_FILE entry will point to the actual file, not the
+     symlink.
+
+     Use the AT_ENTRY address to look for the NT_FILE entry which contains
+     that address, this should be the executable.  */
+  gdb::unique_xmalloc_ptr<char> exec_filename;
+  CORE_ADDR exec_entry_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_ENTRY, &exec_entry_addr) == 1)
+    {
+      std::optional<core_target_mapped_file_info> info
+	= core_target_find_mapped_file (nullptr, exec_entry_addr);
+      if (info.has_value () && !info->filename ().empty ()
+	  && IS_ABSOLUTE_PATH (info->filename ().c_str ()))
+	exec_filename = make_unique_xstrdup (info->filename ().c_str ());
+    }
+
   return core_file_exec_context (std::move (execfn),
+				 std::move (exec_filename),
 				 std::move (arguments),
 				 std::move (environment));
 }
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.c b/gdb/testsuite/gdb.base/corefile-find-exec.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
new file mode 100644
index 00000000000..40324c1f01c
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -0,0 +1,242 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB's ability to auto-load the executable based on the file
+# names extracted from the core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile {debug build-id}] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Load the COREFILE and confirm that GDB auto-loads the executable.
+# The symbols should be read from SYMBOL_FILE and the core file should
+# be reported as generated by GEN_FROM_FILE.
+proc test_load { corefile symbol_file gen_from_file } {
+    clean_restart
+    set saw_generated_line false
+    set saw_reading_symbols false
+
+    gdb_test_multiple "core-file $corefile" "load core file" {
+
+	-re "^Reading symbols from [string_to_regexp $symbol_file]\\.\\.\\.\r\n" {
+	    set saw_reading_symbols true
+	    exp_continue
+	}
+
+	-re "^Core was generated by `[string_to_regexp $gen_from_file]'\\.\r\n" {
+	    set saw_generated_line true
+	    exp_continue
+	}
+
+	-re "^$::gdb_prompt $" {
+	    gdb_assert { $saw_generated_line && $saw_reading_symbols} \
+		$gdb_test_name
+	}
+
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+    }
+}
+
+with_test_prefix "absolute path" {
+    # Generate a core file, this uses an absolute path to the
+    # executable.
+    with_test_prefix "to file" {
+	set corefile [core_find $binfile]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_1 "$binfile.1.core"
+	remote_exec build "mv $corefile $corefile_1"
+
+	test_load $corefile_1 $binfile $binfile
+    }
+
+    # And create a symlink, and repeat the test using an absolute path
+    # to the symlink.
+    with_test_prefix "to symlink" {
+	set symlink_name "symlink_1"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_2 "$binfile.2.core"
+	remote_exec build "mv $corefile $corefile_2"
+
+	test_load $corefile_2 $symlink $symlink
+    }
+
+    # Like the previous test, except this time, delete the symlink
+    # after generating the core file.  GDB should be smart enough to
+    # figure out that we can use the underlying TESTFILE binary.
+    with_test_prefix "to deleted symlink" {
+	set symlink_name "symlink_2"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_3 "$binfile.3.core"
+	remote_exec build "mv $corefile $corefile_3"
+
+	remote_exec build "rm -f $symlink"
+
+	test_load $corefile_3 $binfile $symlink
+    }
+
+    # Generate the core file with an absolute path to the executable,
+    # but move the core file and executable into a single directory
+    # together so GDB can't use the absolute path to find the
+    # executable.
+    #
+    # GDB should still find the executable though, but looking in the
+    # same directory as the core file.
+    with_test_prefix "in side directory" {
+	set binfile_2 [standard_output_file ${testfile}_2]
+	remote_exec build "cp $binfile $binfile_2"
+
+	set corefile [core_find $binfile_2]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_4 "$binfile.4.core"
+	remote_exec build "mv $corefile $corefile_4"
+
+	set side_dir [standard_output_file side_dir]
+	remote_exec build "mkdir -p $side_dir"
+	remote_exec build "mv $binfile_2 $side_dir"
+	remote_exec build "mv $corefile_4 $side_dir"
+
+	set relocated_corefile_4 [file join $side_dir [file tail $corefile_4]]
+	set relocated_binfile_2 [file join $side_dir [file tail $binfile_2]]
+	test_load $relocated_corefile_4 $relocated_binfile_2 $binfile_2
+    }
+}
+
+with_test_prefix "relative path" {
+    # Generate a core file using relative a path.  We ned to work
+    # around the core_find proc a little here.  The core_find proc
+    # creates a sub-directory using standard_output_file and runs the
+    # test binary from inside that directory.
+    #
+    # Usually core_find is passed an absolute path, so thre's no
+    # problem, but we want to pass a relative path.
+    #
+    # So setup a directory structure like this:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #
+    # Place a copy of BINFILE in 'reldir/' and switch to workdir, use
+    # core_find which will create a sibling directory of workdir, and
+    # run the relative path from there.  We then move the generated
+    # core file back into 'workdir/', this leaves a tree like:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #      <core file here>
+    #
+    # Now we can ask GDB to open the core file, if all goes well GDB
+    # should make use of the relative path encoded in the core file to
+    # locate the executable in 'reldir/'.
+    #
+    # We also setup a symlink in 'reldir' that points to the
+    # executable and repeat the test, but this time executing the
+    # symlink.
+    set reldir_name "reldir"
+    set reldir [standard_output_file $reldir_name]
+    remote_exec build "mkdir -p $reldir"
+
+    set alt_testfile "alt_${testfile}"
+    set binfile_3 "$reldir/${alt_testfile}"
+    remote_exec build "cp $binfile $binfile_3"
+
+    set symlink_2 "symlink_2"
+    with_cwd $reldir {
+	remote_exec build "ln -s ${alt_testfile} ${symlink_2}"
+    }
+
+    set work_dir [standard_output_file "workdir"]
+    remote_exec build "mkdir -p $work_dir"
+
+    set rel_path_to_file "../${reldir_name}/${alt_testfile}"
+    set rel_path_to_symlink_2 "../${reldir_name}/${symlink_2}"
+
+    with_cwd $work_dir {
+	with_test_prefix "to file" {
+	    set corefile [core_find $rel_path_to_file]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_5 "${work_dir}/${testfile}.5.core"
+	    remote_exec build "mv $corefile $corefile_5"
+
+	    test_load $corefile_5 \
+		[file join $work_dir $rel_path_to_file] \
+		$rel_path_to_file
+	}
+
+	with_test_prefix "to symlink" {
+	    set corefile [core_find $rel_path_to_symlink_2]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_6 "${work_dir}/${testfile}.6.core"
+	    remote_exec build "mv $corefile $corefile_6"
+
+	    test_load $corefile_6 \
+		[file join $work_dir $rel_path_to_symlink_2] \
+		$rel_path_to_symlink_2
+	}
+
+	# Move the core file.  Now the relative path doesn't work so
+	# we instead rely on GDB to use information about the mapped
+	# files to help locate the executable.
+	with_test_prefix "with moved corefile" {
+	    set corefile_7 [standard_output_file "${testfile}.7.core"]
+	    remote_exec build "cp $corefile_6 $corefile_7"
+	    test_load $corefile_7 $binfile_3 $rel_path_to_symlink_2
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/5] gdb/freebsd: port core file context parsing to FreeBSD
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                   ` (3 preceding siblings ...)
  2024-10-26 11:11 ` [PATCH 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
@ 2024-10-26 11:11 ` Andrew Burgess
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-26 11:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit implements the gdbarch_core_parse_exec_context method for
FreeBSD.

This is much simpler than for Linux.  On FreeBSD, at least the
version (13.x) that I have installer, there are additional entries in
the auxv vector that point directly to the argument and environment
vectors, this makes it trivial to find this information.

If these extra auxv entries are not available on earlier FreeBSD, then
that's fine.  The fallback behaviour will be for GDB to act as it
always has up to this point, you'll just not get the extra
functionality.

Other differences compared to Linux are that FreeBSD has
AT_FREEBSD_EXECPATH instead of AT_EXECFN, the AT_FREEBSD_EXECPATH is
the full path to the executable.  On Linux AT_EXECFN is the command
the user typed, so this can be a relative path.

This difference is handy as on FreeBSD we don't parse the mapped files
from the core file (are they even available?).  So having the EXECPATH
means we can use that as the absolute path to the executable.

However, if the user ran a symlink then AT_FREEBSD_EXECPATH will be
the absolute path to the symlink, not to the underlying file.  This is
probably a good thing, but it does mean there is one case we test on
Linux that fails on FreeBSD.

On Linux if we create a symlink to an executable, then run the symlink
and generate a corefile.  Now delete the symlink and load the core
file.  On Linux GDB will still find (and open) the original
executable.  This is because we use the mapped file information to
find the absolute path to the executable, and the mapped file
information only stores the real file names, not symlink names.

This is a total edge case, I only added the deleted symlink test
originally because I could see that this would work on Linux.  Though
it is neat that Linux finds this, I don't feel too bad that this fails
on FreeBSD.

Other than this, everything seems to work on x86-64 FreeBSD (13.4)
which is all I have setup right now.  I don't see why other
architectures wouldn't work too, but I haven't tested them.
---
 gdb/fbsd-tdep.c                               | 134 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.exp        |   2 +-
 gdb/testsuite/gdb.base/corefile-find-exec.exp |  12 +-
 3 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/gdb/fbsd-tdep.c b/gdb/fbsd-tdep.c
index e97ff52d5bf..804a72c4205 100644
--- a/gdb/fbsd-tdep.c
+++ b/gdb/fbsd-tdep.c
@@ -33,6 +33,7 @@
 #include "elf-bfd.h"
 #include "fbsd-tdep.h"
 #include "gcore-elf.h"
+#include "arch-utils.h"
 
 /* This enum is derived from FreeBSD's <sys/signal.h>.  */
 
@@ -2361,6 +2362,137 @@ fbsd_vdso_range (struct gdbarch *gdbarch, struct mem_range *range)
   return range->length != 0;
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from CBFD.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the argument string vector.  */
+  CORE_ADDR argv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ARGV, &argv_addr) != 1)
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the environment string vector.  */
+  CORE_ADDR envv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ENVV, &envv_addr) != 1)
+    return {};
+
+  /* Read the AT_EXECPATH string.  It's OK if we can't get this
+     information.  */
+  gdb::unique_xmalloc_ptr<char> execpath;
+  CORE_ADDR execpath_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_EXECPATH,
+			  &execpath_string_addr) == 1)
+    execpath = target_read_string (execpath_string_addr, INT_MAX);
+
+  /* The byte order.  */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+
+  /* On FreeBSD the command the user ran is found in argv[0].  When we
+     read the first argument we place it into EXECFN.  */
+  gdb::unique_xmalloc_ptr<char> execfn;
+
+  /* Read strings from AT_FREEBSD_ARGV until we find a NULL marker.  The
+     first argument is placed into EXECFN as the command name.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+  CORE_ADDR str_addr;
+  while ((str_addr
+	  = (CORE_ADDR) read_memory_unsigned_integer (argv_addr, ptr_bytes,
+						      byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      if (execfn == nullptr)
+	execfn = std::move (str);
+      else
+	arguments.emplace_back (std::move (str));
+
+      argv_addr += ptr_bytes;
+    }
+
+  /* Read strings from AT_FREEBSD_ENVV until we find a NULL marker.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((str_addr
+	  = (uint64_t) read_memory_unsigned_integer (envv_addr, ptr_bytes,
+						     byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      environment.emplace_back (std::move (str));
+      envv_addr += ptr_bytes;
+    }
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (execpath),
+				 std::move (arguments),
+				 std::move (environment));
+}
+
+/* See elf-corelow.h.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return fbsd_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Return the address range of the vDSO for the current inferior.  */
 
 static int
@@ -2404,4 +2536,6 @@ fbsd_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* `catch syscall' */
   set_xml_syscall_file_name (gdbarch, "syscalls/freebsd.xml");
   set_gdbarch_get_syscall_number (gdbarch, fbsd_get_syscall_number);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       fbsd_corefile_parse_exec_context);
 }
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index ac97754fe71..73e13e60d75 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
index 40324c1f01c..07e660d85e8 100644
--- a/gdb/testsuite/gdb.base/corefile-find-exec.exp
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
@@ -115,6 +115,16 @@ with_test_prefix "absolute path" {
 
 	remote_exec build "rm -f $symlink"
 
+	# FreeBSD is unable to figure out the actual underlying mapped
+	# file, so when the symlink is deleted, FeeeBSD is stuck.
+	#
+	# There is some argument that this shouldn't even be a
+	# failure, the user ran the symlink, and if the symlink is
+	# gone, should we really expect GDB to find the underlying
+	# file?  That we can on Linux is really just a quirk of how
+	# the mapped file list works.
+	setup_xfail "*-*-freebsd*"
+
 	test_load $corefile_3 $binfile $symlink
     }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 0/5] Better executable auto-loading when opening a core file
  2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                   ` (4 preceding siblings ...)
  2024-10-26 11:11 ` [PATCH 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
@ 2024-10-28 18:53 ` Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
                     ` (5 more replies)
  5 siblings, 6 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In v2:

  - Fixed an incorrect use of gdb::function_view in patch #1 which was
    causing undefined behaviour, and crashes when GDB was built with
    optimisation.

  - Rebased and retested.

---

There's actually a couple of core file related improvements in this
series.

Patches #1 and #2 improve what information GDB can extract about the
execution context (executable name, inferior arguments, and
environment) when opening a core file.

Then patch #4 improves GDB's ability to auto-load the executable that
matches a core file (on GNU/Linux).

Patch #3 is a testsuite refactor to allow for patch #4.

And patch #5 replicates patch #4, but for FreeBSD.

Thanks,
Andrew

---

Andrew Burgess (5):
  gdb: add gdbarch method to get execution context from core file
  gdb: parse and set the inferior environment from core files
  gdb/testsuite: make some of the core file / build-id tests harder
  gdb: improve GDB's ability to auto-load the exec for a core file
  gdb/freebsd: port core file context parsing to FreeBSD

 gdb/arch-utils.c                              |  26 ++
 gdb/arch-utils.h                              |  89 +++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 | 172 +++++++++-
 gdb/fbsd-tdep.c                               | 134 ++++++++
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 307 ++++++++++++++++++
 gdb/testsuite/gdb.base/coredump-filter.exp    |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp   | 252 ++++++--------
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 165 ++++++++++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 252 ++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp           |   9 +
 17 files changed, 1378 insertions(+), 163 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp


base-commit: a723c56efb07c4f8b3f6a3ed4b878a2f8f5572cc
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 1/5] gdb: add gdbarch method to get execution context from core file
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
@ 2024-10-28 18:53   ` Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Add a new gdbarch method which can read the execution context from a
core file.  An execution context, for this commit, means the filename
of the executable used to generate the core file and the arguments
passed to the executable.

In later commits this will be extended further to include the
environment in which the executable was run, but this commit is
already pretty big, so I've split that part out into a later commit.

Initially this new gdbarch method is only implemented for Linux
targets, but a later commit will add FreeBSD support too.

Currently when GDB opens a core file, GDB reports the command and
arguments used to generate the core file.  For example:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./gen-core abc def'.

However, this information comes from the psinfo structure in the core
file, and this struct only allows 80 characters for the command and
arguments combined.  If the command and arguments exceed this then
they are truncated.

Additionally, neither the executable nor the arguments are quoted in
the psinfo structure, so if, for example, the executable was named
'aaa bbb' (i.e. contains white space) and was run with the arguments
'ccc' and 'ddd', then when this core file was opened by GDB we'd see:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./aaa bbb ccc ddd'.

It is impossible to know if 'bbb' is part of the executable filename,
or another argument.

However, the kernel places the executable command onto the user stack,
this is pointed to by the AT_EXECFN entry in the auxv vector.
Additionally, the inferior arguments are all available on the user
stack.  The new gdbarch method added in this commit extracts this
information from the user stack and allows GDB to access it.

The information on the stack is writable by the user, so a user
application can start up, edit the arguments, override the AT_EXECFN
string, and then dump core.  In this case GDB will report incorrect
information, however, it is worth noting that the psinfo structure is
also filled (by the kernel) by just copying information from the user
stack, so, if the user edits the on stack arguments, the values
reported in psinfo will change, so the new approach is no worse than
what we currently have.

The benefit of this approach is that GDB gets to report the full
executable name and all the arguments without the 80 character limit,
and GDB is aware which parts are the executable name, and which parts
are arguments, so we can, for example, style the executable name.

Another benefit is that, now we know all the arguments, we can poke
these into the inferior object.  This means that after loading a core
file a user can 'show args' to see the arguments used.  A user could
even transition from core file debugging to live inferior debugging
using, e.g. 'run', and GDB would restart the inferior with the correct
arguments.

Now the downside: finding the AT_EXECFN string is easy, the auxv entry
points directly too it.  However, finding the arguments is a little
trickier.  There's currently no easy way to get a direct pointer to
the arguments.  Instead, I've got a heuristic which I believe should
find the arguments in most cases.  The algorithm is laid out in
linux-tdep.c, I'll not repeat it here, but it's basically a search of
the user stack, starting from AT_EXECFN.

If the new heuristic fails then GDB just falls back to the old
approach, asking bfd to read the psinfo structure for us, which gives
the old 80 character limited answer.

For testing, I've run this series on (all GNU/Linux) x86-64. s390,
ppc64le, and the new test passes in each case.
---
 gdb/arch-utils.h                              |  57 ++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 |  38 ++-
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 285 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 102 +++++++
 10 files changed, 571 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 40c62f30a65..8d9f1625bdd 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -74,6 +74,58 @@ struct bp_manipulation_endian
   bp_manipulation_endian<sizeof (BREAK_INSN_LITTLE),		  \
   BREAK_INSN_LITTLE, BREAK_INSN_BIG>
 
+/* Structure returned from gdbarch core_parse_exec_context method.  Wraps
+   the execfn string and a vector containing the inferior argument.  If a
+   gdbarch is unable to parse this information then an empty structure is
+   returned, check the execfn as an indication, if this is nullptr then no
+   other fields should be considered valid.  */
+
+struct core_file_exec_context
+{
+  /* Constructor, just move everything into place.  The EXEC_NAME should
+     never be nullptr.  Only call this constructor if all the arguments
+     have been collected successfully, i.e. if the EXEC_NAME could be
+     found but not ARGV then use the no-argument constructor to create an
+     empty context object.  */
+  core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+    : m_exec_name (std::move (exec_name)),
+      m_arguments (std::move (argv))
+  {
+    gdb_assert (m_exec_name != nullptr);
+  }
+
+  /* Create a default context object.  In its default state a context
+     object holds no useful information, and will return false from its
+     valid() method.  */
+  core_file_exec_context () = default;
+
+  /* Return true if this object contains valid context information.  */
+  bool valid () const
+  { return m_exec_name != nullptr; }
+
+  /* Return the execfn string (executable name) as extracted from the core
+     file.  Will always return non-nullptr if valid() returns true.  */
+  const char *execfn () const
+  { return m_exec_name.get (); }
+
+  /* Return the vector of inferior arguments as extracted from the core
+     file.  This does not include argv[0] (the executable name) for that
+     see the execfn() function.  */
+  const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
+  { return m_arguments; }
+
+private:
+
+  /* The executable filename as reported in the core file.  Can be nullptr
+     if no executable name is found.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_name;
+
+  /* List of arguments.  Doesn't include argv[0] which is the executable
+     name, for this look at m_exec_name field.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+};
+
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
 extern bool default_displaced_step_hw_singlestep (struct gdbarch *);
 
@@ -305,6 +357,11 @@ extern void default_read_core_file_mappings
    read_core_file_mappings_pre_loop_ftype pre_loop_cb,
    read_core_file_mappings_loop_ftype loop_cb);
 
+/* Default implementation of gdbarch_core_parse_exec_context.  Returns
+   an empty core_file_exec_context.  */
+extern core_file_exec_context default_core_parse_exec_context
+  (struct gdbarch *gdbarch, bfd *cbfd);
+
 /* Default implementation of gdbarch
    use_target_description_from_corefile_notes.  */
 extern bool default_use_target_description_from_corefile_notes
diff --git a/gdb/corefile.c b/gdb/corefile.c
index f6ec3cd5ca1..c3089e4516e 100644
--- a/gdb/corefile.c
+++ b/gdb/corefile.c
@@ -35,6 +35,7 @@
 #include "cli/cli-utils.h"
 #include "gdbarch.h"
 #include "interps.h"
+#include "arch-utils.h"
 
 void
 reopen_exec_file (void)
@@ -76,6 +77,15 @@ validate_files (void)
     }
 }
 
+/* See arch-utils.h.  */
+
+core_file_exec_context
+default_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  return {};
+}
+\f
+
 std::string
 memory_error_message (enum target_xfer_status err,
 		      struct gdbarch *gdbarch, CORE_ADDR memaddr)
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5820ffed332..5cc11d71b7b 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -854,7 +854,6 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
 void
 core_target_open (const char *arg, int from_tty)
 {
-  const char *p;
   int siggy;
   int scratch_chan;
   int flags;
@@ -990,9 +989,40 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  p = bfd_core_file_failing_command (current_program_space->core_bfd ());
-  if (p)
-    gdb_printf (_("Core was generated by `%s'.\n"), p);
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+  if (ctx.valid ())
+    {
+      std::string args;
+      for (const auto &a : ctx.args ())
+	{
+	  args += ' ';
+	  args += a.get ();
+	}
+
+      gdb_printf (_("Core was generated by `%ps%s'.\n"),
+		  styled_string (file_name_style.style (),
+				 ctx.execfn ()),
+		  args.c_str ());
+
+      /* Copy the arguments into the inferior.  */
+      std::vector<char *> argv;
+      for (const auto &a : ctx.args ())
+	argv.push_back (a.get ());
+      gdb::array_view<char * const> view (argv.data (), argv.size ());
+      current_inferior ()->set_args (view);
+    }
+  else
+    {
+      gdb::unique_xmalloc_ptr<char> failing_command = make_unique_xstrdup
+	(bfd_core_file_failing_command (current_program_space->core_bfd ()));
+      if (failing_command != nullptr)
+	gdb_printf (_("Core was generated by `%s'.\n"),
+		    failing_command.get ());
+    }
 
   /* Clearing any previous state of convenience variables.  */
   clear_exit_convenience_vars ();
diff --git a/gdb/gdbarch-gen.c b/gdb/gdbarch-gen.c
index 0d00cd7c993..6f41ce9d233 100644
--- a/gdb/gdbarch-gen.c
+++ b/gdb/gdbarch-gen.c
@@ -258,6 +258,7 @@ struct gdbarch
   gdbarch_get_pc_address_flags_ftype *get_pc_address_flags = default_get_pc_address_flags;
   gdbarch_read_core_file_mappings_ftype *read_core_file_mappings = default_read_core_file_mappings;
   gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes = default_use_target_description_from_corefile_notes;
+  gdbarch_core_parse_exec_context_ftype *core_parse_exec_context = default_core_parse_exec_context;
 };
 
 /* Create a new ``struct gdbarch'' based on information provided by
@@ -527,6 +528,7 @@ verify_gdbarch (struct gdbarch *gdbarch)
   /* Skip verify of get_pc_address_flags, invalid_p == 0.  */
   /* Skip verify of read_core_file_mappings, invalid_p == 0.  */
   /* Skip verify of use_target_description_from_corefile_notes, invalid_p == 0.  */
+  /* Skip verify of core_parse_exec_context, invalid_p == 0.  */
   if (!log.empty ())
     internal_error (_("verify_gdbarch: the following are invalid ...%s"),
 		    log.c_str ());
@@ -1386,6 +1388,9 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file)
   gdb_printf (file,
 	      "gdbarch_dump: use_target_description_from_corefile_notes = <%s>\n",
 	      host_address_to_string (gdbarch->use_target_description_from_corefile_notes));
+  gdb_printf (file,
+	      "gdbarch_dump: core_parse_exec_context = <%s>\n",
+	      host_address_to_string (gdbarch->core_parse_exec_context));
   if (gdbarch->dump_tdep != NULL)
     gdbarch->dump_tdep (gdbarch, file);
 }
@@ -5463,3 +5468,20 @@ set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch,
 {
   gdbarch->use_target_description_from_corefile_notes = use_target_description_from_corefile_notes;
 }
+
+core_file_exec_context
+gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != NULL);
+  gdb_assert (gdbarch->core_parse_exec_context != NULL);
+  if (gdbarch_debug >= 2)
+    gdb_printf (gdb_stdlog, "gdbarch_core_parse_exec_context called\n");
+  return gdbarch->core_parse_exec_context (gdbarch, cbfd);
+}
+
+void
+set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch,
+				     gdbarch_core_parse_exec_context_ftype core_parse_exec_context)
+{
+  gdbarch->core_parse_exec_context = core_parse_exec_context;
+}
diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h
index b982fd7cd09..29c5ad705f9 100644
--- a/gdb/gdbarch-gen.h
+++ b/gdb/gdbarch-gen.h
@@ -1751,3 +1751,18 @@ extern void set_gdbarch_read_core_file_mappings (struct gdbarch *gdbarch, gdbarc
 typedef bool (gdbarch_use_target_description_from_corefile_notes_ftype) (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern bool gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern void set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes);
+
+/* Examine the core file bfd object CBFD and try to extract the name of
+   the current executable and the argument list, which are return in a
+   core_file_exec_context object.
+
+   If for any reason the details can't be extracted from CBFD then an
+   empty context is returned.
+
+   It is required that the current inferior be the one associated with
+   CBFD, strings are read from the current inferior using target methods
+   which all assume current_inferior() is the one to read from. */
+
+typedef core_file_exec_context (gdbarch_core_parse_exec_context_ftype) (struct gdbarch *gdbarch, bfd *cbfd);
+extern core_file_exec_context gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd);
+extern void set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, gdbarch_core_parse_exec_context_ftype *core_parse_exec_context);
diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h
index 60a0f60df39..8359ae762de 100644
--- a/gdb/gdbarch.h
+++ b/gdb/gdbarch.h
@@ -59,6 +59,7 @@ struct ui_out;
 struct inferior;
 struct x86_xsave_layout;
 struct solib_ops;
+struct core_file_exec_context;
 
 #include "regcache.h"
 
diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py
index 4006380076d..7a218605d89 100644
--- a/gdb/gdbarch_components.py
+++ b/gdb/gdbarch_components.py
@@ -2778,3 +2778,23 @@ The corefile's bfd is passed through COREFILE_BFD.
     predefault="default_use_target_description_from_corefile_notes",
     invalid=False,
 )
+
+Method(
+    comment="""
+Examine the core file bfd object CBFD and try to extract the name of
+the current executable and the argument list, which are return in a
+core_file_exec_context object.
+
+If for any reason the details can't be extracted from CBFD then an
+empty context is returned.
+
+It is required that the current inferior be the one associated with
+CBFD, strings are read from the current inferior using target methods
+which all assume current_inferior() is the one to read from.
+""",
+    type="core_file_exec_context",
+    name="core_parse_exec_context",
+    params=[("bfd *", "cbfd")],
+    predefault="default_core_parse_exec_context",
+    invalid=False,
+)
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 65ec221ef48..e6ba3513c8e 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -1835,6 +1835,289 @@ linux_corefile_thread (struct thread_info *info,
     }
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  /* This function (currently) assumes the stack grows down.  If this is
+     not the case then this function isn't going to help.  */
+  if (!gdbarch_stack_grows_down (gdbarch))
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Parse the .auxv section looking for the AT_EXECFN attribute.  The
+     value of this attribute is a pointer to a string, the string is the
+     executable command.  Additionally, this string is placed at the top of
+     the program stack, and so will be in the same PT_LOAD segment as the
+     argv and envp arrays.  We can use this to try and locate these arrays.
+     If we can't find the AT_EXECFN attribute then we're not going to be
+     able to do anything else here.  */
+  CORE_ADDR execfn_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_EXECFN, &execfn_string_addr) != 1)
+    return {};
+
+  /* Read in the program headers from CBFD.  If we can't do this for any
+     reason then just give up.  */
+  long phdrs_size = bfd_get_elf_phdr_upper_bound (cbfd);
+  if (phdrs_size == -1)
+    return {};
+  gdb::unique_xmalloc_ptr<Elf_Internal_Phdr>
+    phdrs ((Elf_Internal_Phdr *) xmalloc (phdrs_size));
+  int num_phdrs = bfd_get_elf_phdrs (cbfd, phdrs.get ());
+  if (num_phdrs == -1)
+    return {};
+
+  /* Now scan through the headers looking for the one which contains the
+     address held in EXECFN_STRING_ADDR, this is the address of the
+     executable command pointed too by the AT_EXECFN auxv entry.  */
+  Elf_Internal_Phdr *hdr = nullptr;
+  for (int i = 0; i < num_phdrs; i++)
+    {
+      /* The program header that contains the address EXECFN_STRING_ADDR
+	 should be one where all content is contained within CBFD, hence
+	 the check that the file size matches the memory size.  */
+      if (phdrs.get ()[i].p_type == PT_LOAD
+	  && phdrs.get ()[i].p_vaddr <= execfn_string_addr
+	  && (phdrs.get ()[i].p_vaddr
+	      + phdrs.get ()[i].p_memsz) > execfn_string_addr
+	  && phdrs.get ()[i].p_memsz == phdrs.get ()[i].p_filesz)
+	{
+	  hdr = &phdrs.get ()[i];
+	  break;
+	}
+    }
+
+  /* If we failed to find a suitable program header then give up.  */
+  if (hdr == nullptr)
+    return {};
+
+  /* As we assume the stack grows down (see early check in this function)
+     we know that the information we are looking for sits somewhere between
+     EXECFN_STRING_ADDR and the segments virtual address.  These define
+     the HIGH and LOW addresses between which we are going to search.  */
+  CORE_ADDR low = hdr->p_vaddr;
+  CORE_ADDR high = execfn_string_addr;
+
+  /* This PTR is going to be the address we are currently accessing.  */
+  CORE_ADDR ptr = align_down (high, ptr_bytes);
+
+  /* Setup DEREF a helper function which loads a value from an address.
+     The returned value is always placed into a uint64_t, even if we only
+     load 4-bytes, this allows the code below to be pretty generic.  All
+     the values we're dealing with are unsigned, so this should be OK.   */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+  const auto deref = [=] (CORE_ADDR p) -> uint64_t
+    {
+      ULONGEST value = read_memory_unsigned_integer (p, ptr_bytes, byte_order);
+      return (uint64_t) value;
+    };
+
+  /* Now search down through memory looking for a PTR_BYTES sized object
+     which contains the value EXECFN_STRING_ADDR.  The hope is that this
+     will be the AT_EXECFN entry in the auxv table.  There is no guarantee
+     that we'll find the auxv table this way, but we will do our best to
+     validate that what we find is the auxv table, see below.  */
+  while (ptr > low)
+    {
+      if (deref (ptr) == execfn_string_addr
+	  && (ptr - ptr_bytes) > low
+	  && deref (ptr - ptr_bytes) == AT_EXECFN)
+	break;
+
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* Assuming that we are looking at a value field in the auxv table, move
+     forward PTR_BYTES bytes so we are now looking at the next key field in
+     the auxv table, then scan forward until we find the null entry which
+     will be the last entry in the auxv table.  */
+  ptr += ptr_bytes;
+  while ((ptr + (2 * ptr_bytes)) < high
+	 && (deref (ptr) != 0 || deref (ptr + ptr_bytes) != 0))
+    ptr += (2 * ptr_bytes);
+
+  /* PTR now points to the null entry in the auxv table, or we think it
+     does.  Now we want to find the start of the auxv table.  There's no
+     in-memory pattern we can search for at the start of the table, but
+     we can find the start based on the size of the .auxv section within
+     the core file CBFD object.  In the actual core file the auxv is held
+     in a note, but the bfd library makes this into a section for us.
+
+     The addition of (2 * PTR_BYTES) here is because PTR is pointing at the
+     null entry, but the null entry is also included in CONTENTS.  */
+  ptr = ptr + (2 * ptr_bytes) - contents.size ();
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR should now be pointing to the start of the auxv table mapped into
+     the inferior memory.  As we got here using a heuristic then lets
+     compare an auxv table sized block of inferior memory, if this matches
+     then it's not a guarantee that we are in the right place, but it does
+     make it more likely.  */
+  gdb::byte_vector target_contents (size);
+  if (target_read_memory (ptr, target_contents.data (), size) != 0)
+    memory_error (TARGET_XFER_E_IO, ptr);
+  if (memcmp (contents.data (), target_contents.data (), size) != 0)
+    return {};
+
+  /* We have reasonable confidence that PTR points to the start of the auxv
+     table.  Below this should be the null terminated list of pointers to
+     environment strings, and below that the null terminated list of
+     pointers to arguments strings.  After that we should find the
+     argument count.  First, check for the null at the end of the
+     environment list.  */
+  if (deref (ptr - ptr_bytes) != 0)
+    return {};
+
+  ptr -= (2 * ptr_bytes);
+  while (ptr > low && deref (ptr) != 0)
+    ptr -= ptr_bytes;
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing to the null entry at the end of the argument
+     string pointer list.  We now want to scan backward to find the entire
+     argument list.  There's no handy null marker that we can look for
+     here, instead, as we scan backward we look for the argument count
+     (argc) value which appears immediately before the argument list.
+
+     Technically, we could have zero arguments, so the argument count would
+     be zero, however, we don't support this case.  If we find a null entry
+     in the argument list before we find the argument count then we just
+     bail out.
+
+     Start by moving to the last argument string pointer, we expect this
+     to be non-null.  */
+  ptr -= ptr_bytes;
+  uint64_t argc = 0;
+  while (ptr > low)
+    {
+      uint64_t val = deref (ptr);
+      if (val == 0)
+	return {};
+
+      if (val == argc)
+	break;
+
+      argc++;
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing at the argument count value.  Move it forward
+     so we're pointing at the first actual argument string pointer.  */
+  ptr += ptr_bytes;
+
+  /* We can now parse all of the argument strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+
+  /* Skip the first argument.  This is the executable command, but we'll
+     load that separately later.  */
+  ptr += ptr_bytes;
+
+  uint64_t v;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      arguments.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  /* Skip the null-pointer at the end of the argument list.  We will now
+     be pointing at the first environment string.  */
+  ptr += ptr_bytes;
+
+  /* Parse the environment strings.  Nothing is done with this yet, but
+     will be in a later commit.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      environment.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  gdb::unique_xmalloc_ptr<char> execfn
+    = target_read_string (execfn_string_addr, INT_MAX);
+  if (execfn == nullptr)
+    return {};
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (arguments));
+}
+
+/* Parse and return execution context details from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return linux_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Fill the PRPSINFO structure with information about the process being
    debugged.  Returns 1 in case of success, 0 for failures.  Please note that
    even if the structure cannot be entirely filled (e.g., GDB was unable to
@@ -2785,6 +3068,8 @@ linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch,
   set_gdbarch_infcall_mmap (gdbarch, linux_infcall_mmap);
   set_gdbarch_infcall_munmap (gdbarch, linux_infcall_munmap);
   set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       linux_corefile_parse_exec_context);
 }
 
 void _initialize_linux_tdep ();
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.c b/gdb/testsuite/gdb.base/corefile-exec-context.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
new file mode 100644
index 00000000000..b18a8104779
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -0,0 +1,102 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB can handle reading the full executable name and argument
+# list from a core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Linux core files can encore upto 80 characters for the command and
+# arguments in the psinfo.  If BINFILE is less than 80 characters in
+# length then lets try to make it longer.
+set binfile_len [string length $binfile]
+if { $binfile_len <= 80 } {
+    set extra_len [expr 80 - $binfile_len + 1]
+    set extra_str [string repeat "x" $extra_len]
+    set new_binfile $binfile$extra_str
+    remote_exec build "mv $binfile $new_binfile"
+    set binfile $new_binfile
+}
+
+# Generate a core file, this time the inferior has no additional
+# arguments.
+set corefile [core_find $binfile {}]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_1 "$binfile.1.core"
+remote_exec build "mv $corefile $corefile_1"
+
+# Load the core file and confirm that the full executable name is
+# seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_1" "load core file no args" {
+    -re "^Core was generated by `[string_to_regexp $binfile]'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Generate a core file, this time pass some arguments to the inferior.
+set args "aaaaa bbbbb ccccc ddddd eeeee"
+set corefile [core_find $binfile {} $args]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_2 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_2"
+
+# Load the core file and confirm that the full executable name and
+# argument list are seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_2" "load core file with args" {
+    -re "^Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Also, the argument list should be available through 'show args'.
+gdb_test "show args" \
+    "Argument list to give program being debugged when it is started is \"$args\"\\."
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 2/5] gdb: parse and set the inferior environment from core files
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
@ 2024-10-28 18:53   ` Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Extend the core file context parsing mechanism added in the previous
commit to also store the environment parsed from the core file.

This environment can then be injected into the inferior object.

The benefit of this is that when examining a core file in GDB, the
'show environment' command will now show the environment extracted
from a core file.

Consider this example:

  $ env -i GDB_TEST_VAR=FOO ./gen-core
  Segmentation fault (core dumped)
  $ gdb -c ./core.1669829
  ...
  [New LWP 1669829]
  Core was generated by `./gen-core'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x0000000000401111 in ?? ()
  (gdb) show environment
  GDB_TEST_VAR=foo
  (gdb)

There's a new test for this functionality.
---
 gdb/arch-utils.c                              | 26 ++++++++
 gdb/arch-utils.h                              | 13 +++-
 gdb/corelow.c                                 |  3 +
 gdb/linux-tdep.c                              |  6 +-
 .../gdb.base/corefile-exec-context.exp        | 63 +++++++++++++++++++
 5 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
index 6ffa4109765..567dc87d9dd 100644
--- a/gdb/arch-utils.c
+++ b/gdb/arch-utils.c
@@ -1499,6 +1499,32 @@ gdbarch_initialized_p (gdbarch *arch)
   return arch->initialized_p;
 }
 
+/* See arch-utils.h.  */
+
+gdb_environ
+core_file_exec_context::environment () const
+{
+  gdb_environ e;
+
+  for (const auto &entry : m_environment)
+    {
+      char *eq = strchr (entry.get (), '=');
+
+      /* If there's no '=' character, then skip this entry.  */
+      if (eq == nullptr)
+	continue;
+
+      const char *value = eq + 1;
+      const char *var = entry.get ();
+
+      *eq = '\0';
+      e.set (var, value);
+      *eq = '=';
+    }
+
+  return e;
+}
+
 void _initialize_gdbarch_utils ();
 void
 _initialize_gdbarch_utils ()
diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 8d9f1625bdd..1c33bfb4704 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -21,6 +21,7 @@
 #define ARCH_UTILS_H
 
 #include "gdbarch.h"
+#include "gdbsupport/environ.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -88,9 +89,11 @@ struct core_file_exec_context
      found but not ARGV then use the no-argument constructor to create an
      empty context object.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
-			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
-      m_arguments (std::move (argv))
+      m_arguments (std::move (argv)),
+      m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
   }
@@ -115,6 +118,9 @@ struct core_file_exec_context
   const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
   { return m_arguments; }
 
+  /* Return the environment variables from this context.  */
+  gdb_environ environment () const;
+
 private:
 
   /* The executable filename as reported in the core file.  Can be nullptr
@@ -124,6 +130,9 @@ struct core_file_exec_context
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+
+  /* List of environment strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_environment;
 };
 
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5cc11d71b7b..a0129f84b1c 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -1014,6 +1014,9 @@ core_target_open (const char *arg, int from_tty)
 	argv.push_back (a.get ());
       gdb::array_view<char * const> view (argv.data (), argv.size ());
       current_inferior ()->set_args (view);
+
+      /* And now copy the environment.  */
+      current_inferior ()->environment = ctx.environment ();
     }
   else
     {
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index e6ba3513c8e..755e450f8a2 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2066,8 +2066,7 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
      be pointing at the first environment string.  */
   ptr += ptr_bytes;
 
-  /* Parse the environment strings.  Nothing is done with this yet, but
-     will be in a later commit.  */
+  /* Parse the environment strings.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> environment;
   while ((v = deref (ptr)) != 0)
     {
@@ -2084,7 +2083,8 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
     return {};
 
   return core_file_exec_context (std::move (execfn),
-				 std::move (arguments));
+				 std::move (arguments),
+				 std::move (environment));
 }
 
 /* Parse and return execution context details from core file CBFD.  */
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index b18a8104779..ac97754fe71 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -100,3 +100,66 @@ gdb_test_multiple "core-file $corefile_2" "load core file with args" {
 # Also, the argument list should be available through 'show args'.
 gdb_test "show args" \
     "Argument list to give program being debugged when it is started is \"$args\"\\."
+
+# Find the name of an environment variable that is not set.
+set env_var_base "GDB_TEST_ENV_VAR_"
+set env_var_name ""
+
+for { set i 0 } { $i < 10 } { incr i } {
+    set tmp_name ${env_var_base}${i}
+    if { ! [info exists ::env($tmp_name)] } {
+	set env_var_name $tmp_name
+	break
+    }
+}
+
+if { $env_var_name eq "" } {
+    unsupported "couldn't find suitable environment variable name"
+    return -1
+}
+
+# Generate a core file with this environment variable set.
+set env_var_value "TEST VALUE"
+save_vars { ::env($env_var_name) } {
+    setenv $env_var_name $env_var_value
+
+    set corefile [core_find $binfile {} $args]
+    if {$corefile == ""} {
+	untested "unable to create corefile"
+	return 0
+    }
+}
+set corefile_3 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_3"
+
+# Restart, load the core file, and check the environment variable
+# shows up.
+clean_restart $binfile
+
+# Check for environment variable VAR_NAME in the environment, its
+# value should be VAR_VALUE.
+proc check_for_env_var { var_name var_value } {
+    set saw_var false
+    gdb_test_multiple "show environment" "" {
+	-re "^$var_name=$var_value\r\n" {
+	    set saw_var true
+	    exp_continue
+	}
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+	-re "^$::gdb_prompt $" {
+	}
+    }
+    return $saw_var
+}
+
+gdb_assert { ![check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is not set before core file load"
+
+gdb_test "core-file $corefile_3" \
+    "Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n.*" \
+    "load core file for environment test"
+
+gdb_assert { [check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is set after core file load"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 3/5] gdb/testsuite: make some of the core file / build-id tests harder
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
@ 2024-10-28 18:53   ` Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

We have a few tests that load core files, which depend on GDB not
auto-loading the executable that matches the core file.  One of these
tests (corefile-buildid.exp) exercises GDB's ability to load the
executable via the build-id links in the debug directory, while the
other two tests are just written assuming that GDB hasn't auto-loaded
the executable.

In the next commit, GDB is going to get better at finding the
executable for a core file, and as a consequence these tests could
start to fail if the testsuite is being run using a compiler that adds
build-ids by default, and is on a target (currently only Linux) with
the improved executable auto-loading.

To avoid these test failures, this commit updates some of the tests.

coredump-filter.exp and corefile.exp are updated to unload the
executable should it be auto-loaded.  This means that the following
output from GDB will match the expected patterns.  If the executable
wasn't auto-loaded then the new step to unload is harmless.

The corefile-buildid.exp test needed some more significant changes.
For this test it is important that the executable be moved aside so
that GDB can't locate it, but we do still need the executable around
somewhere, so that the debug directory can link to it.  The point of
the test is that the executable _should_ be auto-loaded, but using the
debug directory, not using GDB's context parsing logic.

While looking at this test I noticed two additional problems, first we
were creating the core file more times than we needed.  We only need
to create one core file for each test binary (total two), while we
previously created one core file for each style of debug info
directory (total four).  The extra core files should be identical, and
were just overwriting each other, harmless, but still pointless work.

The other problem is that after running an earlier test we modified
the test binary in order to run a later test.  This means it's not
possible to manually re-run the first test as the binary for that test
is destroyed.

As part of the rewrite in this commit I've addressed these issues.

This test does change many of the test names, but there should be no
real changes in what is being tested after this commit.  However, when
the next commit is added, and GDB gets better at auto-loading the
executable for a core file, these tests should still be testing what
is expected.
---
 gdb/testsuite/gdb.base/coredump-filter.exp  |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp | 252 +++++++++-----------
 gdb/testsuite/gdb.base/corefile.exp         |   9 +
 3 files changed, 130 insertions(+), 148 deletions(-)

diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
index 0c1fc7c2dd6..18c3505172b 100644
--- a/gdb/testsuite/gdb.base/coredump-filter.exp
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -105,14 +105,23 @@ proc test_disasm { core address should_fail } {
 	    return
 	}
 
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	if { $should_fail == 1 } {
 	    remote_exec host "mv -f $hide_binfile $binfile"
-	    gdb_test "x/i \$pc" "=> $hex:\tCannot access memory at address $hex" \
-		"disassemble function with corefile and without a binary"
+	    set re "Cannot access memory at address $hex"
 	} else {
-	    gdb_test "x/i \$pc" "=> $hex:\t\[^C\].*" \
-		"disassemble function with corefile and without a binary"
+	    set re "\[^C\].*"
 	}
+
+	gdb_test "x/i \$pc" "=> $hex:\t${re}" \
+	    "disassemble function with corefile and without a binary"
     }
 
     with_test_prefix "with binary" {
diff --git a/gdb/testsuite/gdb.base/corefile-buildid.exp b/gdb/testsuite/gdb.base/corefile-buildid.exp
index fc54cf201d9..377ae802239 100644
--- a/gdb/testsuite/gdb.base/corefile-buildid.exp
+++ b/gdb/testsuite/gdb.base/corefile-buildid.exp
@@ -19,71 +19,72 @@
 
 # Build-id-related tests for core files.
 
-standard_testfile
+standard_testfile .c -shlib-shr.c -shlib.c
 
-# Build a non-shared executable.
+# Create a corefile from PROGNAME.  Return the name of the generated
+# corefile, or the empty string if anything goes wrong.
+#
+# The generated corefile must contain a buildid for PROGNAME.  If it
+# doesn't then an empty string will be returned.
+proc create_core_file { progname } {
+    # Generate a corefile.
+    set corefile [core_find $progname]
+    if {$corefile == ""} {
+	untested "could not generate core file"
+	return ""
+    }
+    verbose -log "corefile is $corefile"
+
+    # Check the corefile has a build-id for the executable.
+    if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
+	set line [lindex [split $output "\n"] 0]
+	set binfile_re (?:[string_to_regexp $progname]|\\\[(?:exe|pie)\\\])
+	if { ![regexp "^${::hex}\\+${::hex} \[a-f0-9\]+@${::hex}.*$binfile_re$" $line] } {
+	    unsupported "no build-id for executable in corefile"
+	    return ""
+	}
+    } else {
+	unsupported "eu-unstrip tool failed"
+	return ""
+    }
 
-proc build_corefile_buildid_exec {} {
-    global testfile srcfile binfile execdir
+    return $corefile
+}
 
-    if {[build_executable $testfile.exp $testfile $srcfile debug] == -1} {
-	untested "failed to compile"
-	return false
-    }
 
-    # Move executable to non-default path.
-    set builddir [standard_output_file $execdir]
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile [file join $builddir [file tail $binfile]]"
+# Build a non-shared executable.
 
-    return true
+proc build_corefile_buildid_exec { progname } {
+    return [expr {[build_executable "build non-shared exec" $progname $::srcfile] != -1}]
 }
 
 # Build a shared executable.
 
-proc build_corefile_buildid_shared {} {
-    global srcdir subdir testfile binfile srcfile sharedir
-
-    set builddir [standard_output_file $sharedir]
-
+proc build_corefile_buildid_shared { progname } {
     # Compile DSO.
-    set srcdso [file join $srcdir $subdir $testfile-shlib-shr.c]
-    set objdso [standard_output_file $testfile-shlib-shr.so]
-    if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
-	untested "failed to compile dso"
+    set objdso [standard_output_file $::testfile-shlib-shr.so]
+    if {[build_executable "build dso" $objdso $::srcfile2 {debug shlib}] == -1} {
 	return false
     }
 
+
     # Compile shared library.
-    set srclib [file join $srcdir $subdir $testfile-shlib.c]
-    set libname lib$testfile.so
+    set srclib $::srcfile3
+    set libname lib$::testfile.so
     set objlib [standard_output_file $libname]
-    set dlopen_lib [shlib_target_file \
-			[file join $builddir [file tail $objdso]]]
-    set opts [list debug shlib_load \
+    set dlopen_lib [shlib_target_file $objdso]
+    set opts [list debug shlib_load shlib \
 		  additional_flags=-DSHLIB_NAME=\"$dlopen_lib\"]
-    if {[gdb_compile_shlib $srclib $objlib $opts] != ""} {
-	untested "failed to compile shared library"
+    if {[build_executable "build solib" $objlib $::srcfile3 $opts] == -1} {
 	return false
     }
 
     # Compile main program.
-    set srcexec [file join $srcdir $subdir $srcfile]
-    set binfile [standard_output_file $testfile-shared]
     set opts [list debug shlib=$objlib additional_flags=-DTEST_SHARED]
-    if {[gdb_compile $srcexec $binfile executable $opts] != ""} {
-	untested "failed to compile shared executable"
+    if {[build_executable "build shared exec" $progname $::srcfile $opts] == -1} {
 	return false
     }
 
-    # Move objects to non-default path.
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile $builddir"
-    remote_exec build "mv $objdso  $builddir"
-    remote_exec build "mv $objlib $builddir"
-
     return true
 }
 
@@ -154,37 +155,43 @@ proc check_exec_file {file} {
 # SHARED is a boolean indicating whether we are testing the shared
 # library core dump test case.
 
-proc locate_exec_from_core_build_id {corefile buildid suffix \
+proc locate_exec_from_core_build_id {corefile buildid \
+					 dirname progname \
 					 sepdebug symlink shared} {
-    global testfile binfile srcfile
-
     clean_restart
 
     # Set up the build-id directory and symlink the binary there.
+    set d "debugdir"
+    if {$shared} {
+	set d "${d}_shared"
+    } else {
+	set d "${d}_not-shared"
+    }
     if {$symlink} {
-	set d "symlinkdir"
+	set d "${d}_symlink"
     } else {
-	set d "debugdir"
+	set d "${d}_copy"
     }
-    set debugdir [standard_output_file $d-$suffix]
-    remote_exec build "rm -rf $debugdir"
+    if {$sepdebug} {
+	set d "${d}_stripped"
+    } else {
+	set d "${d}_not-stripped"
+    }
+
+    set debugdir [standard_output_file $d]
     remote_exec build \
 	"mkdir -p [file join $debugdir [file dirname $buildid]]"
 
     set files_list {}
-    lappend files_list $binfile $buildid
+    lappend files_list [file join $dirname [file tail $progname]] \
+	$buildid
     if {$sepdebug} {
-	lappend files_list "$binfile.debug" "$buildid.debug"
-    }
-    if {$shared} {
-	global sharedir
-	set builddir [standard_output_file $sharedir]
-    } else {
-	global execdir
-	set builddir [standard_output_file $execdir]
+	lappend files_list [file join $dirname [file tail $progname]].debug \
+	    "$buildid.debug"
     }
+
     foreach {target name} $files_list {
-	set t [file join $builddir [file tail $target]]
+	set t [file join $dirname [file tail $target]]
 	if {$symlink} {
 	    remote_exec build "ln -s $t [file join $debugdir $name]"
 	} else {
@@ -198,109 +205,66 @@ proc locate_exec_from_core_build_id {corefile buildid suffix \
     gdb_test "core-file $corefile" "Program terminated with .*" \
 	"load core file"
     if {$symlink} {
-	set expected_file [file join $builddir [file tail $binfile]]
+	set expected_file [file join $dirname [file tail $progname]]
     } else {
 	set expected_file $buildid
     }
     check_exec_file [file join $debugdir $expected_file]
 }
 
-# Run a build-id tests on a core file.
-# Supported options: "-shared" and "-sepdebug" for running tests
-# of shared and/or stripped/.debug executables.
-
-proc do_corefile_buildid_tests {args} {
-    global binfile testfile srcfile execdir sharedir hex
-
-    # Parse options.
-    parse_args [list {sepdebug} {shared}]
+foreach_with_prefix mode { exec shared } {
+    # Build the executable.
+    set progname ${binfile}-$mode
+    set build_proc build_corefile_buildid_${mode}
+    if { ![$build_proc $progname] } {
+	return -1
+    }
 
-    # PROGRAM to run to generate core file.  This could be different
-    # than the program that was originally built, e.g., for a stripped
-    # executable.
-    if {$shared} {
-	set builddir [standard_output_file $sharedir]
-    } else {
-	set builddir [standard_output_file $execdir]
+    # Generate a corefile.
+    set corefile [create_core_file $progname]
+    if { $corefile eq "" } {
+	return -1
     }
-    set program_to_run [file join $builddir [file tail $binfile]]
 
-    # A list of suffixes to use to describe the test and the .build-id
-    # directory for the test.  The suffix will be used, joined with spaces,
-    # to prefix all tests for the given run.  It will be used, joined with
-    # dashes, to create a unique build-id directory.
-    set suffix {}
-    if {$shared} {
-	lappend suffix "shared"
-    } else {
-	lappend suffix "exec"
+    # Get the build-id filename without ".debug" on the end.  This
+    # will have the format: '.build-id/xx/xxxxx'
+    set buildid [build_id_debug_filename_get $progname ""]
+    if {$buildid == ""} {
+	untested "binary has no build-id"
+	return
     }
+    verbose -log "build-id is $buildid"
 
-    if {$sepdebug} {
-	# Strip debuginfo into its own file.
-	if {[gdb_gnu_strip_debug [standard_output_file $program_to_run] \
-		 no-debuglink] != 0} {
-	    untested "could not strip executable  for [join $suffix \ ]"
-	    return
-	}
+    # Create a directory for the non-stripped test.
+    set combined_dirname [standard_output_file ${mode}_non-stripped]
+    remote_exec build "mkdir -p $combined_dirname"
+    remote_exec build "cp $progname $combined_dirname"
 
-	lappend suffix "sepdebug"
+    # Create a directory for the stripped test.
+    if {[gdb_gnu_strip_debug [standard_output_file $progname] no-debuglink] != 0} {
+	untested "could not strip executable  for [join $suffix \ ]"
+	return
     }
-
-    with_test_prefix "[join $suffix \ ]" {
-	# Find the core file.
-	set corefile [core_find $program_to_run]
-	if {$corefile == ""} {
-	    untested "could not generate core file"
-	    return
-	}
-	verbose -log "corefile is $corefile"
-
-	if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
-	    set line [lindex [split $output "\n"] 0]
-	    set binfile_re (?:[string_to_regexp $program_to_run]|\\\[(?:exe|pie)\\\])
-	    if { ![regexp "^${hex}\\+${hex} \[a-f0-9\]+@${hex}.*$binfile_re$" $line] } {
-		unsupported "build id for exec"
-		return
-	    }
+    set sepdebug_dirname [standard_output_file ${mode}_stripped]
+    remote_exec build "mkdir -p $sepdebug_dirname"
+    remote_exec build "mv $progname $sepdebug_dirname"
+    remote_exec build "mv ${progname}.debug $sepdebug_dirname"
+
+    # Now do the actual testing part.  Fill out a debug directory with
+    # build-id related files (copies or symlinks) and then load the
+    # corefile.  Check GDB finds the executable and debug information
+    # via the build-id related debug directory contents.
+    foreach_with_prefix sepdebug { false true } {
+	if { $sepdebug } {
+	    set dirname $sepdebug_dirname
 	} else {
-	    unsupported "eu-unstrip execution"
-	    return
-	}
-
-	# Get the build-id filename without ".debug" on the end.  This
-	# will have the format: '.build-id/xx/xxxxx'
-	set buildid [build_id_debug_filename_get $program_to_run ""]
-	if {$buildid == ""} {
-	    untested "binary has no build-id"
-	    return
+	    set dirname $combined_dirname
 	}
-	verbose -log "build-id is $buildid"
-
-	locate_exec_from_core_build_id $corefile $buildid \
-	    [join $suffix -] $sepdebug false $shared
 
-	with_test_prefix "symlink" {
+	foreach_with_prefix symlink { false true } {
 	    locate_exec_from_core_build_id $corefile $buildid \
-		[join $suffix -] $sepdebug true $shared
+		$dirname $progname \
+		$sepdebug $symlink [expr {$mode eq "shared"}]
 	}
     }
 }
-
-# Directories where executables will be moved before testing.
-set execdir "build-exec"
-set sharedir "build-shared"
-
-#
-# Do tests
-#
-
-build_corefile_buildid_exec
-do_corefile_buildid_tests
-do_corefile_buildid_tests -sepdebug
-
-if {[allow_shlib_tests]} {
-    build_corefile_buildid_shared
-    do_corefile_buildid_tests -shared
-    do_corefile_buildid_tests -shared -sepdebug
-}
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index dc3c8b1dfc8..2111aa66d7d 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -348,6 +348,15 @@ proc corefile_test_attach {} {
 	gdb_start
 
 	gdb_test "core-file $corefile" "Core was generated by .*" "attach: load core again"
+
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	gdb_test "info files" "\r\nLocal core dump file:\r\n.*" "attach: sanity check we see the core file"
 
 	gdb_test "attach $pid" "Attaching to process $pid\r\n.*" "attach: with core"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 4/5] gdb: improve GDB's ability to auto-load the exec for a core file
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                     ` (2 preceding siblings ...)
  2024-10-28 18:53   ` [PATCHv2 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
@ 2024-10-28 18:53   ` Andrew Burgess
  2024-10-28 18:53   ` [PATCHv2 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

GDB already has a limited mechanism for auto-loading the executable
corresponding to a core file, this can be found in the function
locate_exec_from_corefile_build_id in corelow.c.

However, this approach uses the build-id of the core file to look in
either the debug directory (for a symlink back to the executable) or
by asking debuginfod.  This is great, and works fine if the core file
is a "system" binary, but often, when I'm debugging a core file, it's
part of my development cycle, so there's no build-id symlink in the
debug directory, and debuginfod doesn't know about the binary either,
so GDB can't auto load the executable....

... but the executable is right there!

This commit builds on the earlier commits in this series to make GDB
smarter.

On GNU/Linux, when we parse the execution context from the core
file (see linux-tdep.c), we already grab the command pointed to by
AT_EXECFN.  If this is an absolute path then GDB can use this to
locate the executable, a build-id check ensures we've found the
correct file.  With this small change GDB suddenly becomes a lot
better at auto-loading the executable for a core file.

But we can do better!  Often the AT_EXECFN is not an absolute path.

If it is a relative path then we check for this path relative to the
core file.  This helps if a user does something like:

  $ ./build/bin/some_prog
  Aborted (core dumped)
  $ gdb -c corefile

In this case the core file in the current directory will have an
AT_EXECFN value of './build/bin/some_prog', so if we look for that
path relative to the location of the core file this might result in a
hit, again, a build-id check ensures we found the right file.

But we can do better still!  What if the user moves the core file?  Or
the user is using some tool to manage core files (e.g. the systemd
core file management tool), and the user downloads the core file to a
location from which the relative path no longer works?

Well in this case we can make use of the core file's mapped file
information (the NT_FILE note).  The executable will be included in
the mapped file list, and the path within the mapped file list will be
an absolute path.  We can search for mapped file information based on
an address within the mapped file, and the auxv vector happens to
include an AT_ENTRY value, which is the entry address in the main
executable.  If we look up the mapped file containing this address
we'll have the absolute path to the main executable, a build-id check
ensures this really is the file we're looking for.

It might be tempting to jump straight to the third approach, however,
there is one small downside to the third approach: if the executable
is a symlink then the AT_EXECFN string will be the name of the
symlink, that is, the thing the user asked to run.  The mapped file
entry will be the name of the actual file, i.e. the symlink target.
When we auto-load the executable based on the third approach, the file
loaded might have a different name to that which the user expects,
though the build-id check (almost) guarantees that we've loaded the
correct binary.

But there's one more thing we can check for!

If the user has placed the core file and the executable into a
directory together, for example, as might happen with a bug report,
then neither the absolute path check, nor the relative patch check
will find the executable.  So GDB will also look for a file with the
right name in the same directory as the core file.  Again, a build-id
check is performed to ensure we find the correct file.

Of course, it's still possible that GDB is unable to find the
executable using any of these approaches.  In this case, nothing
changes, GDB will check in the debug info directory for a build-id
based link back to the executable, and if that fails, GDB will ask
debuginfod for the executable.  If this all fails, then, as usual, the
user is able to load the correct executable with the 'file' command,
but hopefully, this should be needed far less from now on.
---
 gdb/arch-utils.h                              |  25 +-
 gdb/corelow.c                                 | 141 ++++++++--
 gdb/linux-tdep.c                              |  22 ++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 242 ++++++++++++++++++
 5 files changed, 438 insertions(+), 17 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 1c33bfb4704..fb4a3ef9c5b 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -22,6 +22,7 @@
 
 #include "gdbarch.h"
 #include "gdbsupport/environ.h"
+#include "filenames.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -87,15 +88,23 @@ struct core_file_exec_context
      never be nullptr.  Only call this constructor if all the arguments
      have been collected successfully, i.e. if the EXEC_NAME could be
      found but not ARGV then use the no-argument constructor to create an
-     empty context object.  */
+     empty context object.
+
+     The EXEC_FILENAME must be the absolute filename of the executable
+     that generated this core file, or nullptr if the absolute filename
+     is not known.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  gdb::unique_xmalloc_ptr<char> exec_filename,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
+      m_exec_filename (std::move (exec_filename)),
       m_arguments (std::move (argv)),
       m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
+    gdb_assert (exec_filename == nullptr
+		|| IS_ABSOLUTE_PATH (exec_filename.get ()));
   }
 
   /* Create a default context object.  In its default state a context
@@ -112,6 +121,13 @@ struct core_file_exec_context
   const char *execfn () const
   { return m_exec_name.get (); }
 
+  /* Return the absolute path to the executable if known.  This might
+     return nullptr even when execfn() returns a non-nullptr value.
+     Additionally, the file referenced here might have a different name
+     than the file returned by execfn if execfn is a symbolic link.  */
+  const char *exec_filename () const
+  { return m_exec_filename.get (); }
+
   /* Return the vector of inferior arguments as extracted from the core
      file.  This does not include argv[0] (the executable name) for that
      see the execfn() function.  */
@@ -127,6 +143,13 @@ struct core_file_exec_context
      if no executable name is found.  */
   gdb::unique_xmalloc_ptr<char> m_exec_name;
 
+  /* Full filename to the executable that was actually executed.  The name
+     within EXEC_FILENAME might not match what the user typed, e.g. if the
+     user typed ./symlinked_name which is a symlink to /tmp/real_name then
+     this is going to contain '/tmp/realname' while EXEC_NAME above will
+     contain './symlinkedname'.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_filename;
+
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
diff --git a/gdb/corelow.c b/gdb/corelow.c
index a0129f84b1c..272b86b6f33 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -828,18 +828,117 @@ rename_vmcore_idle_reg_sections (bfd *abfd, inferior *inf)
 	     replacement_lwpid_str.c_str ());
 }
 
+/* Use CTX to try and find (and open) the executable file for the core file
+   CBFD.  BUILD_ID is the build-id for CBFD which was already extracted by
+   our caller.
+
+   Will return the opened executable or nullptr if the executable couldn't
+   be found.  */
+
+static gdb_bfd_ref_ptr
+locate_exec_from_corefile_exec_context (bfd *cbfd,
+					const bfd_build_id *build_id,
+					const core_file_exec_context &ctx)
+{
+  /* CTX must be valid, and a valid context has an execfn() string.  */
+  gdb_assert (ctx.valid ());
+  gdb_assert (ctx.execfn () != nullptr);
+
+  /* EXEC_NAME will be the command used to start the inferior.  This might
+     not be an absolute path (but could be).  */
+  const char *exec_name = ctx.execfn ();
+
+  /* Function to open FILENAME and check if its build-id matches BUILD_ID
+     from this enclosing scope.  Returns the open BFD for filename if the
+     FILENAME has a matching build-id, otherwise, returns nullptr.  */
+  const auto open_and_check_build_id
+    = [&build_id] (const char *filename) -> gdb_bfd_ref_ptr
+  {
+    /* Try to open a file.  If this succeeds then we still need to perform
+       a build-id check.  */
+    gdb_bfd_ref_ptr execbfd = gdb_bfd_open (filename, gnutarget);
+
+    /* We managed to open a file, but if it's build-id doesn't match
+       BUILD_ID then we just cannot trust it's the right file.  */
+    if (execbfd != nullptr)
+      {
+	const bfd_build_id *other_build_id = build_id_bfd_get (execbfd.get ());
+
+	if (other_build_id == nullptr
+	    || !build_id_equal (other_build_id, build_id))
+	  execbfd = nullptr;
+      }
+
+    return execbfd;
+  };
+
+  gdb_bfd_ref_ptr execbfd;
+
+  /* If EXEC_NAME is absolute then try to open it now.  Otherwise, see if
+     EXEC_NAME is a relative path from the location of the core file.  This
+     is just a guess, the executable might not be here, but we still rely
+     on a build-id match in order to accept any executable we find; we
+     don't accept something just because it happens to be in the right
+     location.  */
+  if (IS_ABSOLUTE_PATH (exec_name))
+    execbfd = open_and_check_build_id (exec_name);
+  else
+    {
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + exec_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If we haven't found the executable yet, then try checking to see if
+     the executable is in the same directory as the core file.  Again,
+     there's no reason why this should be the case, but it's worth a try,
+     and the build-id check should ensure we don't use an invalid file if
+     we happen to find one.  */
+  if (execbfd == nullptr)
+    {
+      const char *base_name = lbasename (exec_name);
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + base_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If the above didn't provide EXECBFD then try the exec_filename from
+     the context.  This will be an absolute filename which the gdbarch code
+     figured out from the core file.  In some cases the gdbarch code might
+     not be able to figure out a suitable absolute filename though.  */
+  if (execbfd == nullptr && ctx.exec_filename () != nullptr)
+    {
+      gdb_assert (IS_ABSOLUTE_PATH (ctx.exec_filename ()));
+
+      /* Try to open a file.  If this succeeds then we still need to
+	 perform a build-id check.  */
+      execbfd = open_and_check_build_id (ctx.exec_filename ());
+    }
+
+  return execbfd;
+}
+
 /* Locate (and load) an executable file (and symbols) given the core file
    BFD ABFD.  */
 
 static void
-locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
+locate_exec_from_corefile_build_id (bfd *abfd,
+				    const core_file_exec_context &ctx,
+				    int from_tty)
 {
   const bfd_build_id *build_id = build_id_bfd_get (abfd);
   if (build_id == nullptr)
     return;
 
-  gdb_bfd_ref_ptr execbfd
-    = find_objfile_by_build_id (build_id, abfd->filename);
+  gdb_bfd_ref_ptr execbfd;
+
+  if (ctx.valid ())
+    execbfd = locate_exec_from_corefile_exec_context (abfd, build_id, ctx);
+
+  if (execbfd == nullptr)
+    execbfd = find_objfile_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
@@ -908,13 +1007,6 @@ core_target_open (const char *arg, int from_tty)
 
   validate_files ();
 
-  /* If we have no exec file, try to set the architecture from the
-     core file.  We don't do this unconditionally since an exec file
-     typically contains more information that helps us determine the
-     architecture than a core file.  */
-  if (!current_program_space->exec_bfd ())
-    set_gdbarch_from_file (current_program_space->core_bfd ());
-
   current_inferior ()->push_target (std::move (target_holder));
 
   switch_to_no_thread ();
@@ -969,9 +1061,31 @@ core_target_open (const char *arg, int from_tty)
       switch_to_thread (thread);
     }
 
+  /* In order to parse the exec context from the core file the current
+     inferior needs to have a suitable gdbarch set.  If an exec file is
+     loaded then the gdbarch will have been set based on the exec file, but
+     if not, ensure we have a suitable gdbarch in place now.  */
+  if (current_program_space->exec_bfd () == nullptr)
+      current_inferior ()->set_arch (target->core_gdbarch ());
+
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+
+  /* If we don't have an executable loaded then see if we can locate one
+     based on the core file.  */
   if (current_program_space->exec_bfd () == nullptr)
     locate_exec_from_corefile_build_id (current_program_space->core_bfd (),
-					from_tty);
+					ctx, from_tty);
+
+  /* If we have no exec file, try to set the architecture from the
+     core file.  We don't do this unconditionally since an exec file
+     typically contains more information that helps us determine the
+     architecture than a core file.  */
+  if (current_program_space->exec_bfd () == nullptr)
+    set_gdbarch_from_file (current_program_space->core_bfd ());
 
   post_create_inferior (from_tty);
 
@@ -989,11 +1103,6 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  /* See if the gdbarch can find the executable name and argument list from
-     the core file.  */
-  core_file_exec_context ctx
-    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
-				       current_program_space->core_bfd ());
   if (ctx.valid ())
     {
       std::string args;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 755e450f8a2..202a778b3f1 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2082,7 +2082,29 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
   if (execfn == nullptr)
     return {};
 
+  /* When the core-file was loaded GDB processed the file backed mappings
+     (from the NT_FILE note).  One of these should have been for the
+     executable.  The AT_EXECFN string might not be an absolute path, but
+     the path in NT_FILE will be absolute, though if AT_EXECFN is a
+     symlink, then the NT_FILE entry will point to the actual file, not the
+     symlink.
+
+     Use the AT_ENTRY address to look for the NT_FILE entry which contains
+     that address, this should be the executable.  */
+  gdb::unique_xmalloc_ptr<char> exec_filename;
+  CORE_ADDR exec_entry_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_ENTRY, &exec_entry_addr) == 1)
+    {
+      std::optional<core_target_mapped_file_info> info
+	= core_target_find_mapped_file (nullptr, exec_entry_addr);
+      if (info.has_value () && !info->filename ().empty ()
+	  && IS_ABSOLUTE_PATH (info->filename ().c_str ()))
+	exec_filename = make_unique_xstrdup (info->filename ().c_str ());
+    }
+
   return core_file_exec_context (std::move (execfn),
+				 std::move (exec_filename),
 				 std::move (arguments),
 				 std::move (environment));
 }
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.c b/gdb/testsuite/gdb.base/corefile-find-exec.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
new file mode 100644
index 00000000000..40324c1f01c
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -0,0 +1,242 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB's ability to auto-load the executable based on the file
+# names extracted from the core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile {debug build-id}] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Load the COREFILE and confirm that GDB auto-loads the executable.
+# The symbols should be read from SYMBOL_FILE and the core file should
+# be reported as generated by GEN_FROM_FILE.
+proc test_load { corefile symbol_file gen_from_file } {
+    clean_restart
+    set saw_generated_line false
+    set saw_reading_symbols false
+
+    gdb_test_multiple "core-file $corefile" "load core file" {
+
+	-re "^Reading symbols from [string_to_regexp $symbol_file]\\.\\.\\.\r\n" {
+	    set saw_reading_symbols true
+	    exp_continue
+	}
+
+	-re "^Core was generated by `[string_to_regexp $gen_from_file]'\\.\r\n" {
+	    set saw_generated_line true
+	    exp_continue
+	}
+
+	-re "^$::gdb_prompt $" {
+	    gdb_assert { $saw_generated_line && $saw_reading_symbols} \
+		$gdb_test_name
+	}
+
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+    }
+}
+
+with_test_prefix "absolute path" {
+    # Generate a core file, this uses an absolute path to the
+    # executable.
+    with_test_prefix "to file" {
+	set corefile [core_find $binfile]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_1 "$binfile.1.core"
+	remote_exec build "mv $corefile $corefile_1"
+
+	test_load $corefile_1 $binfile $binfile
+    }
+
+    # And create a symlink, and repeat the test using an absolute path
+    # to the symlink.
+    with_test_prefix "to symlink" {
+	set symlink_name "symlink_1"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_2 "$binfile.2.core"
+	remote_exec build "mv $corefile $corefile_2"
+
+	test_load $corefile_2 $symlink $symlink
+    }
+
+    # Like the previous test, except this time, delete the symlink
+    # after generating the core file.  GDB should be smart enough to
+    # figure out that we can use the underlying TESTFILE binary.
+    with_test_prefix "to deleted symlink" {
+	set symlink_name "symlink_2"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_3 "$binfile.3.core"
+	remote_exec build "mv $corefile $corefile_3"
+
+	remote_exec build "rm -f $symlink"
+
+	test_load $corefile_3 $binfile $symlink
+    }
+
+    # Generate the core file with an absolute path to the executable,
+    # but move the core file and executable into a single directory
+    # together so GDB can't use the absolute path to find the
+    # executable.
+    #
+    # GDB should still find the executable though, but looking in the
+    # same directory as the core file.
+    with_test_prefix "in side directory" {
+	set binfile_2 [standard_output_file ${testfile}_2]
+	remote_exec build "cp $binfile $binfile_2"
+
+	set corefile [core_find $binfile_2]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_4 "$binfile.4.core"
+	remote_exec build "mv $corefile $corefile_4"
+
+	set side_dir [standard_output_file side_dir]
+	remote_exec build "mkdir -p $side_dir"
+	remote_exec build "mv $binfile_2 $side_dir"
+	remote_exec build "mv $corefile_4 $side_dir"
+
+	set relocated_corefile_4 [file join $side_dir [file tail $corefile_4]]
+	set relocated_binfile_2 [file join $side_dir [file tail $binfile_2]]
+	test_load $relocated_corefile_4 $relocated_binfile_2 $binfile_2
+    }
+}
+
+with_test_prefix "relative path" {
+    # Generate a core file using relative a path.  We ned to work
+    # around the core_find proc a little here.  The core_find proc
+    # creates a sub-directory using standard_output_file and runs the
+    # test binary from inside that directory.
+    #
+    # Usually core_find is passed an absolute path, so thre's no
+    # problem, but we want to pass a relative path.
+    #
+    # So setup a directory structure like this:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #
+    # Place a copy of BINFILE in 'reldir/' and switch to workdir, use
+    # core_find which will create a sibling directory of workdir, and
+    # run the relative path from there.  We then move the generated
+    # core file back into 'workdir/', this leaves a tree like:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #      <core file here>
+    #
+    # Now we can ask GDB to open the core file, if all goes well GDB
+    # should make use of the relative path encoded in the core file to
+    # locate the executable in 'reldir/'.
+    #
+    # We also setup a symlink in 'reldir' that points to the
+    # executable and repeat the test, but this time executing the
+    # symlink.
+    set reldir_name "reldir"
+    set reldir [standard_output_file $reldir_name]
+    remote_exec build "mkdir -p $reldir"
+
+    set alt_testfile "alt_${testfile}"
+    set binfile_3 "$reldir/${alt_testfile}"
+    remote_exec build "cp $binfile $binfile_3"
+
+    set symlink_2 "symlink_2"
+    with_cwd $reldir {
+	remote_exec build "ln -s ${alt_testfile} ${symlink_2}"
+    }
+
+    set work_dir [standard_output_file "workdir"]
+    remote_exec build "mkdir -p $work_dir"
+
+    set rel_path_to_file "../${reldir_name}/${alt_testfile}"
+    set rel_path_to_symlink_2 "../${reldir_name}/${symlink_2}"
+
+    with_cwd $work_dir {
+	with_test_prefix "to file" {
+	    set corefile [core_find $rel_path_to_file]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_5 "${work_dir}/${testfile}.5.core"
+	    remote_exec build "mv $corefile $corefile_5"
+
+	    test_load $corefile_5 \
+		[file join $work_dir $rel_path_to_file] \
+		$rel_path_to_file
+	}
+
+	with_test_prefix "to symlink" {
+	    set corefile [core_find $rel_path_to_symlink_2]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_6 "${work_dir}/${testfile}.6.core"
+	    remote_exec build "mv $corefile $corefile_6"
+
+	    test_load $corefile_6 \
+		[file join $work_dir $rel_path_to_symlink_2] \
+		$rel_path_to_symlink_2
+	}
+
+	# Move the core file.  Now the relative path doesn't work so
+	# we instead rely on GDB to use information about the mapped
+	# files to help locate the executable.
+	with_test_prefix "with moved corefile" {
+	    set corefile_7 [standard_output_file "${testfile}.7.core"]
+	    remote_exec build "cp $corefile_6 $corefile_7"
+	    test_load $corefile_7 $binfile_3 $rel_path_to_symlink_2
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv2 5/5] gdb/freebsd: port core file context parsing to FreeBSD
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                     ` (3 preceding siblings ...)
  2024-10-28 18:53   ` [PATCHv2 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
@ 2024-10-28 18:53   ` Andrew Burgess
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  5 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-28 18:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit implements the gdbarch_core_parse_exec_context method for
FreeBSD.

This is much simpler than for Linux.  On FreeBSD, at least the
version (13.x) that I have installer, there are additional entries in
the auxv vector that point directly to the argument and environment
vectors, this makes it trivial to find this information.

If these extra auxv entries are not available on earlier FreeBSD, then
that's fine.  The fallback behaviour will be for GDB to act as it
always has up to this point, you'll just not get the extra
functionality.

Other differences compared to Linux are that FreeBSD has
AT_FREEBSD_EXECPATH instead of AT_EXECFN, the AT_FREEBSD_EXECPATH is
the full path to the executable.  On Linux AT_EXECFN is the command
the user typed, so this can be a relative path.

This difference is handy as on FreeBSD we don't parse the mapped files
from the core file (are they even available?).  So having the EXECPATH
means we can use that as the absolute path to the executable.

However, if the user ran a symlink then AT_FREEBSD_EXECPATH will be
the absolute path to the symlink, not to the underlying file.  This is
probably a good thing, but it does mean there is one case we test on
Linux that fails on FreeBSD.

On Linux if we create a symlink to an executable, then run the symlink
and generate a corefile.  Now delete the symlink and load the core
file.  On Linux GDB will still find (and open) the original
executable.  This is because we use the mapped file information to
find the absolute path to the executable, and the mapped file
information only stores the real file names, not symlink names.

This is a total edge case, I only added the deleted symlink test
originally because I could see that this would work on Linux.  Though
it is neat that Linux finds this, I don't feel too bad that this fails
on FreeBSD.

Other than this, everything seems to work on x86-64 FreeBSD (13.4)
which is all I have setup right now.  I don't see why other
architectures wouldn't work too, but I haven't tested them.
---
 gdb/fbsd-tdep.c                               | 134 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.exp        |   2 +-
 gdb/testsuite/gdb.base/corefile-find-exec.exp |  12 +-
 3 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/gdb/fbsd-tdep.c b/gdb/fbsd-tdep.c
index e97ff52d5bf..804a72c4205 100644
--- a/gdb/fbsd-tdep.c
+++ b/gdb/fbsd-tdep.c
@@ -33,6 +33,7 @@
 #include "elf-bfd.h"
 #include "fbsd-tdep.h"
 #include "gcore-elf.h"
+#include "arch-utils.h"
 
 /* This enum is derived from FreeBSD's <sys/signal.h>.  */
 
@@ -2361,6 +2362,137 @@ fbsd_vdso_range (struct gdbarch *gdbarch, struct mem_range *range)
   return range->length != 0;
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from CBFD.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the argument string vector.  */
+  CORE_ADDR argv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ARGV, &argv_addr) != 1)
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the environment string vector.  */
+  CORE_ADDR envv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ENVV, &envv_addr) != 1)
+    return {};
+
+  /* Read the AT_EXECPATH string.  It's OK if we can't get this
+     information.  */
+  gdb::unique_xmalloc_ptr<char> execpath;
+  CORE_ADDR execpath_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_EXECPATH,
+			  &execpath_string_addr) == 1)
+    execpath = target_read_string (execpath_string_addr, INT_MAX);
+
+  /* The byte order.  */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+
+  /* On FreeBSD the command the user ran is found in argv[0].  When we
+     read the first argument we place it into EXECFN.  */
+  gdb::unique_xmalloc_ptr<char> execfn;
+
+  /* Read strings from AT_FREEBSD_ARGV until we find a NULL marker.  The
+     first argument is placed into EXECFN as the command name.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+  CORE_ADDR str_addr;
+  while ((str_addr
+	  = (CORE_ADDR) read_memory_unsigned_integer (argv_addr, ptr_bytes,
+						      byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      if (execfn == nullptr)
+	execfn = std::move (str);
+      else
+	arguments.emplace_back (std::move (str));
+
+      argv_addr += ptr_bytes;
+    }
+
+  /* Read strings from AT_FREEBSD_ENVV until we find a NULL marker.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((str_addr
+	  = (uint64_t) read_memory_unsigned_integer (envv_addr, ptr_bytes,
+						     byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      environment.emplace_back (std::move (str));
+      envv_addr += ptr_bytes;
+    }
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (execpath),
+				 std::move (arguments),
+				 std::move (environment));
+}
+
+/* See elf-corelow.h.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return fbsd_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Return the address range of the vDSO for the current inferior.  */
 
 static int
@@ -2404,4 +2536,6 @@ fbsd_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* `catch syscall' */
   set_xml_syscall_file_name (gdbarch, "syscalls/freebsd.xml");
   set_gdbarch_get_syscall_number (gdbarch, fbsd_get_syscall_number);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       fbsd_corefile_parse_exec_context);
 }
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index ac97754fe71..73e13e60d75 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
index 40324c1f01c..07e660d85e8 100644
--- a/gdb/testsuite/gdb.base/corefile-find-exec.exp
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
@@ -115,6 +115,16 @@ with_test_prefix "absolute path" {
 
 	remote_exec build "rm -f $symlink"
 
+	# FreeBSD is unable to figure out the actual underlying mapped
+	# file, so when the symlink is deleted, FeeeBSD is stuck.
+	#
+	# There is some argument that this shouldn't even be a
+	# failure, the user ran the symlink, and if the symlink is
+	# gone, should we really expect GDB to find the underlying
+	# file?  That we can on Linux is really just a quirk of how
+	# the mapped file list works.
+	setup_xfail "*-*-freebsd*"
+
 	test_load $corefile_3 $binfile $symlink
     }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 0/5] Better executable auto-loading when opening a core file
  2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                     ` (4 preceding siblings ...)
  2024-10-28 18:53   ` [PATCHv2 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
@ 2024-10-29 14:08   ` Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
                       ` (4 more replies)
  5 siblings, 5 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In v3:

  - Linaro CI highlighted some failures on ARM.  Turns out ARM's stack
    setup is different than the other architectures I've tested on.
    I've added some additional logic which should mean ARM is now
    handled OK.

In v2:

  - Fixed an incorrect use of gdb::function_view in patch #1 which was
    causing undefined behaviour, and crashes when GDB was built with
    optimisation.

  - Rebased and retested.

---

There's actually a couple of core file related improvements in this
series.

Patches #1 and #2 improve what information GDB can extract about the
execution context (executable name, inferior arguments, and
environment) when opening a core file.

Then patch #4 improves GDB's ability to auto-load the executable that
matches a core file (on GNU/Linux).

Patch #3 is a testsuite refactor to allow for patch #4.

And patch #5 replicates patch #4, but for FreeBSD.

Thanks,
Andrew

---

Andrew Burgess (5):
  gdb: add gdbarch method to get execution context from core file
  gdb: parse and set the inferior environment from core files
  gdb/testsuite: make some of the core file / build-id tests harder
  gdb: improve GDB's ability to auto-load the exec for a core file
  gdb/freebsd: port core file context parsing to FreeBSD

 gdb/arch-utils.c                              |  26 ++
 gdb/arch-utils.h                              |  89 +++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 | 172 +++++++++-
 gdb/fbsd-tdep.c                               | 134 ++++++++
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 315 ++++++++++++++++++
 gdb/testsuite/gdb.base/coredump-filter.exp    |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp   | 252 ++++++--------
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 165 +++++++++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 252 ++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp           |   9 +
 17 files changed, 1386 insertions(+), 163 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp


base-commit: a723c56efb07c4f8b3f6a3ed4b878a2f8f5572cc
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 1/5] gdb: add gdbarch method to get execution context from core file
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
@ 2024-10-29 14:08     ` Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Add a new gdbarch method which can read the execution context from a
core file.  An execution context, for this commit, means the filename
of the executable used to generate the core file and the arguments
passed to the executable.

In later commits this will be extended further to include the
environment in which the executable was run, but this commit is
already pretty big, so I've split that part out into a later commit.

Initially this new gdbarch method is only implemented for Linux
targets, but a later commit will add FreeBSD support too.

Currently when GDB opens a core file, GDB reports the command and
arguments used to generate the core file.  For example:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./gen-core abc def'.

However, this information comes from the psinfo structure in the core
file, and this struct only allows 80 characters for the command and
arguments combined.  If the command and arguments exceed this then
they are truncated.

Additionally, neither the executable nor the arguments are quoted in
the psinfo structure, so if, for example, the executable was named
'aaa bbb' (i.e. contains white space) and was run with the arguments
'ccc' and 'ddd', then when this core file was opened by GDB we'd see:

  (gdb) core-file ./core.521524
  [New LWP 521524]
  Core was generated by `./aaa bbb ccc ddd'.

It is impossible to know if 'bbb' is part of the executable filename,
or another argument.

However, the kernel places the executable command onto the user stack,
this is pointed to by the AT_EXECFN entry in the auxv vector.
Additionally, the inferior arguments are all available on the user
stack.  The new gdbarch method added in this commit extracts this
information from the user stack and allows GDB to access it.

The information on the stack is writable by the user, so a user
application can start up, edit the arguments, override the AT_EXECFN
string, and then dump core.  In this case GDB will report incorrect
information, however, it is worth noting that the psinfo structure is
also filled (by the kernel) by just copying information from the user
stack, so, if the user edits the on stack arguments, the values
reported in psinfo will change, so the new approach is no worse than
what we currently have.

The benefit of this approach is that GDB gets to report the full
executable name and all the arguments without the 80 character limit,
and GDB is aware which parts are the executable name, and which parts
are arguments, so we can, for example, style the executable name.

Another benefit is that, now we know all the arguments, we can poke
these into the inferior object.  This means that after loading a core
file a user can 'show args' to see the arguments used.  A user could
even transition from core file debugging to live inferior debugging
using, e.g. 'run', and GDB would restart the inferior with the correct
arguments.

Now the downside: finding the AT_EXECFN string is easy, the auxv entry
points directly too it.  However, finding the arguments is a little
trickier.  There's currently no easy way to get a direct pointer to
the arguments.  Instead, I've got a heuristic which I believe should
find the arguments in most cases.  The algorithm is laid out in
linux-tdep.c, I'll not repeat it here, but it's basically a search of
the user stack, starting from AT_EXECFN.

If the new heuristic fails then GDB just falls back to the old
approach, asking bfd to read the psinfo structure for us, which gives
the old 80 character limited answer.

For testing, I've run this series on (all GNU/Linux) x86-64. s390,
ppc64le, and the new test passes in each case.  I've done some very
basic testing on ARM which does things a little different than the
other architectures mentioned, see ARM specific notes in
linux_corefile_parse_exec_context_1 for details.
---
 gdb/arch-utils.h                              |  57 ++++
 gdb/corefile.c                                |  10 +
 gdb/corelow.c                                 |  38 ++-
 gdb/gdbarch-gen.c                             |  22 ++
 gdb/gdbarch-gen.h                             |  15 +
 gdb/gdbarch.h                                 |   1 +
 gdb/gdbarch_components.py                     |  20 ++
 gdb/linux-tdep.c                              | 293 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.c          |  25 ++
 .../gdb.base/corefile-exec-context.exp        | 102 ++++++
 10 files changed, 579 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-exec-context.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 40c62f30a65..8d9f1625bdd 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -74,6 +74,58 @@ struct bp_manipulation_endian
   bp_manipulation_endian<sizeof (BREAK_INSN_LITTLE),		  \
   BREAK_INSN_LITTLE, BREAK_INSN_BIG>
 
+/* Structure returned from gdbarch core_parse_exec_context method.  Wraps
+   the execfn string and a vector containing the inferior argument.  If a
+   gdbarch is unable to parse this information then an empty structure is
+   returned, check the execfn as an indication, if this is nullptr then no
+   other fields should be considered valid.  */
+
+struct core_file_exec_context
+{
+  /* Constructor, just move everything into place.  The EXEC_NAME should
+     never be nullptr.  Only call this constructor if all the arguments
+     have been collected successfully, i.e. if the EXEC_NAME could be
+     found but not ARGV then use the no-argument constructor to create an
+     empty context object.  */
+  core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+    : m_exec_name (std::move (exec_name)),
+      m_arguments (std::move (argv))
+  {
+    gdb_assert (m_exec_name != nullptr);
+  }
+
+  /* Create a default context object.  In its default state a context
+     object holds no useful information, and will return false from its
+     valid() method.  */
+  core_file_exec_context () = default;
+
+  /* Return true if this object contains valid context information.  */
+  bool valid () const
+  { return m_exec_name != nullptr; }
+
+  /* Return the execfn string (executable name) as extracted from the core
+     file.  Will always return non-nullptr if valid() returns true.  */
+  const char *execfn () const
+  { return m_exec_name.get (); }
+
+  /* Return the vector of inferior arguments as extracted from the core
+     file.  This does not include argv[0] (the executable name) for that
+     see the execfn() function.  */
+  const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
+  { return m_arguments; }
+
+private:
+
+  /* The executable filename as reported in the core file.  Can be nullptr
+     if no executable name is found.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_name;
+
+  /* List of arguments.  Doesn't include argv[0] which is the executable
+     name, for this look at m_exec_name field.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+};
+
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
 extern bool default_displaced_step_hw_singlestep (struct gdbarch *);
 
@@ -305,6 +357,11 @@ extern void default_read_core_file_mappings
    read_core_file_mappings_pre_loop_ftype pre_loop_cb,
    read_core_file_mappings_loop_ftype loop_cb);
 
+/* Default implementation of gdbarch_core_parse_exec_context.  Returns
+   an empty core_file_exec_context.  */
+extern core_file_exec_context default_core_parse_exec_context
+  (struct gdbarch *gdbarch, bfd *cbfd);
+
 /* Default implementation of gdbarch
    use_target_description_from_corefile_notes.  */
 extern bool default_use_target_description_from_corefile_notes
diff --git a/gdb/corefile.c b/gdb/corefile.c
index f6ec3cd5ca1..c3089e4516e 100644
--- a/gdb/corefile.c
+++ b/gdb/corefile.c
@@ -35,6 +35,7 @@
 #include "cli/cli-utils.h"
 #include "gdbarch.h"
 #include "interps.h"
+#include "arch-utils.h"
 
 void
 reopen_exec_file (void)
@@ -76,6 +77,15 @@ validate_files (void)
     }
 }
 
+/* See arch-utils.h.  */
+
+core_file_exec_context
+default_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  return {};
+}
+\f
+
 std::string
 memory_error_message (enum target_xfer_status err,
 		      struct gdbarch *gdbarch, CORE_ADDR memaddr)
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5820ffed332..5cc11d71b7b 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -854,7 +854,6 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
 void
 core_target_open (const char *arg, int from_tty)
 {
-  const char *p;
   int siggy;
   int scratch_chan;
   int flags;
@@ -990,9 +989,40 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  p = bfd_core_file_failing_command (current_program_space->core_bfd ());
-  if (p)
-    gdb_printf (_("Core was generated by `%s'.\n"), p);
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+  if (ctx.valid ())
+    {
+      std::string args;
+      for (const auto &a : ctx.args ())
+	{
+	  args += ' ';
+	  args += a.get ();
+	}
+
+      gdb_printf (_("Core was generated by `%ps%s'.\n"),
+		  styled_string (file_name_style.style (),
+				 ctx.execfn ()),
+		  args.c_str ());
+
+      /* Copy the arguments into the inferior.  */
+      std::vector<char *> argv;
+      for (const auto &a : ctx.args ())
+	argv.push_back (a.get ());
+      gdb::array_view<char * const> view (argv.data (), argv.size ());
+      current_inferior ()->set_args (view);
+    }
+  else
+    {
+      gdb::unique_xmalloc_ptr<char> failing_command = make_unique_xstrdup
+	(bfd_core_file_failing_command (current_program_space->core_bfd ()));
+      if (failing_command != nullptr)
+	gdb_printf (_("Core was generated by `%s'.\n"),
+		    failing_command.get ());
+    }
 
   /* Clearing any previous state of convenience variables.  */
   clear_exit_convenience_vars ();
diff --git a/gdb/gdbarch-gen.c b/gdb/gdbarch-gen.c
index 0d00cd7c993..6f41ce9d233 100644
--- a/gdb/gdbarch-gen.c
+++ b/gdb/gdbarch-gen.c
@@ -258,6 +258,7 @@ struct gdbarch
   gdbarch_get_pc_address_flags_ftype *get_pc_address_flags = default_get_pc_address_flags;
   gdbarch_read_core_file_mappings_ftype *read_core_file_mappings = default_read_core_file_mappings;
   gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes = default_use_target_description_from_corefile_notes;
+  gdbarch_core_parse_exec_context_ftype *core_parse_exec_context = default_core_parse_exec_context;
 };
 
 /* Create a new ``struct gdbarch'' based on information provided by
@@ -527,6 +528,7 @@ verify_gdbarch (struct gdbarch *gdbarch)
   /* Skip verify of get_pc_address_flags, invalid_p == 0.  */
   /* Skip verify of read_core_file_mappings, invalid_p == 0.  */
   /* Skip verify of use_target_description_from_corefile_notes, invalid_p == 0.  */
+  /* Skip verify of core_parse_exec_context, invalid_p == 0.  */
   if (!log.empty ())
     internal_error (_("verify_gdbarch: the following are invalid ...%s"),
 		    log.c_str ());
@@ -1386,6 +1388,9 @@ gdbarch_dump (struct gdbarch *gdbarch, struct ui_file *file)
   gdb_printf (file,
 	      "gdbarch_dump: use_target_description_from_corefile_notes = <%s>\n",
 	      host_address_to_string (gdbarch->use_target_description_from_corefile_notes));
+  gdb_printf (file,
+	      "gdbarch_dump: core_parse_exec_context = <%s>\n",
+	      host_address_to_string (gdbarch->core_parse_exec_context));
   if (gdbarch->dump_tdep != NULL)
     gdbarch->dump_tdep (gdbarch, file);
 }
@@ -5463,3 +5468,20 @@ set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch,
 {
   gdbarch->use_target_description_from_corefile_notes = use_target_description_from_corefile_notes;
 }
+
+core_file_exec_context
+gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != NULL);
+  gdb_assert (gdbarch->core_parse_exec_context != NULL);
+  if (gdbarch_debug >= 2)
+    gdb_printf (gdb_stdlog, "gdbarch_core_parse_exec_context called\n");
+  return gdbarch->core_parse_exec_context (gdbarch, cbfd);
+}
+
+void
+set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch,
+				     gdbarch_core_parse_exec_context_ftype core_parse_exec_context)
+{
+  gdbarch->core_parse_exec_context = core_parse_exec_context;
+}
diff --git a/gdb/gdbarch-gen.h b/gdb/gdbarch-gen.h
index b982fd7cd09..29c5ad705f9 100644
--- a/gdb/gdbarch-gen.h
+++ b/gdb/gdbarch-gen.h
@@ -1751,3 +1751,18 @@ extern void set_gdbarch_read_core_file_mappings (struct gdbarch *gdbarch, gdbarc
 typedef bool (gdbarch_use_target_description_from_corefile_notes_ftype) (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern bool gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, struct bfd *corefile_bfd);
 extern void set_gdbarch_use_target_description_from_corefile_notes (struct gdbarch *gdbarch, gdbarch_use_target_description_from_corefile_notes_ftype *use_target_description_from_corefile_notes);
+
+/* Examine the core file bfd object CBFD and try to extract the name of
+   the current executable and the argument list, which are return in a
+   core_file_exec_context object.
+
+   If for any reason the details can't be extracted from CBFD then an
+   empty context is returned.
+
+   It is required that the current inferior be the one associated with
+   CBFD, strings are read from the current inferior using target methods
+   which all assume current_inferior() is the one to read from. */
+
+typedef core_file_exec_context (gdbarch_core_parse_exec_context_ftype) (struct gdbarch *gdbarch, bfd *cbfd);
+extern core_file_exec_context gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd);
+extern void set_gdbarch_core_parse_exec_context (struct gdbarch *gdbarch, gdbarch_core_parse_exec_context_ftype *core_parse_exec_context);
diff --git a/gdb/gdbarch.h b/gdb/gdbarch.h
index 60a0f60df39..8359ae762de 100644
--- a/gdb/gdbarch.h
+++ b/gdb/gdbarch.h
@@ -59,6 +59,7 @@ struct ui_out;
 struct inferior;
 struct x86_xsave_layout;
 struct solib_ops;
+struct core_file_exec_context;
 
 #include "regcache.h"
 
diff --git a/gdb/gdbarch_components.py b/gdb/gdbarch_components.py
index 4006380076d..7a218605d89 100644
--- a/gdb/gdbarch_components.py
+++ b/gdb/gdbarch_components.py
@@ -2778,3 +2778,23 @@ The corefile's bfd is passed through COREFILE_BFD.
     predefault="default_use_target_description_from_corefile_notes",
     invalid=False,
 )
+
+Method(
+    comment="""
+Examine the core file bfd object CBFD and try to extract the name of
+the current executable and the argument list, which are return in a
+core_file_exec_context object.
+
+If for any reason the details can't be extracted from CBFD then an
+empty context is returned.
+
+It is required that the current inferior be the one associated with
+CBFD, strings are read from the current inferior using target methods
+which all assume current_inferior() is the one to read from.
+""",
+    type="core_file_exec_context",
+    name="core_parse_exec_context",
+    params=[("bfd *", "cbfd")],
+    predefault="default_core_parse_exec_context",
+    invalid=False,
+)
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 65ec221ef48..0c81bd72de8 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -1835,6 +1835,297 @@ linux_corefile_thread (struct thread_info *info,
     }
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  /* This function (currently) assumes the stack grows down.  If this is
+     not the case then this function isn't going to help.  */
+  if (!gdbarch_stack_grows_down (gdbarch))
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Parse the .auxv section looking for the AT_EXECFN attribute.  The
+     value of this attribute is a pointer to a string, the string is the
+     executable command.  Additionally, this string is placed at the top of
+     the program stack, and so will be in the same PT_LOAD segment as the
+     argv and envp arrays.  We can use this to try and locate these arrays.
+     If we can't find the AT_EXECFN attribute then we're not going to be
+     able to do anything else here.  */
+  CORE_ADDR execfn_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_EXECFN, &execfn_string_addr) != 1)
+    return {};
+
+  /* Read in the program headers from CBFD.  If we can't do this for any
+     reason then just give up.  */
+  long phdrs_size = bfd_get_elf_phdr_upper_bound (cbfd);
+  if (phdrs_size == -1)
+    return {};
+  gdb::unique_xmalloc_ptr<Elf_Internal_Phdr>
+    phdrs ((Elf_Internal_Phdr *) xmalloc (phdrs_size));
+  int num_phdrs = bfd_get_elf_phdrs (cbfd, phdrs.get ());
+  if (num_phdrs == -1)
+    return {};
+
+  /* Now scan through the headers looking for the one which contains the
+     address held in EXECFN_STRING_ADDR, this is the address of the
+     executable command pointed too by the AT_EXECFN auxv entry.  */
+  Elf_Internal_Phdr *hdr = nullptr;
+  for (int i = 0; i < num_phdrs; i++)
+    {
+      /* The program header that contains the address EXECFN_STRING_ADDR
+	 should be one where all content is contained within CBFD, hence
+	 the check that the file size matches the memory size.  */
+      if (phdrs.get ()[i].p_type == PT_LOAD
+	  && phdrs.get ()[i].p_vaddr <= execfn_string_addr
+	  && (phdrs.get ()[i].p_vaddr
+	      + phdrs.get ()[i].p_memsz) > execfn_string_addr
+	  && phdrs.get ()[i].p_memsz == phdrs.get ()[i].p_filesz)
+	{
+	  hdr = &phdrs.get ()[i];
+	  break;
+	}
+    }
+
+  /* If we failed to find a suitable program header then give up.  */
+  if (hdr == nullptr)
+    return {};
+
+  /* As we assume the stack grows down (see early check in this function)
+     we know that the information we are looking for sits somewhere between
+     EXECFN_STRING_ADDR and the segments virtual address.  These define
+     the HIGH and LOW addresses between which we are going to search.  */
+  CORE_ADDR low = hdr->p_vaddr;
+  CORE_ADDR high = execfn_string_addr;
+
+  /* This PTR is going to be the address we are currently accessing.  */
+  CORE_ADDR ptr = align_down (high, ptr_bytes);
+
+  /* Setup DEREF a helper function which loads a value from an address.
+     The returned value is always placed into a uint64_t, even if we only
+     load 4-bytes, this allows the code below to be pretty generic.  All
+     the values we're dealing with are unsigned, so this should be OK.   */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+  const auto deref = [=] (CORE_ADDR p) -> uint64_t
+    {
+      ULONGEST value = read_memory_unsigned_integer (p, ptr_bytes, byte_order);
+      return (uint64_t) value;
+    };
+
+  /* Now search down through memory looking for a PTR_BYTES sized object
+     which contains the value EXECFN_STRING_ADDR.  The hope is that this
+     will be the AT_EXECFN entry in the auxv table.  There is no guarantee
+     that we'll find the auxv table this way, but we will do our best to
+     validate that what we find is the auxv table, see below.  */
+  while (ptr > low)
+    {
+      if (deref (ptr) == execfn_string_addr
+	  && (ptr - ptr_bytes) > low
+	  && deref (ptr - ptr_bytes) == AT_EXECFN)
+	break;
+
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* Assuming that we are looking at a value field in the auxv table, move
+     forward PTR_BYTES bytes so we are now looking at the next key field in
+     the auxv table, then scan forward until we find the null entry which
+     will be the last entry in the auxv table.  */
+  ptr += ptr_bytes;
+  while ((ptr + (2 * ptr_bytes)) < high
+	 && (deref (ptr) != 0 || deref (ptr + ptr_bytes) != 0))
+    ptr += (2 * ptr_bytes);
+
+  /* PTR now points to the null entry in the auxv table, or we think it
+     does.  Now we want to find the start of the auxv table.  There's no
+     in-memory pattern we can search for at the start of the table, but
+     we can find the start based on the size of the .auxv section within
+     the core file CBFD object.  In the actual core file the auxv is held
+     in a note, but the bfd library makes this into a section for us.
+
+     The addition of (2 * PTR_BYTES) here is because PTR is pointing at the
+     null entry, but the null entry is also included in CONTENTS.  */
+  ptr = ptr + (2 * ptr_bytes) - contents.size ();
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR should now be pointing to the start of the auxv table mapped into
+     the inferior memory.  As we got here using a heuristic then lets
+     compare an auxv table sized block of inferior memory, if this matches
+     then it's not a guarantee that we are in the right place, but it does
+     make it more likely.  */
+  gdb::byte_vector target_contents (size);
+  if (target_read_memory (ptr, target_contents.data (), size) != 0)
+    memory_error (TARGET_XFER_E_IO, ptr);
+  if (memcmp (contents.data (), target_contents.data (), size) != 0)
+    return {};
+
+  /* We have reasonable confidence that PTR points to the start of the auxv
+     table.  Below this should be the null terminated list of pointers to
+     environment strings, and below that the null terminated list of
+     pointers to arguments strings.  After that we should find the
+     argument count.  First, check for the null at the end of the
+     environment list.  */
+  if (deref (ptr - ptr_bytes) != 0)
+    return {};
+
+  ptr -= (2 * ptr_bytes);
+  while (ptr > low && deref (ptr) != 0)
+    ptr -= ptr_bytes;
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing to the null entry at the end of the argument
+     string pointer list.  We now want to scan backward to find the entire
+     argument list.  There's no handy null marker that we can look for
+     here, instead, as we scan backward we look for the argument count
+     (argc) value which appears immediately before the argument list.
+
+     Technically, we could have zero arguments, so the argument count would
+     be zero, however, we don't support this case.  If we find a null entry
+     in the argument list before we find the argument count then we just
+     bail out.
+
+     Start by moving to the last argument string pointer, we expect this
+     to be non-null.  */
+  ptr -= ptr_bytes;
+  uint64_t argc = 0;
+  while (ptr > low)
+    {
+      uint64_t val = deref (ptr);
+      if (val == 0)
+	return {};
+
+      if (val == argc)
+	break;
+
+      /* For GNU/Linux on ARM, glibc removes argc from the stack and
+	 replaces it with the "stack-limit".  This actually means a pointer
+	 to the first argument string.  This is unfortunate, but we can
+	 still detect this case.  */
+      if (val == (ptr + ptr_bytes))
+	break;
+
+      argc++;
+      ptr -= ptr_bytes;
+    }
+
+  /* If we reached the lower bound then we failed -- bail out.  */
+  if (ptr <= low)
+    return {};
+
+  /* PTR is now pointing at the argument count value (or where the argument
+     count should be, see notes on ARM above).  Move it forward so we're
+     pointing at the first actual argument string pointer.  */
+  ptr += ptr_bytes;
+
+  /* We can now parse all of the argument strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+
+  /* Skip the first argument.  This is the executable command, but we'll
+     load that separately later.  */
+  ptr += ptr_bytes;
+
+  uint64_t v;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      arguments.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  /* Skip the null-pointer at the end of the argument list.  We will now
+     be pointing at the first environment string.  */
+  ptr += ptr_bytes;
+
+  /* Parse the environment strings.  Nothing is done with this yet, but
+     will be in a later commit.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((v = deref (ptr)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str = target_read_string (v, INT_MAX);
+      if (str == nullptr)
+	return {};
+      environment.emplace_back (std::move (str));
+      ptr += ptr_bytes;
+    }
+
+  gdb::unique_xmalloc_ptr<char> execfn
+    = target_read_string (execfn_string_addr, INT_MAX);
+  if (execfn == nullptr)
+    return {};
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (arguments));
+}
+
+/* Parse and return execution context details from core file CBFD.  */
+
+static core_file_exec_context
+linux_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return linux_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Fill the PRPSINFO structure with information about the process being
    debugged.  Returns 1 in case of success, 0 for failures.  Please note that
    even if the structure cannot be entirely filled (e.g., GDB was unable to
@@ -2785,6 +3076,8 @@ linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch,
   set_gdbarch_infcall_mmap (gdbarch, linux_infcall_mmap);
   set_gdbarch_infcall_munmap (gdbarch, linux_infcall_munmap);
   set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       linux_corefile_parse_exec_context);
 }
 
 void _initialize_linux_tdep ();
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.c b/gdb/testsuite/gdb.base/corefile-exec-context.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
new file mode 100644
index 00000000000..b18a8104779
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -0,0 +1,102 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB can handle reading the full executable name and argument
+# list from a core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Linux core files can encore upto 80 characters for the command and
+# arguments in the psinfo.  If BINFILE is less than 80 characters in
+# length then lets try to make it longer.
+set binfile_len [string length $binfile]
+if { $binfile_len <= 80 } {
+    set extra_len [expr 80 - $binfile_len + 1]
+    set extra_str [string repeat "x" $extra_len]
+    set new_binfile $binfile$extra_str
+    remote_exec build "mv $binfile $new_binfile"
+    set binfile $new_binfile
+}
+
+# Generate a core file, this time the inferior has no additional
+# arguments.
+set corefile [core_find $binfile {}]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_1 "$binfile.1.core"
+remote_exec build "mv $corefile $corefile_1"
+
+# Load the core file and confirm that the full executable name is
+# seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_1" "load core file no args" {
+    -re "^Core was generated by `[string_to_regexp $binfile]'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Generate a core file, this time pass some arguments to the inferior.
+set args "aaaaa bbbbb ccccc ddddd eeeee"
+set corefile [core_find $binfile {} $args]
+if {$corefile == ""} {
+    untested "unable to create corefile"
+    return 0
+}
+set corefile_2 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_2"
+
+# Load the core file and confirm that the full executable name and
+# argument list are seen.
+clean_restart $binfile
+set saw_generated_line false
+gdb_test_multiple "core-file $corefile_2" "load core file with args" {
+    -re "^Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n" {
+	set saw_generated_line true
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+	gdb_assert { $saw_generated_line } $gdb_test_name
+    }
+
+    -re "^\[^\r\n\]*\r\n" {
+	exp_continue
+    }
+}
+
+# Also, the argument list should be available through 'show args'.
+gdb_test "show args" \
+    "Argument list to give program being debugged when it is started is \"$args\"\\."
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 2/5] gdb: parse and set the inferior environment from core files
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
@ 2024-10-29 14:08     ` Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Extend the core file context parsing mechanism added in the previous
commit to also store the environment parsed from the core file.

This environment can then be injected into the inferior object.

The benefit of this is that when examining a core file in GDB, the
'show environment' command will now show the environment extracted
from a core file.

Consider this example:

  $ env -i GDB_TEST_VAR=FOO ./gen-core
  Segmentation fault (core dumped)
  $ gdb -c ./core.1669829
  ...
  [New LWP 1669829]
  Core was generated by `./gen-core'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x0000000000401111 in ?? ()
  (gdb) show environment
  GDB_TEST_VAR=foo
  (gdb)

There's a new test for this functionality.
---
 gdb/arch-utils.c                              | 26 ++++++++
 gdb/arch-utils.h                              | 13 +++-
 gdb/corelow.c                                 |  3 +
 gdb/linux-tdep.c                              |  6 +-
 .../gdb.base/corefile-exec-context.exp        | 63 +++++++++++++++++++
 5 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/gdb/arch-utils.c b/gdb/arch-utils.c
index 6ffa4109765..567dc87d9dd 100644
--- a/gdb/arch-utils.c
+++ b/gdb/arch-utils.c
@@ -1499,6 +1499,32 @@ gdbarch_initialized_p (gdbarch *arch)
   return arch->initialized_p;
 }
 
+/* See arch-utils.h.  */
+
+gdb_environ
+core_file_exec_context::environment () const
+{
+  gdb_environ e;
+
+  for (const auto &entry : m_environment)
+    {
+      char *eq = strchr (entry.get (), '=');
+
+      /* If there's no '=' character, then skip this entry.  */
+      if (eq == nullptr)
+	continue;
+
+      const char *value = eq + 1;
+      const char *var = entry.get ();
+
+      *eq = '\0';
+      e.set (var, value);
+      *eq = '=';
+    }
+
+  return e;
+}
+
 void _initialize_gdbarch_utils ();
 void
 _initialize_gdbarch_utils ()
diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 8d9f1625bdd..1c33bfb4704 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -21,6 +21,7 @@
 #define ARCH_UTILS_H
 
 #include "gdbarch.h"
+#include "gdbsupport/environ.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -88,9 +89,11 @@ struct core_file_exec_context
      found but not ARGV then use the no-argument constructor to create an
      empty context object.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
-			  std::vector<gdb::unique_xmalloc_ptr<char>> argv)
+			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
+			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
-      m_arguments (std::move (argv))
+      m_arguments (std::move (argv)),
+      m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
   }
@@ -115,6 +118,9 @@ struct core_file_exec_context
   const std::vector<gdb::unique_xmalloc_ptr<char>> &args () const
   { return m_arguments; }
 
+  /* Return the environment variables from this context.  */
+  gdb_environ environment () const;
+
 private:
 
   /* The executable filename as reported in the core file.  Can be nullptr
@@ -124,6 +130,9 @@ struct core_file_exec_context
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
+
+  /* List of environment strings.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_environment;
 };
 
 /* Default implementation of gdbarch_displaced_hw_singlestep.  */
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 5cc11d71b7b..a0129f84b1c 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -1014,6 +1014,9 @@ core_target_open (const char *arg, int from_tty)
 	argv.push_back (a.get ());
       gdb::array_view<char * const> view (argv.data (), argv.size ());
       current_inferior ()->set_args (view);
+
+      /* And now copy the environment.  */
+      current_inferior ()->environment = ctx.environment ();
     }
   else
     {
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index 0c81bd72de8..e89bda9af13 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2074,8 +2074,7 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
      be pointing at the first environment string.  */
   ptr += ptr_bytes;
 
-  /* Parse the environment strings.  Nothing is done with this yet, but
-     will be in a later commit.  */
+  /* Parse the environment strings.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> environment;
   while ((v = deref (ptr)) != 0)
     {
@@ -2092,7 +2091,8 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
     return {};
 
   return core_file_exec_context (std::move (execfn),
-				 std::move (arguments));
+				 std::move (arguments),
+				 std::move (environment));
 }
 
 /* Parse and return execution context details from core file CBFD.  */
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index b18a8104779..ac97754fe71 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -100,3 +100,66 @@ gdb_test_multiple "core-file $corefile_2" "load core file with args" {
 # Also, the argument list should be available through 'show args'.
 gdb_test "show args" \
     "Argument list to give program being debugged when it is started is \"$args\"\\."
+
+# Find the name of an environment variable that is not set.
+set env_var_base "GDB_TEST_ENV_VAR_"
+set env_var_name ""
+
+for { set i 0 } { $i < 10 } { incr i } {
+    set tmp_name ${env_var_base}${i}
+    if { ! [info exists ::env($tmp_name)] } {
+	set env_var_name $tmp_name
+	break
+    }
+}
+
+if { $env_var_name eq "" } {
+    unsupported "couldn't find suitable environment variable name"
+    return -1
+}
+
+# Generate a core file with this environment variable set.
+set env_var_value "TEST VALUE"
+save_vars { ::env($env_var_name) } {
+    setenv $env_var_name $env_var_value
+
+    set corefile [core_find $binfile {} $args]
+    if {$corefile == ""} {
+	untested "unable to create corefile"
+	return 0
+    }
+}
+set corefile_3 "$binfile.2.core"
+remote_exec build "mv $corefile $corefile_3"
+
+# Restart, load the core file, and check the environment variable
+# shows up.
+clean_restart $binfile
+
+# Check for environment variable VAR_NAME in the environment, its
+# value should be VAR_VALUE.
+proc check_for_env_var { var_name var_value } {
+    set saw_var false
+    gdb_test_multiple "show environment" "" {
+	-re "^$var_name=$var_value\r\n" {
+	    set saw_var true
+	    exp_continue
+	}
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+	-re "^$::gdb_prompt $" {
+	}
+    }
+    return $saw_var
+}
+
+gdb_assert { ![check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is not set before core file load"
+
+gdb_test "core-file $corefile_3" \
+    "Core was generated by `[string_to_regexp $binfile] $args'\\.\r\n.*" \
+    "load core file for environment test"
+
+gdb_assert { [check_for_env_var $env_var_name $env_var_value] } \
+    "environment variable is set after core file load"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 3/5] gdb/testsuite: make some of the core file / build-id tests harder
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
@ 2024-10-29 14:08     ` Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
  4 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

We have a few tests that load core files, which depend on GDB not
auto-loading the executable that matches the core file.  One of these
tests (corefile-buildid.exp) exercises GDB's ability to load the
executable via the build-id links in the debug directory, while the
other two tests are just written assuming that GDB hasn't auto-loaded
the executable.

In the next commit, GDB is going to get better at finding the
executable for a core file, and as a consequence these tests could
start to fail if the testsuite is being run using a compiler that adds
build-ids by default, and is on a target (currently only Linux) with
the improved executable auto-loading.

To avoid these test failures, this commit updates some of the tests.

coredump-filter.exp and corefile.exp are updated to unload the
executable should it be auto-loaded.  This means that the following
output from GDB will match the expected patterns.  If the executable
wasn't auto-loaded then the new step to unload is harmless.

The corefile-buildid.exp test needed some more significant changes.
For this test it is important that the executable be moved aside so
that GDB can't locate it, but we do still need the executable around
somewhere, so that the debug directory can link to it.  The point of
the test is that the executable _should_ be auto-loaded, but using the
debug directory, not using GDB's context parsing logic.

While looking at this test I noticed two additional problems, first we
were creating the core file more times than we needed.  We only need
to create one core file for each test binary (total two), while we
previously created one core file for each style of debug info
directory (total four).  The extra core files should be identical, and
were just overwriting each other, harmless, but still pointless work.

The other problem is that after running an earlier test we modified
the test binary in order to run a later test.  This means it's not
possible to manually re-run the first test as the binary for that test
is destroyed.

As part of the rewrite in this commit I've addressed these issues.

This test does change many of the test names, but there should be no
real changes in what is being tested after this commit.  However, when
the next commit is added, and GDB gets better at auto-loading the
executable for a core file, these tests should still be testing what
is expected.
---
 gdb/testsuite/gdb.base/coredump-filter.exp  |  17 +-
 gdb/testsuite/gdb.base/corefile-buildid.exp | 252 +++++++++-----------
 gdb/testsuite/gdb.base/corefile.exp         |   9 +
 3 files changed, 130 insertions(+), 148 deletions(-)

diff --git a/gdb/testsuite/gdb.base/coredump-filter.exp b/gdb/testsuite/gdb.base/coredump-filter.exp
index 0c1fc7c2dd6..18c3505172b 100644
--- a/gdb/testsuite/gdb.base/coredump-filter.exp
+++ b/gdb/testsuite/gdb.base/coredump-filter.exp
@@ -105,14 +105,23 @@ proc test_disasm { core address should_fail } {
 	    return
 	}
 
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	if { $should_fail == 1 } {
 	    remote_exec host "mv -f $hide_binfile $binfile"
-	    gdb_test "x/i \$pc" "=> $hex:\tCannot access memory at address $hex" \
-		"disassemble function with corefile and without a binary"
+	    set re "Cannot access memory at address $hex"
 	} else {
-	    gdb_test "x/i \$pc" "=> $hex:\t\[^C\].*" \
-		"disassemble function with corefile and without a binary"
+	    set re "\[^C\].*"
 	}
+
+	gdb_test "x/i \$pc" "=> $hex:\t${re}" \
+	    "disassemble function with corefile and without a binary"
     }
 
     with_test_prefix "with binary" {
diff --git a/gdb/testsuite/gdb.base/corefile-buildid.exp b/gdb/testsuite/gdb.base/corefile-buildid.exp
index fc54cf201d9..377ae802239 100644
--- a/gdb/testsuite/gdb.base/corefile-buildid.exp
+++ b/gdb/testsuite/gdb.base/corefile-buildid.exp
@@ -19,71 +19,72 @@
 
 # Build-id-related tests for core files.
 
-standard_testfile
+standard_testfile .c -shlib-shr.c -shlib.c
 
-# Build a non-shared executable.
+# Create a corefile from PROGNAME.  Return the name of the generated
+# corefile, or the empty string if anything goes wrong.
+#
+# The generated corefile must contain a buildid for PROGNAME.  If it
+# doesn't then an empty string will be returned.
+proc create_core_file { progname } {
+    # Generate a corefile.
+    set corefile [core_find $progname]
+    if {$corefile == ""} {
+	untested "could not generate core file"
+	return ""
+    }
+    verbose -log "corefile is $corefile"
+
+    # Check the corefile has a build-id for the executable.
+    if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
+	set line [lindex [split $output "\n"] 0]
+	set binfile_re (?:[string_to_regexp $progname]|\\\[(?:exe|pie)\\\])
+	if { ![regexp "^${::hex}\\+${::hex} \[a-f0-9\]+@${::hex}.*$binfile_re$" $line] } {
+	    unsupported "no build-id for executable in corefile"
+	    return ""
+	}
+    } else {
+	unsupported "eu-unstrip tool failed"
+	return ""
+    }
 
-proc build_corefile_buildid_exec {} {
-    global testfile srcfile binfile execdir
+    return $corefile
+}
 
-    if {[build_executable $testfile.exp $testfile $srcfile debug] == -1} {
-	untested "failed to compile"
-	return false
-    }
 
-    # Move executable to non-default path.
-    set builddir [standard_output_file $execdir]
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile [file join $builddir [file tail $binfile]]"
+# Build a non-shared executable.
 
-    return true
+proc build_corefile_buildid_exec { progname } {
+    return [expr {[build_executable "build non-shared exec" $progname $::srcfile] != -1}]
 }
 
 # Build a shared executable.
 
-proc build_corefile_buildid_shared {} {
-    global srcdir subdir testfile binfile srcfile sharedir
-
-    set builddir [standard_output_file $sharedir]
-
+proc build_corefile_buildid_shared { progname } {
     # Compile DSO.
-    set srcdso [file join $srcdir $subdir $testfile-shlib-shr.c]
-    set objdso [standard_output_file $testfile-shlib-shr.so]
-    if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
-	untested "failed to compile dso"
+    set objdso [standard_output_file $::testfile-shlib-shr.so]
+    if {[build_executable "build dso" $objdso $::srcfile2 {debug shlib}] == -1} {
 	return false
     }
 
+
     # Compile shared library.
-    set srclib [file join $srcdir $subdir $testfile-shlib.c]
-    set libname lib$testfile.so
+    set srclib $::srcfile3
+    set libname lib$::testfile.so
     set objlib [standard_output_file $libname]
-    set dlopen_lib [shlib_target_file \
-			[file join $builddir [file tail $objdso]]]
-    set opts [list debug shlib_load \
+    set dlopen_lib [shlib_target_file $objdso]
+    set opts [list debug shlib_load shlib \
 		  additional_flags=-DSHLIB_NAME=\"$dlopen_lib\"]
-    if {[gdb_compile_shlib $srclib $objlib $opts] != ""} {
-	untested "failed to compile shared library"
+    if {[build_executable "build solib" $objlib $::srcfile3 $opts] == -1} {
 	return false
     }
 
     # Compile main program.
-    set srcexec [file join $srcdir $subdir $srcfile]
-    set binfile [standard_output_file $testfile-shared]
     set opts [list debug shlib=$objlib additional_flags=-DTEST_SHARED]
-    if {[gdb_compile $srcexec $binfile executable $opts] != ""} {
-	untested "failed to compile shared executable"
+    if {[build_executable "build shared exec" $progname $::srcfile $opts] == -1} {
 	return false
     }
 
-    # Move objects to non-default path.
-    remote_exec build "rm -rf $builddir"
-    remote_exec build "mkdir $builddir"
-    remote_exec build "mv $binfile $builddir"
-    remote_exec build "mv $objdso  $builddir"
-    remote_exec build "mv $objlib $builddir"
-
     return true
 }
 
@@ -154,37 +155,43 @@ proc check_exec_file {file} {
 # SHARED is a boolean indicating whether we are testing the shared
 # library core dump test case.
 
-proc locate_exec_from_core_build_id {corefile buildid suffix \
+proc locate_exec_from_core_build_id {corefile buildid \
+					 dirname progname \
 					 sepdebug symlink shared} {
-    global testfile binfile srcfile
-
     clean_restart
 
     # Set up the build-id directory and symlink the binary there.
+    set d "debugdir"
+    if {$shared} {
+	set d "${d}_shared"
+    } else {
+	set d "${d}_not-shared"
+    }
     if {$symlink} {
-	set d "symlinkdir"
+	set d "${d}_symlink"
     } else {
-	set d "debugdir"
+	set d "${d}_copy"
     }
-    set debugdir [standard_output_file $d-$suffix]
-    remote_exec build "rm -rf $debugdir"
+    if {$sepdebug} {
+	set d "${d}_stripped"
+    } else {
+	set d "${d}_not-stripped"
+    }
+
+    set debugdir [standard_output_file $d]
     remote_exec build \
 	"mkdir -p [file join $debugdir [file dirname $buildid]]"
 
     set files_list {}
-    lappend files_list $binfile $buildid
+    lappend files_list [file join $dirname [file tail $progname]] \
+	$buildid
     if {$sepdebug} {
-	lappend files_list "$binfile.debug" "$buildid.debug"
-    }
-    if {$shared} {
-	global sharedir
-	set builddir [standard_output_file $sharedir]
-    } else {
-	global execdir
-	set builddir [standard_output_file $execdir]
+	lappend files_list [file join $dirname [file tail $progname]].debug \
+	    "$buildid.debug"
     }
+
     foreach {target name} $files_list {
-	set t [file join $builddir [file tail $target]]
+	set t [file join $dirname [file tail $target]]
 	if {$symlink} {
 	    remote_exec build "ln -s $t [file join $debugdir $name]"
 	} else {
@@ -198,109 +205,66 @@ proc locate_exec_from_core_build_id {corefile buildid suffix \
     gdb_test "core-file $corefile" "Program terminated with .*" \
 	"load core file"
     if {$symlink} {
-	set expected_file [file join $builddir [file tail $binfile]]
+	set expected_file [file join $dirname [file tail $progname]]
     } else {
 	set expected_file $buildid
     }
     check_exec_file [file join $debugdir $expected_file]
 }
 
-# Run a build-id tests on a core file.
-# Supported options: "-shared" and "-sepdebug" for running tests
-# of shared and/or stripped/.debug executables.
-
-proc do_corefile_buildid_tests {args} {
-    global binfile testfile srcfile execdir sharedir hex
-
-    # Parse options.
-    parse_args [list {sepdebug} {shared}]
+foreach_with_prefix mode { exec shared } {
+    # Build the executable.
+    set progname ${binfile}-$mode
+    set build_proc build_corefile_buildid_${mode}
+    if { ![$build_proc $progname] } {
+	return -1
+    }
 
-    # PROGRAM to run to generate core file.  This could be different
-    # than the program that was originally built, e.g., for a stripped
-    # executable.
-    if {$shared} {
-	set builddir [standard_output_file $sharedir]
-    } else {
-	set builddir [standard_output_file $execdir]
+    # Generate a corefile.
+    set corefile [create_core_file $progname]
+    if { $corefile eq "" } {
+	return -1
     }
-    set program_to_run [file join $builddir [file tail $binfile]]
 
-    # A list of suffixes to use to describe the test and the .build-id
-    # directory for the test.  The suffix will be used, joined with spaces,
-    # to prefix all tests for the given run.  It will be used, joined with
-    # dashes, to create a unique build-id directory.
-    set suffix {}
-    if {$shared} {
-	lappend suffix "shared"
-    } else {
-	lappend suffix "exec"
+    # Get the build-id filename without ".debug" on the end.  This
+    # will have the format: '.build-id/xx/xxxxx'
+    set buildid [build_id_debug_filename_get $progname ""]
+    if {$buildid == ""} {
+	untested "binary has no build-id"
+	return
     }
+    verbose -log "build-id is $buildid"
 
-    if {$sepdebug} {
-	# Strip debuginfo into its own file.
-	if {[gdb_gnu_strip_debug [standard_output_file $program_to_run] \
-		 no-debuglink] != 0} {
-	    untested "could not strip executable  for [join $suffix \ ]"
-	    return
-	}
+    # Create a directory for the non-stripped test.
+    set combined_dirname [standard_output_file ${mode}_non-stripped]
+    remote_exec build "mkdir -p $combined_dirname"
+    remote_exec build "cp $progname $combined_dirname"
 
-	lappend suffix "sepdebug"
+    # Create a directory for the stripped test.
+    if {[gdb_gnu_strip_debug [standard_output_file $progname] no-debuglink] != 0} {
+	untested "could not strip executable  for [join $suffix \ ]"
+	return
     }
-
-    with_test_prefix "[join $suffix \ ]" {
-	# Find the core file.
-	set corefile [core_find $program_to_run]
-	if {$corefile == ""} {
-	    untested "could not generate core file"
-	    return
-	}
-	verbose -log "corefile is $corefile"
-
-	if { [catch "exec [gdb_find_eu-unstrip] -n --core $corefile" output] == 0 } {
-	    set line [lindex [split $output "\n"] 0]
-	    set binfile_re (?:[string_to_regexp $program_to_run]|\\\[(?:exe|pie)\\\])
-	    if { ![regexp "^${hex}\\+${hex} \[a-f0-9\]+@${hex}.*$binfile_re$" $line] } {
-		unsupported "build id for exec"
-		return
-	    }
+    set sepdebug_dirname [standard_output_file ${mode}_stripped]
+    remote_exec build "mkdir -p $sepdebug_dirname"
+    remote_exec build "mv $progname $sepdebug_dirname"
+    remote_exec build "mv ${progname}.debug $sepdebug_dirname"
+
+    # Now do the actual testing part.  Fill out a debug directory with
+    # build-id related files (copies or symlinks) and then load the
+    # corefile.  Check GDB finds the executable and debug information
+    # via the build-id related debug directory contents.
+    foreach_with_prefix sepdebug { false true } {
+	if { $sepdebug } {
+	    set dirname $sepdebug_dirname
 	} else {
-	    unsupported "eu-unstrip execution"
-	    return
-	}
-
-	# Get the build-id filename without ".debug" on the end.  This
-	# will have the format: '.build-id/xx/xxxxx'
-	set buildid [build_id_debug_filename_get $program_to_run ""]
-	if {$buildid == ""} {
-	    untested "binary has no build-id"
-	    return
+	    set dirname $combined_dirname
 	}
-	verbose -log "build-id is $buildid"
-
-	locate_exec_from_core_build_id $corefile $buildid \
-	    [join $suffix -] $sepdebug false $shared
 
-	with_test_prefix "symlink" {
+	foreach_with_prefix symlink { false true } {
 	    locate_exec_from_core_build_id $corefile $buildid \
-		[join $suffix -] $sepdebug true $shared
+		$dirname $progname \
+		$sepdebug $symlink [expr {$mode eq "shared"}]
 	}
     }
 }
-
-# Directories where executables will be moved before testing.
-set execdir "build-exec"
-set sharedir "build-shared"
-
-#
-# Do tests
-#
-
-build_corefile_buildid_exec
-do_corefile_buildid_tests
-do_corefile_buildid_tests -sepdebug
-
-if {[allow_shlib_tests]} {
-    build_corefile_buildid_shared
-    do_corefile_buildid_tests -shared
-    do_corefile_buildid_tests -shared -sepdebug
-}
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index dc3c8b1dfc8..2111aa66d7d 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -348,6 +348,15 @@ proc corefile_test_attach {} {
 	gdb_start
 
 	gdb_test "core-file $corefile" "Core was generated by .*" "attach: load core again"
+
+	# If GDB managed to auto-load an executable based on the core
+	# file, then unload it now.
+	gdb_test "with confirm off -- file" \
+	    [multi_line \
+		 "^No executable file now\\." \
+		 "No symbol file now\\."] \
+	    "ensure no executable is loaded"
+
 	gdb_test "info files" "\r\nLocal core dump file:\r\n.*" "attach: sanity check we see the core file"
 
 	gdb_test "attach $pid" "Attaching to process $pid\r\n.*" "attach: with core"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 4/5] gdb: improve GDB's ability to auto-load the exec for a core file
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                       ` (2 preceding siblings ...)
  2024-10-29 14:08     ` [PATCHv3 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
@ 2024-10-29 14:08     ` Andrew Burgess
  2024-10-29 14:08     ` [PATCHv3 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
  4 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

GDB already has a limited mechanism for auto-loading the executable
corresponding to a core file, this can be found in the function
locate_exec_from_corefile_build_id in corelow.c.

However, this approach uses the build-id of the core file to look in
either the debug directory (for a symlink back to the executable) or
by asking debuginfod.  This is great, and works fine if the core file
is a "system" binary, but often, when I'm debugging a core file, it's
part of my development cycle, so there's no build-id symlink in the
debug directory, and debuginfod doesn't know about the binary either,
so GDB can't auto load the executable....

... but the executable is right there!

This commit builds on the earlier commits in this series to make GDB
smarter.

On GNU/Linux, when we parse the execution context from the core
file (see linux-tdep.c), we already grab the command pointed to by
AT_EXECFN.  If this is an absolute path then GDB can use this to
locate the executable, a build-id check ensures we've found the
correct file.  With this small change GDB suddenly becomes a lot
better at auto-loading the executable for a core file.

But we can do better!  Often the AT_EXECFN is not an absolute path.

If it is a relative path then we check for this path relative to the
core file.  This helps if a user does something like:

  $ ./build/bin/some_prog
  Aborted (core dumped)
  $ gdb -c corefile

In this case the core file in the current directory will have an
AT_EXECFN value of './build/bin/some_prog', so if we look for that
path relative to the location of the core file this might result in a
hit, again, a build-id check ensures we found the right file.

But we can do better still!  What if the user moves the core file?  Or
the user is using some tool to manage core files (e.g. the systemd
core file management tool), and the user downloads the core file to a
location from which the relative path no longer works?

Well in this case we can make use of the core file's mapped file
information (the NT_FILE note).  The executable will be included in
the mapped file list, and the path within the mapped file list will be
an absolute path.  We can search for mapped file information based on
an address within the mapped file, and the auxv vector happens to
include an AT_ENTRY value, which is the entry address in the main
executable.  If we look up the mapped file containing this address
we'll have the absolute path to the main executable, a build-id check
ensures this really is the file we're looking for.

It might be tempting to jump straight to the third approach, however,
there is one small downside to the third approach: if the executable
is a symlink then the AT_EXECFN string will be the name of the
symlink, that is, the thing the user asked to run.  The mapped file
entry will be the name of the actual file, i.e. the symlink target.
When we auto-load the executable based on the third approach, the file
loaded might have a different name to that which the user expects,
though the build-id check (almost) guarantees that we've loaded the
correct binary.

But there's one more thing we can check for!

If the user has placed the core file and the executable into a
directory together, for example, as might happen with a bug report,
then neither the absolute path check, nor the relative patch check
will find the executable.  So GDB will also look for a file with the
right name in the same directory as the core file.  Again, a build-id
check is performed to ensure we find the correct file.

Of course, it's still possible that GDB is unable to find the
executable using any of these approaches.  In this case, nothing
changes, GDB will check in the debug info directory for a build-id
based link back to the executable, and if that fails, GDB will ask
debuginfod for the executable.  If this all fails, then, as usual, the
user is able to load the correct executable with the 'file' command,
but hopefully, this should be needed far less from now on.
---
 gdb/arch-utils.h                              |  25 +-
 gdb/corelow.c                                 | 141 ++++++++--
 gdb/linux-tdep.c                              |  22 ++
 gdb/testsuite/gdb.base/corefile-find-exec.c   |  25 ++
 gdb/testsuite/gdb.base/corefile-find-exec.exp | 242 ++++++++++++++++++
 5 files changed, 438 insertions(+), 17 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.c
 create mode 100644 gdb/testsuite/gdb.base/corefile-find-exec.exp

diff --git a/gdb/arch-utils.h b/gdb/arch-utils.h
index 1c33bfb4704..fb4a3ef9c5b 100644
--- a/gdb/arch-utils.h
+++ b/gdb/arch-utils.h
@@ -22,6 +22,7 @@
 
 #include "gdbarch.h"
 #include "gdbsupport/environ.h"
+#include "filenames.h"
 
 class frame_info_ptr;
 struct minimal_symbol;
@@ -87,15 +88,23 @@ struct core_file_exec_context
      never be nullptr.  Only call this constructor if all the arguments
      have been collected successfully, i.e. if the EXEC_NAME could be
      found but not ARGV then use the no-argument constructor to create an
-     empty context object.  */
+     empty context object.
+
+     The EXEC_FILENAME must be the absolute filename of the executable
+     that generated this core file, or nullptr if the absolute filename
+     is not known.  */
   core_file_exec_context (gdb::unique_xmalloc_ptr<char> exec_name,
+			  gdb::unique_xmalloc_ptr<char> exec_filename,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> argv,
 			  std::vector<gdb::unique_xmalloc_ptr<char>> envp)
     : m_exec_name (std::move (exec_name)),
+      m_exec_filename (std::move (exec_filename)),
       m_arguments (std::move (argv)),
       m_environment (std::move (envp))
   {
     gdb_assert (m_exec_name != nullptr);
+    gdb_assert (exec_filename == nullptr
+		|| IS_ABSOLUTE_PATH (exec_filename.get ()));
   }
 
   /* Create a default context object.  In its default state a context
@@ -112,6 +121,13 @@ struct core_file_exec_context
   const char *execfn () const
   { return m_exec_name.get (); }
 
+  /* Return the absolute path to the executable if known.  This might
+     return nullptr even when execfn() returns a non-nullptr value.
+     Additionally, the file referenced here might have a different name
+     than the file returned by execfn if execfn is a symbolic link.  */
+  const char *exec_filename () const
+  { return m_exec_filename.get (); }
+
   /* Return the vector of inferior arguments as extracted from the core
      file.  This does not include argv[0] (the executable name) for that
      see the execfn() function.  */
@@ -127,6 +143,13 @@ struct core_file_exec_context
      if no executable name is found.  */
   gdb::unique_xmalloc_ptr<char> m_exec_name;
 
+  /* Full filename to the executable that was actually executed.  The name
+     within EXEC_FILENAME might not match what the user typed, e.g. if the
+     user typed ./symlinked_name which is a symlink to /tmp/real_name then
+     this is going to contain '/tmp/realname' while EXEC_NAME above will
+     contain './symlinkedname'.  */
+  gdb::unique_xmalloc_ptr<char> m_exec_filename;
+
   /* List of arguments.  Doesn't include argv[0] which is the executable
      name, for this look at m_exec_name field.  */
   std::vector<gdb::unique_xmalloc_ptr<char>> m_arguments;
diff --git a/gdb/corelow.c b/gdb/corelow.c
index a0129f84b1c..272b86b6f33 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -828,18 +828,117 @@ rename_vmcore_idle_reg_sections (bfd *abfd, inferior *inf)
 	     replacement_lwpid_str.c_str ());
 }
 
+/* Use CTX to try and find (and open) the executable file for the core file
+   CBFD.  BUILD_ID is the build-id for CBFD which was already extracted by
+   our caller.
+
+   Will return the opened executable or nullptr if the executable couldn't
+   be found.  */
+
+static gdb_bfd_ref_ptr
+locate_exec_from_corefile_exec_context (bfd *cbfd,
+					const bfd_build_id *build_id,
+					const core_file_exec_context &ctx)
+{
+  /* CTX must be valid, and a valid context has an execfn() string.  */
+  gdb_assert (ctx.valid ());
+  gdb_assert (ctx.execfn () != nullptr);
+
+  /* EXEC_NAME will be the command used to start the inferior.  This might
+     not be an absolute path (but could be).  */
+  const char *exec_name = ctx.execfn ();
+
+  /* Function to open FILENAME and check if its build-id matches BUILD_ID
+     from this enclosing scope.  Returns the open BFD for filename if the
+     FILENAME has a matching build-id, otherwise, returns nullptr.  */
+  const auto open_and_check_build_id
+    = [&build_id] (const char *filename) -> gdb_bfd_ref_ptr
+  {
+    /* Try to open a file.  If this succeeds then we still need to perform
+       a build-id check.  */
+    gdb_bfd_ref_ptr execbfd = gdb_bfd_open (filename, gnutarget);
+
+    /* We managed to open a file, but if it's build-id doesn't match
+       BUILD_ID then we just cannot trust it's the right file.  */
+    if (execbfd != nullptr)
+      {
+	const bfd_build_id *other_build_id = build_id_bfd_get (execbfd.get ());
+
+	if (other_build_id == nullptr
+	    || !build_id_equal (other_build_id, build_id))
+	  execbfd = nullptr;
+      }
+
+    return execbfd;
+  };
+
+  gdb_bfd_ref_ptr execbfd;
+
+  /* If EXEC_NAME is absolute then try to open it now.  Otherwise, see if
+     EXEC_NAME is a relative path from the location of the core file.  This
+     is just a guess, the executable might not be here, but we still rely
+     on a build-id match in order to accept any executable we find; we
+     don't accept something just because it happens to be in the right
+     location.  */
+  if (IS_ABSOLUTE_PATH (exec_name))
+    execbfd = open_and_check_build_id (exec_name);
+  else
+    {
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + exec_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If we haven't found the executable yet, then try checking to see if
+     the executable is in the same directory as the core file.  Again,
+     there's no reason why this should be the case, but it's worth a try,
+     and the build-id check should ensure we don't use an invalid file if
+     we happen to find one.  */
+  if (execbfd == nullptr)
+    {
+      const char *base_name = lbasename (exec_name);
+      std::string p = (ldirname (bfd_get_filename (cbfd))
+		       + '/'
+		       + base_name);
+      execbfd = open_and_check_build_id (p.c_str ());
+    }
+
+  /* If the above didn't provide EXECBFD then try the exec_filename from
+     the context.  This will be an absolute filename which the gdbarch code
+     figured out from the core file.  In some cases the gdbarch code might
+     not be able to figure out a suitable absolute filename though.  */
+  if (execbfd == nullptr && ctx.exec_filename () != nullptr)
+    {
+      gdb_assert (IS_ABSOLUTE_PATH (ctx.exec_filename ()));
+
+      /* Try to open a file.  If this succeeds then we still need to
+	 perform a build-id check.  */
+      execbfd = open_and_check_build_id (ctx.exec_filename ());
+    }
+
+  return execbfd;
+}
+
 /* Locate (and load) an executable file (and symbols) given the core file
    BFD ABFD.  */
 
 static void
-locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
+locate_exec_from_corefile_build_id (bfd *abfd,
+				    const core_file_exec_context &ctx,
+				    int from_tty)
 {
   const bfd_build_id *build_id = build_id_bfd_get (abfd);
   if (build_id == nullptr)
     return;
 
-  gdb_bfd_ref_ptr execbfd
-    = find_objfile_by_build_id (build_id, abfd->filename);
+  gdb_bfd_ref_ptr execbfd;
+
+  if (ctx.valid ())
+    execbfd = locate_exec_from_corefile_exec_context (abfd, build_id, ctx);
+
+  if (execbfd == nullptr)
+    execbfd = find_objfile_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
@@ -908,13 +1007,6 @@ core_target_open (const char *arg, int from_tty)
 
   validate_files ();
 
-  /* If we have no exec file, try to set the architecture from the
-     core file.  We don't do this unconditionally since an exec file
-     typically contains more information that helps us determine the
-     architecture than a core file.  */
-  if (!current_program_space->exec_bfd ())
-    set_gdbarch_from_file (current_program_space->core_bfd ());
-
   current_inferior ()->push_target (std::move (target_holder));
 
   switch_to_no_thread ();
@@ -969,9 +1061,31 @@ core_target_open (const char *arg, int from_tty)
       switch_to_thread (thread);
     }
 
+  /* In order to parse the exec context from the core file the current
+     inferior needs to have a suitable gdbarch set.  If an exec file is
+     loaded then the gdbarch will have been set based on the exec file, but
+     if not, ensure we have a suitable gdbarch in place now.  */
+  if (current_program_space->exec_bfd () == nullptr)
+      current_inferior ()->set_arch (target->core_gdbarch ());
+
+  /* See if the gdbarch can find the executable name and argument list from
+     the core file.  */
+  core_file_exec_context ctx
+    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
+				       current_program_space->core_bfd ());
+
+  /* If we don't have an executable loaded then see if we can locate one
+     based on the core file.  */
   if (current_program_space->exec_bfd () == nullptr)
     locate_exec_from_corefile_build_id (current_program_space->core_bfd (),
-					from_tty);
+					ctx, from_tty);
+
+  /* If we have no exec file, try to set the architecture from the
+     core file.  We don't do this unconditionally since an exec file
+     typically contains more information that helps us determine the
+     architecture than a core file.  */
+  if (current_program_space->exec_bfd () == nullptr)
+    set_gdbarch_from_file (current_program_space->core_bfd ());
 
   post_create_inferior (from_tty);
 
@@ -989,11 +1103,6 @@ core_target_open (const char *arg, int from_tty)
       exception_print (gdb_stderr, except);
     }
 
-  /* See if the gdbarch can find the executable name and argument list from
-     the core file.  */
-  core_file_exec_context ctx
-    = gdbarch_core_parse_exec_context (target->core_gdbarch (),
-				       current_program_space->core_bfd ());
   if (ctx.valid ())
     {
       std::string args;
diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
index e89bda9af13..354efe8f37b 100644
--- a/gdb/linux-tdep.c
+++ b/gdb/linux-tdep.c
@@ -2090,7 +2090,29 @@ linux_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
   if (execfn == nullptr)
     return {};
 
+  /* When the core-file was loaded GDB processed the file backed mappings
+     (from the NT_FILE note).  One of these should have been for the
+     executable.  The AT_EXECFN string might not be an absolute path, but
+     the path in NT_FILE will be absolute, though if AT_EXECFN is a
+     symlink, then the NT_FILE entry will point to the actual file, not the
+     symlink.
+
+     Use the AT_ENTRY address to look for the NT_FILE entry which contains
+     that address, this should be the executable.  */
+  gdb::unique_xmalloc_ptr<char> exec_filename;
+  CORE_ADDR exec_entry_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_ENTRY, &exec_entry_addr) == 1)
+    {
+      std::optional<core_target_mapped_file_info> info
+	= core_target_find_mapped_file (nullptr, exec_entry_addr);
+      if (info.has_value () && !info->filename ().empty ()
+	  && IS_ABSOLUTE_PATH (info->filename ().c_str ()))
+	exec_filename = make_unique_xstrdup (info->filename ().c_str ());
+    }
+
   return core_file_exec_context (std::move (execfn),
+				 std::move (exec_filename),
 				 std::move (arguments),
 				 std::move (environment));
 }
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.c b/gdb/testsuite/gdb.base/corefile-find-exec.c
new file mode 100644
index 00000000000..ed4df606a2d
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+int
+main (int argc, char **argv)
+{
+  abort ();
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
new file mode 100644
index 00000000000..40324c1f01c
--- /dev/null
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -0,0 +1,242 @@
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check GDB's ability to auto-load the executable based on the file
+# names extracted from the core file.
+#
+# Currently, only Linux supports reading full executable and arguments
+# from a core file.
+require {istarget *-linux*}
+
+standard_testfile
+
+if {[build_executable $testfile.exp $testfile $srcfile {debug build-id}] == -1} {
+    untested "failed to compile"
+    return -1
+}
+
+# Load the COREFILE and confirm that GDB auto-loads the executable.
+# The symbols should be read from SYMBOL_FILE and the core file should
+# be reported as generated by GEN_FROM_FILE.
+proc test_load { corefile symbol_file gen_from_file } {
+    clean_restart
+    set saw_generated_line false
+    set saw_reading_symbols false
+
+    gdb_test_multiple "core-file $corefile" "load core file" {
+
+	-re "^Reading symbols from [string_to_regexp $symbol_file]\\.\\.\\.\r\n" {
+	    set saw_reading_symbols true
+	    exp_continue
+	}
+
+	-re "^Core was generated by `[string_to_regexp $gen_from_file]'\\.\r\n" {
+	    set saw_generated_line true
+	    exp_continue
+	}
+
+	-re "^$::gdb_prompt $" {
+	    gdb_assert { $saw_generated_line && $saw_reading_symbols} \
+		$gdb_test_name
+	}
+
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+    }
+}
+
+with_test_prefix "absolute path" {
+    # Generate a core file, this uses an absolute path to the
+    # executable.
+    with_test_prefix "to file" {
+	set corefile [core_find $binfile]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_1 "$binfile.1.core"
+	remote_exec build "mv $corefile $corefile_1"
+
+	test_load $corefile_1 $binfile $binfile
+    }
+
+    # And create a symlink, and repeat the test using an absolute path
+    # to the symlink.
+    with_test_prefix "to symlink" {
+	set symlink_name "symlink_1"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_2 "$binfile.2.core"
+	remote_exec build "mv $corefile $corefile_2"
+
+	test_load $corefile_2 $symlink $symlink
+    }
+
+    # Like the previous test, except this time, delete the symlink
+    # after generating the core file.  GDB should be smart enough to
+    # figure out that we can use the underlying TESTFILE binary.
+    with_test_prefix "to deleted symlink" {
+	set symlink_name "symlink_2"
+	set symlink [standard_output_file $symlink_name]
+
+	with_cwd [standard_output_file ""] {
+	    remote_exec build "ln -s ${testfile} $symlink_name"
+	}
+
+	set corefile [core_find $symlink]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_3 "$binfile.3.core"
+	remote_exec build "mv $corefile $corefile_3"
+
+	remote_exec build "rm -f $symlink"
+
+	test_load $corefile_3 $binfile $symlink
+    }
+
+    # Generate the core file with an absolute path to the executable,
+    # but move the core file and executable into a single directory
+    # together so GDB can't use the absolute path to find the
+    # executable.
+    #
+    # GDB should still find the executable though, but looking in the
+    # same directory as the core file.
+    with_test_prefix "in side directory" {
+	set binfile_2 [standard_output_file ${testfile}_2]
+	remote_exec build "cp $binfile $binfile_2"
+
+	set corefile [core_find $binfile_2]
+	if {$corefile == ""} {
+	    untested "unable to create corefile"
+	    return 0
+	}
+	set corefile_4 "$binfile.4.core"
+	remote_exec build "mv $corefile $corefile_4"
+
+	set side_dir [standard_output_file side_dir]
+	remote_exec build "mkdir -p $side_dir"
+	remote_exec build "mv $binfile_2 $side_dir"
+	remote_exec build "mv $corefile_4 $side_dir"
+
+	set relocated_corefile_4 [file join $side_dir [file tail $corefile_4]]
+	set relocated_binfile_2 [file join $side_dir [file tail $binfile_2]]
+	test_load $relocated_corefile_4 $relocated_binfile_2 $binfile_2
+    }
+}
+
+with_test_prefix "relative path" {
+    # Generate a core file using relative a path.  We ned to work
+    # around the core_find proc a little here.  The core_find proc
+    # creates a sub-directory using standard_output_file and runs the
+    # test binary from inside that directory.
+    #
+    # Usually core_find is passed an absolute path, so thre's no
+    # problem, but we want to pass a relative path.
+    #
+    # So setup a directory structure like this:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #
+    # Place a copy of BINFILE in 'reldir/' and switch to workdir, use
+    # core_find which will create a sibling directory of workdir, and
+    # run the relative path from there.  We then move the generated
+    # core file back into 'workdir/', this leaves a tree like:
+    #
+    # corefile-find-exec/
+    #    reldir/
+    #      <copy of $binfile here>
+    #    workdir/
+    #      <core file here>
+    #
+    # Now we can ask GDB to open the core file, if all goes well GDB
+    # should make use of the relative path encoded in the core file to
+    # locate the executable in 'reldir/'.
+    #
+    # We also setup a symlink in 'reldir' that points to the
+    # executable and repeat the test, but this time executing the
+    # symlink.
+    set reldir_name "reldir"
+    set reldir [standard_output_file $reldir_name]
+    remote_exec build "mkdir -p $reldir"
+
+    set alt_testfile "alt_${testfile}"
+    set binfile_3 "$reldir/${alt_testfile}"
+    remote_exec build "cp $binfile $binfile_3"
+
+    set symlink_2 "symlink_2"
+    with_cwd $reldir {
+	remote_exec build "ln -s ${alt_testfile} ${symlink_2}"
+    }
+
+    set work_dir [standard_output_file "workdir"]
+    remote_exec build "mkdir -p $work_dir"
+
+    set rel_path_to_file "../${reldir_name}/${alt_testfile}"
+    set rel_path_to_symlink_2 "../${reldir_name}/${symlink_2}"
+
+    with_cwd $work_dir {
+	with_test_prefix "to file" {
+	    set corefile [core_find $rel_path_to_file]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_5 "${work_dir}/${testfile}.5.core"
+	    remote_exec build "mv $corefile $corefile_5"
+
+	    test_load $corefile_5 \
+		[file join $work_dir $rel_path_to_file] \
+		$rel_path_to_file
+	}
+
+	with_test_prefix "to symlink" {
+	    set corefile [core_find $rel_path_to_symlink_2]
+	    if {$corefile == ""} {
+		untested "unable to create corefile"
+		return 0
+	    }
+	    set corefile_6 "${work_dir}/${testfile}.6.core"
+	    remote_exec build "mv $corefile $corefile_6"
+
+	    test_load $corefile_6 \
+		[file join $work_dir $rel_path_to_symlink_2] \
+		$rel_path_to_symlink_2
+	}
+
+	# Move the core file.  Now the relative path doesn't work so
+	# we instead rely on GDB to use information about the mapped
+	# files to help locate the executable.
+	with_test_prefix "with moved corefile" {
+	    set corefile_7 [standard_output_file "${testfile}.7.core"]
+	    remote_exec build "cp $corefile_6 $corefile_7"
+	    test_load $corefile_7 $binfile_3 $rel_path_to_symlink_2
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCHv3 5/5] gdb/freebsd: port core file context parsing to FreeBSD
  2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
                       ` (3 preceding siblings ...)
  2024-10-29 14:08     ` [PATCHv3 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
@ 2024-10-29 14:08     ` Andrew Burgess
  4 siblings, 0 replies; 18+ messages in thread
From: Andrew Burgess @ 2024-10-29 14:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit implements the gdbarch_core_parse_exec_context method for
FreeBSD.

This is much simpler than for Linux.  On FreeBSD, at least the
version (13.x) that I have installer, there are additional entries in
the auxv vector that point directly to the argument and environment
vectors, this makes it trivial to find this information.

If these extra auxv entries are not available on earlier FreeBSD, then
that's fine.  The fallback behaviour will be for GDB to act as it
always has up to this point, you'll just not get the extra
functionality.

Other differences compared to Linux are that FreeBSD has
AT_FREEBSD_EXECPATH instead of AT_EXECFN, the AT_FREEBSD_EXECPATH is
the full path to the executable.  On Linux AT_EXECFN is the command
the user typed, so this can be a relative path.

This difference is handy as on FreeBSD we don't parse the mapped files
from the core file (are they even available?).  So having the EXECPATH
means we can use that as the absolute path to the executable.

However, if the user ran a symlink then AT_FREEBSD_EXECPATH will be
the absolute path to the symlink, not to the underlying file.  This is
probably a good thing, but it does mean there is one case we test on
Linux that fails on FreeBSD.

On Linux if we create a symlink to an executable, then run the symlink
and generate a corefile.  Now delete the symlink and load the core
file.  On Linux GDB will still find (and open) the original
executable.  This is because we use the mapped file information to
find the absolute path to the executable, and the mapped file
information only stores the real file names, not symlink names.

This is a total edge case, I only added the deleted symlink test
originally because I could see that this would work on Linux.  Though
it is neat that Linux finds this, I don't feel too bad that this fails
on FreeBSD.

Other than this, everything seems to work on x86-64 FreeBSD (13.4)
which is all I have setup right now.  I don't see why other
architectures wouldn't work too, but I haven't tested them.
---
 gdb/fbsd-tdep.c                               | 134 ++++++++++++++++++
 .../gdb.base/corefile-exec-context.exp        |   2 +-
 gdb/testsuite/gdb.base/corefile-find-exec.exp |  12 +-
 3 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/gdb/fbsd-tdep.c b/gdb/fbsd-tdep.c
index e97ff52d5bf..804a72c4205 100644
--- a/gdb/fbsd-tdep.c
+++ b/gdb/fbsd-tdep.c
@@ -33,6 +33,7 @@
 #include "elf-bfd.h"
 #include "fbsd-tdep.h"
 #include "gcore-elf.h"
+#include "arch-utils.h"
 
 /* This enum is derived from FreeBSD's <sys/signal.h>.  */
 
@@ -2361,6 +2362,137 @@ fbsd_vdso_range (struct gdbarch *gdbarch, struct mem_range *range)
   return range->length != 0;
 }
 
+/* Try to extract the inferior arguments, environment, and executable name
+   from CBFD.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context_1 (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  gdb_assert (gdbarch != nullptr);
+
+  /* If there's no core file loaded then we're done.  */
+  if (cbfd == nullptr)
+    return {};
+
+  int ptr_bytes = gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT;
+
+  /* Find the .auxv section in the core file. The BFD library creates this
+     for us from the AUXV note when the BFD is opened.  If the section
+     can't be found then there's nothing more we can do.  */
+  struct bfd_section * section = bfd_get_section_by_name (cbfd, ".auxv");
+  if (section == nullptr)
+    return {};
+
+  /* Grab the contents of the .auxv section.  If we can't get the contents
+     then there's nothing more we can do.  */
+  bfd_size_type size = bfd_section_size (section);
+  if (bfd_section_size_insane (cbfd, section))
+    return {};
+  gdb::byte_vector contents (size);
+  if (!bfd_get_section_contents (cbfd, section, contents.data (), 0, size))
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the argument string vector.  */
+  CORE_ADDR argv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ARGV, &argv_addr) != 1)
+    return {};
+
+  /* Read AT_FREEBSD_ARGV, the address of the environment string vector.  */
+  CORE_ADDR envv_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_ENVV, &envv_addr) != 1)
+    return {};
+
+  /* Read the AT_EXECPATH string.  It's OK if we can't get this
+     information.  */
+  gdb::unique_xmalloc_ptr<char> execpath;
+  CORE_ADDR execpath_string_addr;
+  if (target_auxv_search (contents, current_inferior ()->top_target (),
+			  gdbarch, AT_FREEBSD_EXECPATH,
+			  &execpath_string_addr) == 1)
+    execpath = target_read_string (execpath_string_addr, INT_MAX);
+
+  /* The byte order.  */
+  enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
+
+  /* On FreeBSD the command the user ran is found in argv[0].  When we
+     read the first argument we place it into EXECFN.  */
+  gdb::unique_xmalloc_ptr<char> execfn;
+
+  /* Read strings from AT_FREEBSD_ARGV until we find a NULL marker.  The
+     first argument is placed into EXECFN as the command name.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> arguments;
+  CORE_ADDR str_addr;
+  while ((str_addr
+	  = (CORE_ADDR) read_memory_unsigned_integer (argv_addr, ptr_bytes,
+						      byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      if (execfn == nullptr)
+	execfn = std::move (str);
+      else
+	arguments.emplace_back (std::move (str));
+
+      argv_addr += ptr_bytes;
+    }
+
+  /* Read strings from AT_FREEBSD_ENVV until we find a NULL marker.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> environment;
+  while ((str_addr
+	  = (uint64_t) read_memory_unsigned_integer (envv_addr, ptr_bytes,
+						     byte_order)) != 0)
+    {
+      gdb::unique_xmalloc_ptr<char> str
+	= target_read_string (str_addr, INT_MAX);
+      if (str == nullptr)
+	return {};
+
+      environment.emplace_back (std::move (str));
+      envv_addr += ptr_bytes;
+    }
+
+  return core_file_exec_context (std::move (execfn),
+				 std::move (execpath),
+				 std::move (arguments),
+				 std::move (environment));
+}
+
+/* See elf-corelow.h.  */
+
+static core_file_exec_context
+fbsd_corefile_parse_exec_context (struct gdbarch *gdbarch, bfd *cbfd)
+{
+  /* Catch and discard memory errors.
+
+     If the core file format is not as we expect then we can easily trigger
+     a memory error while parsing the core file.  We don't want this to
+     prevent the user from opening the core file; the information provided
+     by this function is helpful, but not critical, debugging can continue
+     without it.  Instead just give a warning and return an empty context
+     object.  */
+  try
+    {
+      return fbsd_corefile_parse_exec_context_1 (gdbarch, cbfd);
+    }
+  catch (const gdb_exception_error &ex)
+    {
+      if (ex.error == MEMORY_ERROR)
+	{
+	  warning
+	    (_("failed to parse execution context from corefile: %s"),
+	     ex.message->c_str ());
+	  return {};
+	}
+      else
+	throw;
+    }
+}
+
 /* Return the address range of the vDSO for the current inferior.  */
 
 static int
@@ -2404,4 +2536,6 @@ fbsd_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
   /* `catch syscall' */
   set_xml_syscall_file_name (gdbarch, "syscalls/freebsd.xml");
   set_gdbarch_get_syscall_number (gdbarch, fbsd_get_syscall_number);
+  set_gdbarch_core_parse_exec_context (gdbarch,
+				       fbsd_corefile_parse_exec_context);
 }
diff --git a/gdb/testsuite/gdb.base/corefile-exec-context.exp b/gdb/testsuite/gdb.base/corefile-exec-context.exp
index ac97754fe71..73e13e60d75 100644
--- a/gdb/testsuite/gdb.base/corefile-exec-context.exp
+++ b/gdb/testsuite/gdb.base/corefile-exec-context.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
diff --git a/gdb/testsuite/gdb.base/corefile-find-exec.exp b/gdb/testsuite/gdb.base/corefile-find-exec.exp
index 40324c1f01c..07e660d85e8 100644
--- a/gdb/testsuite/gdb.base/corefile-find-exec.exp
+++ b/gdb/testsuite/gdb.base/corefile-find-exec.exp
@@ -18,7 +18,7 @@
 #
 # Currently, only Linux supports reading full executable and arguments
 # from a core file.
-require {istarget *-linux*}
+require {is_any_target "*-*-linux*" "*-*-freebsd*"}
 
 standard_testfile
 
@@ -115,6 +115,16 @@ with_test_prefix "absolute path" {
 
 	remote_exec build "rm -f $symlink"
 
+	# FreeBSD is unable to figure out the actual underlying mapped
+	# file, so when the symlink is deleted, FeeeBSD is stuck.
+	#
+	# There is some argument that this shouldn't even be a
+	# failure, the user ran the symlink, and if the symlink is
+	# gone, should we really expect GDB to find the underlying
+	# file?  That we can on Linux is really just a quirk of how
+	# the mapped file list works.
+	setup_xfail "*-*-freebsd*"
+
 	test_load $corefile_3 $binfile $symlink
     }
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-10-29 14:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-26 11:11 [PATCH 0/5] Better executable auto-loading when opening a core file Andrew Burgess
2024-10-26 11:11 ` [PATCH 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
2024-10-26 11:11 ` [PATCH 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
2024-10-26 11:11 ` [PATCH 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
2024-10-26 11:11 ` [PATCH 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
2024-10-26 11:11 ` [PATCH 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
2024-10-28 18:53 ` [PATCHv2 0/5] Better executable auto-loading when opening a core file Andrew Burgess
2024-10-28 18:53   ` [PATCHv2 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
2024-10-28 18:53   ` [PATCHv2 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
2024-10-28 18:53   ` [PATCHv2 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
2024-10-28 18:53   ` [PATCHv2 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
2024-10-28 18:53   ` [PATCHv2 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess
2024-10-29 14:08   ` [PATCHv3 0/5] Better executable auto-loading when opening a core file Andrew Burgess
2024-10-29 14:08     ` [PATCHv3 1/5] gdb: add gdbarch method to get execution context from " Andrew Burgess
2024-10-29 14:08     ` [PATCHv3 2/5] gdb: parse and set the inferior environment from core files Andrew Burgess
2024-10-29 14:08     ` [PATCHv3 3/5] gdb/testsuite: make some of the core file / build-id tests harder Andrew Burgess
2024-10-29 14:08     ` [PATCHv3 4/5] gdb: improve GDB's ability to auto-load the exec for a core file Andrew Burgess
2024-10-29 14:08     ` [PATCHv3 5/5] gdb/freebsd: port core file context parsing to FreeBSD Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).