public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] More build-id checking when opening core files
@ 2024-05-17 14:18 Andrew Burgess
  2024-05-17 14:18 ` [PATCH 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-17 14:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This series start with patch #3.  I wanted to improve the build-id
checking that GDB does when loading shared libraries as part of
opening a core file, I found the existing build-id checks to be rather
limited.

To write #3 I ended up looking at the mapped file handling in
corelow.c, which lead to patches #1 and #2.

And patch #4 appeared when I observed that we do similar, but slightly
different things in 3 places in GDB, and these could all be merged.

---

Andrew Burgess (4):
  gdb/corefile: remove unavailable target sections
  gdb/corefile: improve file backed mapping handling
  gdb: improve shared library build-id check for core-files
  gdb: unify build-id to objfile lookup code

 gdb/build-id.c                                |  42 +-
 gdb/build-id.h                                |  21 +-
 gdb/corelow.c                                 | 379 +++++++++++++-----
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 369 ++++++++++++++---
 gdb/solib.h                                   |  39 +-
 gdb/solist.h                                  |  22 +
 gdb/testsuite/gdb.base/corefile.exp           |  39 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 ++
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 +
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 +++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 335 ++++++++++++++++
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 268 +++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  20 +-
 22 files changed, 1587 insertions(+), 164 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp


base-commit: ad666becfe075ca7c831ebbf4b44526994395e97
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] gdb/corefile: remove unavailable target sections
  2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
@ 2024-05-17 14:18 ` Andrew Burgess
  2024-05-17 14:18 ` [PATCH 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-17 14:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file the bfd library processes the core file and
creates sections within the bfd object to represent each of the
segments within the core file.

GDB then creates two target_section lists, m_core_section_table and
m_core_file_mappings.  These, as well as m_core_unavailable_mappings,
are used by GDB to implement core_target::xfer_partial, that is, to
allow reading from inferior memory.

The m_core_section_table list represents sections within the core file
itself.  The sections in this list can be split into two groups based
on whether the section has the SEC_HAS_CONTENTS flag set or not.

Sections that have SEC_HAS_CONTENTS have their contents copied into
the core file while sections that do not have the SEC_HAS_CONTENTS
flag will not have their contents copied into the core file when it is
created.

The m_core_file_mappings list is created when GDB parses the mapped
file list in the core file.  Every mapped file will also be covered by
entries in the m_core_section_table list, however, when parts of a
file are mapped read-only then the corresponding entry in
m_core_section_table will not have the SEC_HAS_CONTENTS flag set.  If
parts of a file are mapped read/write then the corresponding
m_core_section_table entry will have the SEC_HAS_CONTENTS flag, and
the corresponding (possibly modified contents) will be copied into the
core file.

However, GDB only creates entries in m_core_file_mappings if it is
able to find the correct on-disk file to open.  If the corresponding
file can't be found then an entry is added to
m_core_unavailable_mappings.  An attempt to read from a region covered
by this list will be redirected to the lower file target in the hope
that a read-only section from the executable (supplied by the user)
might satisfy the read.

And so, in core_target::xfer_partial, the order in which the various
lists are checked to satisfy a read are:

  1. Check m_core_section_table for sections with SEC_HAS_CONTENTS
  flag, this reads content from the core file bfd directly,

  2. Check m_core_file_mappings, this will read content from mapped
  files assuming that GDB was able to find the file on-disk,

  3. Check m_core_unavailable_mappings and redirect to the file
  target, this allows the user supplied executable to satisfy some
  reads,

  4. Check m_core_section_table for sections without SEC_HAS_CONTENTS
  flag, this returns zero for these regions.

If a file is recorded as mapped in the core file, but the file was
mapped read only, then the matching entry in m_core_section_table will
not have the SEC_HAS_CONTENTS flag set.

If GDB can find the file on disk then an entry is added to
m_core_file_mappings, and reads from the region covered by the file
will be satisfied by #2 in the list above.

However, if GDB can't find the file then an entry is added to the
m_core_unavailable_mappings list.  An attempt will be made to satisfy
the read at stage #3 in the list above, however, if this file is not
the executable, then stage #3 will not satisfy the read.

We then fall through to stage #4, the section does not have the
SEC_HAS_CONTENTS flag, and so we end up returning zero.

In this case, I think this is the wrong thing to do.  If the core file
tells us that a file is mapped in, but GDB can't find the file on
disk, then we should instead return a failure when trying to read from
the region covered by the file, anything else is just misleading.

And so, I propose that in core_target::core_target() when the core
file is initially opened and m_core_section_table is setup, we should
remove entries that correspond to m_core_unavailable_mappings entries.

This means that an attempt to read from a mapped file, when GDB is not
able to find the on-disk file, and the contents of the mapped file
were not copied into the core file, then the read will fail.  This
feels like the only sensible option.

I've extended an existing test to exercise this case.  The test maps a
single file twice, once in read-only mode, and once in read-write
mode.  We then remove the on-disk file and load the core file in GDB.
The mapping which was read-write is still accessible (the content of
this region is found in the core file bfd object), while the read-only
mapping is no longer accessible.
---
 gdb/corelow.c                       | 58 +++++++++++++++++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp | 39 +++++++++++++++++++
 2 files changed, 97 insertions(+)

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 462a05591c1..4a489d95706 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -166,6 +166,11 @@ class core_target final : public process_stratum_target
 						    ULONGEST len,
 						    ULONGEST *xfered_len);
 
+  /* Remove from m_core_section_table any entries that are covered by an
+     unavailable mapping.  This will prevent xfer_partial from returning
+     zero for unavailable mappings.  */
+  void filter_unavailable_mappings_from_section_table ();
+
   /* FIXME: kettenis/20031023: Eventually this field should
      disappear.  */
   struct gdbarch *m_core_gdbarch = NULL;
@@ -198,6 +203,59 @@ core_target::core_target ()
   m_core_section_table = build_section_table (current_program_space->core_bfd ());
 
   build_file_mappings ();
+
+  if (!m_core_unavailable_mappings.empty ())
+    filter_unavailable_mappings_from_section_table ();
+}
+
+/* See comment in class declaration.  */
+
+void
+core_target::filter_unavailable_mappings_from_section_table ()
+{
+  auto filter_cb = [&] (const target_section &tsect) -> bool
+  {
+    /* If TSECT has contents then this means that there is content to be
+       read within the core file bfd itself.  If TSECT does not have
+       content then we are relying on reading the content from some other
+       file that was mapped into the inferior at the time the core file was
+       created, but if GDB couldn't find that file then we are not able to
+       read the content and TSECT should be discarded.  */
+    if ((tsect.the_bfd_section->flags & SEC_HAS_CONTENTS) != 0)
+      return false;
+
+    /* Find the first entry in m_core_unavailable_mappings whose start
+       address is not less than or equal to the start of this target
+       section.  The previous unavailable mapping might cover TSECT.  */
+    auto it = std::lower_bound (m_core_unavailable_mappings.begin (),
+				m_core_unavailable_mappings.end (),
+				tsect.addr,
+				[] (const mem_range &mr,
+				    const CORE_ADDR &addr)
+				{
+				  return mr.start <= addr;
+				});
+
+    /* If there is a previous unavailable mapping, then that mapping's
+       start address is less than, or equal to the start of TSECT.  The
+       unavailable mapping might cover TSECT.  If it does then we will
+       delete TSECT.  */
+    if (it != m_core_unavailable_mappings.begin ())
+      {
+	--it;
+
+	if (it->start <= tsect.addr
+	    && (it->start + it->length) >= tsect.endaddr)
+	  return true;
+      }
+
+    return false;
+  };
+
+  auto end = std::remove_if (m_core_section_table.begin (),
+			     m_core_section_table.end (),
+			     filter_cb);
+  m_core_section_table.erase (end, m_core_section_table.end ());
 }
 
 /* Construct the table for file-backed mappings if they exist.
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index 28b723e1f85..ee152e629ab 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -199,6 +199,45 @@ gdb_test "up" "#\[0-9\]* *(\[0-9xa-fH'\]* in)? .* \\(.*\\).*" "up (reinit)"
 
 gdb_test "core" "No core file now."
 
+# Temporarily move coremmap.data out of the way and reload the core
+# file.  We should still be able to read buf2 as the contents of this
+# are written into the core file.  In contrast buf2ro should no longer
+# be readable as the contents of this region are not within the core
+# file, GDB relies on reading this from the coremmap.data file, which
+# can no longer be found.
+set coremmap_data_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data]
+set coremmap_data_backup_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data.backup]
+remote_exec host "mv ${coremmap_data_filename} \
+		  ${coremmap_data_backup_filename}"
+
+clean_restart $binfile
+
+# Load the core file and check we get a warning about the
+# coremmap.data file being missing.
+gdb_test_multiple "core-file $corefile" "warn about coremmap.data missing" {
+    -re -wrap "warning: Can't open file \[^\r\n\]+/coremmap.data during file-backed mapping note processing\r\n.*" {
+	pass $gdb_test_name
+    }
+}
+
+# This xfail was just copied from earlier in the script where we also
+# read from buf2.
+setup_xfail "*-*-sunos*" "*-*-aix*"
+gdb_test "x/8bd buf2" \
+    ".*:.*0.*1.*2.*3.*4.*5.*6.*7.*" \
+    "accessing mmapped data in core file with coremmap.data removed"
+
+gdb_test "x/8bd buf2ro" \
+    "$hex\[^:\]*:\\s+Cannot access memory at address $hex" \
+    "accessing read-only mmapped data in core file with coremmap.data removed"
+
+# Restore the coremmap.data file so later tests don't give warnings
+# when the core file is reloaded.
+remote_exec host "mv ${coremmap_data_backup_filename} \
+		  ${coremmap_data_filename}"
+
 # Test that we can unload the core with the "detach" command.
 
 proc_with_prefix corefile_detach {} {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/4] gdb/corefile: improve file backed mapping handling
  2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
  2024-05-17 14:18 ` [PATCH 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
@ 2024-05-17 14:18 ` Andrew Burgess
  2024-05-17 14:18 ` [PATCH 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-17 14:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit improves how GDB handles file backed mappings within a
core file, specifically, this is a restructuring of the function
core_target::build_file_mapping.

The primary motivation for this commit was to put in place the
infrastructure to support the next commit in this series, but this
commit does itself make some improvements.

Currently in core_target::build_file_mapping we use
gdbarch_read_core_file_mappings to iterate over the mapped regions
within a core file.

For each region a callback is invoked which is passed details of the
mapping; the file the mapping is from, the offset into the file, and
the address range at which the mapping exists.  We are also passed the
build-id for the mapped file in some cases.

We are only told the build-id for the mapped region which actually
contains the ELF header of the mapped file.  Other regions of the same
mapped ELF will not have the build-id passed to the callback.

Within core_target::build_file_mapping, in the per-region callback, we
try to find the mapped file based on its filename.  If the file can't
be found, and if we have a build-id then we'll ask debuginfod to
download the file.

However we find the file, we cache the opened bfd object, which is
good.  Subsequent mappings from the same file will not have a build-id
set, but by that point we already have a cached open bfd object, so
the lack of build-id is irrelevant.

The problem with the above is that if we find a matching file based on
the filename, then we accept that file, even if we have a build-id,
and the build-id doesn't match.

Currently, the mapped region processing is done in a single pass, we
call gdbarch_read_core_file_mappings, and for each mapping, as we see
it, we create the data structures needed to represent that mapping.

In this commit, I will change this to a two phase process.  In the
first phase the mappings are grouped together based on the name of the
mapped file.  At the end of phase one we have a 'struct mapped_file',
a new struct, for each mapped file.  This struct associates an
optional build-id with a list of mapped regions.

In the second phase we try to find the file using its filename.  If
the file is found, and the 'struct mapped_file' has a build-id, then
we'll compare the build-id with the file we found.  This allows us to
reject on-disk files which have changed since the core file was
created.

If no suitable file was found (either no file found, or a build-id
mismatch) then we can use debuginfod to potentially download a
suitable file.

  NOTE: In the future we could potentially add additional sanity
  checks here, for example, if a data-file is mapped, and has no
  build-id, we can estimate a minimum file size based on the expected
  mappings.  If the file we find is not big enough then we can reject
  the on-disk file.  But I don't know how useful this would actually
  be, so I've not done that for now.

Having found (or not) a suitable file then we can create the data
structures for each mapped region just as we did before.

The new functionality here is the extra build-id check, and the
possibility of rejecting an on-disk file if the build-id doesn't
match.

This change could have been done within the existing single phase
approach I think, however, in the next approach I need to have all the
mapped regions associated with the expected build-id, and the new two
phase structure allows me to do that, this is the reason for such an
extensive rewrite in this commit.

There's a new test that exercises GDB's ability to find mapped files
via the build-id, and this downloading from debuginfod.
---
 gdb/corelow.c                                 | 274 +++++++++++-----
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 ++
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 ++
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 +++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 304 ++++++++++++++++++
 5 files changed, 596 insertions(+), 73 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 4a489d95706..28d619594d8 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -53,6 +53,7 @@
 #include "cli/cli-cmds.h"
 #include "xml-tdesc.h"
 #include "memtag.h"
+#include "cli/cli-style.h"
 
 #ifndef O_LARGEFILE
 #define O_LARGEFILE 0
@@ -277,6 +278,48 @@ core_target::build_file_mappings ()
   std::unordered_map<std::string, struct bfd *> bfd_map;
   std::unordered_set<std::string> unavailable_paths;
 
+  /* Type holding information about a single file mapped into the inferior
+     at the point when the core file was created.  Associates a build-id
+     with the list of regions the file is mapped into.  */
+  struct mapped_file
+  {
+    /* Type for a region of a file that was mapped into the inferior when
+       the core file was generated.  */
+    struct region
+    {
+      /* Constructor.   See member variables for argument descriptions.  */
+      region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
+	: start (start_),
+	  end (end_),
+	  file_ofs (file_ofs_)
+      { /* Nothing.  */ }
+
+      /* The inferior address for the start of the mapped region.  */
+      CORE_ADDR start;
+
+      /* The inferior address immediately after the mapped region.  */
+      CORE_ADDR end;
+
+      /* The offset within the mapped file for this content.  */
+      CORE_ADDR file_ofs;
+    };
+
+    /* If not nullptr, then this is the build-id associated with this
+       file.  */
+    const bfd_build_id *build_id = nullptr;
+
+    /* If true then we have seen multiple different build-ids associated
+       with the same filename.  The build_id field will have been set back
+       to nullptr, and we should not set build_id in future.  */
+    bool ignore_build_id_p = false;
+
+    /* All the mapped regions of this file.  */
+    std::vector<region> regions;
+  };
+
+  /* All files mapped into the core file.  The key is the filename.  */
+  std::unordered_map<std::string, mapped_file> mapped_files;
+
   /* See linux_read_core_file_mappings() in linux-tdep.c for an example
      read_core_file_mappings method.  */
   gdbarch_read_core_file_mappings (m_core_gdbarch,
@@ -297,87 +340,172 @@ core_target::build_file_mappings ()
 	   weed out non-file-backed mappings.  */
 	gdb_assert (filename != nullptr);
 
-	if (unavailable_paths.find (filename) != unavailable_paths.end ())
+	/* Add this mapped region to the data for FILENAME.  */
+	mapped_file &file_data = mapped_files[filename];
+	file_data.regions.emplace_back (start, end, file_ofs);
+	if (build_id != nullptr && !file_data.ignore_build_id_p)
 	  {
-	    /* We have already seen some mapping for FILENAME but failed to
-	       find/open the file.  There is no point in trying the same
-	       thing again so just record that the range [start, end) is
-	       unavailable.  */
-	    m_core_unavailable_mappings.emplace_back (start, end - start);
-	    return;
-	  }
-
-	struct bfd *bfd = bfd_map[filename];
-	if (bfd == nullptr)
-	  {
-	    /* Use exec_file_find() to do sysroot expansion.  It'll
-	       also strip the potential sysroot "target:" prefix.  If
-	       there is no sysroot, an equivalent (possibly more
-	       canonical) pathname will be provided.  */
-	    gdb::unique_xmalloc_ptr<char> expanded_fname
-	      = exec_file_find (filename, NULL);
-
-	    if (expanded_fname == nullptr && build_id != nullptr)
-	      debuginfod_exec_query (build_id->data, build_id->size,
-				     filename, &expanded_fname);
-
-	    if (expanded_fname == nullptr)
+	    if (file_data.build_id == nullptr)
+	      file_data.build_id = build_id;
+	    else if (!build_id_equal (build_id, file_data.build_id))
 	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s during file-backed mapping "
-			   "note processing"),
-			 filename);
-		return;
+		warning (_("Multiple build-ids found for %ps"),
+			 styled_string (file_name_style.style (), filename));
+		file_data.build_id = nullptr;
+		file_data.ignore_build_id_p = true;
 	      }
+	  }
+      });
 
-	    bfd = bfd_openr (expanded_fname.get (), "binary");
+  for (const auto &iter : mapped_files)
+    {
+      const std::string &filename = iter.first;
+      const mapped_file &file_data = iter.second;
+
+      /* Use exec_file_find() to do sysroot expansion.  It'll
+	 also strip the potential sysroot "target:" prefix.  If
+	 there is no sysroot, an equivalent (possibly more
+	 canonical) pathname will be provided.  */
+      gdb::unique_xmalloc_ptr<char> expanded_fname
+	= exec_file_find (filename.c_str (), nullptr);
+
+      bool build_id_mismatch = false;
+      if (expanded_fname != nullptr && file_data.build_id != nullptr)
+	{
+	  /* We temporarily open the bfd as a structured target, this
+	     allows us to read the build-id from the bfd if there is one.
+	     For this task it's OK if we reuse an already open bfd object,
+	     so we make this call through GDB's bfd cache.  Once we've
+	     checked the build-id (if there is one) we'll drop this
+	     reference and re-open the bfd using the "binary" target.  */
+	  gdb_bfd_ref_ptr tmp_bfd
+	    = gdb_bfd_open (expanded_fname.get (), gnutarget);
+
+	  if (tmp_bfd != nullptr
+	      && bfd_check_format (tmp_bfd.get (), bfd_object)
+	      && build_id_bfd_get (tmp_bfd.get ()) != nullptr)
+	    {
+	      /* The newly opened TMP_BFD has a build-id, and this mapped
+		 file has a build-id extracted from the core-file.  Check
+		 the build-id's match, and if not, reject TMP_BFD.  */
+	      const struct bfd_build_id *found
+		= build_id_bfd_get (tmp_bfd.get ());
+	      if (!build_id_equal (found, file_data.build_id))
+		build_id_mismatch = true;
+	    }
+	}
 
-	    if (bfd == nullptr || !bfd_check_format (bfd, bfd_object))
-	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s which was expanded to %s "
+      gdb_bfd_ref_ptr abfd;
+      if (expanded_fname != nullptr && !build_id_mismatch)
+	{
+	  struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	  abfd = gdb_bfd_ref_ptr::new_reference (b);
+	}
+
+      if ((expanded_fname == nullptr
+	   || abfd == nullptr
+	   || !bfd_check_format (abfd.get (), bfd_object))
+	  && file_data.build_id != nullptr)
+	{
+	  expanded_fname = nullptr;
+	  debuginfod_exec_query (file_data.build_id->data,
+				 file_data.build_id->size,
+				 filename.c_str (), &expanded_fname);
+	  if (expanded_fname != nullptr)
+	    {
+	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      abfd = gdb_bfd_ref_ptr::new_reference (b);
+	    }
+	}
+
+      if (expanded_fname == nullptr
+	  || abfd == nullptr
+	  || !bfd_check_format (abfd.get (), bfd_object))
+	{
+	  /* If ABFD was opened, but the wrong format, close it now.  */
+	  abfd = nullptr;
+
+	  /* Record all regions for this file as unavailable.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    m_core_unavailable_mappings.emplace_back (region.start,
+						      region.end
+						      - region.start);
+
+	  /* And give the user an appropriate warning.  */
+	  if (build_id_mismatch)
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("File %ps doesn't match build-id from core-file "
+			   "during file-backed mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("File %ps which was expanded to %ps, doesn't match "
+			   "build-id from core-file during file-backed "
+			   "mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	  else
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("Can't open file %ps during file-backed mapping "
+			   "note processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("Can't open file %ps which was expanded to %ps "
 			   "during file-backed mapping note processing"),
-			 filename, expanded_fname.get ());
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	}
+      else
+	{
+	  /* Ensure that the bfd will be closed when core_bfd is closed.
+	     This can be checked before/after a core file detach via "maint
+	     info bfds".  */
+	  gdb_bfd_record_inclusion (current_program_space->core_bfd (),
+				    abfd.get ());
+
+	  /* Create sections for each mapped region.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    {
+	      /* Make new BFD section.  All sections have the same name,
+		 which is permitted by bfd_make_section_anyway().  */
+	      asection *sec = bfd_make_section_anyway (abfd.get (), "load");
+	      if (sec == nullptr)
+		error (_("Can't make section"));
+	      sec->filepos = region.file_ofs;
+	      bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
+	      bfd_set_section_size (sec, region.end - region.start);
+	      bfd_set_section_vma (sec, region.start);
+	      bfd_set_section_lma (sec, region.start);
+	      bfd_set_section_alignment (sec, 2);
+
+	      /* Set target_section fields.  */
+	      m_core_file_mappings.emplace_back (region.start, region.end, sec);
+	    }
+	}
 
-		if (bfd != nullptr)
-		  bfd_close (bfd);
-		return;
-	      }
-	    /* Ensure that the bfd will be closed when core_bfd is closed. 
-	       This can be checked before/after a core file detach via
-	       "maint info bfds".  */
-	    gdb_bfd_record_inclusion (current_program_space->core_bfd (), bfd);
-	    bfd_map[filename] = bfd;
-	  }
+      /* If this is a bfd of a shared library, record its soname and
+	 build-id.  */
+      if (file_data.build_id != nullptr && abfd != nullptr)
+	{
+	  gdb::unique_xmalloc_ptr<char> soname
+	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
 
-	/* Make new BFD section.  All sections have the same name,
-	   which is permitted by bfd_make_section_anyway().  */
-	asection *sec = bfd_make_section_anyway (bfd, "load");
-	if (sec == nullptr)
-	  error (_("Can't make section"));
-	sec->filepos = file_ofs;
-	bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
-	bfd_set_section_size (sec, end - start);
-	bfd_set_section_vma (sec, start);
-	bfd_set_section_lma (sec, start);
-	bfd_set_section_alignment (sec, 2);
-
-	/* Set target_section fields.  */
-	m_core_file_mappings.emplace_back (start, end, sec);
-
-	/* If this is a bfd of a shared library, record its soname
-	   and build id.  */
-	if (build_id != nullptr)
-	  {
-	    gdb::unique_xmalloc_ptr<char> soname
-	      = gdb_bfd_read_elf_soname (bfd->filename);
-	    if (soname != nullptr)
-	      set_cbfd_soname_build_id (current_program_space->cbfd,
-					soname.get (), build_id);
-	  }
-      });
+	  if (soname != nullptr)
+	    set_cbfd_soname_build_id (current_program_space->cbfd,
+				      soname.get (), file_data.build_id);
+	}
+    }
 
   normalize_mem_ranges (&m_core_unavailable_mappings);
 }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
new file mode 100644
index 00000000000..dd6b3f105f2
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
@@ -0,0 +1,24 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+int
+main (void)
+{
+  int res = foo ();
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
new file mode 100644
index 00000000000..ccf35b75ef8
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
@@ -0,0 +1,22 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+volatile int *library_ptr = (int *) POINTER_VALUE;
+
+int
+foo (void)
+{
+  return *library_ptr;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
new file mode 100644
index 00000000000..45d78336930
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
@@ -0,0 +1,45 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+volatile void* library_base_address = 0;
+volatile int *ptr = 0;
+
+int
+main ()
+{
+  struct stat buf;
+  int res;
+
+  int fd = open (SHLIB_FILENAME, O_RDONLY);
+  assert (fd != -1);
+
+  res = fstat (fd, &buf);
+  assert (res == 0);
+
+  library_base_address
+    = mmap (NULL, buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+
+  res = *ptr;	/* Undefined behaviour here.  */
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
new file mode 100644
index 00000000000..c789be87ba3
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -0,0 +1,304 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test checks GDB's ability to use build-ids when setting up file backed
+# mappings as part of reading a core-file.
+#
+# A core-file contains a list of the files that were mapped into the process
+# at the time of the core-file creation.  If the file was mapped read-only
+# then the file contents will not be present in the core-file, but instead GDB
+# is expected to open the mapped file and read the contents from there if
+# needed.  And this is what GDB does.
+#
+# GDB (via the BFD library) will also spot if a mapped looks like a valid ELF
+# and contains a build-id, this build-id is passed back to GDB so that GDB can
+# validate the on-disk file it finds matches the file that was mapped when the
+# core-file was created.
+#
+# In addition, if the on-disk file is found to have a non-matching build-id
+# then GDB can use debuginfod to (try) and download a suitable file.
+#
+# This test is about checking that this file backed mapping mechanism works
+# correctly; that GDB will spot when the build-ids fail to match and will
+# refuse to load an incorrect file.  Additionally we check that the correct
+# file can be downloaded from debuginfod.
+#
+# The test is rather contrived though.  Instead of relying on having a shared
+# library mapped at the time of crash we mmap a shared library into the
+# process and then check this mapping within the test.
+#
+# The problem with using a normal shared library load for this test is that
+# the shared library list is processed as part of a separate step when opening
+# the core file.  Right now this separate step doesn't check the build-ids
+# correctly, so GDB will potentially open the wrong shared library file.  The
+# sections of this incorrect shared library are then added to GDB's list of
+# target sections, and are used to satisfy memory reads, which can give the
+# wrong results.
+#
+# This obviously needs fixing, but is a separate problem from the one being
+# tested here, so this test deliberately checks the mapping using a file that
+# is mmaped rather than loaded as a shared library, as such the file is in the
+# core-files list of mapped files, but is not in the shared library list.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+
+standard_testfile -1.c -2.c -3.c
+
+# Compile an executable that loads the shared library as an actual
+# shared library, then use GDB to figure out the offset of the
+# variable 'library_ptr' within the library.
+set library_filename [standard_output_file "libfoo.so"]
+set binfile2 [standard_output_file "library_loader"]
+
+if {[prepare_for_testing_full "build exec which loads the shared library" \
+	 [list $library_filename \
+	      { debug shlib build-id \
+		    additional_flags=-DPOINTER_VALUE=0x12345678 } \
+	      $srcfile2 {}] \
+	 [list $binfile2 [list debug shlib=$library_filename ] \
+	      $srcfile { debug }]] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+if { [is_address_zero_readable] } {
+    return
+}
+
+set ptr_address [get_hexadecimal_valueof "&library_ptr" "unknown"]
+
+set ptr_offset "unknown"
+gdb_test_multiple "info proc mappings" "" {
+    -re "^\\s+($hex)\\s+($hex)\\s+$hex\\s+($hex)\[^\r\n\]+$library_filename\r\n" {
+	set low_addr $expect_out(1,string)
+	set high_addr $expect_out(2,string)
+	set file_offset $expect_out(3,string)
+
+	if {[expr $ptr_address >= $low_addr] && [expr $ptr_address < $high_addr]} {
+	    set mapping_offset [expr $ptr_address - $low_addr]
+	    set ptr_offset [format 0x%x [expr $file_offset + $mapping_offset]]
+	}
+
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+    }
+
+    -re "(^\[^\r\n\]*)\r\n" {
+	set tmp $expect_out(1,string)
+	exp_continue
+    }
+}
+
+gdb_assert { $ptr_offset ne "unknown" } \
+    "found pointer offset"
+
+set ptr_size [get_integer_valueof "sizeof (library_ptr)" "unknown"]
+set ptr_format_char ""
+if { $ptr_size == 2 } {
+    set ptr_format_char "b"
+} elseif { $ptr_size == 4 } {
+    set ptr_format_char "w"
+} elseif { $ptr_size == 8 } {
+    set ptr_format_char "g"
+}
+if { $ptr_format_char eq "" } {
+    untested "could not figure out size of library_ptr variable"
+    return
+}
+
+# Helper proc to read a value from inferior memory.  Reads at address held in
+# global PTR_ADDRESS, and use PTR_FORMAT_CHAR for the size of the read.
+proc read_ptr_value {  } {
+    set value ""
+    gdb_test_multiple "x/1${::ptr_format_char}x ${::ptr_address}" "" {
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+($::hex)" {
+	    set value $expect_out(1,string)
+	}
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+Cannot access memory at address ${::hex}" {
+	    set value "unavailable"
+	}
+    }
+    return $value
+}
+
+set ptr_expected_value [read_ptr_value]
+if { $ptr_expected_value eq "" } {
+    untested "could not find expected value for library_ptr"
+    return
+}
+
+# Now compile a second executable.  This one doesn't load the shared
+# library as an actual shared library, but instead mmaps the library
+# into the executable.
+#
+# Load this executable within GDB and confirm that we can use the
+# offset we calculated previously to view the value of 'library_ptr'.
+set opts [list debug additional_flags=-DSHLIB_FILENAME=\"$library_filename\"]
+if {[prepare_for_testing "prepare second executable" $binfile \
+	 $srcfile3 $opts] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+gdb_breakpoint [gdb_get_line_number "Undefined behaviour here" $srcfile3]
+gdb_continue_to_breakpoint "run to breakpoint"
+
+set library_base_address \
+    [get_hexadecimal_valueof "library_base_address" "unknown"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable"
+
+# Now rerun the second executable outside of GDB.  The executable should crash
+# and generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# And now restart GDB, load the core-file and check that the library shows as
+# being mapped in, and that we can still read the library_ptr value from
+# memory.
+clean_restart $binfile
+
+gdb_test "core-file $corefile" \
+    [multi_line \
+	 "\\\[\[^\r\n\]+\\\]" \
+	 "Core was generated by \[^\r\n\]+" \
+	 "Program terminated with signal SIGSEGV, Segmentation fault\\." \
+	 "#0  main \\(\\) at \[^\r\n\]+" \
+	 "$decimal\\s+\[^\r\n\]+/\\* Undefined behaviour here\\.  \\*/"] \
+    "load corefile"
+
+set library_base_address [get_hexadecimal_valueof "library_base_address" \
+			      "unknown" "get library_base_address in core-file"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file"
+
+# Now move the shared library file away and restart GDB.  This time when we
+# load the core-file we should see a warning that GDB has failed to map in the
+# library file.  An attempt to read the variable from the library file should
+# fail / give a warning.
+set library_backup_filename [standard_output_file "libfoo.so.backup"]
+remote_exec host "mv \"$library_filename\" \"$library_backup_filename\""
+
+clean_restart $binfile
+
+gdb_test "core-file $corefile" \
+    [multi_line \
+	 "warning: Can't open file [string_to_regexp $library_filename] during file-backed mapping note processing" \
+	 "\\\[\[^\r\n\]+\\\]" \
+	 "Core was generated by \[^\r\n\]+" \
+	 "Program terminated with signal SIGSEGV, Segmentation fault\\." \
+	 "#0  main \\(\\) at \[^\r\n\]+" \
+	 "$decimal\\s+\[^\r\n\]+/\\* Undefined behaviour here\\.  \\*/"] \
+    "load corefile with library file missing"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with library file missing"
+
+# Build a new version of the shared library, keep the library the same size,
+# but change the contents so the build-id changes.  Then restart GDB and load
+# the core-file again.  GDB should spot that the build-id for the shared
+# library is not as expected, and should refuse to map in the shared library.
+if {[build_executable "build second version of shared library" \
+	 $library_filename $srcfile2 \
+	 { debug shlib build-id \
+	       additional_flags=-DPOINTER_VALUE=0x11223344 }] != 0} {
+    return
+}
+
+clean_restart $binfile
+
+gdb_test "core-file $corefile" \
+    [multi_line \
+	 "warning: File [string_to_regexp $library_filename] doesn't match build-id from core-file during file-backed mapping processing" \
+	 "\\\[\[^\r\n\]+\\\]" \
+	 "Core was generated by \[^\r\n\]+" \
+	 "Program terminated with signal SIGSEGV, Segmentation fault\\." \
+	 "#0  main \\(\\) at \[^\r\n\]+" \
+	 "$decimal\\s+\[^\r\n\]+/\\* Undefined behaviour here\\.  \\*/"] \
+    "load corefile with wrong library in place"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with wrong library in place"
+
+# Setup a debuginfod server which can serve the original shared library file.
+# Then restart GDB and load the core-file.  GDB should download the original
+# shared library from debuginfod and use that to provide the file backed
+# mapping.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    clean_restart
+    gdb_test_no_output "set debuginfod enabled on" \
+	"enabled debuginfod for initial test"
+    gdb_load $binfile
+
+    gdb_test "core-file $corefile" \
+	[multi_line \
+	     "Downloading\[^\r\n\]* executable for [string_to_regexp $library_filename]\\.\\.\\." \
+	     "\\\[\[^\r\n\]+\\\](?:\r\nDownloading separate debug info for \[^\r\n\]+)*" \
+	     "Core was generated by \[^\r\n\]+" \
+	     "Program terminated with signal SIGSEGV, Segmentation fault\\." \
+	     "#0  main \\(\\) at \[^\r\n\]+" \
+	     "$decimal\\s+\[^\r\n\]+/\\* Undefined behaviour here\\.  \\*/"] \
+	"load corefile, download library from debuginfod"
+
+    set ptr_value [read_ptr_value]
+    gdb_assert { $ptr_value == $ptr_expected_value } \
+	"check value of pointer variable after downloading library file"
+}
+
+stop_debuginfod
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 3/4] gdb: improve shared library build-id check for core-files
  2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
  2024-05-17 14:18 ` [PATCH 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
  2024-05-17 14:18 ` [PATCH 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
@ 2024-05-17 14:18 ` Andrew Burgess
  2024-05-17 14:19 ` [PATCH 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-17 14:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the libraries file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-id extracted during the ::build_file_mappings phase to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this two changes need to be made to GDB.

First, rather than just recording the soname to build-id mapping in
set_cbfd_soname_build_id, I now also record, the full mapped filename
to build-id mapping, and also the memory ranges to build-id mapping
for every memory range covered by a mapped file.

Second, I've added a new callback solib_ops::find_solib_addr.  This
callback takes a solib object and returns an (optional) address within
the inferior that is part of this library.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, I also build a reverse build-id to filename map in
set_cbfd_soname_build_id.  This is useful as often a shared library is
implemented as a symbolic link to the actual shared library file.  The
mapped file information is stored based on the actual final, real file
name, while the shared library information holds the original symbolic
link file name.

If when loading the shared library, we find the symbolic link has
disappeared, then we can use the build-id to file name map to check if
the actual file is still around, if it is (and if the build-id
matches) then we can fall back to use that file.

I've also renamed set_cbfd_soname_build_id to set_cbfd_solib_build_id
and get_cbfd_soname_build_id to get_cbfd_solib_build_id as the
'soname' in the function names was now only part of the larger job
these functions do.
---
 gdb/corelow.c                                 |  33 +-
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 373 +++++++++++++++---
 gdb/solib.h                                   |  39 +-
 gdb/solist.h                                  |  22 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 242 ++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  13 +
 15 files changed, 784 insertions(+), 64 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 28d619594d8..85bc3c26bea 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -418,6 +418,10 @@ core_target::build_file_mappings ()
 	    }
 	}
 
+      std::vector<mem_range> ranges;
+      for (const mapped_file::region &region : file_data.regions)
+	ranges.emplace_back (region.start, region.end - region.start);
+
       if (expanded_fname == nullptr
 	  || abfd == nullptr
 	  || !bfd_check_format (abfd.get (), bfd_object))
@@ -494,16 +498,29 @@ core_target::build_file_mappings ()
 	    }
 	}
 
-      /* If this is a bfd of a shared library, record its soname and
-	 build-id.  */
-      if (file_data.build_id != nullptr && abfd != nullptr)
+      /* If this is a bfd with a build-id then record the filename,
+	 optional soname (DT_SONAME .dynamic attribute), and the range of
+	 addresses at which this bfd is mapped.  This information can be
+	 used to perform build-id checking when loading the shared
+	 libraries.  */
+      if (file_data.build_id != nullptr)
 	{
-	  gdb::unique_xmalloc_ptr<char> soname
-	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+	  normalize_mem_ranges (&ranges);
+
+	  const char *actual_filename = nullptr;
+	  gdb::unique_xmalloc_ptr<char> soname;
+	  if (abfd != nullptr)
+	    {
+	      actual_filename = bfd_get_filename (abfd.get ());
+	      soname = gdb_bfd_read_elf_soname (actual_filename);
+	    }
 
-	  if (soname != nullptr)
-	    set_cbfd_soname_build_id (current_program_space->cbfd,
-				      soname.get (), file_data.build_id);
+	  set_cbfd_solib_build_id (current_program_space->cbfd,
+				   soname.get (),
+				   filename.c_str (),
+				   actual_filename,
+				   std::move (ranges),
+				   file_data.build_id);
 	}
     }
 
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index a50bb165c19..a9242408137 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -689,6 +689,12 @@ const solib_ops solib_aix_so_ops =
   solib_aix_open_symbol_file_object,
   solib_aix_in_dynsym_resolve_code,
   solib_aix_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_solib_aix ();
diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index f0828fdf102..e9a85db1e5e 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -665,4 +665,10 @@ const solib_ops darwin_so_ops =
   open_symbol_file_object,
   darwin_in_dynsym_resolve_code,
   darwin_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-dsbt.c b/gdb/solib-dsbt.c
index 11225f72ed0..479f32182d2 100644
--- a/gdb/solib-dsbt.c
+++ b/gdb/solib-dsbt.c
@@ -913,6 +913,12 @@ const solib_ops dsbt_so_ops =
   open_symbol_file_object,
   dsbt_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_dsbt_solib ();
diff --git a/gdb/solib-frv.c b/gdb/solib-frv.c
index 39508fab4c8..7502fe1f560 100644
--- a/gdb/solib-frv.c
+++ b/gdb/solib-frv.c
@@ -1084,4 +1084,10 @@ const solib_ops frv_so_ops =
   open_symbol_file_object,
   frv_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 1d4a50568d7..91359293e75 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -3353,6 +3353,15 @@ svr4_iterate_over_objfiles_in_search_order
     }
 }
 
+/* See solib_ops::find_solib_addr in solist.h.  */
+
+static std::optional<CORE_ADDR>
+svr4_find_solib_addr (solib &so)
+{
+  auto *li = gdb::checked_static_cast<lm_info_svr4 *> (so.lm_info.get ());
+  return li->l_addr_inferior;
+}
+
 const struct solib_ops svr4_so_ops =
 {
   svr4_relocate_section_addresses,
@@ -3368,6 +3377,7 @@ const struct solib_ops svr4_so_ops =
   svr4_keep_data_in_core,
   svr4_update_solib_event_breakpoints,
   svr4_handle_solib_event,
+  svr4_find_solib_addr,
 };
 
 void _initialize_svr4_solib ();
diff --git a/gdb/solib-target.c b/gdb/solib-target.c
index 6563da05a47..f89761f109f 100644
--- a/gdb/solib-target.c
+++ b/gdb/solib-target.c
@@ -412,4 +412,10 @@ const solib_ops solib_target_so_ops =
   solib_target_open_symbol_file_object,
   solib_target_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib.c b/gdb/solib.c
index db2ff5ce9a3..c39dfbcc78e 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -475,56 +475,304 @@ solib_bfd_open (const char *pathname)
   return abfd;
 }
 
-/* Mapping of a core file's shared library sonames to their respective
-   build-ids.  Added to the registries of core file bfds.  */
+/* A mem_range and the build-id associated with the file mapped into the
+   given range.  */
 
-typedef std::unordered_map<std::string, std::string> soname_build_id_map;
+struct mem_range_and_build_id
+{
+  mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+    : range (r),
+      build_id (id)
+  { /* Nothing.  */ }
+
+  /* A range of memory addresses.  */
+  mem_range range;
+
+  /* The build-id of the file mapped into RANGE.  */
+  const bfd_build_id *build_id;
+};
+
+/* When a core file is opened set_cbfd_solib_build_id is called and the
+   data it is passed is used to create an instance of this class.
+
+   This class maps various information extracted from parsing the core
+   file onto the corresponding build-id.
+
+   When we latter load the shared libraries as part of the core file
+   opening process we can use this data structure to find the build-id
+   that corresponds to a shared library.
+
+   Knowing the build-id allows for two things, first, we can refuse to
+   load an incorrect shared library (a file is found, but the build-id
+   doesn't match), and second, we can use various strategies to try and
+   find the correct shared library based on the build-id (e.g. we might be
+   able to download the shared library from debuginfod).  */
+
+struct solib_core_file_build_id_data
+{
+  /* A type that maps a string to a build-id.  */
+  using string_to_build_id_map
+    = std::unordered_map<std::string, const bfd_build_id *>;
+
+  /* A type for a reverse build-id to string map.  We only store a sting
+     pointer in this map as we point back to the strings held in a
+     string_to_build_id_map, this avoids duplicating some strings.  */
+  using build_id_to_string_map
+    = std::unordered_map<const bfd_build_id *, std::string>;
+
+  /* When loading a core file, the build-ids are extracted based on the
+     file backed mappings.  This map associates the name of a file that was
+     mapped into the core file with the corresponding build-id.  When
+     opening a shared library, if the shared library filename appears in
+     this map, then we can use the corresponding build-id.  The build-id
+     pointers in this map will never be nullptr.  */
+
+  string_to_build_id_map filename_to_build_id_map;
+
+  /* Map a build-id pointer back to a filename.  The std::string pointers
+     in this map point back to the strings in FILENAME_TO_BUILD_ID_MAP.
+     If we end up figuring out the build-id from SONAME_TO_BUILD_ID_MAP or
+     from ADDRESS_TO_BUILD_ID_LIST then it might be that we can't find an
+     on-disk file.  In this case we can use this reverse map to find the
+     name of the mapped file, which might be usable.  */
+
+  build_id_to_string_map build_id_to_filename_map;
+
+  /* If the file that was mapped into the core file was a shared library
+     then it might have a DT_SONAME tag in its .dynamic section, this tag
+     contains the name of a shared object.  When opening a shared library,
+     if it's basename appears in this map then we can use the corresponding
+     build-id.
+
+     In the rare case that two different files have the same DT_SONAME
+     value then the build-id pointer in this map will be nullptr, this
+     indicates that it's not possible to find a build-id based on the
+     DT_SONAME value.  */
+
+  string_to_build_id_map soname_to_build_id_map;
+
+  /* This vector maps memory ranges onto an associated build-id.  The
+     ranges are those of the files mapped into the core file.  When opening
+     a shared library, if non of the string to build-id maps provide a
+     build-id then we can find a representative address for the library by
+     calling solib_ops::find_solib_addr() then then look in this vector for
+     a memory range that covers this address.  If a suitable match is found
+     then we use the corresponding build-id.
+
+     Entries in this vector should not overlap, and are sorted be
+     increasing memory address.  Within each entry the build-id pointer
+     will not be nullptr.  */
+
+  std::vector<mem_range_and_build_id> address_to_build_id_list;
+
+  /* The address_to_build_id_list is built over multiple calls to
+     set_cbfd_solib_build_id, each call corresponds to a single mapped
+     file and might add multiple entries to address_to_build_id_list.  It
+     would be wasteful to sort address_to_build_id_list each time some new
+     entries are added, better to record that the address_to_build_id_list
+     is unsorted, and then sort once the first time we are going to perform
+     a lookup in address_to_build_id_list, that is what this flag is for.
+     When this flag is false address_to_build_id_list is not sorted, and
+     needs sorting before we search the list.  When this flag is true the
+     address_to_build_id_list is sorted.  */
+
+  bool address_to_build_id_list_sorted = false;
+};
 
 /* Key used to associate a soname_build_id_map to a core file bfd.  */
 
-static const struct registry<bfd>::key<soname_build_id_map>
+static const struct registry<bfd>::key<solib_core_file_build_id_data>
   cbfd_soname_build_id_data_key;
 
 /* See solib.h.  */
 
 void
-set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname,
-			  const bfd_build_id *build_id)
+set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *soname,
+			 const char *expected_filename,
+			 const char *actual_filename,
+			 std::vector<mem_range> &&ranges,
+			 const bfd_build_id *build_id)
 {
-  gdb_assert (abfd.get () != nullptr);
-  gdb_assert (soname != nullptr);
+  gdb_assert (cbfd.get () != nullptr);
   gdb_assert (build_id != nullptr);
+  gdb_assert (expected_filename != nullptr);
+
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
+
+  if (lookup_data == nullptr)
+    {
+      lookup_data = cbfd_soname_build_id_data_key.emplace (cbfd.get ());
+      gdb_assert (lookup_data != nullptr);
+    }
+
+  if (soname != nullptr)
+    {
+      /* If we already have an entry with this SONAME then this indicates
+	 that the inferior has two files mapped into memory with different
+	 file names (and most likely different build-ids), but with the
+	 same DT_SONAME attribute.  In this case we can't use the
+	 DT_SONAME to figure out the expected build-id of a shared
+	 library, so poison the entry for this SONAME by setting the entry
+	 to nullptr.  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (soname);
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr
+	  && !build_id_equal (it->second, build_id))
+	lookup_data->soname_to_build_id_map[soname] = nullptr;
+      else
+	lookup_data->soname_to_build_id_map[soname] = build_id;
+    }
+
+  /* When the core file is initially opened and the mapped files are
+     parsed, we group the build-id information based on the file name.  As
+     a consequence, we should see each EXPECTED_FILENAME value exactly
+     once.  This means that each insertion should always succeed.  */
+  const auto [it, inserted]
+    = lookup_data->filename_to_build_id_map.emplace (expected_filename,
+						     build_id);
+  gdb_assert (inserted);
+
+  /* Setup the reverse build-id to file name map.  We store a pointer to
+     the string in the filename_to_build_id_map, this saves creating
+     another copy of this string.  The two maps have the same lifetime, so
+     this should be fine.  */
+  if (actual_filename != nullptr)
+    lookup_data->build_id_to_filename_map.emplace (build_id,
+						   actual_filename);
+
+  /* Setup the list of memory range to build-id objects.  */
+  for (mem_range &r : ranges)
+    lookup_data->address_to_build_id_list.emplace_back (std::move (r),
+							build_id);
+
+  /* At this point the address_to_build_id_list is unsorted (we just added
+     some entries to the end of the list).  All entries should be added
+     before any look-ups are performed, and the list is only sorted when
+     the first look-up is performed.  */
+  gdb_assert (!lookup_data->address_to_build_id_list_sorted);
+}
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+/* Helper for get_cbfd_solib_build_id.  BUILD_ID is a build-id that was
+   found in LOOKUP_DATA.  Look-up the corresponding file name in
+   LOOKUP_DATA.build_id_to_filename_map and return a pair containing
+   BUILD_ID and the file name pointer that we find.  If no corresponding
+   filename is found in the build_id_to_filename_map then the returned
+   pair contains BUILD_ID and an empty string.  If BUILD_ID is nullptr
+   then the returned pair contains nullptr and an empty string.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_make_result (solib_core_file_build_id_data *lookup_data,
+			    const bfd_build_id *build_id)
+{
+  if (build_id != nullptr)
+    {
+      gdb_assert (lookup_data != nullptr);
 
-  if (mapptr == nullptr)
-    mapptr = cbfd_soname_build_id_data_key.emplace (abfd.get ());
+      auto it = lookup_data->build_id_to_filename_map.find (build_id);
+      if (it != lookup_data->build_id_to_filename_map.end ())
+	return { build_id, it->second };
+    }
 
-  (*mapptr)[soname] = build_id_to_string (build_id);
+  return { build_id, {} };
 }
 
-/* If SONAME had a build-id associated with it in ABFD's registry by a
-   previous call to set_cbfd_soname_build_id then return the build-id
-   as a NULL-terminated hex string.  */
+/* CBFD is the current core file bfd object, which might be nullptr,
+   FILENAME is the file name of the shared library GDB is trying to load,
+   and SOLIB_ADDR is (optionally) an address within the shared library.
 
-static gdb::unique_xmalloc_ptr<char>
-get_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname)
+   Look through the data previously registered via calls to
+   set_cbfd_solib_build_id and try to find a build-id corresponding to
+   FILENAME.
+
+   If a build-id is found then return a pair, the first item is the
+   build-id and the second item is a string which contains the file name of
+   the file GDB opened to provide the file backed mapping.  This file name
+   string might be empty if GDB didn't find a file to provide the file
+   backed mapping.
+
+   If no build-id can be found for FILENAME then GDB will return a pair
+   containing nullptr (for the build-id) and an empty string for the file
+   name.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *filename,
+			 const std::optional<CORE_ADDR> &solib_addr)
 {
-  if (abfd.get () == nullptr || soname == nullptr)
-    return {};
+  if (cbfd.get () == nullptr)
+    return { nullptr, {} };
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
 
-  if (mapptr == nullptr)
-    return {};
+  if (lookup_data == nullptr)
+    return { nullptr, {} };
 
-  auto it = mapptr->find (lbasename (soname));
-  if (it == mapptr->end ())
-    return {};
+  if (filename != nullptr)
+    {
+      /* If there's a matching entry in filename_to_build_id_map then the
+	 associated build-id will not be nullptr, and can be used to
+	 validate that FILENAME is correct.  */
+      auto it = lookup_data->filename_to_build_id_map.find (filename);
+      if (it != lookup_data->filename_to_build_id_map.end ())
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  if (solib_addr.has_value ())
+    {
+      /* On the first lookup, sort the address_to_build_id_list.  */
+      if (!lookup_data->address_to_build_id_list_sorted)
+	{
+	  std::sort (lookup_data->address_to_build_id_list.begin (),
+		     lookup_data->address_to_build_id_list.end (),
+		     [] (const mem_range_and_build_id &a,
+			 const mem_range_and_build_id &b) {
+		       return a.range < b.range;
+		     });
+	  lookup_data->address_to_build_id_list_sorted = true;
+	}
+
+      /* Look for the first entry whose range's start address is not less
+	 than, or equal too, the address SOLIB_ADDR.  If we find such an
+	 entry, then the previous entry's range might contain SOLIB_ADDR.
+	 If it does then that previous entry's build-id can be used.  */
+      auto it = std::lower_bound
+	(lookup_data->address_to_build_id_list.begin (),
+	 lookup_data->address_to_build_id_list.end (),
+	 *solib_addr,
+	 [] (const mem_range_and_build_id &a,
+	     const CORE_ADDR &b) {
+	  return a.range.start <= b;
+	});
+
+      if (it != lookup_data->address_to_build_id_list.begin ())
+	{
+	  --it;
 
-  return make_unique_xstrdup (it->second.c_str ());
+	  if (it->range.contains (*solib_addr))
+	    return get_cbfd_solib_make_result (lookup_data, it->build_id);
+	}
+    }
+
+  if (filename != nullptr)
+    {
+      /* If the basename of FILENAME appears in the soname_to_build_id_map
+	 then when the mapped files were processed, we saw a file with a
+	 DT_SONAME attribute corresponding to FILENAME, use that build-id
+	 to validate FILENAME.
+
+	 However, the build-id in this map might be nullptr if we saw
+	 multiple mapped files with the same DT_SONAME attribute (though
+	 this should be pretty rare).  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (lbasename (filename));
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr)
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  return { nullptr, {} };
 }
 
 /* Given a pointer to one of the shared objects in our list of mapped
@@ -546,36 +794,53 @@ solib_map_sections (solib &so)
 
   gdb::unique_xmalloc_ptr<char> filename (tilde_expand (so.so_name.c_str ()));
   gdb_bfd_ref_ptr abfd (ops->bfd_open (filename.get ()));
-  gdb::unique_xmalloc_ptr<char> build_id_hexstr
-    = get_cbfd_soname_build_id (current_program_space->cbfd,
-				so.so_name.c_str ());
+  std::optional<CORE_ADDR> solib_addr = ops->find_solib_addr (so);
+
+  const auto [expected_build_id, mapped_filename]
+    = get_cbfd_solib_build_id (current_program_space->cbfd,
+				so.so_name.c_str (),
+				solib_addr);
 
   /* If we already know the build-id of this solib from a core file, verify
      it matches ABFD's build-id.  If there is a mismatch or the solib wasn't
      found, attempt to query debuginfod for the correct solib.  */
-  if (build_id_hexstr.get () != nullptr)
+  if (expected_build_id != nullptr)
     {
-      bool mismatch = false;
+      bool mismatch = (abfd != nullptr
+		       && build_id_bfd_get (abfd.get ()) != nullptr
+		       && !build_id_equal (expected_build_id,
+					   build_id_bfd_get (abfd.get ())));
 
-      if (abfd != nullptr && abfd->build_id != nullptr)
-	{
-	  std::string build_id = build_id_to_string (abfd->build_id);
-
-	  if (build_id != build_id_hexstr.get ())
-	    mismatch = true;
-	}
       if (abfd == nullptr || mismatch)
 	{
-	  scoped_fd fd = debuginfod_exec_query (
-	    (const unsigned char *) build_id_hexstr.get (), 0,
-	    so.so_name.c_str (), &filename);
-
-	  if (fd.get () >= 0)
-	    abfd = ops->bfd_open (filename.get ());
-	  else if (mismatch)
-	    warning (_ ("Build-id of %ps does not match core file."),
-		     styled_string (file_name_style.style (),
-				    filename.get ()));
+	  /* If GDB found a suitable file during the file mapping
+	     processing stage then lets use that.  We don't check the
+	     build-id after opening this file, either this file was found
+	     by build-id, in which case it's going to match, or this file
+	     doesn't have a build-id, so checking tells us nothing.
+	     However, if it was good enough during the mapped file
+	     processing, we assume it's good enough now.  */
+	  if (!mapped_filename.empty ())
+	    abfd = ops->bfd_open (mapped_filename.c_str ());
+	  else
+	    abfd = nullptr;
+
+	  if (abfd == nullptr)
+	    {
+	      scoped_fd fd = debuginfod_exec_query
+		(expected_build_id->data, expected_build_id->size,
+		 so.so_name.c_str (), &filename);
+
+	      if (fd.get () >= 0)
+		abfd = ops->bfd_open (filename.get ());
+	      else if (mismatch)
+		{
+		  warning (_ ("Build-id of %ps does not match core file."),
+			   styled_string (file_name_style.style (),
+					  filename.get ()));
+		  abfd = nullptr;
+		}
+	    }
 	}
     }
 
@@ -1705,6 +1970,14 @@ remove_user_added_objfile (struct objfile *objfile)
     }
 }
 
+/* See solist.h.  */
+
+std::optional<CORE_ADDR>
+default_find_solib_addr (solib &so)
+{
+  return {};
+}
+
 void _initialize_solib ();
 
 void
diff --git a/gdb/solib.h b/gdb/solib.h
index f7a93c0718f..9cd9f111cde 100644
--- a/gdb/solib.h
+++ b/gdb/solib.h
@@ -29,6 +29,7 @@ struct program_space;
 #include "gdb_bfd.h"
 #include "symfile-add-flags.h"
 #include "gdbsupport/function-view.h"
+#include "memrange.h"
 
 /* Value of the 'set debug solib' configuration variable.  */
 
@@ -136,11 +137,39 @@ extern void update_solib_breakpoints (void);
 
 extern void handle_solib_event (void);
 
-/* Associate SONAME with BUILD_ID in ABFD's registry so that it can be
-   retrieved with get_cbfd_soname_build_id.  */
+/* This function is used to pass information extracted when processing the
+   core file's mapped file section through to the shared library processing
+   parts of GDB.  When the mapped files are processed GDB is (sometimes)
+   able to find the build-ids for the mapped files.  These build-ids are
+   recorded by calling this function, and can be used during shared library
+   loading to ensure the correct shared libraries are loaded.
 
-extern void set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd,
-				      const char *soname,
-				      const bfd_build_id *build_id);
+   CBFD is the core file bfd object to record this information against.
+
+   SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+   the shared library that was mapped into the core file.  This can be
+   nullptr.
+
+   EXPECTED_FILENAME is the name of the file that was mapped into the
+   inferior as extracted from the core file, this should never be nullptr.
+
+   ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+   mapped file information, this can be nullptr if GDB failed to find a
+   suitable file.
+
+   RANGES is the list of memory ranges at which this file was mapped into
+   the inferior.
+
+   BUILD_ID this is the build-id for this mapped file, this will never be
+   nullptr.  Not every mapped file will have a build-id, but there's no
+   point calling this function if we failed to find a build-id; this is all
+   about allowing the shared library code to find the build-id.  */
+
+extern void set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd,
+				     const char *soname,
+				     const char *expected_filename,
+				     const char *actual_filename,
+				     std::vector<mem_range> &&ranges,
+				     const bfd_build_id *build_id);
 
 #endif /* SOLIB_H */
diff --git a/gdb/solist.h b/gdb/solist.h
index f0d22080de1..bd60987f49c 100644
--- a/gdb/solist.h
+++ b/gdb/solist.h
@@ -167,6 +167,23 @@ struct solib_ops
      NULL, in which case no specific preprocessing is necessary
      for this target.  */
   void (*handle_event) (void);
+
+  /* Return an address within the inferior's address space which is known
+     to be part of SO.  If there is no such address, or GDB doesn't know
+     how to figure out such an address then an empty optional is
+     returned.
+
+     The returned address can be used when loading the shared libraries
+     for a core file.  GDB knows the build-ids for (some) files mapped
+     into the inferior's address space, and knows the address ranges which
+     those mapped files cover.  If GDB can figure out a representative
+     address for the library then this can be used to match a library to a
+     mapped file, and thus to a build-id.  GDB can then use this
+     information to help locate the shared library objfile, if the objfile
+     is not in the expected place (as defined by the shared libraries file
+     name).  */
+
+  std::optional<CORE_ADDR> (*find_solib_addr) (solib &so);
 };
 
 /* A unique pointer to a so_list.  */
@@ -186,4 +203,9 @@ extern gdb_bfd_ref_ptr solib_bfd_fopen (const char *pathname, int fd);
 /* Find solib binary file and open it.  */
 extern gdb_bfd_ref_ptr solib_bfd_open (const char *in_pathname);
 
+/* A default implementation of the solib_ops::find_solib_addr callback.
+   This just returns an empty std::optional<CORE_ADDR> indicating GDB is
+   unable to find an address within the library SO.  */
+extern std::optional<CORE_ADDR> default_find_solib_addr (solib &so);
+
 #endif
diff --git a/gdb/testsuite/gdb.base/solib-search.exp b/gdb/testsuite/gdb.base/solib-search.exp
index 581046a1ed8..bbfe3ae719a 100644
--- a/gdb/testsuite/gdb.base/solib-search.exp
+++ b/gdb/testsuite/gdb.base/solib-search.exp
@@ -43,7 +43,11 @@ set right_binfile2_lib \
 set binfile1_lib [standard_output_file ${libname1}.so]
 set binfile2_lib [standard_output_file ${libname2}.so]
 
-set lib_flags [list debug ldflags=-Wl,-Bsymbolic]
+# When this test was written, GDB's ability to track down shared
+# libraries for a core file based on the build-id much poorer.  As GDB
+# has improved we now need to disable build-ids in order for this test
+# to function as expected.
+set lib_flags [list debug no-build-id ldflags=-Wl,-Bsymbolic]
 set wrong_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=1"
 set right_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=8192 additional_flags=-DRIGHT"
 
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
new file mode 100644
index 00000000000..30e9267dc01
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
@@ -0,0 +1,39 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+/* It is important that these two variables have names of the same length
+   so that the debug information in the two library versions is laid out
+   the same.  If they differ then the .dynamic section might move, which
+   will trigger a different check within GDB than the one we actually want
+   to check.  */
+
+#if LIB_VERSION == 1
+volatile int *library_1_var = (volatile int *) 0x12345678;
+#elif LIB_VERSION == 2
+volatile int *library_2_var = (volatile int *) 0x11223344;
+#else
+# error Unknown library version
+#endif
+
+int
+foo (void)
+{
+  /* This should trigger a core dump.  */
+  abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
new file mode 100644
index 00000000000..a51d48e2f2e
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
@@ -0,0 +1,41 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <stddef.h>
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+/* This is updated by the .exp file.  */
+char *libname = "libfoo_2.so";
+
+int
+main (void)
+{
+  void *handle;
+  int res, tmp;
+
+  handle = dlopen (libname, RTLD_LAZY);
+  assert (handle != NULL);
+
+  res = foo ();
+
+  tmp = dlclose (handle);
+  assert (tmp == 0);
+
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
new file mode 100644
index 00000000000..dfc78923436
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -0,0 +1,242 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test exercises GDB's ability to validate build-ids when loading
+# shared libraries for a core file.
+#
+# The test creates two "versions" of a shared library, sets up a
+# symlink to point to one version of the library, and creates a core file.
+#
+# We then try re-loading the core file and executable and check that
+# GDB is able to correctly load the shared library.  To confuse things
+# we retarget the library symlink at the other version of the library.
+#
+# After that we repeat the test, but this time deleting the symlink
+# completely.
+#
+# Then we remove the version of the library completely, at this point
+# we do expect GDB to give a warning about being unable to load the library.
+#
+# And finally, we setup debuginfod and have it serve the missing
+# library file, GDB should correctly download the library file.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+require {istarget "*-linux*"}
+
+standard_testfile -1.c -2.c
+
+# Build two similar, but slightly different versions of the shared
+# library.  Both libraries have DT_SONAME set to the generic
+# libfoo.so, we'll create a symlink with that name later.
+set library_1_filename [standard_output_file "libfoo_1.so"]
+set library_2_filename [standard_output_file "libfoo_2.so"]
+
+# The generic name for the library.
+set library_filename [standard_output_file "libfoo.so"]
+
+# When compiling a shared library the -Wl,-soname,NAME option is
+# automatically added based on the final name of the library.  We want
+# to compile libfoo_1.so, but set the soname to libfoo.so.  To achieve
+# this we first compile into libfoo.so, and then rename the library to
+# libfoo_1.so.
+if {[build_executable "build libfoo_1.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=1 }] == -1} {
+    return
+}
+remote_exec host "mv ${library_filename} ${library_1_filename}"
+
+# See the comment above, but this time we rename to libfoo_2.so.
+if {[build_executable "build libfoo_2.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=2 }] == -1} {
+    return
+}
+remote_exec host "mv ${library_filename} ${library_2_filename}"
+
+# Create libfoo.so symlink to the libfoo_1.so library.  If this
+# symlink creation fails then we assume we can't create symlinks on
+# this host.  If this succeeds then later symlink creation is required
+# to succeed, and will trigger an FAIL if it doesn't.
+set status \
+    [remote_exec host \
+	 "ln -sf ${library_1_filename} ${library_filename}"]
+if {[lindex $status 0] != 0} {
+    unsupported "host does not support symbolic links"
+    return
+}
+
+# Build the executable.  This links against libfoo.so, which is
+# poining at libfoo_1.so.  Just to confuse things even more, this
+# executable uses dlopen to load libfoo_2.so.  Weird!
+if { [build_executable "build executable" ${binfile} ${srcfile2} \
+	  [list debug shlib=${library_filename} shlib_load]] == -1 } {
+    return
+}
+
+# Run BINFILE which will generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Helper proc to load global BINFILE and then load global COREFILE.
+# If EXPECT_WARNING is true then we expect a warning about being
+# unable to load the shared library symbols, otherwise, EXPECT_WARNING
+# is false and we expect no warnings about loading shared library
+# symbols.
+proc load_exec_and_core_file { expect_warning testname } {
+    with_test_prefix $testname {
+	clean_restart $::binfile
+
+	set saw_warning false
+	gdb_test_multiple "core-file $::corefile" "load core file" {
+	    -re "^core-file \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^\\\[New \[^\r\n\]+\\\]\r\n" {
+		exp_continue
+	    }
+	    -re "^Core was generated by \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^#0\\s+\[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^\\s*\r\n" {
+		exp_continue
+	    }
+	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing" {
+		# Ignore warnings from the file backed mapping phase.
+		exp_continue
+	    }
+	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\nDo you need \"set solib-search-path\" or \"set sysroot\".\r\n" {
+		set saw_warning true
+		exp_continue
+	    }
+	    -re "^Downloading separate debug info for \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^Downloading executable for \[^\r\n\]+/libfoo_1\\.so\\.\\.\\." {
+		exp_continue
+	    }
+	    -re "^$::gdb_prompt $" {
+		gdb_assert { $saw_warning == $expect_warning } $gdb_test_name
+	    }
+	}
+
+	# If we don't expect a warning then debug symbols from the
+	# shared library should be available.  Confirm we can read a
+	# variable from the shared library.  If we do expect a warning
+	# then the shared library debug symbols have not loaded, and
+	# the library variable should not be available.
+	if { !$expect_warning } {
+	    gdb_test "print/x library_1_var" " = 0x12345678" \
+		"check library_1_var can be read"
+	} else {
+	    gdb_test "print/x library_1_var" \
+		"^No symbol \"library_1_var\" in current context\\." \
+		"check library_1_var cannot be read"
+	}
+    }
+}
+
+# Initial test, just load the executable and core file.  At this point
+# everything should load fine as everything is where we expect to find
+# it.
+load_exec_and_core_file false \
+    "load core file, all libraries as expected"
+
+# Update libfoo.so symlink to point at the second library then reload
+# the core file.  GDB should spot that the symlink points to the wrong
+# file, but should be able to figure out the correct file to load as
+# the right file will be in the mapped file list.
+set status [remote_exec host \
+		"ln -sf ${library_2_filename} ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "update library symlink to point to the wrong file"
+
+load_exec_and_core_file false \
+    "load core file, symlink points to wrong file"
+
+# Remove libfoo.so symlink and reload the core file.  As in the
+# previous test GDB should be able to figure out the correct file to
+# load as the correct file will still appear in the mapped file list.
+set status [remote_exec host "rm -f ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } "remove library symlink"
+
+load_exec_and_core_file false \
+    "load core file, symlink removed"
+
+# Remove LIBRARY_1_FILENAME.  We'll now see a warning that the mapped
+# file can't be loaded (we ignore that warning), and we'll see a
+# warning that the shared library can't be loaded.
+set library_1_backup_filename ${library_1_filename}.backup
+set status \
+    [remote_exec host \
+	 "mv ${library_1_filename} ${library_1_backup_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "remove libfoo_1.so"
+
+load_exec_and_core_file true \
+    "load core file, libfoo_1.so removed"
+
+# Setup a debuginfod server which can serve the original shared
+# library file.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_1_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    save_vars { GDBFLAGS } {
+	append GDBFLAGS " -ex \"set debuginfod enabled on\""
+
+	# Reload the executable and core file.  GDB should download
+	# the file libfoo_1.so using debuginfod during the mapped file
+	# phase, but should then reuse that download during the shared
+	# library phase.
+	load_exec_and_core_file false \
+	    "load core file, use debuginfod"
+    }
+}
+
+stop_debuginfod
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 0d78691c381..c958ff18d2a 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -5211,6 +5211,7 @@ proc quote_for_host { args } {
 #     debug information
 #   - text_segment=addr: Tell the linker to place the text segment at ADDR.
 #   - build-id: Ensure the final binary includes a build-id.
+#   - no-build-id: Ensure the final binary does not include a build-id.
 #   - column-info/no-column-info: Enable/Disable generation of column table
 #     information.
 #
@@ -5316,6 +5317,18 @@ proc gdb_compile {source dest type options} {
 	lappend new_options "additional_flags=-Wl,--build-id"
     }
 
+    # If the 'no-build-id' option is used then disable the build-id.
+    if {[lsearch -exact $options no-build-id] > 0} {
+	lappend new_options "additional_flags=-Wl,--build-id=none"
+    }
+
+    # Sanity check.  If both 'build-id' and 'no-build-id' are used
+    # then what is expected from us!
+    if {[lsearch -exact $options build-id] > 0
+	&& [lsearch -exact $options no-build-id] > 0} {
+	error "cannot use build-id and no-build-id options"
+    }
+
     # Treating .c input files as C++ is deprecated in Clang, so
     # explicitly force C++ language.
     if { !$getting_compiler_info
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 4/4] gdb: unify build-id to objfile lookup code
  2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
                   ` (2 preceding siblings ...)
  2024-05-17 14:18 ` [PATCH 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
@ 2024-05-17 14:19 ` Andrew Burgess
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-17 14:19 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

There are 3 places where we currently call debuginfod_exec_query to
lookup an objfile for a given build-id.

In one of these places we first call build_id_to_exec_bfd which also
looks up an objfile given a build-id, but this function looks on disk
for a symlink in the .build-id/ sub-directory (within the
debug-file-directory).

I can't think of any reason why we shouldn't call build_id_to_exec_bfd
before every call to debuginfod_exec_query.

So, in this commit I have added a new function in build-id.c,
find_exec_by_build_id, this function calls build_id_to_exec_bfd, and
if that fails, then calls debuginfod_exec_query.

Everywhere we call debuginfod_exec_query is updated to call the new
function, and in locate_exec_from_corefile_build_id, the existing call
to build_id_to_exec_bfd is removed as calling find_exec_by_build_id
does this for us.

One slight weird thing is in core_target::build_file_mappings, here we
call find_exec_by_build_id which returns a gdb_bfd_ref_ptr for the
opened file, however we immediately reopen the file as "binary".  The
reason for this is that all the bfds opened in ::build_file_mappings
need to be opened as "binary" (see the function comments for why).

I did consider passing a target type into find_exec_by_build_id, which
could then be forwarded to build_id_to_exec_bfd and used to open the
BFD as "binary", however, if you follow the call chain you'll end up
in build_id_to_debug_bfd_1, where we actually open the bfd.  Notice in
here that we call build_id_verify to double check the build-id of the
file we found, this requires that the bfd not be opened as "binary".

What this means is that we always have to first open the bfd using the
gnutarget target type (for the build-id check), and then we would have
to reopen it as "binary".  There seems little point pushing the reopen
logic into find_exec_by_build_id, so we just do this in the
::build_file_mappings function.

I've extended the tests to cover the two cases which actually changed
in this commit.
---
 gdb/build-id.c                                | 42 ++++++++++++++++++-
 gdb/build-id.h                                | 21 ++++++----
 gdb/corelow.c                                 | 40 ++++++------------
 gdb/solib.c                                   | 22 ++++------
 .../gdb.debuginfod/corefile-mapped-file.exp   | 31 ++++++++++++++
 .../gdb.debuginfod/solib-with-soname.exp      | 28 ++++++++++++-
 gdb/testsuite/lib/gdb.exp                     |  7 +++-
 7 files changed, 137 insertions(+), 54 deletions(-)

diff --git a/gdb/build-id.c b/gdb/build-id.c
index 41667d5e5cf..27642b58d56 100644
--- a/gdb/build-id.c
+++ b/gdb/build-id.c
@@ -26,6 +26,8 @@
 #include "filenames.h"
 #include "gdbcore.h"
 #include "cli/cli-style.h"
+#include "gdbsupport/scoped_fd.h"
+#include "debuginfod-support.h"
 
 /* See build-id.h.  */
 
@@ -198,9 +200,11 @@ build_id_to_debug_bfd (size_t build_id_len, const bfd_byte *build_id)
   return build_id_to_bfd_suffix (build_id_len, build_id, ".debug");
 }
 
-/* See build-id.h.  */
+/* Find and open a BFD for an executable file given a build-id.  If no BFD
+   can be found, return NULL.  The returned reference to the BFD must be
+   released by the caller.  */
 
-gdb_bfd_ref_ptr
+static gdb_bfd_ref_ptr
 build_id_to_exec_bfd (size_t build_id_len, const bfd_byte *build_id)
 {
   return build_id_to_bfd_suffix (build_id_len, build_id, "");
@@ -243,3 +247,37 @@ find_separate_debug_file_by_buildid (struct objfile *objfile,
 
   return std::string ();
 }
+
+/* See build-id.h.  */
+
+gdb_bfd_ref_ptr
+find_exec_by_build_id (const bfd_build_id *build_id,
+		       const char *expected_filename)
+{
+  /* Try to find the executable (or shared object) by looking for a
+     (sym)link on disk from the build-id to the object file.  */
+  gdb_bfd_ref_ptr abfd = build_id_to_exec_bfd (build_id->size,
+					       build_id->data);
+
+  if (abfd != nullptr)
+    return abfd;
+
+  /* Attempt to query debuginfod for the executable.  */
+  gdb::unique_xmalloc_ptr<char> path;
+  scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
+					expected_filename, &path);
+  if (fd.get () >= 0)
+    {
+      abfd = gdb_bfd_open (path.get (), gnutarget);
+
+      if (abfd == nullptr)
+	warning (_("\"%ps\" from debuginfod cannot be opened as bfd: %s"),
+		 styled_string (file_name_style.style (), path.get ()),
+		 gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
+      else if (!build_id_verify (abfd.get (), build_id->size,
+				 build_id->data))
+	abfd = nullptr;
+    }
+
+  return abfd;
+}
diff --git a/gdb/build-id.h b/gdb/build-id.h
index c5f20f8782e..3df122a0cbf 100644
--- a/gdb/build-id.h
+++ b/gdb/build-id.h
@@ -40,13 +40,6 @@ extern int build_id_verify (bfd *abfd,
 extern gdb_bfd_ref_ptr build_id_to_debug_bfd (size_t build_id_len,
 					      const bfd_byte *build_id);
 
-/* Find and open a BFD for an executable file given a build-id.  If no BFD
-   can be found, return NULL.  The returned reference to the BFD must be
-   released by the caller.  */
-
-extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
-					     const bfd_byte *build_id);
-
 /* Find the separate debug file for OBJFILE, by using the build-id
    associated with OBJFILE's BFD.  If successful, returns the file name for the
    separate debug file, otherwise, return an empty string.
@@ -60,6 +53,20 @@ extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
 extern std::string find_separate_debug_file_by_buildid
   (struct objfile *objfile, deferred_warnings *warnings);
 
+/* Find an executable (or shared library) that matches BUILD_ID.  This is
+   done by first checking in the debug-file-directory for a .build-id/
+   sub-directory, and looking for a symlink in there that points to the
+   required file.
+
+   If that doesn't find us a file then we call to debuginfod to see if it
+   can provide the required file.
+
+   EXPECTED_FILENAME is used in output messages from debuginfod, this
+   should be the file we were looking for but couldn't find.  */
+
+extern gdb_bfd_ref_ptr find_exec_by_build_id (const bfd_build_id *build_id,
+					      const char *expected_filename);
+
 /* Return an hex-string representation of BUILD_ID.  */
 
 static inline std::string
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 85bc3c26bea..6975719c1f2 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -47,7 +47,6 @@
 #include "gdbsupport/pathstuff.h"
 #include "gdbsupport/scoped_fd.h"
 #include "gdbsupport/x86-xstate.h"
-#include "debuginfod-support.h"
 #include <unordered_map>
 #include <unordered_set>
 #include "cli/cli-cmds.h"
@@ -407,13 +406,19 @@ core_target::build_file_mappings ()
 	   || !bfd_check_format (abfd.get (), bfd_object))
 	  && file_data.build_id != nullptr)
 	{
-	  expanded_fname = nullptr;
-	  debuginfod_exec_query (file_data.build_id->data,
-				 file_data.build_id->size,
-				 filename.c_str (), &expanded_fname);
-	  if (expanded_fname != nullptr)
+	  abfd = find_exec_by_build_id (file_data.build_id,
+					filename.c_str ());
+
+	  if (abfd != nullptr)
 	    {
+	      /* The find_exec_by_build_id will have opened ABFD using the
+		 GNUTARGET global bfd type, however, we need the bfd opened
+		 as the binary type (see the function's header comment), so
+		 now we reopen ABFD with the desired binary type.  */
+	      expanded_fname
+		= make_unique_xstrdup (bfd_get_filename (abfd.get ()));
 	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      gdb_assert (b != nullptr);
 	      abfd = gdb_bfd_ref_ptr::new_reference (b);
 	    }
 	}
@@ -774,28 +779,7 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
     return;
 
   gdb_bfd_ref_ptr execbfd
-    = build_id_to_exec_bfd (build_id->size, build_id->data);
-
-  if (execbfd == nullptr)
-    {
-      /* Attempt to query debuginfod for the executable.  */
-      gdb::unique_xmalloc_ptr<char> execpath;
-      scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
-					    abfd->filename, &execpath);
-
-      if (fd.get () >= 0)
-	{
-	  execbfd = gdb_bfd_open (execpath.get (), gnutarget);
-
-	  if (execbfd == nullptr)
-	    warning (_("\"%s\" from debuginfod cannot be opened as bfd: %s"),
-		     execpath.get (),
-		     gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
-	  else if (!build_id_verify (execbfd.get (), build_id->size,
-				     build_id->data))
-	    execbfd.reset (nullptr);
-	}
-    }
+    = find_exec_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
diff --git a/gdb/solib.c b/gdb/solib.c
index c39dfbcc78e..3292f361176 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -45,7 +45,6 @@
 #include "gdb_bfd.h"
 #include "gdbsupport/filestuff.h"
 #include "gdbsupport/scoped_fd.h"
-#include "debuginfod-support.h"
 #include "source.h"
 #include "cli/cli-style.h"
 
@@ -826,20 +825,15 @@ solib_map_sections (solib &so)
 	    abfd = nullptr;
 
 	  if (abfd == nullptr)
-	    {
-	      scoped_fd fd = debuginfod_exec_query
-		(expected_build_id->data, expected_build_id->size,
-		 so.so_name.c_str (), &filename);
+	    abfd = find_exec_by_build_id (expected_build_id,
+					  so.so_name.c_str ());
 
-	      if (fd.get () >= 0)
-		abfd = ops->bfd_open (filename.get ());
-	      else if (mismatch)
-		{
-		  warning (_ ("Build-id of %ps does not match core file."),
-			   styled_string (file_name_style.style (),
-					  filename.get ()));
-		  abfd = nullptr;
-		}
+	  if (abfd == nullptr && mismatch)
+	    {
+	      warning (_ ("Build-id of %ps does not match core file."),
+		       styled_string (file_name_style.style (),
+				      filename.get ()));
+	      abfd = nullptr;
 	    }
 	}
     }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
index c789be87ba3..553d68e4332 100644
--- a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -230,6 +230,37 @@ set ptr_value [read_ptr_value]
 gdb_assert { $ptr_value eq "unavailable" } \
     "check value of pointer is unavailable with library file missing"
 
+# Now symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to library we just moved aside.  Restart GDB and setup the
+# debug-file-directory before loading the core file.
+#
+# GDB should lookup the file to map via the build-id link in the
+# .build-id/ directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_backup_filename ""]
+
+remote_exec host "mkdir -p [file dirname $build_id_filename]"
+remote_exec host "ln -sf $library_backup_filename $build_id_filename"
+
+clean_restart $binfile
+
+gdb_test_no_output "set debug-file-directory $debugdir" \
+    "set debug-file-directory"
+
+gdb_test "core-file $corefile" \
+    [multi_line \
+	 "\\\[\[^\r\n\]+\\\]" \
+	 "Core was generated by \[^\r\n\]+" \
+	 "Program terminated with signal SIGSEGV, Segmentation fault\\." \
+	 "#0  main \\(\\) at \[^\r\n\]+" \
+	 "$decimal\\s+\[^\r\n\]+/\\* Undefined behaviour here\\.  \\*/"] \
+    "load corefile, lookup in debug-file-directory"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file, lookup in debug-file-directory"
+
 # Build a new version of the shared library, keep the library the same size,
 # but change the contents so the build-id changes.  Then restart GDB and load
 # the core-file again.  GDB should spot that the build-id for the shared
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
index dfc78923436..9d311be8ebc 100644
--- a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -106,10 +106,15 @@ if {$corefile eq ""} {
 # unable to load the shared library symbols, otherwise, EXPECT_WARNING
 # is false and we expect no warnings about loading shared library
 # symbols.
-proc load_exec_and_core_file { expect_warning testname } {
+proc load_exec_and_core_file { expect_warning testname {debugdir ""}} {
     with_test_prefix $testname {
 	clean_restart $::binfile
 
+	if { $debugdir ne "" } {
+	    gdb_test_no_output "set debug-file-directory $debugdir" \
+		"set debug directory"
+	}
+
 	set saw_warning false
 	gdb_test_multiple "core-file $::corefile" "load core file" {
 	    -re "^core-file \[^\r\n\]+\r\n" {
@@ -205,6 +210,27 @@ gdb_assert { [lindex $status 0] == 0 } \
 load_exec_and_core_file true \
     "load core file, libfoo_1.so removed"
 
+# Symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to LIBRARY_1_BACKUP_FILENAME, now when we restart GDB it
+# should find the missing library within the debug directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_1_backup_filename ""]
+set status \
+    [remote_exec host \
+	 "mkdir -p [file dirname $build_id_filename]"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create sub-directory within the debug directory"
+set status \
+    [remote_exec host \
+	 "ln -sf $library_1_backup_filename $build_id_filename"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create symlink within the debug directory "
+
+load_exec_and_core_file false \
+    "load core file, find libfoo_1.so through debug-file-directory" \
+    $debugdir
+
 # Setup a debuginfod server which can serve the original shared
 # library file.
 if {![allow_debuginfod_tests]} {
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index c958ff18d2a..e369b0be96a 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -8019,14 +8019,17 @@ proc get_build_id { filename } {
 
 # Return the build-id hex string (usually 160 bits as 40 hex characters)
 # converted to the form: .build-id/ab/cdef1234...89.debug
+#
+# The '.debug' suffix can be changed by passing the SUFFIX argument.
+#
 # Return "" if no build-id found.
-proc build_id_debug_filename_get { filename } {
+proc build_id_debug_filename_get { filename {suffix ".debug"} } {
     set data [get_build_id $filename]
     if { $data == "" } {
 	return ""
     }
     regsub {^..} $data {\0/} data
-    return ".build-id/${data}.debug"
+    return ".build-id/${data}${suffix}"
 }
 
 # DEST should be a file compiled with debug information.  This proc
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 0/4] More build-id checking when opening core files
  2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
                   ` (3 preceding siblings ...)
  2024-05-17 14:19 ` [PATCH 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
@ 2024-05-20 13:08 ` Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
                     ` (4 more replies)
  4 siblings, 5 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-20 13:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In v2:

  - Failures were reported in the new tests from Linaro CI testing.
    Nothing serious; Linaro was seeing some additional output that the
    regexp I'd written were not accounting for.  I've revised the new
    tests to try and not be too strict, hopefully this should address
    the issues that were reported.

  - I've cleaned up some build vs host mistakes I think I made in some
    remote_exec calls.

  - And I've run the tests agsint 'make check-all-boards' and resolved
    some issues that cropped up.

Andrew Burgess (4):
  gdb/corefile: remove unavailable target sections
  gdb/corefile: improve file backed mapping handling
  gdb: improve shared library build-id check for core-files
  gdb: unify build-id to objfile lookup code

 gdb/build-id.c                                |  42 +-
 gdb/build-id.h                                |  21 +-
 gdb/corelow.c                                 | 379 ++++++++++++-----
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 369 ++++++++++++++---
 gdb/solib.h                                   |  39 +-
 gdb/solist.h                                  |  22 +
 gdb/testsuite/gdb.base/corefile.exp           |  39 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 ++
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 +
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 +++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 380 ++++++++++++++++++
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 290 +++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  20 +-
 22 files changed, 1654 insertions(+), 164 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp


base-commit: ad666becfe075ca7c831ebbf4b44526994395e97
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 1/4] gdb/corefile: remove unavailable target sections
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
@ 2024-05-20 13:08   ` Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-20 13:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file the bfd library processes the core file and
creates sections within the bfd object to represent each of the
segments within the core file.

GDB then creates two target_section lists, m_core_section_table and
m_core_file_mappings.  These, as well as m_core_unavailable_mappings,
are used by GDB to implement core_target::xfer_partial, that is, to
allow reading from inferior memory.

The m_core_section_table list represents sections within the core file
itself.  The sections in this list can be split into two groups based
on whether the section has the SEC_HAS_CONTENTS flag set or not.

Sections that have SEC_HAS_CONTENTS have their contents copied into
the core file while sections that do not have the SEC_HAS_CONTENTS
flag will not have their contents copied into the core file when it is
created.

The m_core_file_mappings list is created when GDB parses the mapped
file list in the core file.  Every mapped file will also be covered by
entries in the m_core_section_table list, however, when parts of a
file are mapped read-only then the corresponding entry in
m_core_section_table will not have the SEC_HAS_CONTENTS flag set.  If
parts of a file are mapped read/write then the corresponding
m_core_section_table entry will have the SEC_HAS_CONTENTS flag, and
the corresponding (possibly modified contents) will be copied into the
core file.

However, GDB only creates entries in m_core_file_mappings if it is
able to find the correct on-disk file to open.  If the corresponding
file can't be found then an entry is added to
m_core_unavailable_mappings.  An attempt to read from a region covered
by this list will be redirected to the lower file target in the hope
that a read-only section from the executable (supplied by the user)
might satisfy the read.

And so, in core_target::xfer_partial, the order in which the various
lists are checked to satisfy a read are:

  1. Check m_core_section_table for sections with SEC_HAS_CONTENTS
  flag, this reads content from the core file bfd directly,

  2. Check m_core_file_mappings, this will read content from mapped
  files assuming that GDB was able to find the file on-disk,

  3. Check m_core_unavailable_mappings and redirect to the file
  target, this allows the user supplied executable to satisfy some
  reads,

  4. Check m_core_section_table for sections without SEC_HAS_CONTENTS
  flag, this returns zero for these regions.

If a file is recorded as mapped in the core file, but the file was
mapped read only, then the matching entry in m_core_section_table will
not have the SEC_HAS_CONTENTS flag set.

If GDB can find the file on disk then an entry is added to
m_core_file_mappings, and reads from the region covered by the file
will be satisfied by #2 in the list above.

However, if GDB can't find the file then an entry is added to the
m_core_unavailable_mappings list.  An attempt will be made to satisfy
the read at stage #3 in the list above, however, if this file is not
the executable, then stage #3 will not satisfy the read.

We then fall through to stage #4, the section does not have the
SEC_HAS_CONTENTS flag, and so we end up returning zero.

In this case, I think this is the wrong thing to do.  If the core file
tells us that a file is mapped in, but GDB can't find the file on
disk, then we should instead return a failure when trying to read from
the region covered by the file, anything else is just misleading.

And so, I propose that in core_target::core_target() when the core
file is initially opened and m_core_section_table is setup, we should
remove entries that correspond to m_core_unavailable_mappings entries.

This means that an attempt to read from a mapped file, when GDB is not
able to find the on-disk file, and the contents of the mapped file
were not copied into the core file, then the read will fail.  This
feels like the only sensible option.

I've extended an existing test to exercise this case.  The test maps a
single file twice, once in read-only mode, and once in read-write
mode.  We then remove the on-disk file and load the core file in GDB.
The mapping which was read-write is still accessible (the content of
this region is found in the core file bfd object), while the read-only
mapping is no longer accessible.
---
 gdb/corelow.c                       | 58 +++++++++++++++++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp | 39 +++++++++++++++++++
 2 files changed, 97 insertions(+)

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 462a05591c1..4a489d95706 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -166,6 +166,11 @@ class core_target final : public process_stratum_target
 						    ULONGEST len,
 						    ULONGEST *xfered_len);
 
+  /* Remove from m_core_section_table any entries that are covered by an
+     unavailable mapping.  This will prevent xfer_partial from returning
+     zero for unavailable mappings.  */
+  void filter_unavailable_mappings_from_section_table ();
+
   /* FIXME: kettenis/20031023: Eventually this field should
      disappear.  */
   struct gdbarch *m_core_gdbarch = NULL;
@@ -198,6 +203,59 @@ core_target::core_target ()
   m_core_section_table = build_section_table (current_program_space->core_bfd ());
 
   build_file_mappings ();
+
+  if (!m_core_unavailable_mappings.empty ())
+    filter_unavailable_mappings_from_section_table ();
+}
+
+/* See comment in class declaration.  */
+
+void
+core_target::filter_unavailable_mappings_from_section_table ()
+{
+  auto filter_cb = [&] (const target_section &tsect) -> bool
+  {
+    /* If TSECT has contents then this means that there is content to be
+       read within the core file bfd itself.  If TSECT does not have
+       content then we are relying on reading the content from some other
+       file that was mapped into the inferior at the time the core file was
+       created, but if GDB couldn't find that file then we are not able to
+       read the content and TSECT should be discarded.  */
+    if ((tsect.the_bfd_section->flags & SEC_HAS_CONTENTS) != 0)
+      return false;
+
+    /* Find the first entry in m_core_unavailable_mappings whose start
+       address is not less than or equal to the start of this target
+       section.  The previous unavailable mapping might cover TSECT.  */
+    auto it = std::lower_bound (m_core_unavailable_mappings.begin (),
+				m_core_unavailable_mappings.end (),
+				tsect.addr,
+				[] (const mem_range &mr,
+				    const CORE_ADDR &addr)
+				{
+				  return mr.start <= addr;
+				});
+
+    /* If there is a previous unavailable mapping, then that mapping's
+       start address is less than, or equal to the start of TSECT.  The
+       unavailable mapping might cover TSECT.  If it does then we will
+       delete TSECT.  */
+    if (it != m_core_unavailable_mappings.begin ())
+      {
+	--it;
+
+	if (it->start <= tsect.addr
+	    && (it->start + it->length) >= tsect.endaddr)
+	  return true;
+      }
+
+    return false;
+  };
+
+  auto end = std::remove_if (m_core_section_table.begin (),
+			     m_core_section_table.end (),
+			     filter_cb);
+  m_core_section_table.erase (end, m_core_section_table.end ());
 }
 
 /* Construct the table for file-backed mappings if they exist.
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index 28b723e1f85..ee152e629ab 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -199,6 +199,45 @@ gdb_test "up" "#\[0-9\]* *(\[0-9xa-fH'\]* in)? .* \\(.*\\).*" "up (reinit)"
 
 gdb_test "core" "No core file now."
 
+# Temporarily move coremmap.data out of the way and reload the core
+# file.  We should still be able to read buf2 as the contents of this
+# are written into the core file.  In contrast buf2ro should no longer
+# be readable as the contents of this region are not within the core
+# file, GDB relies on reading this from the coremmap.data file, which
+# can no longer be found.
+set coremmap_data_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data]
+set coremmap_data_backup_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data.backup]
+remote_exec host "mv ${coremmap_data_filename} \
+		  ${coremmap_data_backup_filename}"
+
+clean_restart $binfile
+
+# Load the core file and check we get a warning about the
+# coremmap.data file being missing.
+gdb_test_multiple "core-file $corefile" "warn about coremmap.data missing" {
+    -re -wrap "warning: Can't open file \[^\r\n\]+/coremmap.data during file-backed mapping note processing\r\n.*" {
+	pass $gdb_test_name
+    }
+}
+
+# This xfail was just copied from earlier in the script where we also
+# read from buf2.
+setup_xfail "*-*-sunos*" "*-*-aix*"
+gdb_test "x/8bd buf2" \
+    ".*:.*0.*1.*2.*3.*4.*5.*6.*7.*" \
+    "accessing mmapped data in core file with coremmap.data removed"
+
+gdb_test "x/8bd buf2ro" \
+    "$hex\[^:\]*:\\s+Cannot access memory at address $hex" \
+    "accessing read-only mmapped data in core file with coremmap.data removed"
+
+# Restore the coremmap.data file so later tests don't give warnings
+# when the core file is reloaded.
+remote_exec host "mv ${coremmap_data_backup_filename} \
+		  ${coremmap_data_filename}"
+
 # Test that we can unload the core with the "detach" command.
 
 proc_with_prefix corefile_detach {} {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 2/4] gdb/corefile: improve file backed mapping handling
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
@ 2024-05-20 13:08   ` Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-20 13:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit improves how GDB handles file backed mappings within a
core file, specifically, this is a restructuring of the function
core_target::build_file_mapping.

The primary motivation for this commit was to put in place the
infrastructure to support the next commit in this series, but this
commit does itself make some improvements.

Currently in core_target::build_file_mapping we use
gdbarch_read_core_file_mappings to iterate over the mapped regions
within a core file.

For each region a callback is invoked which is passed details of the
mapping; the file the mapping is from, the offset into the file, and
the address range at which the mapping exists.  We are also passed the
build-id for the mapped file in some cases.

We are only told the build-id for the mapped region which actually
contains the ELF header of the mapped file.  Other regions of the same
mapped ELF will not have the build-id passed to the callback.

Within core_target::build_file_mapping, in the per-region callback, we
try to find the mapped file based on its filename.  If the file can't
be found, and if we have a build-id then we'll ask debuginfod to
download the file.

However we find the file, we cache the opened bfd object, which is
good.  Subsequent mappings from the same file will not have a build-id
set, but by that point we already have a cached open bfd object, so
the lack of build-id is irrelevant.

The problem with the above is that if we find a matching file based on
the filename, then we accept that file, even if we have a build-id,
and the build-id doesn't match.

Currently, the mapped region processing is done in a single pass, we
call gdbarch_read_core_file_mappings, and for each mapping, as we see
it, we create the data structures needed to represent that mapping.

In this commit, I will change this to a two phase process.  In the
first phase the mappings are grouped together based on the name of the
mapped file.  At the end of phase one we have a 'struct mapped_file',
a new struct, for each mapped file.  This struct associates an
optional build-id with a list of mapped regions.

In the second phase we try to find the file using its filename.  If
the file is found, and the 'struct mapped_file' has a build-id, then
we'll compare the build-id with the file we found.  This allows us to
reject on-disk files which have changed since the core file was
created.

If no suitable file was found (either no file found, or a build-id
mismatch) then we can use debuginfod to potentially download a
suitable file.

  NOTE: In the future we could potentially add additional sanity
  checks here, for example, if a data-file is mapped, and has no
  build-id, we can estimate a minimum file size based on the expected
  mappings.  If the file we find is not big enough then we can reject
  the on-disk file.  But I don't know how useful this would actually
  be, so I've not done that for now.

Having found (or not) a suitable file then we can create the data
structures for each mapped region just as we did before.

The new functionality here is the extra build-id check, and the
possibility of rejecting an on-disk file if the build-id doesn't
match.

This change could have been done within the existing single phase
approach I think, however, in the next approach I need to have all the
mapped regions associated with the expected build-id, and the new two
phase structure allows me to do that, this is the reason for such an
extensive rewrite in this commit.

There's a new test that exercises GDB's ability to find mapped files
via the build-id, and this downloading from debuginfod.
---
 gdb/corelow.c                                 | 274 ++++++++++----
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 ++
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 ++
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 +++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 356 ++++++++++++++++++
 5 files changed, 648 insertions(+), 73 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 4a489d95706..28d619594d8 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -53,6 +53,7 @@
 #include "cli/cli-cmds.h"
 #include "xml-tdesc.h"
 #include "memtag.h"
+#include "cli/cli-style.h"
 
 #ifndef O_LARGEFILE
 #define O_LARGEFILE 0
@@ -277,6 +278,48 @@ core_target::build_file_mappings ()
   std::unordered_map<std::string, struct bfd *> bfd_map;
   std::unordered_set<std::string> unavailable_paths;
 
+  /* Type holding information about a single file mapped into the inferior
+     at the point when the core file was created.  Associates a build-id
+     with the list of regions the file is mapped into.  */
+  struct mapped_file
+  {
+    /* Type for a region of a file that was mapped into the inferior when
+       the core file was generated.  */
+    struct region
+    {
+      /* Constructor.   See member variables for argument descriptions.  */
+      region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
+	: start (start_),
+	  end (end_),
+	  file_ofs (file_ofs_)
+      { /* Nothing.  */ }
+
+      /* The inferior address for the start of the mapped region.  */
+      CORE_ADDR start;
+
+      /* The inferior address immediately after the mapped region.  */
+      CORE_ADDR end;
+
+      /* The offset within the mapped file for this content.  */
+      CORE_ADDR file_ofs;
+    };
+
+    /* If not nullptr, then this is the build-id associated with this
+       file.  */
+    const bfd_build_id *build_id = nullptr;
+
+    /* If true then we have seen multiple different build-ids associated
+       with the same filename.  The build_id field will have been set back
+       to nullptr, and we should not set build_id in future.  */
+    bool ignore_build_id_p = false;
+
+    /* All the mapped regions of this file.  */
+    std::vector<region> regions;
+  };
+
+  /* All files mapped into the core file.  The key is the filename.  */
+  std::unordered_map<std::string, mapped_file> mapped_files;
+
   /* See linux_read_core_file_mappings() in linux-tdep.c for an example
      read_core_file_mappings method.  */
   gdbarch_read_core_file_mappings (m_core_gdbarch,
@@ -297,87 +340,172 @@ core_target::build_file_mappings ()
 	   weed out non-file-backed mappings.  */
 	gdb_assert (filename != nullptr);
 
-	if (unavailable_paths.find (filename) != unavailable_paths.end ())
+	/* Add this mapped region to the data for FILENAME.  */
+	mapped_file &file_data = mapped_files[filename];
+	file_data.regions.emplace_back (start, end, file_ofs);
+	if (build_id != nullptr && !file_data.ignore_build_id_p)
 	  {
-	    /* We have already seen some mapping for FILENAME but failed to
-	       find/open the file.  There is no point in trying the same
-	       thing again so just record that the range [start, end) is
-	       unavailable.  */
-	    m_core_unavailable_mappings.emplace_back (start, end - start);
-	    return;
-	  }
-
-	struct bfd *bfd = bfd_map[filename];
-	if (bfd == nullptr)
-	  {
-	    /* Use exec_file_find() to do sysroot expansion.  It'll
-	       also strip the potential sysroot "target:" prefix.  If
-	       there is no sysroot, an equivalent (possibly more
-	       canonical) pathname will be provided.  */
-	    gdb::unique_xmalloc_ptr<char> expanded_fname
-	      = exec_file_find (filename, NULL);
-
-	    if (expanded_fname == nullptr && build_id != nullptr)
-	      debuginfod_exec_query (build_id->data, build_id->size,
-				     filename, &expanded_fname);
-
-	    if (expanded_fname == nullptr)
+	    if (file_data.build_id == nullptr)
+	      file_data.build_id = build_id;
+	    else if (!build_id_equal (build_id, file_data.build_id))
 	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s during file-backed mapping "
-			   "note processing"),
-			 filename);
-		return;
+		warning (_("Multiple build-ids found for %ps"),
+			 styled_string (file_name_style.style (), filename));
+		file_data.build_id = nullptr;
+		file_data.ignore_build_id_p = true;
 	      }
+	  }
+      });
 
-	    bfd = bfd_openr (expanded_fname.get (), "binary");
+  for (const auto &iter : mapped_files)
+    {
+      const std::string &filename = iter.first;
+      const mapped_file &file_data = iter.second;
+
+      /* Use exec_file_find() to do sysroot expansion.  It'll
+	 also strip the potential sysroot "target:" prefix.  If
+	 there is no sysroot, an equivalent (possibly more
+	 canonical) pathname will be provided.  */
+      gdb::unique_xmalloc_ptr<char> expanded_fname
+	= exec_file_find (filename.c_str (), nullptr);
+
+      bool build_id_mismatch = false;
+      if (expanded_fname != nullptr && file_data.build_id != nullptr)
+	{
+	  /* We temporarily open the bfd as a structured target, this
+	     allows us to read the build-id from the bfd if there is one.
+	     For this task it's OK if we reuse an already open bfd object,
+	     so we make this call through GDB's bfd cache.  Once we've
+	     checked the build-id (if there is one) we'll drop this
+	     reference and re-open the bfd using the "binary" target.  */
+	  gdb_bfd_ref_ptr tmp_bfd
+	    = gdb_bfd_open (expanded_fname.get (), gnutarget);
+
+	  if (tmp_bfd != nullptr
+	      && bfd_check_format (tmp_bfd.get (), bfd_object)
+	      && build_id_bfd_get (tmp_bfd.get ()) != nullptr)
+	    {
+	      /* The newly opened TMP_BFD has a build-id, and this mapped
+		 file has a build-id extracted from the core-file.  Check
+		 the build-id's match, and if not, reject TMP_BFD.  */
+	      const struct bfd_build_id *found
+		= build_id_bfd_get (tmp_bfd.get ());
+	      if (!build_id_equal (found, file_data.build_id))
+		build_id_mismatch = true;
+	    }
+	}
 
-	    if (bfd == nullptr || !bfd_check_format (bfd, bfd_object))
-	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s which was expanded to %s "
+      gdb_bfd_ref_ptr abfd;
+      if (expanded_fname != nullptr && !build_id_mismatch)
+	{
+	  struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	  abfd = gdb_bfd_ref_ptr::new_reference (b);
+	}
+
+      if ((expanded_fname == nullptr
+	   || abfd == nullptr
+	   || !bfd_check_format (abfd.get (), bfd_object))
+	  && file_data.build_id != nullptr)
+	{
+	  expanded_fname = nullptr;
+	  debuginfod_exec_query (file_data.build_id->data,
+				 file_data.build_id->size,
+				 filename.c_str (), &expanded_fname);
+	  if (expanded_fname != nullptr)
+	    {
+	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      abfd = gdb_bfd_ref_ptr::new_reference (b);
+	    }
+	}
+
+      if (expanded_fname == nullptr
+	  || abfd == nullptr
+	  || !bfd_check_format (abfd.get (), bfd_object))
+	{
+	  /* If ABFD was opened, but the wrong format, close it now.  */
+	  abfd = nullptr;
+
+	  /* Record all regions for this file as unavailable.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    m_core_unavailable_mappings.emplace_back (region.start,
+						      region.end
+						      - region.start);
+
+	  /* And give the user an appropriate warning.  */
+	  if (build_id_mismatch)
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("File %ps doesn't match build-id from core-file "
+			   "during file-backed mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("File %ps which was expanded to %ps, doesn't match "
+			   "build-id from core-file during file-backed "
+			   "mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	  else
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("Can't open file %ps during file-backed mapping "
+			   "note processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("Can't open file %ps which was expanded to %ps "
 			   "during file-backed mapping note processing"),
-			 filename, expanded_fname.get ());
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	}
+      else
+	{
+	  /* Ensure that the bfd will be closed when core_bfd is closed.
+	     This can be checked before/after a core file detach via "maint
+	     info bfds".  */
+	  gdb_bfd_record_inclusion (current_program_space->core_bfd (),
+				    abfd.get ());
+
+	  /* Create sections for each mapped region.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    {
+	      /* Make new BFD section.  All sections have the same name,
+		 which is permitted by bfd_make_section_anyway().  */
+	      asection *sec = bfd_make_section_anyway (abfd.get (), "load");
+	      if (sec == nullptr)
+		error (_("Can't make section"));
+	      sec->filepos = region.file_ofs;
+	      bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
+	      bfd_set_section_size (sec, region.end - region.start);
+	      bfd_set_section_vma (sec, region.start);
+	      bfd_set_section_lma (sec, region.start);
+	      bfd_set_section_alignment (sec, 2);
+
+	      /* Set target_section fields.  */
+	      m_core_file_mappings.emplace_back (region.start, region.end, sec);
+	    }
+	}
 
-		if (bfd != nullptr)
-		  bfd_close (bfd);
-		return;
-	      }
-	    /* Ensure that the bfd will be closed when core_bfd is closed. 
-	       This can be checked before/after a core file detach via
-	       "maint info bfds".  */
-	    gdb_bfd_record_inclusion (current_program_space->core_bfd (), bfd);
-	    bfd_map[filename] = bfd;
-	  }
+      /* If this is a bfd of a shared library, record its soname and
+	 build-id.  */
+      if (file_data.build_id != nullptr && abfd != nullptr)
+	{
+	  gdb::unique_xmalloc_ptr<char> soname
+	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
 
-	/* Make new BFD section.  All sections have the same name,
-	   which is permitted by bfd_make_section_anyway().  */
-	asection *sec = bfd_make_section_anyway (bfd, "load");
-	if (sec == nullptr)
-	  error (_("Can't make section"));
-	sec->filepos = file_ofs;
-	bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
-	bfd_set_section_size (sec, end - start);
-	bfd_set_section_vma (sec, start);
-	bfd_set_section_lma (sec, start);
-	bfd_set_section_alignment (sec, 2);
-
-	/* Set target_section fields.  */
-	m_core_file_mappings.emplace_back (start, end, sec);
-
-	/* If this is a bfd of a shared library, record its soname
-	   and build id.  */
-	if (build_id != nullptr)
-	  {
-	    gdb::unique_xmalloc_ptr<char> soname
-	      = gdb_bfd_read_elf_soname (bfd->filename);
-	    if (soname != nullptr)
-	      set_cbfd_soname_build_id (current_program_space->cbfd,
-					soname.get (), build_id);
-	  }
-      });
+	  if (soname != nullptr)
+	    set_cbfd_soname_build_id (current_program_space->cbfd,
+				      soname.get (), file_data.build_id);
+	}
+    }
 
   normalize_mem_ranges (&m_core_unavailable_mappings);
 }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
new file mode 100644
index 00000000000..dd6b3f105f2
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
@@ -0,0 +1,24 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+int
+main (void)
+{
+  int res = foo ();
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
new file mode 100644
index 00000000000..ccf35b75ef8
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
@@ -0,0 +1,22 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+volatile int *library_ptr = (int *) POINTER_VALUE;
+
+int
+foo (void)
+{
+  return *library_ptr;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
new file mode 100644
index 00000000000..45d78336930
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
@@ -0,0 +1,45 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+volatile void* library_base_address = 0;
+volatile int *ptr = 0;
+
+int
+main ()
+{
+  struct stat buf;
+  int res;
+
+  int fd = open (SHLIB_FILENAME, O_RDONLY);
+  assert (fd != -1);
+
+  res = fstat (fd, &buf);
+  assert (res == 0);
+
+  library_base_address
+    = mmap (NULL, buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+
+  res = *ptr;	/* Undefined behaviour here.  */
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
new file mode 100644
index 00000000000..6e3301e1c8d
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -0,0 +1,356 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test checks GDB's ability to use build-ids when setting up file backed
+# mappings as part of reading a core-file.
+#
+# A core-file contains a list of the files that were mapped into the process
+# at the time of the core-file creation.  If the file was mapped read-only
+# then the file contents will not be present in the core-file, but instead GDB
+# is expected to open the mapped file and read the contents from there if
+# needed.  And this is what GDB does.
+#
+# GDB (via the BFD library) will also spot if a mapped looks like a valid ELF
+# and contains a build-id, this build-id is passed back to GDB so that GDB can
+# validate the on-disk file it finds matches the file that was mapped when the
+# core-file was created.
+#
+# In addition, if the on-disk file is found to have a non-matching build-id
+# then GDB can use debuginfod to (try) and download a suitable file.
+#
+# This test is about checking that this file backed mapping mechanism works
+# correctly; that GDB will spot when the build-ids fail to match and will
+# refuse to load an incorrect file.  Additionally we check that the correct
+# file can be downloaded from debuginfod.
+#
+# The test is rather contrived though.  Instead of relying on having a shared
+# library mapped at the time of crash we mmap a shared library into the
+# process and then check this mapping within the test.
+#
+# The problem with using a normal shared library load for this test is that
+# the shared library list is processed as part of a separate step when opening
+# the core file.  Right now this separate step doesn't check the build-ids
+# correctly, so GDB will potentially open the wrong shared library file.  The
+# sections of this incorrect shared library are then added to GDB's list of
+# target sections, and are used to satisfy memory reads, which can give the
+# wrong results.
+#
+# This obviously needs fixing, but is a separate problem from the one being
+# tested here, so this test deliberately checks the mapping using a file that
+# is mmaped rather than loaded as a shared library, as such the file is in the
+# core-files list of mapped files, but is not in the shared library list.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+require {!is_remote host}
+require {!is_remote target}
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+
+standard_testfile -1.c -2.c -3.c
+
+# Compile an executable that loads the shared library as an actual
+# shared library, then use GDB to figure out the offset of the
+# variable 'library_ptr' within the library.
+set library_filename [standard_output_file "libfoo.so"]
+set binfile2 [standard_output_file "library_loader"]
+
+if {[prepare_for_testing_full "build exec which loads the shared library" \
+	 [list $library_filename \
+	      { debug shlib build-id \
+		    additional_flags=-DPOINTER_VALUE=0x12345678 } \
+	      $srcfile2 {}] \
+	 [list $binfile2 [list debug shlib=$library_filename ] \
+	      $srcfile { debug }]] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+if { [is_address_zero_readable] } {
+    return
+}
+
+set ptr_address [get_hexadecimal_valueof "&library_ptr" "unknown"]
+
+set ptr_offset "unknown"
+gdb_test_multiple "info proc mappings" "" {
+    -re "^\\s+($hex)\\s+($hex)\\s+$hex\\s+($hex)\[^\r\n\]+$library_filename\r\n" {
+	set low_addr $expect_out(1,string)
+	set high_addr $expect_out(2,string)
+	set file_offset $expect_out(3,string)
+
+	if {[expr $ptr_address >= $low_addr] && [expr $ptr_address < $high_addr]} {
+	    set mapping_offset [expr $ptr_address - $low_addr]
+	    set ptr_offset [format 0x%x [expr $file_offset + $mapping_offset]]
+	}
+
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+    }
+
+    -re "(^\[^\r\n\]*)\r\n" {
+	set tmp $expect_out(1,string)
+	exp_continue
+    }
+}
+
+gdb_assert { $ptr_offset ne "unknown" } \
+    "found pointer offset"
+
+set ptr_size [get_integer_valueof "sizeof (library_ptr)" "unknown"]
+set ptr_format_char ""
+if { $ptr_size == 2 } {
+    set ptr_format_char "b"
+} elseif { $ptr_size == 4 } {
+    set ptr_format_char "w"
+} elseif { $ptr_size == 8 } {
+    set ptr_format_char "g"
+}
+if { $ptr_format_char eq "" } {
+    untested "could not figure out size of library_ptr variable"
+    return
+}
+
+# Helper proc to read a value from inferior memory.  Reads at address held in
+# global PTR_ADDRESS, and use PTR_FORMAT_CHAR for the size of the read.
+proc read_ptr_value { } {
+    set value ""
+    gdb_test_multiple "x/1${::ptr_format_char}x ${::ptr_address}" "" {
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+($::hex)" {
+	    set value $expect_out(1,string)
+	}
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+Cannot access memory at address ${::hex}" {
+	    set value "unavailable"
+	}
+    }
+    return $value
+}
+
+set ptr_expected_value [read_ptr_value]
+if { $ptr_expected_value eq "" } {
+    untested "could not find expected value for library_ptr"
+    return
+}
+
+# Now compile a second executable.  This one doesn't load the shared
+# library as an actual shared library, but instead mmaps the library
+# into the executable.
+#
+# Load this executable within GDB and confirm that we can use the
+# offset we calculated previously to view the value of 'library_ptr'.
+set opts [list debug additional_flags=-DSHLIB_FILENAME=\"$library_filename\"]
+if {[prepare_for_testing "prepare second executable" $binfile \
+	 $srcfile3 $opts] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+gdb_breakpoint [gdb_get_line_number "Undefined behaviour here" $srcfile3]
+gdb_continue_to_breakpoint "run to breakpoint"
+
+set library_base_address \
+    [get_hexadecimal_valueof "library_base_address" "unknown"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable"
+
+# Now rerun the second executable outside of GDB.  The executable should crash
+# and generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Load a core file from the global COREFILE.  Use TESTNAME as the name
+# of the test.
+#
+# If LINE_RE is not the empty string then this is a regexp for a line
+# that we expect to see in the output when loading the core file, if
+# the line is not present then this test will fail.
+#
+# Any lines beginning with 'warning: ' will cause this test to fail.
+#
+# A couple of other standard lines that are produced when loading a
+# core file are also checked for, just to make sure the core file
+# loading has progressed as expected.
+proc load_core_file { testname { line_re "" } } {
+    set code {}
+
+    if { $line_re ne "" } {
+	append code {
+	    -re "^$line_re\r\n" {
+		set saw_expected_line true
+		exp_continue
+	    }
+	}
+	set saw_expected_line false
+    } else {
+	set saw_expected_line true
+    }
+
+    set saw_unknown_warning false
+    set saw_generated_by_line false
+    set saw_prog_terminated_line false
+
+    append code {
+	-re "^warning: \[^\r\n\]+\r\n" {
+	    set saw_unknown_warning true
+	    exp_continue
+	}
+
+	-re "^Core was generated by \[^\r\n\]+\r\n" {
+	    set saw_generated_by_line true
+	    exp_continue
+	}
+
+	-re "^Program terminated with signal SIGSEGV, Segmentation fault\\.\r\n" {
+	    set saw_prog_terminated_line true
+	    exp_continue
+	}
+
+	-re "^$::gdb_prompt $" {
+	    gdb_assert {$saw_generated_by_line \
+			    && $saw_prog_terminated_line \
+			    && $saw_expected_line \
+			    && !$saw_unknown_warning} \
+		$gdb_test_name
+	}
+
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+    }
+
+    set res [catch { return [gdb_test_multiple "core-file $::corefile" \
+				 "$testname" $code] } string]
+
+    if {$res == 1} {
+	global errorInfo errorCode
+	return -code error -errorinfo $errorInfo -errorcode $errorCode $string
+    } elseif {$res == 2} {
+	return $string
+    } else {
+	# We expect RES to be 2 (TCL_RETURN) or 1 (TCL_ERROR).  If we get
+	# here then somehow the 'catch' above finished without hitting
+	# either of those cases, which is .... weird.
+	perror "unexepcted return value, code = $res, value = $string"
+	return -1
+    }
+}
+
+# And now restart GDB, load the core-file and check that the library shows as
+# being mapped in, and that we can still read the library_ptr value from
+# memory.
+clean_restart $binfile
+
+load_core_file "load core file"
+
+set library_base_address [get_hexadecimal_valueof "library_base_address" \
+			      "unknown" "get library_base_address in core-file"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file"
+
+# Now move the shared library file away and restart GDB.  This time when we
+# load the core-file we should see a warning that GDB has failed to map in the
+# library file.  An attempt to read the variable from the library file should
+# fail / give a warning.
+set library_backup_filename [standard_output_file "libfoo.so.backup"]
+remote_exec build "mv \"$library_filename\" \"$library_backup_filename\""
+
+clean_restart $binfile
+
+load_core_file "load corefile with library file missing" \
+    "warning: Can't open file [string_to_regexp $library_filename] during file-backed mapping note processing"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with library file missing"
+
+# Build a new version of the shared library, keep the library the same size,
+# but change the contents so the build-id changes.  Then restart GDB and load
+# the core-file again.  GDB should spot that the build-id for the shared
+# library is not as expected, and should refuse to map in the shared library.
+if {[build_executable "build second version of shared library" \
+	 $library_filename $srcfile2 \
+	 { debug shlib build-id \
+	       additional_flags=-DPOINTER_VALUE=0x11223344 }] != 0} {
+    return
+}
+
+clean_restart $binfile
+
+load_core_file "load corefile with wrong library in place" \
+    "warning: File [string_to_regexp $library_filename] doesn't match build-id from core-file during file-backed mapping processing"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with wrong library in place"
+
+# Setup a debuginfod server which can serve the original shared library file.
+# Then restart GDB and load the core-file.  GDB should download the original
+# shared library from debuginfod and use that to provide the file backed
+# mapping.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    clean_restart
+    gdb_test_no_output "set debuginfod enabled on" \
+	"enabled debuginfod for initial test"
+    gdb_load $binfile
+
+    load_core_file "load corefile, download library from debuginfod" \
+	"Downloading\[^\r\n\]* executable for [string_to_regexp $library_filename]\\.\\.\\."
+
+    set ptr_value [read_ptr_value]
+    gdb_assert { $ptr_value == $ptr_expected_value } \
+	"check value of pointer variable after downloading library file"
+}
+
+stop_debuginfod
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 3/4] gdb: improve shared library build-id check for core-files
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
@ 2024-05-20 13:08   ` Andrew Burgess
  2024-05-21 19:00     ` Andrew Burgess
  2024-05-20 13:08   ` [PATCHv2 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
  4 siblings, 1 reply; 16+ messages in thread
From: Andrew Burgess @ 2024-05-20 13:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the libraries file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-id extracted during the ::build_file_mappings phase to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this two changes need to be made to GDB.

First, rather than just recording the soname to build-id mapping in
set_cbfd_soname_build_id, I now also record, the full mapped filename
to build-id mapping, and also the memory ranges to build-id mapping
for every memory range covered by a mapped file.

Second, I've added a new callback solib_ops::find_solib_addr.  This
callback takes a solib object and returns an (optional) address within
the inferior that is part of this library.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, I also build a reverse build-id to filename map in
set_cbfd_soname_build_id.  This is useful as often a shared library is
implemented as a symbolic link to the actual shared library file.  The
mapped file information is stored based on the actual final, real file
name, while the shared library information holds the original symbolic
link file name.

If when loading the shared library, we find the symbolic link has
disappeared, then we can use the build-id to file name map to check if
the actual file is still around, if it is (and if the build-id
matches) then we can fall back to use that file.

I've also renamed set_cbfd_soname_build_id to set_cbfd_solib_build_id
and get_cbfd_soname_build_id to get_cbfd_solib_build_id as the
'soname' in the function names was now only part of the larger job
these functions do.
---
 gdb/corelow.c                                 |  33 +-
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 373 +++++++++++++++---
 gdb/solib.h                                   |  39 +-
 gdb/solist.h                                  |  22 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 260 ++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  13 +
 15 files changed, 802 insertions(+), 64 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 28d619594d8..85bc3c26bea 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -418,6 +418,10 @@ core_target::build_file_mappings ()
 	    }
 	}
 
+      std::vector<mem_range> ranges;
+      for (const mapped_file::region &region : file_data.regions)
+	ranges.emplace_back (region.start, region.end - region.start);
+
       if (expanded_fname == nullptr
 	  || abfd == nullptr
 	  || !bfd_check_format (abfd.get (), bfd_object))
@@ -494,16 +498,29 @@ core_target::build_file_mappings ()
 	    }
 	}
 
-      /* If this is a bfd of a shared library, record its soname and
-	 build-id.  */
-      if (file_data.build_id != nullptr && abfd != nullptr)
+      /* If this is a bfd with a build-id then record the filename,
+	 optional soname (DT_SONAME .dynamic attribute), and the range of
+	 addresses at which this bfd is mapped.  This information can be
+	 used to perform build-id checking when loading the shared
+	 libraries.  */
+      if (file_data.build_id != nullptr)
 	{
-	  gdb::unique_xmalloc_ptr<char> soname
-	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+	  normalize_mem_ranges (&ranges);
+
+	  const char *actual_filename = nullptr;
+	  gdb::unique_xmalloc_ptr<char> soname;
+	  if (abfd != nullptr)
+	    {
+	      actual_filename = bfd_get_filename (abfd.get ());
+	      soname = gdb_bfd_read_elf_soname (actual_filename);
+	    }
 
-	  if (soname != nullptr)
-	    set_cbfd_soname_build_id (current_program_space->cbfd,
-				      soname.get (), file_data.build_id);
+	  set_cbfd_solib_build_id (current_program_space->cbfd,
+				   soname.get (),
+				   filename.c_str (),
+				   actual_filename,
+				   std::move (ranges),
+				   file_data.build_id);
 	}
     }
 
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index a50bb165c19..a9242408137 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -689,6 +689,12 @@ const solib_ops solib_aix_so_ops =
   solib_aix_open_symbol_file_object,
   solib_aix_in_dynsym_resolve_code,
   solib_aix_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_solib_aix ();
diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index f0828fdf102..e9a85db1e5e 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -665,4 +665,10 @@ const solib_ops darwin_so_ops =
   open_symbol_file_object,
   darwin_in_dynsym_resolve_code,
   darwin_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-dsbt.c b/gdb/solib-dsbt.c
index 11225f72ed0..479f32182d2 100644
--- a/gdb/solib-dsbt.c
+++ b/gdb/solib-dsbt.c
@@ -913,6 +913,12 @@ const solib_ops dsbt_so_ops =
   open_symbol_file_object,
   dsbt_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_dsbt_solib ();
diff --git a/gdb/solib-frv.c b/gdb/solib-frv.c
index 39508fab4c8..7502fe1f560 100644
--- a/gdb/solib-frv.c
+++ b/gdb/solib-frv.c
@@ -1084,4 +1084,10 @@ const solib_ops frv_so_ops =
   open_symbol_file_object,
   frv_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 1d4a50568d7..91359293e75 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -3353,6 +3353,15 @@ svr4_iterate_over_objfiles_in_search_order
     }
 }
 
+/* See solib_ops::find_solib_addr in solist.h.  */
+
+static std::optional<CORE_ADDR>
+svr4_find_solib_addr (solib &so)
+{
+  auto *li = gdb::checked_static_cast<lm_info_svr4 *> (so.lm_info.get ());
+  return li->l_addr_inferior;
+}
+
 const struct solib_ops svr4_so_ops =
 {
   svr4_relocate_section_addresses,
@@ -3368,6 +3377,7 @@ const struct solib_ops svr4_so_ops =
   svr4_keep_data_in_core,
   svr4_update_solib_event_breakpoints,
   svr4_handle_solib_event,
+  svr4_find_solib_addr,
 };
 
 void _initialize_svr4_solib ();
diff --git a/gdb/solib-target.c b/gdb/solib-target.c
index 6563da05a47..f89761f109f 100644
--- a/gdb/solib-target.c
+++ b/gdb/solib-target.c
@@ -412,4 +412,10 @@ const solib_ops solib_target_so_ops =
   solib_target_open_symbol_file_object,
   solib_target_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib.c b/gdb/solib.c
index db2ff5ce9a3..c39dfbcc78e 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -475,56 +475,304 @@ solib_bfd_open (const char *pathname)
   return abfd;
 }
 
-/* Mapping of a core file's shared library sonames to their respective
-   build-ids.  Added to the registries of core file bfds.  */
+/* A mem_range and the build-id associated with the file mapped into the
+   given range.  */
 
-typedef std::unordered_map<std::string, std::string> soname_build_id_map;
+struct mem_range_and_build_id
+{
+  mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+    : range (r),
+      build_id (id)
+  { /* Nothing.  */ }
+
+  /* A range of memory addresses.  */
+  mem_range range;
+
+  /* The build-id of the file mapped into RANGE.  */
+  const bfd_build_id *build_id;
+};
+
+/* When a core file is opened set_cbfd_solib_build_id is called and the
+   data it is passed is used to create an instance of this class.
+
+   This class maps various information extracted from parsing the core
+   file onto the corresponding build-id.
+
+   When we latter load the shared libraries as part of the core file
+   opening process we can use this data structure to find the build-id
+   that corresponds to a shared library.
+
+   Knowing the build-id allows for two things, first, we can refuse to
+   load an incorrect shared library (a file is found, but the build-id
+   doesn't match), and second, we can use various strategies to try and
+   find the correct shared library based on the build-id (e.g. we might be
+   able to download the shared library from debuginfod).  */
+
+struct solib_core_file_build_id_data
+{
+  /* A type that maps a string to a build-id.  */
+  using string_to_build_id_map
+    = std::unordered_map<std::string, const bfd_build_id *>;
+
+  /* A type for a reverse build-id to string map.  We only store a sting
+     pointer in this map as we point back to the strings held in a
+     string_to_build_id_map, this avoids duplicating some strings.  */
+  using build_id_to_string_map
+    = std::unordered_map<const bfd_build_id *, std::string>;
+
+  /* When loading a core file, the build-ids are extracted based on the
+     file backed mappings.  This map associates the name of a file that was
+     mapped into the core file with the corresponding build-id.  When
+     opening a shared library, if the shared library filename appears in
+     this map, then we can use the corresponding build-id.  The build-id
+     pointers in this map will never be nullptr.  */
+
+  string_to_build_id_map filename_to_build_id_map;
+
+  /* Map a build-id pointer back to a filename.  The std::string pointers
+     in this map point back to the strings in FILENAME_TO_BUILD_ID_MAP.
+     If we end up figuring out the build-id from SONAME_TO_BUILD_ID_MAP or
+     from ADDRESS_TO_BUILD_ID_LIST then it might be that we can't find an
+     on-disk file.  In this case we can use this reverse map to find the
+     name of the mapped file, which might be usable.  */
+
+  build_id_to_string_map build_id_to_filename_map;
+
+  /* If the file that was mapped into the core file was a shared library
+     then it might have a DT_SONAME tag in its .dynamic section, this tag
+     contains the name of a shared object.  When opening a shared library,
+     if it's basename appears in this map then we can use the corresponding
+     build-id.
+
+     In the rare case that two different files have the same DT_SONAME
+     value then the build-id pointer in this map will be nullptr, this
+     indicates that it's not possible to find a build-id based on the
+     DT_SONAME value.  */
+
+  string_to_build_id_map soname_to_build_id_map;
+
+  /* This vector maps memory ranges onto an associated build-id.  The
+     ranges are those of the files mapped into the core file.  When opening
+     a shared library, if non of the string to build-id maps provide a
+     build-id then we can find a representative address for the library by
+     calling solib_ops::find_solib_addr() then then look in this vector for
+     a memory range that covers this address.  If a suitable match is found
+     then we use the corresponding build-id.
+
+     Entries in this vector should not overlap, and are sorted be
+     increasing memory address.  Within each entry the build-id pointer
+     will not be nullptr.  */
+
+  std::vector<mem_range_and_build_id> address_to_build_id_list;
+
+  /* The address_to_build_id_list is built over multiple calls to
+     set_cbfd_solib_build_id, each call corresponds to a single mapped
+     file and might add multiple entries to address_to_build_id_list.  It
+     would be wasteful to sort address_to_build_id_list each time some new
+     entries are added, better to record that the address_to_build_id_list
+     is unsorted, and then sort once the first time we are going to perform
+     a lookup in address_to_build_id_list, that is what this flag is for.
+     When this flag is false address_to_build_id_list is not sorted, and
+     needs sorting before we search the list.  When this flag is true the
+     address_to_build_id_list is sorted.  */
+
+  bool address_to_build_id_list_sorted = false;
+};
 
 /* Key used to associate a soname_build_id_map to a core file bfd.  */
 
-static const struct registry<bfd>::key<soname_build_id_map>
+static const struct registry<bfd>::key<solib_core_file_build_id_data>
   cbfd_soname_build_id_data_key;
 
 /* See solib.h.  */
 
 void
-set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname,
-			  const bfd_build_id *build_id)
+set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *soname,
+			 const char *expected_filename,
+			 const char *actual_filename,
+			 std::vector<mem_range> &&ranges,
+			 const bfd_build_id *build_id)
 {
-  gdb_assert (abfd.get () != nullptr);
-  gdb_assert (soname != nullptr);
+  gdb_assert (cbfd.get () != nullptr);
   gdb_assert (build_id != nullptr);
+  gdb_assert (expected_filename != nullptr);
+
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
+
+  if (lookup_data == nullptr)
+    {
+      lookup_data = cbfd_soname_build_id_data_key.emplace (cbfd.get ());
+      gdb_assert (lookup_data != nullptr);
+    }
+
+  if (soname != nullptr)
+    {
+      /* If we already have an entry with this SONAME then this indicates
+	 that the inferior has two files mapped into memory with different
+	 file names (and most likely different build-ids), but with the
+	 same DT_SONAME attribute.  In this case we can't use the
+	 DT_SONAME to figure out the expected build-id of a shared
+	 library, so poison the entry for this SONAME by setting the entry
+	 to nullptr.  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (soname);
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr
+	  && !build_id_equal (it->second, build_id))
+	lookup_data->soname_to_build_id_map[soname] = nullptr;
+      else
+	lookup_data->soname_to_build_id_map[soname] = build_id;
+    }
+
+  /* When the core file is initially opened and the mapped files are
+     parsed, we group the build-id information based on the file name.  As
+     a consequence, we should see each EXPECTED_FILENAME value exactly
+     once.  This means that each insertion should always succeed.  */
+  const auto [it, inserted]
+    = lookup_data->filename_to_build_id_map.emplace (expected_filename,
+						     build_id);
+  gdb_assert (inserted);
+
+  /* Setup the reverse build-id to file name map.  We store a pointer to
+     the string in the filename_to_build_id_map, this saves creating
+     another copy of this string.  The two maps have the same lifetime, so
+     this should be fine.  */
+  if (actual_filename != nullptr)
+    lookup_data->build_id_to_filename_map.emplace (build_id,
+						   actual_filename);
+
+  /* Setup the list of memory range to build-id objects.  */
+  for (mem_range &r : ranges)
+    lookup_data->address_to_build_id_list.emplace_back (std::move (r),
+							build_id);
+
+  /* At this point the address_to_build_id_list is unsorted (we just added
+     some entries to the end of the list).  All entries should be added
+     before any look-ups are performed, and the list is only sorted when
+     the first look-up is performed.  */
+  gdb_assert (!lookup_data->address_to_build_id_list_sorted);
+}
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+/* Helper for get_cbfd_solib_build_id.  BUILD_ID is a build-id that was
+   found in LOOKUP_DATA.  Look-up the corresponding file name in
+   LOOKUP_DATA.build_id_to_filename_map and return a pair containing
+   BUILD_ID and the file name pointer that we find.  If no corresponding
+   filename is found in the build_id_to_filename_map then the returned
+   pair contains BUILD_ID and an empty string.  If BUILD_ID is nullptr
+   then the returned pair contains nullptr and an empty string.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_make_result (solib_core_file_build_id_data *lookup_data,
+			    const bfd_build_id *build_id)
+{
+  if (build_id != nullptr)
+    {
+      gdb_assert (lookup_data != nullptr);
 
-  if (mapptr == nullptr)
-    mapptr = cbfd_soname_build_id_data_key.emplace (abfd.get ());
+      auto it = lookup_data->build_id_to_filename_map.find (build_id);
+      if (it != lookup_data->build_id_to_filename_map.end ())
+	return { build_id, it->second };
+    }
 
-  (*mapptr)[soname] = build_id_to_string (build_id);
+  return { build_id, {} };
 }
 
-/* If SONAME had a build-id associated with it in ABFD's registry by a
-   previous call to set_cbfd_soname_build_id then return the build-id
-   as a NULL-terminated hex string.  */
+/* CBFD is the current core file bfd object, which might be nullptr,
+   FILENAME is the file name of the shared library GDB is trying to load,
+   and SOLIB_ADDR is (optionally) an address within the shared library.
 
-static gdb::unique_xmalloc_ptr<char>
-get_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname)
+   Look through the data previously registered via calls to
+   set_cbfd_solib_build_id and try to find a build-id corresponding to
+   FILENAME.
+
+   If a build-id is found then return a pair, the first item is the
+   build-id and the second item is a string which contains the file name of
+   the file GDB opened to provide the file backed mapping.  This file name
+   string might be empty if GDB didn't find a file to provide the file
+   backed mapping.
+
+   If no build-id can be found for FILENAME then GDB will return a pair
+   containing nullptr (for the build-id) and an empty string for the file
+   name.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *filename,
+			 const std::optional<CORE_ADDR> &solib_addr)
 {
-  if (abfd.get () == nullptr || soname == nullptr)
-    return {};
+  if (cbfd.get () == nullptr)
+    return { nullptr, {} };
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
 
-  if (mapptr == nullptr)
-    return {};
+  if (lookup_data == nullptr)
+    return { nullptr, {} };
 
-  auto it = mapptr->find (lbasename (soname));
-  if (it == mapptr->end ())
-    return {};
+  if (filename != nullptr)
+    {
+      /* If there's a matching entry in filename_to_build_id_map then the
+	 associated build-id will not be nullptr, and can be used to
+	 validate that FILENAME is correct.  */
+      auto it = lookup_data->filename_to_build_id_map.find (filename);
+      if (it != lookup_data->filename_to_build_id_map.end ())
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  if (solib_addr.has_value ())
+    {
+      /* On the first lookup, sort the address_to_build_id_list.  */
+      if (!lookup_data->address_to_build_id_list_sorted)
+	{
+	  std::sort (lookup_data->address_to_build_id_list.begin (),
+		     lookup_data->address_to_build_id_list.end (),
+		     [] (const mem_range_and_build_id &a,
+			 const mem_range_and_build_id &b) {
+		       return a.range < b.range;
+		     });
+	  lookup_data->address_to_build_id_list_sorted = true;
+	}
+
+      /* Look for the first entry whose range's start address is not less
+	 than, or equal too, the address SOLIB_ADDR.  If we find such an
+	 entry, then the previous entry's range might contain SOLIB_ADDR.
+	 If it does then that previous entry's build-id can be used.  */
+      auto it = std::lower_bound
+	(lookup_data->address_to_build_id_list.begin (),
+	 lookup_data->address_to_build_id_list.end (),
+	 *solib_addr,
+	 [] (const mem_range_and_build_id &a,
+	     const CORE_ADDR &b) {
+	  return a.range.start <= b;
+	});
+
+      if (it != lookup_data->address_to_build_id_list.begin ())
+	{
+	  --it;
 
-  return make_unique_xstrdup (it->second.c_str ());
+	  if (it->range.contains (*solib_addr))
+	    return get_cbfd_solib_make_result (lookup_data, it->build_id);
+	}
+    }
+
+  if (filename != nullptr)
+    {
+      /* If the basename of FILENAME appears in the soname_to_build_id_map
+	 then when the mapped files were processed, we saw a file with a
+	 DT_SONAME attribute corresponding to FILENAME, use that build-id
+	 to validate FILENAME.
+
+	 However, the build-id in this map might be nullptr if we saw
+	 multiple mapped files with the same DT_SONAME attribute (though
+	 this should be pretty rare).  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (lbasename (filename));
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr)
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  return { nullptr, {} };
 }
 
 /* Given a pointer to one of the shared objects in our list of mapped
@@ -546,36 +794,53 @@ solib_map_sections (solib &so)
 
   gdb::unique_xmalloc_ptr<char> filename (tilde_expand (so.so_name.c_str ()));
   gdb_bfd_ref_ptr abfd (ops->bfd_open (filename.get ()));
-  gdb::unique_xmalloc_ptr<char> build_id_hexstr
-    = get_cbfd_soname_build_id (current_program_space->cbfd,
-				so.so_name.c_str ());
+  std::optional<CORE_ADDR> solib_addr = ops->find_solib_addr (so);
+
+  const auto [expected_build_id, mapped_filename]
+    = get_cbfd_solib_build_id (current_program_space->cbfd,
+				so.so_name.c_str (),
+				solib_addr);
 
   /* If we already know the build-id of this solib from a core file, verify
      it matches ABFD's build-id.  If there is a mismatch or the solib wasn't
      found, attempt to query debuginfod for the correct solib.  */
-  if (build_id_hexstr.get () != nullptr)
+  if (expected_build_id != nullptr)
     {
-      bool mismatch = false;
+      bool mismatch = (abfd != nullptr
+		       && build_id_bfd_get (abfd.get ()) != nullptr
+		       && !build_id_equal (expected_build_id,
+					   build_id_bfd_get (abfd.get ())));
 
-      if (abfd != nullptr && abfd->build_id != nullptr)
-	{
-	  std::string build_id = build_id_to_string (abfd->build_id);
-
-	  if (build_id != build_id_hexstr.get ())
-	    mismatch = true;
-	}
       if (abfd == nullptr || mismatch)
 	{
-	  scoped_fd fd = debuginfod_exec_query (
-	    (const unsigned char *) build_id_hexstr.get (), 0,
-	    so.so_name.c_str (), &filename);
-
-	  if (fd.get () >= 0)
-	    abfd = ops->bfd_open (filename.get ());
-	  else if (mismatch)
-	    warning (_ ("Build-id of %ps does not match core file."),
-		     styled_string (file_name_style.style (),
-				    filename.get ()));
+	  /* If GDB found a suitable file during the file mapping
+	     processing stage then lets use that.  We don't check the
+	     build-id after opening this file, either this file was found
+	     by build-id, in which case it's going to match, or this file
+	     doesn't have a build-id, so checking tells us nothing.
+	     However, if it was good enough during the mapped file
+	     processing, we assume it's good enough now.  */
+	  if (!mapped_filename.empty ())
+	    abfd = ops->bfd_open (mapped_filename.c_str ());
+	  else
+	    abfd = nullptr;
+
+	  if (abfd == nullptr)
+	    {
+	      scoped_fd fd = debuginfod_exec_query
+		(expected_build_id->data, expected_build_id->size,
+		 so.so_name.c_str (), &filename);
+
+	      if (fd.get () >= 0)
+		abfd = ops->bfd_open (filename.get ());
+	      else if (mismatch)
+		{
+		  warning (_ ("Build-id of %ps does not match core file."),
+			   styled_string (file_name_style.style (),
+					  filename.get ()));
+		  abfd = nullptr;
+		}
+	    }
 	}
     }
 
@@ -1705,6 +1970,14 @@ remove_user_added_objfile (struct objfile *objfile)
     }
 }
 
+/* See solist.h.  */
+
+std::optional<CORE_ADDR>
+default_find_solib_addr (solib &so)
+{
+  return {};
+}
+
 void _initialize_solib ();
 
 void
diff --git a/gdb/solib.h b/gdb/solib.h
index f7a93c0718f..9cd9f111cde 100644
--- a/gdb/solib.h
+++ b/gdb/solib.h
@@ -29,6 +29,7 @@ struct program_space;
 #include "gdb_bfd.h"
 #include "symfile-add-flags.h"
 #include "gdbsupport/function-view.h"
+#include "memrange.h"
 
 /* Value of the 'set debug solib' configuration variable.  */
 
@@ -136,11 +137,39 @@ extern void update_solib_breakpoints (void);
 
 extern void handle_solib_event (void);
 
-/* Associate SONAME with BUILD_ID in ABFD's registry so that it can be
-   retrieved with get_cbfd_soname_build_id.  */
+/* This function is used to pass information extracted when processing the
+   core file's mapped file section through to the shared library processing
+   parts of GDB.  When the mapped files are processed GDB is (sometimes)
+   able to find the build-ids for the mapped files.  These build-ids are
+   recorded by calling this function, and can be used during shared library
+   loading to ensure the correct shared libraries are loaded.
 
-extern void set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd,
-				      const char *soname,
-				      const bfd_build_id *build_id);
+   CBFD is the core file bfd object to record this information against.
+
+   SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+   the shared library that was mapped into the core file.  This can be
+   nullptr.
+
+   EXPECTED_FILENAME is the name of the file that was mapped into the
+   inferior as extracted from the core file, this should never be nullptr.
+
+   ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+   mapped file information, this can be nullptr if GDB failed to find a
+   suitable file.
+
+   RANGES is the list of memory ranges at which this file was mapped into
+   the inferior.
+
+   BUILD_ID this is the build-id for this mapped file, this will never be
+   nullptr.  Not every mapped file will have a build-id, but there's no
+   point calling this function if we failed to find a build-id; this is all
+   about allowing the shared library code to find the build-id.  */
+
+extern void set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd,
+				     const char *soname,
+				     const char *expected_filename,
+				     const char *actual_filename,
+				     std::vector<mem_range> &&ranges,
+				     const bfd_build_id *build_id);
 
 #endif /* SOLIB_H */
diff --git a/gdb/solist.h b/gdb/solist.h
index f0d22080de1..bd60987f49c 100644
--- a/gdb/solist.h
+++ b/gdb/solist.h
@@ -167,6 +167,23 @@ struct solib_ops
      NULL, in which case no specific preprocessing is necessary
      for this target.  */
   void (*handle_event) (void);
+
+  /* Return an address within the inferior's address space which is known
+     to be part of SO.  If there is no such address, or GDB doesn't know
+     how to figure out such an address then an empty optional is
+     returned.
+
+     The returned address can be used when loading the shared libraries
+     for a core file.  GDB knows the build-ids for (some) files mapped
+     into the inferior's address space, and knows the address ranges which
+     those mapped files cover.  If GDB can figure out a representative
+     address for the library then this can be used to match a library to a
+     mapped file, and thus to a build-id.  GDB can then use this
+     information to help locate the shared library objfile, if the objfile
+     is not in the expected place (as defined by the shared libraries file
+     name).  */
+
+  std::optional<CORE_ADDR> (*find_solib_addr) (solib &so);
 };
 
 /* A unique pointer to a so_list.  */
@@ -186,4 +203,9 @@ extern gdb_bfd_ref_ptr solib_bfd_fopen (const char *pathname, int fd);
 /* Find solib binary file and open it.  */
 extern gdb_bfd_ref_ptr solib_bfd_open (const char *in_pathname);
 
+/* A default implementation of the solib_ops::find_solib_addr callback.
+   This just returns an empty std::optional<CORE_ADDR> indicating GDB is
+   unable to find an address within the library SO.  */
+extern std::optional<CORE_ADDR> default_find_solib_addr (solib &so);
+
 #endif
diff --git a/gdb/testsuite/gdb.base/solib-search.exp b/gdb/testsuite/gdb.base/solib-search.exp
index 581046a1ed8..bbfe3ae719a 100644
--- a/gdb/testsuite/gdb.base/solib-search.exp
+++ b/gdb/testsuite/gdb.base/solib-search.exp
@@ -43,7 +43,11 @@ set right_binfile2_lib \
 set binfile1_lib [standard_output_file ${libname1}.so]
 set binfile2_lib [standard_output_file ${libname2}.so]
 
-set lib_flags [list debug ldflags=-Wl,-Bsymbolic]
+# When this test was written, GDB's ability to track down shared
+# libraries for a core file based on the build-id much poorer.  As GDB
+# has improved we now need to disable build-ids in order for this test
+# to function as expected.
+set lib_flags [list debug no-build-id ldflags=-Wl,-Bsymbolic]
 set wrong_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=1"
 set right_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=8192 additional_flags=-DRIGHT"
 
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
new file mode 100644
index 00000000000..30e9267dc01
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
@@ -0,0 +1,39 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+/* It is important that these two variables have names of the same length
+   so that the debug information in the two library versions is laid out
+   the same.  If they differ then the .dynamic section might move, which
+   will trigger a different check within GDB than the one we actually want
+   to check.  */
+
+#if LIB_VERSION == 1
+volatile int *library_1_var = (volatile int *) 0x12345678;
+#elif LIB_VERSION == 2
+volatile int *library_2_var = (volatile int *) 0x11223344;
+#else
+# error Unknown library version
+#endif
+
+int
+foo (void)
+{
+  /* This should trigger a core dump.  */
+  abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
new file mode 100644
index 00000000000..a51d48e2f2e
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
@@ -0,0 +1,41 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <stddef.h>
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+/* This is updated by the .exp file.  */
+char *libname = "libfoo_2.so";
+
+int
+main (void)
+{
+  void *handle;
+  int res, tmp;
+
+  handle = dlopen (libname, RTLD_LAZY);
+  assert (handle != NULL);
+
+  res = foo ();
+
+  tmp = dlclose (handle);
+  assert (tmp == 0);
+
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
new file mode 100644
index 00000000000..44c4268edd3
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -0,0 +1,260 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test exercises GDB's ability to validate build-ids when loading
+# shared libraries for a core file.
+#
+# The test creates two "versions" of a shared library, sets up a
+# symlink to point to one version of the library, and creates a core file.
+#
+# We then try re-loading the core file and executable and check that
+# GDB is able to correctly load the shared library.  To confuse things
+# we retarget the library symlink at the other version of the library.
+#
+# After that we repeat the test, but this time deleting the symlink
+# completely.
+#
+# Then we remove the version of the library completely, at this point
+# we do expect GDB to give a warning about being unable to load the library.
+#
+# And finally, we setup debuginfod and have it serve the missing
+# library file, GDB should correctly download the library file.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+require {istarget "*-linux*"}
+require {!is_remote host}
+require {!using_fission}
+
+standard_testfile -1.c -2.c
+
+# Build two similar, but slightly different versions of the shared
+# library.  Both libraries have DT_SONAME set to the generic
+# libfoo.so, we'll create a symlink with that name later.
+set library_1_filename [standard_output_file "libfoo_1.so"]
+set library_2_filename [standard_output_file "libfoo_2.so"]
+
+# The generic name for the library.
+set library_filename [standard_output_file "libfoo.so"]
+
+# When compiling a shared library the -Wl,-soname,NAME option is
+# automatically added based on the final name of the library.  We want
+# to compile libfoo_1.so, but set the soname to libfoo.so.  To achieve
+# this we first compile into libfoo.so, and then rename the library to
+# libfoo_1.so.
+if {[build_executable "build libfoo_1.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=1 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_1_filename}"
+
+# See the comment above, but this time we rename to libfoo_2.so.
+if {[build_executable "build libfoo_2.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=2 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_2_filename}"
+
+# Create libfoo.so symlink to the libfoo_1.so library.  If this
+# symlink creation fails then we assume we can't create symlinks on
+# this host.  If this succeeds then later symlink creation is required
+# to succeed, and will trigger an FAIL if it doesn't.
+set status \
+    [remote_exec build \
+	 "ln -sf ${library_1_filename} ${library_filename}"]
+if {[lindex $status 0] != 0} {
+    unsupported "host does not support symbolic links"
+    return
+}
+
+# Build the executable.  This links against libfoo.so, which is
+# poining at libfoo_1.so.  Just to confuse things even more, this
+# executable uses dlopen to load libfoo_2.so.  Weird!
+if { [build_executable "build executable" ${binfile} ${srcfile2} \
+	  [list debug shlib=${library_filename} shlib_load]] == -1 } {
+    return
+}
+
+# If the board file is automatically splitting the debug information
+# into a separate file (e.g. the cc-with-gnu-debuglink.exp board) then
+# this test isn't going to work.
+clean_restart
+gdb_file_cmd $binfile
+if {$gdb_file_cmd_debug_info ne "debug"} {
+    unsupported "failed to find debug information"
+    return
+}
+if {[regexp "${testfile}.debug" $gdb_file_cmd_msg]} {
+    unsupported "debug information has been split to a separate file"
+    return
+}
+
+# Run BINFILE which will generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Helper proc to load global BINFILE and then load global COREFILE.
+#
+# If EXPECT_WARNING is true then we require a warning about being
+# unable to load the shared library symbols, otherwise, EXPECT_WARNING
+# is false and we require no warning.
+#
+# If EXPECT_DOWNLOAD is true then we require a line indicating that
+# the shared library is being downloaded from debuginfod, otherwise
+# the shared library should not be downloaded.
+proc load_exec_and_core_file { expect_warning expect_download testname } {
+    with_test_prefix $testname {
+	clean_restart $::binfile
+
+	set saw_warning false
+	set saw_download false
+	set saw_generated false
+	set saw_terminated false
+
+	gdb_test_multiple "core-file $::corefile" "load core file" {
+	    -re "^Core was generated by \[^\r\n\]+\r\n" {
+		set saw_generated true
+		exp_continue
+	    }
+	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
+		set saw_terminated true
+		exp_continue
+	    }
+	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing\r\n" {
+		# Ignore warnings from the file backed mapping phase.
+		exp_continue
+	    }
+	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\nDo you need \"set solib-search-path\" or \"set sysroot\".\r\n" {
+		set saw_warning true
+		exp_continue
+	    }
+	    -re "^Downloading executable for \[^\r\n\]+/libfoo_1\\.so\\.\\.\\.\r\n" {
+		set saw_download true
+		exp_continue
+	    }
+	    -re "^$::gdb_prompt $" {
+		gdb_assert { $saw_generated && $saw_terminated \
+				 && $saw_warning == $expect_warning \
+				 && $saw_download == $expect_download } \
+		    $gdb_test_name
+	    }
+	    -re "^\[^\r\n\]*\r\n" {
+		exp_continue
+	    }
+	}
+
+	# If we don't expect a warning then debug symbols from the
+	# shared library should be available.  Confirm we can read a
+	# variable from the shared library.  If we do expect a warning
+	# then the shared library debug symbols have not loaded, and
+	# the library variable should not be available.
+	if { !$expect_warning } {
+	    gdb_test "print/x library_1_var" " = 0x12345678" \
+		"check library_1_var can be read"
+	} else {
+	    gdb_test "print/x library_1_var" \
+		"^No symbol \"library_1_var\" in current context\\." \
+		"check library_1_var cannot be read"
+	}
+    }
+}
+
+# Initial test, just load the executable and core file.  At this point
+# everything should load fine as everything is where we expect to find
+# it.
+load_exec_and_core_file false false \
+    "load core file, all libraries as expected"
+
+# Update libfoo.so symlink to point at the second library then reload
+# the core file.  GDB should spot that the symlink points to the wrong
+# file, but should be able to figure out the correct file to load as
+# the right file will be in the mapped file list.
+set status [remote_exec build \
+		"ln -sf ${library_2_filename} ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "update library symlink to point to the wrong file"
+
+load_exec_and_core_file false false \
+    "load core file, symlink points to wrong file"
+
+# Remove libfoo.so symlink and reload the core file.  As in the
+# previous test GDB should be able to figure out the correct file to
+# load as the correct file will still appear in the mapped file list.
+set status [remote_exec build "rm -f ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } "remove library symlink"
+
+load_exec_and_core_file false false \
+    "load core file, symlink removed"
+
+# Remove LIBRARY_1_FILENAME.  We'll now see a warning that the mapped
+# file can't be loaded (we ignore that warning), and we'll see a
+# warning that the shared library can't be loaded.
+set library_1_backup_filename ${library_1_filename}.backup
+set status \
+    [remote_exec build \
+	 "mv ${library_1_filename} ${library_1_backup_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "remove libfoo_1.so"
+
+load_exec_and_core_file true false \
+    "load core file, libfoo_1.so removed"
+
+# Setup a debuginfod server which can serve the original shared
+# library file.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_1_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    save_vars { GDBFLAGS } {
+	append GDBFLAGS " -ex \"set debuginfod enabled on\""
+
+	# Reload the executable and core file.  GDB should download
+	# the file libfoo_1.so using debuginfod during the mapped file
+	# phase, but should then reuse that download during the shared
+	# library phase.
+	load_exec_and_core_file false true \
+	    "load core file, use debuginfod"
+    }
+}
+
+stop_debuginfod
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 0d78691c381..c958ff18d2a 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -5211,6 +5211,7 @@ proc quote_for_host { args } {
 #     debug information
 #   - text_segment=addr: Tell the linker to place the text segment at ADDR.
 #   - build-id: Ensure the final binary includes a build-id.
+#   - no-build-id: Ensure the final binary does not include a build-id.
 #   - column-info/no-column-info: Enable/Disable generation of column table
 #     information.
 #
@@ -5316,6 +5317,18 @@ proc gdb_compile {source dest type options} {
 	lappend new_options "additional_flags=-Wl,--build-id"
     }
 
+    # If the 'no-build-id' option is used then disable the build-id.
+    if {[lsearch -exact $options no-build-id] > 0} {
+	lappend new_options "additional_flags=-Wl,--build-id=none"
+    }
+
+    # Sanity check.  If both 'build-id' and 'no-build-id' are used
+    # then what is expected from us!
+    if {[lsearch -exact $options build-id] > 0
+	&& [lsearch -exact $options no-build-id] > 0} {
+	error "cannot use build-id and no-build-id options"
+    }
+
     # Treating .c input files as C++ is deprecated in Clang, so
     # explicitly force C++ language.
     if { !$getting_compiler_info
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 4/4] gdb: unify build-id to objfile lookup code
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
                     ` (2 preceding siblings ...)
  2024-05-20 13:08   ` [PATCHv2 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
@ 2024-05-20 13:08   ` Andrew Burgess
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
  4 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-20 13:08 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

There are 3 places where we currently call debuginfod_exec_query to
lookup an objfile for a given build-id.

In one of these places we first call build_id_to_exec_bfd which also
looks up an objfile given a build-id, but this function looks on disk
for a symlink in the .build-id/ sub-directory (within the
debug-file-directory).

I can't think of any reason why we shouldn't call build_id_to_exec_bfd
before every call to debuginfod_exec_query.

So, in this commit I have added a new function in build-id.c,
find_exec_by_build_id, this function calls build_id_to_exec_bfd, and
if that fails, then calls debuginfod_exec_query.

Everywhere we call debuginfod_exec_query is updated to call the new
function, and in locate_exec_from_corefile_build_id, the existing call
to build_id_to_exec_bfd is removed as calling find_exec_by_build_id
does this for us.

One slight weird thing is in core_target::build_file_mappings, here we
call find_exec_by_build_id which returns a gdb_bfd_ref_ptr for the
opened file, however we immediately reopen the file as "binary".  The
reason for this is that all the bfds opened in ::build_file_mappings
need to be opened as "binary" (see the function comments for why).

I did consider passing a target type into find_exec_by_build_id, which
could then be forwarded to build_id_to_exec_bfd and used to open the
BFD as "binary", however, if you follow the call chain you'll end up
in build_id_to_debug_bfd_1, where we actually open the bfd.  Notice in
here that we call build_id_verify to double check the build-id of the
file we found, this requires that the bfd not be opened as "binary".

What this means is that we always have to first open the bfd using the
gnutarget target type (for the build-id check), and then we would have
to reopen it as "binary".  There seems little point pushing the reopen
logic into find_exec_by_build_id, so we just do this in the
::build_file_mappings function.

I've extended the tests to cover the two cases which actually changed
in this commit.
---
 gdb/build-id.c                                | 42 ++++++++++++++++++-
 gdb/build-id.h                                | 21 ++++++----
 gdb/corelow.c                                 | 40 ++++++------------
 gdb/solib.c                                   | 22 ++++------
 .../gdb.debuginfod/corefile-mapped-file.exp   | 24 +++++++++++
 .../gdb.debuginfod/solib-with-soname.exp      | 32 +++++++++++++-
 gdb/testsuite/lib/gdb.exp                     |  7 +++-
 7 files changed, 134 insertions(+), 54 deletions(-)

diff --git a/gdb/build-id.c b/gdb/build-id.c
index 41667d5e5cf..27642b58d56 100644
--- a/gdb/build-id.c
+++ b/gdb/build-id.c
@@ -26,6 +26,8 @@
 #include "filenames.h"
 #include "gdbcore.h"
 #include "cli/cli-style.h"
+#include "gdbsupport/scoped_fd.h"
+#include "debuginfod-support.h"
 
 /* See build-id.h.  */
 
@@ -198,9 +200,11 @@ build_id_to_debug_bfd (size_t build_id_len, const bfd_byte *build_id)
   return build_id_to_bfd_suffix (build_id_len, build_id, ".debug");
 }
 
-/* See build-id.h.  */
+/* Find and open a BFD for an executable file given a build-id.  If no BFD
+   can be found, return NULL.  The returned reference to the BFD must be
+   released by the caller.  */
 
-gdb_bfd_ref_ptr
+static gdb_bfd_ref_ptr
 build_id_to_exec_bfd (size_t build_id_len, const bfd_byte *build_id)
 {
   return build_id_to_bfd_suffix (build_id_len, build_id, "");
@@ -243,3 +247,37 @@ find_separate_debug_file_by_buildid (struct objfile *objfile,
 
   return std::string ();
 }
+
+/* See build-id.h.  */
+
+gdb_bfd_ref_ptr
+find_exec_by_build_id (const bfd_build_id *build_id,
+		       const char *expected_filename)
+{
+  /* Try to find the executable (or shared object) by looking for a
+     (sym)link on disk from the build-id to the object file.  */
+  gdb_bfd_ref_ptr abfd = build_id_to_exec_bfd (build_id->size,
+					       build_id->data);
+
+  if (abfd != nullptr)
+    return abfd;
+
+  /* Attempt to query debuginfod for the executable.  */
+  gdb::unique_xmalloc_ptr<char> path;
+  scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
+					expected_filename, &path);
+  if (fd.get () >= 0)
+    {
+      abfd = gdb_bfd_open (path.get (), gnutarget);
+
+      if (abfd == nullptr)
+	warning (_("\"%ps\" from debuginfod cannot be opened as bfd: %s"),
+		 styled_string (file_name_style.style (), path.get ()),
+		 gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
+      else if (!build_id_verify (abfd.get (), build_id->size,
+				 build_id->data))
+	abfd = nullptr;
+    }
+
+  return abfd;
+}
diff --git a/gdb/build-id.h b/gdb/build-id.h
index c5f20f8782e..3df122a0cbf 100644
--- a/gdb/build-id.h
+++ b/gdb/build-id.h
@@ -40,13 +40,6 @@ extern int build_id_verify (bfd *abfd,
 extern gdb_bfd_ref_ptr build_id_to_debug_bfd (size_t build_id_len,
 					      const bfd_byte *build_id);
 
-/* Find and open a BFD for an executable file given a build-id.  If no BFD
-   can be found, return NULL.  The returned reference to the BFD must be
-   released by the caller.  */
-
-extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
-					     const bfd_byte *build_id);
-
 /* Find the separate debug file for OBJFILE, by using the build-id
    associated with OBJFILE's BFD.  If successful, returns the file name for the
    separate debug file, otherwise, return an empty string.
@@ -60,6 +53,20 @@ extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
 extern std::string find_separate_debug_file_by_buildid
   (struct objfile *objfile, deferred_warnings *warnings);
 
+/* Find an executable (or shared library) that matches BUILD_ID.  This is
+   done by first checking in the debug-file-directory for a .build-id/
+   sub-directory, and looking for a symlink in there that points to the
+   required file.
+
+   If that doesn't find us a file then we call to debuginfod to see if it
+   can provide the required file.
+
+   EXPECTED_FILENAME is used in output messages from debuginfod, this
+   should be the file we were looking for but couldn't find.  */
+
+extern gdb_bfd_ref_ptr find_exec_by_build_id (const bfd_build_id *build_id,
+					      const char *expected_filename);
+
 /* Return an hex-string representation of BUILD_ID.  */
 
 static inline std::string
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 85bc3c26bea..6975719c1f2 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -47,7 +47,6 @@
 #include "gdbsupport/pathstuff.h"
 #include "gdbsupport/scoped_fd.h"
 #include "gdbsupport/x86-xstate.h"
-#include "debuginfod-support.h"
 #include <unordered_map>
 #include <unordered_set>
 #include "cli/cli-cmds.h"
@@ -407,13 +406,19 @@ core_target::build_file_mappings ()
 	   || !bfd_check_format (abfd.get (), bfd_object))
 	  && file_data.build_id != nullptr)
 	{
-	  expanded_fname = nullptr;
-	  debuginfod_exec_query (file_data.build_id->data,
-				 file_data.build_id->size,
-				 filename.c_str (), &expanded_fname);
-	  if (expanded_fname != nullptr)
+	  abfd = find_exec_by_build_id (file_data.build_id,
+					filename.c_str ());
+
+	  if (abfd != nullptr)
 	    {
+	      /* The find_exec_by_build_id will have opened ABFD using the
+		 GNUTARGET global bfd type, however, we need the bfd opened
+		 as the binary type (see the function's header comment), so
+		 now we reopen ABFD with the desired binary type.  */
+	      expanded_fname
+		= make_unique_xstrdup (bfd_get_filename (abfd.get ()));
 	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      gdb_assert (b != nullptr);
 	      abfd = gdb_bfd_ref_ptr::new_reference (b);
 	    }
 	}
@@ -774,28 +779,7 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
     return;
 
   gdb_bfd_ref_ptr execbfd
-    = build_id_to_exec_bfd (build_id->size, build_id->data);
-
-  if (execbfd == nullptr)
-    {
-      /* Attempt to query debuginfod for the executable.  */
-      gdb::unique_xmalloc_ptr<char> execpath;
-      scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
-					    abfd->filename, &execpath);
-
-      if (fd.get () >= 0)
-	{
-	  execbfd = gdb_bfd_open (execpath.get (), gnutarget);
-
-	  if (execbfd == nullptr)
-	    warning (_("\"%s\" from debuginfod cannot be opened as bfd: %s"),
-		     execpath.get (),
-		     gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
-	  else if (!build_id_verify (execbfd.get (), build_id->size,
-				     build_id->data))
-	    execbfd.reset (nullptr);
-	}
-    }
+    = find_exec_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
diff --git a/gdb/solib.c b/gdb/solib.c
index c39dfbcc78e..3292f361176 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -45,7 +45,6 @@
 #include "gdb_bfd.h"
 #include "gdbsupport/filestuff.h"
 #include "gdbsupport/scoped_fd.h"
-#include "debuginfod-support.h"
 #include "source.h"
 #include "cli/cli-style.h"
 
@@ -826,20 +825,15 @@ solib_map_sections (solib &so)
 	    abfd = nullptr;
 
 	  if (abfd == nullptr)
-	    {
-	      scoped_fd fd = debuginfod_exec_query
-		(expected_build_id->data, expected_build_id->size,
-		 so.so_name.c_str (), &filename);
+	    abfd = find_exec_by_build_id (expected_build_id,
+					  so.so_name.c_str ());
 
-	      if (fd.get () >= 0)
-		abfd = ops->bfd_open (filename.get ());
-	      else if (mismatch)
-		{
-		  warning (_ ("Build-id of %ps does not match core file."),
-			   styled_string (file_name_style.style (),
-					  filename.get ()));
-		  abfd = nullptr;
-		}
+	  if (abfd == nullptr && mismatch)
+	    {
+	      warning (_ ("Build-id of %ps does not match core file."),
+		       styled_string (file_name_style.style (),
+				      filename.get ()));
+	      abfd = nullptr;
 	    }
 	}
     }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
index 6e3301e1c8d..b5dee228ca0 100644
--- a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -296,6 +296,30 @@ set ptr_value [read_ptr_value]
 gdb_assert { $ptr_value eq "unavailable" } \
     "check value of pointer is unavailable with library file missing"
 
+# Now symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to library we just moved aside.  Restart GDB and setup the
+# debug-file-directory before loading the core file.
+#
+# GDB should lookup the file to map via the build-id link in the
+# .build-id/ directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_backup_filename ""]
+
+remote_exec build "mkdir -p [file dirname $build_id_filename]"
+remote_exec build "ln -sf $library_backup_filename $build_id_filename"
+
+clean_restart $binfile
+
+gdb_test_no_output "set debug-file-directory $debugdir" \
+    "set debug-file-directory"
+
+load_core_file "load corefile, lookup in debug-file-directory"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file, lookup in debug-file-directory"
+
 # Build a new version of the shared library, keep the library the same size,
 # but change the contents so the build-id changes.  Then restart GDB and load
 # the core-file again.  GDB should spot that the build-id for the shared
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
index 44c4268edd3..cdd9730a22c 100644
--- a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -126,10 +126,19 @@ if {$corefile eq ""} {
 # If EXPECT_DOWNLOAD is true then we require a line indicating that
 # the shared library is being downloaded from debuginfod, otherwise
 # the shared library should not be downloaded.
-proc load_exec_and_core_file { expect_warning expect_download testname } {
+#
+# If DEBUGDIR is not the empty string then 'debug-file-directory' is
+# set to the value of DEBUGDIR.
+proc load_exec_and_core_file { expect_warning expect_download testname \
+				   {debugdir ""} } {
     with_test_prefix $testname {
 	clean_restart $::binfile
 
+	if { $debugdir ne "" } {
+	    gdb_test_no_output "set debug-file-directory $debugdir" \
+		"set debug directory"
+	}
+
 	set saw_warning false
 	set saw_download false
 	set saw_generated false
@@ -223,6 +232,27 @@ gdb_assert { [lindex $status 0] == 0 } \
 load_exec_and_core_file true false \
     "load core file, libfoo_1.so removed"
 
+# Symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to LIBRARY_1_BACKUP_FILENAME, now when we restart GDB it
+# should find the missing library within the debug directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_1_backup_filename ""]
+set status \
+    [remote_exec build \
+	 "mkdir -p [file dirname $build_id_filename]"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create sub-directory within the debug directory"
+set status \
+    [remote_exec build \
+	 "ln -sf $library_1_backup_filename $build_id_filename"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create symlink within the debug directory "
+
+load_exec_and_core_file false false \
+    "load core file, find libfoo_1.so through debug-file-directory" \
+    $debugdir
+
 # Setup a debuginfod server which can serve the original shared
 # library file.
 if {![allow_debuginfod_tests]} {
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index c958ff18d2a..e369b0be96a 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -8019,14 +8019,17 @@ proc get_build_id { filename } {
 
 # Return the build-id hex string (usually 160 bits as 40 hex characters)
 # converted to the form: .build-id/ab/cdef1234...89.debug
+#
+# The '.debug' suffix can be changed by passing the SUFFIX argument.
+#
 # Return "" if no build-id found.
-proc build_id_debug_filename_get { filename } {
+proc build_id_debug_filename_get { filename {suffix ".debug"} } {
     set data [get_build_id $filename]
     if { $data == "" } {
 	return ""
     }
     regsub {^..} $data {\0/} data
-    return ".build-id/${data}.debug"
+    return ".build-id/${data}${suffix}"
 }
 
 # DEST should be a file compiled with debug information.  This proc
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 3/4] gdb: improve shared library build-id check for core-files
  2024-05-20 13:08   ` [PATCHv2 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
@ 2024-05-21 19:00     ` Andrew Burgess
  0 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-05-21 19:00 UTC (permalink / raw)
  To: gdb-patches

Andrew Burgess <aburgess@redhat.com> writes:

> +
> +# Helper proc to load global BINFILE and then load global COREFILE.
> +#
> +# If EXPECT_WARNING is true then we require a warning about being
> +# unable to load the shared library symbols, otherwise, EXPECT_WARNING
> +# is false and we require no warning.
> +#
> +# If EXPECT_DOWNLOAD is true then we require a line indicating that
> +# the shared library is being downloaded from debuginfod, otherwise
> +# the shared library should not be downloaded.
> +proc load_exec_and_core_file { expect_warning expect_download testname } {
> +    with_test_prefix $testname {
> +	clean_restart $::binfile
> +
> +	set saw_warning false
> +	set saw_download false
> +	set saw_generated false
> +	set saw_terminated false
> +
> +	gdb_test_multiple "core-file $::corefile" "load core file" {
> +	    -re "^Core was generated by \[^\r\n\]+\r\n" {
> +		set saw_generated true
> +		exp_continue
> +	    }
> +	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
> +		set saw_terminated true
> +		exp_continue
> +	    }
> +	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing\r\n" {
> +		# Ignore warnings from the file backed mapping phase.
> +		exp_continue
> +	    }
> +	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\nDo you need \"set solib-search-path\" or \"set sysroot\".\r\n" {
> +		set saw_warning true
> +		exp_continue
> +	    }

This line still causes a failure when testing with READ1.  The fix is to
just drop the:

  Do you need \"set solib-search-path\" or \"set sysroot\".\r\n

part from this regexp.  I've made this change locally.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv3 0/4] More build-id checking when opening core files
  2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
                     ` (3 preceding siblings ...)
  2024-05-20 13:08   ` [PATCHv2 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
@ 2024-06-10 21:49   ` Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
                       ` (3 more replies)
  4 siblings, 4 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-06-10 21:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In v3:

  - I started working on the next set of patches in this series, not
    ready for posting just yet, but they will eventually apply on top
    of this series.  While doing that I made some tweaks to patch #3
    in this series, nothing huge, just moved some code around to make
    it more sharable later on, the underlying logic remains the same.

  - Patches #1, #2, and #4 are unchanged.

  - Rebased and retested.

In v2:

  - Failures were reported in the new tests from Linaro CI testing.
    Nothing serious; Linaro was seeing some additional output that the
    regexp I'd written were not accounting for.  I've revised the new
    tests to try and not be too strict, hopefully this should address
    the issues that were reported.

  - I've cleaned up some build vs host mistakes I think I made in some
    remote_exec calls.

  - And I've run the tests agsint 'make check-all-boards' and resolved
    some issues that cropped up.

---

Andrew Burgess (4):
  gdb/corefile: remove unavailable target sections
  gdb/corefile: improve file backed mapping handling
  gdb: improve shared library build-id check for core-files
  gdb: unify build-id to objfile lookup code

 gdb/build-id.c                                |  42 +-
 gdb/build-id.h                                |  21 +-
 gdb/corelow.c                                 | 680 +++++++++++++++---
 gdb/gdbcore.h                                 |  66 ++
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 123 ++--
 gdb/solib.h                                   |   8 +-
 gdb/solist.h                                  |  22 +
 gdb/testsuite/gdb.base/corefile.exp           |  39 +
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 +
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 +
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 ++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 380 ++++++++++
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 +
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 290 ++++++++
 gdb/testsuite/lib/gdb.exp                     |   7 +-
 23 files changed, 1704 insertions(+), 191 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp


base-commit: de4edfcb949c0d04ec8b49fdfe06267f92618589
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv3 1/4] gdb/corefile: remove unavailable target sections
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
@ 2024-06-10 21:49     ` Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-06-10 21:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file the bfd library processes the core file and
creates sections within the bfd object to represent each of the
segments within the core file.

GDB then creates two target_section lists, m_core_section_table and
m_core_file_mappings.  These, as well as m_core_unavailable_mappings,
are used by GDB to implement core_target::xfer_partial, that is, to
allow reading from inferior memory.

The m_core_section_table list represents sections within the core file
itself.  The sections in this list can be split into two groups based
on whether the section has the SEC_HAS_CONTENTS flag set or not.

Sections that have SEC_HAS_CONTENTS have their contents copied into
the core file while sections that do not have the SEC_HAS_CONTENTS
flag will not have their contents copied into the core file when it is
created.

The m_core_file_mappings list is created when GDB parses the mapped
file list in the core file.  Every mapped file will also be covered by
entries in the m_core_section_table list, however, when parts of a
file are mapped read-only then the corresponding entry in
m_core_section_table will not have the SEC_HAS_CONTENTS flag set.  If
parts of a file are mapped read/write then the corresponding
m_core_section_table entry will have the SEC_HAS_CONTENTS flag, and
the corresponding (possibly modified contents) will be copied into the
core file.

However, GDB only creates entries in m_core_file_mappings if it is
able to find the correct on-disk file to open.  If the corresponding
file can't be found then an entry is added to
m_core_unavailable_mappings.  An attempt to read from a region covered
by this list will be redirected to the lower file target in the hope
that a read-only section from the executable (supplied by the user)
might satisfy the read.

And so, in core_target::xfer_partial, the order in which the various
lists are checked to satisfy a read are:

  1. Check m_core_section_table for sections with SEC_HAS_CONTENTS
  flag, this reads content from the core file bfd directly,

  2. Check m_core_file_mappings, this will read content from mapped
  files assuming that GDB was able to find the file on-disk,

  3. Check m_core_unavailable_mappings and redirect to the file
  target, this allows the user supplied executable to satisfy some
  reads,

  4. Check m_core_section_table for sections without SEC_HAS_CONTENTS
  flag, this returns zero for these regions.

If a file is recorded as mapped in the core file, but the file was
mapped read only, then the matching entry in m_core_section_table will
not have the SEC_HAS_CONTENTS flag set.

If GDB can find the file on disk then an entry is added to
m_core_file_mappings, and reads from the region covered by the file
will be satisfied by #2 in the list above.

However, if GDB can't find the file then an entry is added to the
m_core_unavailable_mappings list.  An attempt will be made to satisfy
the read at stage #3 in the list above, however, if this file is not
the executable, then stage #3 will not satisfy the read.

We then fall through to stage #4, the section does not have the
SEC_HAS_CONTENTS flag, and so we end up returning zero.

In this case, I think this is the wrong thing to do.  If the core file
tells us that a file is mapped in, but GDB can't find the file on
disk, then we should instead return a failure when trying to read from
the region covered by the file, anything else is just misleading.

And so, I propose that in core_target::core_target() when the core
file is initially opened and m_core_section_table is setup, we should
remove entries that correspond to m_core_unavailable_mappings entries.

This means that an attempt to read from a mapped file, when GDB is not
able to find the on-disk file, and the contents of the mapped file
were not copied into the core file, then the read will fail.  This
feels like the only sensible option.

I've extended an existing test to exercise this case.  The test maps a
single file twice, once in read-only mode, and once in read-write
mode.  We then remove the on-disk file and load the core file in GDB.
The mapping which was read-write is still accessible (the content of
this region is found in the core file bfd object), while the read-only
mapping is no longer accessible.
---
 gdb/corelow.c                       | 58 +++++++++++++++++++++++++++++
 gdb/testsuite/gdb.base/corefile.exp | 39 +++++++++++++++++++
 2 files changed, 97 insertions(+)

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 49da9be07ef..d26955b1c08 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -167,6 +167,11 @@ class core_target final : public process_stratum_target
 						    ULONGEST len,
 						    ULONGEST *xfered_len);
 
+  /* Remove from m_core_section_table any entries that are covered by an
+     unavailable mapping.  This will prevent xfer_partial from returning
+     zero for unavailable mappings.  */
+  void filter_unavailable_mappings_from_section_table ();
+
   /* FIXME: kettenis/20031023: Eventually this field should
      disappear.  */
   struct gdbarch *m_core_gdbarch = NULL;
@@ -199,6 +204,59 @@ core_target::core_target ()
   m_core_section_table = build_section_table (current_program_space->core_bfd ());
 
   build_file_mappings ();
+
+  if (!m_core_unavailable_mappings.empty ())
+    filter_unavailable_mappings_from_section_table ();
+}
+
+/* See comment in class declaration.  */
+
+void
+core_target::filter_unavailable_mappings_from_section_table ()
+{
+  auto filter_cb = [&] (const target_section &tsect) -> bool
+  {
+    /* If TSECT has contents then this means that there is content to be
+       read within the core file bfd itself.  If TSECT does not have
+       content then we are relying on reading the content from some other
+       file that was mapped into the inferior at the time the core file was
+       created, but if GDB couldn't find that file then we are not able to
+       read the content and TSECT should be discarded.  */
+    if ((tsect.the_bfd_section->flags & SEC_HAS_CONTENTS) != 0)
+      return false;
+
+    /* Find the first entry in m_core_unavailable_mappings whose start
+       address is not less than or equal to the start of this target
+       section.  The previous unavailable mapping might cover TSECT.  */
+    auto it = std::lower_bound (m_core_unavailable_mappings.begin (),
+				m_core_unavailable_mappings.end (),
+				tsect.addr,
+				[] (const mem_range &mr,
+				    const CORE_ADDR &addr)
+				{
+				  return mr.start <= addr;
+				});
+
+    /* If there is a previous unavailable mapping, then that mapping's
+       start address is less than, or equal to the start of TSECT.  The
+       unavailable mapping might cover TSECT.  If it does then we will
+       delete TSECT.  */
+    if (it != m_core_unavailable_mappings.begin ())
+      {
+	--it;
+
+	if (it->start <= tsect.addr
+	    && (it->start + it->length) >= tsect.endaddr)
+	  return true;
+      }
+
+    return false;
+  };
+
+  auto end = std::remove_if (m_core_section_table.begin (),
+			     m_core_section_table.end (),
+			     filter_cb);
+  m_core_section_table.erase (end, m_core_section_table.end ());
 }
 
 /* Construct the table for file-backed mappings if they exist.
diff --git a/gdb/testsuite/gdb.base/corefile.exp b/gdb/testsuite/gdb.base/corefile.exp
index 28b723e1f85..ee152e629ab 100644
--- a/gdb/testsuite/gdb.base/corefile.exp
+++ b/gdb/testsuite/gdb.base/corefile.exp
@@ -199,6 +199,45 @@ gdb_test "up" "#\[0-9\]* *(\[0-9xa-fH'\]* in)? .* \\(.*\\).*" "up (reinit)"
 
 gdb_test "core" "No core file now."
 
+# Temporarily move coremmap.data out of the way and reload the core
+# file.  We should still be able to read buf2 as the contents of this
+# are written into the core file.  In contrast buf2ro should no longer
+# be readable as the contents of this region are not within the core
+# file, GDB relies on reading this from the coremmap.data file, which
+# can no longer be found.
+set coremmap_data_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data]
+set coremmap_data_backup_filename \
+    [standard_output_file coredir.[getpid]/coremmap.data.backup]
+remote_exec host "mv ${coremmap_data_filename} \
+		  ${coremmap_data_backup_filename}"
+
+clean_restart $binfile
+
+# Load the core file and check we get a warning about the
+# coremmap.data file being missing.
+gdb_test_multiple "core-file $corefile" "warn about coremmap.data missing" {
+    -re -wrap "warning: Can't open file \[^\r\n\]+/coremmap.data during file-backed mapping note processing\r\n.*" {
+	pass $gdb_test_name
+    }
+}
+
+# This xfail was just copied from earlier in the script where we also
+# read from buf2.
+setup_xfail "*-*-sunos*" "*-*-aix*"
+gdb_test "x/8bd buf2" \
+    ".*:.*0.*1.*2.*3.*4.*5.*6.*7.*" \
+    "accessing mmapped data in core file with coremmap.data removed"
+
+gdb_test "x/8bd buf2ro" \
+    "$hex\[^:\]*:\\s+Cannot access memory at address $hex" \
+    "accessing read-only mmapped data in core file with coremmap.data removed"
+
+# Restore the coremmap.data file so later tests don't give warnings
+# when the core file is reloaded.
+remote_exec host "mv ${coremmap_data_backup_filename} \
+		  ${coremmap_data_filename}"
+
 # Test that we can unload the core with the "detach" command.
 
 proc_with_prefix corefile_detach {} {
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv3 2/4] gdb/corefile: improve file backed mapping handling
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
@ 2024-06-10 21:49     ` Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
  3 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-06-10 21:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit improves how GDB handles file backed mappings within a
core file, specifically, this is a restructuring of the function
core_target::build_file_mapping.

The primary motivation for this commit was to put in place the
infrastructure to support the next commit in this series, but this
commit does itself make some improvements.

Currently in core_target::build_file_mapping we use
gdbarch_read_core_file_mappings to iterate over the mapped regions
within a core file.

For each region a callback is invoked which is passed details of the
mapping; the file the mapping is from, the offset into the file, and
the address range at which the mapping exists.  We are also passed the
build-id for the mapped file in some cases.

We are only told the build-id for the mapped region which actually
contains the ELF header of the mapped file.  Other regions of the same
mapped ELF will not have the build-id passed to the callback.

Within core_target::build_file_mapping, in the per-region callback, we
try to find the mapped file based on its filename.  If the file can't
be found, and if we have a build-id then we'll ask debuginfod to
download the file.

However we find the file, we cache the opened bfd object, which is
good.  Subsequent mappings from the same file will not have a build-id
set, but by that point we already have a cached open bfd object, so
the lack of build-id is irrelevant.

The problem with the above is that if we find a matching file based on
the filename, then we accept that file, even if we have a build-id,
and the build-id doesn't match.

Currently, the mapped region processing is done in a single pass, we
call gdbarch_read_core_file_mappings, and for each mapping, as we see
it, we create the data structures needed to represent that mapping.

In this commit, I will change this to a two phase process.  In the
first phase the mappings are grouped together based on the name of the
mapped file.  At the end of phase one we have a 'struct mapped_file',
a new struct, for each mapped file.  This struct associates an
optional build-id with a list of mapped regions.

In the second phase we try to find the file using its filename.  If
the file is found, and the 'struct mapped_file' has a build-id, then
we'll compare the build-id with the file we found.  This allows us to
reject on-disk files which have changed since the core file was
created.

If no suitable file was found (either no file found, or a build-id
mismatch) then we can use debuginfod to potentially download a
suitable file.

  NOTE: In the future we could potentially add additional sanity
  checks here, for example, if a data-file is mapped, and has no
  build-id, we can estimate a minimum file size based on the expected
  mappings.  If the file we find is not big enough then we can reject
  the on-disk file.  But I don't know how useful this would actually
  be, so I've not done that for now.

Having found (or not) a suitable file then we can create the data
structures for each mapped region just as we did before.

The new functionality here is the extra build-id check, and the
possibility of rejecting an on-disk file if the build-id doesn't
match.

This change could have been done within the existing single phase
approach I think, however, in the next approach I need to have all the
mapped regions associated with the expected build-id, and the new two
phase structure allows me to do that, this is the reason for such an
extensive rewrite in this commit.

There's a new test that exercises GDB's ability to find mapped files
via the build-id, and this downloading from debuginfod.
---
 gdb/corelow.c                                 | 274 ++++++++++----
 .../gdb.debuginfod/corefile-mapped-file-1.c   |  24 ++
 .../gdb.debuginfod/corefile-mapped-file-2.c   |  22 ++
 .../gdb.debuginfod/corefile-mapped-file-3.c   |  45 +++
 .../gdb.debuginfod/corefile-mapped-file.exp   | 356 ++++++++++++++++++
 5 files changed, 648 insertions(+), 73 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index d26955b1c08..9182a84b1c0 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -54,6 +54,7 @@
 #include "cli/cli-cmds.h"
 #include "xml-tdesc.h"
 #include "memtag.h"
+#include "cli/cli-style.h"
 
 #ifndef O_LARGEFILE
 #define O_LARGEFILE 0
@@ -278,6 +279,48 @@ core_target::build_file_mappings ()
   std::unordered_map<std::string, struct bfd *> bfd_map;
   std::unordered_set<std::string> unavailable_paths;
 
+  /* Type holding information about a single file mapped into the inferior
+     at the point when the core file was created.  Associates a build-id
+     with the list of regions the file is mapped into.  */
+  struct mapped_file
+  {
+    /* Type for a region of a file that was mapped into the inferior when
+       the core file was generated.  */
+    struct region
+    {
+      /* Constructor.   See member variables for argument descriptions.  */
+      region (CORE_ADDR start_, CORE_ADDR end_, CORE_ADDR file_ofs_)
+	: start (start_),
+	  end (end_),
+	  file_ofs (file_ofs_)
+      { /* Nothing.  */ }
+
+      /* The inferior address for the start of the mapped region.  */
+      CORE_ADDR start;
+
+      /* The inferior address immediately after the mapped region.  */
+      CORE_ADDR end;
+
+      /* The offset within the mapped file for this content.  */
+      CORE_ADDR file_ofs;
+    };
+
+    /* If not nullptr, then this is the build-id associated with this
+       file.  */
+    const bfd_build_id *build_id = nullptr;
+
+    /* If true then we have seen multiple different build-ids associated
+       with the same filename.  The build_id field will have been set back
+       to nullptr, and we should not set build_id in future.  */
+    bool ignore_build_id_p = false;
+
+    /* All the mapped regions of this file.  */
+    std::vector<region> regions;
+  };
+
+  /* All files mapped into the core file.  The key is the filename.  */
+  std::unordered_map<std::string, mapped_file> mapped_files;
+
   /* See linux_read_core_file_mappings() in linux-tdep.c for an example
      read_core_file_mappings method.  */
   gdbarch_read_core_file_mappings (m_core_gdbarch,
@@ -298,87 +341,172 @@ core_target::build_file_mappings ()
 	   weed out non-file-backed mappings.  */
 	gdb_assert (filename != nullptr);
 
-	if (unavailable_paths.find (filename) != unavailable_paths.end ())
+	/* Add this mapped region to the data for FILENAME.  */
+	mapped_file &file_data = mapped_files[filename];
+	file_data.regions.emplace_back (start, end, file_ofs);
+	if (build_id != nullptr && !file_data.ignore_build_id_p)
 	  {
-	    /* We have already seen some mapping for FILENAME but failed to
-	       find/open the file.  There is no point in trying the same
-	       thing again so just record that the range [start, end) is
-	       unavailable.  */
-	    m_core_unavailable_mappings.emplace_back (start, end - start);
-	    return;
-	  }
-
-	struct bfd *bfd = bfd_map[filename];
-	if (bfd == nullptr)
-	  {
-	    /* Use exec_file_find() to do sysroot expansion.  It'll
-	       also strip the potential sysroot "target:" prefix.  If
-	       there is no sysroot, an equivalent (possibly more
-	       canonical) pathname will be provided.  */
-	    gdb::unique_xmalloc_ptr<char> expanded_fname
-	      = exec_file_find (filename, NULL);
-
-	    if (expanded_fname == nullptr && build_id != nullptr)
-	      debuginfod_exec_query (build_id->data, build_id->size,
-				     filename, &expanded_fname);
-
-	    if (expanded_fname == nullptr)
+	    if (file_data.build_id == nullptr)
+	      file_data.build_id = build_id;
+	    else if (!build_id_equal (build_id, file_data.build_id))
 	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s during file-backed mapping "
-			   "note processing"),
-			 filename);
-		return;
+		warning (_("Multiple build-ids found for %ps"),
+			 styled_string (file_name_style.style (), filename));
+		file_data.build_id = nullptr;
+		file_data.ignore_build_id_p = true;
 	      }
+	  }
+      });
 
-	    bfd = bfd_openr (expanded_fname.get (), "binary");
+  for (const auto &iter : mapped_files)
+    {
+      const std::string &filename = iter.first;
+      const mapped_file &file_data = iter.second;
+
+      /* Use exec_file_find() to do sysroot expansion.  It'll
+	 also strip the potential sysroot "target:" prefix.  If
+	 there is no sysroot, an equivalent (possibly more
+	 canonical) pathname will be provided.  */
+      gdb::unique_xmalloc_ptr<char> expanded_fname
+	= exec_file_find (filename.c_str (), nullptr);
+
+      bool build_id_mismatch = false;
+      if (expanded_fname != nullptr && file_data.build_id != nullptr)
+	{
+	  /* We temporarily open the bfd as a structured target, this
+	     allows us to read the build-id from the bfd if there is one.
+	     For this task it's OK if we reuse an already open bfd object,
+	     so we make this call through GDB's bfd cache.  Once we've
+	     checked the build-id (if there is one) we'll drop this
+	     reference and re-open the bfd using the "binary" target.  */
+	  gdb_bfd_ref_ptr tmp_bfd
+	    = gdb_bfd_open (expanded_fname.get (), gnutarget);
+
+	  if (tmp_bfd != nullptr
+	      && bfd_check_format (tmp_bfd.get (), bfd_object)
+	      && build_id_bfd_get (tmp_bfd.get ()) != nullptr)
+	    {
+	      /* The newly opened TMP_BFD has a build-id, and this mapped
+		 file has a build-id extracted from the core-file.  Check
+		 the build-id's match, and if not, reject TMP_BFD.  */
+	      const struct bfd_build_id *found
+		= build_id_bfd_get (tmp_bfd.get ());
+	      if (!build_id_equal (found, file_data.build_id))
+		build_id_mismatch = true;
+	    }
+	}
 
-	    if (bfd == nullptr || !bfd_check_format (bfd, bfd_object))
-	      {
-		m_core_unavailable_mappings.emplace_back (start, end - start);
-		unavailable_paths.insert (filename);
-		warning (_("Can't open file %s which was expanded to %s "
+      gdb_bfd_ref_ptr abfd;
+      if (expanded_fname != nullptr && !build_id_mismatch)
+	{
+	  struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	  abfd = gdb_bfd_ref_ptr::new_reference (b);
+	}
+
+      if ((expanded_fname == nullptr
+	   || abfd == nullptr
+	   || !bfd_check_format (abfd.get (), bfd_object))
+	  && file_data.build_id != nullptr)
+	{
+	  expanded_fname = nullptr;
+	  debuginfod_exec_query (file_data.build_id->data,
+				 file_data.build_id->size,
+				 filename.c_str (), &expanded_fname);
+	  if (expanded_fname != nullptr)
+	    {
+	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      abfd = gdb_bfd_ref_ptr::new_reference (b);
+	    }
+	}
+
+      if (expanded_fname == nullptr
+	  || abfd == nullptr
+	  || !bfd_check_format (abfd.get (), bfd_object))
+	{
+	  /* If ABFD was opened, but the wrong format, close it now.  */
+	  abfd = nullptr;
+
+	  /* Record all regions for this file as unavailable.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    m_core_unavailable_mappings.emplace_back (region.start,
+						      region.end
+						      - region.start);
+
+	  /* And give the user an appropriate warning.  */
+	  if (build_id_mismatch)
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("File %ps doesn't match build-id from core-file "
+			   "during file-backed mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("File %ps which was expanded to %ps, doesn't match "
+			   "build-id from core-file during file-backed "
+			   "mapping processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	  else
+	    {
+	      if (expanded_fname == nullptr
+		  || filename == expanded_fname.get ())
+		warning (_("Can't open file %ps during file-backed mapping "
+			   "note processing"),
+			 styled_string (file_name_style.style (),
+					filename.c_str ()));
+	      else
+		warning (_("Can't open file %ps which was expanded to %ps "
 			   "during file-backed mapping note processing"),
-			 filename, expanded_fname.get ());
+			 styled_string (file_name_style.style (),
+					filename.c_str ()),
+			 styled_string (file_name_style.style (),
+					expanded_fname.get ()));
+	    }
+	}
+      else
+	{
+	  /* Ensure that the bfd will be closed when core_bfd is closed.
+	     This can be checked before/after a core file detach via "maint
+	     info bfds".  */
+	  gdb_bfd_record_inclusion (current_program_space->core_bfd (),
+				    abfd.get ());
+
+	  /* Create sections for each mapped region.  */
+	  for (const mapped_file::region &region : file_data.regions)
+	    {
+	      /* Make new BFD section.  All sections have the same name,
+		 which is permitted by bfd_make_section_anyway().  */
+	      asection *sec = bfd_make_section_anyway (abfd.get (), "load");
+	      if (sec == nullptr)
+		error (_("Can't make section"));
+	      sec->filepos = region.file_ofs;
+	      bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
+	      bfd_set_section_size (sec, region.end - region.start);
+	      bfd_set_section_vma (sec, region.start);
+	      bfd_set_section_lma (sec, region.start);
+	      bfd_set_section_alignment (sec, 2);
+
+	      /* Set target_section fields.  */
+	      m_core_file_mappings.emplace_back (region.start, region.end, sec);
+	    }
+	}
 
-		if (bfd != nullptr)
-		  bfd_close (bfd);
-		return;
-	      }
-	    /* Ensure that the bfd will be closed when core_bfd is closed. 
-	       This can be checked before/after a core file detach via
-	       "maint info bfds".  */
-	    gdb_bfd_record_inclusion (current_program_space->core_bfd (), bfd);
-	    bfd_map[filename] = bfd;
-	  }
+      /* If this is a bfd of a shared library, record its soname and
+	 build-id.  */
+      if (file_data.build_id != nullptr && abfd != nullptr)
+	{
+	  gdb::unique_xmalloc_ptr<char> soname
+	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
 
-	/* Make new BFD section.  All sections have the same name,
-	   which is permitted by bfd_make_section_anyway().  */
-	asection *sec = bfd_make_section_anyway (bfd, "load");
-	if (sec == nullptr)
-	  error (_("Can't make section"));
-	sec->filepos = file_ofs;
-	bfd_set_section_flags (sec, SEC_READONLY | SEC_HAS_CONTENTS);
-	bfd_set_section_size (sec, end - start);
-	bfd_set_section_vma (sec, start);
-	bfd_set_section_lma (sec, start);
-	bfd_set_section_alignment (sec, 2);
-
-	/* Set target_section fields.  */
-	m_core_file_mappings.emplace_back (start, end, sec);
-
-	/* If this is a bfd of a shared library, record its soname
-	   and build id.  */
-	if (build_id != nullptr)
-	  {
-	    gdb::unique_xmalloc_ptr<char> soname
-	      = gdb_bfd_read_elf_soname (bfd->filename);
-	    if (soname != nullptr)
-	      set_cbfd_soname_build_id (current_program_space->cbfd,
-					soname.get (), build_id);
-	  }
-      });
+	  if (soname != nullptr)
+	    set_cbfd_soname_build_id (current_program_space->cbfd,
+				      soname.get (), file_data.build_id);
+	}
+    }
 
   normalize_mem_ranges (&m_core_unavailable_mappings);
 }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
new file mode 100644
index 00000000000..dd6b3f105f2
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-1.c
@@ -0,0 +1,24 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+int
+main (void)
+{
+  int res = foo ();
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
new file mode 100644
index 00000000000..ccf35b75ef8
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-2.c
@@ -0,0 +1,22 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+volatile int *library_ptr = (int *) POINTER_VALUE;
+
+int
+foo (void)
+{
+  return *library_ptr;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
new file mode 100644
index 00000000000..45d78336930
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file-3.c
@@ -0,0 +1,45 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+volatile void* library_base_address = 0;
+volatile int *ptr = 0;
+
+int
+main ()
+{
+  struct stat buf;
+  int res;
+
+  int fd = open (SHLIB_FILENAME, O_RDONLY);
+  assert (fd != -1);
+
+  res = fstat (fd, &buf);
+  assert (res == 0);
+
+  library_base_address
+    = mmap (NULL, buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+
+  res = *ptr;	/* Undefined behaviour here.  */
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
new file mode 100644
index 00000000000..6e3301e1c8d
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -0,0 +1,356 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test checks GDB's ability to use build-ids when setting up file backed
+# mappings as part of reading a core-file.
+#
+# A core-file contains a list of the files that were mapped into the process
+# at the time of the core-file creation.  If the file was mapped read-only
+# then the file contents will not be present in the core-file, but instead GDB
+# is expected to open the mapped file and read the contents from there if
+# needed.  And this is what GDB does.
+#
+# GDB (via the BFD library) will also spot if a mapped looks like a valid ELF
+# and contains a build-id, this build-id is passed back to GDB so that GDB can
+# validate the on-disk file it finds matches the file that was mapped when the
+# core-file was created.
+#
+# In addition, if the on-disk file is found to have a non-matching build-id
+# then GDB can use debuginfod to (try) and download a suitable file.
+#
+# This test is about checking that this file backed mapping mechanism works
+# correctly; that GDB will spot when the build-ids fail to match and will
+# refuse to load an incorrect file.  Additionally we check that the correct
+# file can be downloaded from debuginfod.
+#
+# The test is rather contrived though.  Instead of relying on having a shared
+# library mapped at the time of crash we mmap a shared library into the
+# process and then check this mapping within the test.
+#
+# The problem with using a normal shared library load for this test is that
+# the shared library list is processed as part of a separate step when opening
+# the core file.  Right now this separate step doesn't check the build-ids
+# correctly, so GDB will potentially open the wrong shared library file.  The
+# sections of this incorrect shared library are then added to GDB's list of
+# target sections, and are used to satisfy memory reads, which can give the
+# wrong results.
+#
+# This obviously needs fixing, but is a separate problem from the one being
+# tested here, so this test deliberately checks the mapping using a file that
+# is mmaped rather than loaded as a shared library, as such the file is in the
+# core-files list of mapped files, but is not in the shared library list.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+require {!is_remote host}
+require {!is_remote target}
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+
+standard_testfile -1.c -2.c -3.c
+
+# Compile an executable that loads the shared library as an actual
+# shared library, then use GDB to figure out the offset of the
+# variable 'library_ptr' within the library.
+set library_filename [standard_output_file "libfoo.so"]
+set binfile2 [standard_output_file "library_loader"]
+
+if {[prepare_for_testing_full "build exec which loads the shared library" \
+	 [list $library_filename \
+	      { debug shlib build-id \
+		    additional_flags=-DPOINTER_VALUE=0x12345678 } \
+	      $srcfile2 {}] \
+	 [list $binfile2 [list debug shlib=$library_filename ] \
+	      $srcfile { debug }]] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+if { [is_address_zero_readable] } {
+    return
+}
+
+set ptr_address [get_hexadecimal_valueof "&library_ptr" "unknown"]
+
+set ptr_offset "unknown"
+gdb_test_multiple "info proc mappings" "" {
+    -re "^\\s+($hex)\\s+($hex)\\s+$hex\\s+($hex)\[^\r\n\]+$library_filename\r\n" {
+	set low_addr $expect_out(1,string)
+	set high_addr $expect_out(2,string)
+	set file_offset $expect_out(3,string)
+
+	if {[expr $ptr_address >= $low_addr] && [expr $ptr_address < $high_addr]} {
+	    set mapping_offset [expr $ptr_address - $low_addr]
+	    set ptr_offset [format 0x%x [expr $file_offset + $mapping_offset]]
+	}
+
+	exp_continue
+    }
+
+    -re "^$gdb_prompt $" {
+    }
+
+    -re "(^\[^\r\n\]*)\r\n" {
+	set tmp $expect_out(1,string)
+	exp_continue
+    }
+}
+
+gdb_assert { $ptr_offset ne "unknown" } \
+    "found pointer offset"
+
+set ptr_size [get_integer_valueof "sizeof (library_ptr)" "unknown"]
+set ptr_format_char ""
+if { $ptr_size == 2 } {
+    set ptr_format_char "b"
+} elseif { $ptr_size == 4 } {
+    set ptr_format_char "w"
+} elseif { $ptr_size == 8 } {
+    set ptr_format_char "g"
+}
+if { $ptr_format_char eq "" } {
+    untested "could not figure out size of library_ptr variable"
+    return
+}
+
+# Helper proc to read a value from inferior memory.  Reads at address held in
+# global PTR_ADDRESS, and use PTR_FORMAT_CHAR for the size of the read.
+proc read_ptr_value { } {
+    set value ""
+    gdb_test_multiple "x/1${::ptr_format_char}x ${::ptr_address}" "" {
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+($::hex)" {
+	    set value $expect_out(1,string)
+	}
+	-re -wrap "^${::hex}(?:\\s+<\[^>\]+>)?:\\s+Cannot access memory at address ${::hex}" {
+	    set value "unavailable"
+	}
+    }
+    return $value
+}
+
+set ptr_expected_value [read_ptr_value]
+if { $ptr_expected_value eq "" } {
+    untested "could not find expected value for library_ptr"
+    return
+}
+
+# Now compile a second executable.  This one doesn't load the shared
+# library as an actual shared library, but instead mmaps the library
+# into the executable.
+#
+# Load this executable within GDB and confirm that we can use the
+# offset we calculated previously to view the value of 'library_ptr'.
+set opts [list debug additional_flags=-DSHLIB_FILENAME=\"$library_filename\"]
+if {[prepare_for_testing "prepare second executable" $binfile \
+	 $srcfile3 $opts] != 0} {
+    return
+}
+
+if {![runto_main]} {
+    return
+}
+
+gdb_breakpoint [gdb_get_line_number "Undefined behaviour here" $srcfile3]
+gdb_continue_to_breakpoint "run to breakpoint"
+
+set library_base_address \
+    [get_hexadecimal_valueof "library_base_address" "unknown"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable"
+
+# Now rerun the second executable outside of GDB.  The executable should crash
+# and generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Load a core file from the global COREFILE.  Use TESTNAME as the name
+# of the test.
+#
+# If LINE_RE is not the empty string then this is a regexp for a line
+# that we expect to see in the output when loading the core file, if
+# the line is not present then this test will fail.
+#
+# Any lines beginning with 'warning: ' will cause this test to fail.
+#
+# A couple of other standard lines that are produced when loading a
+# core file are also checked for, just to make sure the core file
+# loading has progressed as expected.
+proc load_core_file { testname { line_re "" } } {
+    set code {}
+
+    if { $line_re ne "" } {
+	append code {
+	    -re "^$line_re\r\n" {
+		set saw_expected_line true
+		exp_continue
+	    }
+	}
+	set saw_expected_line false
+    } else {
+	set saw_expected_line true
+    }
+
+    set saw_unknown_warning false
+    set saw_generated_by_line false
+    set saw_prog_terminated_line false
+
+    append code {
+	-re "^warning: \[^\r\n\]+\r\n" {
+	    set saw_unknown_warning true
+	    exp_continue
+	}
+
+	-re "^Core was generated by \[^\r\n\]+\r\n" {
+	    set saw_generated_by_line true
+	    exp_continue
+	}
+
+	-re "^Program terminated with signal SIGSEGV, Segmentation fault\\.\r\n" {
+	    set saw_prog_terminated_line true
+	    exp_continue
+	}
+
+	-re "^$::gdb_prompt $" {
+	    gdb_assert {$saw_generated_by_line \
+			    && $saw_prog_terminated_line \
+			    && $saw_expected_line \
+			    && !$saw_unknown_warning} \
+		$gdb_test_name
+	}
+
+	-re "^\[^\r\n\]*\r\n" {
+	    exp_continue
+	}
+    }
+
+    set res [catch { return [gdb_test_multiple "core-file $::corefile" \
+				 "$testname" $code] } string]
+
+    if {$res == 1} {
+	global errorInfo errorCode
+	return -code error -errorinfo $errorInfo -errorcode $errorCode $string
+    } elseif {$res == 2} {
+	return $string
+    } else {
+	# We expect RES to be 2 (TCL_RETURN) or 1 (TCL_ERROR).  If we get
+	# here then somehow the 'catch' above finished without hitting
+	# either of those cases, which is .... weird.
+	perror "unexepcted return value, code = $res, value = $string"
+	return -1
+    }
+}
+
+# And now restart GDB, load the core-file and check that the library shows as
+# being mapped in, and that we can still read the library_ptr value from
+# memory.
+clean_restart $binfile
+
+load_core_file "load core file"
+
+set library_base_address [get_hexadecimal_valueof "library_base_address" \
+			      "unknown" "get library_base_address in core-file"]
+set ptr_address [format 0x%x [expr $library_base_address + $ptr_offset]]
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file"
+
+# Now move the shared library file away and restart GDB.  This time when we
+# load the core-file we should see a warning that GDB has failed to map in the
+# library file.  An attempt to read the variable from the library file should
+# fail / give a warning.
+set library_backup_filename [standard_output_file "libfoo.so.backup"]
+remote_exec build "mv \"$library_filename\" \"$library_backup_filename\""
+
+clean_restart $binfile
+
+load_core_file "load corefile with library file missing" \
+    "warning: Can't open file [string_to_regexp $library_filename] during file-backed mapping note processing"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with library file missing"
+
+# Build a new version of the shared library, keep the library the same size,
+# but change the contents so the build-id changes.  Then restart GDB and load
+# the core-file again.  GDB should spot that the build-id for the shared
+# library is not as expected, and should refuse to map in the shared library.
+if {[build_executable "build second version of shared library" \
+	 $library_filename $srcfile2 \
+	 { debug shlib build-id \
+	       additional_flags=-DPOINTER_VALUE=0x11223344 }] != 0} {
+    return
+}
+
+clean_restart $binfile
+
+load_core_file "load corefile with wrong library in place" \
+    "warning: File [string_to_regexp $library_filename] doesn't match build-id from core-file during file-backed mapping processing"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value eq "unavailable" } \
+    "check value of pointer is unavailable with wrong library in place"
+
+# Setup a debuginfod server which can serve the original shared library file.
+# Then restart GDB and load the core-file.  GDB should download the original
+# shared library from debuginfod and use that to provide the file backed
+# mapping.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    clean_restart
+    gdb_test_no_output "set debuginfod enabled on" \
+	"enabled debuginfod for initial test"
+    gdb_load $binfile
+
+    load_core_file "load corefile, download library from debuginfod" \
+	"Downloading\[^\r\n\]* executable for [string_to_regexp $library_filename]\\.\\.\\."
+
+    set ptr_value [read_ptr_value]
+    gdb_assert { $ptr_value == $ptr_expected_value } \
+	"check value of pointer variable after downloading library file"
+}
+
+stop_debuginfod
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv3 3/4] gdb: improve shared library build-id check for core-files
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
@ 2024-06-10 21:49     ` Andrew Burgess
  2024-06-10 21:49     ` [PATCHv3 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
  3 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-06-10 21:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the libraries file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-ids extracted in core_target::build_file_mappings to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this I propose making the following changes to GDB:

(1) Rather than just recording the soname to build-id mapping in
set_cbfd_soname_build_id, we should also record, the full mapped
filename to build-id mapping, and also the memory ranges to build-id
mapping for every memory range covered by every mapped file.

(2) Add a new callback solib_ops::find_solib_addr.  This callback
takes a solib object and returns an (optional) address within the
inferior that is part of this library.  We can use this address to
find a mapped file using the stored memory ranges which will increase
the cases in which a match can be found.

(3) Move the mapped file record keeping out of solib.c and into
corelow.c.  Future commits will make use of this information from
other parts of GDB.  This information was never solib specific, it
lived in the solib.c file because that was the only user of the data,
but really, the data is all about the core file, and should be stored
in core_target, other parts of GDB can then query this data as needed.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, we also create a build-id to filename map.  This is
useful as often a shared library is implemented as a symbolic link to
the actual shared library file.  The mapped file information is stored
based on the actual, real file name, while the shared library
information holds the original symbolic link file name.

If when loading the shared library, we find the symbolic link has
disappeared, we can use the build-id to file name map to check if the
actual file is still around, if it is (and if the build-id matches)
then we can fall back to use that file.  This is another way in which
we can slightly increase the chances that GDB will find the required
files when loading a core file.

Adding all of the above required pretty much a full rewrite of the
existing set_cbfd_soname_build_id function and the corresponding
get_cbfd_soname_build_id function, so I have taken the opportunity to
move the information caching out of solib.c and into corelow.c where
it is now accessed through the function core_target_find_mapped_file.

At this point the benefit of this move is not entirely obvious, though
I don't think the new location is significantly worse than where it
was originally.  The benefit though is that the cached information is
no longer tied to the shared library loading code.

I already have a second set of patches (not in this series) that make
use of this caching from elsewhere in GDB.  I've not included those
patches in this series as this series is already pretty big, but even
if those follow up patches don't arrive, I think the new location is
just as good as the original location.

Rather that caching the information within the core file BFD via the
registry mechanism, the information used for the mapped file lookup is
now stored within the core_file target directly.
---
 gdb/corelow.c                                 | 334 +++++++++++++++++-
 gdb/gdbcore.h                                 |  66 ++++
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 128 +++----
 gdb/solib.h                                   |   8 +-
 gdb/solist.h                                  |  22 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 +++
 .../gdb.debuginfod/solib-with-soname.exp      | 260 ++++++++++++++
 15 files changed, 853 insertions(+), 91 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 9182a84b1c0..1294f78a982 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -60,6 +60,119 @@
 #define O_LARGEFILE 0
 #endif
 
+/* A mem_range and the build-id associated with the file mapped into the
+   given range.  */
+
+struct mem_range_and_build_id
+{
+  mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+    : range (r),
+      build_id (id)
+  { /* Nothing.  */ }
+
+  /* A range of memory addresses.  */
+  mem_range range;
+
+  /* The build-id of the file mapped into RANGE.  */
+  const bfd_build_id *build_id;
+};
+
+/* An instance of this class is created within the core_target and is used
+   to hold all the information that relating to mapped files, their address
+   ranges, and their corresponding build-ids.  */
+
+struct mapped_file_info
+{
+  /* See comment on function definition.  */
+
+  void add (const char *soname, const char *expected_filename,
+	    const char *actual_filename, std::vector<mem_range> &&ranges,
+	    const bfd_build_id *build_id);
+
+  /* See comment on function definition.  */
+
+  std::optional <core_target_mapped_file_info>
+  lookup (const char *filename, const std::optional<CORE_ADDR> &addr);
+
+private:
+
+  /* Helper for ::lookup.  BUILD_ID is a build-id that was found in
+     one of the data structures within this class.  Lookup the
+     corresponding filename in m_build_id_to_filename_map and return a pair
+     containing the build-id and filename.
+
+     If no corresponding filename is found in m_build_id_to_filename_map
+     then the returned pair contains BUILD_ID and an empty string.
+
+     If BUILD_ID is nullptr then the returned pair contains nullptr and an
+     empty string.  */
+
+  struct core_target_mapped_file_info
+  make_result (const bfd_build_id *build_id)
+  {
+    if (build_id != nullptr)
+      {
+      auto it = m_build_id_to_filename_map.find (build_id);
+      if (it != m_build_id_to_filename_map.end ())
+	return { build_id, it->second };
+    }
+
+    return { build_id, {} };
+  }
+
+  /* A type that maps a string to a build-id.  */
+  using string_to_build_id_map
+    = std::unordered_map<std::string, const bfd_build_id *>;
+
+  /* A type that maps a build-id to a string.  */
+  using build_id_to_string_map
+    = std::unordered_map<const bfd_build_id *, std::string>;
+
+  /* When loading a core file, the build-ids are extracted based on the
+     file backed mappings.  This map associates the name of a file that was
+     mapped into the core file with the corresponding build-id.  The
+     build-id pointers in this map will never be nullptr as we only record
+     files if they have a build-id.  */
+
+  string_to_build_id_map m_filename_to_build_id_map;
+
+  /* Map a build-id pointer back to the name of the file that was mapped
+     into the inferior's address space.  If we lookup a matching build-id
+     using either a soname or an address then this map allows us to also
+     provide a full path to a file with a matching build-id.  */
+
+  build_id_to_string_map m_build_id_to_filename_map;
+
+  /* If the file that was mapped into the core file was a shared library
+     then it might have a DT_SONAME tag in its .dynamic section, this tag
+     contains the name of a shared object.  When opening a shared library,
+     if it's basename appears in this map then we can use the corresponding
+     build-id.
+
+     In the rare case that two different files have the same DT_SONAME
+     value then the build-id pointer in this map will be nullptr, this
+     indicates that it's not possible to find a build-id based on the given
+     DT_SONAME value.  */
+
+  string_to_build_id_map m_soname_to_build_id_map;
+
+  /* This vector maps memory ranges onto an associated build-id.  The
+     ranges are those of the files mapped into the core file.
+
+     Entries in this vector must not overlap, and are sorted be increasing
+     memory address.  Within each entry the build-id pointer will not be
+     nullptr.
+
+     While building this vector the entries are not sorted, they are
+     sorted once after the table has finished being built.  */
+
+  std::vector<mem_range_and_build_id> m_address_to_build_id_list;
+
+  /* False if address_to_build_id_list is unsorted, otherwise true.  */
+
+  bool m_address_to_build_id_list_sorted = false;
+};
+
 /* The core file target.  */
 
 static const target_info core_target_info = {
@@ -136,6 +249,13 @@ class core_target final : public process_stratum_target
   /* See definition.  */
   void info_proc_mappings (struct gdbarch *gdbarch);
 
+  std::optional <core_target_mapped_file_info>
+  lookup_mapped_file_info (const char *filename,
+			   const std::optional<CORE_ADDR> &addr)
+  {
+    return m_mapped_file_info.lookup (filename, addr);
+  }
+
 private: /* per-core data */
 
   /* Get rid of the core inferior.  */
@@ -158,7 +278,13 @@ class core_target final : public process_stratum_target
      still be useful.  */
   std::vector<mem_range> m_core_unavailable_mappings;
 
-  /* Build m_core_file_mappings.  Called from the constructor.  */
+  /* Data structure that holds information mapping filenames and address
+     ranges to the corresponding build-ids as well as the reverse build-id
+     to filename mapping.  */
+  mapped_file_info m_mapped_file_info;
+
+  /* Build m_core_file_mappings and m_mapped_file_info.  Called from the
+     constructor.  */
   void build_file_mappings ();
 
   /* Helper method for xfer_partial.  */
@@ -419,6 +545,10 @@ core_target::build_file_mappings ()
 	    }
 	}
 
+      std::vector<mem_range> ranges;
+      for (const mapped_file::region &region : file_data.regions)
+	ranges.emplace_back (region.start, region.end - region.start);
+
       if (expanded_fname == nullptr
 	  || abfd == nullptr
 	  || !bfd_check_format (abfd.get (), bfd_object))
@@ -495,16 +625,26 @@ core_target::build_file_mappings ()
 	    }
 	}
 
-      /* If this is a bfd of a shared library, record its soname and
-	 build-id.  */
-      if (file_data.build_id != nullptr && abfd != nullptr)
+      /* If this is a bfd with a build-id then record the filename,
+	 optional soname (DT_SONAME .dynamic attribute), and the range of
+	 addresses at which this bfd is mapped.  This information can be
+	 used to perform build-id checking when loading the shared
+	 libraries.  */
+      if (file_data.build_id != nullptr)
 	{
-	  gdb::unique_xmalloc_ptr<char> soname
-	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+	  normalize_mem_ranges (&ranges);
+
+	  const char *actual_filename = nullptr;
+	  gdb::unique_xmalloc_ptr<char> soname;
+	  if (abfd != nullptr)
+	    {
+	      actual_filename = bfd_get_filename (abfd.get ());
+	      soname = gdb_bfd_read_elf_soname (actual_filename);
+	    }
 
-	  if (soname != nullptr)
-	    set_cbfd_soname_build_id (current_program_space->cbfd,
-				      soname.get (), file_data.build_id);
+	  m_mapped_file_info.add (soname.get (), filename.c_str (),
+				  actual_filename, std::move (ranges),
+				  file_data.build_id);
 	}
     }
 
@@ -1702,6 +1842,182 @@ maintenance_print_core_file_backed_mappings (const char *args, int from_tty)
     targ->info_proc_mappings (targ->core_gdbarch ());
 }
 
+/* Add more details discovered while processing the core-file's mapped file
+   information, we're building maps between filenames and the corresponding
+   build-ids, between address ranges and the corresponding build-ids, and
+   also a reverse map between build-id and the corresponding filename.
+
+   SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+   a shared library that was mapped into the core file.  This can be
+   nullptr if the mapped files was not a shared library, or didn't have a
+   DT_SONAME attribute.
+
+   EXPECTED_FILENAME is the name of the file that was mapped into the
+   inferior as extracted from the core file, this should never be nullptr.
+
+   ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+   mapped file information, this can be nullptr if GDB failed to find a
+   suitable file.  This might be different to EXPECTED_FILENAME, e.g. GDB
+   might have downloaded the file from debuginfod and so ACTUAL_FILENAME
+   will be a file in the debuginfod client cache.
+
+   RANGES is the list of memory ranges at which this file was mapped into
+   the inferior.
+
+   BUILD_ID is the build-id for this mapped file, this will never be
+   nullptr.  Not every mapped file will have a build-id, but there's no
+   point calling this function if we failed to find a build-id; this
+   structure only exists so we can lookup files based on their build-id.  */
+
+void
+mapped_file_info::add (const char *soname,
+		       const char *expected_filename,
+		       const char *actual_filename,
+		       std::vector<mem_range> &&ranges,
+		       const bfd_build_id *build_id)
+{
+  gdb_assert (build_id != nullptr);
+  gdb_assert (expected_filename != nullptr);
+
+  if (soname != nullptr)
+    {
+      /* If we already have an entry with this SONAME then this indicates
+	 that the inferior has two files mapped into memory with different
+	 file names (and most likely different build-ids), but with the
+	 same DT_SONAME attribute.  In this case we can't use the
+	 DT_SONAME to figure out the expected build-id of a shared
+	 library, so poison the entry for this SONAME by setting the entry
+	 to nullptr.  */
+      auto it = m_soname_to_build_id_map.find (soname);
+      if (it != m_soname_to_build_id_map.end ()
+	  && it->second != nullptr
+	  && !build_id_equal (it->second, build_id))
+	m_soname_to_build_id_map[soname] = nullptr;
+      else
+	m_soname_to_build_id_map[soname] = build_id;
+    }
+
+  /* When the core file is initially opened and the mapped files are
+     parsed, we group the build-id information based on the file name.  As
+     a consequence, we should see each EXPECTED_FILENAME value exactly
+     once.  This means that each insertion should always succeed.  */
+  const auto [it, inserted]
+    = m_filename_to_build_id_map.emplace (expected_filename, build_id);
+  gdb_assert (inserted);
+
+  /* Setup the reverse build-id to file name map.  */
+  if (actual_filename != nullptr)
+    m_build_id_to_filename_map.emplace (build_id, actual_filename);
+
+  /* Setup the list of memory range to build-id objects.  */
+  for (mem_range &r : ranges)
+    m_address_to_build_id_list.emplace_back (std::move (r), build_id);
+
+  /* At this point the m_address_to_build_id_list is unsorted (we just
+     added some entries to the end of the list).  All entries should be
+     added before any look-ups are performed, and the list is only sorted
+     when the first look-up is performed.  */
+  gdb_assert (!m_address_to_build_id_list_sorted);
+}
+
+/* FILENAME is the name of a file GDB is trying to load, and ADDR is
+   (optionally) an address within the file in the inferior's address space.
+
+   Search through the information gathered from the core-file's mapped file
+   information looking for a file named FILENAME, or for a file that covers
+   ADDR.  If a match is found then return the build-id for the file along
+   with the location where GDB found the mapped file.
+
+   The location of the mapped file might be the empty string if GDB was
+   unable to find the mapped file.
+
+   If no build-id can be found for FILENAME then GDB will return a pair
+   containing nullptr (for the build-id) and an empty string for the file
+   name.  */
+
+std::optional <core_target_mapped_file_info>
+mapped_file_info::lookup (const char *filename,
+			  const std::optional<CORE_ADDR> &addr)
+{
+  if (filename != nullptr)
+    {
+      /* If there's a matching entry in m_filename_to_build_id_map then the
+	 associated build-id will not be nullptr, and can be used to
+	 validate that FILENAME is correct.  */
+      auto it = m_filename_to_build_id_map.find (filename);
+      if (it != m_filename_to_build_id_map.end ())
+	return make_result (it->second);
+    }
+
+  if (addr.has_value ())
+    {
+      /* On the first lookup, sort the address_to_build_id_list.  */
+      if (!m_address_to_build_id_list_sorted)
+	{
+	  std::sort (m_address_to_build_id_list.begin (),
+		     m_address_to_build_id_list.end (),
+		     [] (const mem_range_and_build_id &a,
+			 const mem_range_and_build_id &b) {
+		       return a.range < b.range;
+		     });
+	  m_address_to_build_id_list_sorted = true;
+	}
+
+      /* Look for the first entry whose range's start address is not less
+	 than, or equal too, the address ADDR.  If we find such an entry,
+	 then the previous entry's range might contain ADDR.  If it does
+	 then that previous entry's build-id can be used.  */
+      auto it = std::lower_bound
+	(m_address_to_build_id_list.begin (),
+	 m_address_to_build_id_list.end (),
+	 *addr,
+	 [] (const mem_range_and_build_id &a,
+	     const CORE_ADDR &b) {
+	  return a.range.start <= b;
+	});
+
+      if (it != m_address_to_build_id_list.begin ())
+	{
+	  --it;
+
+	  if (it->range.contains (*addr))
+	    return make_result (it->build_id);
+	}
+    }
+
+  if (filename != nullptr)
+    {
+      /* If the basename of FILENAME appears in m_soname_to_build_id_map
+	 then when the mapped files were processed, we saw a file with a
+	 DT_SONAME attribute corresponding to FILENAME, use that build-id
+	 to validate FILENAME.
+
+	 However, the build-id in this map might be nullptr if we saw
+	 multiple mapped files with the same DT_SONAME attribute (though
+	 this should be pretty rare).  */
+      auto it
+	= m_soname_to_build_id_map.find (lbasename (filename));
+      if (it != m_soname_to_build_id_map.end ()
+	  && it->second != nullptr)
+	return make_result (it->second);
+    }
+
+  return {};
+}
+
+/* See gdbcore.h.  */
+
+std::optional <core_target_mapped_file_info>
+core_target_find_mapped_file (const char *filename,
+			      std::optional<CORE_ADDR> addr)
+{
+  core_target *targ = get_current_core_target ();
+  if (targ == nullptr || current_program_space->cbfd.get () == nullptr)
+    return {};
+
+  return targ->lookup_mapped_file_info (filename, addr);
+}
+
 void _initialize_corelow ();
 void
 _initialize_corelow ()
diff --git a/gdb/gdbcore.h b/gdb/gdbcore.h
index d6aeb356e37..782643acf56 100644
--- a/gdb/gdbcore.h
+++ b/gdb/gdbcore.h
@@ -196,4 +196,70 @@ class thread_section_name
   std::string m_storage;
 };
 
+/* Type returned from core_target_find_mapped_file.  Holds information
+   about a mapped file that was processed when a core file was initially
+   loaded.  */
+struct core_target_mapped_file_info
+{
+  /* Constructor.  BUILD_ID is not nullptr, and is the build-id for the
+     mapped file.  FILENAME is the location of the file that GDB loaded to
+     provide the mapped file.  This might be different from the name of the
+     mapped file mentioned in the core file, e.g. if GDB downloads a file
+     from debuginfod then FILENAME would point into the debuginfod client
+     cache.  The FILENAME can be the empty string if GDB was unable to find
+     a file to provide the mapped file.  */
+
+  core_target_mapped_file_info (const bfd_build_id *build_id,
+				const std::string filename)
+    : m_build_id (build_id),
+      m_filename (filename)
+  {
+    gdb_assert (m_build_id != nullptr);
+  }
+
+  /* The build-id for this mapped file.  */
+
+  const bfd_build_id *
+  build_id () const
+  {
+    return m_build_id;
+  }
+
+  /* The file GDB used to provide this mapped file.  */
+
+  const std::string &
+  filename () const
+  {
+    return m_filename;
+  }
+
+private:
+  const bfd_build_id *m_build_id = nullptr;
+  const std::string m_filename;
+};
+
+/* If the current inferior has a core_target for its process target, then
+   lookup information about a mapped file that was discovered when the
+   core file was loaded.
+
+   The FILENAME is the file we're looking for.  The ADDR, if provided, is a
+   mapped address within the inferior which is known to be part of the file
+   we are looking for.
+
+   As an example, when loading shared libraries this function can be
+   called, in that case FILENAME will be the name of the shared library
+   that GDB is trying to load and ADDR will be an inferior address which is
+   part of the shared library we are looking for.
+
+   This function looks for a mapped file which matches FILENAME and/or
+   which covers ADDR and returns information about that file.
+
+   The returned information includes the name of the mapped file if known
+   and the build-id for the mapped file if known.
+
+   */
+std::optional<core_target_mapped_file_info>
+core_target_find_mapped_file (const char *filename,
+			      std::optional<CORE_ADDR> addr);
+
 #endif /* !defined (GDBCORE_H) */
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index a50bb165c19..a9242408137 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -689,6 +689,12 @@ const solib_ops solib_aix_so_ops =
   solib_aix_open_symbol_file_object,
   solib_aix_in_dynsym_resolve_code,
   solib_aix_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_solib_aix ();
diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index f0828fdf102..e9a85db1e5e 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -665,4 +665,10 @@ const solib_ops darwin_so_ops =
   open_symbol_file_object,
   darwin_in_dynsym_resolve_code,
   darwin_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-dsbt.c b/gdb/solib-dsbt.c
index 908c78aebdc..7675c8c0386 100644
--- a/gdb/solib-dsbt.c
+++ b/gdb/solib-dsbt.c
@@ -914,6 +914,12 @@ const solib_ops dsbt_so_ops =
   open_symbol_file_object,
   dsbt_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_dsbt_solib ();
diff --git a/gdb/solib-frv.c b/gdb/solib-frv.c
index 39508fab4c8..7502fe1f560 100644
--- a/gdb/solib-frv.c
+++ b/gdb/solib-frv.c
@@ -1084,4 +1084,10 @@ const solib_ops frv_so_ops =
   open_symbol_file_object,
   frv_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 5c8557e9377..7b216fad1e1 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -3354,6 +3354,15 @@ svr4_iterate_over_objfiles_in_search_order
     }
 }
 
+/* See solib_ops::find_solib_addr in solist.h.  */
+
+static std::optional<CORE_ADDR>
+svr4_find_solib_addr (solib &so)
+{
+  auto *li = gdb::checked_static_cast<lm_info_svr4 *> (so.lm_info.get ());
+  return li->l_addr_inferior;
+}
+
 const struct solib_ops svr4_so_ops =
 {
   svr4_relocate_section_addresses,
@@ -3369,6 +3378,7 @@ const struct solib_ops svr4_so_ops =
   svr4_keep_data_in_core,
   svr4_update_solib_event_breakpoints,
   svr4_handle_solib_event,
+  svr4_find_solib_addr,
 };
 
 void _initialize_svr4_solib ();
diff --git a/gdb/solib-target.c b/gdb/solib-target.c
index 6563da05a47..f89761f109f 100644
--- a/gdb/solib-target.c
+++ b/gdb/solib-target.c
@@ -412,4 +412,10 @@ const solib_ops solib_target_so_ops =
   solib_target_open_symbol_file_object,
   solib_target_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib.c b/gdb/solib.c
index 881f0d96502..f0b0d2e4505 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -476,58 +476,6 @@ solib_bfd_open (const char *pathname)
   return abfd;
 }
 
-/* Mapping of a core file's shared library sonames to their respective
-   build-ids.  Added to the registries of core file bfds.  */
-
-typedef std::unordered_map<std::string, std::string> soname_build_id_map;
-
-/* Key used to associate a soname_build_id_map to a core file bfd.  */
-
-static const struct registry<bfd>::key<soname_build_id_map>
-  cbfd_soname_build_id_data_key;
-
-/* See solib.h.  */
-
-void
-set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname,
-			  const bfd_build_id *build_id)
-{
-  gdb_assert (abfd.get () != nullptr);
-  gdb_assert (soname != nullptr);
-  gdb_assert (build_id != nullptr);
-
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
-
-  if (mapptr == nullptr)
-    mapptr = cbfd_soname_build_id_data_key.emplace (abfd.get ());
-
-  (*mapptr)[soname] = build_id_to_string (build_id);
-}
-
-/* If SONAME had a build-id associated with it in ABFD's registry by a
-   previous call to set_cbfd_soname_build_id then return the build-id
-   as a NULL-terminated hex string.  */
-
-static gdb::unique_xmalloc_ptr<char>
-get_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname)
-{
-  if (abfd.get () == nullptr || soname == nullptr)
-    return {};
-
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
-
-  if (mapptr == nullptr)
-    return {};
-
-  auto it = mapptr->find (lbasename (soname));
-  if (it == mapptr->end ())
-    return {};
-
-  return make_unique_xstrdup (it->second.c_str ());
-}
-
 /* Given a pointer to one of the shared objects in our list of mapped
    objects, use the recorded name to open a bfd descriptor for the
    object, build a section table, relocate all the section addresses
@@ -547,36 +495,60 @@ solib_map_sections (solib &so)
 
   gdb::unique_xmalloc_ptr<char> filename (tilde_expand (so.so_name.c_str ()));
   gdb_bfd_ref_ptr abfd (ops->bfd_open (filename.get ()));
-  gdb::unique_xmalloc_ptr<char> build_id_hexstr
-    = get_cbfd_soname_build_id (current_program_space->cbfd,
-				so.so_name.c_str ());
+
+  /* If we have a core target then the core target might have some helpful
+     information (i.e. build-ids) about the shared libraries we are trying
+     to load.  Grab those hints now and use the below to validate or find
+     the shared libraries.
+
+     If we don't have a core target then this will return an empty struct
+     with no hint information, we then lookup the shared library based on
+     its filename.  */
+  std::optional<CORE_ADDR> solib_addr = ops->find_solib_addr (so);
+  std::optional <const core_target_mapped_file_info> mapped_file_info
+    = core_target_find_mapped_file (so.so_name.c_str (), solib_addr);
 
   /* If we already know the build-id of this solib from a core file, verify
      it matches ABFD's build-id.  If there is a mismatch or the solib wasn't
      found, attempt to query debuginfod for the correct solib.  */
-  if (build_id_hexstr.get () != nullptr)
+  if (mapped_file_info.has_value ())
     {
-      bool mismatch = false;
-
-      if (abfd != nullptr && abfd->build_id != nullptr)
-	{
-	  std::string build_id = build_id_to_string (abfd->build_id);
+      bool mismatch = (abfd != nullptr
+		       && build_id_bfd_get (abfd.get ()) != nullptr
+		       && !build_id_equal (mapped_file_info->build_id (),
+					   build_id_bfd_get (abfd.get ())));
 
-	  if (build_id != build_id_hexstr.get ())
-	    mismatch = true;
-	}
       if (abfd == nullptr || mismatch)
 	{
-	  scoped_fd fd = debuginfod_exec_query (
-	    (const unsigned char *) build_id_hexstr.get (), 0,
-	    so.so_name.c_str (), &filename);
-
-	  if (fd.get () >= 0)
-	    abfd = ops->bfd_open (filename.get ());
-	  else if (mismatch)
-	    warning (_ ("Build-id of %ps does not match core file."),
-		     styled_string (file_name_style.style (),
-				    filename.get ()));
+	  /* If GDB found a suitable file during the file mapping
+	     processing stage then lets use that.  We don't check the
+	     build-id after opening this file, either this file was found
+	     by build-id, in which case it's going to match, or this file
+	     doesn't have a build-id, so checking tells us nothing.
+	     However, if it was good enough during the mapped file
+	     processing, we assume it's good enough now.  */
+	  if (!mapped_file_info->filename ().empty ())
+	    abfd = ops->bfd_open (mapped_file_info->filename ().c_str ());
+	  else
+	    abfd = nullptr;
+
+	  if (abfd == nullptr)
+	    {
+	      scoped_fd fd = debuginfod_exec_query
+		(mapped_file_info->build_id ()->data,
+		 mapped_file_info->build_id ()->size,
+		 so.so_name.c_str (), &filename);
+
+	      if (fd.get () >= 0)
+		abfd = ops->bfd_open (filename.get ());
+	      else if (mismatch)
+		{
+		  warning (_ ("Build-id of %ps does not match core file."),
+			   styled_string (file_name_style.style (),
+					  filename.get ()));
+		  abfd = nullptr;
+		}
+	    }
 	}
     }
 
@@ -1706,6 +1678,14 @@ remove_user_added_objfile (struct objfile *objfile)
     }
 }
 
+/* See solist.h.  */
+
+std::optional<CORE_ADDR>
+default_find_solib_addr (solib &so)
+{
+  return {};
+}
+
 void _initialize_solib ();
 
 void
diff --git a/gdb/solib.h b/gdb/solib.h
index f7a93c0718f..95f4753c1ad 100644
--- a/gdb/solib.h
+++ b/gdb/solib.h
@@ -29,6 +29,7 @@ struct program_space;
 #include "gdb_bfd.h"
 #include "symfile-add-flags.h"
 #include "gdbsupport/function-view.h"
+#include "memrange.h"
 
 /* Value of the 'set debug solib' configuration variable.  */
 
@@ -136,11 +137,4 @@ extern void update_solib_breakpoints (void);
 
 extern void handle_solib_event (void);
 
-/* Associate SONAME with BUILD_ID in ABFD's registry so that it can be
-   retrieved with get_cbfd_soname_build_id.  */
-
-extern void set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd,
-				      const char *soname,
-				      const bfd_build_id *build_id);
-
 #endif /* SOLIB_H */
diff --git a/gdb/solist.h b/gdb/solist.h
index f0d22080de1..bd60987f49c 100644
--- a/gdb/solist.h
+++ b/gdb/solist.h
@@ -167,6 +167,23 @@ struct solib_ops
      NULL, in which case no specific preprocessing is necessary
      for this target.  */
   void (*handle_event) (void);
+
+  /* Return an address within the inferior's address space which is known
+     to be part of SO.  If there is no such address, or GDB doesn't know
+     how to figure out such an address then an empty optional is
+     returned.
+
+     The returned address can be used when loading the shared libraries
+     for a core file.  GDB knows the build-ids for (some) files mapped
+     into the inferior's address space, and knows the address ranges which
+     those mapped files cover.  If GDB can figure out a representative
+     address for the library then this can be used to match a library to a
+     mapped file, and thus to a build-id.  GDB can then use this
+     information to help locate the shared library objfile, if the objfile
+     is not in the expected place (as defined by the shared libraries file
+     name).  */
+
+  std::optional<CORE_ADDR> (*find_solib_addr) (solib &so);
 };
 
 /* A unique pointer to a so_list.  */
@@ -186,4 +203,9 @@ extern gdb_bfd_ref_ptr solib_bfd_fopen (const char *pathname, int fd);
 /* Find solib binary file and open it.  */
 extern gdb_bfd_ref_ptr solib_bfd_open (const char *in_pathname);
 
+/* A default implementation of the solib_ops::find_solib_addr callback.
+   This just returns an empty std::optional<CORE_ADDR> indicating GDB is
+   unable to find an address within the library SO.  */
+extern std::optional<CORE_ADDR> default_find_solib_addr (solib &so);
+
 #endif
diff --git a/gdb/testsuite/gdb.base/solib-search.exp b/gdb/testsuite/gdb.base/solib-search.exp
index 581046a1ed8..bbfe3ae719a 100644
--- a/gdb/testsuite/gdb.base/solib-search.exp
+++ b/gdb/testsuite/gdb.base/solib-search.exp
@@ -43,7 +43,11 @@ set right_binfile2_lib \
 set binfile1_lib [standard_output_file ${libname1}.so]
 set binfile2_lib [standard_output_file ${libname2}.so]
 
-set lib_flags [list debug ldflags=-Wl,-Bsymbolic]
+# When this test was written, GDB's ability to track down shared
+# libraries for a core file based on the build-id much poorer.  As GDB
+# has improved we now need to disable build-ids in order for this test
+# to function as expected.
+set lib_flags [list debug no-build-id ldflags=-Wl,-Bsymbolic]
 set wrong_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=1"
 set right_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=8192 additional_flags=-DRIGHT"
 
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
new file mode 100644
index 00000000000..30e9267dc01
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
@@ -0,0 +1,39 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+/* It is important that these two variables have names of the same length
+   so that the debug information in the two library versions is laid out
+   the same.  If they differ then the .dynamic section might move, which
+   will trigger a different check within GDB than the one we actually want
+   to check.  */
+
+#if LIB_VERSION == 1
+volatile int *library_1_var = (volatile int *) 0x12345678;
+#elif LIB_VERSION == 2
+volatile int *library_2_var = (volatile int *) 0x11223344;
+#else
+# error Unknown library version
+#endif
+
+int
+foo (void)
+{
+  /* This should trigger a core dump.  */
+  abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
new file mode 100644
index 00000000000..a51d48e2f2e
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
@@ -0,0 +1,41 @@
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <stddef.h>
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+/* This is updated by the .exp file.  */
+char *libname = "libfoo_2.so";
+
+int
+main (void)
+{
+  void *handle;
+  int res, tmp;
+
+  handle = dlopen (libname, RTLD_LAZY);
+  assert (handle != NULL);
+
+  res = foo ();
+
+  tmp = dlclose (handle);
+  assert (tmp == 0);
+
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
new file mode 100644
index 00000000000..5bf6817daf8
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -0,0 +1,260 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test exercises GDB's ability to validate build-ids when loading
+# shared libraries for a core file.
+#
+# The test creates two "versions" of a shared library, sets up a
+# symlink to point to one version of the library, and creates a core file.
+#
+# We then try re-loading the core file and executable and check that
+# GDB is able to correctly load the shared library.  To confuse things
+# we retarget the library symlink at the other version of the library.
+#
+# After that we repeat the test, but this time deleting the symlink
+# completely.
+#
+# Then we remove the version of the library completely, at this point
+# we do expect GDB to give a warning about being unable to load the library.
+#
+# And finally, we setup debuginfod and have it serve the missing
+# library file, GDB should correctly download the library file.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+require {istarget "*-linux*"}
+require {!is_remote host}
+require {!using_fission}
+
+standard_testfile -1.c -2.c
+
+# Build two similar, but slightly different versions of the shared
+# library.  Both libraries have DT_SONAME set to the generic
+# libfoo.so, we'll create a symlink with that name later.
+set library_1_filename [standard_output_file "libfoo_1.so"]
+set library_2_filename [standard_output_file "libfoo_2.so"]
+
+# The generic name for the library.
+set library_filename [standard_output_file "libfoo.so"]
+
+# When compiling a shared library the -Wl,-soname,NAME option is
+# automatically added based on the final name of the library.  We want
+# to compile libfoo_1.so, but set the soname to libfoo.so.  To achieve
+# this we first compile into libfoo.so, and then rename the library to
+# libfoo_1.so.
+if {[build_executable "build libfoo_1.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=1 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_1_filename}"
+
+# See the comment above, but this time we rename to libfoo_2.so.
+if {[build_executable "build libfoo_2.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=2 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_2_filename}"
+
+# Create libfoo.so symlink to the libfoo_1.so library.  If this
+# symlink creation fails then we assume we can't create symlinks on
+# this host.  If this succeeds then later symlink creation is required
+# to succeed, and will trigger an FAIL if it doesn't.
+set status \
+    [remote_exec build \
+	 "ln -sf ${library_1_filename} ${library_filename}"]
+if {[lindex $status 0] != 0} {
+    unsupported "host does not support symbolic links"
+    return
+}
+
+# Build the executable.  This links against libfoo.so, which is
+# poining at libfoo_1.so.  Just to confuse things even more, this
+# executable uses dlopen to load libfoo_2.so.  Weird!
+if { [build_executable "build executable" ${binfile} ${srcfile2} \
+	  [list debug shlib=${library_filename} shlib_load]] == -1 } {
+    return
+}
+
+# If the board file is automatically splitting the debug information
+# into a separate file (e.g. the cc-with-gnu-debuglink.exp board) then
+# this test isn't going to work.
+clean_restart
+gdb_file_cmd $binfile
+if {$gdb_file_cmd_debug_info ne "debug"} {
+    unsupported "failed to find debug information"
+    return
+}
+if {[regexp "${testfile}.debug" $gdb_file_cmd_msg]} {
+    unsupported "debug information has been split to a separate file"
+    return
+}
+
+# Run BINFILE which will generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Helper proc to load global BINFILE and then load global COREFILE.
+#
+# If EXPECT_WARNING is true then we require a warning about being
+# unable to load the shared library symbols, otherwise, EXPECT_WARNING
+# is false and we require no warning.
+#
+# If EXPECT_DOWNLOAD is true then we require a line indicating that
+# the shared library is being downloaded from debuginfod, otherwise
+# the shared library should not be downloaded.
+proc load_exec_and_core_file { expect_warning expect_download testname } {
+    with_test_prefix $testname {
+	clean_restart $::binfile
+
+	set saw_warning false
+	set saw_download false
+	set saw_generated false
+	set saw_terminated false
+
+	gdb_test_multiple "core-file $::corefile" "load core file" {
+	    -re "^Core was generated by \[^\r\n\]+\r\n" {
+		set saw_generated true
+		exp_continue
+	    }
+	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
+		set saw_terminated true
+		exp_continue
+	    }
+	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing\r\n" {
+		# Ignore warnings from the file backed mapping phase.
+		exp_continue
+	    }
+	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\n" {
+		set saw_warning true
+		exp_continue
+	    }
+	    -re "^Downloading executable for \[^\r\n\]+/libfoo_1\\.so\\.\\.\\.\r\n" {
+		set saw_download true
+		exp_continue
+	    }
+	    -re "^$::gdb_prompt $" {
+		gdb_assert { $saw_generated && $saw_terminated \
+				 && $saw_warning == $expect_warning \
+				 && $saw_download == $expect_download } \
+		    $gdb_test_name
+	    }
+	    -re "^\[^\r\n\]*\r\n" {
+		exp_continue
+	    }
+	}
+
+	# If we don't expect a warning then debug symbols from the
+	# shared library should be available.  Confirm we can read a
+	# variable from the shared library.  If we do expect a warning
+	# then the shared library debug symbols have not loaded, and
+	# the library variable should not be available.
+	if { !$expect_warning } {
+	    gdb_test "print/x library_1_var" " = 0x12345678" \
+		"check library_1_var can be read"
+	} else {
+	    gdb_test "print/x library_1_var" \
+		"^No symbol \"library_1_var\" in current context\\." \
+		"check library_1_var cannot be read"
+	}
+    }
+}
+
+# Initial test, just load the executable and core file.  At this point
+# everything should load fine as everything is where we expect to find
+# it.
+load_exec_and_core_file false false \
+    "load core file, all libraries as expected"
+
+# Update libfoo.so symlink to point at the second library then reload
+# the core file.  GDB should spot that the symlink points to the wrong
+# file, but should be able to figure out the correct file to load as
+# the right file will be in the mapped file list.
+set status [remote_exec build \
+		"ln -sf ${library_2_filename} ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "update library symlink to point to the wrong file"
+
+load_exec_and_core_file false false \
+    "load core file, symlink points to wrong file"
+
+# Remove libfoo.so symlink and reload the core file.  As in the
+# previous test GDB should be able to figure out the correct file to
+# load as the correct file will still appear in the mapped file list.
+set status [remote_exec build "rm -f ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } "remove library symlink"
+
+load_exec_and_core_file false false \
+    "load core file, symlink removed"
+
+# Remove LIBRARY_1_FILENAME.  We'll now see a warning that the mapped
+# file can't be loaded (we ignore that warning), and we'll see a
+# warning that the shared library can't be loaded.
+set library_1_backup_filename ${library_1_filename}.backup
+set status \
+    [remote_exec build \
+	 "mv ${library_1_filename} ${library_1_backup_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "remove libfoo_1.so"
+
+load_exec_and_core_file true false \
+    "load core file, libfoo_1.so removed"
+
+# Setup a debuginfod server which can serve the original shared
+# library file.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_1_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    save_vars { GDBFLAGS } {
+	append GDBFLAGS " -ex \"set debuginfod enabled on\""
+
+	# Reload the executable and core file.  GDB should download
+	# the file libfoo_1.so using debuginfod during the mapped file
+	# phase, but should then reuse that download during the shared
+	# library phase.
+	load_exec_and_core_file false true \
+	    "load core file, use debuginfod"
+    }
+}
+
+stop_debuginfod
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv3 4/4] gdb: unify build-id to objfile lookup code
  2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
                       ` (2 preceding siblings ...)
  2024-06-10 21:49     ` [PATCHv3 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
@ 2024-06-10 21:49     ` Andrew Burgess
  3 siblings, 0 replies; 16+ messages in thread
From: Andrew Burgess @ 2024-06-10 21:49 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

There are 3 places where we currently call debuginfod_exec_query to
lookup an objfile for a given build-id.

In one of these places we first call build_id_to_exec_bfd which also
looks up an objfile given a build-id, but this function looks on disk
for a symlink in the .build-id/ sub-directory (within the
debug-file-directory).

I can't think of any reason why we shouldn't call build_id_to_exec_bfd
before every call to debuginfod_exec_query.

So, in this commit I have added a new function in build-id.c,
find_exec_by_build_id, this function calls build_id_to_exec_bfd, and
if that fails, then calls debuginfod_exec_query.

Everywhere we call debuginfod_exec_query is updated to call the new
function, and in locate_exec_from_corefile_build_id, the existing call
to build_id_to_exec_bfd is removed as calling find_exec_by_build_id
does this for us.

One slight weird thing is in core_target::build_file_mappings, here we
call find_exec_by_build_id which returns a gdb_bfd_ref_ptr for the
opened file, however we immediately reopen the file as "binary".  The
reason for this is that all the bfds opened in ::build_file_mappings
need to be opened as "binary" (see the function comments for why).

I did consider passing a target type into find_exec_by_build_id, which
could then be forwarded to build_id_to_exec_bfd and used to open the
BFD as "binary", however, if you follow the call chain you'll end up
in build_id_to_debug_bfd_1, where we actually open the bfd.  Notice in
here that we call build_id_verify to double check the build-id of the
file we found, this requires that the bfd not be opened as "binary".

What this means is that we always have to first open the bfd using the
gnutarget target type (for the build-id check), and then we would have
to reopen it as "binary".  There seems little point pushing the reopen
logic into find_exec_by_build_id, so we just do this in the
::build_file_mappings function.

I've extended the tests to cover the two cases which actually changed
in this commit.
---
 gdb/build-id.c                                | 42 ++++++++++++++++++-
 gdb/build-id.h                                | 21 ++++++----
 gdb/corelow.c                                 | 40 ++++++------------
 gdb/solib.c                                   | 23 ++++------
 .../gdb.debuginfod/corefile-mapped-file.exp   | 24 +++++++++++
 .../gdb.debuginfod/solib-with-soname.exp      | 32 +++++++++++++-
 gdb/testsuite/lib/gdb.exp                     |  7 +++-
 7 files changed, 134 insertions(+), 55 deletions(-)

diff --git a/gdb/build-id.c b/gdb/build-id.c
index 41667d5e5cf..27642b58d56 100644
--- a/gdb/build-id.c
+++ b/gdb/build-id.c
@@ -26,6 +26,8 @@
 #include "filenames.h"
 #include "gdbcore.h"
 #include "cli/cli-style.h"
+#include "gdbsupport/scoped_fd.h"
+#include "debuginfod-support.h"
 
 /* See build-id.h.  */
 
@@ -198,9 +200,11 @@ build_id_to_debug_bfd (size_t build_id_len, const bfd_byte *build_id)
   return build_id_to_bfd_suffix (build_id_len, build_id, ".debug");
 }
 
-/* See build-id.h.  */
+/* Find and open a BFD for an executable file given a build-id.  If no BFD
+   can be found, return NULL.  The returned reference to the BFD must be
+   released by the caller.  */
 
-gdb_bfd_ref_ptr
+static gdb_bfd_ref_ptr
 build_id_to_exec_bfd (size_t build_id_len, const bfd_byte *build_id)
 {
   return build_id_to_bfd_suffix (build_id_len, build_id, "");
@@ -243,3 +247,37 @@ find_separate_debug_file_by_buildid (struct objfile *objfile,
 
   return std::string ();
 }
+
+/* See build-id.h.  */
+
+gdb_bfd_ref_ptr
+find_exec_by_build_id (const bfd_build_id *build_id,
+		       const char *expected_filename)
+{
+  /* Try to find the executable (or shared object) by looking for a
+     (sym)link on disk from the build-id to the object file.  */
+  gdb_bfd_ref_ptr abfd = build_id_to_exec_bfd (build_id->size,
+					       build_id->data);
+
+  if (abfd != nullptr)
+    return abfd;
+
+  /* Attempt to query debuginfod for the executable.  */
+  gdb::unique_xmalloc_ptr<char> path;
+  scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
+					expected_filename, &path);
+  if (fd.get () >= 0)
+    {
+      abfd = gdb_bfd_open (path.get (), gnutarget);
+
+      if (abfd == nullptr)
+	warning (_("\"%ps\" from debuginfod cannot be opened as bfd: %s"),
+		 styled_string (file_name_style.style (), path.get ()),
+		 gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
+      else if (!build_id_verify (abfd.get (), build_id->size,
+				 build_id->data))
+	abfd = nullptr;
+    }
+
+  return abfd;
+}
diff --git a/gdb/build-id.h b/gdb/build-id.h
index c5f20f8782e..3df122a0cbf 100644
--- a/gdb/build-id.h
+++ b/gdb/build-id.h
@@ -40,13 +40,6 @@ extern int build_id_verify (bfd *abfd,
 extern gdb_bfd_ref_ptr build_id_to_debug_bfd (size_t build_id_len,
 					      const bfd_byte *build_id);
 
-/* Find and open a BFD for an executable file given a build-id.  If no BFD
-   can be found, return NULL.  The returned reference to the BFD must be
-   released by the caller.  */
-
-extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
-					     const bfd_byte *build_id);
-
 /* Find the separate debug file for OBJFILE, by using the build-id
    associated with OBJFILE's BFD.  If successful, returns the file name for the
    separate debug file, otherwise, return an empty string.
@@ -60,6 +53,20 @@ extern gdb_bfd_ref_ptr build_id_to_exec_bfd (size_t build_id_len,
 extern std::string find_separate_debug_file_by_buildid
   (struct objfile *objfile, deferred_warnings *warnings);
 
+/* Find an executable (or shared library) that matches BUILD_ID.  This is
+   done by first checking in the debug-file-directory for a .build-id/
+   sub-directory, and looking for a symlink in there that points to the
+   required file.
+
+   If that doesn't find us a file then we call to debuginfod to see if it
+   can provide the required file.
+
+   EXPECTED_FILENAME is used in output messages from debuginfod, this
+   should be the file we were looking for but couldn't find.  */
+
+extern gdb_bfd_ref_ptr find_exec_by_build_id (const bfd_build_id *build_id,
+					      const char *expected_filename);
+
 /* Return an hex-string representation of BUILD_ID.  */
 
 static inline std::string
diff --git a/gdb/corelow.c b/gdb/corelow.c
index 1294f78a982..a7a2a1eedb1 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -48,7 +48,6 @@
 #include "gdbsupport/pathstuff.h"
 #include "gdbsupport/scoped_fd.h"
 #include "gdbsupport/x86-xstate.h"
-#include "debuginfod-support.h"
 #include <unordered_map>
 #include <unordered_set>
 #include "cli/cli-cmds.h"
@@ -534,13 +533,19 @@ core_target::build_file_mappings ()
 	   || !bfd_check_format (abfd.get (), bfd_object))
 	  && file_data.build_id != nullptr)
 	{
-	  expanded_fname = nullptr;
-	  debuginfod_exec_query (file_data.build_id->data,
-				 file_data.build_id->size,
-				 filename.c_str (), &expanded_fname);
-	  if (expanded_fname != nullptr)
+	  abfd = find_exec_by_build_id (file_data.build_id,
+					filename.c_str ());
+
+	  if (abfd != nullptr)
 	    {
+	      /* The find_exec_by_build_id will have opened ABFD using the
+		 GNUTARGET global bfd type, however, we need the bfd opened
+		 as the binary type (see the function's header comment), so
+		 now we reopen ABFD with the desired binary type.  */
+	      expanded_fname
+		= make_unique_xstrdup (bfd_get_filename (abfd.get ()));
 	      struct bfd *b = bfd_openr (expanded_fname.get (), "binary");
+	      gdb_assert (b != nullptr);
 	      abfd = gdb_bfd_ref_ptr::new_reference (b);
 	    }
 	}
@@ -898,28 +903,7 @@ locate_exec_from_corefile_build_id (bfd *abfd, int from_tty)
     return;
 
   gdb_bfd_ref_ptr execbfd
-    = build_id_to_exec_bfd (build_id->size, build_id->data);
-
-  if (execbfd == nullptr)
-    {
-      /* Attempt to query debuginfod for the executable.  */
-      gdb::unique_xmalloc_ptr<char> execpath;
-      scoped_fd fd = debuginfod_exec_query (build_id->data, build_id->size,
-					    abfd->filename, &execpath);
-
-      if (fd.get () >= 0)
-	{
-	  execbfd = gdb_bfd_open (execpath.get (), gnutarget);
-
-	  if (execbfd == nullptr)
-	    warning (_("\"%s\" from debuginfod cannot be opened as bfd: %s"),
-		     execpath.get (),
-		     gdb_bfd_errmsg (bfd_get_error (), nullptr).c_str ());
-	  else if (!build_id_verify (execbfd.get (), build_id->size,
-				     build_id->data))
-	    execbfd.reset (nullptr);
-	}
-    }
+    = find_exec_by_build_id (build_id, abfd->filename);
 
   if (execbfd != nullptr)
     {
diff --git a/gdb/solib.c b/gdb/solib.c
index f0b0d2e4505..a05d003bd20 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -46,7 +46,6 @@
 #include "gdb_bfd.h"
 #include "gdbsupport/filestuff.h"
 #include "gdbsupport/scoped_fd.h"
-#include "debuginfod-support.h"
 #include "source.h"
 #include "cli/cli-style.h"
 
@@ -533,21 +532,15 @@ solib_map_sections (solib &so)
 	    abfd = nullptr;
 
 	  if (abfd == nullptr)
+	    abfd = find_exec_by_build_id (mapped_file_info->build_id (),
+					  so.so_name.c_str ());
+
+	  if (abfd == nullptr && mismatch)
 	    {
-	      scoped_fd fd = debuginfod_exec_query
-		(mapped_file_info->build_id ()->data,
-		 mapped_file_info->build_id ()->size,
-		 so.so_name.c_str (), &filename);
-
-	      if (fd.get () >= 0)
-		abfd = ops->bfd_open (filename.get ());
-	      else if (mismatch)
-		{
-		  warning (_ ("Build-id of %ps does not match core file."),
-			   styled_string (file_name_style.style (),
-					  filename.get ()));
-		  abfd = nullptr;
-		}
+	      warning (_ ("Build-id of %ps does not match core file."),
+		       styled_string (file_name_style.style (),
+				      filename.get ()));
+	      abfd = nullptr;
 	    }
 	}
     }
diff --git a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
index 6e3301e1c8d..b5dee228ca0 100644
--- a/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
+++ b/gdb/testsuite/gdb.debuginfod/corefile-mapped-file.exp
@@ -296,6 +296,30 @@ set ptr_value [read_ptr_value]
 gdb_assert { $ptr_value eq "unavailable" } \
     "check value of pointer is unavailable with library file missing"
 
+# Now symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to library we just moved aside.  Restart GDB and setup the
+# debug-file-directory before loading the core file.
+#
+# GDB should lookup the file to map via the build-id link in the
+# .build-id/ directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_backup_filename ""]
+
+remote_exec build "mkdir -p [file dirname $build_id_filename]"
+remote_exec build "ln -sf $library_backup_filename $build_id_filename"
+
+clean_restart $binfile
+
+gdb_test_no_output "set debug-file-directory $debugdir" \
+    "set debug-file-directory"
+
+load_core_file "load corefile, lookup in debug-file-directory"
+
+set ptr_value [read_ptr_value]
+gdb_assert { $ptr_value == $ptr_expected_value } \
+    "check value of pointer variable from core-file, lookup in debug-file-directory"
+
 # Build a new version of the shared library, keep the library the same size,
 # but change the contents so the build-id changes.  Then restart GDB and load
 # the core-file again.  GDB should spot that the build-id for the shared
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
index 5bf6817daf8..9ef12041dc6 100644
--- a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -126,10 +126,19 @@ if {$corefile eq ""} {
 # If EXPECT_DOWNLOAD is true then we require a line indicating that
 # the shared library is being downloaded from debuginfod, otherwise
 # the shared library should not be downloaded.
-proc load_exec_and_core_file { expect_warning expect_download testname } {
+#
+# If DEBUGDIR is not the empty string then 'debug-file-directory' is
+# set to the value of DEBUGDIR.
+proc load_exec_and_core_file { expect_warning expect_download testname \
+				   {debugdir ""} } {
     with_test_prefix $testname {
 	clean_restart $::binfile
 
+	if { $debugdir ne "" } {
+	    gdb_test_no_output "set debug-file-directory $debugdir" \
+		"set debug directory"
+	}
+
 	set saw_warning false
 	set saw_download false
 	set saw_generated false
@@ -223,6 +232,27 @@ gdb_assert { [lindex $status 0] == 0 } \
 load_exec_and_core_file true false \
     "load core file, libfoo_1.so removed"
 
+# Symlink the .build-id/xx/xxx...xxx filename within the debug
+# directory to LIBRARY_1_BACKUP_FILENAME, now when we restart GDB it
+# should find the missing library within the debug directory.
+set debugdir [standard_output_file "debugdir"]
+set build_id_filename \
+    $debugdir/[build_id_debug_filename_get $library_1_backup_filename ""]
+set status \
+    [remote_exec build \
+	 "mkdir -p [file dirname $build_id_filename]"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create sub-directory within the debug directory"
+set status \
+    [remote_exec build \
+	 "ln -sf $library_1_backup_filename $build_id_filename"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "create symlink within the debug directory "
+
+load_exec_and_core_file false false \
+    "load core file, find libfoo_1.so through debug-file-directory" \
+    $debugdir
+
 # Setup a debuginfod server which can serve the original shared
 # library file.
 if {![allow_debuginfod_tests]} {
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 5aeaa8d09b4..4b2f061cab3 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -8059,14 +8059,17 @@ proc get_build_id { filename } {
 
 # Return the build-id hex string (usually 160 bits as 40 hex characters)
 # converted to the form: .build-id/ab/cdef1234...89.debug
+#
+# The '.debug' suffix can be changed by passing the SUFFIX argument.
+#
 # Return "" if no build-id found.
-proc build_id_debug_filename_get { filename } {
+proc build_id_debug_filename_get { filename {suffix ".debug"} } {
     set data [get_build_id $filename]
     if { $data == "" } {
 	return ""
     }
     regsub {^..} $data {\0/} data
-    return ".build-id/${data}.debug"
+    return ".build-id/${data}${suffix}"
 }
 
 # DEST should be a file compiled with debug information.  This proc
-- 
2.25.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-06-10 21:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-17 14:18 [PATCH 0/4] More build-id checking when opening core files Andrew Burgess
2024-05-17 14:18 ` [PATCH 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
2024-05-17 14:18 ` [PATCH 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
2024-05-17 14:18 ` [PATCH 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
2024-05-17 14:19 ` [PATCH 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
2024-05-20 13:08 ` [PATCHv2 0/4] More build-id checking when opening core files Andrew Burgess
2024-05-20 13:08   ` [PATCHv2 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
2024-05-20 13:08   ` [PATCHv2 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
2024-05-20 13:08   ` [PATCHv2 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
2024-05-21 19:00     ` Andrew Burgess
2024-05-20 13:08   ` [PATCHv2 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess
2024-06-10 21:49   ` [PATCHv3 0/4] More build-id checking when opening core files Andrew Burgess
2024-06-10 21:49     ` [PATCHv3 1/4] gdb/corefile: remove unavailable target sections Andrew Burgess
2024-06-10 21:49     ` [PATCHv3 2/4] gdb/corefile: improve file backed mapping handling Andrew Burgess
2024-06-10 21:49     ` [PATCHv3 3/4] gdb: improve shared library build-id check for core-files Andrew Burgess
2024-06-10 21:49     ` [PATCHv3 4/4] gdb: unify build-id to objfile lookup code Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).