public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Aaron Merey <amerey@redhat.com>
Cc: Tom Tromey <tom@tromey.com>,
	lsix@lancelotsix.com,
	Aaron Merey via Gdb-patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH 2/3] gdb: Add soname to build-id mapping for corefiles
Date: Thu, 17 Feb 2022 16:01:45 +0000	[thread overview]
Message-ID: <20220217160145.GP2571@redhat.com> (raw)
In-Reply-To: <CAJDtP-ThnxJZYXT5Tf7302nQGqHyoaDvF_DTvWbrVLb7rWkRrw@mail.gmail.com>

* Aaron Merey via Gdb-patches <gdb-patches@sourceware.org> [2021-11-17 16:16:13 -0500]:

> Hi Tom,
> 
> On Wed, Nov 17, 2021 at 9:29 AM Tom Tromey <tom@tromey.com> wrote:
> > If re-opening and then calling bfd_check_format sets the flags
> > correctly, then I suppose the question is just why the call in
> > build_file_mappings isn't doing that:
> >
> >             if (bfd == nullptr || !bfd_check_format (bfd, bfd_object))
> >
> > Possibly the answer is that corelow passes "binary" as the target, but
> > your code does:
> >
> > Aaron> +  gdb_bfd_ref_ptr abfd = gdb_bfd_open (bfd->filename, gnutarget);
> >
> > But then I wonder why "binary" is appropriate in corelow.
> 
> Yes it would be much better if "bfd" was simply initialised with the proper
> fields. Corelow treats bfds in a target-agnostic manner but this shouldn't
> preclude having accurate information in "bfd".
> 
> We could force "bfd" to have a format of bfd_unknown and let
> bfd_check_format figure out the correct format and fields.  This is
> basically what I do in gdb_bfd_read_elf_soname.  But really there
> should be a way to just initialise "bfd" with completely accurate
> contents at the time of creation.  If this doesn't exist maybe now
> is the time to add it.

As I understand it, the code in corelow is making a deliberate choice
_not_ to initialise the bfd with any of the "correct" information.

We want something that can handle both structured files, like shared
libraries, but also unstructured, raw data files that are mapped into
memory.

Once we've opened the bfd in core_target::build_file_mappings, we then
add sections to the bfd to cover those regions that were mapped into
memory.

One option would be to modify core_target::build_file_mappings so that
we first try to open the bfd using `gnutarget` as the type, and only
if that fails (returns nullptr), try opening the bfd with type
binary.  I think in that way you'd find the bfd had all the data you
needed.

The other option, which I might be inclined to go with, would be to
change the API for the new gdb_bfd_read_elf_soname function, make it
instead something like:

  gdb::optional<std::string>
  gdb_bfd_read_elf_soname (const char *filename)
  { ... }

Then you remove the whole ugly, pull the filename from the bfd and
reopen the bfd logic, and instead, you're just opening a file as a
bfd, and reading content from it.  You already have the filename
available in core_target::build_file_mappings.

I have some minor style issues which I've highlighted below.  But I've
only really had a good look at the one part of the patch I discuss
above, (around gdb_bfd_read_elf_soname).  I can try to look at the
rest next week, if nobody else picks this up.

Thanks,
Andrew


> From 275e69ce65aa3d5ee3b017175c73f8536a1fb559 Mon Sep 17 00:00:00 2001
> From: Aaron Merey <amerey@redhat.com>
> Date: Tue, 16 Nov 2021 17:33:52 -0500
> Subject: [PATCH 2/3] gdb: Add soname to build-id mapping for core files
> 
> Since commit aa2d5a422 gdb has been able to read executable and shared
> library build-ids within core files.
> 
> Expand this functionality so that each program_space maintains a map of
> soname to build-id for each shared library referenced in the program_space's
> core file.
> 
> This feature may be used to verify that gdb has found the correct shared
> libraries for core files and to facilitate downloading shared libaries via
> debuginfod.
> ---
>  gdb/corelow.c    | 11 +++++++++++
>  gdb/linux-tdep.c | 38 +++++++++++++++++++++++++++++++++++++-
>  gdb/progspace.c  | 32 ++++++++++++++++++++++++++++++++
>  gdb/progspace.h  | 18 ++++++++++++++++++
>  gdb/solib.c      | 35 +++++++++++++++++++++++++++++++++++
>  gdb/solib.h      |  5 +++++
>  6 files changed, 138 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/corelow.c b/gdb/corelow.c
> index 10942e6af01..d1256a9e99b 100644
> --- a/gdb/corelow.c
> +++ b/gdb/corelow.c
> @@ -282,6 +282,16 @@ core_target::build_file_mappings ()
> 
>         /* Set target_section fields.  */
>         m_core_file_mappings.emplace_back (start, end, sec);
> +
> +	/* If this is a bfd of a shared library, record its soname
> +	   and build id.  */
> +	if (build_id != nullptr)
> +	  {
> +	    gdb::optional<std::string> soname = gdb_bfd_read_elf_soname (bfd);
> +	    if (soname)

if (soname.has_value ())

> +	      current_program_space->set_cbfd_soname_build_id
> +		(std::move (*soname), build_id);
> +	  }
>        });
> 
>    normalize_mem_ranges (&m_core_unavailable_mappings);
> @@ -305,6 +315,7 @@ core_target::close ()
>          comments in clear_solib in solib.c.  */
>        clear_solib ();
> 
> +      current_program_space->clear_cbfd_soname_build_ids ();
>        current_program_space->cbfd.reset (nullptr);
>      }
> 
> diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c
> index e2cff83086a..7325ba67208 100644
> --- a/gdb/linux-tdep.c
> +++ b/gdb/linux-tdep.c
> @@ -44,6 +44,7 @@
>  #include "solib-svr4.h"
> 
>  #include <ctype.h>
> +#include <unordered_map>
> 
>  /* This enum represents the values that the user can choose when
>     informing the Linux kernel about which memory mappings will be
> @@ -1170,6 +1171,23 @@ linux_read_core_file_mappings
>    if (f != descend)
>      warning (_("malformed note - filename area is too big"));
> 
> +  const bfd_build_id *orig_build_id = cbfd->build_id;
> +  std::unordered_map<ULONGEST, const bfd_build_id *> vma_map;
> +  std::unordered_map<char *, const bfd_build_id *> filename_map;
> +
> +  /* Search for solib build-ids in the core file.  Each time one is found,
> +     map the start vma of the corresponding elf header to the build-id.  */
> +  for (bfd_section *sec = cbfd->sections; sec != nullptr; sec = sec->next)
> +    {
> +      cbfd->build_id = nullptr;
> +
> +      if (sec->flags & SEC_LOAD
> +	  && get_elf_backend_data (cbfd)->elf_backend_core_find_build_id
> +	       (cbfd, (bfd_vma) sec->filepos))
> +	vma_map[sec->vma] = cbfd->build_id;
> +    }
> +
> +  cbfd->build_id = orig_build_id;
>    pre_loop_cb (count);
> 
>    for (int i = 0; i < count; i++)
> @@ -1183,8 +1201,26 @@ linux_read_core_file_mappings
>        descdata += addr_size;
>        char * filename = filenames;
>        filenames += strlen ((char *) filenames) + 1;
> +      const bfd_build_id *build_id = nullptr;
> +      auto vma_map_it = vma_map.find (start);
> +
> +      /* Map filename to the build-id associated with this start vma,
> +	 if such a build-id was found.  Otherwise use the build-id
> +	 already associated with this filename if it exists. */

Two spaces after the '.'.

> +      if (vma_map_it != vma_map.end ())
> +	{
> +	  build_id = vma_map_it->second;
> +	  filename_map[filename] = build_id;
> +	}
> +      else
> +	{
> +	  auto filename_map_it = filename_map.find (filename);
> +
> +	  if (filename_map_it != filename_map.end ())
> +	    build_id = filename_map_it->second;
> +	}
> 
> -      loop_cb (i, start, end, file_ofs, filename, nullptr);
> +      loop_cb (i, start, end, file_ofs, filename, build_id);
>      }
>  }
> 
> diff --git a/gdb/progspace.c b/gdb/progspace.c
> index 7080bf8ee27..680c5ba8f0a 100644
> --- a/gdb/progspace.c
> +++ b/gdb/progspace.c
> @@ -18,6 +18,7 @@
>     along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> 
>  #include "defs.h"
> +#include "build-id.h"
>  #include "gdbcmd.h"
>  #include "objfiles.h"
>  #include "arch-utils.h"
> @@ -358,6 +359,37 @@ print_program_space (struct ui_out *uiout, int requested)
>      }
>  }
> 
> +/* See progspace.h.  */
> +
> +void
> +program_space::set_cbfd_soname_build_id (std::string soname,
> +					 const bfd_build_id *build_id)
> +{
> +  m_cbfd_soname_to_build_id[std::move (soname)] = build_id_to_string (build_id);
> +}
> +
> +/* See progspace.h.  */
> +
> +const char *
> +program_space::get_cbfd_soname_build_id (const char *soname)
> +{
> +  gdb_assert (soname);

gdb_assert (soname != nullptr);

> +
> +  auto it = m_cbfd_soname_to_build_id.find (basename (soname));
> +  if (it == m_cbfd_soname_to_build_id.end ())
> +    return nullptr;
> +
> +  return it->second.c_str ();
> +}
> +
> +/* See progspace.h.  */
> +
> +void
> +program_space::clear_cbfd_soname_build_ids ()
> +{
> +  m_cbfd_soname_to_build_id.clear ();
> +}
> +
>  /* Boolean test for an already-known program space id.  */
> 
>  static int
> diff --git a/gdb/progspace.h b/gdb/progspace.h
> index fb348ca7539..01af5a387e3 100644
> --- a/gdb/progspace.h
> +++ b/gdb/progspace.h
> @@ -30,6 +30,7 @@
>  #include "gdbsupport/safe-iterator.h"
>  #include <list>
>  #include <vector>
> +#include <unordered_map>
> 
>  struct target_ops;
>  struct bfd;
> @@ -324,6 +325,19 @@ struct program_space
>    /* Binary file diddling handle for the core file.  */
>    gdb_bfd_ref_ptr cbfd;
> 
> +  /* Associate a core file SONAME with BUILD_ID so that it can be retrieved
> +     with get_cbfd_soname_build_id.  */
> +  void set_cbfd_soname_build_id (std::string soname,
> +				 const bfd_build_id *build_id);
> +
> +  /* If a core file SONAME had a build-id associated with it by a previous
> +     call to set_cbfd_soname_build_id then return the build-id as a
> +     NULL-terminated hex string.  */
> +  const char *get_cbfd_soname_build_id (const char *soname);
> +
> +  /* Clear all core file soname to build-id mappings.  */
> +  void clear_cbfd_soname_build_ids ();
> +
>    /* The address space attached to this program space.  More than one
>       program space may be bound to the same address space.  In the
>       traditional unix-like debugging scenario, this will usually
> @@ -378,6 +392,10 @@ struct program_space
>    /* The set of target sections matching the sections mapped into
>       this program space.  Managed by both exec_ops and solib.c.  */
>    target_section_table m_target_sections;
> +
> +  /* Mapping of a core file's shared library sonames to their
> +     respective build-ids.  */
> +  std::unordered_map<std::string, std::string> m_cbfd_soname_to_build_id;
>  };
> 
>  /* An address space.  It is used for comparing if
> diff --git a/gdb/solib.c b/gdb/solib.c
> index 3947c2d1d2e..c0beea19d7b 100644
> --- a/gdb/solib.c
> +++ b/gdb/solib.c
> @@ -23,6 +23,7 @@
>  #include <fcntl.h>
>  #include "symtab.h"
>  #include "bfd.h"
> +#include "build-id.h"
>  #include "symfile.h"
>  #include "objfiles.h"
>  #include "gdbcore.h"
> @@ -1586,6 +1587,40 @@ gdb_bfd_scan_elf_dyntag (const int desired_dyntag, bfd *abfd, CORE_ADDR *ptr,
>    return 0;
>  }
> 
> +/* See solib.h.  */
> +
> +gdb::optional<std::string>
> +gdb_bfd_read_elf_soname (struct bfd *bfd)
> +{
> +  gdb_assert (bfd != nullptr);
> +
> +  gdb_bfd_ref_ptr abfd = gdb_bfd_open (bfd->filename, gnutarget);
> +
> +  if (abfd == nullptr)
> +    return {};
> +
> +  /* Check that bfd is an ET_DYN ELF file.  */
> +  bfd_check_format (abfd.get (), bfd_object);
> +  if (!(bfd_get_file_flags (abfd.get ()) & DYNAMIC))
> +    return {};
> +
> +  /* Determine soname of shared library.  If found map soname to build-id.  */
> +  CORE_ADDR idx;
> +  if (!gdb_bfd_scan_elf_dyntag (DT_SONAME, abfd.get (), &idx, nullptr))
> +    return {};
> +
> +  struct bfd_section *dynstr = bfd_get_section_by_name (abfd.get (), ".dynstr");
> +  if (dynstr == nullptr || bfd_section_size (dynstr) <= idx)
> +    return {};
> +
> +  /* Read the soname from the string table.  */
> +  gdb::byte_vector dynstr_buf;
> +  if (!gdb_bfd_get_full_section_contents (abfd.get (), dynstr, &dynstr_buf))
> +    return {};
> +
> +  return std::string ((const char *)dynstr_buf.data () + idx);

Missing a space after the cast.

> +}
> +
>  /* Lookup the value for a specific symbol from symbol table.  Look up symbol
>     from ABFD.  MATCH_SYM is a callback function to determine whether to pick
>     up a symbol.  DATA is the input of this callback function.  Return NULL
> diff --git a/gdb/solib.h b/gdb/solib.h
> index c50f74e06bf..51cc047463f 100644
> --- a/gdb/solib.h
> +++ b/gdb/solib.h
> @@ -118,6 +118,11 @@ extern CORE_ADDR gdb_bfd_lookup_symbol_from_symtab (bfd *abfd,
>  extern int gdb_bfd_scan_elf_dyntag (const int desired_dyntag, bfd *abfd,
>                                     CORE_ADDR *ptr, CORE_ADDR *ptr_addr);
> 
> +/* If BFD is an ELF shared object then attempt to return the string
> +   referred to by its DT_SONAME tag.   */
> +
> +extern gdb::optional<std::string> gdb_bfd_read_elf_soname (struct bfd *bfd);
> +
>  /* Enable or disable optional solib event breakpoints as appropriate.  */
> 
>  extern void update_solib_breakpoints (void);


  parent reply	other threads:[~2022-02-17 16:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10  1:47 [PATCH v4 0/3] Add debuginfod core file support Aaron Merey
2021-11-10  1:47 ` [PATCH 1/3] gdb: Add aliases for read_core_file_mappings callbacks Aaron Merey
2021-11-14  2:20   ` Simon Marchi
2021-11-17  3:39     ` Aaron Merey
2021-11-10  1:47 ` [PATCH 2/3] gdb: Add soname to build-id mapping for corefiles Aaron Merey
2021-11-14  2:36   ` Simon Marchi
2021-11-17  3:24     ` Aaron Merey
2021-11-17 14:17       ` Tom Tromey
2021-11-17 21:16         ` Aaron Merey
2022-01-26  1:40           ` Aaron Merey
2022-02-09  2:31             ` Aaron Merey
2022-02-17 16:01           ` Andrew Burgess [this message]
2021-11-10  1:47 ` [PATCH 3/3] PR gdb/27570: missing support for debuginfod in core_target::build_file_mappings Aaron Merey
2021-11-14  2:56   ` Simon Marchi
2021-11-17  3:28     ` Aaron Merey
2022-01-26  1:42       ` Aaron Merey
2022-02-09  2:31         ` Aaron Merey
  -- strict thread matches above, loose matches on Subject: below --
2021-08-12  4:24 [PATCH v3 0/3] Add debuginfod core file support Aaron Merey
2021-08-12  4:24 ` [PATCH 2/3] gdb: Add soname to build-id mapping for corefiles Aaron Merey
2021-08-15 14:51   ` Lancelot SIX
2021-08-17 13:58     ` Simon Marchi
2021-08-19  2:22       ` Aaron Merey
2021-09-29  1:12         ` Aaron Merey
2021-11-04  1:32         ` Simon Marchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220217160145.GP2571@redhat.com \
    --to=aburgess@redhat.com \
    --cc=amerey@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=lsix@lancelotsix.com \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).