public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Mark Wielaard <mark@klomp.org>
Cc: elfutils-devel@sourceware.org
Subject: Re: [PATCH 08/14] libdw: Parse DWARF package file index sections
Date: Mon, 6 Nov 2023 22:45:08 -0800	[thread overview]
Message-ID: <ZUnc9C74lrMSN0vk@telecaster> (raw)
In-Reply-To: <20231101230704.GN8429@gnu.wildebeest.org>

On Thu, Nov 02, 2023 at 12:07:04AM +0100, Mark Wielaard wrote:
> Hi Omar,
> 
> On Wed, Sep 27, 2023 at 11:20:57AM -0700, Omar Sandoval wrote:
> > The .debug_cu_index and .debug_tu_index sections in DWARF package files
> > are basically hash tables mapping a unit's 8 byte signature to an offset
> > and size in each section used by that unit [1].  Add support for parsing
> > and doing lookups in the index sections.
> > 
> > We look up a unit in the index when we intern it and cache its hash
> > table row in Dwarf_CU.
> 
> This looks good. Thanks for the various testcases.
> 
> Do I understand correctly that binutils dwp only does the DWARF4 GNU
> extension and llvm-dwp only does the DWARF5 standardized format? Maybe
> we should create a eu-dwp that does both.

llvm-dwp can do both the DWARF4 GNU format and the DWARF5 format.
binutils dwp can only do DWARF4.

> It looks like you are very careful checking boundaries, but it would
> be bad to do some fuzzing on this.
> 
> > Then, a new function, dwarf_cu_dwp_section_info,
> > can be used to look up the section offsets and sizes for a unit.  This
> > will mostly be used internally in libdw, but it will also be needed in
> > static inline functions shared with eu-readelf.  Additionally, making it
> > public it makes dwp support much easier for external tools that do their
> > own low-level parsing of DWARF information, like drgn [2].
> 
> Although I am not against this new interface, I am not super
> enthousiastic about it.
> 
> There is one odd part, DW_SECT_TYPES should be defined in dwarf.h with
> the other DW_SECT constants, otherwise it cannot be used (unless you
> define it yourself to 2).

Yeah, I wasn't sure about adding it or not since it's not technically
defined in the DWARF5 standard. But if we can count on future DWARF
standards not reusing the value 2 (which we probably can), I agree that
it seems better to add it to dwarf.h.

> For eu-readelf we don't really need it being public, it already cheats
> a little and uses some (non-public) libdwP.h functions. That is
> actually not great either. So if there is a public function that is
> available that is actually preferred. But if it is just for
> eu-readelf, I don't think it is really needed. And it seems you
> haven't actually added the support to eu-readelf to print these
> offsets.

Right, eu-readelf only uses it indirectly via the static inline
functions modified in patch 13. It looks like eu-readelf dynamically
links to libdw (on Fedora, at least), so either some part of the dwp
implementation needs to be exported, or a big chunk of it needs to be
inlined in libdwP.h, right?

Either way, I'd still love to have this interface for drgn.

> It is fine to expose these offsets and sizes, but how exactly are you
> using them in drgn? It seems we don't have any other interfaces in
> libdw that you can then use these with.

Here's the branch I linked to in my cover letter:
https://github.com/osandov/drgn/tree/dwp. I need this interface for the
couple of places where drgn parses DWARF itself.

One of those places is in the global name indexing step drgn does at
startup. We do this by enumerating all CUs using libdw, then using our
own purpose-built DWARF parser to do a fast, parallelized scan.

Our bespoke parser needs to know the base of the abbreviation table and
the string offsets table, which is easy without dwp. This interface also
makes it easy with dwp. Without this interface, I'd need to reimplement
the CU index parser in drgn :(

> Can we split off this public interface from the rest of this patch?
> But then we also need to split off the tests. So maybe keep them together?

Yeah, I included the interface in this patch exactly so that I could
test the implementation in the same patch. I'm happy to split that up
however you'd prefer.

[snip]

> > +  Dwarf_Package_Index *index = malloc (sizeof (*index));
> > +  if (index == NULL)
> > +    {
> > +      __libdw_seterrno (DWARF_E_NOMEM);
> > +      return NULL;
> > +    }
> > +
> > +  index->dbg = dbg;
> > +  /* Set absent sections to UINT32_MAX.  */
> > +  memset (index->sections, 0xff, sizeof (index->sections));
> 
> OK, although I would have preferred a simple for loop and let the
> compiler optimize it.

Ok, I can fix that.

[snip]

> > +static int
> > +__libdw_dwp_section_info (Dwarf_Package_Index *index, uint32_t unit_row,
> > +			  unsigned int section, Dwarf_Off *offsetp,
> > +			  Dwarf_Off *sizep)
> > +{
> > +  if (index == NULL)
> > +    return -1;
> > +  if (unit_row == 0)
> > +    {
> > +      __libdw_seterrno (DWARF_E_INVALID_DWARF);
> > +      return -1;
> > +    }
> 
> OK, but why isn't the caller checking this?

Partially becase it simplified callers like __libdw_dwp_findcu_id to not
require a separate check, and partially just to be defensive.

[snip]

Thanks for taking a look!

  reply	other threads:[~2023-11-07  6:45 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27 18:20 [PATCH 00/14] elfutils: DWARF package (.dwp) file support Omar Sandoval
2023-09-27 18:20 ` [PATCH 01/14] libdw: Make try_split_file static Omar Sandoval
2023-10-03 16:09   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 02/14] libdw: Handle split DWARF in dwarf_entrypc Omar Sandoval
2023-10-03 16:10   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 03/14] libdw: Handle DW_AT_ranges in split DWARF 5 skeleton in dwarf_ranges Omar Sandoval
2023-10-03 16:10   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 04/14] libdw: Handle other string forms in dwarf_macro_param2 Omar Sandoval
2023-10-03 16:11   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 05/14] libdw: Fix dwarf_macro_getsrcfiles for DWARF 5 Omar Sandoval
2023-10-03 21:29   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 06/14] libdw: Handle split DWARF in dwarf_macro_getsrcfiles Omar Sandoval
2023-10-03 21:49   ` Mark Wielaard
2023-09-27 18:20 ` [PATCH 07/14] libdw: Recognize .debug_[ct]u_index sections in dwarf_elf_begin Omar Sandoval
2023-11-01 14:03   ` Mark Wielaard
2023-11-01 17:30     ` Omar Sandoval
2023-09-27 18:20 ` [PATCH 08/14] libdw: Parse DWARF package file index sections Omar Sandoval
2023-11-01 23:07   ` Mark Wielaard
2023-11-07  6:45     ` Omar Sandoval [this message]
2023-09-27 18:20 ` [PATCH 09/14] libdw, libdwfl: Save original path of ELF file Omar Sandoval
2023-11-02 17:04   ` Mark Wielaard
2023-11-07  6:46     ` Omar Sandoval
2023-09-27 18:20 ` [PATCH 10/14] libdw: Try .dwp file in __libdw_find_split_unit() Omar Sandoval
2023-11-02 19:56   ` Mark Wielaard
2023-11-07  7:07     ` Omar Sandoval
2023-09-27 18:21 ` [PATCH 11/14] tests: Handle DW_MACRO_{define,undef}_{strx,sup} in dwarf-getmacros Omar Sandoval
2023-11-02 20:30   ` [PATCH 11/14] tests: Handle DW_MACRO_{define, undef}_{strx, sup} " Mark Wielaard
2023-09-27 18:21 ` [PATCH 12/14] tests: Optionally dump all units " Omar Sandoval
2023-11-02 20:39   ` Mark Wielaard
2023-09-27 18:21 ` [PATCH 13/14] libdw: Apply DWARF package file section offsets where appropriate Omar Sandoval
2023-11-02 22:20   ` Mark Wielaard
2023-11-07 16:55     ` Omar Sandoval
2023-09-27 18:21 ` [PATCH 14/14] libdw: Handle overflowed DW_SECT_INFO offsets in DWARF package file indexes Omar Sandoval
2023-09-27 19:20 ` [PATCH 00/14] elfutils: DWARF package (.dwp) file support Frank Ch. Eigler
2023-09-27 20:17   ` Omar Sandoval
2023-10-03 21:54 ` Mark Wielaard
2023-11-02 23:05 ` Mark Wielaard
2023-11-07 17:13   ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUnc9C74lrMSN0vk@telecaster \
    --to=osandov@osandov.com \
    --cc=elfutils-devel@sourceware.org \
    --cc=mark@klomp.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).