public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>,
	elfutils-devel@sourceware.org
Subject: Re: dwarf_aggregate_size doesn't work with arrays in partial CUs
Date: Wed, 29 Sep 2021 16:21:20 +0200	[thread overview]
Message-ID: <afebae258ba067f19c025661babc6c341efc49b5.camel@klomp.org> (raw)
In-Reply-To: <CAJ7wOOvKDx2TakFm2dA82DmjsyCETuz0gKAR6taorx5eHArTBA@mail.gmail.com>

Hi KJ,

On Sat, 2021-09-25 at 17:21 +1000, KJ Tsanaktsidis via Elfutils-devel
wrote:
> I'm writing a program that uses ptrace to poke at internal OpenSSL
> data structures for another process. I'm using libdw to parse the
> DWARF data for the copy of OpenSSL actually linked in to the target
> process, so I can extract struct offsets, member sizes and the like
> and poke at the right places.
> 
> I've run into an issue where dwarf_aggregate_size can't calculate the
> size of an array, when the array is included in a partial CU
> (DW_TAG_partial_unit). If the array unit includes a DW_AT_upper_bound
> attribute, but not a DW_AT_lower_bound attribute, then
> dwarf_aggregate_size will infer the lower bound based on the
> DW_AT_language attribute of the enclisng CU (i.e. whether the language
> uses zero or one based indexing).
> 
> However, the debug symbols I'm looking at for OpenSSL from the Ubuntu
> repositories have the DW_AT_language on the full compilation unit
> entries, but not in the partial ones included in them. This means that
> caling dwarf_aggregate_size on the array type DIE does not work.

That is indeed a problem, since dwarf_aggregate_size doesn't provide
another way to provide the language to use for the
dwarf_default_lower_bound call. And the default is to return an
DWARF_E_UNKNOWN_LANGUAGE error.

Maybe we should change the default to assume the lower bound is zero?

> The DWARF spec doesn't really seem to have anything to say on the
> matter (all it says is "A full or partial compilation unit entry may
> have the following attributes", but doesn't say what it logically
> means if an attribute is present on the complete CU but not a partial
> one).

I think it is assumed that it inherits those attributes from the CU
from which the partial one was imported and/or from the CU of the DIE
that referenced the DIE in the partial unit. But I don't think it is
easy to track that with libdw currently.

> I guess it doesn't really make sense for a single compilation unit to
> contain multiple languages? So I wonder if dwarf_srclang (called by
> dwarf_aggregate_size) should crawl through the list of CU's to see if
> the DIE's CU is included in a CU that _does_ specify DW_AT_language
> (recursively, I suppose). Then, we can infer that the partial CU's
> language is the same as the enclosing one.
> 
> If people reckon this is a good idea (or, have a better one!), I'm
> happy to try and put together a patch.

I think that suggestion is sound, but really expensive. It also is
somewhat tricky if you have alt files, you'll have to track back to the
original Dwarf to see if it imports one of the partial units from the
alt file.

But I also don't have a good alternative idea. We could maybe have a
variant of dwarf_aggregate_size that takes a language default value,
but that doesn't seem like a very generic solution. Or maybe a variant
of dwarf_srclang that takes any DIE (not just a CU DIE) and which tries
to figure out the best language to use, which falls back to some
default value if it cannot figure out what the language is that can be
used with dwarf_default_lower_bound to get a default (most likely
zero)?

We could also ask producers (like dwz) to always include a
DW_AT_language for partial units they create. But that of course makes
the partial units bigger (and at least dwz creates them to make the
full debuginfo smaller).

Cheers,

Mark

  reply	other threads:[~2021-09-29 14:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-25  7:21 KJ Tsanaktsidis
2021-09-29 14:21 ` Mark Wielaard [this message]
2021-10-03  5:05   ` KJ Tsanaktsidis
2021-11-10 13:40     ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afebae258ba067f19c025661babc6c341efc49b5.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=elfutils-devel@sourceware.org \
    --cc=ktsanaktsidis@zendesk.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).