From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id 0A0A63858C60 for ; Wed, 29 Sep 2021 14:21:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0A0A63858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from tarox.wildebeest.org (83-87-18-245.cable.dynamic.v4.ziggo.nl [83.87.18.245]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 6D15E3000A0A; Wed, 29 Sep 2021 16:21:21 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id 655DB413CD4A; Wed, 29 Sep 2021 16:21:20 +0200 (CEST) Message-ID: Subject: Re: dwarf_aggregate_size doesn't work with arrays in partial CUs From: Mark Wielaard To: KJ Tsanaktsidis , elfutils-devel@sourceware.org Date: Wed, 29 Sep 2021 16:21:20 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2021 14:21:25 -0000 Hi KJ, On Sat, 2021-09-25 at 17:21 +1000, KJ Tsanaktsidis via Elfutils-devel wrote: > I'm writing a program that uses ptrace to poke at internal OpenSSL > data structures for another process. I'm using libdw to parse the > DWARF data for the copy of OpenSSL actually linked in to the target > process, so I can extract struct offsets, member sizes and the like > and poke at the right places. >=20 > I've run into an issue where dwarf_aggregate_size can't calculate the > size of an array, when the array is included in a partial CU > (DW_TAG_partial_unit). If the array unit includes a DW_AT_upper_bound > attribute, but not a DW_AT_lower_bound attribute, then > dwarf_aggregate_size will infer the lower bound based on the > DW_AT_language attribute of the enclisng CU (i.e. whether the language > uses zero or one based indexing). >=20 > However, the debug symbols I'm looking at for OpenSSL from the Ubuntu > repositories have the DW_AT_language on the full compilation unit > entries, but not in the partial ones included in them. This means that > caling dwarf_aggregate_size on the array type DIE does not work. That is indeed a problem, since dwarf_aggregate_size doesn't provide another way to provide the language to use for the dwarf_default_lower_bound call. And the default is to return an DWARF_E_UNKNOWN_LANGUAGE error. Maybe we should change the default to assume the lower bound is zero? > The DWARF spec doesn't really seem to have anything to say on the > matter (all it says is "A full or partial compilation unit entry may > have the following attributes", but doesn't say what it logically > means if an attribute is present on the complete CU but not a partial > one). I think it is assumed that it inherits those attributes from the CU from which the partial one was imported and/or from the CU of the DIE that referenced the DIE in the partial unit. But I don't think it is easy to track that with libdw currently. > I guess it doesn't really make sense for a single compilation unit to > contain multiple languages? So I wonder if dwarf_srclang (called by > dwarf_aggregate_size) should crawl through the list of CU's to see if > the DIE's CU is included in a CU that _does_ specify DW_AT_language > (recursively, I suppose). Then, we can infer that the partial CU's > language is the same as the enclosing one. >=20 > If people reckon this is a good idea (or, have a better one!), I'm > happy to try and put together a patch. I think that suggestion is sound, but really expensive. It also is somewhat tricky if you have alt files, you'll have to track back to the original Dwarf to see if it imports one of the partial units from the alt file. But I also don't have a good alternative idea. We could maybe have a variant of dwarf_aggregate_size that takes a language default value, but that doesn't seem like a very generic solution. Or maybe a variant of dwarf_srclang that takes any DIE (not just a CU DIE) and which tries to figure out the best language to use, which falls back to some default value if it cannot figure out what the language is that can be used with dwarf_default_lower_bound to get a default (most likely zero)? We could also ask producers (like dwz) to always include a DW_AT_language for partial units they create. But that of course makes the partial units bigger (and at least dwz creates them to make the full debuginfo smaller). Cheers, Mark