From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4809218731124614329==" MIME-Version: 1.0 From: Petr Machata To: elfutils-devel@lists.fedorahosted.org Subject: Re: [PATCH][v2] Support .debug_macro Date: Mon, 29 Sep 2014 17:26:11 +0200 Message-ID: In-Reply-To: 1411731889.5074.27.camel@bordewijk.wildebeest.org --===============4809218731124614329== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Mark Wielaard writes: > On Fri, 2014-09-12 at 20:03 +0200, Petr Machata wrote: >> - dwarf_getmacros is left to support only .debug_macinfo. The only >> problematic opcode really is DW_MACINFO_vendor_ext > > You could suggest that the standard reserves that value to enable > consumers to use the same parsers for both. That's worth a try. > Could you also include a small testcase? That could then also be used as > an example. I don't think it is immediately clear how you would use the > dual style die/off accessors. So an test/example would be helpful. I posted a test case patch with my previous submission. That's still active, but I didn't ping it until this one gets through: https://lists.fedorahosted.org/pipermail/elfutils-devel/2014-Septem= ber/004143.html > And I was wondering if having a readelf --debug-dump=3Ddecodedmacro using > this interface was useful. The raw macro dump doesn't handle transparent > includes (you don't have to implement that, but maybe it is useful to do > if you do decide to include a test/example anyway). Sure, why not. I actually think readelf should just use libdw/libelf interfaces instead of duplicating the decoding logic, though that's not what you are proposing. > Why you decide to a include the version and line_offset interfaces on > Dwarf_Macro? Isn't that a CU DIE concept? (See also below, maybe I just > misunderstand the data representation.) Both version and line offset belong to macro units (not CU's, each macro unit may set its own .debug_line offset, and macro units have their own version). But there's a question of how to get to that value. You need version to interpret the opcode, and you need .debug_line offset to interpret file names. One alternative is to have an interface like this: extern ptrdiff_t dwarf_getmacros_off (Dwarf *dbg, Dwarf_Off macoff, int (*callback) (Dwarf_Macro *, void *), void *arg, ptrdiff_t token, unsigned int *versionp, Dwarf_Off *line_offp); ... which would set *VERSIONP and *LINE_OFFP before invoking CALLBACK. Then use it like this: struct data { some stuff; unsigned int version; Dwarf_Off line_off; }; int callback (Dwarf_Macro *macro, void *user) { // Cast user to struct data, use version and line_off. } struct data data =3D {...}; dwarf_getmacros_off (dbg, off, callback, &data, token, &data.version, &data.line_off); But arranging this exchange smells of reinventing the wheel every time you need to use the interface, and the interface is not extensible (if we need to pass one more thing, we need to add one more parameter). You could also extend the callback signature and just pass them down directly. That's much better in my opinion, but again has the extensibility problem. For this reason I thought it best to just carry these per-unit variables with macros. That addresses both the extensibility (for each new thing that you need to pass around, you add one more private field, and one more public interface), as well as the don't-repeat-yourself concerns. The obvious downside is that you'll never know what these values are for macro-less macro unit, but for macro-less unit you don't care. Yet another possibility is to have a type like Dwarf_Macro_Unit, akin to Dwarf_CU, which could be accessed from macro, or possibly be passed together with macro to the callback, and which would carry these things, and possibly more--like the prototype table. I didn't think it necessary to go to these lengths. >> +static inline Dwarf_Macro_Op_Table * >> +get_macinfo_table (void) >> +{ >> + static Dwarf_Macro_Op_Table *ret =3D NULL; >> + if (unlikely (ret =3D=3D NULL)) >> + ret =3D init_macinfo_table (); >> + return ret; >> +} > > The statics here make me a little uncomfortable. What if the user is > working with multiple Dwarfs at the same time iterating through the > simultaneously (maybe to compare the macros each defines in their CUs). > Won't they overwrite each others Macro_Op_Table? This is a singleton macro prototype table for .debug_macinfo (not .debug_macro) section. It's the same in all Dwarf's. The reason it's there is just for uniformness of decoding both types of sections. So there's no problem if you have more than one Dwarf. One problem this could cause however is that in a multi-threaded application, there's a race between the threads for initialization of this global resource. If this is deemed a problem, we could do this: __thread Dwarf_Macro_Op_Table *ret =3D NULL; if (unlikely (ret =3D=3D NULL)) { lock; static Dwarf_Macro_Op_Table *global_table =3D NULL; if (global_table =3D=3D NULL) global_table =3D init_macinfo_table (); ret =3D global_table; unlock; } This is essentially double-checked locking, but that's not thread-safe, so I make ret thread-local, and always lock before touching the global resource. That should be both thread-safe, have reasonable performance, and not waste memory by storing one table for each thread. Initially I dismissed this problem because we all know that libdw is thread unsafe anyway. But here we potentially get thread-unsafe behavior even if we touch different Dwarf's, which might be unexpected. May be worth reconsidering. >> - switch (opcode) >> + /* A fake CU with bare minimum data to fool dwarf_formX into >> + doing the right thing with the attributes that we put >> + out. */ >> + Dwarf_CU fake_cu =3D { >> + .dbg =3D dbg, >> + .version =3D 4, >> + .offset_size =3D table->is_64bit ? 8 : 4, >> + }; > > Do we want/need to match the version here to match the CU DIE this macro > table came from? Might be hard to get at at this point though. Maybe > match it to the Macro_Op_Table version? The problem is that some macro units don't have a related CU per se. They could be accessed through offset from other macro units. There typically will be more than one CU that these independent units logically belong to, otherwise this transparent inclusion scheme wouldn't be advantageous. But I guess we should match the CU version to whatever corresponds to the .debug_macro version that this macro comes from (the two versions can in theory evolve independently. It's hard to imagine something like that happening, but there could be a version of Dwarf that doesn't change .debug_info at all, but does change .debug_macro). >> + Dwarf_Macro macro =3D { >> + .line_offset =3D table->line_offset, >> + .version =3D table->version, >> + .opcode =3D opcode, >> + .nargs =3D proto->nforms, >> + .attributes =3D attributes, >> + }; > > Couldn't you just store the table, opcode and attributes in Dwarf_Macro? > The rest can then be retrieved from the associated table. Isn't that what I'm doing? I guess I don't understand what you mean. > Although it is common for callback based functions in libdw I would > explicitly say that the Dwarf_Macro is only valid during the callback > and no information can be retrieved using the pointer after the callback > returns. OK, agreed. Thanks, PM --===============4809218731124614329==--