From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25225 invoked by alias); 9 May 2003 23:14:24 -0000 Mailing-List: contact gdb-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-owner@sources.redhat.com Received: (qmail 25100 invoked from network); 9 May 2003 23:14:23 -0000 Received: from unknown (HELO papaya.bactrian.org) (216.101.126.244) by sources.redhat.com with SMTP; 9 May 2003 23:14:23 -0000 Received: from papaya.bactrian.org (papaya.bactrian.org [127.0.0.1]) by papaya.bactrian.org (8.12.8/8.12.8) with ESMTP id h49NELOt006847; Fri, 9 May 2003 16:14:21 -0700 Received: (from carlton@localhost) by papaya.bactrian.org (8.12.8/8.12.8/Submit) id h49NEIwR006845; Fri, 9 May 2003 16:14:18 -0700 X-Authentication-Warning: papaya.bactrian.org: carlton set sender to carlton@bactrian.org using -f To: gdb Subject: dwarves, hierarchies, and cross-references From: David Carlton Date: Fri, 09 May 2003 23:14:00 -0000 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2003-05/txt/msg00173.txt.bz2 Right now the DWARF 2 symbol reader basically proceeds in a hierarchical fashion: it starts at the DW_TAG_compile_unit entry, then reads its children, and while reading those children reads their children, and so forth. This is good for building up fully qualified names: e.g. if I have code like namespace N { class C { void foo() {} }; } then it's easy to remember that, when generating the info for 'foo', we're within a context called 'N::C', so we should really call it 'N::C::foo'. But that's not the whole story: sometimes one DIE refers to another DIE somewhere else in the hierarchy. Typically (always?), these other DIEs are used to provide type info. So an example might be: namespace N { class C { public: class E {}; }; class D : public C::E {}; } Here, N::D has a DW_TAG_inheritance entry that references N::C::E's DIE. Now, if I've already traversed N::C::E before traversing N::D, I probably already know everything about N::C::E. But if the compiler happens to emit the info for N::D before the info for N::C (and hence N::C::E), things get hairier: the reader wants to find info about this class called E, and it's hard to envision exactly how the reader will know that, say, the class is really N::C::E (as opposed to, say, E or N::D::E or N::E or something). My branch gets this wrong: in situations like the above, it frequently thinks that D has a base class called N::D::E. Fortunately, later the reader generates a correctly named version of the debug info of the class, so this isn't the end of the world, but it's an unfortunate situation, because the wrong name lingers in places. Any suggestions? Here are some options that I've considered: * Once we notice that we put E in the wrong context, update everybody who has been misled by this. This seems complicated and potentially fragile to me: exactly what data would we have to maintain to make this work? * When parsing E via a cross-reference, figure out its context, so we can name it correctly. This seems like a plausible idea to me; I'm only worried that it might be a little inefficient at times. * Break up the symbol reading into a two-stage process: first, go through the hierarchy of DIE's enough to initialize their type fields with a bare-bones type, containing enough info for future cross-references to be able to use it. (What exactly needs to be filled in? Certainly the name field; does anything else need to be filled in?) Then go through the hierarchy a second time, filling in everything completely. I think I like the third option the best. But I'm worried that it won't be clear what information has to be filled in on the first pass, and that it also won't be clear exactly how far the first pass will have to descend the tree; also, it could lead to lots of code duplication. (We already look at the tree once for psymtabs and once for symtabs; breaking the latter up into two passes would make that even worse.) If the third option doesn't work, I think the second option should work: probably the patches on my branch to set names properly is the only place that we really depend on traversing the hierarchy in order, in which case tackling the name issue head-on is a sensible approach. (Hmm. Maybe I like the second approach the best.) Comments? Suggestions? Is this explanation of the problem clear at all? David Carlton carlton@bactrian.org