From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19069 invoked by alias); 27 Apr 2010 08:50:38 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 18994 invoked by uid 22791); 27 Apr 2010 08:50:29 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,TBC,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Roland McGrath To: Sami Wagiaalla Cc: Jan Kratochvil , archer@sourceware.org, Keith Seitz Subject: Re: Cross-CU C++ DIE references vs. mangling In-Reply-To: Sami Wagiaalla's message of Monday, 12 April 2010 14:46:48 -0400 <1271098008.2901.211.camel@localhost> References: <20100310191833.GA2816@host0.dyn.jankratochvil.net> <20100310193207.GA6147@host0.dyn.jankratochvil.net> <20100311060305.B177A7D5E@magilla.sf.frob.com> <1271098008.2901.211.camel@localhost> Message-Id: <20100427085011.9ADC0CF70@magilla.sf.frob.com> Date: Tue, 27 Apr 2010 08:50:00 -0000 X-SW-Source: 2010-q2/txt/msg00014.txt.bz2 Sorry for the delay. I'm now trying to hose down the various hornets' nests I stirred up in DWARFland. > So after a few (really, many) reads of this email I think I can > summarize the issues and solutions discussed there. I just wanted to > make sure I have a proper understanding of the issue before filing a gcc > feature request. So, Is this a correct summary: Ok. I don't think I stated an actual conclusion, just tried to air all the nuances needing consideration. Perhaps appropriate conclusions were implied by the confluence of nuances, but I did not quite assert any. > The goal is the help gdb find the proper location for variables where > declarations and definitions are separated over CU's or so's. Yes, that's the problem that we started discussing. In my ramblings, I extended it to consider finding the proper code address (or function descriptor, as appropriate) for functions too (the cases with complexity analogous to the examples we've discussed with variables being C++ methods and namespace-qualified functions). > - It requires a search of all other CU's/so' to locate the definition. > Which is inefficient "It requires" sounds like this is the only option today. That's not so. I think you might be conflating two different things. One option is to search all other CUs (in all objects) to locate a DIE that is both a defining declaration and matches the original declaration DIE of interest. That is inefficient because it's an exhaustive-search kind of method. It's incomplete for all cases where the definition you are looking for does not have a DWARF CU you can find (due to stripping or due to lack of -g at compilation, etc). Another option today is to glean an ELF symbol name by one method or another, and then look for that. This has two components: coming up with the symbol name, and searching for it. The symbol search portion is presumed to be more efficient than searching through DWARF, though its largest-granularity scaling problem is the same one of searching across all the ELF objects. > but also inaccurate since > > - The scope of the declaration can be different from that of the > definition (e.g. class members). That issue per se does not render the grovel-all-CUs method inaccurate, just more complex than you might think at first blush. In each CU, you have to notice each matching non-defining declaration (which does indeed have the DIE topology matching your original declaration), and check for defining declarations whose DW_AT_specification points to the match. This is just a detail of how it is both complex and costly to check all CUs for a DIE that's an appropriate defining declaration. > If DW_AT_MIPS_linkage_name is > available it can be used to resolve this, however The "glean an ELF symbol name" portion can be done in two ways. One is DW_AT_MIPS_linkage_name, which is trivially simple to code for in the debugger and trivially cheap to extract. The other is to apply the language/ABI-specific symbol name mangling algorithm to the DIE topology of your original declaration DIE of interest. If DW_AT_MIPS_linkage_name is available, it supplies the same answer(*) that you get via mangling based on DIE topology. It's not that it "resolves inaccuracy", it's just that it yields from a very simple and cheap procedure (looking at the attribute) the same answer that the much more complex procedure should yield. (*) Conversely, I had the impression from Keith that GCC (at least in the past and maybe still today) emitted the wrong mangled name for DW_AT_MIPS_linkage_name sometimes. That sort of boggles the mind, but apparently is an issue of potential concern weighing against using DW_AT_MIPS_linkage_name. > - if the definition is in a stripped DSO there is indeed a definition > (ELF) but nowhere is there a DW_AT_location pointing to it. Also, That is true but is not a "however" about using DW_AT_MIPS_linkage_name. Nor is it a "however" about NOT using DW_AT_MIPS_linkage_name. Rather, it is a "however" about using CU grovelling to find a definition rather than gleaning an ELF symbol name. If you rely on CU grovelling, you of course only grovel the DWARF CUs that you have, which might not include the definition. > - it is possible to have two names defined in two separate so's with the > same linkage name. eg: Yes. I gave the concrete example for this situation, but I consider it part of the same point that you can also have two symbols with the same name inside one object. For that to happen, either one or both will be local or hidden symbols, or two global symbols will be in different symbol version sets. To be fair, this could be considered an entirely orthogonal issue. It applies here no different than it does to very simple non-mangled symbol names (e.g. from C). If at any point you glean a symbol name and then look it up by directly name, you can have multiple ambiguous matches. However, if you glean a specific ELF symbol--not the name, but a particular symbol index in a particular ELF symbol table--then you can disambiguate (with potentially very complex effort, but you in theory have enough information). In the vast majority of cases where you have a DWARF CU with a non-defining declaration to start from, you should be able to glean the particular ELF symbol in that object to use. In the object containing the non-defining declaration itself there will almost always be only one ELF symbol by that name. If it's local or has non-default visibility, you can use it right there--it's the defining symbol you're looking for. If it's global, then it has a symbol version association that you can use to drive your ELF symbol search unambiguously. > Proposed solution: > > Teach the compiler to generate a DW_AT_location for a non defining > declaration that is applicable in that die's scope. That location > expression would be parallel to the assembly generated for the symbol I only sort of proposed this, and it's not a complete solution. > The following part I don't quite understand: > > > We could certainly teach GCC to do this. > > It would then be telling us more pieces of direct truth about the code. > > Would that not be the best thing ever? > > Well, almost. > > > > First, what about a defining declaration in a PIC CU? [...] > Why is there a need for second artificial location describing die ? As I > understand it declarationhood is specified by the die's nesting in the > die hierarchy not its DW_AT_location. In other words, what is missing in > the current way gcc specifies locations for defining declarations ? Declarationhood per se (or perhaps we should say "non-definingness") is specified by the presence of DW_AT_declaration, not by DIE toplogy. The issue is that in PIC code, what's a defining declaration in the source might actually be acting as a non-defining declaration at runtime. Every defining declaration serves two purposes. The first is to describe the declaration. Just like a non-defining declaration, this wants to tell the debugger what using this particular name in this particular context (i.e. containing DIE) means in the source program. That's the frame of mind you want when doing things like expression evaluation in a given context. This corresponds to how assembly code is generated in that context to find the address of the entity described (data address or target PC/function descriptor). The second purpose is to describe the definition. This wants to tell the debugger what piece of memory this definition is providing. That's the frame of mind you want when resolving someone else's non-defining declaration, or when trying to examine initializer values before a program is running. This corresponds to how assembly code is generated to create the definition and (perhaps) initialize it. In code generated today for all defining declarations, we only have a description of the definition. In non-PIC code, that suffices to describe the declaration, since it's always resolved to that selfsame definition. In PIC code, that declaration is resolved like non-defining declarations and may or may not wind up matching this same definition. Thus there is the idea for PIC code generation to emit both a declaration DIE and a definition DIE for each defining declaration. When looking to evaluate the named variable in that context, you'd use the one. When looking for a definition, you'd use the other. > This summary does not include the part starting with "Before dynamic > linker startup" to the end of the email. Mainly because I am assuming > that the main use case is after dynamic linker startup. Well, I have a few problems with assuming that. Firstly, it's just not the way to do business. If we're going to contemplate changing or refining the contract between compiler and debugger in subtle ways, we don't do it lightly and we don't consider just one use case and go and change things purely to satisfy that purpose. We need to thoroughly consider what is most correct for each case we know of, and understand what methods do or don't achieve that. We may very well decide to trade off better support for some cases against less perfect support for cases deemed less common or important. But we'll do that explicitly after understanding what we're giving up and what we're getting. Secondly, is it really the main use case? Well, maybe it is for variables. That is all you actually asked about, but I insist on answering about what you need to know, not just what you asked. For variables, what it doesn't cover is printing initial values (which ordinarily works today) and setting watchpoints. The other half of the problem is functions (including methods). I'll grant that they are not quite as central in debugger expression evaluation as variables, but they're important there. Moreover, they are key to a use case every bit as important to the debugging experience as expression evaluation: setting breakpoints. Finally, after the dynamic linker details, and where I ranted a little about the functions, near the end is where I came closest to drawing an actual conclusion. That putative conclusion is more or less that all the preceding new ideas are sufficiently incomplete in their own ways that no such proposals really warrant pursuit and we might as well admit we are stuck with mangled symbols and faking the ELF dance as best we can. If that's the conclusion, then the only proposals are either to have a reliable linkage_name attribute and rely on it, or to drop it as useless and expect to construct a mangled name from DIE topology. I still wouldn't say I have come to that conclusion quite yet. I described (almost) everything I understand about the possibilities and constraints. I was hoping for some other folks to gain that understanding and share their opinions about how it all fits together. Thanks, Roland