From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25422 invoked by alias); 11 Mar 2010 06:03:22 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 25400 invoked by uid 22791); 11 Mar 2010 06:03:18 -0000 X-SWARE-Spam-Status: No, hits=-4.7 required=5.0 tests=AWL,BAYES_50,RCVD_IN_DNSWL_HI,SPF_HELO_PASS X-Spam-Check-By: sourceware.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Roland McGrath To: Jan Kratochvil Cc: archer@sourceware.org, Sami Wagiaalla , Keith Seitz Subject: Re: Cross-CU C++ DIE references vs. mangling In-Reply-To: Jan Kratochvil's message of Wednesday, 10 March 2010 20:32:07 +0100 <20100310193207.GA6147@host0.dyn.jankratochvil.net> References: <20100310191833.GA2816@host0.dyn.jankratochvil.net> <20100310193207.GA6147@host0.dyn.jankratochvil.net> Message-Id: <20100311060305.B177A7D5E@magilla.sf.frob.com> Date: Thu, 11 Mar 2010 06:03:00 -0000 X-SW-Source: 2010-q1/txt/msg00092.txt.bz2 > In this case if it see DW_AT_external + DW_AT_declaration it can also global > "S::i" defining DIE so DW_AT_MIPS_linkage_name is probably really not needed. I'm not quite sure I'm following you. I think you meant "... it can also see the global ..." here. So, yes, it can look through all other CUs looking for a defining declaration with a DW_AT_location. It matches another CU's version of the same thing by looking at the path of levels to a . You already know this, but I'll mention another wrinkle about that. For C++ namespaces, I think all definitions do indeed have to be inside a DW_TAG_namespace scope with matching name. So there you are looking for exactly the sequence of nested that matches. But the general case also includes class members (both methods and static member variables). For these, the defining declaration can be lexically outside the declaration scope, either at top-level (child of CU) or inside matching DW_TAG_namespace scopes but outside the matching DW_TAG_class_type (et al). To find that case, you need to remember each DIE that matches the scope/name path but has only DW_AT_declaration. Then look at each DIE with an appropriate tag for a DW_AT_specification pointing to a matched DIE. (Here "appropriate" means DW_TAG_variable for DW_TAG_variable, DW_TAG_variable for DW_TAG_member, and DW_TAG_subprogram for DW_TAG_subprogram.) However, you might well never find one. Consider when the definition is in a stripped DSO. (For simplicity, let's say your reference is in another DSO, so it's PIC code. I'll mention that wrinkle a little later.) (It could even just be in a single stripped (or -g0 compiled) object in the referring object, given obtuse build situations.) Then there is indeed a definition to find, but nowhere a DW_AT_location pointing to it. In that case, you really have no recourse but to discern the ELF symbol name and look that up. If you don't have (or don't trust) DW_AT_MIPS_linkage_name, then you need to construct the name correct with the mangling algorithm applied to your DWARF-derived understanding of the declaration scope in C++ terms. Another way to think about the subject is that there are two fundamental ways to go about answering the question, "What does DIE 0x123 refer to?" One is holistic and source-centric. The other is "local" and code-centric. The first is what we've talked about above. We take the DWARF to describe the semantic structure of the code in source-language terms. We then look at all the CUs together and take them to form a single coherent whole as if we were a human reading all the bare source code. From that perspective, we can say, "S::i means S::i. There's an S::i reference over here, and an S::i definition over there, and there we are." Unfortunately, that only works, at best, as well as a human reading the (preprocessed) source code--without reading the build rules, arcane linking circumstances, etc. The second is what is embodied in theory by using DW_AT_MIPS_linkage_name all the time. We take the DWARF in a single (logical) CU to tell us exactly and only about what the compiler producing that one CU thought about the code and/or data actually emitted in that same translation unit (i.e. .o file from $compiler -c, pedantically distinct from a DWARF CU). It's local in that you confine your consideration to what the compiler knew, when it knew it (before linking). Whereas the first approach is source-centric in that you use DWARF to imagine what the source looked like and then apply source-language rules to understand it, this is code-centric in that you use DWARF to imagine what the assembly looked like and apply ELF linking rules to understand what it means at run time. The hairy issues of the latter approach are orthogonal to the question of DW_AT_MIPS_linkage_name itself. The crux of that approach is that you are using the DWARF to surmise what a reference in assembly looks like. The sole purpose of DW_AT_MIPS_linkage_name is to make it trivial to surmise that because the compiler is just telling you, and it should well know. But if the compiler lies, it's useless. For all the issues I'm raising today, it's entirely equivalent if you get the right answer by using the DWARF to feed the mangling algorithm as the way to surmise what the assembly must have been. It's just not equivalently trivial to do. :-) I started with a mention of the case where you can't possibly use a holistic source-centric approach: ya just ain't got all the source. That reveals the crux of the inadequacy of that approach, but it's only the tip of the iceberg. Not all CUs are created equal (some are born without the benefit of -g), and not all CUs weather the slings and arrows equally (some are mercilessly stripped bare). But beyond that, still others have equal faculties to their brethren and appear equal yet are separate and different, not the same at all. In those cases you can have all the imagined source code you think you need, but get the wrong answers because that's not really everything you need to know. If you remember a page ago, I made an aside about everything being PIC simplifying things. (Go figure! It sure never simplifies understanding the assembly code! But it really does simplify the situation here.) Consider a case where there is a non-PIC reference: $ gcc -g -xc <(echo 'int foo = 23;') -shared -o foo.so $ gcc -g -xc <(echo 'extern int foo;main(){return foo;}') -c -o bar.o $ gcc -o bar bar.o foo.so $ eu-readelf -sr -winfo bar foo.so Now, if you look at all those CUs there is exactly one "foo" that has a DW_AT_location. That's in foo.so, and it gives the address of foo.so's definition of "foo" in its .data section. This address is also foo.so's st_value for "foo" in its .dynsym and .symtab. (If foo.so is stripped, you may have only .dynsym.) But that's the wrong address. The real address is the space allocated for "foo" in the main executable (bar). The CU doesn't say that. It just says DW_AT_external, DW_AT_declaration. That's the truth the compiler knew, when it was not yet even possible to know whether "foo" would have a real definition in another CU in the same executable, or would instead have an implicit definition created by the linker. That's in fact what we have here. The resolution of "foo" from the bar.o CU is an address in bar's own .bss, for a word that never existed in bar.o at all. The linker adds the word, a R_*_COPY reloc to it, and a "foo" symbol with that st_value. (This tells the dynamic linker to look up foo.so's "foo" at startup time and copy its initializer (23) to bar's copy.) But it's worse than that! That's not only what "foo" means to the code in the executable, it's also what "foo" actually means to the code in foo.so itself, the code in the very CU that purports to (and does!) provide the defining declaration. So even though it gave you a DW_AT_location, you cannot stop there and take that as the explanation. The compiler didn't lie, but it told you something quite subtle. It gave DW_AT_location, but also DW_AT_external. This means exactly, "I defined it in the assembly, but the linker is also involved." (You can read in, "... and good luck with that, sucker!") Since the linker is involved, this means you have to figure out what the assembly looked like (i.e. correct mangling or whatever), and then figure out what the linker did with that, and then figure out what that meant to the dynamic linker. This is related to two other cases, with symbol visibility, and with symbol versioning. These are ways that are similar in how you have to look at the compiler's perspective through the lens of the linkers that followed, and in that there are two things that from CUs alone look like they go together. In the PIC case, they really do go together in the abstract view--there is just a purely mechanical wrinkle that could mislead you. In these cases, you can have something that the DWARF identifies just the same in multiple CUs, but that actually aren't the same in the slightest! Consider: $ g++ -g -c -fPIC -o foo1.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { int i; };') $ g++ -g -c -fPIC -o foo2.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { extern int i; }; int foo () { return internal::i; }') $ gcc -g -shared -o foo.so foo1.o foo2.o $ g++ -g -c -fPIC -o bar1.o -xc++ <(echo 'namespace internal { int i; };') $ g++ -g -c -fPIC -o bar2.o -xc++ <(echo 'namespace internal { extern int i; }; int bar () { return internal::i; }') $ gcc -g -shared -o bar.so bar1.o bar2.o $ eu-readelf -sr -winfo foo.so bar.so Now imagine a program linking in both foo.so and bar.so. There are two different things that are both separate but equal and both truly internal::i and both truly _ZN8internal1iE. By any method, there is no one answer to, "What is internal::i?" The only answers are context-specific. This example is the simplest case. Here you could very well take the holistic, source-centric view and decide that means "holistic within the same linked object". When asking in a foo.so context, you take the foo1.o and foo2.o CUs together. When asking in a bar.so context, you take the bar1.o and bar2.o CUs together. But this really is only the simplest case. Now imagine that bar1.o and bar2.o are linked into two separate DSOs. Those two DSOs are meant to work together, so one refers to the other's namespace. But foo.so has nothing to do with them. You can't tell that from the source, and you can't tell it from segregating the source along DSO boundaries. (Well, maybe you can, since the literal source has the visiblity attribute to see, and perhaps the DWARF could represent that to you. But I could as well have made the example do the symbol-hiding with a linker version script, so the compiler might never have known.) There is no CU in bar2.so that defines it, so you have to look in some other DSO. How can you tell whether you wanted to look in foo.so or in bar1.so? If the answer is to consult any ELF details about the DSOs' interrelationships (and that's the only answer I can think of), then we've come full circle to getting code-centric at the large granularity while we claimed to be holistic at the small granularity. If instead you think locally, and code-centrically, you can get it all right. The bar2.o CU's declaration says DW_AT_external, so you know its symbol name. In bar2.so this symbol is undefined, so you have to emulate the dynamic linker's lookup to resolve it. The foo.so symbol by that name is defined, but it's STB_LOCAL and STV_HIDDEN, so it can't be the one. The bar1.so definition is STB_GLOBAL and STV_DEFAULT, so it could be the one. Luckily, it's the only such candidate in this example. When there are other candidates, you have to be sure you are following the dynamic linker's algorithm for which comes first and wins. All the corners of that are a big wad of hair, but it is concretely knowable anyway. (Mostly. Mostly.) The case with symbol versions is much the same. The details of what ELF symbol magic you have to grok to resolve to the right symbol in the right object are all that differ. Of course, it can always be worse than that. Even when you do everything right, you still have to bridge the gap from the symbol name DWARF has made you decide you to look for, and an actual ELF symbol. If you do enough fancy linker machinations, there can be more than one symbol by the same name in a single linked object. (They'll have different binding, or different symbol versions, or be in different section groups.) It could very well not be possible to figure out which one the assembly code corresponding to a given CU wound up actually referring to when all linked. (There are ways to do things that just plain lose information.) But you have to really go out of your way to wind up with that. If a build process involves ld -r stages with funny options and/or linker scripts, be afraid--but even then the extreme weirdness is unlikely to come up. Not that I'm averse to ending on a note of despair, but I'll toss in a twist of contrarian wistfulness on the finish. It could all have been so different, man, it could have been...beautiful. There is an obvious case that the truly local and truly code-centric thing would be for the compiler to just tell you directly in the DWARF the actual truth about what the code does. Not with a symbol name, but with code, or its DWARF equivalent, which is a location. This is what Jan imagined in that IRC conversation. The DWARF spec explicitly allows this: a non-defining declaration (i.e. one with DW_AT_declaration) may have a DW_AT_location that applies only in the context of that DIE's scope and supercedes the defining declaration's DW_AT_location. Personally, I love this as a theory. But traditional practice (at least in GCC, and perhaps everywhere) has always been to just leave it off and let DW_AT_declaration, DW_AT_external imply the need to follow ELF-driven runtime logic. What it would mean is that the compiler emits two pieces of assembly code together in the translation unit: the compiled code that actually accesses a variable, and the DW_AT_location expression describing that access. In theory, though the instruction sets differ (one is real machine code fragment and one is DWARF expression stack machine program), this is assembling two versions of code that does the same thing, and uses the same symbols and relocations to do it. This is what Jan was imagining on IRC. But I was thinking about it in the wrong way. I took Jan's "with a relocation record" comment too literally, as meaning final objects with relocations left for DWARF (akin to dynamic text relocations). I didn't think of the cool way to consider the PIC case, or else I blathered on about that hours or days later that Jan didn't quote. (In fact, probably I just thought about it idly an hour or two later and then never discussed the thought. Sigh.) The key is that you can have the same(ish) relocs using the same symbols in the code and DWARF as assembled. Then whatever happens in linking stages later should be the same, as it's all the same ELF symbol references in both the .text and .debug_info relocations. If what you do is describe in the DWARF location expression exactly the real access that is used in the emitted code, then for a PIC translation unit that's a PIC access in the DWARF stack machine program too. That is resolved fully at link time (ld -shared) and does not yield a relocation record for the final DWARF, just as it doesn't yield a text relocation for each site of access in the code. For non-PIC code, the actual code looks like: movl _ZN8internal1iE(%rip), %eax and the DWARF bit could look like: .byte DW_OP_addr .quad _ZN8internal1iE These use different relocation types, but they mean the same thing in the context of how the code works. x86-64 globals are always PC-relative just because that's the efficient instruction, but it means the "absolute" address of the symbol. So the access uses a R_X86_64_PC32 and the DWARF uses R_X86_64_64. Since both relocs point to the same ELF symbol, you know they will "travel together". These get resolved at link time to absolute addresses, et voila. The DWARF location accurately describes what the code really does. In a PIC access, what the final code will actually do is not really related to anything about ELF symbols. It's just memory indirection. The PIC code is: movq _ZN8internal1iE@GOTPCREL(%rip), %rax movl (%rax), %eax This generates R_X86_64_GOTPCREL. At link time (ld -shared), this relocation goes away and it doesn't refer to the location where the _ZN8internal1iE symbol says. Instead, it's resolved to a .got slot created by the linker. When this code runs, all that's happening is loading the pointer from that memory. So, the DWARF location can describe that too: .byte DW_OP_addr .quad _ZN8internal1iE@GOT .byte DW_OP_deref This generates R_X86_64_GOT64. At link time, this too goes away and becomes the "absolute" address of the .got slot. So it accurately describes just what the code does to access internal::i. (In this case, "absolute" earns its scare quotes, because it really means relative to the load bias of the containing DSO at run time, just like all other addresses in DWARF, and in ELF symbol values, in a DSO.) We could certainly teach GCC to do this. It would then be telling us more pieces of direct truth about the code. Would that not be the best thing ever? Well, almost. First, what about a defining declaration in a PIC CU? In the abstract, a defining declaration can be considered as talking about two different things. One is its declarationhood, wherein it says that the containing scope has this name visible. For that purpose, it could reasonably be expected to be like a non-defining declaration: say how code in this scope accesses the variable--the truth about what's in the assembly code for any accesses in that CU. But the other thing is its definitionhood, wherein it says what data address contains the data cell and thus (optionally) implies what object file position holds the initializer image--another truth about what's in the assembly code for the definition in this CU. In non-PIC code, these two truths match. Both use direct address constants (as relocated at link time). But in PIC code, the truth about the definition is an address constant, while the truth about the access is an indirection through .got. (If you have PIC code that uses __attribute__((visibility("hidden"))) then it's direct access, though PC-relative, and thus "non-PIC" ("absolute") for DWARF purposes, so both truths match as in truly non-PIC code.) Personally, I would be all for having it both ways. In a CU where a defining declaration is actually used by PIC accesses, then you could generate a second non-defining declaration (even for C). Give it DW_AT_artificial, DW_AT_declaration, DW_AT_specification pointing to the defining declaration (in lieu of DW_AT_name, DW_AT_type, et al), and then DW_AT_location with the PIC style using indirection. With that, you could know that if you got a DW_AT_location from any DIE with DW_AT_declaration then you're done and have the real truth for accesses. If we presume no CUs from pre-apocalyptic compilers now that we are in these here end times, then we are finally free from ever having to rely on discerning the right ELF symbol from a name we surmised from DWARF (be it via DW_AT_MIPS_linkage_name or mangling). Phew! In other words, excepting the small matter of manifest reality, we don't even need to think about ELF stuff (except for the occasional load bias for a DW_OP_addr)--the access locations are always given in DWARF expressions using pure memory access (either direct or indirect). Well, almost. Before dynamic linker startup, what's the truth about what memory location a given name in a given scope refers to when PIC indirection is involved? The real truth is that code in that scope doesn't run yet! That question doesn't get answered until the dynamic linker has done startup. But you might want to know. Like what if you tried "print internal::i" (or "print foo", from the first PIC C example). If you've done something like "info line func" (I think) then you've given it a context (at least a CU) to imagine what you're asking about, so it can get to the right DIE for the right "internal::i" or "foo". That should print the initialized value, even though there is no memory to read it out of, only the ELF files. If that is a non-defining declaration with a PIC-indirect DW_AT_location, then it says to load the .got slot, but that slot is not initialized yet. Likewise, if that is a non-defining declaration for non-PIC code linked against a DSO definition, then it has an absolute-address DW_AT_location--which is the real truth about the memory to use, e.g. where to put a watchpoint--but that memory is not initialized yet. For those cases you can look for dynamic relocs applying to the memory address from DW_AT_location. There will be a R_*_GLOB_DAT reloc or an R_*_COPY reloc, respectively. That reloc points to an ELF symbol. It points to exactly the right symbol, no name-matching to do and possibly be wrong. That is, it reduces even the "extreme weirdness" cases to the level of hair of merely every actual case you would ever have to contend with in the real world. Secure in the knowledge that this is exactly the ELF symbol that the dynamic linker will be using to drive its resolution of this reloc, you just have to consider the binding, visibility, and version set of this symbol and correctly emulate the dynamic linker to find which exact ELF symbol in which file will be supplying the definition. In the PIC-indirect case, that is, the symbol whose adjusted address is the actual memory. In that case, of course, you have to then check that address for a copy reloc and possibly turn it into the direct address w/copy case. In that, "the definition" is the symbol whose address (as represented in the file by following its phdrs or shdrs) holds the initializer. So, if either you want to do "offline" work of any kind, or you want to cope with DWARF from compilers that have existed yet, then you still have the little matter of emulating the dynamic linker. I've omitted a whole little tirade (yes, omitted! it could be longer, I tell you!) about how literal the "truth" about how PIC accesses work really should be (x86-64 and its PC-relativity is the simplifying example!). I'll merely allude to most of the tirade about non-defining declarations for code. That is, functions, including methods. Well, ok, a wee bit of tirade. If you are trying to do correct expression evaluation, or just which "foo" I meant in "break foo", in a particular context, you have all the same issues for functions and methods. A non-defining declaration is a DW_TAG_subprogram with DW_AT_external and DW_AT_declaration. If the context yields a DIE that has DW_AT_declaration, you have to discern a mangled symbol and look it up. It has no DW_AT_location like a data object would. But it could have a DW_AT_entry_pc. The DWARF spec does not mention this case explicitly for use with non-defining declarations as it does for DW_AT_location. But I read it as implicitly permitted, and naturally meaning something analogous: the truth about where "foo()" calls made in this scope would jump to. The really real truth for PIC cases is that it's a PLT entry, and what all that means is pretty much the whole rest of that tirade. At the end of the day, more truth in the DWARF can only really save you from some pathological weirdness that isn't going to show up anyway. Just the run-of-the-mill weirdness means you really need to turn all the DW_AT_external declarations (defining ones too, given PIC!) into a known ELF symbol in the referring file and attempt to resolve that to the right true definition address by ELF rules. Not to mention that at best we assume that our imagined post-apocalyptic compiler surely can emit the same ELF symbol in assembly for DW_AT_location that it just did in assembly code for an actual use. But, that's exactly the same symbol name string it should emit in the assembly for DW_AT_MIPS_linkage_name, just with "" instead of @GOT. So, if it can get DW_AT_MIPS_linkage_name wrong... Ok, so by "wistfulness", I meant fantasy, disillusionment, bitterness, and resignation, and by "the finish", I meant "another four pages". What were you expecting? Thanks, Roland