From: Sami Wagiaalla <swagiaal@redhat.com>
To: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>,
archer@sourceware.org, Keith Seitz <keiths@redhat.com>
Subject: Re: Cross-CU C++ DIE references vs. mangling
Date: Mon, 12 Apr 2010 18:51:00 -0000 [thread overview]
Message-ID: <1271098008.2901.211.camel@localhost> (raw)
In-Reply-To: <20100311060305.B177A7D5E@magilla.sf.frob.com>
So after a few (really, many) reads of this email I think I can
summarize the issues and solutions discussed there. I just wanted to
make sure I have a proper understanding of the issue before filing a gcc
feature request. So, Is this a correct summary:
The goal is the help gdb find the proper location for variables where
declarations and definitions are separated over CU's or so's.
Why cant gdb do this by itself ? Because:
- It requires a search of all other CU's/so' to locate the definition.
Which is inefficient but also inaccurate since
- The scope of the declaration can be different from that of the
definition (e.g. class members). If DW_AT_MIPS_linkage_name is
available it can be used to resolve this, however
- if the definition is in a stripped DSO there is indeed a definition
(ELF) but nowhere is there a DW_AT_location pointing to it. Also,
- it is possible to have two names defined in two separate so's with the
same linkage name. eg:
> Consider:
>
> $ g++ -g -c -fPIC -o foo1.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { int i; };')
> $ g++ -g -c -fPIC -o foo2.o -xc++ <(echo 'namespace internal __attribute__((visibility("hidden"))) { extern int i; }; int foo () { return internal::i; }')
> $ gcc -g -shared -o foo.so foo1.o foo2.o
> $ g++ -g -c -fPIC -o bar1.o -xc++ <(echo 'namespace internal { int i; };')
> $ g++ -g -c -fPIC -o bar2.o -xc++ <(echo 'namespace internal { extern int i; }; int bar () { return internal::i; }')
> $ gcc -g -shared -o bar.so bar1.o bar2.o
> $ eu-readelf -sr -winfo foo.so bar.so
>
> Now imagine a program linking in both foo.so and bar.so. There are
> two different things that are both separate but equal and both truly
> internal::i and both truly _ZN8internal1iE. By any method, there is
> no one answer to, "What is internal::i?" The only answers are
> context-specific.
>
Proposed solution:
Teach the compiler to generate a DW_AT_location for a non defining
declaration that is applicable in that die's scope. That location
expression would be parallel to the assembly generated for the symbol
> The key is that you can have the same(ish) relocs using the same
> symbols in the code and DWARF as assembled. Then whatever happens
> in linking stages later should be the same[...]
So,
> For non-PIC code, the actual code looks like:
>
> movl _ZN8internal1iE(%rip), %eax
>
> and the DWARF bit could look like:
>
> .byte DW_OP_addr
> .quad _ZN8internal1iE
>
[...]
> These get resolved at link time to absolute addresses, et voila.
And,
> In a PIC access, what the final code will actually do is not really
> related to anything about ELF symbols. It's just memory indirection.
> The PIC code is:
>
> movq _ZN8internal1iE@GOTPCREL(%rip), %rax
> movl (%rax), %eax
>
[...]
> .byte DW_OP_addr
> .quad _ZN8internal1iE@GOT
> .byte DW_OP_deref
>
> This generates R_X86_64_GOT64. At link time, this too goes away and
> becomes the "absolute" address of the .got slot.
The following part I don't quite understand:
> We could certainly teach GCC to do this.
> It would then be telling us more pieces of direct truth about the code.
> Would that not be the best thing ever?
> Well, almost.
>
> First, what about a defining declaration in a PIC CU?
>
> In the abstract, a defining declaration can be considered as talking
> about two different things. One is its declarationhood, wherein it
> says that the containing scope has this name visible. For that
> purpose, it could reasonably be expected to be like a non-defining
> declaration: say how code in this scope accesses the variable--the
> truth about what's in the assembly code for any accesses in that CU.
> But the other thing is its definitionhood, wherein it says what data
> address contains the data cell and thus (optionally) implies what
> object file position holds the initializer image--another truth about
> what's in the assembly code for the definition in this CU.
>
> In non-PIC code, these two truths match. Both use direct address
> constants (as relocated at link time). But in PIC code, the truth
> about the definition is an address constant, while the truth about the
> access is an indirection through .got. (If you have PIC code that
> uses __attribute__((visibility("hidden"))) then it's direct access,
> though PC-relative, and thus "non-PIC" ("absolute") for DWARF
> purposes, so both truths match as in truly non-PIC code.)
>
> Personally, I would be all for having it both ways. In a CU where a
> defining declaration is actually used by PIC accesses, then you could
> generate a second non-defining declaration (even for C). Give it
> DW_AT_artificial, DW_AT_declaration, DW_AT_specification pointing to
> the defining declaration (in lieu of DW_AT_name, DW_AT_type, et al),
> and then DW_AT_location with the PIC style using indirection.
>
> With that, you could know that if you got a DW_AT_location from any
> DIE with DW_AT_declaration then you're done and have the real truth
> for accesses. If we presume no CUs from pre-apocalyptic compilers now
> that we are in these here end times, then we are finally free from
> ever having to rely on discerning the right ELF symbol from a name we
> surmised from DWARF (be it via DW_AT_MIPS_linkage_name or mangling).
>
Why is there a need for second artificial location describing die ? As I
understand it declarationhood is specified by the die's nesting in the
die hierarchy not its DW_AT_location. In other words, what is missing in
the current way gcc specifies locations for defining declarations ?
This summary does not include the part starting with "Before dynamic
linker startup" to the end of the email. Mainly because I am assuming
that the main use case is after dynamic linker startup.
next prev parent reply other threads:[~2010-04-12 18:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-10 19:18 Jan Kratochvil
2010-03-10 19:32 ` Jan Kratochvil
2010-03-11 6:03 ` Roland McGrath
2010-04-12 18:51 ` Sami Wagiaalla [this message]
2010-04-27 8:50 ` Roland McGrath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271098008.2901.211.camel@localhost \
--to=swagiaal@redhat.com \
--cc=archer@sourceware.org \
--cc=jan.kratochvil@redhat.com \
--cc=keiths@redhat.com \
--cc=roland@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).