public inbox for frysk@sourceware.org
 help / color / mirror / Atom feed
* Dwarf/libdw question
@ 2007-10-01 15:36 Sami Wagiaalla
  2007-10-02  2:20 ` Roland McGrath
  0 siblings, 1 reply; 4+ messages in thread
From: Sami Wagiaalla @ 2007-10-01 15:36 UTC (permalink / raw)
  To: Roland McGrath, frysk

Hi Roland,

I am working on implementing c++ scoping rules in frysk. Is there 
elfutils API that I can use to figure out what class/struct a function 
belongs to, so that references to member variables  can be resolved.

If not I could create something... perhaps following dwarf_getscopes_die 
as a guide line.

Sami

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Dwarf/libdw question
  2007-10-01 15:36 Dwarf/libdw question Sami Wagiaalla
@ 2007-10-02  2:20 ` Roland McGrath
  2007-10-02 14:34   ` Elfutils: Resolving class member variables (was Re: Dwarf/libdw question) Sami Wagiaalla
  0 siblings, 1 reply; 4+ messages in thread
From: Roland McGrath @ 2007-10-02  2:20 UTC (permalink / raw)
  To: Sami Wagiaalla; +Cc: frysk

Hi Sami.  Please use more specific Subject lines in your postings.
Reading the list archives' index will not be very informative to
someone looking years from now for discussion on this particular topic.

> I am working on implementing c++ scoping rules in frysk. Is there 
> elfutils API that I can use to figure out what class/struct a function 
> belongs to, so that references to member variables  can be resolved.

The key is DW_AT_specification.  Let's take an example:

	class c
	{
	  int m1() { return 17; }
	  int m2();
	public:
	  int m() { return m1() + m2(); }
	};

	int c::m2() { return 23; }

	int main()
	{
	  c x;
	  return x.m();
	}

The DIE tree for this is (explanations below):

	 [     b]  compile_unit
		   macro_info           0
		   stmt_list            0
		   producer             "GNU C++ 4.1.2 20070502 (Red Hat 4.1.2-12)"
		   language             C++ (4)
		   name                 "s.cxx"
		   comp_dir             "/home/roland/build/stock-elfutils"
	 [    67]    structure_type
		     sibling              [    d4]
		     name                 "c"
		     byte_size            1
		     decl_file            1
		     decl_line            2
	 [    71]      subprogram
		       sibling              [    94]
		       external             
		       name                 "m1"
		       decl_file            1
		       decl_line            3
		       MIPS_linkage_name    "_ZN1c2m1Ev"
		       type                 [    d4]
		       accessibility        private (3)
		       declaration          
	 [    8d]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    94]      subprogram
		       sibling              [    b7]
		       external             
		       name                 "m2"
		       decl_file            1
		       decl_line            4
		       MIPS_linkage_name    "_ZN1c2m2Ev"
		       type                 [    d4]
		       accessibility        private (3)
		       declaration          
	 [    b0]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    b7]      subprogram
		       external             
		       name                 "m"
		       decl_file            1
		       decl_line            6
		       MIPS_linkage_name    "_ZN1c1mEv"
		       type                 [    d4]
		       declaration          
	 [    cc]        formal_parameter
			 type                 [    db]
			 artificial           
	 [    d4]    base_type
		     name                 "int"
		     byte_size            4
		     encoding             signed (5)
	 [    db]    pointer_type
		     byte_size            8
		     type                 [    67]
	 [    e1]    subprogram
		     sibling              [   10d]
		     specification        [    71]
		     low_pc               0x000000000040054c
		     high_pc              0x000000000040055b
		     frame_base           location list [     0]
	 [    fe]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -24
	 [   10d]    const_type
		     type                 [    db]
	 [   112]    subprogram
		     sibling              [   13f]
		     specification        [    94]
		     decl_line            9
		     low_pc               0x0000000000400528
		     high_pc              0x0000000000400537
		     frame_base           location list [    4c]
	 [   130]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -24
	 [   13f]    subprogram
		     sibling              [   16b]
		     specification        [    b7]
		     low_pc               0x000000000040055c
		     high_pc              0x0000000000400587
		     frame_base           location list [    98]
	 [   15c]      formal_parameter
		       name                 "this"
		       type                 [   10d]
		       artificial           
		       location             2 byte block
			[   0] fbreg -32
	 [   16b]    subprogram
		     external             
		     name                 "main"
		     decl_file            1
		     decl_line            11
		     type                 [    d4]
		     low_pc               0x0000000000400538
		     high_pc              0x000000000040054b
		     frame_base           location list [    e4]
	 [   18c]      variable
		       name                 "x"
		       decl_file            1
		       decl_line            13
		       type                 [    67]
		       location             2 byte block
			[   0] fbreg -17

Note that the subprogram DIEs describing actual machine code are
top-level children of the CU.  Here these are [e1], [112], [13f].  They
are not children of [67], the structure_type DIE describing the class.
This is sensible enough because these are global function definitions,
even if they have names and types with scope limited to the class.

Consider [112].  This has the attributes and children that refer to its
machine code (low_pc, high_pc, frame_base, formal_parameter).  Note it
does not have the attributes like name and type.  Instead, it has a
specification attribute that points to [94].  specification is
analogous to abstract_origin, but rather than linking a concrete code
element to an abstract inline definition, it links a concrete code
element to an abstract declaration.  So, [112] is the code for "m2",
and [94] is the specification for "m2".

dwarf_attr_integrate checks for specification as well as abstract_origin.
So, for common cases with attributes you just don't think about it.
dwarf_diename uses dwarf_attr_integrate, so you will see a name without
extra effort even if it's indirect.

I used [112] as the example because m2 is defined outside the class
definition.  As you can see, GCC does the same thing for m1 [e1] and m
[13f], though those definitions actually appear lexically inside the
class.  Reading the DWARF spec one would expect these cases to use a
single DIE inside the class and not use DW_AT_specification at all.  I
don't know if there is a particular reason GCC doesn't do that, and I
see no big benefit in changing what it does.  But I think that DWARF
consumers should expect that either style might be used and work the
same with either.

Note how [112] has a decl_line attribute but no decl_file, while [e1]
and [13f] have neither.  This is an example of the general rule with
specification (and abstract_origin): it's elided if it's not different.
Since m2's body was defined outside the class, [112] refers to line 9.
If the class declaration were in a header file and the method definition
in another file, there would also be a decl_file attribute.  (If
everything were all on one line and the compiler emitted column
information, there would be a decl_column but no decl_line.  The
compiler does not yet emit decl_column attributes, but we should write
consumers as if it did.)  Since [e1] and [13f] describe bodies defined
in their selfsame specification declarations, they would never have a
decl_{file,line,column} of their own.

So now I've told you the basics to work with, but not actually answered
your question.  There are two parts to resolving class members.

First, the name resolution per se.  First there are scopes inside a
subprogram DIE, same as in C.  When you are dealing with a class method,
the subprogram's specification attribute gives you the declaration
inside the class scope (use dwarf_formref_die (dwarf_attr (...))).  Then
use dwarf_getscopes_die on that to see the class, namespace, etc. scopes
containing it.  For each of those, see if they have DW_TAG_inheritance,
DW_TAG_imported_declaration, etc. children that contribute more scopes
to the name resolution logic for the language.  Among those you find a
member, variable, subprogram, etc. DIE by the name you are looking for.

If you found a static member (aka class variable), i.e. DW_TAG_variable,
you are done.  It gets treated just like other variable DIEs.

If you found a class member (aka instance variable), i.e. DW_TAG_member,
then it depends on how you plan to use it.  For the context of a pointer
to member (as "mem" in "type cl::*p = &cl::mem;"), then you are done.
The DW_AT_data_member_location tells you what value to use.

In a static method (aka class method), referring to a regular class
member (instance variable) is invalid.

In an instance method, "mem" is resolved the same as "this->mem".  The
subprogram DIE for the method definition contains an automatically-inserted
first formal_parameter DIE, with the artifical attribute and named "this".
AFAICT, the only way to distinguish a static method from an instance method
in the DWARF tree is the presence of this first artifical formal_parameter.
(Though in practice it always has the name attribute of "this", I would
write it to detect a first formal_parameter with artifical rather than
looking at the name.)  This formal_parameter is like any other aside from
being artifical, so you combine its location attribute with the PC context
you're looking from, and data_member_location attribute of the member DIE
to find the member in the object from that PC context.

When the name resolved to a subprogram DIE, you have to do two things to
see how to treat it.  First, if the DIE has DW_AT_declaration, then you
have to find the concrete code DIE whose DW_AT_specification points to it.
Then, you have to check (as above) whether it's a static method or an
instance method, so you know what "name(foo)" is supposed to mean if a user
gave that as a call.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Elfutils: Resolving class member variables (was Re: Dwarf/libdw  question)
  2007-10-02  2:20 ` Roland McGrath
@ 2007-10-02 14:34   ` Sami Wagiaalla
  2007-10-02 19:59     ` Sami Wagiaalla
  0 siblings, 1 reply; 4+ messages in thread
From: Sami Wagiaalla @ 2007-10-02 14:34 UTC (permalink / raw)
  To: Roland McGrath; +Cc: frysk

Hi Roland,

Thank you for the thorough reply. You have probably answered any future 
questions I might have :)

Roland McGrath wrote:
> Hi Sami.  Please use more specific Subject lines in your postings.
> Reading the list archives' index will not be very informative to
> someone looking years from now for discussion on this particular topic.
>   
Good point... will do.
> First, the name resolution per se.  First there are scopes inside a
> subprogram DIE, same as in C.  When you are dealing with a class method,
> the subprogram's specification attribute gives you the declaration
> inside the class scope (use dwarf_formref_die (dwarf_attr (...))).  Then
> use dwarf_getscopes_die on that to see the class, namespace, etc. scopes
> containing it.  For each of those, see if they have DW_TAG_inheritance,
> DW_TAG_imported_declaration, etc. children that contribute more scopes
> to the name resolution logic for the language.  Among those you find a
> member, variable, subprogram, etc. DIE by the name you are looking for.
>   
I will try that.

Thanks,
  Sami

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Elfutils: Resolving class member variables (was Re: Dwarf/libdw   question)
  2007-10-02 14:34   ` Elfutils: Resolving class member variables (was Re: Dwarf/libdw question) Sami Wagiaalla
@ 2007-10-02 19:59     ` Sami Wagiaalla
  0 siblings, 0 replies; 4+ messages in thread
From: Sami Wagiaalla @ 2007-10-02 19:59 UTC (permalink / raw)
  To: Roland McGrath; +Cc: frysk

Sami Wagiaalla wrote:
> Hi Roland,
>
> Thank you for the thorough reply. You have probably answered any 
> future questions I might have :)
>
> Roland McGrath wrote:
>> Hi Sami.  Please use more specific Subject lines in your postings.
>> Reading the list archives' index will not be very informative to
>> someone looking years from now for discussion on this particular topic.
>>   
> Good point... will do.
>> First, the name resolution per se.  First there are scopes inside a
>> subprogram DIE, same as in C.  When you are dealing with a class method,
>> the subprogram's specification attribute gives you the declaration
>> inside the class scope (use dwarf_formref_die (dwarf_attr (...))).  Then
>> use dwarf_getscopes_die on that to see the class, namespace, etc. scopes
>> containing it.  For each of those, see if they have DW_TAG_inheritance,
>> DW_TAG_imported_declaration, etc. children that contribute more scopes
>> to the name resolution logic for the language.  Among those you find a
>> member, variable, subprogram, etc. DIE by the name you are looking for.
>>   
> I will try that.
This works, but the following patch was needed:

Index: libdw/libdw_visit_scopes.c
===================================================================
RCS file: /cvs/frysk/frysk-imports/elfutils/libdw/libdw_visit_scopes.c,v
retrieving revision 1.4
diff -u -r1.4 libdw_visit_scopes.c
--- libdw/libdw_visit_scopes.c  22 Aug 2007 17:11:08 -0000      1.4
+++ libdw/libdw_visit_scopes.c  2 Oct 2007 19:55:06 -0000
@@ -68,6 +68,7 @@
     case DW_TAG_catch_block:
     case DW_TAG_try_block:
     case DW_TAG_entry_point:
+  case DW_TAG_structure_type:
       return match;
     case DW_TAG_inlined_subroutine:
       return match_inline;

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-10-02 19:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-01 15:36 Dwarf/libdw question Sami Wagiaalla
2007-10-02  2:20 ` Roland McGrath
2007-10-02 14:34   ` Elfutils: Resolving class member variables (was Re: Dwarf/libdw question) Sami Wagiaalla
2007-10-02 19:59     ` Sami Wagiaalla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).