public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* linkonce and dwarf 2
@ 1999-11-16 15:13 Jason Merrill
  1999-11-16 16:23 ` Ian Lance Taylor
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Merrill @ 1999-11-16 15:13 UTC (permalink / raw)
  To: binutils; +Cc: jason

So, I've been thinking about schemes for eliminating duplicate dwarf2 debug
information without having to deal with the linker reading it all in and
writing it out again.

For those less familiar with dwarf2, debug info is divided into
"compilation units" (CUs), which don't necessarily correspond to object
files, though commonly they do.  Usually, references between DIEs (Debug
Info Entries) are via an offset from the beginning of the CU, but they can
also be via an offset from the beginning of the .debug_info section, for
exactly this purpose.

1) One scheme that ought to work fine is to put the DIE
for a class into its own CU, which would go into a linkonce section and be
pasted onto .debug_info by the linker script.  The problem with this is
that you would pay the CU overhead for each such class; it works out to at
least 16 bytes for a minimal CU header, though when we're dealing with
classes like streambuf that have 6k of debug info, maybe that's not so bad.

2) Another scheme would be to have only one additional CU.  Each class
would still get its own section, and the link would cause the CU header to
be wrapped around the classes which weren't discarded from that object; I'm
assuming that if you have sections in two objects like

   .debug_info_head
   .gnu.linkonce.d.1A
   .gnu.linkonce.d.f__Fv
   .debug_info_tail

and the link script says

   .debug_info : { *(.debug_info* .gnu.linkonce.d.*) }

then the section contents will retain the above order in the output, rather
than lumping all the _head bits together and so on.

The problem with this scheme is that the CU header specifies the length of
the CU, and there would be no way to compute this at compile or assembly
time; it would depend on which subsections had been kept or thrown away.
Is there currently any way to express link-time arithmetic in gas?

3) Another scheme would be to do something like the stabs BINCL
optimization, and put all the info for a header into its own CU.  This
would also be useful for C, while the two schemes above would mainly work
for C++ (because of the stronger rules for type names in C++).  The problem
with this scheme is that it would require linkonce and symbol resolution
semantics like those of the BINCL optimization; we would only want to
discard duplicates if they were identical, and we would want references to
be bound to the appropriate instance if not all are discarded.

Hmm...it occurs to me that rather than just keying off the header name, we
could do a checksum of the debug info we're generating and include that in
the section name and symbol names.  That ought to do the trick without any
linker changes.

Any thoughts?

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: linkonce and dwarf 2
  1999-11-16 15:13 linkonce and dwarf 2 Jason Merrill
@ 1999-11-16 16:23 ` Ian Lance Taylor
  1999-11-16 18:00   ` Jason Merrill
  1999-11-16 18:40   ` Richard Henderson
  0 siblings, 2 replies; 6+ messages in thread
From: Ian Lance Taylor @ 1999-11-16 16:23 UTC (permalink / raw)
  To: jason; +Cc: binutils

   Date: Tue, 16 Nov 1999 15:13:27 -0800
   From: Jason Merrill <jason@cygnus.com>

   2) Another scheme would be to have only one additional CU.  Each class
   would still get its own section, and the link would cause the CU header to
   be wrapped around the classes which weren't discarded from that object; I'm
   assuming that if you have sections in two objects like

      .debug_info_head
      .gnu.linkonce.d.1A
      .gnu.linkonce.d.f__Fv
      .debug_info_tail

   and the link script says

      .debug_info : { *(.debug_info* .gnu.linkonce.d.*) }

   then the section contents will retain the above order in the output, rather
   than lumping all the _head bits together and so on.

I think this will cause the GNU linker to output all the .debug_info*
sections followed by all the .gnu.linkonce.d.* sections.  I may be
mistaken, though.

What if you aren't using the GNU linker?

   The problem with this scheme is that the CU header specifies the length of
   the CU, and there would be no way to compute this at compile or assembly
   time; it would depend on which subsections had been kept or thrown away.
   Is there currently any way to express link-time arithmetic in gas?

Only via relocations.

   3) Another scheme would be to do something like the stabs BINCL
   optimization, and put all the info for a header into its own CU.  This
   would also be useful for C, while the two schemes above would mainly work
   for C++ (because of the stronger rules for type names in C++).  The problem
   with this scheme is that it would require linkonce and symbol resolution
   semantics like those of the BINCL optimization; we would only want to
   discard duplicates if they were identical, and we would want references to
   be bound to the appropriate instance if not all are discarded.

I'm not sure why this is a serious problem.

   Hmm...it occurs to me that rather than just keying off the header name, we
   could do a checksum of the debug info we're generating and include that in
   the section name and symbol names.  That ought to do the trick without any
   linker changes.

Cute.

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: linkonce and dwarf 2
  1999-11-16 16:23 ` Ian Lance Taylor
@ 1999-11-16 18:00   ` Jason Merrill
  1999-11-16 18:11     ` Ian Lance Taylor
  1999-11-16 18:40   ` Richard Henderson
  1 sibling, 1 reply; 6+ messages in thread
From: Jason Merrill @ 1999-11-16 18:00 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils

>>>>> Ian Lance Taylor <ian@zembu.com> writes:

 >    From: Jason Merrill <jason@cygnus.com>

 >       .debug_info : { *(.debug_info* .gnu.linkonce.d.*) }

 >    then the section contents will retain the above order in the output,
 >    rather than lumping all the _head bits together and so on.

 > I think this will cause the GNU linker to output all the .debug_info*
 > sections followed by all the .gnu.linkonce.d.* sections.  I may be
 > mistaken, though.

I figured that if that's what you wanted, you would write

  *(.debug_info*)
  *(.gnu.linkonce.d.*)

 > What if you aren't using the GNU linker?

Well, then you lose.  I don't see any way to do this stuff that doesn't
rely on link-time combination of sections.

 >    The problem with this scheme is that the CU header specifies the
 >    length of the CU, and there would be no way to compute this at
 >    compile or assembly time; it would depend on which subsections had
 >    been kept or thrown away.  Is there currently any way to express
 >    link-time arithmetic in gas?

 > Only via relocations.

I suppose what I was asking was "is there a relocation which could express
the difference between two arbitrary symbols, and is there gas syntax which
could generate it?"

 >    3) Another scheme would be to do something like the stabs BINCL
 >    optimization, and put all the info for a header into its own CU.
 >    This would also be useful for C, while the two schemes above would
 >    mainly work for C++ (because of the stronger rules for type names in
 >    C++).  The problem with this scheme is that it would require linkonce
 >    and symbol resolution semantics like those of the BINCL optimization;
 >    we would only want to discard duplicates if they were identical, and
 >    we would want references to be bound to the appropriate instance if
 >    not all are discarded.

 > I'm not sure why this is a serious problem.

Hmm?  Because headers might have different contents in different objects,
whether because of macros, template instantiations or whatever.  If I have
a DIE referring to a symbol defined in a comdat that is discarded and that
symbol isn't defined elsewhere, I lose.

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: linkonce and dwarf 2
  1999-11-16 18:00   ` Jason Merrill
@ 1999-11-16 18:11     ` Ian Lance Taylor
  1999-11-17  7:57       ` Jason Merrill
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Lance Taylor @ 1999-11-16 18:11 UTC (permalink / raw)
  To: jason; +Cc: binutils

   From: Jason Merrill <jason@cygnus.com>
   Date: 16 Nov 1999 18:00:37 -0800

    >       .debug_info : { *(.debug_info* .gnu.linkonce.d.*) }

    >    then the section contents will retain the above order in the output,
    >    rather than lumping all the _head bits together and so on.

    > I think this will cause the GNU linker to output all the .debug_info*
    > sections followed by all the .gnu.linkonce.d.* sections.  I may be
    > mistaken, though.

   I figured that if that's what you wanted, you would write

     *(.debug_info*)
     *(.gnu.linkonce.d.*)

I don't think there is actually any difference between the two.
Again, I may be mistaken.  I haven't actually tried it.

I don't see anything in the documentation which promises anything
either way.

    >    The problem with this scheme is that the CU header specifies the
    >    length of the CU, and there would be no way to compute this at
    >    compile or assembly time; it would depend on which subsections had
    >    been kept or thrown away.  Is there currently any way to express
    >    link-time arithmetic in gas?

    > Only via relocations.

   I suppose what I was asking was "is there a relocation which could express
   the difference between two arbitrary symbols, and is there gas syntax which
   could generate it?"

There is no such relocation for most object file formats.  In some
cases, you can simulate it by using a PC relative relocation with a
complex addend.

    >    3) Another scheme would be to do something like the stabs BINCL
    >    optimization, and put all the info for a header into its own CU.
    >    This would also be useful for C, while the two schemes above would
    >    mainly work for C++ (because of the stronger rules for type names in
    >    C++).  The problem with this scheme is that it would require linkonce
    >    and symbol resolution semantics like those of the BINCL optimization;
    >    we would only want to discard duplicates if they were identical, and
    >    we would want references to be bound to the appropriate instance if
    >    not all are discarded.

    > I'm not sure why this is a serious problem.

   Hmm?  Because headers might have different contents in different objects,
   whether because of macros, template instantiations or whatever.  If I have
   a DIE referring to a symbol defined in a comdat that is discarded and that
   symbol isn't defined elsewhere, I lose.

The BINCL optimization only discards duplicate headers which are the
same.  It hashes the information; if headers have different contents,
both instances are kept.

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: linkonce and dwarf 2
  1999-11-16 16:23 ` Ian Lance Taylor
  1999-11-16 18:00   ` Jason Merrill
@ 1999-11-16 18:40   ` Richard Henderson
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Henderson @ 1999-11-16 18:40 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: jason, binutils

On Tue, Nov 16, 1999 at 07:21:37PM -0500, Ian Lance Taylor wrote:
>       .debug_info : { *(.debug_info* .gnu.linkonce.d.*) }
> 
> I think this will cause the GNU linker to output all the .debug_info*
> sections followed by all the .gnu.linkonce.d.* sections.  I may be
> mistaken, though.

You are correct.


r~

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: linkonce and dwarf 2
  1999-11-16 18:11     ` Ian Lance Taylor
@ 1999-11-17  7:57       ` Jason Merrill
  0 siblings, 0 replies; 6+ messages in thread
From: Jason Merrill @ 1999-11-17  7:57 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils

>>>>> Ian Lance Taylor <ian@zembu.com> writes:

 > The BINCL optimization only discards duplicate headers which are the
 > same.  It hashes the information; if headers have different contents,
 > both instances are kept.

Exactly.  What I was saying is that there is no way to get those semantics
for dwarf2 with SEC_LINK_ONCE.

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1999-11-17  7:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-11-16 15:13 linkonce and dwarf 2 Jason Merrill
1999-11-16 16:23 ` Ian Lance Taylor
1999-11-16 18:00   ` Jason Merrill
1999-11-16 18:11     ` Ian Lance Taylor
1999-11-17  7:57       ` Jason Merrill
1999-11-16 18:40   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).