public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* RFC: Implicit DWARF relocs
@ 2012-04-25  8:37 Jakub Jelinek
  2012-04-25 19:07 ` Cary Coutant
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Jelinek @ 2012-04-25  8:37 UTC (permalink / raw)
  To: binutils; +Cc: Cary Coutant

Hi!

I've looked at the DWARF Fission proposal and to me it looks that lots of
efforts is spent on decreasing the size of the debug info related relocations
on ET_REL objects, often (at least in my understanding) at the cost of
increasing the debug info (but by smaller amount than the saving on the size
of relocations).

I wonder if we couldn't for the reduction of relocation size against 
.debug_info/.debug_types/.debug_macro sections and maybe .debug_loc use a
different approach.  The DWARF sections are structured and DWARF consumers
know where to relocate things, so why couldn't the linker?

The compiler, if it detects linker support, could (through assembler
extensions) set 
#define SHF_GNU_DWARF_IMPLICIT_RELOCS	(1 << 20)
bit in sh_flags and set sh_info to algorithm version for the implicit relocs
(say 1 at the beginning) and then omit some relocations against other
.debug_* sections in the assembly it creates.

The linker (in ld.bfd I guess e.g. _bfd_elf_link_read_relocs could do that)
would then add implicit relocations where the compiler omitted them
(and for ld -r adjust them and not store again), using a simple algorithm.

The .debug_info (and similarly .debug_types) algorithm would be, for the
.debug_info section find corresponding .debug_str, .debug_loc, .debug_line,
.debug_ranges and .debug_abbrev sections if available, where corresponding
would be for .debug_info in a comdat group look for those named sections
in the same comdat group first, then fall back to the named sections not in
comdat, for non-comdat non-comdat named sections only.
Then qsort all the explicit relocations against the section by increasing
offset, then walk the section.  If abbrev offset field in the DWARF CU
header doesn't have explicit relocation against it, add implicit one against
corresponding .debug_abbrev section + addend stored in that memory location.
Parse abbrevs into an array or hash table or combined data structure, walk
the CU content.  If DW_FORM_strp location doesn't have explicit reloc,
assume corresponding .debug_str + addend in that 4 byte field.
If DW_FORM_sec_offset location doesn't have explicit reloc, assume
.debug_{line,ranges,loc} + addend in that field for DW_AT_stmt_list,
{DW_AT_ranges,DW_AT_start_scope} resp. other attributes.
Anywhere where the implicit reloc would do a wrong thing the producer must
supply an explicit relocation.

Of course the linker would need to be strict on erroring out on anything
unexpected (unknown DW_FORM_*, etc.).  The consumer would need to avoid
using implicit relocs if it uses forms that the linker might not understand
based on the requested algorithm version.

The .debug_macro algorithm would handle similarly DW_FORM_strp values in the
section if they don't have explicit reloc.

And, maybe .debug_loc could have an algorithm where for the address fields
in the section remembers last relocation against an address field if any,
and if an address field isn't ~0 or 0, implicitly relocate it relative
to the last address field relocation - 1 (the - 1 bias so that we never get
there 0).  Perhaps it should do it only until terminating 0, 0, and stop
also on ~0, something.  In .debug_loc we have an alternative, let the
producer for DW_AT_low_pc 0 (have_multiple_text_sections in dwarf2out.c,
which is quite often these days) emit ~0, base entries first, but that
wastes 64 or 128 bits in the section on each location list to get rid of
most of the relocations.

	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: Implicit DWARF relocs
  2012-04-25  8:37 RFC: Implicit DWARF relocs Jakub Jelinek
@ 2012-04-25 19:07 ` Cary Coutant
  2012-04-25 21:59   ` Jakub Jelinek
  0 siblings, 1 reply; 3+ messages in thread
From: Cary Coutant @ 2012-04-25 19:07 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: binutils

> I've looked at the DWARF Fission proposal and to me it looks that lots of
> efforts is spent on decreasing the size of the debug info related relocations
> on ET_REL objects, often (at least in my understanding) at the cost of
> increasing the debug info (but by smaller amount than the saving on the size
> of relocations).

Let's break this down into separate pieces:

(1) Relocations for DW_FORM_strp. By consolidating all the string
pointers into the .debug_str_offsets section, it becomes much easier
to have implicit relocations. Effectively, I'm moving the 4-byte
string offset from the DIE to the separate offsets section, then
adding an LEB128 index to the DIE, so that extra LEB128 index would be
a net increase in size -- until you have two or more references to the
same string. If we assume that an average string index will fit in a
two-byte LEB128, we only need to have two references to one string to
break even -- we add 4 bytes to the DIE stream, but we save an extra
4-byte string offset in the string offsets section. Now if we have
only one reference to a single string, odds are the compiler would
have used DW_FORM_string for it anyway, and the Fission proposal
changes nothing. I believe the new DW_FORM_str_index form and the
string offsets section can be a valuable improvement even apart from
Fission -- i.e., even when not splitting debug info into .dwo files.

(2) Relocations for references to .debug_abbrev, .debug_line. These
are insignificant, space-wise. There's no reason to ask the linker to
do any magic just to eliminate these.

(3) Relocations for DW_FORM_sec_offset, referring to .debug_loc
(loclistptr). For these, the Fission proposal essentially does what
you suggest. The references are left as unrelocated offsets, and it's
up the the DWARF consumer to locate the base of the .debug_loc
section. (Note that we have recently made some significant changes to
the way .debug_loc is handled -- we now move it to the .dwo file. If
you haven't read the Fission wiki page since I updated it yesterday,
please take another look.) Like (1), I think this could be a valuable
improvement even apart from Fission.

(4) Relocations for DW_FORM_sec_offset, referring to .debug_ranges
(rangelistptr). Here, we have also replaced relocated values with
unrelocated offsets, at the cost of adding a single relocated
attribute to the compile_unit DIE. (Again, please take a look at the
updated wiki page to see our recent changes.) Like (1) and (3), I
think this could also be an improvement apart from Fission.

(5) Relocations into loadable text and data. There's really nothing
that can help here, other than perhaps consolidating multiple
references to the same address into a single relocation. With Fission,
the .debug_addr section is crucial to the concept, and does let us do
that consolidation. Apart from Fission, it might still be useful: the
compiler could still use normal direct form for addresses likely to be
unique, but the new DW_FORM_addr_index form for addresses likely to
need consolidation.

> I wonder if we couldn't for the reduction of relocation size against
> .debug_info/.debug_types/.debug_macro sections and maybe .debug_loc use a
> different approach.  The DWARF sections are structured and DWARF consumers
> know where to relocate things, so why couldn't the linker?

I've avoided having the linker process the DWARF data for a couple of reasons:

(1) Our primary goal is to get the debug info out of the link path
completely -- it's expensive to send all that debug info to a
distributed build server. Consolidating the data that needs relocation
into the bare minimum sections -- .debug_addr, and skeleton
.debug_info/types sections -- lets us omit the rest from the .o files.

(2) The linker is slow enough without having to parse DWARF info.
That's why we've fixed GCC to generate good .debug_pubnames and
.debug_pubtypes sections so that we can generate the .gdb_index
section without having to do a full parse.

> The .debug_info (and similarly .debug_types) algorithm would be, for the
> .debug_info section find corresponding .debug_str, .debug_loc, .debug_line,
> .debug_ranges and .debug_abbrev sections if available, where corresponding
> would be for .debug_info in a comdat group look for those named sections
> in the same comdat group first, then fall back to the named sections not in
> comdat, for non-comdat non-comdat named sections only.
> Then qsort all the explicit relocations against the section by increasing
> offset, then walk the section.  If abbrev offset field in the DWARF CU
> header doesn't have explicit relocation against it, add implicit one against
> corresponding .debug_abbrev section + addend stored in that memory location.
> Parse abbrevs into an array or hash table or combined data structure, walk
> the CU content.  If DW_FORM_strp location doesn't have explicit reloc,
> assume corresponding .debug_str + addend in that 4 byte field.
> If DW_FORM_sec_offset location doesn't have explicit reloc, assume
> .debug_{line,ranges,loc} + addend in that field for DW_AT_stmt_list,
> {DW_AT_ranges,DW_AT_start_scope} resp. other attributes.
> Anywhere where the implicit reloc would do a wrong thing the producer must
> supply an explicit relocation.

From my measurements on large C++ apps, relocations for DW_FORM_strp
dominate all others. Consolidating these into .debug_str_offsets would
allow the linker to process these relocations implicitly without any
extra overhead from parsing the DWARF DIE structure. In addition, we
gain the benefit I described above of consolidating multiple
references to the same string.

> And, maybe .debug_loc could have an algorithm where for the address fields
> in the section remembers last relocation against an address field if any,
> and if an address field isn't ~0 or 0, implicitly relocate it relative
> to the last address field relocation - 1 (the - 1 bias so that we never get
> there 0).  Perhaps it should do it only until terminating 0, 0, and stop
> also on ~0, something.  In .debug_loc we have an alternative, let the
> producer for DW_AT_low_pc 0 (have_multiple_text_sections in dwarf2out.c,
> which is quite often these days) emit ~0, base entries first, but that
> wastes 64 or 128 bits in the section on each location list to get rid of
> most of the relocations.

Please take a look at the revised wiki page for our new treatment of .debug_loc.

-cary

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: Implicit DWARF relocs
  2012-04-25 19:07 ` Cary Coutant
@ 2012-04-25 21:59   ` Jakub Jelinek
  0 siblings, 0 replies; 3+ messages in thread
From: Jakub Jelinek @ 2012-04-25 21:59 UTC (permalink / raw)
  To: Cary Coutant; +Cc: binutils

On Wed, Apr 25, 2012 at 12:02:10PM -0700, Cary Coutant wrote:
> > I've looked at the DWARF Fission proposal and to me it looks that lots of
> > efforts is spent on decreasing the size of the debug info related relocations
> > on ET_REL objects, often (at least in my understanding) at the cost of
> > increasing the debug info (but by smaller amount than the saving on the size
> > of relocations).
> 
> Let's break this down into separate pieces:

With this I'm not trying to replace Fission in any way, I've been just
thinking about decreasing disk space usage for debug info usage where the
most important thing is the final debug info size of linked objects, but
it doesn't hurt if the ET_REL size/*.a of them decreases.
For the size of debug info in commonly linked shared libraries/binaries in
the distributions and from work on dwz I think a single pass over selected
debug sections, just parsing the forms isn't that expensive, the sections
will be largely paged in during relocation processing anyway.

On my latest gcc build, looking at libbackend.a, there is ~ 70MB total of
.rela.debug_info sections, ~ 70MB total of .rela.debug_loc sections and
~ 100MB total of .rela.debug_macro sections.  By adding implicit relocations
there, the first 5 kinds of .rela.debug_info, or the first kind of
.rela.debug_macro, or big part of the first kind of .text relocs in
.debug_loc could be saved, so libbackend.a could shrink from ~ 750MB by
guess 200MB or slightly more than that.

Except for .debug_loc DW_FORM_strp is really the most important one,
but by adding an indirection table I'm afraid debug info will grow for the
final executable unnecessarily.  From looking at several larger *.o files,
usually most of the DW_FORM_strp references are present just once, so even
if you use uleb128 as references to the indirect table, it will for large
CUs be mostly 6 bytes for string reference instead of just 4.

			number of relocations
.rela.debug_info
        .debug_str      1643281
        .debug_loc       363564
        .debug_ranges    163042
        .debug_line         305 
        .debug_abbrev       305
        others           769945
.rela.debug_macro
        .debug_str      4083259
        .debug_macro      68995
        .debug_line         305
.rela.debug_loc
        .text           2718726
        others           212556

	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-04-25 21:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-25  8:37 RFC: Implicit DWARF relocs Jakub Jelinek
2012-04-25 19:07 ` Cary Coutant
2012-04-25 21:59   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).