* Range lists, zero-length functions, linker gc @ 2020-05-31 18:55 Fangrui Song 2020-05-31 19:15 ` Fangrui Song ` (3 more replies) 0 siblings, 4 replies; 25+ messages in thread From: Fangrui Song @ 2020-05-31 18:55 UTC (permalink / raw) To: binutils, gdb, elfutils-devel It is being discussed on llvm-dev (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA) what linkers should do regarding relocations referencing dropped functions (due to section group rules, --gc-sections, /DISCARD/, etc) in .debug_* As an example: __attribute__((section(".text.x"))) void f1() { } __attribute__((section(".text.x"))) void f2() { } int main() { } Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION symbols are collected): 0x00000043: DW_TAG_subprogram [2] ###### relocated by .text.x + 10 DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") With ld --gc-sections: * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend This can cause overlapping address ranges with normal text sections. {{overlap}} * [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend). See bfd/reloc.c (behavior introduced in https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 ) [0, 0) cannot be used because it terminates the list entry. [-1, -1) cannot be used because -1 represents a base address selection entry which will affect subsequent address offset pairs. * .debug_loc address offset pairs have similar problem to .debug_ranges * In DWARF v5, the abnormal values can be in a separate section .debug_addr --- To save your time, I have a summary of the discussions. I am eager to know what you think of the ideas from binutils/gdb/elfutils's perspective. * {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address. All (undef + addend) in .debug_* are resolved to -1. We have to ignore the addend. With __attribute__((section(".text.x"))), the address offset pair may be something like [.text.x + 16, .text.x + 24) I have to resolve the whole (.text.x + 16) to the special value. (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2 (0 and -1 cannot be used due to the reasons above). * Refined formula for a relocated value in a non-SHF_ALLOC section: if is_defined(sym) return addr(sym) + addend if relocated_section is .debug_ranges or .debug_loc return -2 # addend is intentionally ignored // Every DWARF v5 section falls here return -1 {{zero}} * {{zero}} Can we resolve (undef + addend) to 0? https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to > quirky/interesting use cases (admittedly - such platforms could equally want to make their > executable code way up in the address space near max or max - 1, etc?). Question: is address 0 meaningful for code in some binary formats? * {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend. For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large enough, it may overlap with a normal text address range (for example [0x10000, *)) This can cause problems in debuggers. How does gdb solve the problem? * {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to `addend`. For non-debug sections (open-ended), do we have needs resolving such values to `base` or `base+addend` where base is customizable? (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html ) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 18:55 Range lists, zero-length functions, linker gc Fangrui Song @ 2020-05-31 19:15 ` Fangrui Song 2020-05-31 20:10 ` Mark Wielaard ` (2 subsequent siblings) 3 siblings, 0 replies; 25+ messages in thread From: Fangrui Song @ 2020-05-31 19:15 UTC (permalink / raw) To: binutils, gdb, elfutils-devel On 2020-05-31, Fangrui Song wrote: >It is being discussed on llvm-dev >(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA) >what linkers should do regarding relocations referencing dropped functions (due >to section group rules, --gc-sections, /DISCARD/, etc) in .debug_* > >As an example: > > __attribute__((section(".text.x"))) void f1() { } > __attribute__((section(".text.x"))) void f2() { } > int main() { } > >Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION >symbols are collected): > > 0x00000043: DW_TAG_subprogram [2] > ###### relocated by .text.x + 10 > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > >With ld --gc-sections: > >* DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend > This can cause overlapping address ranges with normal text sections. {{overlap}} >* [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend). > See bfd/reloc.c (behavior introduced in > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 ) > > [0, 0) cannot be used because it terminates the list entry. > [-1, -1) cannot be used because -1 represents a base address selection entry which will affect > subsequent address offset pairs. >* .debug_loc address offset pairs have similar problem to .debug_ranges >* In DWARF v5, the abnormal values can be in a separate section .debug_addr > >--- > >To save your time, I have a summary of the discussions. I am eager to know what you think >of the ideas from binutils/gdb/elfutils's perspective. > >* {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address. > All (undef + addend) in .debug_* are resolved to -1. > > We have to ignore the addend. With __attribute__((section(".text.x"))), > the address offset pair may be something like [.text.x + 16, .text.x + 24) > I have to resolve the whole (.text.x + 16) to the special value. > > (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2 > (0 and -1 cannot be used due to the reasons above). > >* Refined formula for a relocated value in a non-SHF_ALLOC section: > > if is_defined(sym) > return addr(sym) + addend > if relocated_section is .debug_ranges or .debug_loc > return -2 # addend is intentionally ignored > > // Every DWARF v5 section falls here > return -1 {{zero}} > >* {{zero}} Can we resolve (undef + addend) to 0? > > https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html > > > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to > > quirky/interesting use cases (admittedly - such platforms could equally want to make their > > executable code way up in the address space near max or max - 1, etc?). > > Question: is address 0 meaningful for code in some binary formats? > >* {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend. > For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large > enough, it may overlap with a normal text address range (for example [0x10000, *)) > > This can cause problems in debuggers. How does gdb solve the problem? > >* {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to > `addend`. For non-debug sections (open-ended), do we have needs resolving such > values to `base` or `base+addend` where base is customizable? > (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html ) Forgot to mention * {{compatibility}} Do we need an option if we change the computed value of (undef + addend) to -2 (.debug_loc,.debug_ranges)/-1 (other .debug_*) (or 0 (other .debug_*), but it might not be nice to some binary formats {{reserved_address}}) https://lists.llvm.org/pipermail/llvm-dev/2020-May/141958.html > If we end up blessing it as part of the DWARF spec, we probably > wouldn't want it to be user-configurable for the .debug_ sections, so > I'd hesitate to add that configurability to the linker lest we have to > revoke it to conform to DWARF (breaking flag compatibility with > previous versions of the linker, etc). Admittedly we'll be breaking > output compatibility with this change regardless, so potentially > having the flag as an escape hatch could be useful. I hope we don't need to have a linker option. But if some not-so-old versions of gdb / binutils programs / elfutils programs can't cope with -2/-1/0 {{reserved_address}}, we may have to invent a linker option. I hope GNU ld, gold and LLD can have a compatible option. (As an LLD contributor, I'd be happy to implement the opinion in LLD) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 18:55 Range lists, zero-length functions, linker gc Fangrui Song 2020-05-31 19:15 ` Fangrui Song @ 2020-05-31 20:10 ` Mark Wielaard 2020-05-31 20:47 ` Fangrui Song 2020-05-31 20:49 ` David Blaikie 2020-05-31 21:33 ` David Blaikie 2020-06-01 16:25 ` Andrew Burgess 3 siblings, 2 replies; 25+ messages in thread From: Mark Wielaard @ 2020-05-31 20:10 UTC (permalink / raw) To: Fangrui Song; +Cc: binutils, gdb, elfutils-devel Hi, On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: > what linkers should do regarding relocations referencing dropped > functions (due to section group rules, --gc-sections, /DISCARD/, > etc) in .debug_* > > As an example: > > __attribute__((section(".text.x"))) void f1() { } > __attribute__((section(".text.x"))) void f2() { } > int main() { } > > Some .debug_* sections are relocated by R_X86_64_64 referencing > undefined symbols (the STT_SECTION symbols are collected): > > 0x00000043: DW_TAG_subprogram [2] > ###### relocated by .text.x + 10 > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > > With ld --gc-sections: > > * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + > addend This can cause overlapping address ranges with normal text > sections. {{overlap}} * [beginning address offset, ending address > offset) in .debug_ranges are resolved to 1 (ignoring addend). See > bfd/reloc.c (behavior introduced in > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 > ) > > [0, 0) cannot be used because it terminates the list entry. > [-1, -1) cannot be used because -1 represents a base address > selection entry which will affect subsequent address offset > pairs. > * .debug_loc address offset pairs have similar problem to .debug_ranges > * In DWARF v5, the abnormal values can be in a separate section .debug_addr > > --- > > I am eager to know what you think > of the ideas from binutils/gdb/elfutils's perspective. I think this is a producer problem. If a (code) section can be totally dropped then the associated (.debug) sections should have been generated together with that (code) section in a COMDAT group. That way when the linker drops that section, all the associated sections in that COMDAT group will get dropped with it. If you don't do that, then the DWARF is malformed and there is not much a consumer can do about it. Said otherwise, I don't think it is correct for the linker (with --gc-sections) to drop any sections that have references to it (through relocation symbols) from other (.debug) sections. Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 20:10 ` Mark Wielaard @ 2020-05-31 20:47 ` Fangrui Song 2020-05-31 22:11 ` Mark Wielaard 2020-05-31 20:49 ` David Blaikie 1 sibling, 1 reply; 25+ messages in thread From: Fangrui Song @ 2020-05-31 20:47 UTC (permalink / raw) To: Mark Wielaard; +Cc: binutils, gdb, elfutils-devel On 2020-05-31, Mark Wielaard wrote: >Hi, > >On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: >> what linkers should do regarding relocations referencing dropped >> functions (due to section group rules, --gc-sections, /DISCARD/, >> etc) in .debug_* >> >> As an example: >> >> __attribute__((section(".text.x"))) void f1() { } >> __attribute__((section(".text.x"))) void f2() { } >> int main() { } >> >> Some .debug_* sections are relocated by R_X86_64_64 referencing >> undefined symbols (the STT_SECTION symbols are collected): >> >> 0x00000043: DW_TAG_subprogram [2] >> ###### relocated by .text.x + 10 >> DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") >> DW_AT_high_pc [DW_FORM_data4] (0x00000006) >> DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) >> DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") >> DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") >> >> >> With ld --gc-sections: >> >> * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + >> addend This can cause overlapping address ranges with normal text >> sections. {{overlap}} * [beginning address offset, ending address >> offset) in .debug_ranges are resolved to 1 (ignoring addend). See >> bfd/reloc.c (behavior introduced in >> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 >> ) >> >> [0, 0) cannot be used because it terminates the list entry. >> [-1, -1) cannot be used because -1 represents a base address >> selection entry which will affect subsequent address offset >> pairs. >> * .debug_loc address offset pairs have similar problem to .debug_ranges >> * In DWARF v5, the abnormal values can be in a separate section .debug_addr >> >> --- >> >> I am eager to know what you think >> of the ideas from binutils/gdb/elfutils's perspective. > >I think this is a producer problem. If a (code) section can be totally >dropped then the associated (.debug) sections should have been >generated together with that (code) section in a COMDAT group. That >way when the linker drops that section, all the associated sections in >that COMDAT group will get dropped with it. If you don't do that, then >the DWARF is malformed and there is not much a consumer can do about >it. > >Said otherwise, I don't think it is correct for the linker (with >--gc-sections) to drop any sections that have references to it >(through relocation symbols) from other (.debug) sections. I would love if we could solve the problem using ELF features, but putting DW_TAG_subprogram in the same section group is not an unqualified win (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141926.html) (Cost: sizeof(Elf64_Shdr) = 64, Elf_Word for the entry in .group, plus a string in .strtab unless you use the string ".debug_info" (reusing the string requires https://sourceware.org/bugzilla/show_bug.cgi?id=25380)) According to Peter Smith in the thread https://groups.google.com/forum/#!msg/generic-abi/A-1rbP8hFCA/EDA7Sf3KBwAJ , Arm Compiler 5 splits up DWARF v3 debugging information and puts these sections into comdat groups: "This approach did produce significantly more debug information than gcc did. For small microcontroller projects this wasn't a problem. For larger feature phone problems we had to put a lot of work into keeping the linker's memory usage down as many of our customers at the time were using 32-bit Windows machines with a default maximum virtual memory of 2Gb." See Ben, Ali and others' comments in the thread. Fragmented .debug_* may not be practical. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 20:47 ` Fangrui Song @ 2020-05-31 22:11 ` Mark Wielaard 2020-05-31 23:17 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-05-31 22:11 UTC (permalink / raw) To: Fangrui Song; +Cc: gdb, elfutils-devel, binutils Hi, On Sun, May 31, 2020 at 01:47:38PM -0700, Fangrui Song via Elfutils-devel wrote: > On 2020-05-31, Mark Wielaard wrote: > > I think this is a producer problem. If a (code) section can be totally > > dropped then the associated (.debug) sections should have been > > generated together with that (code) section in a COMDAT group. That > > way when the linker drops that section, all the associated sections in > > that COMDAT group will get dropped with it. If you don't do that, then > > the DWARF is malformed and there is not much a consumer can do about > > it. > > > > Said otherwise, I don't think it is correct for the linker (with > > --gc-sections) to drop any sections that have references to it > > (through relocation symbols) from other (.debug) sections. > > I would love if we could solve the problem using ELF features, but > putting DW_TAG_subprogram in the same section group is not an > unqualified win Sorry for pushing back a little, but as a DWARF consumer this feels a little like the DWARF producer hasn't tried hard enough to produce valid DWARF and now tries to pass the problems off onto the DWARF consumer. Or when looking at it from the perspective of the linker, the compiler gave it an impossible problem to solve because it didn't really get all the pieces of the puzzle (the compiler already fused some independent parts together). I certainly appreciate the issue on 32-bit systems. It seems we already have reached the limits for some programs to be linked (or produce all the debuginfo) when all you got is 32-bits. But maybe that means that the problem is actually that the compiler already produced too much code/data. And the issue really is that it passes some problems, like unused code elimination, off to the linker. While the compiler really should have a better view of that, and should do that job itself. If it did, then it would never even produce the debuginfo in the first place. GCC used to produce horrible DWARF years ago with early LTO implementations, because they just handed it all off to the linker to figure out. But they solved it by generating DWARF in phases, only when it was known the DWARF was valid/consistent did it get produced. So that if some code was actually eliminated then the linker never even see any "code ranges" for code that disappeared. See Early Debug: https://gcc.gnu.org/wiki/early-debug Might some similar technique, where the compiler does a bit more work, so that it actually produces less DWARF to be processed by the linker, be used here? Sorry for pushing the problem back to the producer side, but as a consumer I think that is the more correct place to solve this. Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 22:11 ` Mark Wielaard @ 2020-05-31 23:17 ` David Blaikie 0 siblings, 0 replies; 25+ messages in thread From: David Blaikie @ 2020-05-31 23:17 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils On Sun, May 31, 2020 at 3:42 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sun, May 31, 2020 at 01:47:38PM -0700, Fangrui Song via Elfutils-devel wrote: > > On 2020-05-31, Mark Wielaard wrote: > > > I think this is a producer problem. If a (code) section can be totally > > > dropped then the associated (.debug) sections should have been > > > generated together with that (code) section in a COMDAT group. That > > > way when the linker drops that section, all the associated sections in > > > that COMDAT group will get dropped with it. If you don't do that, then > > > the DWARF is malformed and there is not much a consumer can do about > > > it. > > > > > > Said otherwise, I don't think it is correct for the linker (with > > > --gc-sections) to drop any sections that have references to it > > > (through relocation symbols) from other (.debug) sections. > > > > I would love if we could solve the problem using ELF features, but > > putting DW_TAG_subprogram in the same section group is not an > > unqualified win > > Sorry for pushing back a little, No worries - so long as other people engage with the rest of the thread, hopefully - happy to/worthwhile discussing all the edges. > but as a DWARF consumer this feels a > little like the DWARF producer hasn't tried hard enough to produce > valid DWARF and now tries to pass the problems off onto the DWARF > consumer. I think the fact that it's been this way across multiple compilers, linkers, and debuggers for decades is pretty strong evidence that it's at least a strategy producers do use/probably want to/will continue using. > Or when looking at it from the perspective of the linker, > the compiler gave it an impossible problem to solve because it didn't > really get all the pieces of the puzzle (the compiler already fused > some independent parts together). > > I certainly appreciate the issue on 32-bit systems. It seems we > already have reached the limits for some programs to be linked (or > produce all the debuginfo) when all you got is 32-bits. > > But maybe that means that the problem is actually that the compiler > already produced too much code/data. And the issue really is that it > passes some problems, like unused code elimination, off to the > linker. While the compiler really should have a better view of that, > and should do that job itself. Something like LLVM's ThinLTO does help here - avoiding duplication in object files, but doesn't entirely eliminate code removal in the final link step. Anything that attempts to improve this (including ThinLTO) comes at the cost of "pinch points" in the compilation - places where global knowledge is required to decide how to remove the redundancy - which complicates and potentially slows down the build (if you want cross-file optimizations, you're willing to pay some slowdown there - but if you're looking for a quick interactive build, this sort of extra pinch point is going to be unfortunate (ThinLTO helps mitigate some of the huge cost of LTO, but it's still extra steps)). > If it did, then it would never even > produce the debuginfo in the first place. > > GCC used to produce horrible DWARF years ago with early LTO > implementations, because they just handed it all off to the linker to > figure out. But they solved it by generating DWARF in phases, only > when it was known the DWARF was valid/consistent did it get > produced. So that if some code was actually eliminated then the linker > never even see any "code ranges" for code that disappeared. See Early > Debug: https://gcc.gnu.org/wiki/early-debug Ah, interesting read - thanks for the link! Yeah, LLVM took a different path there - the serializable IR (GIMPL equivalent, I guess) includes a semantic representation of DWARF, essentially - so while we've dealt with various issues around IR+IR linking for (Thin & full) LTO, it wasn't such a hard break/architectural issue as GCC dealt with there. Though we have discussed/entertained the idea of doing something more like GCC does - generating static DWARF earlier in the front-end and serializing a blob of relatively opaque DWARF in the IR except for the bits of IR (variable locations, etc) that the compiler needs visibility into. That particular way GCC used of separating the CUs is an interesting one to know about/keep in mind if we go down that route (though might hit the Split DWARF/multiple CUs issue if we did). > Might some similar technique, where the compiler does a bit more work, > so that it actually produces less DWARF to be processed by the linker, > be used here? Not while we're looking at the "classic" compilation model (compile source files to object files, link object files), that I'm aware of. > Sorry for pushing the problem back to the producer side, but as a > consumer I think that is the more correct place to solve this. No worries - and I there might be some interesting approaches to consider, but I think the history of this issue is long enough that some producers, in some use-cases, will continue to want this functionality that's been (in some cases explicitly (eg: bfd's support for debug_ranges looks very explicitly to support DWARF in this situation), in some cases defacto) supported for quite a while. - Dave ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 20:10 ` Mark Wielaard 2020-05-31 20:47 ` Fangrui Song @ 2020-05-31 20:49 ` David Blaikie 2020-05-31 22:29 ` Mark Wielaard 1 sibling, 1 reply; 25+ messages in thread From: David Blaikie @ 2020-05-31 20:49 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils On Sun, May 31, 2020 at 1:41 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: > > what linkers should do regarding relocations referencing dropped > > functions (due to section group rules, --gc-sections, /DISCARD/, > > etc) in .debug_* > > > > As an example: > > > > __attribute__((section(".text.x"))) void f1() { } > > __attribute__((section(".text.x"))) void f2() { } > > int main() { } > > > > Some .debug_* sections are relocated by R_X86_64_64 referencing > > undefined symbols (the STT_SECTION symbols are collected): > > > > 0x00000043: DW_TAG_subprogram [2] > > ###### relocated by .text.x + 10 > > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > > > > > With ld --gc-sections: > > > > * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + > > addend This can cause overlapping address ranges with normal text > > sections. {{overlap}} * [beginning address offset, ending address > > offset) in .debug_ranges are resolved to 1 (ignoring addend). See > > bfd/reloc.c (behavior introduced in > > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 > > ) > > > > [0, 0) cannot be used because it terminates the list entry. > > [-1, -1) cannot be used because -1 represents a base address > > selection entry which will affect subsequent address offset > > pairs. > > * .debug_loc address offset pairs have similar problem to .debug_ranges > > * In DWARF v5, the abnormal values can be in a separate section .debug_addr > > > > --- > > > > I am eager to know what you think > > of the ideas from binutils/gdb/elfutils's perspective. > > I think this is a producer problem. If a (code) section can be totally > dropped then the associated (.debug) sections should have been > generated together with that (code) section in a COMDAT group. That > way when the linker drops that section, all the associated sections in > that COMDAT group will get dropped with it. If you don't do that, then > the DWARF is malformed and there is not much a consumer can do about > it. > > Said otherwise, I don't think it is correct for the linker (with > --gc-sections) to drop any sections that have references to it > (through relocation symbols) from other (.debug) sections. That's probably not practical for at least some users - the easiest/most thorough counter-example is Split DWARF - the DWARF is in another file the linker can't see. All the linker sees is a list of addresses (debug_addr). All 3 linkers have (modulo bugs) supported this situation, to varying degrees, for decades (ld.bfd: resolve to zero everywhere, resolve to 1 in debug_ranges, lld/gold: resolve to 0+addend) & this is an attempt to fix the bugs & maybe make the solution a bit more robust/work for more cases/be more intentional. (even if not for Split DWARF - creating DWARF that can be dropped by a non-DWARF-aware linker (ie: one that doesn't have to parse/rebuild all the DWARF at link time - which would be super expensive (though someone's prototyping that in lld for those willing to pay that tradeoff)) involves larger DWARF which isn't always a great tradeoff - some users care a lot more about object size than executable size (and maybe increased link time - due to more sections, etc)) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 20:49 ` David Blaikie @ 2020-05-31 22:29 ` Mark Wielaard 2020-05-31 22:36 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-05-31 22:29 UTC (permalink / raw) To: David Blaikie; +Cc: Fangrui Song, gdb, elfutils-devel, binutils Hi, On Sun, May 31, 2020 at 01:49:12PM -0700, David Blaikie wrote: > On Sun, May 31, 2020 at 1:41 PM Mark Wielaard <mark@klomp.org> wrote: > > On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: > > > I am eager to know what you think > > > of the ideas from binutils/gdb/elfutils's perspective. > > > > I think this is a producer problem. If a (code) section can be totally > > dropped then the associated (.debug) sections should have been > > generated together with that (code) section in a COMDAT group. That > > way when the linker drops that section, all the associated sections in > > that COMDAT group will get dropped with it. If you don't do that, then > > the DWARF is malformed and there is not much a consumer can do about > > it. > > > > Said otherwise, I don't think it is correct for the linker (with > > --gc-sections) to drop any sections that have references to it > > (through relocation symbols) from other (.debug) sections. > > That's probably not practical for at least some users - the > easiest/most thorough counter-example is Split DWARF - the DWARF is in > another file the linker can't see. All the linker sees is a list of > addresses (debug_addr). I might be missing something, but I think this works fine with Split DWARF. As long as you make sure that the .dwo files/sections are separated along the same lines as the ELF section groups are. That means each section group either gets its own .dwo file, or you generate the .dwo sections in the same section group in the same object file using the SHF_EXCLUDED trick. That way each .debug.dwo uses their own index into the separate .debug_addr tables. If that group, with the .debug_addr table, gets discarded, then the reference to the .dwo also disappears and it simply won't be used. Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 22:29 ` Mark Wielaard @ 2020-05-31 22:36 ` David Blaikie 2020-06-01 9:31 ` Mark Wielaard 0 siblings, 1 reply; 25+ messages in thread From: David Blaikie @ 2020-05-31 22:36 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils On Sun, May 31, 2020 at 3:30 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sun, May 31, 2020 at 01:49:12PM -0700, David Blaikie wrote: > > On Sun, May 31, 2020 at 1:41 PM Mark Wielaard <mark@klomp.org> wrote: > > > On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: > > > > I am eager to know what you think > > > > of the ideas from binutils/gdb/elfutils's perspective. > > > > > > I think this is a producer problem. If a (code) section can be totally > > > dropped then the associated (.debug) sections should have been > > > generated together with that (code) section in a COMDAT group. That > > > way when the linker drops that section, all the associated sections in > > > that COMDAT group will get dropped with it. If you don't do that, then > > > the DWARF is malformed and there is not much a consumer can do about > > > it. > > > > > > Said otherwise, I don't think it is correct for the linker (with > > > --gc-sections) to drop any sections that have references to it > > > (through relocation symbols) from other (.debug) sections. > > > > That's probably not practical for at least some users - the > > easiest/most thorough counter-example is Split DWARF - the DWARF is in > > another file the linker can't see. All the linker sees is a list of > > addresses (debug_addr). > > I might be missing something, but I think this works fine with Split > DWARF. As long as you make sure that the .dwo files/sections are > separated along the same lines as the ELF section groups are. That > means each section group either gets its own .dwo file, or you > generate the .dwo sections in the same section group in the same > object file using the SHF_EXCLUDED trick. That way each .debug.dwo > uses their own index into the separate .debug_addr tables. If that > group, with the .debug_addr table, gets discarded, then the reference > to the .dwo also disappears and it simply won't be used. Oh, a whole separate .dwo file per function? That would be pretty extreme/difficult to implement (now the compiler's producing a variable number of output files? using some naming scheme so the build system could find them again for building a .dwp if needed, etc). Certainly Bazel (& the internal Google version used to build most Google software) can't handle an unbounded/unknown number of output files from a build action. Multiple CUs in a single .dwo file is not really supported, which would be another challenge (we had to compromise debug info quality a little because of this limitation when doing ThinLTO - unable to emit multiple CUs into each thin-linked .o file) - at which point maybe the compiler'd need to produce an intermediate .dwp file of sorts... but there wouldn't be any great way for the debugger to find those intermediate .dwp files (since it's basically "either find the .dwo file that's written in the DWARF, or find the .dwp file relative to the executable name)? Not sure. & again the overhead of all those separate contributions, headers, etc, turns out to be not very desirable in any case. - Dave ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 22:36 ` David Blaikie @ 2020-06-01 9:31 ` Mark Wielaard 2020-06-01 20:18 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-06-01 9:31 UTC (permalink / raw) To: David Blaikie; +Cc: Fangrui Song, gdb, elfutils-devel, binutils Hi, On Sun, May 31, 2020 at 03:36:02PM -0700, David Blaikie wrote: > On Sun, May 31, 2020 at 3:30 PM Mark Wielaard <mark@klomp.org> wrote: > > On Sun, May 31, 2020 at 01:49:12PM -0700, David Blaikie wrote: > > > That's probably not practical for at least some users - the > > > easiest/most thorough counter-example is Split DWARF - the DWARF is in > > > another file the linker can't see. All the linker sees is a list of > > > addresses (debug_addr). > > > > I might be missing something, but I think this works fine with Split > > DWARF. As long as you make sure that the .dwo files/sections are > > separated along the same lines as the ELF section groups are. That > > means each section group either gets its own .dwo file, or you > > generate the .dwo sections in the same section group in the same > > object file using the SHF_EXCLUDED trick. That way each .debug.dwo > > uses their own index into the separate .debug_addr tables. If that > > group, with the .debug_addr table, gets discarded, then the reference > > to the .dwo also disappears and it simply won't be used. > > Oh, a whole separate .dwo file per function? That would be pretty > extreme/difficult to implement (now the compiler's producing a > variable number of output files? using some naming scheme so the build > system could find them again for building a .dwp if needed, etc). Each skeleton compilation unit has a DW_AT_dwo_name attribute which indicates the .dwo file where the split unit sections can be found. It actually seems seems easier to generate a different one for each skeleton compilation unit than trying to combine them for all the different skeleton compilation units you produce. > Certainly Bazel (& the internal Google version used to build most > Google software) can't handle an unbounded/unknown number of output > files from a build action. Yes, in principle .dwo files seems troublesome for build systems in general. Especially since to do things properly you would need to read the actual dwo_name attribute to make the connection from object/skeleton file to split dwarf object file. And there is no easy way to map back from .dwo to main ELF file. Because of that I am actually a fan of the SHF_EXCLUDED hack that simply places the split .dwo sections in the same object file. For the above that would mean, just place them in the same section group. > Multiple CUs in a single .dwo file is not really supported, which > would be another challenge (we had to compromise debug info quality a > little because of this limitation when doing ThinLTO - unable to emit > multiple CUs into each thin-linked .o file) - at which point maybe the > compiler'd need to produce an intermediate .dwp file of sorts... Are you sure? Each CU would have a separate dwo_id field to distinquish them. At least that is how elfutils figures out which CU in a dwo file matches a given skeleton DIE. This should work the same as for type units, you can have multiple type untis in the same file and distinquish which one you need by matching the signature. > & again the overhead of all those separate contributions, headers, > etc, turns out to be not very desirable in any case. Yes, I agree with that. But as said earlier, maybe the compiler shouldn't have generated to code/data in the first place? Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-01 9:31 ` Mark Wielaard @ 2020-06-01 20:18 ` David Blaikie 2020-06-02 16:50 ` Mark Wielaard 0 siblings, 1 reply; 25+ messages in thread From: David Blaikie @ 2020-06-01 20:18 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils On Mon, Jun 1, 2020 at 2:31 AM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Sun, May 31, 2020 at 03:36:02PM -0700, David Blaikie wrote: > > On Sun, May 31, 2020 at 3:30 PM Mark Wielaard <mark@klomp.org> wrote: > > > On Sun, May 31, 2020 at 01:49:12PM -0700, David Blaikie wrote: > > > > That's probably not practical for at least some users - the > > > > easiest/most thorough counter-example is Split DWARF - the DWARF is in > > > > another file the linker can't see. All the linker sees is a list of > > > > addresses (debug_addr). > > > > > > I might be missing something, but I think this works fine with Split > > > DWARF. As long as you make sure that the .dwo files/sections are > > > separated along the same lines as the ELF section groups are. That > > > means each section group either gets its own .dwo file, or you > > > generate the .dwo sections in the same section group in the same > > > object file using the SHF_EXCLUDED trick. That way each .debug.dwo > > > uses their own index into the separate .debug_addr tables. If that > > > group, with the .debug_addr table, gets discarded, then the reference > > > to the .dwo also disappears and it simply won't be used. > > > > Oh, a whole separate .dwo file per function? That would be pretty > > extreme/difficult to implement (now the compiler's producing a > > variable number of output files? using some naming scheme so the build > > system could find them again for building a .dwp if needed, etc). > > Each skeleton compilation unit has a DW_AT_dwo_name attribute which > indicates the .dwo file where the split unit sections can be found. It > actually seems seems easier to generate a different one for each > skeleton compilation unit than trying to combine them for all the > different skeleton compilation units you produce. > > > Certainly Bazel (& the internal Google version used to build most > > Google software) can't handle an unbounded/unknown number of output > > files from a build action. > > Yes, in principle .dwo files seems troublesome for build systems in > general. They're pretty practical when they're generated right next to the .o file & that's guaranteed by the compiler. "if you generate x.o, there will be x.dwo next to it" - that's certainly how Bazel deals with this. It doesn't parse the DWARF at all - knowing where the .dwo files are along with the .o files. > Especially since to do things properly you would need to read > the actual dwo_name attribute to make the connection from > object/skeleton file to split dwarf object file. And there is no easy > way to map back from .dwo to main ELF file. I don't think those issues have come up as problems for Google's deployment of Split DWARF which we've been using since the early prototypes. > Because of that I am > actually a fan of the SHF_EXCLUDED hack that simply places the split > .dwo sections in the same object file. For the above that would mean, > just place them in the same section group. This was a newer feature added during standardization of Split DWARF, which is handy for some users - but doesn't address the needs of the original design of Split DWARF (for Google) - a distributed build system that is trying to avoid moving more bytes than it must to one machine to run the link step. So not having to ship all the DWARF bytes to one machine for interactive debugging (pulling down from a distributed file system only the needed .dwo files during debugging - not all of them) - or at least being able to ship all the .dwo files to one machine to make a .dwp, and ship all the .o files to another machine for the link. > > > Multiple CUs in a single .dwo file is not really supported, which > > would be another challenge (we had to compromise debug info quality a > > little because of this limitation when doing ThinLTO - unable to emit > > multiple CUs into each thin-linked .o file) - at which point maybe the > > compiler'd need to produce an intermediate .dwp file of sorts... > > Are you sure? Fairly sure - I worked in depth on the implementation of ThinLTO & considered a variety of options trying to support Split DWARF in that situation. > Each CU would have a separate dwo_id field to > distinquish them. At least that is how elfutils figures out which CU > in a dwo file matches a given skeleton DIE. This should work the same > as for type units, you can have multiple type untis in the same file > and distinquish which one you need by matching the signature. One of the complications is that it increased the complexity of making a .dwp file - Split DWARF is spec'd to ensure that the linking process is as lightweight as possible. Not having the size overhead of relocations (though trading off more indirection through the cu_index, debug_str_offsets, etc). Oh right... that was the critical issue: There was no way I could think of to do cross-CU references in Split DWARF (cross-CU references being critical to LTO - inlining from one CU into another, etc). Because there was no relocation processing in dwp generation. Arguably maybe one could use a sec_offset that's resolved relative to a local range within the contributions described by the cu_index - but the cu_index must have one entry per unit (the entries are keyed on unit) - I guess you could have a single entry per CU, but have those entries overlap (so all the CUs from one dwo file get separate index entries that contain the same contribution ranges). Then consumers would have to search through the debug_info contribution to find the right unit.... defeating some of the value of the index. > > & again the overhead of all those separate contributions, headers, > > etc, turns out to be not very desirable in any case. > > Yes, I agree with that. But as said earlier, maybe the compiler > shouldn't have generated to code/data in the first place? In the (especially) C++ compilation model, I don't believe that's possible - inline functions, templates, etc, require duplication - unless you have a more complicated build process that can gather the potential duplication, then fan back out again to compile, etc. ThinLTO does some of this - at a cost of a more complicated build system, etc. - Dave ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-01 20:18 ` David Blaikie @ 2020-06-02 16:50 ` Mark Wielaard 2020-06-02 18:06 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-06-02 16:50 UTC (permalink / raw) To: David Blaikie; +Cc: Fangrui Song, gdb, elfutils-devel, binutils Hi, On Mon, 2020-06-01 at 13:18 -0700, David Blaikie wrote: > On Mon, Jun 1, 2020 at 2:31 AM Mark Wielaard <mark@klomp.org> wrote: > > Each skeleton compilation unit has a DW_AT_dwo_name attribute which > > indicates the .dwo file where the split unit sections can be found. It > > actually seems seems easier to generate a different one for each > > skeleton compilation unit than trying to combine them for all the > > different skeleton compilation units you produce. > > > > > Certainly Bazel (& the internal Google version used to build most > > > Google software) can't handle an unbounded/unknown number of output > > > files from a build action. > > > > Yes, in principle .dwo files seems troublesome for build systems in > > general. > > They're pretty practical when they're generated right next to the .o > file & that's guaranteed by the compiler. "if you generate x.o, there > will be x.dwo next to it" - that's certainly how Bazel deals with > this. It doesn't parse the DWARF at all - knowing where the .dwo files > are along with the .o files. The DWARF spec makes it clear that a DWO is per CU, not per object file. So when an object file contains multiple CUs, it might also be associated with multiple .dwo files (as is also the case with a linked executable or shared library). The spec makes says the DW_AT_dwo_name can contain both a (relative) file or a path to the associated DWO file. Which means that relying on a one-to-one mapping from .o to .dwo is fragile and is likely to break when tools start using multiple CUs or different naming heuristics. > > Because of that I am > > actually a fan of the SHF_EXCLUDED hack that simply places the split > > .dwo sections in the same object file. For the above that would mean, > > just place them in the same section group. > > This was a newer feature added during standardization of Split DWARF, > which is handy for some users Although it is used in practice by some producers, it is not standardize (yet). Also because SHF_EXCLUDED isn't standardized (although it is used consistently for those arches that support it). > - but doesn't address the needs of the > original design of Split DWARF (for Google) - a distributed build > system that is trying to avoid moving more bytes than it must to one > machine to run the link step. So not having to ship all the DWARF > bytes to one machine for interactive debugging (pulling down from a > distributed file system only the needed .dwo files during debugging - > not all of them) - or at least being able to ship all the .dwo files > to one machine to make a .dwp, and ship all the .o files to another > machine for the link. I think that is not what most people would use split-dwarf for. The Google setup seems somewhat unique. Most people probably do compiling, linking and debugging on the same machine. The main use case (for me) is to speed up the edit-compile-debug cycle. Making sure that the linker doesn't have to deal with (most of) the .debug sections and can just leave them behind (either in the .o file, or a separate .dwo file) is the main attraction of split-dwarf IMHO. When actually producing production builds with debug you still pay the price anyway, because instead of the linker, you now need to build your dwp packages which does most of the same work the linker would have done anyway (combining the data, merging the string indexes, deduplicating debug types, etc.) > > > Multiple CUs in a single .dwo file is not really supported, which > > > would be another challenge (we had to compromise debug info quality a > > > little because of this limitation when doing ThinLTO - unable to emit > > > multiple CUs into each thin-linked .o file) - at which point maybe the > > > compiler'd need to produce an intermediate .dwp file of sorts... > > > > Are you sure? > > Fairly sure - I worked in depth on the implementation of ThinLTO & > considered a variety of options trying to support Split DWARF in that > situation. > > > Each CU would have a separate dwo_id field to > > distinquish them. At least that is how elfutils figures out which CU > > in a dwo file matches a given skeleton DIE. This should work the same > > as for type units, you can have multiple type untis in the same file > > and distinquish which one you need by matching the signature. > > One of the complications is that it increased the complexity of making > a .dwp file - Split DWARF is spec'd to ensure that the linking process > is as lightweight as possible. Not having the size overhead of > relocations (though trading off more indirection through the cu_index, > debug_str_offsets, etc). Oh right... that was the critical issue: > There was no way I could think of to do cross-CU references in Split > DWARF (cross-CU references being critical to LTO - inlining from one > CU into another, etc). Because there was no relocation processing in > dwp generation. Arguably maybe one could use a sec_offset that's > resolved relative to a local range within the contributions described > by the cu_index - but the cu_index must have one entry per unit (the > entries are keyed on unit) - I guess you could have a single entry per > CU, but have those entries overlap (so all the CUs from one dwo file > get separate index entries that contain the same contribution ranges). > Then consumers would have to search through the debug_info > contribution to find the right unit.... defeating some of the value of > the index. I think we are drifting somewhat away from the original topic and/or are talking past each other. We somehow combined the topics of doing LTO with using Split DWARF, while we started with whether a DWARF producer like a compiler that generated separate functions in separate ELF sections could also generate the associated DWARF in separate sections. I believe it can, and it can even do so when generating Split DWARF. You see some practical issues, especially when combining an LTO build together with generating Split DWARF. But before we try to resolve those issues, maybe we should take a step back and see which issue we are really trying to solve. I do think combining Split DWARF and LTO might not be the best solution. When doing LTO you probably want something like GCC Early Debug, which is like Split DWARF, but different, because the Early Debug simply doesn't contain any address (ranges) yet (not even through indirection like .debug_addr). > > > & again the overhead of all those separate contributions, headers, > > > etc, turns out to be not very desirable in any case. > > > > Yes, I agree with that. But as said earlier, maybe the compiler > > shouldn't have generated to code/data in the first place? > > In the (especially) C++ compilation model, I don't believe that's > possible - inline functions, templates, etc, require duplication - > unless you have a more complicated build process that can gather the > potential duplication, then fan back out again to compile, etc. > ThinLTO does some of this - at a cost of a more complicated build > system, etc. It might be useful for the original discussion to have a few more concrete examples to show when you might have unused code that the linker might want to discard, but where the compiler could only produce DWARF in one big blob. Apart of the -ffunction-sections case, where I would argue the compiler simply needs to make sure that if it generates code in separate sections it also should create the DWARF separate section (groups). Thanks, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-02 16:50 ` Mark Wielaard @ 2020-06-02 18:06 ` David Blaikie 2020-06-03 3:10 ` Alan Modra 2020-06-19 12:00 ` Range lists, zero-length functions, linker gc Mark Wielaard 0 siblings, 2 replies; 25+ messages in thread From: David Blaikie @ 2020-06-02 18:06 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Mon, 2020-06-01 at 13:18 -0700, David Blaikie wrote: > > On Mon, Jun 1, 2020 at 2:31 AM Mark Wielaard <mark@klomp.org> wrote: > > > Each skeleton compilation unit has a DW_AT_dwo_name attribute which > > > indicates the .dwo file where the split unit sections can be found. It > > > actually seems seems easier to generate a different one for each > > > skeleton compilation unit than trying to combine them for all the > > > different skeleton compilation units you produce. > > > > > > > Certainly Bazel (& the internal Google version used to build most > > > > Google software) can't handle an unbounded/unknown number of output > > > > files from a build action. > > > > > > Yes, in principle .dwo files seems troublesome for build systems in > > > general. > > > > They're pretty practical when they're generated right next to the .o > > file & that's guaranteed by the compiler. "if you generate x.o, there > > will be x.dwo next to it" - that's certainly how Bazel deals with > > this. It doesn't parse the DWARF at all - knowing where the .dwo files > > are along with the .o files. > > The DWARF spec makes it clear that a DWO is per CU, not per object > file. So when an object file contains multiple CUs, it might also be > associated with multiple .dwo files (as is also the case with a linked > executable or shared library). The spec makes says the DW_AT_dwo_name > can contain both a (relative) file or a path to the associated DWO > file. Which means that relying on a one-to-one mapping from .o to .dwo > is fragile and is likely to break when tools start using multiple CUs > or different naming heuristics. Yep, agreed - in the most general form there's no guarantee that one compilation would produce one .dwo and you'd have to parse the .o to find all the associated .dwos. Practically speaking that's not the reality right now (build systems rely on stronger/narrower guarantees by the compiler about how many/where the .dwo files are). > > > Because of that I am > > > actually a fan of the SHF_EXCLUDED hack that simply places the split > > > .dwo sections in the same object file. For the above that would mean, > > > just place them in the same section group. > > > > This was a newer feature added during standardization of Split DWARF, > > which is handy for some users > > Although it is used in practice by some producers, it is not > standardize (yet). Also because SHF_EXCLUDED isn't standardized > (although it is used consistently for those arches that support it). Ah, sorry, I didn't mean the specific implementation strategy of using SHF_EXCLUDED, I meant the general concept of having a .o file be its own .dwo file is standardized "The sections that do not require relocation, however, can be written to the relocatable object (.o) file but ignored by the linker, or they can be written to a separate DWARF object (.dwo) file that need not be accessed by the linker." > > - but doesn't address the needs of the > > original design of Split DWARF (for Google) - a distributed build > > system that is trying to avoid moving more bytes than it must to one > > machine to run the link step. So not having to ship all the DWARF > > bytes to one machine for interactive debugging (pulling down from a > > distributed file system only the needed .dwo files during debugging - > > not all of them) - or at least being able to ship all the .dwo files > > to one machine to make a .dwp, and ship all the .o files to another > > machine for the link. > > I think that is not what most people would use split-dwarf for. Probably not - but it's the use case I care about/need to support. > The > Google setup seems somewhat unique. Most people probably do compiling, > linking and debugging on the same machine. The main use case (for me) > is to speed up the edit-compile-debug cycle. Making sure that the > linker doesn't have to deal with (most of) the .debug sections and can > just leave them behind (either in the .o file, or a separate .dwo file) > is the main attraction of split-dwarf IMHO. When actually producing > production builds with debug you still pay the price anyway, because > instead of the linker, you now need to build your dwp packages which > does most of the same work the linker would have done anyway (combining > the data, merging the string indexes, deduplicating debug types, etc.) It's still a price you can parallelize, rather than having to serialize (somewhat - lld is multithreaded for instance). And the dwp support for linking other dwp files together means you can do it iteratively (rather than taking all the .dwo files and doing noe big link step - you can take a few dwos, link them into an intermediate dwp (removing duplicate type information and strings) then link again with other intermediate dwps, etc - with some distribution/parallelism benefits). > > > > Multiple CUs in a single .dwo file is not really supported, which > > > > would be another challenge (we had to compromise debug info quality a > > > > little because of this limitation when doing ThinLTO - unable to emit > > > > multiple CUs into each thin-linked .o file) - at which point maybe the > > > > compiler'd need to produce an intermediate .dwp file of sorts... > > > > > > Are you sure? > > > > Fairly sure - I worked in depth on the implementation of ThinLTO & > > considered a variety of options trying to support Split DWARF in that > > situation. > > > > > Each CU would have a separate dwo_id field to > > > distinquish them. At least that is how elfutils figures out which CU > > > in a dwo file matches a given skeleton DIE. This should work the same > > > as for type units, you can have multiple type untis in the same file > > > and distinquish which one you need by matching the signature. > > > > One of the complications is that it increased the complexity of making > > a .dwp file - Split DWARF is spec'd to ensure that the linking process > > is as lightweight as possible. Not having the size overhead of > > relocations (though trading off more indirection through the cu_index, > > debug_str_offsets, etc). Oh right... that was the critical issue: > > There was no way I could think of to do cross-CU references in Split > > DWARF (cross-CU references being critical to LTO - inlining from one > > CU into another, etc). Because there was no relocation processing in > > dwp generation. Arguably maybe one could use a sec_offset that's > > resolved relative to a local range within the contributions described > > by the cu_index - but the cu_index must have one entry per unit (the > > entries are keyed on unit) - I guess you could have a single entry per > > CU, but have those entries overlap (so all the CUs from one dwo file > > get separate index entries that contain the same contribution ranges). > > Then consumers would have to search through the debug_info > > contribution to find the right unit.... defeating some of the value of > > the index. > > I think we are drifting somewhat away from the original topic and/or > are talking past each other. We somehow combined the topics of doing > LTO with using Split DWARF, while we started with whether a DWARF > producer like a compiler that generated separate functions in separate > ELF sections could also generate the associated DWARF in separate > sections. I believe it can, and it can even do so when generating Split > DWARF. You see some practical issues, especially when combining an LTO > build together with generating Split DWARF. But before we try to > resolve those issues, maybe we should take a step back and see which > issue we are really trying to solve. > > I do think combining Split DWARF and LTO might not be the best > solution. When doing LTO you probably want something like GCC Early > Debug, which is like Split DWARF, but different, because the Early > Debug simply doesn't contain any address (ranges) yet (not even through > indirection like .debug_addr). I don't think Early Debug fits here - it seems like it was specifically for DWARF that doesn't refer to any code (eg: function declarations and type definitions). I don't see how it could be used for the actual address-referencing DWARF needed to describe function definitions. > > > > & again the overhead of all those separate contributions, headers, > > > > etc, turns out to be not very desirable in any case. > > > > > > Yes, I agree with that. But as said earlier, maybe the compiler > > > shouldn't have generated to code/data in the first place? > > > > In the (especially) C++ compilation model, I don't believe that's > > possible - inline functions, templates, etc, require duplication - > > unless you have a more complicated build process that can gather the > > potential duplication, then fan back out again to compile, etc. > > ThinLTO does some of this - at a cost of a more complicated build > > system, etc. > > It might be useful for the original discussion to have a few more > concrete examples to show when you might have unused code that the > linker might want to discard, but where the compiler could only produce > DWARF in one big blob. Apart of the -ffunction-sections case, Function sections, inline functions, function templates are core examples. > where I > would argue the compiler simply needs to make sure that if it generates > code in separate sections it also should create the DWARF separate > section (groups). I don't think that's practical - the overhead, I believe, is too high. Headers for each section contribution (ELF headers but DWARF headers moreso - having a separate .debug_addr, .debug_line, etc section for each function would be very expensive) would make for very large object files. - Dave ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-02 18:06 ` David Blaikie @ 2020-06-03 3:10 ` Alan Modra 2020-06-03 4:06 ` Fangrui Song 2020-06-03 21:50 ` David Blaikie 2020-06-19 12:00 ` Range lists, zero-length functions, linker gc Mark Wielaard 1 sibling, 2 replies; 25+ messages in thread From: Alan Modra @ 2020-06-03 3:10 UTC (permalink / raw) To: David Blaikie; +Cc: Mark Wielaard, gdb, elfutils-devel, binutils, Fangrui Song On Tue, Jun 02, 2020 at 11:06:10AM -0700, David Blaikie via Binutils wrote: > On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <mark@klomp.org> wrote: > > where I > > would argue the compiler simply needs to make sure that if it generates > > code in separate sections it also should create the DWARF separate > > section (groups). > > I don't think that's practical - the overhead, I believe, is too high. > Headers for each section contribution (ELF headers but DWARF headers > moreso - having a separate .debug_addr, .debug_line, etc section for > each function would be very expensive) would make for very large > object files. With a little linker magic I don't see the neccesity of duplicating the DWARF headers. Taking .debug_line as an example, a compiler could emit the header, opcode, directory and file tables to a .debug_line section with line statements for function foo emitted to .debug_line.foo and for bar to .debug_line.bar, trusting that the linker will combine these sections in order to create an output .debug_line section. If foo code is excluded then .debug_line.foo info will also be dropped if section groups are used. -- Alan Modra Australia Development Lab, IBM ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-03 3:10 ` Alan Modra @ 2020-06-03 4:06 ` Fangrui Song 2020-06-03 21:50 ` David Blaikie 1 sibling, 0 replies; 25+ messages in thread From: Fangrui Song @ 2020-06-03 4:06 UTC (permalink / raw) To: Alan Modra; +Cc: David Blaikie, Mark Wielaard, gdb, elfutils-devel, binutils On 2020-06-03, Alan Modra wrote: >On Tue, Jun 02, 2020 at 11:06:10AM -0700, David Blaikie via Binutils wrote: >> On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <mark@klomp.org> wrote: >> > where I >> > would argue the compiler simply needs to make sure that if it generates >> > code in separate sections it also should create the DWARF separate >> > section (groups). >> >> I don't think that's practical - the overhead, I believe, is too high. >> Headers for each section contribution (ELF headers but DWARF headers >> moreso - having a separate .debug_addr, .debug_line, etc section for >> each function would be very expensive) would make for very large >> object files. > >With a little linker magic I don't see the neccesity of duplicating >the DWARF headers. Taking .debug_line as an example, a compiler could >emit the header, opcode, directory and file tables to a .debug_line >section with line statements for function foo emitted to >.debug_line.foo and for bar to .debug_line.bar, trusting that the >linker will combine these sections in order to create an output >.debug_line section. If foo code is excluded then .debug_line.foo >info will also be dropped if section groups are used. > >-- >Alan Modra >Australia Development Lab, IBM sizeof(Elf64_Shdr) = 64. If we create a .debug_line fragment and a .debug_info fragment for a function, we waste 128 bytes. https://sourceware.org/pipermail/binutils/2020-May/111361.html > .debug_line.bar We should use the unique linkage feature https://sourceware.org/bugzilla/show_bug.cgi?id=25380 otherwise we also waste lots of bytes for the .debug_*.* section names. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-03 3:10 ` Alan Modra 2020-06-03 4:06 ` Fangrui Song @ 2020-06-03 21:50 ` David Blaikie 2020-06-09 20:24 ` Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) Fangrui Song 1 sibling, 1 reply; 25+ messages in thread From: David Blaikie @ 2020-06-03 21:50 UTC (permalink / raw) To: Alan Modra; +Cc: Mark Wielaard, gdb, elfutils-devel, binutils, Fangrui Song On Tue, Jun 2, 2020 at 8:10 PM Alan Modra <amodra@gmail.com> wrote: > > On Tue, Jun 02, 2020 at 11:06:10AM -0700, David Blaikie via Binutils wrote: > > On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <mark@klomp.org> wrote: > > > where I > > > would argue the compiler simply needs to make sure that if it generates > > > code in separate sections it also should create the DWARF separate > > > section (groups). > > > > I don't think that's practical - the overhead, I believe, is too high. > > Headers for each section contribution (ELF headers but DWARF headers > > moreso - having a separate .debug_addr, .debug_line, etc section for > > each function would be very expensive) would make for very large > > object files. > > With a little linker magic I don't see the neccesity of duplicating > the DWARF headers. Taking .debug_line as an example, a compiler could > emit the header, opcode, directory and file tables to a .debug_line > section with line statements for function foo emitted to > .debug_line.foo and for bar to .debug_line.bar, trusting that the > linker will combine these sections in order to create an output > .debug_line section. If foo code is excluded then .debug_line.foo > info will also be dropped if section groups are used. I don't think this would apply to debug_addr - where the entries are referenced from elsewhere via index, or debug_rnglist where the rnglist header (or the debug_info directly) contains offsets into this section, so taking chunks out would break those offsets. (or to the file/directory name part of debug_line - where you might want to remove file/line entries that were eliminated as dead code - but that'd throw off the indexes) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) 2020-06-03 21:50 ` David Blaikie @ 2020-06-09 20:24 ` Fangrui Song 2020-06-19 20:04 ` Mark Wielaard 0 siblings, 1 reply; 25+ messages in thread From: Fangrui Song @ 2020-06-09 20:24 UTC (permalink / raw) To: gdb, elfutils-devel, binutils; +Cc: Alan Modra, Mark Wielaard, David Blaikie I want to revive the thread, but focus on whether a tombstone value (-1/-2) in .debug_* can cause trouble to various DWARF consumers (gdb, debug related tools in elfutils and other utilities I don't know about). Paul Robinson has proposed that DWARF v6 should reserve a tombstone value (the value a relocation referencing a discarded symbol in a .debug_* section should be resolved to) http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 Some comments about the proposal: > - deduplicating different functions with identical content; GNU refers > to this as ICF (Identical Code Folding); ICF (gold --icf={safe,all}) can cause DW_TAG_subprogram with different DW_AT_name to have the same range. > - functions with no callers; sometimes called dead-stripping or > garbage collection. --gc-sections can lead to tombstone values. A referenced symbol may be discarded because its containing sections is garbage collected. > - functions emitted in COMDAT sections, typically C++ template > instantiations or inline functions from a header file; This can cause either tombstone values (STB_LOCAL) or duplicate DIEs (non-STB_LOCAL). On 2020-06-03, David Blaikie wrote: >On Tue, Jun 2, 2020 at 8:10 PM Alan Modra <amodra@gmail.com> wrote: >> >> On Tue, Jun 02, 2020 at 11:06:10AM -0700, David Blaikie via Binutils wrote: >> > On Tue, Jun 2, 2020 at 9:50 AM Mark Wielaard <mark@klomp.org> wrote: >> > > where I >> > > would argue the compiler simply needs to make sure that if it generates >> > > code in separate sections it also should create the DWARF separate >> > > section (groups). >> > >> > I don't think that's practical - the overhead, I believe, is too high. >> > Headers for each section contribution (ELF headers but DWARF headers >> > moreso - having a separate .debug_addr, .debug_line, etc section for >> > each function would be very expensive) would make for very large >> > object files. >> >> With a little linker magic I don't see the neccesity of duplicating >> the DWARF headers. Taking .debug_line as an example, a compiler could >> emit the header, opcode, directory and file tables to a .debug_line >> section with line statements for function foo emitted to >> .debug_line.foo and for bar to .debug_line.bar, trusting that the >> linker will combine these sections in order to create an output >> .debug_line section. If foo code is excluded then .debug_line.foo >> info will also be dropped if section groups are used. > >I don't think this would apply to debug_addr - where the entries are >referenced from elsewhere via index, or debug_rnglist where the >rnglist header (or the debug_info directly) contains offsets into this >section, so taking chunks out would break those offsets. (or to the >file/directory name part of debug_line - where you might want to >remove file/line entries that were eliminated as dead code - but >that'd throw off the indexes) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) 2020-06-09 20:24 ` Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) Fangrui Song @ 2020-06-19 20:04 ` Mark Wielaard 2020-06-20 1:02 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-06-19 20:04 UTC (permalink / raw) To: Fangrui Song, gdb, elfutils-devel, binutils; +Cc: David Blaikie, Alan Modra Hi, On Tue, 2020-06-09 at 13:24 -0700, Fangrui Song via Elfutils-devel wrote: > I want to revive the thread, but focus on whether a tombstone value > (-1/-2) in .debug_* can cause trouble to various DWARF consumers (gdb, > debug related tools in elfutils and other utilities I don't know about). > > Paul Robinson has proposed that DWARF v6 should reserve a tombstone > value (the value a relocation referencing a discarded symbol in a > .debug_* section should be resolved to) > http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 I would appreciate having a clear "not valid" marker instead of getting a possibly bogus (but valid) address. -1 seems a reasonable value. Although I have seen (and written) code that simply assumes zero is that value. Would such an invalid address marker in an DW_AT_low_pc make the whole program scope under a DIE invalid? What about (addr, loc, rng) base addresses? Can they contain an invalid marker, does that make the whole table/range invalid? I must admit that as a DWARF consumer I am slightly worried that having a sanctioned "invalid marker" will cause DWARF producers to just not coordinate and simply assume they can always invalidate anything they emit. Even if there could be a real solution by coordinating between compiler/linker who is responsible for producing the valid DWARF entries (especially when LTO is involved). > Some comments about the proposal: > > > - deduplicating different functions with identical content; GNU > > refers > > to this as ICF (Identical Code Folding); > > ICF (gold --icf={safe,all}) can cause DW_TAG_subprogram with > different DW_AT_name to have the same range. Cary Coutant wrote up a general Two-Level Line Number Table proposal to address the issue of having a single machine instruction corresponds to more than one source statement: http://wiki.dwarfstd.org/index.php?title=TwoLevelLineTables Which seems useful in these kind of situations. But I don't know the current status of the proposal. Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) 2020-06-19 20:04 ` Mark Wielaard @ 2020-06-20 1:02 ` David Blaikie 0 siblings, 0 replies; 25+ messages in thread From: David Blaikie @ 2020-06-20 1:02 UTC (permalink / raw) To: Mark Wielaard; +Cc: Fangrui Song, gdb, elfutils-devel, binutils, Alan Modra On Fri, Jun 19, 2020 at 1:04 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Tue, 2020-06-09 at 13:24 -0700, Fangrui Song via Elfutils-devel wrote: > > I want to revive the thread, but focus on whether a tombstone value > > (-1/-2) in .debug_* can cause trouble to various DWARF consumers (gdb, > > debug related tools in elfutils and other utilities I don't know about). > > > > Paul Robinson has proposed that DWARF v6 should reserve a tombstone > > value (the value a relocation referencing a discarded symbol in a > > .debug_* section should be resolved to) > > http://www.dwarfstd.org/ShowIssue.php?issue=200609.1 > > I would appreciate having a clear "not valid" marker instead of getting > a possibly bogus (but valid) address. -1 seems a reasonable value. > Although I have seen (and written) code that simply assumes zero is > that value. Yep - and zero seemed like a good one - except in debug_ranges and debug_loc where that would produce a premature list termination (bfd.ld gets around this by using 1 in debug_ranges) - or on architectures for which 0 is a valid address. if you use the zero+addend approach that gold uses (and lld did use/maybe still does, but is going to move away from) then you /almost/ avoid the need to special case debug_ranges and debug_loc, until you hit a zero-length function (you can create zero-length functions from code like "int f1() { }" or "void f2() { __builtin_unreachable(); }") - then you get the early list termination again Also zero+addend might trip up in a case like: "void f1() { } __attribute__((nodebug)) void f2() { } void f3() { }" - now f3's starting address has a non-zero addend, so it's indistinguishable from valid code at a very low address > Would such an invalid address marker in an DW_AT_low_pc make the whole > program scope under a DIE invalid? What about (addr, loc, rng) base > addresses? Can they contain an invalid marker, does that make the whole > table/range invalid? That would be my intent, yes - any pointer derived from an invalid address would be invalid. Take the f1/f2/f3 nodebug example above - f3's starting address could be described by "invalid address + offset" (currently DWARF has no way of describing this - well, it sort of does, you could use an exprloc with an OP_addrx and the arithmetic necessary to add to that - though I doubt many consumers could handle an exprloc there - but I would like to champion that to enable reuse of address pool entries to reduce the size of .o debug info contributions when using Split DWARF - or just reduce the number of relocations/.o file size when using non-split DWARF), so it'd be important for that to be special cased in pointer arithmetic so the tombstone value propagates through arithmetic. > I must admit that as a DWARF consumer I am slightly worried that having > a sanctioned "invalid marker" will cause DWARF producers to just not > coordinate and simply assume they can always invalidate anything they > emit. At least in my experience (8 years or so working on LLVM's DWARF emission) we've got a pretty strong incentive to reduce DWARF size already - I don't think any producers are being particularly cavalier about producing excess DWARF on the basis that it can be marked invalid. > Even if there could be a real solution by coordinating between > compiler/linker who is responsible for producing the valid DWARF > entries (especially when LTO is involved). A lot of engineering work went into restructuring LLVM's debug info IR representation for LTO to ensure LLVM doesn't produce DWARF for functions deduplicated or dropped by LTO. - Dave > > > Some comments about the proposal: > > > > > - deduplicating different functions with identical content; GNU > > > refers > > > to this as ICF (Identical Code Folding); > > > > ICF (gold --icf={safe,all}) can cause DW_TAG_subprogram with > > different DW_AT_name to have the same range. > > Cary Coutant wrote up a general Two-Level Line Number Table proposal to > address the issue of having a single machine instruction corresponds to > more than one source statement: > http://wiki.dwarfstd.org/index.php?title=TwoLevelLineTables > > Which seems useful in these kind of situations. But I don't know the > current status of the proposal. This was motivated by a desire to be able to do symbolized stack traces including inline stack frames with a smaller representation than is currently possible in DWARF - it allows the line table itself to describe inlining, to some degree, rather than relying on the DIE tree (in part this was motivated by a desire to be able to symbolized backtraces with inlining in-process when Split DWARF is used and the .dwo/.dwp files are not available). I don't think it extends to dealing with the case of deduplication like this - nor addresses the possibility of two CUs having overlapping instruction ranges. (it's semantically roughly equivalent to the inlined_subroutines of a subprogram - not so much related to two copies of a function being deduplicated & then being shared by CUs) - Dave ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-02 18:06 ` David Blaikie 2020-06-03 3:10 ` Alan Modra @ 2020-06-19 12:00 ` Mark Wielaard 2020-06-20 0:46 ` David Blaikie 1 sibling, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-06-19 12:00 UTC (permalink / raw) To: David Blaikie; +Cc: gdb, elfutils-devel, binutils, Fangrui Song Hi, On Tue, 2020-06-02 at 11:06 -0700, David Blaikie via Elfutils-devel wrote: > > I do think combining Split DWARF and LTO might not be the best > > solution. When doing LTO you probably want something like GCC Early > > Debug, which is like Split DWARF, but different, because the Early > > Debug simply doesn't contain any address (ranges) yet (not even through > > indirection like .debug_addr). > > I don't think Early Debug fits here - it seems like it was > specifically for DWARF that doesn't refer to any code (eg: function > declarations and type definitions). I don't see how it could be used > for the actual address-referencing DWARF needed to describe function > definitions. I think that is kind of the point of Early Debug. Only use DWARF (at first) for address/range-less data like types and program scope entries, but don't emit anything (in DWARF format) for things that might need adjustments during link/LTO phase. The problem with using DWARF with address (ranges) during early object creation is that the linker isn't capable to rewrite the DWARF. You'll need a linker plugin that calls back into the compiler to do the actual LTO and emit the actual DWARF containing address/ranges (which can then link back to the already emitted DWARF types/program scope/etc during the Early Debug phase). I think the issue you are describing is actually that you do use DWARF to describe function definitions (not just the declarations) too early. If you aren't sure yet which addresses will be used DWARF isn't really the appropriate (temporary) debug format. > > > > > & again the overhead of all those separate contributions, headers, > > > > > etc, turns out to be not very desirable in any case. > > > > > > > > Yes, I agree with that. But as said earlier, maybe the compiler > > > > shouldn't have generated to code/data in the first place? > > > > > > In the (especially) C++ compilation model, I don't believe that's > > > possible - inline functions, templates, etc, require duplication - > > > unless you have a more complicated build process that can gather the > > > potential duplication, then fan back out again to compile, etc. > > > ThinLTO does some of this - at a cost of a more complicated build > > > system, etc. > > > > It might be useful for the original discussion to have a few more > > concrete examples to show when you might have unused code that the > > linker might want to discard, but where the compiler could only produce > > DWARF in one big blob. Apart of the -ffunction-sections case, > > Function sections, inline functions, function templates are core examples. I understand the function sections case, but can you give actual examples of an inline function or function template source code and how a DWARF producer generates DWARF for that? Maybe some simple source code we can put through gcc or clang to see how they (mis)handle it. Not being a compiler architect I am not sure I understand why those cannot be expressed correctly. > > where I > > would argue the compiler simply needs to make sure that if it generates > > code in separate sections it also should create the DWARF separate > > section (groups). > > I don't think that's practical - the overhead, I believe, is too high. > Headers for each section contribution (ELF headers but DWARF headers > moreso - having a separate .debug_addr, .debug_line, etc section for > each function would be very expensive) would make for very large > object files. I see your point, but maybe this shouldn't be handled by the linker then, but maybe have a linker plugin so the compiler can fixup the DWARF (or generate it later). Cheers, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-19 12:00 ` Range lists, zero-length functions, linker gc Mark Wielaard @ 2020-06-20 0:46 ` David Blaikie 2020-06-24 22:21 ` Mark Wielaard 0 siblings, 1 reply; 25+ messages in thread From: David Blaikie @ 2020-06-20 0:46 UTC (permalink / raw) To: Mark Wielaard; +Cc: gdb, elfutils-devel, binutils, Fangrui Song On Fri, Jun 19, 2020 at 5:00 AM Mark Wielaard <mark@klomp.org> wrote: > > Hi, > > On Tue, 2020-06-02 at 11:06 -0700, David Blaikie via Elfutils-devel wrote: > > > I do think combining Split DWARF and LTO might not be the best > > > solution. When doing LTO you probably want something like GCC Early > > > Debug, which is like Split DWARF, but different, because the Early > > > Debug simply doesn't contain any address (ranges) yet (not even through > > > indirection like .debug_addr). > > > > I don't think Early Debug fits here - it seems like it was > > specifically for DWARF that doesn't refer to any code (eg: function > > declarations and type definitions). I don't see how it could be used > > for the actual address-referencing DWARF needed to describe function > > definitions. > > I think that is kind of the point of Early Debug. Only use DWARF (at > first) for address/range-less data like types and program scope > entries, but don't emit anything (in DWARF format) for things that > might need adjustments during link/LTO phase. The problem with using > DWARF with address (ranges) during early object creation is that the > linker isn't capable to rewrite the DWARF. You'll need a linker plugin > that calls back into the compiler to do the actual LTO and emit the > actual DWARF containing address/ranges (which can then link back to the > already emitted DWARF types/program scope/etc during the Early Debug > phase). I think the issue you are describing is actually that you do > use DWARF to describe function definitions (not just the declarations) > too early. If you aren't sure yet which addresses will be used DWARF > isn't really the appropriate (temporary) debug format. Sorry, I think we keep talking around each other. Not sure if we can reach a good consensus or shared understanding on this topic. DWARF in unlinked object files has been a fairly well used temporary debug format for a long time - and the DWARF spec has done a lot to ensure it is compatible with ELF in both object files and linkers forever, basically? So I don't think it'd be suitable to say "DWARF isn't an appropriate intermediate debug format to use between compilers and linkers". In the sense that I don't think either the DWARF committee members, producers, or consumers would agree with this sentiment. > > > > > > & again the overhead of all those separate contributions, headers, > > > > > > etc, turns out to be not very desirable in any case. > > > > > > > > > > Yes, I agree with that. But as said earlier, maybe the compiler > > > > > shouldn't have generated to code/data in the first place? > > > > > > > > In the (especially) C++ compilation model, I don't believe that's > > > > possible - inline functions, templates, etc, require duplication - > > > > unless you have a more complicated build process that can gather the > > > > potential duplication, then fan back out again to compile, etc. > > > > ThinLTO does some of this - at a cost of a more complicated build > > > > system, etc. > > > > > > It might be useful for the original discussion to have a few more > > > concrete examples to show when you might have unused code that the > > > linker might want to discard, but where the compiler could only produce > > > DWARF in one big blob. Apart of the -ffunction-sections case, > > > > Function sections, inline functions, function templates are core examples. > > I understand the function sections case, but can you give actual > examples of an inline function or function template source code and how > a DWARF producer generates DWARF for that? Maybe some simple source > code we can put through gcc or clang to see how they (mis)handle it. > Not being a compiler architect I am not sure I understand why those > cannot be expressed correctly. oh, sure! sorry. a simple case of inline functions being deduplicated looks like this: a.cpp: inline void f1() { } void f2() { f1(); } b.cpp: inline void f1() { } void f2(); int main() { f1(); f2(); } This actually demonstrates a slightly different behavior of bfd and gold: When the comdats are the same size (I'm told that's the heuristic) and the local symbol names the DWARF uses to refer to the functions (f1 in this case) - then both DWARF descriptions are resolved to point to the same deduplicated copy of 'f1', eg: BFD and Gold both produce this DWARF (uninteresting attributes have been omitted): DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000065] = "a.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0x0000000000401110, 0x000000000040111b) [0x0000000000401120, 0x0000000000401126)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401110) DW_AT_high_pc [DW_FORM_data4] (0x0000000b) DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000009d] = "f2") DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120) DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000aa] = "b.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000030 [0x0000000000401130, 0x0000000000401142) [0x0000000000401120, 0x0000000000401126)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401130) DW_AT_high_pc [DW_FORM_data4] (0x00000012) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000b0] = "main") DW_TAG_subprogram [3] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120) DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") Now you have two CUs that have overlapping ranges, which is interesting - if not strictly invalid (DWARF being permissive and all). Though I think the size heuristic is risky - it's possible that 'f1' was optimized differently in the two compilations and just happened to end up with the same size - but the DWARF descriptions may be incorrect for the other version of the function (eg: one compiler chose to put a constant in one register, the toher compiler used another register - same instruction sequence length, but the DWARF would be different and incorrect to mismatch like that) If you end up with different function lengths (which is common enough in larger programs - different other definitions may be available, different inlining heuristics about overall object size, etc, may kick in) then you get BFD and Gold's current tombstoning behavior: DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000065] = "a.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0x0000000000401110, 0x000000000040111b) [0x0000000000401120, 0x000000000040112b)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401110) DW_AT_high_pc [DW_FORM_data4] (0x0000000b) DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000009d] = "f2") DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401120) DW_AT_high_pc [DW_FORM_data4] (0x0000000b) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000aa] = "b.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000030 [0x0000000000401130, 0x0000000000401142) [0x0000000000000001, 0x0000000000000001)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401130) DW_AT_high_pc [DW_FORM_data4] (0x00000012) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000b0] = "main") DW_TAG_subprogram [3] DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") In this case BFD uses the tombstone value 0 in most sections, but uses 1 in debug_ranges to ensure it doesn't produce the 0,0 that would end the range list early (this workaround is incomplete and should also be applied to debug_loc which is terminated by 0,0 too - but GCC (and Clang) doesn't produce any inter-function location lists, so this doesn't present a problem in practice/for now, except for dumping tools which end up seeing "holes" in debug_loc that would otherwise be dumpable) Gold's behavior in this case is a little different, using the 0+addend approach: DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000065] = "a.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0x0000000000400540, 0x000000000040054b) [0x0000000000400550, 0x000000000040055b)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000400540) DW_AT_high_pc [DW_FORM_data4] (0x0000000b) DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000009d] = "f2") DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000400550) DW_AT_high_pc [DW_FORM_data4] (0x0000000b) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000aa] = "b.cpp") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000030 [0x0000000000400560, 0x0000000000400572) [0x0000000000000000, 0x0000000000000006)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000400560) DW_AT_high_pc [DW_FORM_data4] (0x00000012) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000b0] = "main") DW_TAG_subprogram [3] DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000a7] = "f1") I introduced an ODR violation here (by modifying a.cpp's f1 to call f2 - thus making a.cpp's f1 a different length from b.cpp's f1) just as an easy way to demonstrate the "different lengths" issue - but this could arise from valid code that was differently optimized in the two translation units. & yeah - on an LLVM thread we did dabble with what it'd look like to use comdats without whole separate units to put these together - and it's possible, though that doesn't apply to Split DWARF (can't piece together the debug_addr section either - since it'd throw of the indexes used from the Split DWARF file) - and still adds extra section overhead. Did prototype debug_ranges/debug_rnglist comdat assembling (so the CU's range list wouldn't have entries for the deduplicated/gc'd functions) (but again, more ELF sections - for little gain in linked debug info size for the cost in intermediate object size) > > > where I > > > would argue the compiler simply needs to make sure that if it generates > > > code in separate sections it also should create the DWARF separate > > > section (groups). > > > > I don't think that's practical - the overhead, I believe, is too high. > > Headers for each section contribution (ELF headers but DWARF headers > > moreso - having a separate .debug_addr, .debug_line, etc section for > > each function would be very expensive) would make for very large > > object files. > > I see your point, but maybe this shouldn't be handled by the linker > then, but maybe have a linker plugin so the compiler can fixup the > DWARF (or generate it later). This sounds like it'd still be fairly intrusive (architecturally) and expensive (both from a software complexity and linking time/memory usage/etc). I'm not ruling it out as a possibility - and I'm interested in dabbling with this kind of deduplication purely academically (my users use Split DWARF, so there's no opportunity there to fix this - so my interest in in-.o/linked executable DWARF is limited to personal interest). I'm curious about just how expensive the ELF sections would be, what sort of custom scheme might be used instead (I could imagine a content-aware feature that might be more terse than generic ELF sections, but not especially invasive (wouldn't require parsing or rewriting DWARF DIEs, etc). That's being discussed in the LLVM community - but I don't expect it'll be soon, nor pervasively used even if it is built. So I come back to Split DWARF making this fairly well impossible to implement without a tombstone value, so far as I can imagine/think of. And function sections at least making it very expensive to implement (either in terms of object size and/or significant changes to the nature of linking DWARF). And this being a pretty well established use case/feature for decades now - that has some relatively small drawbacks in certain narrow cases (zero length functions, zero or low address values that are valid in some use cases) that adding an explicit tombstone is necessary in some cases and beneficial if not strictly necessary in others. - Dave > > Cheers, > > Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-20 0:46 ` David Blaikie @ 2020-06-24 22:21 ` Mark Wielaard 2020-06-25 23:45 ` David Blaikie 0 siblings, 1 reply; 25+ messages in thread From: Mark Wielaard @ 2020-06-24 22:21 UTC (permalink / raw) To: David Blaikie; +Cc: gdb, elfutils-devel, binutils, Fangrui Song Hi David, On Fri, 2020-06-19 at 17:46 -0700, David Blaikie via Elfutils-devel wrote: > On Fri, Jun 19, 2020 at 5:00 AM Mark Wielaard <mark@klomp.org> wrote: > > I think that is kind of the point of Early Debug. Only use DWARF (at > > first) for address/range-less data like types and program scope > > entries, but don't emit anything (in DWARF format) for things that > > might need adjustments during link/LTO phase. The problem with using > > DWARF with address (ranges) during early object creation is that the > > linker isn't capable to rewrite the DWARF. You'll need a linker plugin > > that calls back into the compiler to do the actual LTO and emit the > > actual DWARF containing address/ranges (which can then link back to the > > already emitted DWARF types/program scope/etc during the Early Debug > > phase). I think the issue you are describing is actually that you do > > use DWARF to describe function definitions (not just the declarations) > > too early. If you aren't sure yet which addresses will be used DWARF > > isn't really the appropriate (temporary) debug format. > > Sorry, I think we keep talking around each other. Not sure if we can > reach a good consensus or shared understanding on this topic. I think the confusion comes from the fact that we seem to cycle through a couple of different topics which are related, but not really connected directly. There is the topic of using "tombstones" in place of some pc or range attributes/tables in the case of traditional linking separate compile units/objects. Where we seem to agree that those are better than silently producing bad data, but were we disagree whether there are other ways to solve the issue (using comdat section for example, where we might see the overhead/gains differently). There is the topic of LTO where part of the linker optimization is done through a (compiler) plugin. Where it isn't clear (to me at least) if some of the traditional way of handling DWARF in object files makes sense. I would argue that GCC shows that for LTO you need something like Early Debug, where you only produce parts of the DWARF early that don't contain any addresses or ranges, since you don't know yet where code/data will end up till after the actual LTO phase, only after which it can be produced. Then there is the topic of Split Dwarf, where I am not sure it is directly relevant to the above two topics. It is just a different representation of the DWARF data, with an extra layer of indirections used for addresses. Which in the case of the traditional model means that you still hit the tombstones, just through an indirection table. And for LTO it just makes some things more complicated because you have this extra address indirection table, but since you cannot know where the addresses end up till after the LTO phase you now have an extra layer of indirection to fix up. > DWARF in unlinked object files has been a fairly well used temporary > debug format for a long time - and the DWARF spec has done a lot to > ensure it is compatible with ELF in both object files and linkers > forever, basically? So I don't think it'd be suitable to say "DWARF > isn't an appropriate intermediate debug format to use between > compilers and linkers". In the sense that I don't think either the > DWARF committee members, producers, or consumers would agree with this > sentiment. I absolutely agree with that statement for the traditional linker model, where you build up DWARF data per compile unit. But for the LTO model, where there is a feedback loop between compiler and linker, I don't think (all of) DWARF is an appropriate intermediate debug format. If only because the concept of "compile unit" gets really fuzzy. I think in that model a lot of DWARF can still be used usefully as intermediate debug format to pass between compiler, linker, compiler, linker during the LTO phase. Just not the part that describes the program scope and variable/data locations represented as (ranges of) addresses (when produced early). > > I understand the function sections case, but can you give actual > > examples of an inline function or function template source code and how > > a DWARF producer generates DWARF for that? Maybe some simple source > > code we can put through gcc or clang to see how they (mis)handle it. > > Not being a compiler architect I am not sure I understand why those > > cannot be expressed correctly. > > oh, sure! sorry. > > a simple case of inline functions being deduplicated looks like this: > > a.cpp: > inline void f1() { } > void f2() { > f1(); > } > > b.cpp: > inline void f1() { } > void f2(); > int main() { > f1(); > f2(); > } > > This actually demonstrates a slightly different behavior of bfd and > gold: When the comdats are the same size (I'm told that's the > heuristic) and the local symbol names the DWARF uses to refer to the > functions (f1 in this case) - then both DWARF descriptions are > resolved to point to the same deduplicated copy of 'f1', eg: Thanks for the concrete example. I'll study it. Would you mind telling which DWARF producer/compiler you used and which command line flags you used to the compiler and linker invocations? I like to replicate the produced DWARF but wasn't able to get something that used ranges like in your examples. I also wonder about the ODR violation, does your example depend on this being C++ or does it produce the same issues when it was build as a C program? Thanks, Mark ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-06-24 22:21 ` Mark Wielaard @ 2020-06-25 23:45 ` David Blaikie 0 siblings, 0 replies; 25+ messages in thread From: David Blaikie @ 2020-06-25 23:45 UTC (permalink / raw) To: Mark Wielaard; +Cc: gdb, elfutils-devel, binutils, Fangrui Song On Wed, Jun 24, 2020 at 3:22 PM Mark Wielaard <mark@klomp.org> wrote: > > Hi David, > > On Fri, 2020-06-19 at 17:46 -0700, David Blaikie via Elfutils-devel wrote: > > On Fri, Jun 19, 2020 at 5:00 AM Mark Wielaard <mark@klomp.org> wrote: > > > I think that is kind of the point of Early Debug. Only use DWARF (at > > > first) for address/range-less data like types and program scope > > > entries, but don't emit anything (in DWARF format) for things that > > > might need adjustments during link/LTO phase. The problem with using > > > DWARF with address (ranges) during early object creation is that the > > > linker isn't capable to rewrite the DWARF. You'll need a linker plugin > > > that calls back into the compiler to do the actual LTO and emit the > > > actual DWARF containing address/ranges (which can then link back to the > > > already emitted DWARF types/program scope/etc during the Early Debug > > > phase). I think the issue you are describing is actually that you do > > > use DWARF to describe function definitions (not just the declarations) > > > too early. If you aren't sure yet which addresses will be used DWARF > > > isn't really the appropriate (temporary) debug format. > > > > Sorry, I think we keep talking around each other. Not sure if we can > > reach a good consensus or shared understanding on this topic. > > I think the confusion comes from the fact that we seem to cycle through > a couple of different topics which are related, but not really > connected directly. > > There is the topic of using "tombstones" in place of some pc or range > attributes/tables in the case of traditional linking separate compile > units/objects. Where we seem to agree that those are better than > silently producing bad data, but were we disagree whether there are > other ways to solve the issue (using comdat section for example, where > we might see the overhead/gains differently). > > There is the topic of LTO where part of the linker optimization is done > through a (compiler) plugin. Where it isn't clear (to me at least) if > some of the traditional way of handling DWARF in object files makes > sense. Oh - perhaps to clarify: I don't know of any implementation that creates DWARF in intermediate object files in LTO. > I would argue that GCC shows that for LTO you need something > like Early Debug, where you only produce parts of the DWARF early that > don't contain any addresses or ranges, since you don't know yet where > code/data will end up till after the actual LTO phase, only after which > it can be produced. Yeah - I guess that's the point of the name "Early Debug" - it's earlier than usual, rather than making the rest later than usual. In LLVM's implementation the faux .o files in LTO contain no DWARF whatsoever - but a semantic representation something like DWARF intended to be manipulated by compiler optimizations and designed to drop unreferenced portions as optimizations make changes. (if you inline and optimize away a function call, that function may get dropped - then no DWARF is emitted for it, same as if it were never called) Yeah, it'd be theoretically possible to create all the DWARF up-front, use loclists and rnglists for /everything/ (because you wouldn't know if a variable would have a single location or multiple until after optimizations) and then fill in those loclists and rnglists post-optimization. I don't know of any implementation that does that, though - it'd make for very verbose DWARF, and I agree with you that that wouldn't be great - I think the only point of conflict there is: I don't think that's a concern that's actually manifesting in DWARF producers today. Certainly not in LLVM & doesn't sound like it is in GCC. I think there's enough incentive for compiler performance - not to produce loads of duplicate DWARF, and to have a fairly compact/optimizable intermediate representation - there was a lot of work that went into changing LLVM's representation to be more amenable to LTO to ensure things got dropped and deduplicated as soon as possible. > Then there is the topic of Split Dwarf, where I am not sure it is > directly relevant to the above two topics. It is just a different > representation of the DWARF data, with an extra layer of indirections > used for addresses. Which in the case of the traditional model means > that you still hit the tombstones, just through an indirection table. > And for LTO it just makes some things more complicated because you have > this extra address indirection table, but since you cannot know where > the addresses end up till after the LTO phase you now have an extra > layer of indirection to fix up. I think the point of Split DWARF is, to your first point about you and I having perhaps different tradeoffs about object size cost (using comdats to deduplicate/drop DWARF For dead or deduplicated functions) - in the case of Split DWARF, it's impossible - well, it's impossible if you're going to use fragmented DWARF (eg: use comdats to stitch together a single CU out of droppable parts). If you were going to drop the DWARF related to a dead or deduplicated function when using Split DWARF you'd have to use a whole separate unit (possibly a partial_unit) - which would add a lot more size overhead. Perhaps enough that we'd both agree that's prohibitive (especially since that cost would persist into the linked binary - so it wouldn't be as much of a .o/linked executable tradeoff, but an outright growth) > > > DWARF in unlinked object files has been a fairly well used temporary > > debug format for a long time - and the DWARF spec has done a lot to > > ensure it is compatible with ELF in both object files and linkers > > forever, basically? So I don't think it'd be suitable to say "DWARF > > isn't an appropriate intermediate debug format to use between > > compilers and linkers". In the sense that I don't think either the > > DWARF committee members, producers, or consumers would agree with this > > sentiment. > > I absolutely agree with that statement for the traditional linker > model, where you build up DWARF data per compile unit. Ah, OK - then perhaps that's all we need to really agree on to move forward with the discussion of a tombstone value, what value it is, that it should be in the DWARF spec and all the implementations should know and agree on it? > But for the LTO > model, where there is a feedback loop between compiler and linker, I > don't think (all of) DWARF is an appropriate intermediate debug format. Neither do I - though if we both agree there is a need for a tombstone in the traditional linker model, then we do leave it open for very inefficient LTO implementations to use that feature too - though there's lots of ways a DWARF producer could produce very inefficient DWARF & I don't think there's a great need to mandate against it in general (if we could avoid having the tombstone concept entirely - sure - but if we've got to have it, I don't know that the LTO conversation goes anyway in terms of informing the design of the tombstone feature) > If only because the concept of "compile unit" gets really fuzzy. I > think in that model a lot of DWARF can still be used usefully as > intermediate debug format to pass between compiler, linker, compiler, > linker during the LTO phase. Just not the part that describes the > program scope and variable/data locations represented as (ranges of) > addresses (when produced early). > > > > I understand the function sections case, but can you give actual > > > examples of an inline function or function template source code and how > > > a DWARF producer generates DWARF for that? Maybe some simple source > > > code we can put through gcc or clang to see how they (mis)handle it. > > > Not being a compiler architect I am not sure I understand why those > > > cannot be expressed correctly. > > > > oh, sure! sorry. > > > > a simple case of inline functions being deduplicated looks like this: > > > > a.cpp: > > inline void f1() { } > > void f2() { > > f1(); > > } > > > > b.cpp: > > inline void f1() { } > > void f2(); > > int main() { > > f1(); > > f2(); > > } > > > > This actually demonstrates a slightly different behavior of bfd and > > gold: When the comdats are the same size (I'm told that's the > > heuristic) and the local symbol names the DWARF uses to refer to the > > functions (f1 in this case) - then both DWARF descriptions are > > resolved to point to the same deduplicated copy of 'f1', eg: > > Thanks for the concrete example. I'll study it. > > Would you mind telling which DWARF producer/compiler you used and which > command line flags you used to the compiler and linker invocations? clang or gcc without any extra flags should suffice here To get the summarized DWARF I showed above, I used this complete command line: $ clang++ -g a.cpp b.cpp && llvm-dwarfdump -v -debug-info a.out | grep "DW_TAG\|DW_AT_[^ ]*pc\|DW_AT_ranges\|^ *\[\|DW_AT_name" | sed -e "s/............//" (using clang and llvm-dwarfdump from LLVM trunk) > I > like to replicate the produced DWARF but wasn't able to get something > that used ranges like in your examples. I also wonder about the ODR > violation, does your example depend on this being C++ or does it > produce the same issues when it was build as a C program? I believe C has different "inline" semantics that I'm not as familiar with - but I /believe/ the actual C standard inline semantics wouldn't produce the kind of situation that C++ does. (in C++ you define an inline function in every translation it's used - and the compiler can choose to inline or not, and if it doesn't actually inline then the object file carries a deduplicable definition of the function and then the linker picks one of those definitions from any in the input object files - whereas in C the inline function definition, if not inlined, is discarded by the compiler and the user must have provided a non-inline definition in one file as usual - so there's no duplication/deduplication) You could use function-sections/gc-sections to observe the "ODR violation" sort of situation where the addresses go to zero/tombstone rather than the "two subprograms point to one function" behavior: eg: $ clang -g -ffunction-sections -Wl,-gc-sections a.c && llvm-dwarfdump-tot -v -debug-info a.out | grep "DW_TAG\|DW_AT_[^ ]*pc\|DW_AT_ranges\|^ *\[\|DW_AT_name" | sed -e "s/............//" DW_TAG_compile_unit [1] * DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000065] = "a.c") DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0x0000000000000001, 0x0000000000000001) [0x0000000000401110, 0x0000000000401118)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) DW_AT_high_pc [DW_FORM_data4] (0x00000006) DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000094] = "f1") DW_TAG_subprogram [3] DW_AT_low_pc [DW_FORM_addr] (0x0000000000401110) DW_AT_high_pc [DW_FORM_data4] (0x00000008) DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000097] = "main") DW_TAG_base_type [4] DW_AT_name [DW_FORM_strp] ( .debug_str[0x0000009c] = "int") & I guess now we can show the full variety of tombstone behavior... (the above example was with bfd ld, using 1 as a tombstone in debug_ranges and 0 as the tombstone elsewhere (such as in the low_pc of the "f1" subprogram)) - this works unless zero or 1 (or other "small" values - or you have large functions (so [0, 6) range becomes larger and starts overlapping with the non-gc'd functions)) are part of the valid address range of the program - if they are, then the subprogram address ranges become ambiguous & you don't know which function you're in Then we've got gold (add "-fuse-ld=gold" to the compilation command), just snipping the relevant bit of the output: DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0x0000000000000000, 0x0000000000000006) [0x0000000000400510, 0x0000000000400518)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000) Here we can see gold's technique of using "0+addend" as the tombstone value - which works, again, until your valid address range is lower or you have large functions (or you special case zero as the tombstone - which then works until you have zero as a valid code address, or you have empty functions (where range and loc lists would get terminated prematurely) or you have a function that starts at a non-zero addend... ) Then we've got lld's new behavior (which will hopefully be adopted by the other linkers and the DWARF standard as a more robust solution): DW_AT_ranges [DW_FORM_sec_offset] (0x00000000 [0xfffffffffffffffe, 0xfffffffffffffffe) # this would be 0xffffffffffffffff in DWARFv5, but needs to be 0xfffffffffffffffe in DWARFv4 to avoid creating unintended base address selection entries in debug_loc and debug_ranges [0x0000000000201690, 0x0000000000201698)) DW_TAG_subprogram [2] DW_AT_low_pc [DW_FORM_addr] (0xffffffffffffffff) Which probably works about as well as the other solutions if the consumer isn't special casing things (& isn't being too fussy about the fact that low_pc+(data4)high_pc might overflow... ) and also allows the consumer to special case more intentionally without ruling out zero as a valid address, etc. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 18:55 Range lists, zero-length functions, linker gc Fangrui Song 2020-05-31 19:15 ` Fangrui Song 2020-05-31 20:10 ` Mark Wielaard @ 2020-05-31 21:33 ` David Blaikie 2020-06-01 16:25 ` Andrew Burgess 3 siblings, 0 replies; 25+ messages in thread From: David Blaikie @ 2020-05-31 21:33 UTC (permalink / raw) To: Fangrui Song; +Cc: binutils, gdb, elfutils-devel Thanks for getting this conversation started here, Fangrui, I might summarize things slightly differently (some corrections - some just different phrasing): Current situation: When a linker discards code (either chooses a comdat copy from another object file that's not identical (two inline functions might be optimized differently, so DWARF can't point both descriptions to the same code - one has to be pointed to some "null" data essentially) or because of --gc-sections, etc) the DWARF that had relocations to them must be given some value. But what value? Current situation: bfd: 1 in debug_ranges, 0 elsewhere (debug_ or otherwise) lld and gold: 0+addend everywhere (debug_ or otherwise) Problems: bfd uses 1 in debug_ranges to avoid creating a 0,0 range entry (<= DWARFv4, debug_ranges contains address pairs terminated by 0,0) that would terminate the list prematurely bfd misses the same problem in debug_loc - though that's less impactful (debug_loc are usually just within the scope of one function, so it's usually all or nothing - if it terminates the list early it's not good for dumpers, but not likely a problem for debuggers - though in theory you could have a debug_loc across multiple functions/sections (if you optimize a global variable up into a local register through different functions) - and then terminating the list early would be a problem) lld/gold approach ends up mostly creating ranges like [0, length) - for sufficiently large functions, or code mapped into sufficiently low address ranges this range could overlap with real code and create ambiguities unless the consumer special cased starting at zero... - except for the ".text.x" example below, where 0+addend could still result a [positive, positive) address range that would be impossible to reliably identify in the consumer lld/gold has a more severe problem in the event of empty functions (GCC and Clang can both produce empty functions - simplest example being "int f1() { }" - yeah, you can't call this validly, but still code that can appear and is valid so long as it isn't called - also (where we found this recently) "void f1() { llvm_unreachable(); }" creates zero-length functions too) 0+addend produces a [0, 0) entry in the range list which terminates it prematurely and breaks debug info for other code that appears after the empty function. So, it'd be nice to improve the situation for low-range code that could overlap with the [0+addend, 0+addend) situation in lld/gold, fix the 0,0 debug_range problem, and maybe overall make this more explicit/intentional/consistent between producers (compilers and linkers), consumers, and the DWARF spec itself. -1 isn't workable in general, because it has special meaning in debug_ranges and debug_loc - but otherwise it's probably a pretty good "special" constant (though I guess in theory someone could map their code to the very top of their address range? I assume that's less likely than using zero or other "low-ish" address spaces that could overlap with the [0+addend, 0+addend) situation of lld/gold). Hence Fangrui's suggestion of -2 for debug_ranges and debug_loc, -1 everywhere else (at least all debug_* sections - but "all other sections" if that turns out to be a problematic value for non-debug sections) On Sun, May 31, 2020 at 12:19 PM Fangrui Song via Gdb <gdb@sourceware.org> wrote: > > It is being discussed on llvm-dev > (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA) > what linkers should do regarding relocations referencing dropped functions (due > to section group rules, --gc-sections, /DISCARD/, etc) in .debug_* > > As an example: > > __attribute__((section(".text.x"))) void f1() { } > __attribute__((section(".text.x"))) void f2() { } > int main() { } > > Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION > symbols are collected): > > 0x00000043: DW_TAG_subprogram [2] > ###### relocated by .text.x + 10 > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > > With ld --gc-sections: > > * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend > This can cause overlapping address ranges with normal text sections. {{overlap}} > * [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend). > See bfd/reloc.c (behavior introduced in > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 ) > > [0, 0) cannot be used because it terminates the list entry. > [-1, -1) cannot be used because -1 represents a base address selection entry which will affect > subsequent address offset pairs. > * .debug_loc address offset pairs have similar problem to .debug_ranges > * In DWARF v5, the abnormal values can be in a separate section .debug_addr > > --- > > To save your time, I have a summary of the discussions. I am eager to know what you think > of the ideas from binutils/gdb/elfutils's perspective. > > * {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address. > All (undef + addend) in .debug_* are resolved to -1. > > We have to ignore the addend. With __attribute__((section(".text.x"))), > the address offset pair may be something like [.text.x + 16, .text.x + 24) > I have to resolve the whole (.text.x + 16) to the special value. > > (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2 > (0 and -1 cannot be used due to the reasons above). > > * Refined formula for a relocated value in a non-SHF_ALLOC section: > > if is_defined(sym) > return addr(sym) + addend > if relocated_section is .debug_ranges or .debug_loc > return -2 # addend is intentionally ignored > > // Every DWARF v5 section falls here > return -1 {{zero}} > > * {{zero}} Can we resolve (undef + addend) to 0? > > https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html > > > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to > > quirky/interesting use cases (admittedly - such platforms could equally want to make their > > executable code way up in the address space near max or max - 1, etc?). > > Question: is address 0 meaningful for code in some binary formats? > > * {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend. > For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large > enough, it may overlap with a normal text address range (for example [0x10000, *)) > > This can cause problems in debuggers. How does gdb solve the problem? > > * {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to > `addend`. For non-debug sections (open-ended), do we have needs resolving such > values to `base` or `base+addend` where base is customizable? > (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html ) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Range lists, zero-length functions, linker gc 2020-05-31 18:55 Range lists, zero-length functions, linker gc Fangrui Song ` (2 preceding siblings ...) 2020-05-31 21:33 ` David Blaikie @ 2020-06-01 16:25 ` Andrew Burgess 3 siblings, 0 replies; 25+ messages in thread From: Andrew Burgess @ 2020-06-01 16:25 UTC (permalink / raw) To: Fangrui Song; +Cc: binutils, gdb, elfutils-devel * Fangrui Song via Gdb <gdb@sourceware.org> [2020-05-31 11:55:06 -0700]: > It is being discussed on llvm-dev > (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA) > what linkers should do regarding relocations referencing dropped functions (due > to section group rules, --gc-sections, /DISCARD/, etc) in .debug_* > > As an example: > > __attribute__((section(".text.x"))) void f1() { } > __attribute__((section(".text.x"))) void f2() { } > int main() { } > > Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION > symbols are collected): > > 0x00000043: DW_TAG_subprogram [2] > ###### relocated by .text.x + 10 > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > > With ld --gc-sections: > > * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend > This can cause overlapping address ranges with normal text sections. {{overlap}} > * [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend). > See bfd/reloc.c (behavior introduced in > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 ) > > [0, 0) cannot be used because it terminates the list entry. > [-1, -1) cannot be used because -1 represents a base address selection entry which will affect > subsequent address offset pairs. > * .debug_loc address offset pairs have similar problem to .debug_ranges > * In DWARF v5, the abnormal values can be in a separate section .debug_addr > > --- > > To save your time, I have a summary of the discussions. I am eager to know what you think > of the ideas from binutils/gdb/elfutils's perspective. > > * {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address. > All (undef + addend) in .debug_* are resolved to -1. > > We have to ignore the addend. With __attribute__((section(".text.x"))), > the address offset pair may be something like [.text.x + 16, .text.x + 24) > I have to resolve the whole (.text.x + 16) to the special value. > > (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2 > (0 and -1 cannot be used due to the reasons above). > > * Refined formula for a relocated value in a non-SHF_ALLOC section: > > if is_defined(sym) > return addr(sym) + addend > if relocated_section is .debug_ranges or .debug_loc > return -2 # addend is intentionally ignored > > // Every DWARF v5 section falls here > return -1 {{zero}} > > * {{zero}} Can we resolve (undef + addend) to 0? > > https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html > > > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to > > quirky/interesting use cases (admittedly - such platforms could equally want to make their > > executable code way up in the address space near max or max - 1, etc?). > > Question: is address 0 meaningful for code in some binary formats? There are targets where 0 is valid code address, so we should avoid attaching special meaning to this where possible. Thanks, Andrew ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2020-06-25 23:46 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-05-31 18:55 Range lists, zero-length functions, linker gc Fangrui Song 2020-05-31 19:15 ` Fangrui Song 2020-05-31 20:10 ` Mark Wielaard 2020-05-31 20:47 ` Fangrui Song 2020-05-31 22:11 ` Mark Wielaard 2020-05-31 23:17 ` David Blaikie 2020-05-31 20:49 ` David Blaikie 2020-05-31 22:29 ` Mark Wielaard 2020-05-31 22:36 ` David Blaikie 2020-06-01 9:31 ` Mark Wielaard 2020-06-01 20:18 ` David Blaikie 2020-06-02 16:50 ` Mark Wielaard 2020-06-02 18:06 ` David Blaikie 2020-06-03 3:10 ` Alan Modra 2020-06-03 4:06 ` Fangrui Song 2020-06-03 21:50 ` David Blaikie 2020-06-09 20:24 ` Tombstone values in debug sections (was: Range lists, zero-length functions, linker gc) Fangrui Song 2020-06-19 20:04 ` Mark Wielaard 2020-06-20 1:02 ` David Blaikie 2020-06-19 12:00 ` Range lists, zero-length functions, linker gc Mark Wielaard 2020-06-20 0:46 ` David Blaikie 2020-06-24 22:21 ` Mark Wielaard 2020-06-25 23:45 ` David Blaikie 2020-05-31 21:33 ` David Blaikie 2020-06-01 16:25 ` Andrew Burgess
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).