From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id 283E1388E802; Tue, 2 Jun 2020 16:50:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 283E1388E802 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mark@klomp.org Received: from tarox.wildebeest.org (tarox.wildebeest.org [172.31.17.39]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 2D1A330C8B2E; Tue, 2 Jun 2020 18:50:16 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id CCBBA4708676; Tue, 2 Jun 2020 18:50:16 +0200 (CEST) Message-ID: Subject: Re: Range lists, zero-length functions, linker gc From: Mark Wielaard To: David Blaikie Cc: Fangrui Song , gdb@sourceware.org, elfutils-devel@sourceware.org, binutils@sourceware.org Date: Tue, 02 Jun 2020 18:50:16 +0200 In-Reply-To: References: <20200531185506.mp2idyczc4thye4h@google.com> <20200531201016.GJ44629@wildebeest.org> <20200531222937.GM44629@wildebeest.org> <20200601093103.GN44629@wildebeest.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-8.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jun 2020 16:50:20 -0000 Hi, On Mon, 2020-06-01 at 13:18 -0700, David Blaikie wrote: > On Mon, Jun 1, 2020 at 2:31 AM Mark Wielaard wrote: > > Each skeleton compilation unit has a DW_AT_dwo_name attribute which > > indicates the .dwo file where the split unit sections can be found. It > > actually seems seems easier to generate a different one for each > > skeleton compilation unit than trying to combine them for all the > > different skeleton compilation units you produce. > >=20 > > > Certainly Bazel (& the internal Google version used to build most > > > Google software) can't handle an unbounded/unknown number of output > > > files from a build action. > >=20 > > Yes, in principle .dwo files seems troublesome for build systems in > > general. >=20 > They're pretty practical when they're generated right next to the .o > file & that's guaranteed by the compiler. "if you generate x.o, there > will be x.dwo next to it" - that's certainly how Bazel deals with > this. It doesn't parse the DWARF at all - knowing where the .dwo files > are along with the .o files. The DWARF spec makes it clear that a DWO is per CU, not per object file. So when an object file contains multiple CUs, it might also be associated with multiple .dwo files (as is also the case with a linked executable or shared library). The spec makes says the DW_AT_dwo_name can contain both a (relative) file or a path to the associated DWO file. Which means that relying on a one-to-one mapping from .o to .dwo is fragile and is likely to break when tools start using multiple CUs or different naming heuristics. > > Because of that I am > > actually a fan of the SHF_EXCLUDED hack that simply places the split > > .dwo sections in the same object file. For the above that would mean, > > just place them in the same section group. >=20 > This was a newer feature added during standardization of Split DWARF, > which is handy for some users Although it is used in practice by some producers, it is not standardize (yet). Also because SHF_EXCLUDED isn't standardized (although it is used consistently for those arches that support it). > - but doesn't address the needs of the > original design of Split DWARF (for Google) - a distributed build > system that is trying to avoid moving more bytes than it must to one > machine to run the link step. So not having to ship all the DWARF > bytes to one machine for interactive debugging (pulling down from a > distributed file system only the needed .dwo files during debugging - > not all of them) - or at least being able to ship all the .dwo files > to one machine to make a .dwp, and ship all the .o files to another > machine for the link. I think that is not what most people would use split-dwarf for. The Google setup seems somewhat unique. Most people probably do compiling, linking and debugging on the same machine. The main use case (for me) is to speed up the edit-compile-debug cycle. Making sure that the linker doesn't have to deal with (most of) the .debug sections and can just leave them behind (either in the .o file, or a separate .dwo file) is the main attraction of split-dwarf IMHO. When actually producing production builds with debug you still pay the price anyway, because instead of the linker, you now need to build your dwp packages which does most of the same work the linker would have done anyway (combining the data, merging the string indexes, deduplicating debug types, etc.) > > > Multiple CUs in a single .dwo file is not really supported, which > > > would be another challenge (we had to compromise debug info quality a > > > little because of this limitation when doing ThinLTO - unable to emit > > > multiple CUs into each thin-linked .o file) - at which point maybe th= e > > > compiler'd need to produce an intermediate .dwp file of sorts... > >=20 > > Are you sure? >=20 > Fairly sure - I worked in depth on the implementation of ThinLTO & > considered a variety of options trying to support Split DWARF in that > situation. >=20 > > Each CU would have a separate dwo_id field to > > distinquish them. At least that is how elfutils figures out which CU > > in a dwo file matches a given skeleton DIE. This should work the same > > as for type units, you can have multiple type untis in the same file > > and distinquish which one you need by matching the signature. >=20 > One of the complications is that it increased the complexity of making > a .dwp file - Split DWARF is spec'd to ensure that the linking process > is as lightweight as possible. Not having the size overhead of > relocations (though trading off more indirection through the cu_index, > debug_str_offsets, etc). Oh right... that was the critical issue: > There was no way I could think of to do cross-CU references in Split > DWARF (cross-CU references being critical to LTO - inlining from one > CU into another, etc). Because there was no relocation processing in > dwp generation. Arguably maybe one could use a sec_offset that's > resolved relative to a local range within the contributions described > by the cu_index - but the cu_index must have one entry per unit (the > entries are keyed on unit) - I guess you could have a single entry per > CU, but have those entries overlap (so all the CUs from one dwo file > get separate index entries that contain the same contribution ranges). > Then consumers would have to search through the debug_info > contribution to find the right unit.... defeating some of the value of > the index. I think we are drifting somewhat away from the original topic and/or are talking past each other. We somehow combined the topics of doing LTO with using Split DWARF, while we started with whether a DWARF producer like a compiler that generated separate functions in separate ELF sections could also generate the associated DWARF in separate sections. I believe it can, and it can even do so when generating Split DWARF. You see some practical issues, especially when combining an LTO build together with generating Split DWARF. But before we try to resolve those issues, maybe we should take a step back and see which issue we are really trying to solve. I do think combining Split DWARF and LTO might not be the best solution. When doing LTO you probably want something like GCC Early Debug, which is like Split DWARF, but different, because the Early Debug simply doesn't contain any address (ranges) yet (not even through indirection like .debug_addr). > > > & again the overhead of all those separate contributions, headers, > > > etc, turns out to be not very desirable in any case. > >=20 > > Yes, I agree with that. But as said earlier, maybe the compiler > > shouldn't have generated to code/data in the first place? >=20 > In the (especially) C++ compilation model, I don't believe that's > possible - inline functions, templates, etc, require duplication - > unless you have a more complicated build process that can gather the > potential duplication, then fan back out again to compile, etc. > ThinLTO does some of this - at a cost of a more complicated build > system, etc. It might be useful for the original discussion to have a few more concrete examples to show when you might have unused code that the linker might want to discard, but where the compiler could only produce DWARF in one big blob. Apart of the -ffunction-sections case, where I would argue the compiler simply needs to make sure that if it generates code in separate sections it also should create the DWARF separate section (groups). Thanks, Mark