From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mark@klomp.org>
Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112])
 by sourceware.org (Postfix) with ESMTPS id B54A73851C08;
 Mon,  1 Jun 2020 09:31:34 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B54A73851C08
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=klomp.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mark@klomp.org
Received: from librem (ip4da7790f.direct-adsl.nl [77.167.121.15])
 (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by gnu.wildebeest.org (Postfix) with ESMTPSA id 83F5F3000B1A;
 Mon,  1 Jun 2020 11:31:33 +0200 (CEST)
Received: by librem (Postfix, from userid 1000)
 id 22083C3CB3; Mon,  1 Jun 2020 11:31:03 +0200 (CEST)
Date: Mon, 1 Jun 2020 11:31:03 +0200
From: Mark Wielaard <mark@klomp.org>
To: David Blaikie <dblaikie@gmail.com>
Cc: Fangrui Song <maskray@google.com>, gdb@sourceware.org,
 elfutils-devel@sourceware.org, binutils@sourceware.org
Subject: Re: Range lists, zero-length functions, linker gc
Message-ID: <20200601093103.GN44629@wildebeest.org>
References: <20200531185506.mp2idyczc4thye4h@google.com>
 <20200531201016.GJ44629@wildebeest.org>
 <CAENS6Esjx0HQpviW=ZrA4O3Bza7JDOpoqe3fLxqmLZ4TZsv-9w@mail.gmail.com>
 <20200531222937.GM44629@wildebeest.org>
 <CAENS6Es_DuMzzQi-RBzF_0vm2QCX1DbxGsQsTVDMdSz2f2h4oQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAENS6Es_DuMzzQi-RBzF_0vm2QCX1DbxGsQsTVDMdSz2f2h4oQ@mail.gmail.com>
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gdb@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb mailing list <gdb.sourceware.org>
List-Unsubscribe: <http://sourceware.org/mailman/options/gdb>,
 <mailto:gdb-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-request@sourceware.org?subject=help>
List-Subscribe: <http://sourceware.org/mailman/listinfo/gdb>,
 <mailto:gdb-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Jun 2020 09:31:36 -0000

Hi,

On Sun, May 31, 2020 at 03:36:02PM -0700, David Blaikie wrote:
> On Sun, May 31, 2020 at 3:30 PM Mark Wielaard <mark@klomp.org> wrote:
> > On Sun, May 31, 2020 at 01:49:12PM -0700, David Blaikie wrote:
> > > That's probably not practical for at least some users - the
> > > easiest/most thorough counter-example is Split DWARF - the DWARF is in
> > > another file the linker can't see. All the linker sees is a list of
> > > addresses (debug_addr).
> >
> > I might be missing something, but I think this works fine with Split
> > DWARF. As long as you make sure that the .dwo files/sections are
> > separated along the same lines as the ELF section groups are. That
> > means each section group either gets its own .dwo file, or you
> > generate the .dwo sections in the same section group in the same
> > object file using the SHF_EXCLUDED trick. That way each .debug.dwo
> > uses their own index into the separate .debug_addr tables. If that
> > group, with the .debug_addr table, gets discarded, then the reference
> > to the .dwo also disappears and it simply won't be used.
> 
> Oh, a whole separate .dwo file per function? That would be pretty
> extreme/difficult to implement (now the compiler's producing a
> variable number of output files? using some naming scheme so the build
> system could find them again for building a .dwp if needed, etc).

Each skeleton compilation unit has a DW_AT_dwo_name attribute which
indicates the .dwo file where the split unit sections can be found. It
actually seems seems easier to generate a different one for each
skeleton compilation unit than trying to combine them for all the
different skeleton compilation units you produce.

> Certainly Bazel (& the internal Google version used to build most
> Google software) can't handle an unbounded/unknown number of output
> files from a build action.

Yes, in principle .dwo files seems troublesome for build systems in
general. Especially since to do things properly you would need to read
the actual dwo_name attribute to make the connection from
object/skeleton file to split dwarf object file. And there is no easy
way to map back from .dwo to main ELF file. Because of that I am
actually a fan of the SHF_EXCLUDED hack that simply places the split
.dwo sections in the same object file. For the above that would mean,
just place them in the same section group.

> Multiple CUs in a single .dwo file is not really supported, which
> would be another challenge (we had to compromise debug info quality a
> little because of this limitation when doing ThinLTO - unable to emit
> multiple CUs into each thin-linked .o file) - at which point maybe the
> compiler'd need to produce an intermediate .dwp file of sorts...

Are you sure? Each CU would have a separate dwo_id field to
distinquish them. At least that is how elfutils figures out which CU
in a dwo file matches a given skeleton DIE. This should work the same
as for type units, you can have multiple type untis in the same file
and distinquish which one you need by matching the signature.

> & again the overhead of all those separate contributions, headers,
> etc, turns out to be not very desirable in any case.

Yes, I agree with that. But as said earlier, maybe the compiler
shouldn't have generated to code/data in the first place?

Cheers,

Mark