From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x144.google.com (mail-il1-x144.google.com [IPv6:2607:f8b0:4864:20::144]) by sourceware.org (Postfix) with ESMTPS id 48AEB383E804; Sun, 31 May 2020 20:49:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 48AEB383E804 Received: by mail-il1-x144.google.com with SMTP id d1so7480167ila.8; Sun, 31 May 2020 13:49:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2oHxs9wSLW8cavPO2NMko/K/BPr6qTi+HOTGb6AX+ZM=; b=FQeC//LZzjtDBiz0WPAEghU3OdFekFRqXWuSIYzXDtCI4IavM4QksvqiLLma4YwfHU 10Pb3wkrTImIPcDMj29BiCqOT3q3Pnm80ujBKw6SAnecL0uuSUs1/Gy+kzqe79z6llfV dc6aqxCsRnpCX9SdPQhBonT1L6gYSjOGNX8Wc7mV/Uc2PzftK5H0VFSfIdoI7y7oSnoc wFDLLqR4dcJax4GLS/SID8TM8kuHN6coPgS3hmAdVku+f+5ZSTFc8UA3YubWOautHQam vDThmoItkCFGmSCp1WWo2cBL7OqjlCw81yAlNlNPMUSiym5MHMOoIZEqK8Cc8JtjZUV2 uECA== X-Gm-Message-State: AOAM5320q6fUVpnkTF2K7gmdyxNof3+Y0uPV7wxB+M7r9BYBHtDdCVmP FrphBm6UCPAQb8/aaegnXDryKMePKcpuzAr29bw= X-Google-Smtp-Source: ABdhPJxpwasX5AcXoAZtpE4LaGDXv5y6xuBhldkfoC5iyOlbli3kxZSZZogpSWWl77BVmp/OjBon8UngnmQaQevIO6g= X-Received: by 2002:a92:4810:: with SMTP id v16mr17325938ila.75.1590958163675; Sun, 31 May 2020 13:49:23 -0700 (PDT) MIME-Version: 1.0 References: <20200531185506.mp2idyczc4thye4h@google.com> <20200531201016.GJ44629@wildebeest.org> In-Reply-To: <20200531201016.GJ44629@wildebeest.org> From: David Blaikie Date: Sun, 31 May 2020 13:49:12 -0700 Message-ID: Subject: Re: Range lists, zero-length functions, linker gc To: Mark Wielaard Cc: Fangrui Song , gdb@sourceware.org, elfutils-devel@sourceware.org, binutils@sourceware.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 May 2020 20:49:25 -0000 On Sun, May 31, 2020 at 1:41 PM Mark Wielaard wrote: > > Hi, > > On Sun, May 31, 2020 at 11:55:06AM -0700, Fangrui Song via Elfutils-devel wrote: > > what linkers should do regarding relocations referencing dropped > > functions (due to section group rules, --gc-sections, /DISCARD/, > > etc) in .debug_* > > > > As an example: > > > > __attribute__((section(".text.x"))) void f1() { } > > __attribute__((section(".text.x"))) void f2() { } > > int main() { } > > > > Some .debug_* sections are relocated by R_X86_64_64 referencing > > undefined symbols (the STT_SECTION symbols are collected): > > > > 0x00000043: DW_TAG_subprogram [2] > > ###### relocated by .text.x + 10 > > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > > > > > With ld --gc-sections: > > > > * DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + > > addend This can cause overlapping address ranges with normal text > > sections. {{overlap}} * [beginning address offset, ending address > > offset) in .debug_ranges are resolved to 1 (ignoring addend). See > > bfd/reloc.c (behavior introduced in > > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 > > ) > > > > [0, 0) cannot be used because it terminates the list entry. > > [-1, -1) cannot be used because -1 represents a base address > > selection entry which will affect subsequent address offset > > pairs. > > * .debug_loc address offset pairs have similar problem to .debug_ranges > > * In DWARF v5, the abnormal values can be in a separate section .debug_addr > > > > --- > > > > I am eager to know what you think > > of the ideas from binutils/gdb/elfutils's perspective. > > I think this is a producer problem. If a (code) section can be totally > dropped then the associated (.debug) sections should have been > generated together with that (code) section in a COMDAT group. That > way when the linker drops that section, all the associated sections in > that COMDAT group will get dropped with it. If you don't do that, then > the DWARF is malformed and there is not much a consumer can do about > it. > > Said otherwise, I don't think it is correct for the linker (with > --gc-sections) to drop any sections that have references to it > (through relocation symbols) from other (.debug) sections. That's probably not practical for at least some users - the easiest/most thorough counter-example is Split DWARF - the DWARF is in another file the linker can't see. All the linker sees is a list of addresses (debug_addr). All 3 linkers have (modulo bugs) supported this situation, to varying degrees, for decades (ld.bfd: resolve to zero everywhere, resolve to 1 in debug_ranges, lld/gold: resolve to 0+addend) & this is an attempt to fix the bugs & maybe make the solution a bit more robust/work for more cases/be more intentional. (even if not for Split DWARF - creating DWARF that can be dropped by a non-DWARF-aware linker (ie: one that doesn't have to parse/rebuild all the DWARF at link time - which would be super expensive (though someone's prototyping that in lld for those willing to pay that tradeoff)) involves larger DWARF which isn't always a great tradeoff - some users care a lot more about object size than executable size (and maybe increased link time - due to more sections, etc))