From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by sourceware.org (Postfix) with ESMTPS id 3C56F385DC00 for ; Sun, 31 May 2020 19:15:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3C56F385DC00 Received: by mail-pl1-x62c.google.com with SMTP id t7so3376713plr.0 for ; Sun, 31 May 2020 12:15:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=EfmKjQKSm3+vdXwwG8K0/ziSTv2GfGPeTobT2RgScfs=; b=AyQhg3tHvczS9DibYk+EYks0FrlYm4mEPm4qDsGou/6rGw2nlJLwx27IYUM5mhRVX4 N7qS1GpmDBa/5SOUJB9T3sUsQTLleZEg3bOiyrJ23TrQiK2SflPwQ7ilJ550XDMI23Qh PYUqL7prYZN0hcI5xdo8ayPRnsZXLyuUeoheELEWnc93cc3iF8j+BYciDa8dISAnFxmu Y+jUCsTTnQmt4d7MqH4QVUlOu+jsJc1sKOP7dOeEUFjhdiN1TvfeX1wNdKdwgVzpsfg1 HR/9RvRMNDQXyD7g8OQ0xaO3VEkuz0dDUU/I2U+ibQW9ZfETqKUiaOi7WpeB+zcQSpFP o0wA== X-Gm-Message-State: AOAM532M/GRTokUV2dXAJKLo8Babxq7ufetnpJsqOnZ7+P9C5lpXdCau 1DOTaeKTTBfTcbnOEB0onn7oag== X-Google-Smtp-Source: ABdhPJwdJhP6RkpTz4akTt2VnyV2r9AMB1Y74gaCtb8Wy52FP2b77p601xGOgfHtY98ciEb0O+llaA== X-Received: by 2002:a17:90a:2461:: with SMTP id h88mr18057208pje.224.1590952536103; Sun, 31 May 2020 12:15:36 -0700 (PDT) Received: from google.com ([2620:15c:2ce:0:9efe:9f1:9267:2b27]) by smtp.gmail.com with ESMTPSA id w12sm5816049pjb.11.2020.05.31.12.15.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 May 2020 12:15:35 -0700 (PDT) Date: Sun, 31 May 2020 12:15:32 -0700 From: Fangrui Song To: binutils@sourceware.org, gdb@sourceware.org, elfutils-devel@sourceware.org Subject: Re: Range lists, zero-length functions, linker gc Message-ID: <20200531191532.albcdemzwbeyovik@google.com> References: <20200531185506.mp2idyczc4thye4h@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20200531185506.mp2idyczc4thye4h@google.com> X-Spam-Status: No, score=-21.8 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 May 2020 19:15:38 -0000 On 2020-05-31, Fangrui Song wrote: >It is being discussed on llvm-dev >(https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html https://groups.google.com/forum/#!topic/llvm-dev/i0DFx6YSqDA) >what linkers should do regarding relocations referencing dropped functions (due >to section group rules, --gc-sections, /DISCARD/, etc) in .debug_* > >As an example: > > __attribute__((section(".text.x"))) void f1() { } > __attribute__((section(".text.x"))) void f2() { } > int main() { } > >Some .debug_* sections are relocated by R_X86_64_64 referencing undefined symbols (the STT_SECTION >symbols are collected): > > 0x00000043: DW_TAG_subprogram [2] > ###### relocated by .text.x + 10 > DW_AT_low_pc [DW_FORM_addr] (0x0000000000000010 ".text.x") > DW_AT_high_pc [DW_FORM_data4] (0x00000006) > DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_reg6 RBP) > DW_AT_linkage_name [DW_FORM_strp] ( .debug_str[0x0000002c] = "_Z2f2v") > DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000033] = "f2") > > >With ld --gc-sections: > >* DW_AT_low_pc [DW_FORM_addr] in .debug_info are resolved to 0 + addend > This can cause overlapping address ranges with normal text sections. {{overlap}} >* [beginning address offset, ending address offset) in .debug_ranges are resolved to 1 (ignoring addend). > See bfd/reloc.c (behavior introduced in > https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=e4067dbb2a3368dbf908b39c5435c84d51abc9f3 ) > > [0, 0) cannot be used because it terminates the list entry. > [-1, -1) cannot be used because -1 represents a base address selection entry which will affect > subsequent address offset pairs. >* .debug_loc address offset pairs have similar problem to .debug_ranges >* In DWARF v5, the abnormal values can be in a separate section .debug_addr > >--- > >To save your time, I have a summary of the discussions. I am eager to know what you think >of the ideas from binutils/gdb/elfutils's perspective. > >* {{reserved_address}} Paul Robinson wants to propose that DWARF v6 reserves a special address. > All (undef + addend) in .debug_* are resolved to -1. > > We have to ignore the addend. With __attribute__((section(".text.x"))), > the address offset pair may be something like [.text.x + 16, .text.x + 24) > I have to resolve the whole (.text.x + 16) to the special value. > > (undef + addend) in pre-DWARF v5 .debug_loc and .debug_ranges are resolved to -2 > (0 and -1 cannot be used due to the reasons above). > >* Refined formula for a relocated value in a non-SHF_ALLOC section: > > if is_defined(sym) > return addr(sym) + addend > if relocated_section is .debug_ranges or .debug_loc > return -2 # addend is intentionally ignored > > // Every DWARF v5 section falls here > return -1 {{zero}} > >* {{zero}} Can we resolve (undef + addend) to 0? > > https://lists.llvm.org/pipermail/llvm-dev/2020-May/141967.html > > > while it might not be an issue for ELF, DWARF would want a standard that's fairly resilient to > > quirky/interesting use cases (admittedly - such platforms could equally want to make their > > executable code way up in the address space near max or max - 1, etc?). > > Question: is address 0 meaningful for code in some binary formats? > >* {{overlap}} The current situation (GNU ld, gold, LLD): (undef + addend) in .debug_* are resolved to addend. > For an address offset pair like [.text + 0, .text + 0x10010), if the ending address offset is large > enough, it may overlap with a normal text address range (for example [0x10000, *)) > > This can cause problems in debuggers. How does gdb solve the problem? > >* {{nonalloc}} Linkers resolve (undef + addend) in non-SHF_ALLOC sections to > `addend`. For non-debug sections (open-ended), do we have needs resolving such > values to `base` or `base+addend` where base is customizable? > (https://lists.llvm.org/pipermail/llvm-dev/2020-May/141956.html ) Forgot to mention * {{compatibility}} Do we need an option if we change the computed value of (undef + addend) to -2 (.debug_loc,.debug_ranges)/-1 (other .debug_*) (or 0 (other .debug_*), but it might not be nice to some binary formats {{reserved_address}}) https://lists.llvm.org/pipermail/llvm-dev/2020-May/141958.html > If we end up blessing it as part of the DWARF spec, we probably > wouldn't want it to be user-configurable for the .debug_ sections, so > I'd hesitate to add that configurability to the linker lest we have to > revoke it to conform to DWARF (breaking flag compatibility with > previous versions of the linker, etc). Admittedly we'll be breaking > output compatibility with this change regardless, so potentially > having the flag as an escape hatch could be useful. I hope we don't need to have a linker option. But if some not-so-old versions of gdb / binutils programs / elfutils programs can't cope with -2/-1/0 {{reserved_address}}, we may have to invent a linker option. I hope GNU ld, gold and LLD can have a compatible option. (As an LLD contributor, I'd be happy to implement the opinion in LLD)