From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id DF4573857C65 for ; Thu, 24 Sep 2020 11:39:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DF4573857C65 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=mittosystems.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=jozef.l@mittosystems.com Received: by mail-wm1-x330.google.com with SMTP id b79so3234501wmb.4 for ; Thu, 24 Sep 2020 04:39:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mittosystems.com; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=glYS9wcodIGQKU1I+clUzgKqHAL4DYUdIKsnBcMLHDU=; b=jitUiB3DXQ6UvgW14JqE5O1SWDPIj3PUyms3fycAGhmTKzDA2yMVkVXrBwhi4sK6qj 0/enBP72bm9O/2m/csHw6K9M0G22rkbuQYcC8E4q4HNGZHYhBXJfyJdmpkkSXPaXb3Pr mjdMOqxr41KhfZTR/wf/RDBSl4WjxuMg0fPpHZazyPSwB4f+J/dWcRy/RRzR4U9QBI0P wcYuTUXX5kWsL+fW61TxQv9bmrWtYagVCx0rtqWuTiiqkjJDHrtYvgSha+T2pYd8RUTW CLyucSexSnRr0z99WZV2aZzJ4aMVPVRlmucO2cRdHNjwZ6VfPdTrVcM8wI7b8XH93cLb TYiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=glYS9wcodIGQKU1I+clUzgKqHAL4DYUdIKsnBcMLHDU=; b=ncBexH/Qn2Aou1mtpuivPluIBje9hxwr34incCy4JEkMAGHbyw4jDERj9dnW9bUnmn LTS+pA6k5xefi+37m8f4/8cVJ9VOpawVcT2xPY6NX8yG/Jm0WVDGXlG7feq0Wo9BUIOq NI250fMPuwgDYrdzDkwqnAolA+fuXYoyYlEC5rVQC8+UARWXDiwaVXMy8vFVajw87tS/ P3SMvSolRCbj0o9H6iGfmtGsQLfjzGvxodqq/xzv8XsTegOzyABG3njpMKFxGz3cHBq2 Um2I0BDjJSjuOEYd0gErrqEyiwE6btWYxgf+PKDwIHKe6Vvyu682hxFYARd0VKkxg+5u FOxQ== X-Gm-Message-State: AOAM531cQpgfKxzBy40wXtGRSQzUXhzsj4Z6wN4Ur6rgdcHnGzy35bjx qcukb1gyt/x8dwIC8g0NInEl6Q== X-Google-Smtp-Source: ABdhPJymwo1zC2MYDJZsAmaw84R4qvYiVUPCSeZvtWfU3/KXulEiqLQ2GwF2YzJQpU9EwSxyZ+qV6w== X-Received: by 2002:a1c:1dd0:: with SMTP id d199mr4176307wmd.7.1600947555713; Thu, 24 Sep 2020 04:39:15 -0700 (PDT) Received: from jozef-acer-manjaro ([2a01:4b00:87fd:900:5e1d:5c99:56da:76e8]) by smtp.gmail.com with ESMTPSA id u126sm3910645wmu.9.2020.09.24.04.39.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Sep 2020 04:39:12 -0700 (PDT) Date: Thu, 24 Sep 2020 12:39:10 +0100 From: Jozef Lawrynowicz To: Fangrui Song Cc: Michael Matz , Binutils , "H.J. Lu" , ccoutant@gmail.com Subject: Re: [PATCH] Support SHF_GNU_RETAIN ELF section flag Message-ID: <20200924113910.zm2ocfura3egmq44@jozef-acer-manjaro> Mail-Followup-To: Fangrui Song , Michael Matz , Binutils , "H.J. Lu" , ccoutant@gmail.com References: <20200923095818.npbwybrm63vb4ejm@jozef-acer-manjaro> <20200923165211.fr4rqzp5uqqmrufq@jozef-acer-manjaro> <20200923184735.4k2tji4yro452bep@jozef-acer-manjaro> <20200923200437.mnegrmwebjuzmfeu@jozef-acer-manjaro> <20200923232943.kasbrmqtpone4yi7@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200923232943.kasbrmqtpone4yi7@gmail.com> X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Sep 2020 11:39:19 -0000 On Wed, Sep 23, 2020 at 04:29:43PM -0700, Fangrui Song wrote: > Hi Jozef, Hi Fangrui, > I saw your proposal https://sourceware.org/pipermail/gnu-gabi/2020q3/000429.html > I did not subscribe to gnu-gabi before yesterday so it is inconvenient for me to > reply there. Since SHF_GNU_RETAIN is a new feature, and we already have facility > for making arbitrary sections alive with R_*_NONE, can you highlight the selling > point of a new flag? > > Copying me previous reply here > > We already have a way to create an artificial reference: > > > > .reloc ., R_X86_64_NONE, target_symbol > > > > If we allow a relocation number for the second operand > > > > .reloc ., 0, target_symbol > > > > this will be generic. You can insert the directives in a GC root (e.g. > > _start or a symbol referenced by -u or maybe an .init_array) > > If you do not want to touch the section containing the -e (--entry) symbol, you > can use: > > .section .init_array.1,"a",@init_array > .reloc ., R_X86_64_NONE, retained_section > > (I find that gold has an internal error with such a relocation.) > But GNU ld should have been supported this for a very long time. > > (I added these directives to llvm last year: https://reviews.llvm.org/D62014 ) > The fact that this relies on the compiler knowing a specific section will be present in a linker script, when we are dealing with such a broad ecosystem of targets and operating systems, makes me uneasy. The functionality simply breaks if the user has a custom linker script which does not have .init_array. Many embedded applications can be written without requiring this section. If someone has written their linker script from scratch, only including the section directives for the sections they actually need, why must we enforce that they have a .init_array input section rule just so they can make use of the "retain" attribute. It doesn't make sense - .init_array and "retain" are not related. Even if this approach would work and pick the right section, I think it is nicer for the user for the "retain" attribute to have a dedicated ELF construct which describes the requirement to retain the section, instead of using an existing construct whose purpose is not related. Your average user is going to be very confused why there are relocs in section X which point to various symbols in their code. If they have written the entire application, they might be able to infer that it is the "retain" attribute which generated these relocs, but if someone else wrote the code or the code is from a library or SDK it will not be clear. Ok we could maybe name a reloc like BFD_RELOC_RETAIN, but then what would the description be? This relocation type does not actually perform any relocation action, but is used to indicate that the symbol it references should not be discarded by linker garbage collection. It must be placed in a section which will definitely be present in the linked output file, and not be subject to garbage collection, otherwise it will not have any effect. Can you tell me why it is preferable to use the relocation mechanism to implement this, instead of a precisely defined new section flag? Why must we look to workarounds to implement something like this anyway? We can work out the details of a new section flag, and ensure it is precisely specified to ensure robustness, and then developers can benefit from understanding more about how their program has been put together. Do we want to make life easier for ourselves, or easier for our users? I get that ABI changes can be a bit disruptive, but this new flag in particular really isn't complicated anyway. > --- > > For a new section flag, there are a bunch of things needing thoughts > > * assembler > > The .retain directive seems to be discouraged... For section flags: > > .section .foo,"a" > .section .foo,"aR" # is this a new section > .pushsection .foo,"aR" # is this a new section No they are not new sections. From my original proposal: > Alternatively, the "R" flag is recognized by the "flags" argument to the > .section directive and will apply SHF_GNU_RETAIN to that section. > It is intended that SHF_GNU_RETAIN does not interfere with any validation when > switching to a section. It can be used to augment the section flags in a section > which has already been created. When you have two .section directives for the same section, GAS "switches" between them instead of creating new sections, which is what I referred to above. This is why the .retain directive more precisely describes what is happening. The compiler is telling the assembler that the section containing the declaration of the function or data symbol should have the SHF_GNU_RETAIN flag applied. > > Does the compiler need to remember that a section has the flag? > (Think how this works with __attribute__((section(...))); many asm streamers are > one-pass) The compiler does not need to worry about sections beyond getting the name of the section the declaration is in. The "retain" attribute just means that the section containing the declaration of the function or data object must be retained, so it emits a directive to describe that. Once the assembler has set SHF_GNU_RETAIN on a section, it will not be unset. I expect the most common use case to actually be when either the "section" attribute has been used, or the -f{function,data}-sections GCC options have been passed. If the user is trying to make the most out of garbage collection, they should be using -f{function,data}-sections. > > * linker > - What does -r do on two sections of the same, one with the flag and the other > without? (as HJ mentioned) To reply to H.J. as well for this point: I don't think this warrants any special behavior, SHF_GNU_RETAIN doesn't need to change the behavior of section merging. The user should put the object to retain in it's own section if they don't want large parts of their program to possibly be unnecessarily retained. The unique section name they give their SHF_GNU_RETAIN section will not be merged into a general output section name until they perform the final non-relocatable link. A section with SHF_GNU_RETAIN applied is being retained because it contains some information that is important to the program. So wherever the that information ends up needs to be retained. > - Does the output section have the flag? SHF_GNU_RETAIN is applied to an input section. To ensure the input section is retained, SHF_GNU_RETAIN must be applied to any section that input section is merged with. The flag doesn't get removed from output sections. > - Does the flag retain other sections in the same section group? Yes. >From the description on section groups from the ELF spec: ... such groups must be included or omitted from the linked object as a unit. I think potentially the only confusing part of any section flag merging behavior is the fact that the assembly code might have different .section directives for the same section, some with "R" and some without (+1 for a .retain directive ;)). Once the assembler has emitted its output, the SHF_GNU_RETAIN flag applied to an input section behaves like any other section flag. There is only one line of linker code which does anything specific with SHF_GNU_RETAIN, and that is the code in bfd/elflink.c to "gc_mark" the section. Thanks, Jozef > > > On 2020-09-23, H.J. Lu via Binutils wrote: > > On Wed, Sep 23, 2020 at 1:04 PM Jozef Lawrynowicz > > wrote: > > > > > > On Wed, Sep 23, 2020 at 12:03:28PM -0700, H.J. Lu via Binutils wrote: > > > > On Wed, Sep 23, 2020 at 11:47 AM Jozef Lawrynowicz > > > > wrote: > > > > > > > > > > On Wed, Sep 23, 2020 at 10:13:37AM -0700, H.J. Lu via Binutils wrote: > > > > > > On Wed, Sep 23, 2020 at 9:52 AM Jozef Lawrynowicz > > > > > > wrote: > > > > > > > > > > > > > > On Wed, Sep 23, 2020 at 01:51:56PM +0000, Michael Matz wrote: > > > > > > > > Hello, > > > > > > > > > > > > > > > > On Wed, 23 Sep 2020, H.J. Lu via Binutils wrote: > > > > > > > > > > > > > > > > > > I think that: > > > > > > > > > > > > > > > > > > > > > .section .text,"ax" > > > > > > > > > > > ... > > > > > > > > > > > foo: > > > > > > > > > > > ... > > > > > > > > > > > .retain > > > > > > > > > > > retained_fn: > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > is some nice syntactic sugar compared to: > > > > > > > > > > > > > > > > > > > > > .section .text,"ax" > > > > > > > > > > > ... > > > > > > > > > > > foo: > > > > > > > > > > > ... > > > > > > > > > > > .section .text,"axR" > > > > > > > > > > > retained_fn: > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > It's also partly for convenience; we have other directives which are > > > > > > > > > > synonyms or short-hand for each other. > > > > > > > > > > > > > > > > > > > > > > > > > > > > You don't need to keep the whole section when only one symbol should > > > > > > > > > be kept. Please drop the .retain directive. GCC, as and ld should do the > > > > > > > > > right thing with > > > > > > > > > > > > > > > > > > .section .text,"ax" > > > > > > > > > ... > > > > > > > > > foo: > > > > > > > > > ... > > > > > > > > > .section .text,"axR" > > > > > > > > > > > > > > > > > > retained_fn: > > > > > > > > > > > > > > > > > > where foo can be dropped and retained_fn will be kept. > > > > > > > > > > > > > > > > This is not what we discussed at the ABI list, the flag is per section, so > > > > > > > > either the whole section is retained or not. What you describe is > > > > > > > > something else that would work on a per symbol basis, which would have to > > > > > > > > be specified in a different way and might or might not be a good idea. > > > > > > > > But let's not conflate these two. > > > > > > > > > > > > > > Also, the linker cannot currently dissect a section and remove a > > > > > > > particular unused symbol anyway. Since garbage collection only operates > > > > > > > on the section level, marking the section itself as "retained" seems > > > > > > > most appropriate. > > > > > > > > > > > > It can be done. If you put your branch on > > > > > > > > > > > > https://gitlab.com/x86-binutils/binutils-gdb > > > > > > > > > > > > I can help you implement it. > > > > > > > > > > It's not something I have time to look into at the moment, for now the > > > > > aim is just to prevent garbage collection of sections. > > > > > > > > Linker and assembler already support it. You just need to add SHF_GNU_RETAIN > > > > to the framework. Check how SHF_GNU_MBIND works. > > > > > > Sorry, I don't understand. > > > > > > Are you saying that LD already supports the garbage collection of > > > individual unused symbol definitions from input sections? Whilst > > > retaining other symbol definitions which are required by the program? > > > I cannot find any reference to this. > > > > > > How does that relate to SHF_GNU_MBIND? I looked at all the references > > > to "mbind" in Binutils and nothing seemed related garbage collection of > > > sections, since SHF_GNU_MBIND is just used to indicate a particular > > > section should be placed in a special memory area. > > > > For > > > > section .text,"ax" > > ... > > foo: > > ... > > .section .text,"axR" > > retained_fn: > > > > you need to create a new .text section with SHF_GNU_RETAIN for > > retained_fn. See get_section in obj-elf.c. If you want to avoid > > merging .text section with SHF_GNU_RETAIN with other .text > > sections by ld -r, linker needs to distinguish sections of the > > same name with and without SHF_GNU_RETAIN.