From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id BBE85386F410; Mon, 18 Jan 2021 22:54:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BBE85386F410 Received: by mail-oo1-xc30.google.com with SMTP id y14so4461678oom.10; Mon, 18 Jan 2021 14:54:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=OYPRjMbVdVxqcFVPC04LWWE6zckbyXFFPm5B1UkedYg=; b=mKSAYG35otFJSY1XGA0Lc+vlD4G+nfe2CQ9HEfeRZ2mNpwEUxZO/aWkQZ+m1VBF817 VU32/PvvmKREZbaRq0PCYm5/m7sSbaGm3jz7TTjUmZbn24aDtQ3ZJOBRhGZt6l1qCNmm o0Ct42VMimtEZLak/izoQ/IybPxAwJLN2BdRRJaJc5iDaNjQRhqGvEiFyjGOlXHmOtai aTLVUqUwKVRSqLxOYM8xtoNtWFxRAKbl0DYlb4gH2eGQYkcPAyFsXSUqD+E8csAbsI77 MV5J8oegVgSPcIKjMMoYXzpaHUrJ6WXFHja0qIljZikXY02cYXDp5eqOe+5mc/CNll2n r9Tg== X-Gm-Message-State: AOAM5305PiEbVqcj1TbsZRFhvJ6R/LetbLpftZerYi5bjtArw+ce0s96 7AhoQPzJgYKgnVCU2u4VcPZfswkfZZCn2nL4t1Bsb0xXut8= X-Google-Smtp-Source: ABdhPJzUSQnia5GC+Jz77JKE3QpfnGcw3OI0MaOKKB2DuzyNZ3jelqdXRUduWqEWPr8FDCJ5zQIFX5pHoiWvmeJH4O0= X-Received: by 2002:a4a:1e42:: with SMTP id 63mr913772ooq.57.1611010469776; Mon, 18 Jan 2021 14:54:29 -0800 (PST) MIME-Version: 1.0 References: <20210118220403.nzq6imfmaluuavfp@gmail.com> In-Reply-To: <20210118220403.nzq6imfmaluuavfp@gmail.com> From: "H.J. Lu" Date: Mon, 18 Jan 2021 14:53:53 -0800 Message-ID: Subject: Re: ifunc resolving To: Fangrui Song Cc: Binutils , GNU C Library , Szabolcs Nagy Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3030.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2021 22:54:32 -0000 On Mon, Jan 18, 2021 at 2:04 PM Fangrui Song wrote: > > > I have seen ifunc relocation activities on glibc and ld recently. > https://sourceware.org/glibc/wiki/GNU_IFUNC is under-documented, some asp= ects > have not been well-known, and there are a lot differences across architec= tures > supporting ifunc, so I am sending this email hoping that these aspects ca= n be > clarified, toolchain developers can get on the same page, and documentati= on can > be improved (if developers get confused at times, how could regular users > comfortably use them? :) ) > > > 1. An ifunc defined in the executable is called by a (link-time DT_NEEDED= or > runtime) shared object. > > From ld https://sourceware.org/bugzilla/show_bug.cgi?id=3D23169 (x86 onl= y) this looks desired. > My understanding (comment 8) is that > > (1) The main executable is relocated the last. > (2) By converting the main executable STT_GNU_IFUNC symbol to STT_FUNC, w= hen > processing relocations in a DSO, the ifunc resolver will not be called wh= ile the > main executable is unresolved. > > ifunc calls from within the executable do not incur additional costs. > ifunc calls from DSOs go through the main exe PLT and are punished. > > When processing an ifunc relocation in a DSO, if the ifunc resolver is de= fined > in another DSO, according to comment 9 it will be errored. > > The adds an executable-vs-shared difference to non-preemptible ifunc, but= so be it. > > > The above sounds reasonable. However, the top-of-tree ld does not make -n= o-pie > and -pie behaviors consistent (note: ld does not support -no-pie yet). > > > cat > ./a.s < resolver: > nop > > .globl ifunc, _start > .type ifunc, @gnu_indirect_function > .set ifunc, resolver > > _start: > movq ifunc@GOTPCREL(%rip), %rax > call ifunc > # bl ifunc > eof > echo 'call ifunc' > ./b.s > as a.s -o a.o > gcc -shared -fpic b.s -o b.so > > ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf -= W -s a | grep ifunc > ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && re= adelf -W -s a | grep ifunc > ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && read= elf -W --dyn-syms a | grep ifunc > ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o a = && readelf -W --dyn-syms a | grep ifunc > > ~/Dev/binutils-gdb/Debug/ld/ld-new is a top-of-tree ld. > > % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o -o a && readelf= -W -s a | grep ifunc > 7: 0000000000401008 0 IFUNC GLOBAL DEFAULT 3 ifunc .symtab is unused by ld.so. > % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic a.o ./b.so -o a && = readelf -W -s a | grep ifunc > 5: 0000000000401010 0 FUNC GLOBAL DEFAULT 7 ifunc > 8: 0000000000401010 0 FUNC GLOBAL DEFAULT 7 ifunc The KEY is that the address of the PLT entry in PDE is known at link-time. No IRELATIVE is needed. > % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o -o a && re= adelf -W --dyn-syms a | grep ifunc > 5: 0000000000001020 0 IFUNC GLOBAL DEFAULT 8 ifunc > % ~/Dev/binutils-gdb/Debug/ld/ld-new --export-dynamic -pie a.o ./b.so -o = a && readelf -W --dyn-syms a | grep ifunc > 5: 0000000000001020 0 IFUNC GLOBAL DEFAULT 8 ifunc The address of the PLT entry in PIE is unknown at link-time. > In the four combinations, -no-pie a.o ./b.so does the conversion. > > Once a resolution is agreed, it'd be good to make aarch64/ppc/x86/etc con= sistent. > > > 2. When to convert STT_GNU_IFUNC to STT_FUNC? Only when the address of the PLT entry in executable is known at link-time. > (This is more a ld question.) > > In LLD, for a non-GOT-generating-non-PLT-generating relocation referencin= g a > STT_GNU_IFUNC, a canonical PLT entry is created and the symbol type is ch= anged > to STT_FUNC. (An absolute relocation with 0 addend in a SHF_WRITE section= used > to not trigger a nonical PLT entry. https://reviews.llvm.org/D65995 dropp= ed the > case.) References from other modules will resolve to the PLT entry. > > This approach has pros and cons: > > * With a canonical PLT entry, the resolver of a symbol is called only onc= e. > * If the relocation appears in a non-SHF_WRITE section, a text relocation= can be avoided. > * Relocation types which are not valid dynamic relocation types are suppo= rted. GNU ld may error relocation R_X86_64_PC32 against STT_GNU_IFUNC symbo= l `ifunc' isn't supported > * References will bind to the canonical PLT entry. A function call needs = to jump to the PLT, loads the value from the GOT, then does an indirect cal= l. This allows IFUNC in PDE with external references from DSO. > Last time I checked, the architectures of GNU ld behaved quite differentl= y. This > is an area that arch consistency should be improved. Not all targets support this. > > 3. Prefer .rela.dyn over .rela.plt for R_*_IRELATIVE? > > ld powerpc produces R_*_IRELATIVE in .rela.dyn. > glibc powerpc32/powerpc64 do not process R_*_IRELATIVE if they are not in > [DT_JMPREL, DT_JMPREL+DT_PLTRELSZ). > > This may be a good practice because R_*_IRELATIVE is by nature eagerly re= solved. > The potentially lazy .rela.plt is not suitable. > > I think at least aarch64 and x86 are still using .rela.plt. > > In LLD I followed .rela.dyn and it has been working well https://reviews.= llvm.org/D65651 . > > > 4. When to define __rela_iplt_start and __rela_iplt_end? I invented these. __rela_iplt_start and __rela_iplt_end should be defined ONLY when there are no dynamic tags. Since all ET_DYN files have dynamic tags, ET_DYN files shouldn't define them. > Static pie and static no-pie relocation processing is very different in g= libc. > > * Static no-pie uses special code to process a magic array delimitered by= __rela_iplt_start/__rela_iplt_end. > * Static pie uses self-relocation to take care of R_*_IRELATIVE. The abov= e magic array code is executed as well. If __rela_iplt_start/__rela_iplt_en= d are defined, we will get 0 < __rela_iplt_start < __rela_iplt_end in csu/l= ibc-start.c. ARCH_SETUP_IREL will crash when resolving the first relocation= which has been processed. > > LLD defines __rela_iplt_start/__rela_iplt_end in -pie mode (GNU ld doesn'= t) so That is wrong. ET_DYN file shouldn't define them. > static pie elf/ldconfig segfaults. If we take the patch "Make > _dl_relocate_static_pie return an int indicating whether it applied reloc= s." > from https://sourceware.org/git/?p=3Dglibc.git;a=3Dshortlog;h=3Drefs/head= s/maskray/lld , > LLD linked static-pie glibc programs will work well (with another cleanup= from > an unrelated thing: https://sourceware.org/pipermail/libc-alpha/2020-Dece= mber/121144.html). > > My idea is that defining __rela_iplt_start/__rela_iplt_end in -pie is jus= tified. You are wrong. > I do see that GNU ld may not want a change (probably in a couple of years= ) Never. > because it does not want to gratuitously break older glibc, but taking th= e patch > (probably with description rewritten) is a clarification to glibc code to= me. I strongly object to such bogus change. > glibc maintainers can follow up on "[PATCH 0/3] Make glibc build with LLD= " > if you accept that patch. > > In a few years, when the compatibility for older glibc can be dropped. > ld can define __rela_iplt_start in -pie mode to drop the unneeded differe= nce > in diff -u =3D(ld.bfd --verbose) =3D(ld.bfd -pie --verbose) output. Not going to happen. --=20 H.J.