From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x930.google.com (mail-ua1-x930.google.com [IPv6:2607:f8b0:4864:20::930]) by sourceware.org (Postfix) with ESMTPS id 4E16F3858037 for ; Tue, 16 Nov 2021 21:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4E16F3858037 Received: by mail-ua1-x930.google.com with SMTP id b17so1233840uas.0 for ; Tue, 16 Nov 2021 13:07:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=p5rV2wPRZ0mihTqDynmcNFVDLdN2uIfqKwVYRQ5Re9E=; b=8JUWbDlc31x83SY7KHyXfI7czYK5hfYeQAcIp3WDe5D3CWS7HtIjTl03eY9NOD17C/ /sCnQoB0HrlXLWBnSws679T4wXXpSsUbkVMZ7CzhVltFsyb/LVJuUSMKEA/TjsR5gtXI Ny1aQFipWNYEjZf10terB02jL3+7dnSh5Vs2s24tHyCZwqcF5n/Jl4+4Jiax9OA1GrPF lWfdy3XEEpGqZec62/Iy5IFrwa0c+VpwQfgeY0erkZqdIcw1AkjTD45WUNIh/xyKAkjQ dj2zC+9IhjLU9whBXP3R7ffc1wbPAlrS8dLWfM3W+dtqbILeGhBKEwHK8Du2DnByIp7w v9kw== X-Gm-Message-State: AOAM533vpXIrHTxBSIwem9UnNXFF2fJDXyPDW47xNMnUkAnZ4gUS8CnZ uMwxqztqS7Oc8dTFhjLv0s8/M2NLiGjXAQ== X-Google-Smtp-Source: ABdhPJx1luBFZrNOet0aL/gyouk0r/69oD4dgkf7LM/xXKnH7EIrD6elWPmUj4FHeEu4F+aAF7htKA== X-Received: by 2002:ab0:f2:: with SMTP id 105mr15129542uaj.125.1637096863556; Tue, 16 Nov 2021 13:07:43 -0800 (PST) Received: from ?IPV6:2804:431:c7ca:66dc:190:a0a5:4184:e499? ([2804:431:c7ca:66dc:190:a0a5:4184:e499]) by smtp.gmail.com with ESMTPSA id n10sm4122437uaj.20.2021.11.16.13.07.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Nov 2021 13:07:43 -0800 (PST) Message-ID: Date: Tue, 16 Nov 2021 18:07:40 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Subject: Re: Can DT_RELR catch up glibc 2.35? Content-Language: en-US To: Fangrui Song , Carlos O'Donell , "H.J. Lu" Cc: libc-alpha@sourceware.org, Rich Felker References: <20211112074723.uvmlvihlutnib6ik@google.com> From: Adhemerval Zanella In-Reply-To: <20211112074723.uvmlvihlutnib6ik@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_INFOUSMEBIZ, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Nov 2021 21:07:46 -0000 On 12/11/2021 04:47, Fangrui Song wrote: > I am glad that https://sourceware.org/pipermail/libc-alpha/2021-October/132029.html > ("[PATCH v2] elf: Support DT_RELR relative relocation format [BZ #27924]") gets > some traction and many folks acknowledge the size benefit. > (On my Arch Linux, I measured 8% decrease for my /usr/bin.) I brought this to the weekly glibc call two weeks ago and if I recall correctly the *main* issue is we need a proper generic ABI definition published to move this forward on glibc side (H.J.Lu was adamant about). >From my part, current status where we have multiple system that already support it (android, chromeos, freebsd) and with a toolchain that supports build/check glibc on at least 4 different ABIs (lld 13 on x86 and arm) is good enough. We lack of proper testing while using bfd might a drawback, since we lack a way to generate binaries without linker support. > > There are two potential issues. > > 1. Lack of "Time travel compatibility" detector > 2. Some folks feel that unable to test with scripts/build-many-glibcs.py is a problem. >   (ld.lld --pack-dyn-relocs=relr (since July 2018) is the only linker implementation >   and scripts/build-many-glibcs.py doesn't have an lld configuration) > > Let me address them for you. > > --- > > 1. > > "Time travel compatibility" means running a new object on an old system. > A new object using DT_RELR doesn't have the R_*_RELATIVE part in > .rel.dyn/.rela.dyn and is destined to crash. > > If the GNU ld implementation (which may take a while) adopts an > undefined versioned .dynsym symbol (e.g. _dl_have_relr > https://sourceware.org/pipermail/binutils/2021-October/118347.html), > we can guarantee old ld.so will report an error. > The undefined symbol needs to be versioned because ld -shared (default > to --allow-shlib-undefined) does not error on unversioned symbols. Say > GNU ld adopts something like _dl_have_relr@GLIBC_2.40 . Now it is funny as GNU > ld needs to know the glibc version "GLIBC_2.40", not just the stem > glibc-flavored symbol name "_dl_have_relr". This might be troublesome to backport, since it would require to use a higher version than the baseline one. I am not sure if distro will be willing or plan to backport such feature though. > > There are non-Linux OSes which don't like a "_dl_have_relr" symbol name. > GNU ld would have to provide options in two flavors, one with > _dl_have_relr@GLIBC_2.40, one without. Among glibc systems, there are > plenty of distros there which don't rigidly require a friendly > diagnostic for "time traverl compatibility", e.g. I pretty sure many > Gentoo Linux folks doing aggressive optimizations know that their > executables don't run on old systems. I think even other Linux libc, such as musl, won't be willing to support tying the DT_RELR to a loader/libc symbol existing (musl even less because it explicit does not support symbol versioning). > > An alternative to _dl_have_relr is EI_ABIVERSION. That is probably even > less appealing because bumping the version locks out many ELF consumers. > https://maskray.me/blog/2021-10-31-relative-relocations-and-relr#ei_abiversion > In addition, I noticed that Debian ld.so 2.32 just seems to ignore EI_ABIVERSION. The problem with EI_ABIVERSION is a limitation of glibc, which only checks EI_ABIVERSION on open_verify() and this is not called on default process execution, where kernel will be one responsible to load both the binary and the interpreter: --- $ cat test.c #include int main () { return 0; } $ gdb ./test [...] (gdb) starti [...] process 1420253 Mapped address spaces: Start Addr End Addr Size Offset objfile 0x555555554000 0x555555555000 0x1000 0x0 /tmp/test/test 0x555555555000 0x555555556000 0x1000 0x1000 /tmp/test/test 0x555555556000 0x555555557000 0x1000 0x2000 /tmp/test/test 0x555555557000 0x555555559000 0x2000 0x2000 /tmp/test/test 0x7ffff7fc2000 0x7ffff7fc6000 0x4000 0x0 [vvar] 0x7ffff7fc6000 0x7ffff7fc8000 0x2000 0x0 [vdso] 0x7ffff7fc8000 0x7ffff7fc9000 0x1000 0x0 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 0x7ffff7fc9000 0x7ffff7ff1000 0x28000 0x1000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 0x7ffff7ff1000 0x7ffff7ffb000 0xa000 0x29000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 0x7ffff7ffb000 0x7ffff7fff000 0x4000 0x32000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 0x7ffffffde000 0x7ffffffff000 0x21000 0x0 [stack] 0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall] --- However, the test is correctly executed on any load library and/or if the executable is executed by issuing the loader directly: --- $ readelf -h test ELF Header: Magic: 7f 45 4c 46 02 01 01 00 *04* 00 00 00 00 00 00 00 [...] $ /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ./test ./test: error while loading shared libraries: ./test: ELF file ABI version invalid --- I think this is an bug, since it basically defeats the EI_ABIVERSION check and makes programs executed by issuing the loader with a different semantic than the one executed through execve syscall. Afaik kernel does not pass such information on auxv vector (we might ask for a AT_EHDR eventually) so a potential fix will cost us some extra syscalls on every program execution (to read and check the ELF Header with similar test done on open_verify()). However it does *not* help on older glibc which will still accept old binaries. > > % r2 -wqc 'wx 22 @ 8' a; readelf -Wh a | grep ABI; ./a >   OS/ABI:                            UNIX - GNU >   ABI Version:                       34 > hello > I am not really sure if the 'time travel compatibility' is really an issue, although I saw reports where users try to use chromeos library on glibc that fails in some strange ways (most likely due DT_RELR). If user is deploying a *opt-in* feature that requires proper dynamic loader support, I would expect it know the environment he is targeting. So I think the best course of action for this issue is indeed fix EI_ABIVERSION and make DT_RELR a new 'libc-abis' entry. We might backport the EI_ABIVERSION fix to some older releases, and distros that want to use DT_RELR should do also. > --- > > 2. > > Fetching a prebuilt llvm-project 13.0.0 which supports many Linux distros is > difficult. The accessibility of ld.lld 13.0.0 is certainly nice but I wish that > you don't consider it a blocker as llvm-project 13.0.0 has arrived on many > distros and will arrive on others soon. > > Moreover, I want to emphasize that the core logic is below 30 lines.  It is > isolated enough and uses sufficiently few interfaces so as NOT to cause > maintenance burden to other (tricky) parts of ld.so. The build-many-glibc support would be a nice addition, by I personally think it should no be a blocker. I have a long term goal to add DT_RELR support on binutils, but since I don't have much experience with the internals of the bfd, the progress pace is slow. However, I think current lld status is good enough for at least x86 and arm (no idea about riscv besides the fact it builds). I need to push my p atch to enable powerpc support, however I still having trouble targeting power10 (which seems to be a lld issue). > > --- > > I installed Gentoo Linux last weekend for fun and chatted with some Gentoo > Linux folks who use -fuse-ld=lld. I am sending this message because I think I > should make the feature benefit them earlier. I know some Arch/Debian Linux > users are interested in the feature as well but they may have to wait longer > for GNU ld (their system linker) support. > > I sincerely hope that the patch can catch up glibc 2.35.  By making the > functionality available in an older consumer, we just avoid more "time > travelling compatibility" problems.  Landing the consumer and the producer at > about the same time is actually the bane of many compatibility problems.