From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: Fangrui Song <maskray@google.com>,
Carlos O'Donell <carlos@redhat.com>,
GNU C Library <libc-alpha@sourceware.org>,
Rich Felker <dalias@libc.org>
Subject: Re: Can DT_RELR catch up glibc 2.35?
Date: Wed, 17 Nov 2021 09:46:12 -0300 [thread overview]
Message-ID: <0732a3cc-8dad-52fb-96e3-ef5da8eb3a8e@linaro.org> (raw)
In-Reply-To: <CAMe9rOqOLWsoZK9Vxw1U=Gxu6EoBiBANWZWWM_oM5k4Hf1VTUg@mail.gmail.com>
On 16/11/2021 21:26, H.J. Lu wrote:
> On Tue, Nov 16, 2021 at 1:07 PM Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>>
>> On 12/11/2021 04:47, Fangrui Song wrote:
>>> I am glad that https://sourceware.org/pipermail/libc-alpha/2021-October/132029.html
>>> ("[PATCH v2] elf: Support DT_RELR relative relocation format [BZ #27924]") gets
>>> some traction and many folks acknowledge the size benefit.
>>> (On my Arch Linux, I measured 8% decrease for my /usr/bin.)
>>
>> I brought this to the weekly glibc call two weeks ago and if I recall correctly
>> the *main* issue is we need a proper generic ABI definition published to move this
>> forward on glibc side (H.J.Lu was adamant about).
>>
>> From my part, current status where we have multiple system that already support
>> it (android, chromeos, freebsd) and with a toolchain that supports build/check
>> glibc on at least 4 different ABIs (lld 13 on x86 and arm) is good enough.
>>
>> We lack of proper testing while using bfd might a drawback, since we lack a way
>> to generate binaries without linker support.
>>
>>>
>>> There are two potential issues.
>>>
>>> 1. Lack of "Time travel compatibility" detector
>>> 2. Some folks feel that unable to test with scripts/build-many-glibcs.py is a problem.
>>> (ld.lld --pack-dyn-relocs=relr (since July 2018) is the only linker implementation
>>> and scripts/build-many-glibcs.py doesn't have an lld configuration)
>>>
>>> Let me address them for you.
>>>
>>> ---
>>>
>>> 1.
>>>
>>> "Time travel compatibility" means running a new object on an old system.
>>> A new object using DT_RELR doesn't have the R_*_RELATIVE part in
>>> .rel.dyn/.rela.dyn and is destined to crash.
>>>
>>> If the GNU ld implementation (which may take a while) adopts an
>>> undefined versioned .dynsym symbol (e.g. _dl_have_relr
>>> https://sourceware.org/pipermail/binutils/2021-October/118347.html),
>>> we can guarantee old ld.so will report an error.
>>> The undefined symbol needs to be versioned because ld -shared (default
>>> to --allow-shlib-undefined) does not error on unversioned symbols. Say
>>> GNU ld adopts something like _dl_have_relr@GLIBC_2.40 . Now it is funny as GNU
>>> ld needs to know the glibc version "GLIBC_2.40", not just the stem
>>> glibc-flavored symbol name "_dl_have_relr".
>>
>> This might be troublesome to backport, since it would require to use a higher
>> version than the baseline one. I am not sure if distro will be willing or plan
>> to backport such feature though.
>>
>>>
>>> There are non-Linux OSes which don't like a "_dl_have_relr" symbol name.
>>> GNU ld would have to provide options in two flavors, one with
>>> _dl_have_relr@GLIBC_2.40, one without. Among glibc systems, there are
>>> plenty of distros there which don't rigidly require a friendly
>>> diagnostic for "time traverl compatibility", e.g. I pretty sure many
>>> Gentoo Linux folks doing aggressive optimizations know that their
>>> executables don't run on old systems.
>>
>> I think even other Linux libc, such as musl, won't be willing to support
>> tying the DT_RELR to a loader/libc symbol existing (musl even less because
>> it explicit does not support symbol versioning).
>>
>>>
>>> An alternative to _dl_have_relr is EI_ABIVERSION. That is probably even
>>> less appealing because bumping the version locks out many ELF consumers.
>>> https://maskray.me/blog/2021-10-31-relative-relocations-and-relr#ei_abiversion
>>> In addition, I noticed that Debian ld.so 2.32 just seems to ignore EI_ABIVERSION.
>>
>> The problem with EI_ABIVERSION is a limitation of glibc, which only checks
>> EI_ABIVERSION on open_verify() and this is not called on default process
>> execution, where kernel will be one responsible to load both the binary
>> and the interpreter:
>>
>> ---
>> $ cat test.c
>> #include <stdio.h>
>>
>> int main ()
>> {
>> return 0;
>> }
>> $ gdb ./test
>> [...]
>> (gdb) starti
>> [...]
>> process 1420253
>> Mapped address spaces:
>>
>> Start Addr End Addr Size Offset objfile
>> 0x555555554000 0x555555555000 0x1000 0x0 /tmp/test/test
>> 0x555555555000 0x555555556000 0x1000 0x1000 /tmp/test/test
>> 0x555555556000 0x555555557000 0x1000 0x2000 /tmp/test/test
>> 0x555555557000 0x555555559000 0x2000 0x2000 /tmp/test/test
>> 0x7ffff7fc2000 0x7ffff7fc6000 0x4000 0x0 [vvar]
>> 0x7ffff7fc6000 0x7ffff7fc8000 0x2000 0x0 [vdso]
>> 0x7ffff7fc8000 0x7ffff7fc9000 0x1000 0x0 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
>> 0x7ffff7fc9000 0x7ffff7ff1000 0x28000 0x1000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
>> 0x7ffff7ff1000 0x7ffff7ffb000 0xa000 0x29000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
>> 0x7ffff7ffb000 0x7ffff7fff000 0x4000 0x32000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
>> 0x7ffffffde000 0x7ffffffff000 0x21000 0x0 [stack]
>> 0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
>> ---
>>
>> However, the test is correctly executed on any load library and/or if the
>> executable is executed by issuing the loader directly:
>>
>> ---
>> $ readelf -h test
>> ELF Header:
>> Magic: 7f 45 4c 46 02 01 01 00 *04* 00 00 00 00 00 00 00
>> [...]
>> $ /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ./test
>> ./test: error while loading shared libraries: ./test: ELF file ABI version invalid
>> ---
>>
>> I think this is an bug, since it basically defeats the EI_ABIVERSION check
>> and makes programs executed by issuing the loader with a different semantic
>> than the one executed through execve syscall.
>>
>> Afaik kernel does not pass such information on auxv vector (we might ask
>> for a AT_EHDR eventually) so a potential fix will cost us some extra
>> syscalls on every program execution (to read and check the ELF Header with
>> similar test done on open_verify()).
>>
>> However it does *not* help on older glibc which will still accept old binaries.
>>
>>>
>>> % r2 -wqc 'wx 22 @ 8' a; readelf -Wh a | grep ABI; ./a
>>> OS/ABI: UNIX - GNU
>>> ABI Version: 34
>>> hello
>>>
>>
>> I am not really sure if the 'time travel compatibility' is really an issue,
>> although I saw reports where users try to use chromeos library on glibc that
>> fails in some strange ways (most likely due DT_RELR). If user is deploying
>> a *opt-in* feature that requires proper dynamic loader support, I would
>> expect it know the environment he is targeting.
>>
>> So I think the best course of action for this issue is indeed fix EI_ABIVERSION
>> and make DT_RELR a new 'libc-abis' entry. We might backport the EI_ABIVERSION
>> fix to some older releases, and distros that want to use DT_RELR should do also.
>
> Given that EI_ABIVERSION doesn't really work, should we revisit my
> GNU_PROPERTY_1_GLIBC_2_NEEDED proposal:
>
> https://sourceware.org/pipermail/binutils/2021-October/118292.html
The GNU_PROPERTY_1_GLIBC_2_NEEDED still does not really help much if the idea
is to backport DT_RELR to older version and it still adds logic on the static
linker about glibc symbol version. I would like that static linker know as
little as possible about glibc version, EI_ABIVERSION is way simpler and
already express ABI extensions.
I still think for DT_RELR instead of inventing another GNU extension, we might
fix EI_ABIVERSION and use it properly. Checking with kernel, I think it should
be simple: the elf header is located at the AT_PHDR - sizeof (ElfW(Ehdr)), so we
can refactor the tests at open_verify and use on rtld.c for the case execve()
is called for the executable.
next prev parent reply other threads:[~2021-11-17 12:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-12 7:47 Fangrui Song
2021-11-16 21:07 ` Adhemerval Zanella
2021-11-17 0:26 ` H.J. Lu
2021-11-17 12:46 ` Adhemerval Zanella [this message]
2021-11-17 13:14 ` H.J. Lu
2021-11-18 0:30 ` Fangrui Song
2021-11-18 9:45 ` Florian Weimer
2021-11-18 23:27 ` Fangrui Song
2021-11-19 11:51 ` Adhemerval Zanella
2021-11-24 1:10 ` Sam James
2021-11-19 19:18 ` Rich Felker
2021-11-17 22:12 ` Florian Weimer
2021-11-18 12:45 ` Adhemerval Zanella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0732a3cc-8dad-52fb-96e3-ef5da8eb3a8e@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=carlos@redhat.com \
--cc=dalias@libc.org \
--cc=hjl.tools@gmail.com \
--cc=libc-alpha@sourceware.org \
--cc=maskray@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).