public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Fangrui Song <maskray@google.com>,
	Carlos O'Donell <carlos@redhat.com>,
	"H.J. Lu" <hjl.tools@gmail.com>
Cc: libc-alpha@sourceware.org, Rich Felker <dalias@libc.org>
Subject: Re: Can DT_RELR catch up glibc 2.35?
Date: Tue, 16 Nov 2021 18:07:40 -0300	[thread overview]
Message-ID: <e9dd6c87-318d-8645-6a9e-46fda37cffb2@linaro.org> (raw)
In-Reply-To: <20211112074723.uvmlvihlutnib6ik@google.com>



On 12/11/2021 04:47, Fangrui Song wrote:
> I am glad that https://sourceware.org/pipermail/libc-alpha/2021-October/132029.html
> ("[PATCH v2] elf: Support DT_RELR relative relocation format [BZ #27924]") gets
> some traction and many folks acknowledge the size benefit.
> (On my Arch Linux, I measured 8% decrease for my /usr/bin.)

I brought this to the weekly glibc call two weeks ago and if I recall correctly 
the *main* issue is we need a proper generic ABI definition published to move this
forward on glibc side (H.J.Lu was adamant about).

From my part, current status where we have multiple system that already support
it (android, chromeos, freebsd) and with a toolchain that supports build/check
glibc on at least 4 different ABIs (lld 13 on x86 and arm) is good enough.

We lack of proper testing while using bfd might a drawback, since we lack a way
to generate binaries without linker support.

> 
> There are two potential issues.
> 
> 1. Lack of "Time travel compatibility" detector
> 2. Some folks feel that unable to test with scripts/build-many-glibcs.py is a problem.
>   (ld.lld --pack-dyn-relocs=relr (since July 2018) is the only linker implementation
>   and scripts/build-many-glibcs.py doesn't have an lld configuration)
> 
> Let me address them for you.
> 
> ---
> 
> 1.
> 
> "Time travel compatibility" means running a new object on an old system.
> A new object using DT_RELR doesn't have the R_*_RELATIVE part in
> .rel.dyn/.rela.dyn and is destined to crash.
> 
> If the GNU ld implementation (which may take a while) adopts an
> undefined versioned .dynsym symbol (e.g. _dl_have_relr
> https://sourceware.org/pipermail/binutils/2021-October/118347.html),
> we can guarantee old ld.so will report an error.
> The undefined symbol needs to be versioned because ld -shared (default
> to --allow-shlib-undefined) does not error on unversioned symbols. Say
> GNU ld adopts something like _dl_have_relr@GLIBC_2.40 . Now it is funny as GNU
> ld needs to know the glibc version "GLIBC_2.40", not just the stem
> glibc-flavored symbol name "_dl_have_relr".

This might be troublesome to backport, since it would require to use a higher
version than the baseline one.  I am not sure if distro will be willing or plan
to backport such feature though.

> 
> There are non-Linux OSes which don't like a "_dl_have_relr" symbol name.
> GNU ld would have to provide options in two flavors, one with
> _dl_have_relr@GLIBC_2.40, one without. Among glibc systems, there are
> plenty of distros there which don't rigidly require a friendly
> diagnostic for "time traverl compatibility", e.g. I pretty sure many
> Gentoo Linux folks doing aggressive optimizations know that their
> executables don't run on old systems.

I think even other Linux libc, such as musl, won't be willing to support
tying the DT_RELR to a loader/libc symbol existing (musl even less because
it explicit does not support symbol versioning).

> 
> An alternative to _dl_have_relr is EI_ABIVERSION. That is probably even
> less appealing because bumping the version locks out many ELF consumers.
> https://maskray.me/blog/2021-10-31-relative-relocations-and-relr#ei_abiversion
> In addition, I noticed that Debian ld.so 2.32 just seems to ignore EI_ABIVERSION.

The problem with EI_ABIVERSION is a limitation of glibc, which only checks
EI_ABIVERSION on open_verify() and this is not called on default process
execution, where kernel will be one responsible to load both the binary
and the interpreter:

---
$ cat test.c 
#include <stdio.h>

int main ()
{
  return 0;
}
$ gdb ./test 
[...]
(gdb) starti
[...]
process 1420253
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x555555554000     0x555555555000     0x1000        0x0 /tmp/test/test
      0x555555555000     0x555555556000     0x1000     0x1000 /tmp/test/test
      0x555555556000     0x555555557000     0x1000     0x2000 /tmp/test/test
      0x555555557000     0x555555559000     0x2000     0x2000 /tmp/test/test
      0x7ffff7fc2000     0x7ffff7fc6000     0x4000        0x0 [vvar]
      0x7ffff7fc6000     0x7ffff7fc8000     0x2000        0x0 [vdso]
      0x7ffff7fc8000     0x7ffff7fc9000     0x1000        0x0 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7fc9000     0x7ffff7ff1000    0x28000     0x1000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ff1000     0x7ffff7ffb000     0xa000    0x29000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffff7ffb000     0x7ffff7fff000     0x4000    0x32000 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0 [stack]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0 [vsyscall]
---

However, the test is correctly executed on any load library and/or if the
executable is executed by issuing the loader directly:

---
$ readelf -h test
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 *04* 00 00 00 00 00 00 00 
[...]
$ /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ./test
./test: error while loading shared libraries: ./test: ELF file ABI version invalid
---

I think this is an bug, since it basically defeats the EI_ABIVERSION check
and makes programs executed by issuing the loader with a different semantic
than the one executed through execve syscall.

Afaik kernel does not pass such information on auxv vector (we might ask
for a AT_EHDR eventually) so a potential fix will cost us some extra 
syscalls on every program execution (to read and check the ELF Header with 
similar test done on open_verify()).

However it does *not* help on older glibc which will still accept old binaries.

> 
> % r2 -wqc 'wx 22 @ 8' a; readelf -Wh a | grep ABI; ./a
>   OS/ABI:                            UNIX - GNU
>   ABI Version:                       34
> hello
> 

I am not really sure if the 'time travel compatibility' is really an issue,
although I saw reports where users try to use chromeos library on glibc that
fails in some strange ways (most likely due DT_RELR). If user is deploying
a *opt-in* feature that requires proper dynamic loader support, I would
expect it know the environment he is targeting.

So I think the best course of action for this issue is indeed fix EI_ABIVERSION
and make DT_RELR a new 'libc-abis' entry.  We might backport the EI_ABIVERSION
fix to some older releases, and distros that want to use DT_RELR should do also.

> ---
> 
> 2.
> 
> Fetching a prebuilt llvm-project 13.0.0 which supports many Linux distros is
> difficult. The accessibility of ld.lld 13.0.0 is certainly nice but I wish that
> you don't consider it a blocker as llvm-project 13.0.0 has arrived on many
> distros and will arrive on others soon.
> 
> Moreover, I want to emphasize that the core logic is below 30 lines.  It is
> isolated enough and uses sufficiently few interfaces so as NOT to cause
> maintenance burden to other (tricky) parts of ld.so.

The build-many-glibc support would be a nice addition, by I personally think
it should no be a blocker.  I have a long term goal to add DT_RELR support
on binutils, but since I don't have much experience with the internals
of the bfd, the progress pace is slow.

However, I think current lld status is good enough for at least x86
and arm (no idea about riscv besides the fact it builds).  I need to push my p
atch to enable powerpc support, however I still having trouble targeting power10
(which seems to be a lld issue).

> 
> ---
> 
> I installed Gentoo Linux last weekend for fun and chatted with some Gentoo
> Linux folks who use -fuse-ld=lld. I am sending this message because I think I
> should make the feature benefit them earlier. I know some Arch/Debian Linux
> users are interested in the feature as well but they may have to wait longer
> for GNU ld (their system linker) support.
> 
> I sincerely hope that the patch can catch up glibc 2.35.  By making the
> functionality available in an older consumer, we just avoid more "time
> travelling compatibility" problems.  Landing the consumer and the producer at
> about the same time is actually the bane of many compatibility problems.

  reply	other threads:[~2021-11-16 21:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-12  7:47 Fangrui Song
2021-11-16 21:07 ` Adhemerval Zanella [this message]
2021-11-17  0:26   ` H.J. Lu
2021-11-17 12:46     ` Adhemerval Zanella
2021-11-17 13:14       ` H.J. Lu
2021-11-18  0:30         ` Fangrui Song
2021-11-18  9:45           ` Florian Weimer
2021-11-18 23:27             ` Fangrui Song
2021-11-19 11:51               ` Adhemerval Zanella
2021-11-24  1:10               ` Sam James
2021-11-19 19:18           ` Rich Felker
2021-11-17 22:12   ` Florian Weimer
2021-11-18 12:45     ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9dd6c87-318d-8645-6a9e-46fda37cffb2@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=carlos@redhat.com \
    --cc=dalias@libc.org \
    --cc=hjl.tools@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=maskray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).