public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Lukasz Majewski <lukma@denx.de>
To: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>,
	Fangrui Song <maskray@google.com>,
	Florian Weimer <fweimer@redhat.com>,
	Joseph Myers <joseph@codesourcery.com>,
	Carlos O'Donell <carlos@redhat.com>,
	Andreas Schwab <schwab@linux-m68k.org>,
	libc-alpha <libc-alpha@sourceware.org>
Subject: Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
Date: Fri, 15 Oct 2021 15:59:19 +0200	[thread overview]
Message-ID: <20211015155919.20392561@ktm> (raw)
In-Reply-To: <20211015120915.GD1982710@arm.com>

[-- Attachment #1: Type: text/plain, Size: 7521 bytes --]

Hi Szabolcs,

> The 10/15/2021 09:54, Lukasz Majewski wrote:
> > This change is a partial revert of commit
> > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > usage of __ehdr_start linker variable to get the address of loaded
> > program.
> > 
> > The elf_machine_load_address() function is declared in the
> > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > _dl_start() entry point for the program. It shall return the load
> > address of the dynamic linker program.
> > 
> > With this revert the 'adr' assembler instruction is used instead of
> > a place holder:
> > 
> > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > 
> > which is pre-set by binutils.
> > 
> > The problem starts when one runs 'prelink' on the rootfs created
> > with for example OE/Yocto.
> > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > have different addresses - for example 0x41000000 instead of the
> > originally set 0x0.
> > 
> > This is crucial when /sbin/init is executed. Value set in
> > __ehdr_start symbol is not updated. This causes the program to
> > crash very early when ld-linux-armhf.so.3's _dl_start is executed,
> > as calculated offset for loader relocation is going to hit the
> > kernel space (0xf7xxyyyy).
> > 
> > It looks like the correct way to obtain the _dl_start offset on ARM
> > is to use assembler instruction 'adr' at execution time (so the
> > prelink assigned offset is taken into consideration) instead of
> > __ehdr_start.
> > 
> > With this patch we only modify the elf_machine_load_address()
> > function, as it is called very early, before the
> > ld-linux-armhf.so.3 is performing relocation (also its own one).  
> 
> i'd use an explanation like:
> 
> __ehdr_start is a linker created symbol that points to the elf header.
> The elf header is at the beginning of the elf file and normally its
> virtual address is 0 in a shared library.  This means the runtime
> address of __ehdr_start is the load address of the module.  However if
> prelinking is applied to ld.so then all virtual addresses are moved by
> an offset so the runtime address of the elf header becomes the load
> address + prelink offset.  The kernel does not treat prelinked ld.so
> specially so the load address is not 0, it still has to be computed,
> but simply using __ehdr_start no longer gives a correct value for
> that.
> 
> This issue affects all targets with prelinking support, but so far we
> only got reports from OE/Yocto builds for arm that has prelinked
> ld.so.
> 

Thanks for a very detailed description.

> but i think a better fix is possible than revert:
> 
> ElfW(Addr)
> elf_machine_load_address ()
> {
>   extern ElfW(Dyn) _DYNAMIC[] attribute_hidden;
>   extern ElfW(Dyn) extern_DYNAMIC[] asm ("_DYNAMIC");
> 

So the _DYNAMIC = GOT[0] (and it points into the .dynamic section)
objdump -d -j .got ld-linux-armhf.so.3

Disassembly of section .got:

41036fbc <.got+0x41000000>:
41036fbc:       41036ef4        .word   0x41036ef4

So it indeed points into the
  [16] .dynamic          DYNAMIC         41036ef4

>   /* Uses pc-relative address computation.  */
>   ElfW(Addr) runtime_addr = (ElfW(Addr)) &_DYNAMIC;

I guess that the &_DYNAMIC gives the address around which $pc runs 
(e.g. 0xb6fc9504) - this is the actual address of run program.

(A side question - is there any way to read the _DYNAMIC symbol value
directly - via e.g. readelf or objdump?)

> 
>   /* Loads an unrelocated GOT entry.  */
>   ElfW(Addr) linktime_addr = (ElfW(Addr)) &extern_DYNAMIC;
> 

This is the prelink'ed address -> 0x41036ef4 in our case?

>   return runtime_addr - linktime_addr;

And the address to which we shall relocated would be:

0xb6fc9504 - 0x41036ef4 = 0x75f92610 - which is the address to which
the ld.so (ld-linux-armhf.so.3) will re-relocate itself?

> }
> 
> I expect this to work on most targets and very similar to the code
> that was originally used on other targets: only a new GOT entry is
> introduced instead of using GOT[0].

In fact we only rely on _DYNAMIC symbol -> which points into .dynamic
section.

> (that new got entry will have a
> relative relocation which means there must be a dynamic section even
> in a static PIE, so i expect _DYNAMIC to be defined.

Ok.

> this also means
> that it's slightly more expensive than &__ehdr_start, so it is for
> targets that want to support prelinked ld.so)
> 
> The original arm code used _dl_start symbol, likely because that's
> within range for the adr instruction for more efficient pc-relative
> computation. But that's a function symbol that requires fixups due
> to thumb interworking issues and is not available in static PIE, so
> using _DYNAMIC sounds better even on arm.

+1.

> 
> > 
> > HW:
> > Hardware name:
> > 	- ARM-Versatile Express (Run with QEMU)
> > 	- Beagle Bone Black
> > 
> > Build Environment: OE/Yocto -> poky
> > SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > 
> > Fixes: BZ #28293
> > ---
> >  sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> >  1 file changed, 25 insertions(+), 3 deletions(-)
> > 
> > diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> > index dfa05eee44..d6e5f1d5ec 100644
> > --- a/sysdeps/arm/dl-machine.h
> > +++ b/sysdeps/arm/dl-machine.h
> > @@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr
> > *ehdr) }
> >  
> >  /* Return the run-time load address of the shared object.  */
> > -static inline ElfW(Addr) __attribute__ ((unused))
> > +static inline Elf32_Addr __attribute__ ((unused))
> >  elf_machine_load_address (void)
> >  {
> > -  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> > -  return (ElfW(Addr)) &__ehdr_start;
> > +  Elf32_Addr pcrel_addr;
> > +#ifdef SHARED
> > +  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> > +  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> > +#else
> > +  extern Elf32_Addr __dl_relocate_static_pie (void *)
> > +    asm ("_dl_relocate_static_pie") attribute_hidden;
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> > +  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> > +#endif
> > +#ifdef __thumb__
> > +  /* Clear the low bit of the function address.
> > +
> > +     NOTE: got_addr is from GOT table whose lsb is always set by
> > linker if it's
> > +     Thumb function address.  PCREL_ADDR comes from PC-relative
> > calculation
> > +     which will finish during assembling.  GAS assembler before
> > the fix for
> > +     PR gas/21458 was not setting the lsb but does after that.
> > Always do the
> > +     strip for both, so the code works with various combinations
> > of glibc and
> > +     Binutils.  */
> > +  got_addr &= ~(Elf32_Addr) 1;
> > +  pcrel_addr &= ~(Elf32_Addr) 1;
> > +#endif
> > +  return pcrel_addr - got_addr;
> >  }
> >  
> >  /* Return the link-time address of _DYNAMIC.  */
> > -- 
> > 2.20.1
> >   




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      parent reply	other threads:[~2021-10-15 13:59 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-07 13:16 [PATCH] dl: Use "adr" assembler command to get proper load address Lukasz Majewski
2021-09-07 16:49 ` Fangrui Song
2021-09-07 17:32   ` Lukasz Majewski
2021-09-07 17:44     ` Fangrui Song
2021-09-08 15:05       ` Lukasz Majewski
2021-09-08 17:41         ` Fāng-ruì Sòng
2021-09-08 19:19         ` Adhemerval Zanella
2021-09-08 20:34           ` Lukasz Majewski
2021-09-09  7:18             ` Lukasz Majewski
2021-09-09  9:49               ` Lukasz Majewski
2021-09-10 10:10                 ` Lukasz Majewski
2021-09-17  8:29                   ` Lukasz Majewski
2021-09-17 13:27                     ` Joseph Myers
2021-09-17 16:17                       ` Andreas Schwab
2021-09-26 19:58                       ` Lukasz Majewski
2021-09-27 16:00                         ` Joseph Myers
2021-10-05  7:45       ` Lukasz Majewski
2021-10-06  7:57         ` Fangrui Song
2021-10-06  9:03           ` Lukasz Majewski
2021-10-06 11:43             ` Lukasz Majewski
2021-10-06 12:55               ` Szabolcs Nagy
2021-10-07  9:19                 ` Lukasz Majewski
2021-10-07 10:00                   ` Lukasz Majewski
2021-10-07 14:15                     ` Szabolcs Nagy
2021-10-07 14:58                       ` Lukasz Majewski
2021-10-07 14:16                     ` Adhemerval Zanella
2021-10-07 14:29                       ` H.J. Lu
2021-10-07 15:57                         ` Szabolcs Nagy
2021-10-07 16:22                           ` H.J. Lu
2021-10-07 16:53                             ` Adhemerval Zanella
2021-10-07 17:05                               ` H.J. Lu
2021-10-07 17:24                               ` Fāng-ruì Sòng
2021-10-08  9:15                                 ` Szabolcs Nagy
2021-10-11  8:56                         ` Lukasz Majewski
2021-10-11 10:18                           ` Szabolcs Nagy
2021-10-11 11:47                             ` Lukasz Majewski
2021-10-11 12:01                               ` H.J. Lu
2021-10-11 13:10                                 ` Lukasz Majewski
2021-10-11 13:22                                   ` H.J. Lu
2021-10-11 14:31                                     ` Lukasz Majewski
2021-10-11 13:34                                 ` Adhemerval Zanella
2021-10-11 12:48                               ` Szabolcs Nagy
2021-10-15  7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
2021-10-15 12:09   ` Szabolcs Nagy
2021-10-15 12:21     ` H.J. Lu
2021-10-15 12:59       ` Lukasz Majewski
2021-10-15 23:53         ` Fāng-ruì Sòng
2021-10-18 11:08           ` Szabolcs Nagy
2021-10-18 11:35             ` Florian Weimer
2021-10-19 12:03               ` Lukasz Majewski
2021-10-25 10:18               ` Lukasz Majewski
2021-10-25 10:25                 ` Florian Weimer
2021-10-25 10:53                   ` Lukasz Majewski
2021-10-25 13:34                     ` Szabolcs Nagy
2021-10-25 14:04                       ` Lukasz Majewski
2021-10-25 15:09                         ` Szabolcs Nagy
2021-10-25 17:26                           ` Joseph Myers
2021-10-26 13:52                             ` Lukasz Majewski
2021-10-26 20:55                               ` Joseph Myers
2021-10-27  9:38                                 ` Szabolcs Nagy
2021-10-25 18:25                           ` Lukasz Majewski
2021-10-15 13:59     ` Lukasz Majewski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211015155919.20392561@ktm \
    --to=lukma@denx.de \
    --cc=adhemerval.zanella@linaro.org \
    --cc=carlos@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    --cc=maskray@google.com \
    --cc=schwab@linux-m68k.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).