From: Fangrui Song <maskray@google.com>
To: Lukasz Majewski <lukma@denx.de>
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>,
Florian Weimer <fweimer@redhat.com>,
Joseph Myers <joseph@codesourcery.com>,
Carlos O'Donell <carlos@redhat.com>,
libc-alpha <libc-alpha@sourceware.org>
Subject: Re: [PATCH] dl: Use "adr" assembler command to get proper load address
Date: Tue, 7 Sep 2021 10:44:17 -0700 [thread overview]
Message-ID: <20210907174417.sctsswphsyae4mpc@google.com> (raw)
In-Reply-To: <20210907193227.6047f9cc@ktm>
On 2021-09-07, Lukasz Majewski wrote:
>Hi Fangrui,
>
>> On 2021-09-07, Lukasz Majewski wrote:
>> >This change is a partial revert of commit
>> >bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
>> >__ehdr_start linker variable to get the address of loaded program.
>> >
>> >The elf_machine_load_address() function is declared in the
>> >sysdeps/arm/dl-machine.h header. It is called from _dl_start() entry
>> >point for the program. It shall return the load address of the
>> >dynamic linker program.
>>
>> Yes.
>>
>> >With this revert the 'adr' assembler instruction is used instead of a
>> >place holder:
>> >
>> >arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
>> >00000000 l .note.gnu.build-id 00000000 __ehdr_start
>> >
>> >which shall be pre-set by binutils.
>>
>> I don't understand this. The sh_addr field of the binutils defined
>> __ehdr_start is 0.
>>
>> Declararing __ehdr_start as hidden and accessing it with PC-relative
>> addressing generates computes the runtime address of __ehdr_start,
>> which equals the load base.
>
>Maybe I don't get it - but shouldn't this be filled in by binutils
>(linker ?) when the program is assembled before run?
The linker just sets its link-time address (sh_addr): 0.
The address is computed at runtime.
>Or is the __ehdr_start used just as a relative offset to get the proper
>position of ld-linux-armh.so when called?
It's a relative offset.
% cat a.c
__attribute__((visibility("hidden")))
extern char __ehdr_start[];
char *load_address() { return __ehdr_start; }
% arm-linux-gnueabi-gcc -O1 -fpic a.c -nostdlib
% arm-linux-gnueabi-objdump -dr a.out
a.out: file format elf32-littlearm
Disassembly of section .text:
00000198 <load_address>:
198: e59f0004 ldr r0, [pc, #4] ; 1a4 <load_address+0xc>
19c: e08f0000 add r0, pc, r0
1a0: e12fff1e bx lr
1a4: fffffe5c .word 0xfffffe5c
At runtime, `load_address` will compute the load address.
Note that there is no runtime relocation.
>>
>> What's wrong with it? The objdump -t line doesn't tell why this change
>> is good.
>
>I'm also puzzled why this causes the QEMU versalite board to break.
>(/sbin/init dies with SIGSEGFAULT = 0xb [*])
>
>It looks like the &__ehdr_start is not equal in the result to 'adr'
>assembler code in QEMU.
>
>I also thought that the issue is caused by fs/binfmt_elf.c changes in
>the Linux kernel (as mine is 5.1), so I've tested it with 5.10 and
>5.14. No change.
The kernel change must be unrelated.
The kernel doesn't even know __ehdr_start as a symbol exists.
>The QEMU is pretty recent - it is 6.0.0 version for ARM.
>
>> Can you compare objdump -dr output and show the incorrect
>> code sequences with C assessing __ehdr_start?
>
>Ok. I will explicitly compare _dl_start function with and without this
>patch.
In case it helps, perhaps replace always_inline with noinline
and check why the function is incorrect under your qemu setup.
>>
>> >This is crucial in the QEMU ARM environment for which (when
>> >/sbin/init is executed) values set in __ehdr_start symbol are wrong.
>> >This causes the program to crash very early - when the
>> >/lib/ld-linux-armhf.so.3 is executed as a prerequisite to /sbin/init
>> >execution.
>>
>> >The kernel's fs/binfmt_elf.c is though responsible for setting
>> >up execution environment, not binutils.
>>
>> Whether the symbol entry __ehdr_start exists doesn't matter.
>> The hidden definition does not have any associated relocation.
>
>I'm also wondering if similar issue is visible with "normal" - i.e. not
>QEMU ARM run init.
>
>Maybe the "rough" environment in which /sbin/init is run is the culprit?
>
I had tested running ld.so and elf/ldconfig which covered the usage.
I don't know. It needs some debugging from you:)
>>
>> >It looks like the only robust way to obtain the _dl_start offset is
>> >to use assembler instruction - not rely on values provided by
>> >binutils.
>> >
>> >HW:
>> >Hardware name: ARM-Versatile Express (Run with QEMU)
>> >Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
>> >
>> >When the /sbin/init is setup for run from Linux kernel's very small
>> >environment with LD_DEBUG=all the __ehdr_start is not shown at all.
>> >
>> >Fixes: BZ #28293
>> >---
>> > sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
>> > 1 file changed, 25 insertions(+), 3 deletions(-)
>> >
>> >diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
>> >index eb13cb8b57..58ebef6ecd 100644
>> >--- a/sysdeps/arm/dl-machine.h
>> >+++ b/sysdeps/arm/dl-machine.h
>> >@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
>> > }
>> >
>> > /* Return the run-time load address of the shared object. */
>> >-static inline ElfW(Addr) __attribute__ ((unused))
>> >+static inline Elf32_Addr __attribute__ ((unused))
>> > elf_machine_load_address (void)
>> > {
>> >- extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
>> >- return (ElfW(Addr)) &__ehdr_start;
>> >+ Elf32_Addr pcrel_addr;
>> >+#ifdef SHARED
>> >+ extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
>> >+ Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
>> >+ asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
>> >+#else
>> >+ extern Elf32_Addr __dl_relocate_static_pie (void *)
>> >+ asm ("_dl_relocate_static_pie") attribute_hidden;
>> >+ Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
>> >+ asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
>> >+#endif
>> >+#ifdef __thumb__
>> >+ /* Clear the low bit of the function address.
>> >+
>> >+ NOTE: got_addr is from GOT table whose lsb is always set by
>> >linker if it's
>> >+ Thumb function address. PCREL_ADDR comes from PC-relative
>> >calculation
>> >+ which will finish during assembling. GAS assembler before the
>> >fix for
>> >+ PR gas/21458 was not setting the lsb but does after that.
>> >Always do the
>> >+ strip for both, so the code works with various combinations of
>> >glibc and
>> >+ Binutils. */
>> >+ got_addr &= ~(Elf32_Addr) 1;
>> >+ pcrel_addr &= ~(Elf32_Addr) 1;
>> >+#endif
>> >+ return pcrel_addr - got_addr;
>> > }
>> >
>> > /* Return the link-time address of _DYNAMIC. */
>> >--
>> >2.20.1
>> >
>
>Note:
>
>[*] - the QEMU debugging is pretty difficult with /sbin/init. Qemu has
>good support for kernel debugging (-s -S switches). Then for user space
>program one could use the "on-board" gdb or gdbserver.
>
>However, /sbin/init needs to be debugged from linux and scheduler
>context switches needs to be taken into account to review how the code
>is executed. And such debugging scenario has issue with QEMU-arm.
>
>
>Best regards,
>
>Lukasz Majewski
>
>--
>
>DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
>HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
>Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
next prev parent reply other threads:[~2021-09-07 17:44 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-07 13:16 Lukasz Majewski
2021-09-07 16:49 ` Fangrui Song
2021-09-07 17:32 ` Lukasz Majewski
2021-09-07 17:44 ` Fangrui Song [this message]
2021-09-08 15:05 ` Lukasz Majewski
2021-09-08 17:41 ` Fāng-ruì Sòng
2021-09-08 19:19 ` Adhemerval Zanella
2021-09-08 20:34 ` Lukasz Majewski
2021-09-09 7:18 ` Lukasz Majewski
2021-09-09 9:49 ` Lukasz Majewski
2021-09-10 10:10 ` Lukasz Majewski
2021-09-17 8:29 ` Lukasz Majewski
2021-09-17 13:27 ` Joseph Myers
2021-09-17 16:17 ` Andreas Schwab
2021-09-26 19:58 ` Lukasz Majewski
2021-09-27 16:00 ` Joseph Myers
2021-10-05 7:45 ` Lukasz Majewski
2021-10-06 7:57 ` Fangrui Song
2021-10-06 9:03 ` Lukasz Majewski
2021-10-06 11:43 ` Lukasz Majewski
2021-10-06 12:55 ` Szabolcs Nagy
2021-10-07 9:19 ` Lukasz Majewski
2021-10-07 10:00 ` Lukasz Majewski
2021-10-07 14:15 ` Szabolcs Nagy
2021-10-07 14:58 ` Lukasz Majewski
2021-10-07 14:16 ` Adhemerval Zanella
2021-10-07 14:29 ` H.J. Lu
2021-10-07 15:57 ` Szabolcs Nagy
2021-10-07 16:22 ` H.J. Lu
2021-10-07 16:53 ` Adhemerval Zanella
2021-10-07 17:05 ` H.J. Lu
2021-10-07 17:24 ` Fāng-ruì Sòng
2021-10-08 9:15 ` Szabolcs Nagy
2021-10-11 8:56 ` Lukasz Majewski
2021-10-11 10:18 ` Szabolcs Nagy
2021-10-11 11:47 ` Lukasz Majewski
2021-10-11 12:01 ` H.J. Lu
2021-10-11 13:10 ` Lukasz Majewski
2021-10-11 13:22 ` H.J. Lu
2021-10-11 14:31 ` Lukasz Majewski
2021-10-11 13:34 ` Adhemerval Zanella
2021-10-11 12:48 ` Szabolcs Nagy
2021-10-15 7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
2021-10-15 12:09 ` Szabolcs Nagy
2021-10-15 12:21 ` H.J. Lu
2021-10-15 12:59 ` Lukasz Majewski
2021-10-15 23:53 ` Fāng-ruì Sòng
2021-10-18 11:08 ` Szabolcs Nagy
2021-10-18 11:35 ` Florian Weimer
2021-10-19 12:03 ` Lukasz Majewski
2021-10-25 10:18 ` Lukasz Majewski
2021-10-25 10:25 ` Florian Weimer
2021-10-25 10:53 ` Lukasz Majewski
2021-10-25 13:34 ` Szabolcs Nagy
2021-10-25 14:04 ` Lukasz Majewski
2021-10-25 15:09 ` Szabolcs Nagy
2021-10-25 17:26 ` Joseph Myers
2021-10-26 13:52 ` Lukasz Majewski
2021-10-26 20:55 ` Joseph Myers
2021-10-27 9:38 ` Szabolcs Nagy
2021-10-25 18:25 ` Lukasz Majewski
2021-10-15 13:59 ` Lukasz Majewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210907174417.sctsswphsyae4mpc@google.com \
--to=maskray@google.com \
--cc=adhemerval.zanella@linaro.org \
--cc=carlos@redhat.com \
--cc=fweimer@redhat.com \
--cc=joseph@codesourcery.com \
--cc=libc-alpha@sourceware.org \
--cc=lukma@denx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).