public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] dl: Use "adr" assembler command to get proper load address
@ 2021-09-07 13:16 Lukasz Majewski
  2021-09-07 16:49 ` Fangrui Song
  2021-10-15  7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-07 13:16 UTC (permalink / raw)
  To: Adhemerval Zanella, Fangrui Song, Florian Weimer
  Cc: Joseph Myers, Carlos O'Donell, libc-alpha, Lukasz Majewski

This change is a partial revert of commit
bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
__ehdr_start linker variable to get the address of loaded program.

The elf_machine_load_address() function is declared in the
sysdeps/arm/dl-machine.h header. It is called from _dl_start() entry
point for the program. It shall return the load address of the dynamic
linker program.
With this revert the 'adr' assembler instruction is used instead of a
place holder:

arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
00000000 l       .note.gnu.build-id     00000000      __ehdr_start

which shall be pre-set by binutils.

This is crucial in the QEMU ARM environment for which (when /sbin/init
is executed) values set in __ehdr_start symbol are wrong. This causes
the program to crash very early - when the /lib/ld-linux-armhf.so.3 is
executed as a prerequisite to /sbin/init execution.
The kernel's fs/binfmt_elf.c is though responsible for setting
up execution environment, not binutils.

It looks like the only robust way to obtain the _dl_start offset is to
use assembler instruction - not rely on values provided by binutils.

HW:
Hardware name: ARM-Versatile Express (Run with QEMU)
Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1

When the /sbin/init is setup for run from Linux kernel's very small
environment with LD_DEBUG=all the __ehdr_start is not shown at all.

Fixes: BZ #28293
---
 sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
index eb13cb8b57..58ebef6ecd 100644
--- a/sysdeps/arm/dl-machine.h
+++ b/sysdeps/arm/dl-machine.h
@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
 }
 
 /* Return the run-time load address of the shared object.  */
-static inline ElfW(Addr) __attribute__ ((unused))
+static inline Elf32_Addr __attribute__ ((unused))
 elf_machine_load_address (void)
 {
-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
-  return (ElfW(Addr)) &__ehdr_start;
+  Elf32_Addr pcrel_addr;
+#ifdef SHARED
+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
+#else
+  extern Elf32_Addr __dl_relocate_static_pie (void *)
+    asm ("_dl_relocate_static_pie") attribute_hidden;
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
+#endif
+#ifdef __thumb__
+  /* Clear the low bit of the function address.
+
+     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
+     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
+     which will finish during assembling.  GAS assembler before the fix for
+     PR gas/21458 was not setting the lsb but does after that.  Always do the
+     strip for both, so the code works with various combinations of glibc and
+     Binutils.  */
+  got_addr &= ~(Elf32_Addr) 1;
+  pcrel_addr &= ~(Elf32_Addr) 1;
+#endif
+  return pcrel_addr - got_addr;
 }
 
 /* Return the link-time address of _DYNAMIC.  */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-07 13:16 [PATCH] dl: Use "adr" assembler command to get proper load address Lukasz Majewski
@ 2021-09-07 16:49 ` Fangrui Song
  2021-09-07 17:32   ` Lukasz Majewski
  2021-10-15  7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
  1 sibling, 1 reply; 62+ messages in thread
From: Fangrui Song @ 2021-09-07 16:49 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

On 2021-09-07, Lukasz Majewski wrote:
>This change is a partial revert of commit
>bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
>__ehdr_start linker variable to get the address of loaded program.
>
>The elf_machine_load_address() function is declared in the
>sysdeps/arm/dl-machine.h header. It is called from _dl_start() entry
>point for the program. It shall return the load address of the dynamic
>linker program.

Yes.

>With this revert the 'adr' assembler instruction is used instead of a
>place holder:
>
>arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
>00000000 l       .note.gnu.build-id     00000000      __ehdr_start
>
>which shall be pre-set by binutils.

I don't understand this. The sh_addr field of the binutils defined
__ehdr_start is 0.

Declararing __ehdr_start as hidden and accessing it with PC-relative
addressing generates computes the runtime address of __ehdr_start, which
equals the load base.

What's wrong with it? The objdump -t line doesn't tell why this change
is good. Can you compare objdump -dr output and show the incorrect code
sequences with C assessing __ehdr_start?

>This is crucial in the QEMU ARM environment for which (when /sbin/init
>is executed) values set in __ehdr_start symbol are wrong. This causes
>the program to crash very early - when the /lib/ld-linux-armhf.so.3 is
>executed as a prerequisite to /sbin/init execution.

>The kernel's fs/binfmt_elf.c is though responsible for setting
>up execution environment, not binutils.

Whether the symbol entry __ehdr_start exists doesn't matter.
The hidden definition does not have any associated relocation.

>It looks like the only robust way to obtain the _dl_start offset is to
>use assembler instruction - not rely on values provided by binutils.
>
>HW:
>Hardware name: ARM-Versatile Express (Run with QEMU)
>Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
>
>When the /sbin/init is setup for run from Linux kernel's very small
>environment with LD_DEBUG=all the __ehdr_start is not shown at all.
>
>Fixes: BZ #28293
>---
> sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> 1 file changed, 25 insertions(+), 3 deletions(-)
>
>diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
>index eb13cb8b57..58ebef6ecd 100644
>--- a/sysdeps/arm/dl-machine.h
>+++ b/sysdeps/arm/dl-machine.h
>@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
> }
>
> /* Return the run-time load address of the shared object.  */
>-static inline ElfW(Addr) __attribute__ ((unused))
>+static inline Elf32_Addr __attribute__ ((unused))
> elf_machine_load_address (void)
> {
>-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
>-  return (ElfW(Addr)) &__ehdr_start;
>+  Elf32_Addr pcrel_addr;
>+#ifdef SHARED
>+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
>+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
>+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
>+#else
>+  extern Elf32_Addr __dl_relocate_static_pie (void *)
>+    asm ("_dl_relocate_static_pie") attribute_hidden;
>+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
>+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
>+#endif
>+#ifdef __thumb__
>+  /* Clear the low bit of the function address.
>+
>+     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
>+     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
>+     which will finish during assembling.  GAS assembler before the fix for
>+     PR gas/21458 was not setting the lsb but does after that.  Always do the
>+     strip for both, so the code works with various combinations of glibc and
>+     Binutils.  */
>+  got_addr &= ~(Elf32_Addr) 1;
>+  pcrel_addr &= ~(Elf32_Addr) 1;
>+#endif
>+  return pcrel_addr - got_addr;
> }
>
> /* Return the link-time address of _DYNAMIC.  */
>-- 
>2.20.1
>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-07 16:49 ` Fangrui Song
@ 2021-09-07 17:32   ` Lukasz Majewski
  2021-09-07 17:44     ` Fangrui Song
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-07 17:32 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 5845 bytes --]

Hi Fangrui,

> On 2021-09-07, Lukasz Majewski wrote:
> >This change is a partial revert of commit
> >bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
> >__ehdr_start linker variable to get the address of loaded program.
> >
> >The elf_machine_load_address() function is declared in the
> >sysdeps/arm/dl-machine.h header. It is called from _dl_start() entry
> >point for the program. It shall return the load address of the
> >dynamic linker program.  
> 
> Yes.
> 
> >With this revert the 'adr' assembler instruction is used instead of a
> >place holder:
> >
> >arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> >00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> >
> >which shall be pre-set by binutils.  
> 
> I don't understand this. The sh_addr field of the binutils defined
> __ehdr_start is 0.
> 
> Declararing __ehdr_start as hidden and accessing it with PC-relative
> addressing generates computes the runtime address of __ehdr_start,
> which equals the load base.

Maybe I don't get it - but shouldn't this be filled in by binutils
(linker ?) when the program is assembled before run?

Or is the __ehdr_start used just as a relative offset to get the proper
position of ld-linux-armh.so when called?

> 
> What's wrong with it? The objdump -t line doesn't tell why this change
> is good.

I'm also puzzled why this causes the QEMU versalite board to break.
(/sbin/init dies with SIGSEGFAULT = 0xb [*])

It looks like the &__ehdr_start is not equal in the result to 'adr'
assembler code in QEMU.

I also thought that the issue is caused by fs/binfmt_elf.c changes in
the Linux kernel (as mine is 5.1), so I've tested it with 5.10 and
5.14. No change.

The QEMU is pretty recent - it is 6.0.0 version for ARM.

> Can you compare objdump -dr output and show the incorrect
> code sequences with C assessing __ehdr_start?

Ok. I will explicitly compare _dl_start function with and without this
patch.

> 
> >This is crucial in the QEMU ARM environment for which (when
> >/sbin/init is executed) values set in __ehdr_start symbol are wrong.
> >This causes the program to crash very early - when the
> >/lib/ld-linux-armhf.so.3 is executed as a prerequisite to /sbin/init
> >execution.  
> 
> >The kernel's fs/binfmt_elf.c is though responsible for setting
> >up execution environment, not binutils.  
> 
> Whether the symbol entry __ehdr_start exists doesn't matter.
> The hidden definition does not have any associated relocation.

I'm also wondering if similar issue is visible with "normal" - i.e. not
QEMU ARM run init.

Maybe the "rough" environment in which /sbin/init is run is the culprit?

> 
> >It looks like the only robust way to obtain the _dl_start offset is
> >to use assembler instruction - not rely on values provided by
> >binutils.
> >
> >HW:
> >Hardware name: ARM-Versatile Express (Run with QEMU)
> >Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
> >
> >When the /sbin/init is setup for run from Linux kernel's very small
> >environment with LD_DEBUG=all the __ehdr_start is not shown at all.
> >
> >Fixes: BZ #28293
> >---
> > sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> > 1 file changed, 25 insertions(+), 3 deletions(-)
> >
> >diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> >index eb13cb8b57..58ebef6ecd 100644
> >--- a/sysdeps/arm/dl-machine.h
> >+++ b/sysdeps/arm/dl-machine.h
> >@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
> > }
> >
> > /* Return the run-time load address of the shared object.  */
> >-static inline ElfW(Addr) __attribute__ ((unused))
> >+static inline Elf32_Addr __attribute__ ((unused))
> > elf_machine_load_address (void)
> > {
> >-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> >-  return (ElfW(Addr)) &__ehdr_start;
> >+  Elf32_Addr pcrel_addr;
> >+#ifdef SHARED
> >+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> >+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> >+#else
> >+  extern Elf32_Addr __dl_relocate_static_pie (void *)
> >+    asm ("_dl_relocate_static_pie") attribute_hidden;
> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> >+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> >+#endif
> >+#ifdef __thumb__
> >+  /* Clear the low bit of the function address.
> >+
> >+     NOTE: got_addr is from GOT table whose lsb is always set by
> >linker if it's
> >+     Thumb function address.  PCREL_ADDR comes from PC-relative
> >calculation
> >+     which will finish during assembling.  GAS assembler before the
> >fix for
> >+     PR gas/21458 was not setting the lsb but does after that.
> >Always do the
> >+     strip for both, so the code works with various combinations of
> >glibc and
> >+     Binutils.  */
> >+  got_addr &= ~(Elf32_Addr) 1;
> >+  pcrel_addr &= ~(Elf32_Addr) 1;
> >+#endif
> >+  return pcrel_addr - got_addr;
> > }
> >
> > /* Return the link-time address of _DYNAMIC.  */
> >-- 
> >2.20.1
> >  

Note:

[*] - the QEMU debugging is pretty difficult with /sbin/init. Qemu has
good support for kernel debugging (-s -S switches). Then for user space
program one could use the "on-board" gdb or gdbserver.

However, /sbin/init needs to be debugged from linux and scheduler
context switches needs to be taken into account to review how the code
is executed. And such debugging scenario has issue with QEMU-arm.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-07 17:32   ` Lukasz Majewski
@ 2021-09-07 17:44     ` Fangrui Song
  2021-09-08 15:05       ` Lukasz Majewski
  2021-10-05  7:45       ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Fangrui Song @ 2021-09-07 17:44 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha


On 2021-09-07, Lukasz Majewski wrote:
>Hi Fangrui,
>
>> On 2021-09-07, Lukasz Majewski wrote:
>> >This change is a partial revert of commit
>> >bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
>> >__ehdr_start linker variable to get the address of loaded program.
>> >
>> >The elf_machine_load_address() function is declared in the
>> >sysdeps/arm/dl-machine.h header. It is called from _dl_start() entry
>> >point for the program. It shall return the load address of the
>> >dynamic linker program.
>>
>> Yes.
>>
>> >With this revert the 'adr' assembler instruction is used instead of a
>> >place holder:
>> >
>> >arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
>> >00000000 l       .note.gnu.build-id     00000000      __ehdr_start
>> >
>> >which shall be pre-set by binutils.
>>
>> I don't understand this. The sh_addr field of the binutils defined
>> __ehdr_start is 0.
>>
>> Declararing __ehdr_start as hidden and accessing it with PC-relative
>> addressing generates computes the runtime address of __ehdr_start,
>> which equals the load base.
>
>Maybe I don't get it - but shouldn't this be filled in by binutils
>(linker ?) when the program is assembled before run?

The linker just sets its link-time address (sh_addr): 0.
The address is computed at runtime.

>Or is the __ehdr_start used just as a relative offset to get the proper
>position of ld-linux-armh.so when called?

It's a relative offset.

% cat a.c
__attribute__((visibility("hidden")))
extern char __ehdr_start[];

char *load_address() { return __ehdr_start; }

% arm-linux-gnueabi-gcc -O1 -fpic a.c -nostdlib
% arm-linux-gnueabi-objdump -dr a.out

a.out:     file format elf32-littlearm


Disassembly of section .text:

00000198 <load_address>:
  198:   e59f0004        ldr     r0, [pc, #4]    ; 1a4 <load_address+0xc>
  19c:   e08f0000        add     r0, pc, r0
  1a0:   e12fff1e        bx      lr
  1a4:   fffffe5c        .word   0xfffffe5c

At runtime, `load_address` will compute the load address.
Note that there is no runtime relocation.


>>
>> What's wrong with it? The objdump -t line doesn't tell why this change
>> is good.
>
>I'm also puzzled why this causes the QEMU versalite board to break.
>(/sbin/init dies with SIGSEGFAULT = 0xb [*])
>
>It looks like the &__ehdr_start is not equal in the result to 'adr'
>assembler code in QEMU.
>
>I also thought that the issue is caused by fs/binfmt_elf.c changes in
>the Linux kernel (as mine is 5.1), so I've tested it with 5.10 and
>5.14. No change.

The kernel change must be unrelated.
The kernel doesn't even know __ehdr_start as a symbol exists.

>The QEMU is pretty recent - it is 6.0.0 version for ARM.
>
>> Can you compare objdump -dr output and show the incorrect
>> code sequences with C assessing __ehdr_start?
>
>Ok. I will explicitly compare _dl_start function with and without this
>patch.

In case it helps, perhaps replace always_inline with noinline
and check why the function is incorrect under your qemu setup.

>>
>> >This is crucial in the QEMU ARM environment for which (when
>> >/sbin/init is executed) values set in __ehdr_start symbol are wrong.
>> >This causes the program to crash very early - when the
>> >/lib/ld-linux-armhf.so.3 is executed as a prerequisite to /sbin/init
>> >execution.
>>
>> >The kernel's fs/binfmt_elf.c is though responsible for setting
>> >up execution environment, not binutils.
>>
>> Whether the symbol entry __ehdr_start exists doesn't matter.
>> The hidden definition does not have any associated relocation.
>
>I'm also wondering if similar issue is visible with "normal" - i.e. not
>QEMU ARM run init.
>
>Maybe the "rough" environment in which /sbin/init is run is the culprit?
>

I had tested running ld.so and elf/ldconfig which covered the usage.

I don't know. It needs some debugging from you:)

>>
>> >It looks like the only robust way to obtain the _dl_start offset is
>> >to use assembler instruction - not rely on values provided by
>> >binutils.
>> >
>> >HW:
>> >Hardware name: ARM-Versatile Express (Run with QEMU)
>> >Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
>> >
>> >When the /sbin/init is setup for run from Linux kernel's very small
>> >environment with LD_DEBUG=all the __ehdr_start is not shown at all.
>> >
>> >Fixes: BZ #28293
>> >---
>> > sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
>> > 1 file changed, 25 insertions(+), 3 deletions(-)
>> >
>> >diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
>> >index eb13cb8b57..58ebef6ecd 100644
>> >--- a/sysdeps/arm/dl-machine.h
>> >+++ b/sysdeps/arm/dl-machine.h
>> >@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
>> > }
>> >
>> > /* Return the run-time load address of the shared object.  */
>> >-static inline ElfW(Addr) __attribute__ ((unused))
>> >+static inline Elf32_Addr __attribute__ ((unused))
>> > elf_machine_load_address (void)
>> > {
>> >-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
>> >-  return (ElfW(Addr)) &__ehdr_start;
>> >+  Elf32_Addr pcrel_addr;
>> >+#ifdef SHARED
>> >+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
>> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
>> >+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
>> >+#else
>> >+  extern Elf32_Addr __dl_relocate_static_pie (void *)
>> >+    asm ("_dl_relocate_static_pie") attribute_hidden;
>> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
>> >+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
>> >+#endif
>> >+#ifdef __thumb__
>> >+  /* Clear the low bit of the function address.
>> >+
>> >+     NOTE: got_addr is from GOT table whose lsb is always set by
>> >linker if it's
>> >+     Thumb function address.  PCREL_ADDR comes from PC-relative
>> >calculation
>> >+     which will finish during assembling.  GAS assembler before the
>> >fix for
>> >+     PR gas/21458 was not setting the lsb but does after that.
>> >Always do the
>> >+     strip for both, so the code works with various combinations of
>> >glibc and
>> >+     Binutils.  */
>> >+  got_addr &= ~(Elf32_Addr) 1;
>> >+  pcrel_addr &= ~(Elf32_Addr) 1;
>> >+#endif
>> >+  return pcrel_addr - got_addr;
>> > }
>> >
>> > /* Return the link-time address of _DYNAMIC.  */
>> >--
>> >2.20.1
>> >
>
>Note:
>
>[*] - the QEMU debugging is pretty difficult with /sbin/init. Qemu has
>good support for kernel debugging (-s -S switches). Then for user space
>program one could use the "on-board" gdb or gdbserver.
>
>However, /sbin/init needs to be debugged from linux and scheduler
>context switches needs to be taken into account to review how the code
>is executed. And such debugging scenario has issue with QEMU-arm.
>
>
>Best regards,
>
>Lukasz Majewski
>
>--
>
>DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
>HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
>Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de



^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-07 17:44     ` Fangrui Song
@ 2021-09-08 15:05       ` Lukasz Majewski
  2021-09-08 17:41         ` Fāng-ruì Sòng
  2021-09-08 19:19         ` Adhemerval Zanella
  2021-10-05  7:45       ` Lukasz Majewski
  1 sibling, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-08 15:05 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 9892 bytes --]

Hi Fangrui,

> On 2021-09-07, Lukasz Majewski wrote:
> >Hi Fangrui,
> >  
> >> On 2021-09-07, Lukasz Majewski wrote:  
> >> >This change is a partial revert of commit
> >> >bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
> >> >__ehdr_start linker variable to get the address of loaded program.
> >> >
> >> >The elf_machine_load_address() function is declared in the
> >> >sysdeps/arm/dl-machine.h header. It is called from _dl_start()
> >> >entry point for the program. It shall return the load address of
> >> >the dynamic linker program.  
> >>
> >> Yes.
> >>  
> >> >With this revert the 'adr' assembler instruction is used instead
> >> >of a place holder:
> >> >
> >> >arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> >> >00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> >> >
> >> >which shall be pre-set by binutils.  
> >>
> >> I don't understand this. The sh_addr field of the binutils defined
> >> __ehdr_start is 0.
> >>
> >> Declararing __ehdr_start as hidden and accessing it with
> >> PC-relative addressing generates computes the runtime address of
> >> __ehdr_start, which equals the load base.  
> >
> >Maybe I don't get it - but shouldn't this be filled in by binutils
> >(linker ?) when the program is assembled before run?  
> 
> The linker just sets its link-time address (sh_addr): 0.
> The address is computed at runtime.
> 
> >Or is the __ehdr_start used just as a relative offset to get the
> >proper position of ld-linux-armh.so when called?  
> 
> It's a relative offset.
> 
> % cat a.c
> __attribute__((visibility("hidden")))
> extern char __ehdr_start[];
> 
> char *load_address() { return __ehdr_start; }
> 
> % arm-linux-gnueabi-gcc -O1 -fpic a.c -nostdlib
> % arm-linux-gnueabi-objdump -dr a.out
> 
> a.out:     file format elf32-littlearm
> 
> 
> Disassembly of section .text:
> 
> 00000198 <load_address>:
>   198:   e59f0004        ldr     r0, [pc, #4]    ; 1a4
> <load_address+0xc> 19c:   e08f0000        add     r0, pc, r0
>   1a0:   e12fff1e        bx      lr
>   1a4:   fffffe5c        .word   0xfffffe5c
> 
> At runtime, `load_address` will compute the load address.
> Note that there is no runtime relocation.

I think that the problem may be with having the negative value
calculated.

The relevant snipet:

    116c:       bf00            nop
    116e:       bf00            nop
    1170:       bf00            nop
    1172:       f8df 3508       ldr.w   r3, [pc, #1288] ; 167c <_dl_start+0x520>

    1176:       f8df 1508       ldr.w   r1, [pc, #1288] ; 1680 <_dl_start+0x524>

    117a:       447b            add     r3, pc
    117c:       4479            add     r1, pc
    117e:       f8c3 1598       str.w   r1, [r3, #1432] ; 0x598
    1182:       bf00            nop
    1184:       bf00            nop
    1186:       bf00            nop
    1188:       bf00            nop
    118a:       bf00            nop
    118c:       bf00            nop
    118e:       f8df 24f4       ldr.w   r2, [pc, #1268] ; 1684 <_dl_start+0x528>
    1192:       f8d3 5598       ldr.w   r5, [r3, #1432] ; 0x598
    1196:       447a            add     r2, pc
    1198:       442a            add     r2, r5
    119a:       1a52            subs    r2, r2, r1
    119c:       f8c3 25a0       str.w   r2, [r3, #1440] ; 0x5a0
    11a0:       6813            ldr     r3, [r2, #0]


    167c:       0002be92        .word   0x0002be92
    1680:       ffffee80        .word   0xffffee80

The r1 gets the 0xffffee80 (negative offset) value. It is then added to pc
and used to calculate r2.

For working code (with this patch applied) - there are NO such large
values (like aforementioned 0xffffee80). The arithmetic is done on 

   1690:       00000020        .word   0x00000020
   1694:       0002be7e        .word   0x0002be7e

which seems to work.

This shouldn't be a problem as with U2 the arithmetic shall work.
However, I've noticed (with passing LD_DEBUG=all) that the
ld-linux-armhf.so.3 (and init) are relocated twice before execution.

Why do we need to relocate it?

Another question is why on this particular case the large (i.e.
negative) offset matters?

> 
> 
> >>
> >> What's wrong with it? The objdump -t line doesn't tell why this
> >> change is good.  
> >
> >I'm also puzzled why this causes the QEMU versalite board to break.
> >(/sbin/init dies with SIGSEGFAULT = 0xb [*])
> >
> >It looks like the &__ehdr_start is not equal in the result to 'adr'
> >assembler code in QEMU.
> >
> >I also thought that the issue is caused by fs/binfmt_elf.c changes in
> >the Linux kernel (as mine is 5.1), so I've tested it with 5.10 and
> >5.14. No change.  
> 
> The kernel change must be unrelated.
> The kernel doesn't even know __ehdr_start as a symbol exists.
> 
> >The QEMU is pretty recent - it is 6.0.0 version for ARM.
> >  
> >> Can you compare objdump -dr output and show the incorrect
> >> code sequences with C assessing __ehdr_start?  
> >
> >Ok. I will explicitly compare _dl_start function with and without
> >this patch.  
> 
> In case it helps, perhaps replace always_inline with noinline
> and check why the function is incorrect under your qemu setup.
> 
> >>  
> >> >This is crucial in the QEMU ARM environment for which (when
> >> >/sbin/init is executed) values set in __ehdr_start symbol are
> >> >wrong. This causes the program to crash very early - when the
> >> >/lib/ld-linux-armhf.so.3 is executed as a prerequisite to
> >> >/sbin/init execution.  
> >>  
> >> >The kernel's fs/binfmt_elf.c is though responsible for setting
> >> >up execution environment, not binutils.  
> >>
> >> Whether the symbol entry __ehdr_start exists doesn't matter.
> >> The hidden definition does not have any associated relocation.  
> >
> >I'm also wondering if similar issue is visible with "normal" - i.e.
> >not QEMU ARM run init.
> >
> >Maybe the "rough" environment in which /sbin/init is run is the
> >culprit?
> >  
> 
> I had tested running ld.so and elf/ldconfig which covered the usage.
> 
> I don't know. It needs some debugging from you:)
> 
> >>  
> >> >It looks like the only robust way to obtain the _dl_start offset
> >> >is to use assembler instruction - not rely on values provided by
> >> >binutils.
> >> >
> >> >HW:
> >> >Hardware name: ARM-Versatile Express (Run with QEMU)
> >> >Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
> >> >
> >> >When the /sbin/init is setup for run from Linux kernel's very
> >> >small environment with LD_DEBUG=all the __ehdr_start is not shown
> >> >at all.
> >> >
> >> >Fixes: BZ #28293
> >> >---
> >> > sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> >> > 1 file changed, 25 insertions(+), 3 deletions(-)
> >> >
> >> >diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> >> >index eb13cb8b57..58ebef6ecd 100644
> >> >--- a/sysdeps/arm/dl-machine.h
> >> >+++ b/sysdeps/arm/dl-machine.h
> >> >@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr
> >> >*ehdr)
> >> > }
> >> >
> >> > /* Return the run-time load address of the shared object.  */
> >> >-static inline ElfW(Addr) __attribute__ ((unused))
> >> >+static inline Elf32_Addr __attribute__ ((unused))
> >> > elf_machine_load_address (void)
> >> > {
> >> >-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> >> >-  return (ElfW(Addr)) &__ehdr_start;
> >> >+  Elf32_Addr pcrel_addr;
> >> >+#ifdef SHARED
> >> >+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> >> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> >> >+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> >> >+#else
> >> >+  extern Elf32_Addr __dl_relocate_static_pie (void *)
> >> >+    asm ("_dl_relocate_static_pie") attribute_hidden;
> >> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> >> >+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> >> >+#endif
> >> >+#ifdef __thumb__
> >> >+  /* Clear the low bit of the function address.
> >> >+
> >> >+     NOTE: got_addr is from GOT table whose lsb is always set by
> >> >linker if it's
> >> >+     Thumb function address.  PCREL_ADDR comes from PC-relative
> >> >calculation
> >> >+     which will finish during assembling.  GAS assembler before
> >> >the fix for
> >> >+     PR gas/21458 was not setting the lsb but does after that.
> >> >Always do the
> >> >+     strip for both, so the code works with various combinations
> >> >of glibc and
> >> >+     Binutils.  */
> >> >+  got_addr &= ~(Elf32_Addr) 1;
> >> >+  pcrel_addr &= ~(Elf32_Addr) 1;
> >> >+#endif
> >> >+  return pcrel_addr - got_addr;
> >> > }
> >> >
> >> > /* Return the link-time address of _DYNAMIC.  */
> >> >--
> >> >2.20.1
> >> >  
> >
> >Note:
> >
> >[*] - the QEMU debugging is pretty difficult with /sbin/init. Qemu
> >has good support for kernel debugging (-s -S switches). Then for
> >user space program one could use the "on-board" gdb or gdbserver.
> >
> >However, /sbin/init needs to be debugged from linux and scheduler
> >context switches needs to be taken into account to review how the
> >code is executed. And such debugging scenario has issue with
> >QEMU-arm.
> >
> >
> >Best regards,
> >
> >Lukasz Majewski
> >
> >--
> >
> >DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> >HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> >Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> >lukma@denx.de  
> 
> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-08 15:05       ` Lukasz Majewski
@ 2021-09-08 17:41         ` Fāng-ruì Sòng
  2021-09-08 19:19         ` Adhemerval Zanella
  1 sibling, 0 replies; 62+ messages in thread
From: Fāng-ruì Sòng @ 2021-09-08 17:41 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

On Wed, Sep 8, 2021 at 8:05 AM Lukasz Majewski <lukma@denx.de> wrote:
>
> Hi Fangrui,
>
> > On 2021-09-07, Lukasz Majewski wrote:
> > >Hi Fangrui,
> > >
> > >> On 2021-09-07, Lukasz Majewski wrote:
> > >> >This change is a partial revert of commit
> > >> >bca0f5cbc9257c13322b99e55235c4f21ba0bd82 which imposed usage of
> > >> >__ehdr_start linker variable to get the address of loaded program.
> > >> >
> > >> >The elf_machine_load_address() function is declared in the
> > >> >sysdeps/arm/dl-machine.h header. It is called from _dl_start()
> > >> >entry point for the program. It shall return the load address of
> > >> >the dynamic linker program.
> > >>
> > >> Yes.
> > >>
> > >> >With this revert the 'adr' assembler instruction is used instead
> > >> >of a place holder:
> > >> >
> > >> >arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > >> >00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > >> >
> > >> >which shall be pre-set by binutils.
> > >>
> > >> I don't understand this. The sh_addr field of the binutils defined
> > >> __ehdr_start is 0.
> > >>
> > >> Declararing __ehdr_start as hidden and accessing it with
> > >> PC-relative addressing generates computes the runtime address of
> > >> __ehdr_start, which equals the load base.
> > >
> > >Maybe I don't get it - but shouldn't this be filled in by binutils
> > >(linker ?) when the program is assembled before run?
> >
> > The linker just sets its link-time address (sh_addr): 0.
> > The address is computed at runtime.
> >
> > >Or is the __ehdr_start used just as a relative offset to get the
> > >proper position of ld-linux-armh.so when called?
> >
> > It's a relative offset.
> >
> > % cat a.c
> > __attribute__((visibility("hidden")))
> > extern char __ehdr_start[];
> >
> > char *load_address() { return __ehdr_start; }
> >
> > % arm-linux-gnueabi-gcc -O1 -fpic a.c -nostdlib
> > % arm-linux-gnueabi-objdump -dr a.out
> >
> > a.out:     file format elf32-littlearm
> >
> >
> > Disassembly of section .text:
> >
> > 00000198 <load_address>:
> >   198:   e59f0004        ldr     r0, [pc, #4]    ; 1a4
> > <load_address+0xc> 19c:   e08f0000        add     r0, pc, r0
> >   1a0:   e12fff1e        bx      lr
> >   1a4:   fffffe5c        .word   0xfffffe5c
> >
> > At runtime, `load_address` will compute the load address.
> > Note that there is no runtime relocation.
>
> I think that the problem may be with having the negative value
> calculated.

Unfamiliar with arm, but I don't see why a negative value can be a problem.
The address computation just wraps around as expected.

> The relevant snipet:
>
>     116c:       bf00            nop
>     116e:       bf00            nop
>     1170:       bf00            nop
>     1172:       f8df 3508       ldr.w   r3, [pc, #1288] ; 167c <_dl_start+0x520>
>
>     1176:       f8df 1508       ldr.w   r1, [pc, #1288] ; 1680 <_dl_start+0x524>
>
>     117a:       447b            add     r3, pc
>     117c:       4479            add     r1, pc
>     117e:       f8c3 1598       str.w   r1, [r3, #1432] ; 0x598
>     1182:       bf00            nop
>     1184:       bf00            nop
>     1186:       bf00            nop
>     1188:       bf00            nop
>     118a:       bf00            nop
>     118c:       bf00            nop
>     118e:       f8df 24f4       ldr.w   r2, [pc, #1268] ; 1684 <_dl_start+0x528>
>     1192:       f8d3 5598       ldr.w   r5, [r3, #1432] ; 0x598
>     1196:       447a            add     r2, pc
>     1198:       442a            add     r2, r5
>     119a:       1a52            subs    r2, r2, r1
>     119c:       f8c3 25a0       str.w   r2, [r3, #1440] ; 0x5a0
>     11a0:       6813            ldr     r3, [r2, #0]
>
>
>     167c:       0002be92        .word   0x0002be92
>     1680:       ffffee80        .word   0xffffee80
>
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to pc
> and used to calculate r2.
>
> For working code (with this patch applied) - there are NO such large
> values (like aforementioned 0xffffee80). The arithmetic is done on
>
>    1690:       00000020        .word   0x00000020
>    1694:       0002be7e        .word   0x0002be7e
>
> which seems to work.
>
> This shouldn't be a problem as with U2 the arithmetic shall work.
> However, I've noticed (with passing LD_DEBUG=all) that the
> ld-linux-armhf.so.3 (and init) are relocated twice before execution.
>
> Why do we need to relocate it?

No idea...

> Another question is why on this particular case the large (i.e.
> negative) offset matters?

No idea...

> >
> >
> > >>
> > >> What's wrong with it? The objdump -t line doesn't tell why this
> > >> change is good.
> > >
> > >I'm also puzzled why this causes the QEMU versalite board to break.
> > >(/sbin/init dies with SIGSEGFAULT = 0xb [*])
> > >
> > >It looks like the &__ehdr_start is not equal in the result to 'adr'
> > >assembler code in QEMU.
> > >
> > >I also thought that the issue is caused by fs/binfmt_elf.c changes in
> > >the Linux kernel (as mine is 5.1), so I've tested it with 5.10 and
> > >5.14. No change.
> >
> > The kernel change must be unrelated.
> > The kernel doesn't even know __ehdr_start as a symbol exists.
> >
> > >The QEMU is pretty recent - it is 6.0.0 version for ARM.
> > >
> > >> Can you compare objdump -dr output and show the incorrect
> > >> code sequences with C assessing __ehdr_start?
> > >
> > >Ok. I will explicitly compare _dl_start function with and without
> > >this patch.
> >
> > In case it helps, perhaps replace always_inline with noinline
> > and check why the function is incorrect under your qemu setup.
> >
> > >>
> > >> >This is crucial in the QEMU ARM environment for which (when
> > >> >/sbin/init is executed) values set in __ehdr_start symbol are
> > >> >wrong. This causes the program to crash very early - when the
> > >> >/lib/ld-linux-armhf.so.3 is executed as a prerequisite to
> > >> >/sbin/init execution.
> > >>
> > >> >The kernel's fs/binfmt_elf.c is though responsible for setting
> > >> >up execution environment, not binutils.
> > >>
> > >> Whether the symbol entry __ehdr_start exists doesn't matter.
> > >> The hidden definition does not have any associated relocation.
> > >
> > >I'm also wondering if similar issue is visible with "normal" - i.e.
> > >not QEMU ARM run init.
> > >
> > >Maybe the "rough" environment in which /sbin/init is run is the
> > >culprit?
> > >
> >
> > I had tested running ld.so and elf/ldconfig which covered the usage.
> >
> > I don't know. It needs some debugging from you:)
> >
> > >>
> > >> >It looks like the only robust way to obtain the _dl_start offset
> > >> >is to use assembler instruction - not rely on values provided by
> > >> >binutils.
> > >> >
> > >> >HW:
> > >> >Hardware name: ARM-Versatile Express (Run with QEMU)
> > >> >Tested (affected) kernels v5.1.12, v5.10.62 and v5.14.1
> > >> >
> > >> >When the /sbin/init is setup for run from Linux kernel's very
> > >> >small environment with LD_DEBUG=all the __ehdr_start is not shown
> > >> >at all.
> > >> >
> > >> >Fixes: BZ #28293
> > >> >---
> > >> > sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> > >> > 1 file changed, 25 insertions(+), 3 deletions(-)
> > >> >
> > >> >diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> > >> >index eb13cb8b57..58ebef6ecd 100644
> > >> >--- a/sysdeps/arm/dl-machine.h
> > >> >+++ b/sysdeps/arm/dl-machine.h
> > >> >@@ -38,11 +38,33 @@ elf_machine_matches_host (const Elf32_Ehdr
> > >> >*ehdr)
> > >> > }
> > >> >
> > >> > /* Return the run-time load address of the shared object.  */
> > >> >-static inline ElfW(Addr) __attribute__ ((unused))
> > >> >+static inline Elf32_Addr __attribute__ ((unused))
> > >> > elf_machine_load_address (void)
> > >> > {
> > >> >-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> > >> >-  return (ElfW(Addr)) &__ehdr_start;
> > >> >+  Elf32_Addr pcrel_addr;
> > >> >+#ifdef SHARED
> > >> >+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> > >> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> > >> >+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> > >> >+#else
> > >> >+  extern Elf32_Addr __dl_relocate_static_pie (void *)
> > >> >+    asm ("_dl_relocate_static_pie") attribute_hidden;
> > >> >+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> > >> >+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> > >> >+#endif
> > >> >+#ifdef __thumb__
> > >> >+  /* Clear the low bit of the function address.
> > >> >+
> > >> >+     NOTE: got_addr is from GOT table whose lsb is always set by
> > >> >linker if it's
> > >> >+     Thumb function address.  PCREL_ADDR comes from PC-relative
> > >> >calculation
> > >> >+     which will finish during assembling.  GAS assembler before
> > >> >the fix for
> > >> >+     PR gas/21458 was not setting the lsb but does after that.
> > >> >Always do the
> > >> >+     strip for both, so the code works with various combinations
> > >> >of glibc and
> > >> >+     Binutils.  */
> > >> >+  got_addr &= ~(Elf32_Addr) 1;
> > >> >+  pcrel_addr &= ~(Elf32_Addr) 1;
> > >> >+#endif
> > >> >+  return pcrel_addr - got_addr;
> > >> > }
> > >> >
> > >> > /* Return the link-time address of _DYNAMIC.  */
> > >> >--
> > >> >2.20.1
> > >> >
> > >
> > >Note:
> > >
> > >[*] - the QEMU debugging is pretty difficult with /sbin/init. Qemu
> > >has good support for kernel debugging (-s -S switches). Then for
> > >user space program one could use the "on-board" gdb or gdbserver.
> > >
> > >However, /sbin/init needs to be debugged from linux and scheduler
> > >context switches needs to be taken into account to review how the
> > >code is executed. And such debugging scenario has issue with
> > >QEMU-arm.
> > >
> > >
> > >Best regards,
> > >
> > >Lukasz Majewski
> > >
> > >--
> > >
> > >DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> > >HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> > >Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > >lukma@denx.de
> >
> >
>
>
>
>
> Best regards,
>
> Lukasz Majewski
>
> --
>
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de



-- 
宋方睿

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-08 15:05       ` Lukasz Majewski
  2021-09-08 17:41         ` Fāng-ruì Sòng
@ 2021-09-08 19:19         ` Adhemerval Zanella
  2021-09-08 20:34           ` Lukasz Majewski
  1 sibling, 1 reply; 62+ messages in thread
From: Adhemerval Zanella @ 2021-09-08 19:19 UTC (permalink / raw)
  To: Lukasz Majewski, Fangrui Song
  Cc: Florian Weimer, Joseph Myers, Carlos O'Donell, libc-alpha



On 08/09/2021 12:05, Lukasz Majewski wrote:
> 
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to pc
> and used to calculate r2.
> 
> For working code (with this patch applied) - there are NO such large
> values (like aforementioned 0xffffee80). The arithmetic is done on 
> 
>    1690:       00000020        .word   0x00000020
>    1694:       0002be7e        .word   0x0002be7e
> 
> which seems to work.
> 
> This shouldn't be a problem as with U2 the arithmetic shall work.
> However, I've noticed (with passing LD_DEBUG=all) that the
> ld-linux-armhf.so.3 (and init) are relocated twice before execution.
> 
> Why do we need to relocate it?
> 
> Another question is why on this particular case the large (i.e.
> negative) offset matters?

I think it is highly unlikely the negative offset plays any role here.
Do you have a working example to trigger this issue?  I am currently
testing arm for different compilers (gcc 6.2, gcc 10, gcc 11) and
with different configurations (armv5, armv6, armv7, with and without
thumb) and I haven't see any issue so far.

It might binutils related, which version are you using?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-08 19:19         ` Adhemerval Zanella
@ 2021-09-08 20:34           ` Lukasz Majewski
  2021-09-09  7:18             ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-08 20:34 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: Fangrui Song, Florian Weimer, Joseph Myers, Carlos O'Donell,
	libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1758 bytes --]

Hi Adhemerval,

> On 08/09/2021 12:05, Lukasz Majewski wrote:
> > 
> > The r1 gets the 0xffffee80 (negative offset) value. It is then
> > added to pc and used to calculate r2.
> > 
> > For working code (with this patch applied) - there are NO such large
> > values (like aforementioned 0xffffee80). The arithmetic is done on 
> > 
> >    1690:       00000020        .word   0x00000020
> >    1694:       0002be7e        .word   0x0002be7e
> > 
> > which seems to work.
> > 
> > This shouldn't be a problem as with U2 the arithmetic shall work.
> > However, I've noticed (with passing LD_DEBUG=all) that the
> > ld-linux-armhf.so.3 (and init) are relocated twice before execution.
> > 
> > Why do we need to relocate it?
> > 
> > Another question is why on this particular case the large (i.e.
> > negative) offset matters?  
> 
> I think it is highly unlikely the negative offset plays any role here.
> Do you have a working example to trigger this issue? 

I'm just using QEMU (runqemu -d y2038-image-devel nographic) from
meta-y2038 build.

Recent versions: 
poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
qemu_6.0.0
binutils_2.37
gcc_11.2

The QEMU board uses Cortex-A9 core.

> I am currently
> testing arm for different compilers (gcc 6.2, gcc 10, gcc 11) and
> with different configurations (armv5, armv6, armv7, with and without
> thumb) and I haven't see any issue so far.
> 
> It might binutils related, which version are you using?




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-08 20:34           ` Lukasz Majewski
@ 2021-09-09  7:18             ` Lukasz Majewski
  2021-09-09  9:49               ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-09  7:18 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Florian Weimer, libc-alpha, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 2479 bytes --]

On Wed, 8 Sep 2021 22:34:21 +0200
Lukasz Majewski <lukma@denx.de> wrote:

> Hi Adhemerval,
> 
> > On 08/09/2021 12:05, Lukasz Majewski wrote:  
> > > 
> > > The r1 gets the 0xffffee80 (negative offset) value. It is then
> > > added to pc and used to calculate r2.
> > > 
> > > For working code (with this patch applied) - there are NO such
> > > large values (like aforementioned 0xffffee80). The arithmetic is
> > > done on 
> > > 
> > >    1690:       00000020        .word   0x00000020
> > >    1694:       0002be7e        .word   0x0002be7e
> > > 
> > > which seems to work.
> > > 
> > > This shouldn't be a problem as with U2 the arithmetic shall work.
> > > However, I've noticed (with passing LD_DEBUG=all) that the
> > > ld-linux-armhf.so.3 (and init) are relocated twice before
> > > execution.
> > > 
> > > Why do we need to relocate it?
> > > 
> > > Another question is why on this particular case the large (i.e.
> > > negative) offset matters?    
> > 
> > I think it is highly unlikely the negative offset plays any role
> > here. Do you have a working example to trigger this issue?   
> 
> I'm just using QEMU (runqemu -d y2038-image-devel nographic) from
> meta-y2038 build.
> 
> Recent versions: 
> poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> qemu_6.0.0
> binutils_2.37
> gcc_11.2
> 
> The QEMU board uses Cortex-A9 core.

I've updated the poky to be the newest -master:
master:50293eec9a7d0602b748220ab100b070b8d3432a

No change.

I will try the same setup with Cortex-A5 core - maybe there is some
kind of subtle handling of such emulation in qemu?

> 
> > I am currently
> > testing arm for different compilers (gcc 6.2, gcc 10, gcc 11) and
> > with different configurations (armv5, armv6, armv7, with and without
> > thumb) and I haven't see any issue so far.
> > 
> > It might binutils related, which version are you using?  
> 
> 
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-09  7:18             ` Lukasz Majewski
@ 2021-09-09  9:49               ` Lukasz Majewski
  2021-09-10 10:10                 ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-09  9:49 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Florian Weimer, libc-alpha, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 3153 bytes --]

On Thu, 9 Sep 2021 09:18:06 +0200
Lukasz Majewski <lukma@denx.de> wrote:

> On Wed, 8 Sep 2021 22:34:21 +0200
> Lukasz Majewski <lukma@denx.de> wrote:
> 
> > Hi Adhemerval,
> >   
> > > On 08/09/2021 12:05, Lukasz Majewski wrote:    
> > > > 
> > > > The r1 gets the 0xffffee80 (negative offset) value. It is then
> > > > added to pc and used to calculate r2.
> > > > 
> > > > For working code (with this patch applied) - there are NO such
> > > > large values (like aforementioned 0xffffee80). The arithmetic is
> > > > done on 
> > > > 
> > > >    1690:       00000020        .word   0x00000020
> > > >    1694:       0002be7e        .word   0x0002be7e
> > > > 
> > > > which seems to work.
> > > > 
> > > > This shouldn't be a problem as with U2 the arithmetic shall
> > > > work. However, I've noticed (with passing LD_DEBUG=all) that the
> > > > ld-linux-armhf.so.3 (and init) are relocated twice before
> > > > execution.
> > > > 
> > > > Why do we need to relocate it?
> > > > 
> > > > Another question is why on this particular case the large (i.e.
> > > > negative) offset matters?      
> > > 
> > > I think it is highly unlikely the negative offset plays any role
> > > here. Do you have a working example to trigger this issue?     
> > 
> > I'm just using QEMU (runqemu -d y2038-image-devel nographic) from
> > meta-y2038 build.
> > 
> > Recent versions: 
> > poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > qemu_6.0.0
> > binutils_2.37
> > gcc_11.2
> > 
> > The QEMU board uses Cortex-A9 core.  
> 
> I've updated the poky to be the newest -master:
> master:50293eec9a7d0602b748220ab100b070b8d3432a
> 
> No change.
> 
> I will try the same setup with Cortex-A5 core - maybe there is some
> kind of subtle handling of such emulation in qemu?

No change. On the qemuarm yocto/OE MACHINE the same situation is
observed.

After applying this patch the board works without issues.

> 
> >   
> > > I am currently
> > > testing arm for different compilers (gcc 6.2, gcc 10, gcc 11) and
> > > with different configurations (armv5, armv6, armv7, with and
> > > without thumb) and I haven't see any issue so far.
> > > 
> > > It might binutils related, which version are you using?    
> > 
> > 
> > 
> > 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > --
> > 
> > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > lukma@denx.de  
> 
> 
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-09  9:49               ` Lukasz Majewski
@ 2021-09-10 10:10                 ` Lukasz Majewski
  2021-09-17  8:29                   ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-10 10:10 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Florian Weimer, libc-alpha, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 3819 bytes --]

On Thu, 9 Sep 2021 11:49:36 +0200
Lukasz Majewski <lukma@denx.de> wrote:

> On Thu, 9 Sep 2021 09:18:06 +0200
> Lukasz Majewski <lukma@denx.de> wrote:
> 
> > On Wed, 8 Sep 2021 22:34:21 +0200
> > Lukasz Majewski <lukma@denx.de> wrote:
> >   
> > > Hi Adhemerval,
> > >     
> > > > On 08/09/2021 12:05, Lukasz Majewski wrote:      
> > > > > 
> > > > > The r1 gets the 0xffffee80 (negative offset) value. It is then
> > > > > added to pc and used to calculate r2.
> > > > > 
> > > > > For working code (with this patch applied) - there are NO such
> > > > > large values (like aforementioned 0xffffee80). The arithmetic
> > > > > is done on 
> > > > > 
> > > > >    1690:       00000020        .word   0x00000020
> > > > >    1694:       0002be7e        .word   0x0002be7e
> > > > > 
> > > > > which seems to work.
> > > > > 
> > > > > This shouldn't be a problem as with U2 the arithmetic shall
> > > > > work. However, I've noticed (with passing LD_DEBUG=all) that
> > > > > the ld-linux-armhf.so.3 (and init) are relocated twice before
> > > > > execution.
> > > > > 
> > > > > Why do we need to relocate it?
> > > > > 
> > > > > Another question is why on this particular case the large
> > > > > (i.e. negative) offset matters?        
> > > > 
> > > > I think it is highly unlikely the negative offset plays any role
> > > > here. Do you have a working example to trigger this issue?
> > > >  
> > > 
> > > I'm just using QEMU (runqemu -d y2038-image-devel nographic) from
> > > meta-y2038 build.
> > > 
> > > Recent versions: 
> > > poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > > qemu_6.0.0
> > > binutils_2.37
> > > gcc_11.2
> > > 
> > > The QEMU board uses Cortex-A9 core.    
> > 
> > I've updated the poky to be the newest -master:
> > master:50293eec9a7d0602b748220ab100b070b8d3432a
> > 
> > No change.
> > 
> > I will try the same setup with Cortex-A5 core - maybe there is some
> > kind of subtle handling of such emulation in qemu?  
> 
> No change. On the qemuarm yocto/OE MACHINE the same situation is
> observed.
> 

Do you have any more ideas where to look for solution of this problem?

> After applying this patch the board works without issues.
> 
> >   
> > >     
> > > > I am currently
> > > > testing arm for different compilers (gcc 6.2, gcc 10, gcc 11)
> > > > and with different configurations (armv5, armv6, armv7, with and
> > > > without thumb) and I haven't see any issue so far.
> > > > 
> > > > It might binutils related, which version are you using?      
> > > 
> > > 
> > > 
> > > 
> > > Best regards,
> > > 
> > > Lukasz Majewski
> > > 
> > > --
> > > 
> > > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > > lukma@denx.de    
> > 
> > 
> > 
> > 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > --
> > 
> > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > lukma@denx.de  
> 
> 
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-10 10:10                 ` Lukasz Majewski
@ 2021-09-17  8:29                   ` Lukasz Majewski
  2021-09-17 13:27                     ` Joseph Myers
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-17  8:29 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: Florian Weimer, libc-alpha, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 4992 bytes --]

Dear Community,

> On Thu, 9 Sep 2021 11:49:36 +0200
> Lukasz Majewski <lukma@denx.de> wrote:
> 
> > On Thu, 9 Sep 2021 09:18:06 +0200
> > Lukasz Majewski <lukma@denx.de> wrote:
> >   
> > > On Wed, 8 Sep 2021 22:34:21 +0200
> > > Lukasz Majewski <lukma@denx.de> wrote:
> > >     
> > > > Hi Adhemerval,
> > > >       
> > > > > On 08/09/2021 12:05, Lukasz Majewski wrote:        
> > > > > > 
> > > > > > The r1 gets the 0xffffee80 (negative offset) value. It is
> > > > > > then added to pc and used to calculate r2.
> > > > > > 
> > > > > > For working code (with this patch applied) - there are NO
> > > > > > such large values (like aforementioned 0xffffee80). The
> > > > > > arithmetic is done on 
> > > > > > 
> > > > > >    1690:       00000020        .word   0x00000020
> > > > > >    1694:       0002be7e        .word   0x0002be7e
> > > > > > 
> > > > > > which seems to work.
> > > > > > 
> > > > > > This shouldn't be a problem as with U2 the arithmetic shall
> > > > > > work. However, I've noticed (with passing LD_DEBUG=all) that
> > > > > > the ld-linux-armhf.so.3 (and init) are relocated twice
> > > > > > before execution.
> > > > > > 
> > > > > > Why do we need to relocate it?
> > > > > > 
> > > > > > Another question is why on this particular case the large
> > > > > > (i.e. negative) offset matters?          
> > > > > 
> > > > > I think it is highly unlikely the negative offset plays any
> > > > > role here. Do you have a working example to trigger this
> > > > > issue? 
> > > > 
> > > > I'm just using QEMU (runqemu -d y2038-image-devel nographic)
> > > > from meta-y2038 build.
> > > > 
> > > > Recent versions: 
> > > > poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > > > qemu_6.0.0
> > > > binutils_2.37
> > > > gcc_11.2
> > > > 
> > > > The QEMU board uses Cortex-A9 core.      
> > > 
> > > I've updated the poky to be the newest -master:
> > > master:50293eec9a7d0602b748220ab100b070b8d3432a
> > > 
> > > No change.
> > > 
> > > I will try the same setup with Cortex-A5 core - maybe there is
> > > some kind of subtle handling of such emulation in qemu?    
> > 
> > No change. On the qemuarm yocto/OE MACHINE the same situation is
> > observed.
> >   
> 
> Do you have any more ideas where to look for solution of this problem?

Can we make a decision regarding this fix?

The only finding, which I get, is the large negative offset added to the
pc in the glibc broken (current master) code, which causes QEMU system
to fail. 

I've shared this problem with the QEMU community [1], but no reply
received till today.


Shall we:

1. Apply this patch or

2. Revert the conversion (whole) patch [2] for ARM.

3. Wait till QEMU response for the problem?



Links:

[1] -
https://lists.nongnu.org/archive/html/qemu-devel/2021-09/msg03332.html

[2] - SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82

> 
> > After applying this patch the board works without issues.
> >   
> > >     
> > > >       
> > > > > I am currently
> > > > > testing arm for different compilers (gcc 6.2, gcc 10, gcc 11)
> > > > > and with different configurations (armv5, armv6, armv7, with
> > > > > and without thumb) and I haven't see any issue so far.
> > > > > 
> > > > > It might binutils related, which version are you using?
> > > > >  
> > > > 
> > > > 
> > > > 
> > > > 
> > > > Best regards,
> > > > 
> > > > Lukasz Majewski
> > > > 
> > > > --
> > > > 
> > > > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > > > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194
> > > > Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax:
> > > > (+49)-8142-66989-80 Email: lukma@denx.de      
> > > 
> > > 
> > > 
> > > 
> > > Best regards,
> > > 
> > > Lukasz Majewski
> > > 
> > > --
> > > 
> > > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > > lukma@denx.de    
> > 
> > 
> > 
> > 
> > Best regards,
> > 
> > Lukasz Majewski
> > 
> > --
> > 
> > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > lukma@denx.de  
> 
> 
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-17  8:29                   ` Lukasz Majewski
@ 2021-09-17 13:27                     ` Joseph Myers
  2021-09-17 16:17                       ` Andreas Schwab
  2021-09-26 19:58                       ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Joseph Myers @ 2021-09-17 13:27 UTC (permalink / raw)
  To: Lukasz Majewski; +Cc: Adhemerval Zanella, Florian Weimer, libc-alpha

On Fri, 17 Sep 2021, Lukasz Majewski wrote:

> Can we make a decision regarding this fix?

I don't think we've yet seen a thorough analysis of the issue.

Involvement of QEMU is probably not relevant.  Either the dynamic linker 
binary generated is correct, according to the instruction semantics in the 
Arm ARM and the relocation semantics in AAELF, or it isn't.  If it's 
incorrect, either the (static) linker inputs are correct and there's a 
linker bug, or the linker inputs are incorrect ("correct" here might be a 
bit fuzzier, involving things that are not fully specified).

So start from working out whether the generated binary is correct or not, 
with an appropriate description of the relevant instruction sequences 
executed and why they do or do not work in each case.

QEMU only becomes relevant if you have a binary that works when executed 
on hardware but not on QEMU (or vice versa).

(a) Do you have such a binary working on hardware but not on QEMU?

(b) Have you tested using the same binaries with both QEMU and hardware?

(c) Have you tested with the glibc testsuite, using a prebuilt system 
image with known good glibc rather than building init with the new QEMU?  
That's a better starting point than building a whole system with the new 
glibc, though if the result is "all tests fail execution" you may not be 
much further forward (but at least you can run the dynamic linker, good 
and bad versions, under gdbserver, and step individual instructions to see 
exactly what happens when the problem code is executed).

(d) Likewise, but with --enable-hardcoded-path-in-tests?  The point of 
this question is that one thing that's different by default between a 
glibc testsuite run and running binaries outside the glibc testsuite is 
whether new binaries are run directly or via ld.so --library-path.  It's 
possible some dynamic linker bugs might show up in one configuration but 
not the other.  So if the glibc testsuite passes in a default 
configuration (which I assume it did, if the submitter of the original 
patch tested it properly), but init fails in a newly built system image, 
one possible explanation would be such a bug - and running the glibc 
testsuite with --enable-hardcoded-path-in-tests is one way of checking for 
that kind of issue.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-17 13:27                     ` Joseph Myers
@ 2021-09-17 16:17                       ` Andreas Schwab
  2021-09-26 19:58                       ` Lukasz Majewski
  1 sibling, 0 replies; 62+ messages in thread
From: Andreas Schwab @ 2021-09-17 16:17 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Lukasz Majewski, Florian Weimer, libc-alpha

On Sep 17 2021, Joseph Myers wrote:

> QEMU only becomes relevant if you have a binary that works when executed 
> on hardware but not on QEMU (or vice versa).

Given that glibc is working fine on hardware, it is unlikely a bug in
glibc.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-17 13:27                     ` Joseph Myers
  2021-09-17 16:17                       ` Andreas Schwab
@ 2021-09-26 19:58                       ` Lukasz Majewski
  2021-09-27 16:00                         ` Joseph Myers
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-09-26 19:58 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Adhemerval Zanella, Florian Weimer, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 2973 bytes --]

Hi Joseph,

Thanks for your input.

> On Fri, 17 Sep 2021, Lukasz Majewski wrote:
> 
> > Can we make a decision regarding this fix?  
> 
> I don't think we've yet seen a thorough analysis of the issue.
> 
> Involvement of QEMU is probably not relevant.  Either the dynamic
> linker binary generated is correct, according to the instruction
> semantics in the Arm ARM and the relocation semantics in AAELF, or it
> isn't.  If it's incorrect, either the (static) linker inputs are
> correct and there's a linker bug, or the linker inputs are incorrect
> ("correct" here might be a bit fuzzier, involving things that are not
> fully specified).
> 
> So start from working out whether the generated binary is correct or
> not, with an appropriate description of the relevant instruction
> sequences executed and why they do or do not work in each case.
> 
> QEMU only becomes relevant if you have a binary that works when
> executed on hardware but not on QEMU (or vice versa).
> 
> (a) Do you have such a binary working on hardware but not on QEMU?

Yes. I can run Beagle Bone generated image on the HW (without the fix),
but it breaks down on QEMU.

> 
> (b) Have you tested using the same binaries with both QEMU and
> hardware?

Yes. OOPs are only present on the QEMU.

> 
> (c) Have you tested with the glibc testsuite, using a prebuilt system 
> image with known good glibc rather than building init with the new
> QEMU?

No, not yet. I try to debug working vs broken ld-linux.so on working
QEMU setup with LD_PRELOAD.

> That's a better starting point than building a whole system
> with the new glibc, though if the result is "all tests fail
> execution" you may not be much further forward (but at least you can
> run the dynamic linker, good and bad versions, under gdbserver, and
> step individual instructions to see exactly what happens when the
> problem code is executed).
> 
> (d) Likewise, but with --enable-hardcoded-path-in-tests?  The point
> of this question is that one thing that's different by default
> between a glibc testsuite run and running binaries outside the glibc
> testsuite is whether new binaries are run directly or via ld.so
> --library-path.  It's possible some dynamic linker bugs might show up
> in one configuration but not the other.  So if the glibc testsuite
> passes in a default configuration (which I assume it did, if the
> submitter of the original patch tested it properly), but init fails
> in a newly built system image, one possible explanation would be such
> a bug - and running the glibc testsuite with
> --enable-hardcoded-path-in-tests is one way of checking for that kind
> of issue.
> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-26 19:58                       ` Lukasz Majewski
@ 2021-09-27 16:00                         ` Joseph Myers
  0 siblings, 0 replies; 62+ messages in thread
From: Joseph Myers @ 2021-09-27 16:00 UTC (permalink / raw)
  To: Lukasz Majewski; +Cc: Adhemerval Zanella, Florian Weimer, libc-alpha

On Sun, 26 Sep 2021, Lukasz Majewski wrote:

> > QEMU only becomes relevant if you have a binary that works when
> > executed on hardware but not on QEMU (or vice versa).
> > 
> > (a) Do you have such a binary working on hardware but not on QEMU?
> 
> Yes. I can run Beagle Bone generated image on the HW (without the fix),
> but it breaks down on QEMU.

If this is system QEMU emulation, that strongly suggests a QEMU bug, not a 
glibc bug - meaning you should find where the execution diverges between 
QEMU and hardware to identify the mis-emulated instruction.

(For QEMU usermode emulation, different address space layout compared to 
running natively under the Linux kernel could be an issue.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-09-07 17:44     ` Fangrui Song
  2021-09-08 15:05       ` Lukasz Majewski
@ 2021-10-05  7:45       ` Lukasz Majewski
  2021-10-06  7:57         ` Fangrui Song
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-05  7:45 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 670 bytes --]

Hi Fangrui,

> >
> >I'm also wondering if similar issue is visible with "normal" - i.e.
> >not QEMU ARM run init.
> >
> >Maybe the "rough" environment in which /sbin/init is run is the
> >culprit?
> >  
> 
> I had tested running ld.so and elf/ldconfig which covered the usage.
> 
> I don't know. It needs some debugging from you:)

Could you share on which hardware (ARM) you run this change?


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-05  7:45       ` Lukasz Majewski
@ 2021-10-06  7:57         ` Fangrui Song
  2021-10-06  9:03           ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Fangrui Song @ 2021-10-06  7:57 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

On 2021-10-05, Lukasz Majewski wrote:
>Hi Fangrui,
>
>> >
>> >I'm also wondering if similar issue is visible with "normal" - i.e.
>> >not QEMU ARM run init.
>> >
>> >Maybe the "rough" environment in which /sbin/init is run is the
>> >culprit?
>> >
>>
>> I had tested running ld.so and elf/ldconfig which covered the usage.
>>
>> I don't know. It needs some debugging from you:)
>
>Could you share on which hardware (ARM) you run this change?
>
>
>Best regards,
>
>Lukasz Majewski

I ran ld.so and elf/ldconfig with qemu-arm-static (built from qemu/linux-user) (via binfmt_misc) on x86-64 Linux (Debian testing).
I don't have real ARM hardware.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-06  7:57         ` Fangrui Song
@ 2021-10-06  9:03           ` Lukasz Majewski
  2021-10-06 11:43             ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-06  9:03 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Carlos O'Donell, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

Hi Fangrui,

> On 2021-10-05, Lukasz Majewski wrote:
> >Hi Fangrui,
> >  
> >> >
> >> >I'm also wondering if similar issue is visible with "normal" -
> >> >i.e. not QEMU ARM run init.
> >> >
> >> >Maybe the "rough" environment in which /sbin/init is run is the
> >> >culprit?
> >> >  
> >>
> >> I had tested running ld.so and elf/ldconfig which covered the
> >> usage.
> >>
> >> I don't know. It needs some debugging from you:)  
> >
> >Could you share on which hardware (ARM) you run this change?
> >
> >
> >Best regards,
> >
> >Lukasz Majewski  
> 

Thanks for the info.

> I ran ld.so and elf/ldconfig with qemu-arm-static (built from

Ok, so this was only the user mode emulation, not qemu-system-arm ?

> qemu/linux-user) (via binfmt_misc) on x86-64 Linux (Debian testing).
> I don't have real ARM hardware.

The built image for BBB (Beagle Bone Black) also shows OOPs on the real
HW.

I'm debugging this now and share results asap.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-06  9:03           ` Lukasz Majewski
@ 2021-10-06 11:43             ` Lukasz Majewski
  2021-10-06 12:55               ` Szabolcs Nagy
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-06 11:43 UTC (permalink / raw)
  To: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab
  Cc: Carlos O'Donell, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 5907 bytes --]

Dear Community,

Please find in-depth analyze about the issue:

It was tested with Beagle Bone Black (BBB) and QEMU (the same binary
rootfs images).
(If needed I will upload images and script to run QEMU to some server
for reproduction).
Branch: https://github.com/lmajewski/y2038_glibc/commits/y2038_edge

On working setup to trigger the core dump:
/home/root/ld-linux-armhf.so.3 /sbin/init
gdb ./ld-linux-armhf.so.3 core

(and the /home/root/ld-linux-armhf.so.3 is the "broken" one).


Not working (patch [1] not applied):
====================================

All the code is located in _dl_start in elf/rtld.c and
elf/get-dynamic-info.h files:

(gdb) p/x $r5
$46 = 0xb6fc8000
r5 is set from the elf_machine_load_address()

Then we enter the elf_get_dynamic_info()

0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
   0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2, #4]
=> 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3, r3, r5
   0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3, [r2, #4]
(gdb) p/x $r3
$63 = 0x410003f4
(gdb) p/x $r5
$64 = 0xb6fc8000
(gdb) si
0xb6fc9600      99            ADJUST_DYN_INFO (DT_SYMTAB);
   0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2, #4]
   0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3, r3, r5
=> 0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3, [r2, #4]
(gdb) p/x $r3
$65 = 0xf7fc83f4

The above address is the kernel space one - so accessing it causes:
"Program received signal SIGSEGV, Segmentation fault."

(gdb) p/x $r3
$55 = 0xf7fc88e4
112                 DO_ELF_MACHINE_REL_RELATIVE (map, l_addr, relative);
=> 0xb6fc979c <_dl_start+728>:  08 10 13 e5     ldr     r1, [r3, #-8]


The problem is that elf_machine_load_address() returns wrong value - or
maybe to say it differently - the one used afterwards by glibc is wrong.

The correct one is, which corresponds to using 'adr' command (with
patch [1] applied):

0xb6fc9508 in _dl_start (arg=0xbefffdf0) at rtld.c:545
545       bootstrap_map.l_addr = elf_machine_load_address ();
=> 0xb6fc9508 <_dl_start+68>:   98 55 81 e5     str     r5, [r1, #1432]
; 0x598 (gdb) p/x $r5
$12 = 0x75fc8000


Another issue is the way the assembler code is generated. With the
working code one would have:

(gdb) si
0xb6fc9608      99            ADJUST_DYN_INFO (DT_SYMTAB);
   0xb6fc9600 <_dl_start+316>:  00 37 9f e5     ldr     r3, [pc, #1792]
; 0xb6fc9604 <_dl_start+320>:  03 30 8f e0     add     r3, pc, r3
=> 0xb6fc9608 <_dl_start+324>:  d0 35 93 e5     ldr     r3, [r3, #1488]
; 0x5d0 0xb6fc960c <_dl_start+328>:  00 00 53 e3     cmp     r3, #0

(gdb) p/x $r3
$13 = 0xb6fff010
(gdb) p/x $pc
$14 = 0xb6fc9608
(gdb) 

In the above example the $r3 value is relative to $pc and $r5 is not
used at all.


The code which fails now looks like:
112                 DO_ELF_MACHINE_REL_RELATIVE (map, l_addr, relative);
=> 0xb6fc9870 <_dl_start+940>:  08 10 13 e5     ldr     r1, [r3, #-8]

(gdb) p/x l_addr
$15 = 0x75fc8000

and everything works as expected.


Questions:
==========

1. I guess that different addressing assembler generated ($r3
calculation) is related to the address to be relocated (when the offset
is too large the register is used - not the $pc itself).

2. I don't understand why values provided by elf_machine_load_address()
differ? I would expect that those are equal.

And help and input appreciated.





Links:
[1] -
https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5

readelf -e ld-linux-armhf.so.3 

[10] .plt              PROGBITS        41000994 000994 000050 04  AX  0   0  4
[11] .text             PROGBITS        41000a00 000a00 01fed0 00  AX  0   0 64
[12] .rodata           PROGBITS        410208d0 0208d0 004b24 00   A  0   0  4
[13] .ARM.extab        PROGBITS        410253f4 0253f4 000018 00   A  0   0  4
[14] .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11   0  4
[15] .data.rel.ro      PROGBITS        41036200 026200 000cf4 00  WA  0   0  8
[16] .dynamic          DYNAMIC         41036ef4 026ef4 0000c8 08  WA  5   0  4
[17] .got              PROGBITS        41036fbc 026fbc 000040 04  WA  0   0  4


> Hi Fangrui,
> 
> > On 2021-10-05, Lukasz Majewski wrote:  
> > >Hi Fangrui,
> > >    
> > >> >
> > >> >I'm also wondering if similar issue is visible with "normal" -
> > >> >i.e. not QEMU ARM run init.
> > >> >
> > >> >Maybe the "rough" environment in which /sbin/init is run is the
> > >> >culprit?
> > >> >    
> > >>
> > >> I had tested running ld.so and elf/ldconfig which covered the
> > >> usage.
> > >>
> > >> I don't know. It needs some debugging from you:)    
> > >
> > >Could you share on which hardware (ARM) you run this change?
> > >
> > >
> > >Best regards,
> > >
> > >Lukasz Majewski    
> >   
> 
> Thanks for the info.
> 
> > I ran ld.so and elf/ldconfig with qemu-arm-static (built from  
> 
> Ok, so this was only the user mode emulation, not qemu-system-arm ?
> 
> > qemu/linux-user) (via binfmt_misc) on x86-64 Linux (Debian testing).
> > I don't have real ARM hardware.  
> 
> The built image for BBB (Beagle Bone Black) also shows OOPs on the
> real HW.
> 
> I'm debugging this now and share results asap.
> 
> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-06 11:43             ` Lukasz Majewski
@ 2021-10-06 12:55               ` Szabolcs Nagy
  2021-10-07  9:19                 ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-06 12:55 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab, libc-alpha

The 10/06/2021 13:43, Lukasz Majewski wrote:
> Please find in-depth analyze about the issue:
> 
> It was tested with Beagle Bone Black (BBB) and QEMU (the same binary
> rootfs images).
> (If needed I will upload images and script to run QEMU to some server
> for reproduction).
> Branch: https://github.com/lmajewski/y2038_glibc/commits/y2038_edge

i think it is easier to look at if you upload the broken
ld.so binary. or at least readelf -aW ld.so output.

> On working setup to trigger the core dump:
> /home/root/ld-linux-armhf.so.3 /sbin/init
> gdb ./ld-linux-armhf.so.3 core
> 
> (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
> 
> 
> Not working (patch [1] not applied):
> ====================================
> 
> All the code is located in _dl_start in elf/rtld.c and
> elf/get-dynamic-info.h files:
> 
> (gdb) p/x $r5
> $46 = 0xb6fc8000
> r5 is set from the elf_machine_load_address()
> 
> Then we enter the elf_get_dynamic_info()
> 
> 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
>    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2, #4]
> => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3, r3, r5
>    0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3, [r2, #4]
> (gdb) p/x $r3
> $63 = 0x410003f4
> (gdb) p/x $r5
> $64 = 0xb6fc8000

it seems r5 is already wrong here, it's not the runtime
address of 0. (looks more like the runtime address of
0x41000000)

likely something goes wrong with the computation of r5.

> Links:
> [1] -
> https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> 
> readelf -e ld-linux-armhf.so.3 
> 
> [10] .plt              PROGBITS        41000994 000994 000050 04  AX  0   0  4
> [11] .text             PROGBITS        41000a00 000a00 01fed0 00  AX  0   0 64
> [12] .rodata           PROGBITS        410208d0 0208d0 004b24 00   A  0   0  4
> [13] .ARM.extab        PROGBITS        410253f4 0253f4 000018 00   A  0   0  4
> [14] .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11   0  4
> [15] .data.rel.ro      PROGBITS        41036200 026200 000cf4 00  WA  0   0  8
> [16] .dynamic          DYNAMIC         41036ef4 026ef4 0000c8 08  WA  5   0  4
> [17] .got              PROGBITS        41036fbc 026fbc 000040 04  WA  0   0  4

why are all addresses >0x41000000 ?
in a shared library i expect all those addresses
to be close to 0.

is this made by some modified binutils?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-06 12:55               ` Szabolcs Nagy
@ 2021-10-07  9:19                 ` Lukasz Majewski
  2021-10-07 10:00                   ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-07  9:19 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 4084 bytes --]

Hi Szabolcs,

> The 10/06/2021 13:43, Lukasz Majewski wrote:
> > Please find in-depth analyze about the issue:
> > 
> > It was tested with Beagle Bone Black (BBB) and QEMU (the same binary
> > rootfs images).
> > (If needed I will upload images and script to run QEMU to some
> > server for reproduction).
> > Branch: https://github.com/lmajewski/y2038_glibc/commits/y2038_edge
> >  
> 
> i think it is easier to look at if you upload the broken
> ld.so binary. or at least readelf -aW ld.so output.

Please find working and broken ld-linux-armhf.so.3:
https://owncloud.denx.de/s/wtAfktG6pXLffSA

> 
> > On working setup to trigger the core dump:
> > /home/root/ld-linux-armhf.so.3 /sbin/init
> > gdb ./ld-linux-armhf.so.3 core
> > 
> > (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
> > 
> > 
> > Not working (patch [1] not applied):
> > ====================================
> > 
> > All the code is located in _dl_start in elf/rtld.c and
> > elf/get-dynamic-info.h files:
> > 
> > (gdb) p/x $r5
> > $46 = 0xb6fc8000
> > r5 is set from the elf_machine_load_address()
> > 
> > Then we enter the elf_get_dynamic_info()
> > 
> > 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
> >    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2,
> > #4] => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3, r3,
> > r5 0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3, [r2, #4]
> > (gdb) p/x $r3
> > $63 = 0x410003f4
> > (gdb) p/x $r5
> > $64 = 0xb6fc8000  
> 
> it seems r5 is already wrong here, it's not the runtime
> address of 0. (looks more like the runtime address of
> 0x41000000)

The r5 is set according to the change in patch which now I'm trying to
fix.

To be more specific - r5 is set according to code in patch SHA1:
bca0f5cbc9257c13322b99e55235c4f21ba0bd82

> 
> likely something goes wrong with the computation of r5.

Before the SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82 patch r5 was
computed with using 'adr' assembler instruction, not set from some
constant value.

asm ("adr %0, _dl_start" : "=r" (pcrel_addr));

And r5 value was 0x75fc8000

> 
> > Links:
> > [1] -
> > https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> > 
> > readelf -e ld-linux-armhf.so.3 
> > 
> > [10] .plt              PROGBITS        41000994 000994 000050 04
> > AX  0   0  4 [11] .text             PROGBITS        41000a00 000a00
> > 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
> > 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
> > PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
> > .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11
> >  0  4 [15] .data.rel.ro      PROGBITS        41036200 026200 000cf4
> > 00  WA  0   0  8 [16] .dynamic          DYNAMIC         41036ef4
> > 026ef4 0000c8 08  WA  5   0  4 [17] .got              PROGBITS
> >   41036fbc 026fbc 000040 04  WA  0   0  4  
> 
> why are all addresses >0x41000000 ?
> in a shared library i expect all those addresses
> to be close to 0.

On this real HW system (the rootfs which is running) - libc.so.6 also
has address > 0x41000000
libm.so.6 also has the value > 0x41200000
(Entry point address:               0x412c9190)

The offset of > 0x41000000 looks a bit strange indeed, but it is still
less than the kernel space. Moreover, with position independent code it
shall not matter.

> 
> is this made by some modified binutils?

I've double checked the ld-linux-armhf.so.3 and after build it has:
(Entry point address:               0xa00) which seems to be correct.

So it looks like during installation of the glibc (on the Yocto/OE
created rootfs) the elf header is modified and this 0x41000000 offset is
added.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07  9:19                 ` Lukasz Majewski
@ 2021-10-07 10:00                   ` Lukasz Majewski
  2021-10-07 14:15                     ` Szabolcs Nagy
  2021-10-07 14:16                     ` Adhemerval Zanella
  0 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-07 10:00 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 5075 bytes --]

On Thu, 7 Oct 2021 11:19:26 +0200
Lukasz Majewski <lukma@denx.de> wrote:

> Hi Szabolcs,
> 
> > The 10/06/2021 13:43, Lukasz Majewski wrote:  
> > > Please find in-depth analyze about the issue:
> > > 
> > > It was tested with Beagle Bone Black (BBB) and QEMU (the same
> > > binary rootfs images).
> > > (If needed I will upload images and script to run QEMU to some
> > > server for reproduction).
> > > Branch:
> > > https://github.com/lmajewski/y2038_glibc/commits/y2038_edge 
> > 
> > i think it is easier to look at if you upload the broken
> > ld.so binary. or at least readelf -aW ld.so output.  
> 
> Please find working and broken ld-linux-armhf.so.3:
> https://owncloud.denx.de/s/wtAfktG6pXLffSA
> 
> >   
> > > On working setup to trigger the core dump:
> > > /home/root/ld-linux-armhf.so.3 /sbin/init
> > > gdb ./ld-linux-armhf.so.3 core
> > > 
> > > (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
> > > 
> > > 
> > > Not working (patch [1] not applied):
> > > ====================================
> > > 
> > > All the code is located in _dl_start in elf/rtld.c and
> > > elf/get-dynamic-info.h files:
> > > 
> > > (gdb) p/x $r5
> > > $46 = 0xb6fc8000
> > > r5 is set from the elf_machine_load_address()
> > > 
> > > Then we enter the elf_get_dynamic_info()
> > > 
> > > 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
> > >    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2,
> > > #4] => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3,
> > > r3, r5 0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3,
> > > [r2, #4] (gdb) p/x $r3
> > > $63 = 0x410003f4
> > > (gdb) p/x $r5
> > > $64 = 0xb6fc8000    
> > 
> > it seems r5 is already wrong here, it's not the runtime
> > address of 0. (looks more like the runtime address of
> > 0x41000000)  
> 
> The r5 is set according to the change in patch which now I'm trying to
> fix.
> 
> To be more specific - r5 is set according to code in patch SHA1:
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> 
> > 
> > likely something goes wrong with the computation of r5.  
> 
> Before the SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82 patch r5 was
> computed with using 'adr' assembler instruction, not set from some
> constant value.
> 
> asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> 
> And r5 value was 0x75fc8000
> 
> >   
> > > Links:
> > > [1] -
> > > https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> > > 
> > > readelf -e ld-linux-armhf.so.3 
> > > 
> > > [10] .plt              PROGBITS        41000994 000994 000050 04
> > > AX  0   0  4 [11] .text             PROGBITS        41000a00
> > > 000a00 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
> > > 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
> > > PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
> > > .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11
> > >  0  4 [15] .data.rel.ro      PROGBITS        41036200 026200
> > > 000cf4 00  WA  0   0  8 [16] .dynamic          DYNAMIC
> > > 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
> > > PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4    
> > 
> > why are all addresses >0x41000000 ?
> > in a shared library i expect all those addresses
> > to be close to 0.  
> 
> On this real HW system (the rootfs which is running) - libc.so.6 also
> has address > 0x41000000
> libm.so.6 also has the value > 0x41200000
> (Entry point address:               0x412c9190)
> 
> The offset of > 0x41000000 looks a bit strange indeed, but it is still
> less than the kernel space. Moreover, with position independent code
> it shall not matter.
> 
> > 
> > is this made by some modified binutils?  
> 
> I've double checked the ld-linux-armhf.so.3 and after build it has:
> (Entry point address:               0xa00) which seems to be correct.
> 
> So it looks like during installation of the glibc (on the Yocto/OE
> created rootfs) the elf header is modified and this 0x41000000 offset
> is added.
> 

And indeed it is the case. Yocto/OE by default perform prelinking (use
prelink program) to speedup start time of dynamic program.

The prelink [1] itself assigns some virtual addresses to all required
shared objects (in our case for /sbin/init), so no clashes are
encountered.

And using prelink is a _default_ behaviour in Yocto/OE poky distro.

Links:
[1] - https://linux.die.net/man/8/prelink

> 
> Best regards,
> 
> Lukasz Majewski
> 
> --
> 
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> lukma@denx.de




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 10:00                   ` Lukasz Majewski
@ 2021-10-07 14:15                     ` Szabolcs Nagy
  2021-10-07 14:58                       ` Lukasz Majewski
  2021-10-07 14:16                     ` Adhemerval Zanella
  1 sibling, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-07 14:15 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab, libc-alpha

The 10/07/2021 12:00, Lukasz Majewski wrote:
> On Thu, 7 Oct 2021 11:19:26 +0200
> Lukasz Majewski <lukma@denx.de> wrote:
> > Please find working and broken ld-linux-armhf.so.3:
> > https://owncloud.denx.de/s/wtAfktG6pXLffSA

thanks.

i can confirm that this crashes on an aarch64 machine too
(that supports running arm binaries), so not just a qemu
problem.


> > > > Links:
> > > > [1] -
> > > > https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> > > > 
> > > > readelf -e ld-linux-armhf.so.3 
> > > > 
> > > > [10] .plt              PROGBITS        41000994 000994 000050 04
> > > > AX  0   0  4 [11] .text             PROGBITS        41000a00
> > > > 000a00 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
> > > > 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
> > > > PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
> > > > .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11
> > > >  0  4 [15] .data.rel.ro      PROGBITS        41036200 026200
> > > > 000cf4 00  WA  0   0  8 [16] .dynamic          DYNAMIC
> > > > 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
> > > > PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4    
> > > 
> > > why are all addresses >0x41000000 ?
> > > in a shared library i expect all those addresses
> > > to be close to 0.  
> > 
> > On this real HW system (the rootfs which is running) - libc.so.6 also
> > has address > 0x41000000
> > libm.so.6 also has the value > 0x41200000
> > (Entry point address:               0x412c9190)
> > 
> > The offset of > 0x41000000 looks a bit strange indeed, but it is still
> > less than the kernel space. Moreover, with position independent code
> > it shall not matter.
> > 
> > > 
> > > is this made by some modified binutils?  
> > 
> > I've double checked the ld-linux-armhf.so.3 and after build it has:
> > (Entry point address:               0xa00) which seems to be correct.
> > 
> > So it looks like during installation of the glibc (on the Yocto/OE
> > created rootfs) the elf header is modified and this 0x41000000 offset
> > is added.
> > 
> 
> And indeed it is the case. Yocto/OE by default perform prelinking (use
> prelink program) to speedup start time of dynamic program.
> 
> The prelink [1] itself assigns some virtual addresses to all required
> shared objects (in our case for /sbin/init), so no clashes are
> encountered.
> 
> And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> 
> Links:
> [1] - https://linux.die.net/man/8/prelink

ah prelinking.

ok so &__ehdr_start is supposed to point to the
elf header, which supposed to start at address 0
within the elf image.

but with prelinking everthing is moved by a fixed
offset (including the elf header), so __ehdr_start
is no longer has 0 address.

i don't immediately know what's the best fix for this.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 10:00                   ` Lukasz Majewski
  2021-10-07 14:15                     ` Szabolcs Nagy
@ 2021-10-07 14:16                     ` Adhemerval Zanella
  2021-10-07 14:29                       ` H.J. Lu
  1 sibling, 1 reply; 62+ messages in thread
From: Adhemerval Zanella @ 2021-10-07 14:16 UTC (permalink / raw)
  To: Lukasz Majewski, Szabolcs Nagy
  Cc: Fangrui Song, Florian Weimer, Joseph Myers, Andreas Schwab, libc-alpha



On 07/10/2021 07:00, Lukasz Majewski wrote:
> On Thu, 7 Oct 2021 11:19:26 +0200
> Lukasz Majewski <lukma@denx.de> wrote:
> 
>> Hi Szabolcs,
>>
>>> The 10/06/2021 13:43, Lukasz Majewski wrote:  
>>>> Please find in-depth analyze about the issue:
>>>>
>>>> It was tested with Beagle Bone Black (BBB) and QEMU (the same
>>>> binary rootfs images).
>>>> (If needed I will upload images and script to run QEMU to some
>>>> server for reproduction).
>>>> Branch:
>>>> https://github.com/lmajewski/y2038_glibc/commits/y2038_edge 
>>>
>>> i think it is easier to look at if you upload the broken
>>> ld.so binary. or at least readelf -aW ld.so output.  
>>
>> Please find working and broken ld-linux-armhf.so.3:
>> https://owncloud.denx.de/s/wtAfktG6pXLffSA
>>
>>>   
>>>> On working setup to trigger the core dump:
>>>> /home/root/ld-linux-armhf.so.3 /sbin/init
>>>> gdb ./ld-linux-armhf.so.3 core
>>>>
>>>> (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
>>>>
>>>>
>>>> Not working (patch [1] not applied):
>>>> ====================================
>>>>
>>>> All the code is located in _dl_start in elf/rtld.c and
>>>> elf/get-dynamic-info.h files:
>>>>
>>>> (gdb) p/x $r5
>>>> $46 = 0xb6fc8000
>>>> r5 is set from the elf_machine_load_address()
>>>>
>>>> Then we enter the elf_get_dynamic_info()
>>>>
>>>> 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
>>>>    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2,
>>>> #4] => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3,
>>>> r3, r5 0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3,
>>>> [r2, #4] (gdb) p/x $r3
>>>> $63 = 0x410003f4
>>>> (gdb) p/x $r5
>>>> $64 = 0xb6fc8000    
>>>
>>> it seems r5 is already wrong here, it's not the runtime
>>> address of 0. (looks more like the runtime address of
>>> 0x41000000)  
>>
>> The r5 is set according to the change in patch which now I'm trying to
>> fix.
>>
>> To be more specific - r5 is set according to code in patch SHA1:
>> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
>>
>>>
>>> likely something goes wrong with the computation of r5.  
>>
>> Before the SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82 patch r5 was
>> computed with using 'adr' assembler instruction, not set from some
>> constant value.
>>
>> asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
>>
>> And r5 value was 0x75fc8000
>>
>>>   
>>>> Links:
>>>> [1] -
>>>> https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
>>>>
>>>> readelf -e ld-linux-armhf.so.3 
>>>>
>>>> [10] .plt              PROGBITS        41000994 000994 000050 04
>>>> AX  0   0  4 [11] .text             PROGBITS        41000a00
>>>> 000a00 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
>>>> 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
>>>> PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
>>>> .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11
>>>>  0  4 [15] .data.rel.ro      PROGBITS        41036200 026200
>>>> 000cf4 00  WA  0   0  8 [16] .dynamic          DYNAMIC
>>>> 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
>>>> PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4    
>>>
>>> why are all addresses >0x41000000 ?
>>> in a shared library i expect all those addresses
>>> to be close to 0.  
>>
>> On this real HW system (the rootfs which is running) - libc.so.6 also
>> has address > 0x41000000
>> libm.so.6 also has the value > 0x41200000
>> (Entry point address:               0x412c9190)
>>
>> The offset of > 0x41000000 looks a bit strange indeed, but it is still
>> less than the kernel space. Moreover, with position independent code
>> it shall not matter.
>>
>>>
>>> is this made by some modified binutils?  
>>
>> I've double checked the ld-linux-armhf.so.3 and after build it has:
>> (Entry point address:               0xa00) which seems to be correct.
>>
>> So it looks like during installation of the glibc (on the Yocto/OE
>> created rootfs) the elf header is modified and this 0x41000000 offset
>> is added.
>>
> 
> And indeed it is the case. Yocto/OE by default perform prelinking (use
> prelink program) to speedup start time of dynamic program.
> 
> The prelink [1] itself assigns some virtual addresses to all required
> shared objects (in our case for /sbin/init), so no clashes are
> encountered.
> 
> And using prelink is a _default_ behaviour in Yocto/OE poky distro.

Does it work without prelink? Also, does it fail with prelink in real
hardware?

It indeed might be a prelink issue in fact.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 14:16                     ` Adhemerval Zanella
@ 2021-10-07 14:29                       ` H.J. Lu
  2021-10-07 15:57                         ` Szabolcs Nagy
  2021-10-11  8:56                         ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: H.J. Lu @ 2021-10-07 14:29 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: Lukasz Majewski, Szabolcs Nagy, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
>
>
> On 07/10/2021 07:00, Lukasz Majewski wrote:
> > On Thu, 7 Oct 2021 11:19:26 +0200
> > Lukasz Majewski <lukma@denx.de> wrote:
> >
> >> Hi Szabolcs,
> >>
> >>> The 10/06/2021 13:43, Lukasz Majewski wrote:
> >>>> Please find in-depth analyze about the issue:
> >>>>
> >>>> It was tested with Beagle Bone Black (BBB) and QEMU (the same
> >>>> binary rootfs images).
> >>>> (If needed I will upload images and script to run QEMU to some
> >>>> server for reproduction).
> >>>> Branch:
> >>>> https://github.com/lmajewski/y2038_glibc/commits/y2038_edge
> >>>
> >>> i think it is easier to look at if you upload the broken
> >>> ld.so binary. or at least readelf -aW ld.so output.
> >>
> >> Please find working and broken ld-linux-armhf.so.3:
> >> https://owncloud.denx.de/s/wtAfktG6pXLffSA
> >>
> >>>
> >>>> On working setup to trigger the core dump:
> >>>> /home/root/ld-linux-armhf.so.3 /sbin/init
> >>>> gdb ./ld-linux-armhf.so.3 core
> >>>>
> >>>> (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
> >>>>
> >>>>
> >>>> Not working (patch [1] not applied):
> >>>> ====================================
> >>>>
> >>>> All the code is located in _dl_start in elf/rtld.c and
> >>>> elf/get-dynamic-info.h files:
> >>>>
> >>>> (gdb) p/x $r5
> >>>> $46 = 0xb6fc8000
> >>>> r5 is set from the elf_machine_load_address()
> >>>>
> >>>> Then we enter the elf_get_dynamic_info()
> >>>>
> >>>> 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
> >>>>    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3, [r2,
> >>>> #4] => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne   r3,
> >>>> r3, r5 0xb6fc9600 <_dl_start+316>:  04 30 82 15     strne   r3,
> >>>> [r2, #4] (gdb) p/x $r3
> >>>> $63 = 0x410003f4
> >>>> (gdb) p/x $r5
> >>>> $64 = 0xb6fc8000
> >>>
> >>> it seems r5 is already wrong here, it's not the runtime
> >>> address of 0. (looks more like the runtime address of
> >>> 0x41000000)
> >>
> >> The r5 is set according to the change in patch which now I'm trying to
> >> fix.
> >>
> >> To be more specific - r5 is set according to code in patch SHA1:
> >> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> >>
> >>>
> >>> likely something goes wrong with the computation of r5.
> >>
> >> Before the SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82 patch r5 was
> >> computed with using 'adr' assembler instruction, not set from some
> >> constant value.
> >>
> >> asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> >>
> >> And r5 value was 0x75fc8000
> >>
> >>>
> >>>> Links:
> >>>> [1] -
> >>>> https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> >>>>
> >>>> readelf -e ld-linux-armhf.so.3
> >>>>
> >>>> [10] .plt              PROGBITS        41000994 000994 000050 04
> >>>> AX  0   0  4 [11] .text             PROGBITS        41000a00
> >>>> 000a00 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
> >>>> 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
> >>>> PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
> >>>> .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00  AL 11
> >>>>  0  4 [15] .data.rel.ro      PROGBITS        41036200 026200
> >>>> 000cf4 00  WA  0   0  8 [16] .dynamic          DYNAMIC
> >>>> 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
> >>>> PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4
> >>>
> >>> why are all addresses >0x41000000 ?
> >>> in a shared library i expect all those addresses
> >>> to be close to 0.
> >>
> >> On this real HW system (the rootfs which is running) - libc.so.6 also
> >> has address > 0x41000000
> >> libm.so.6 also has the value > 0x41200000
> >> (Entry point address:               0x412c9190)
> >>
> >> The offset of > 0x41000000 looks a bit strange indeed, but it is still
> >> less than the kernel space. Moreover, with position independent code
> >> it shall not matter.
> >>
> >>>
> >>> is this made by some modified binutils?
> >>
> >> I've double checked the ld-linux-armhf.so.3 and after build it has:
> >> (Entry point address:               0xa00) which seems to be correct.
> >>
> >> So it looks like during installation of the glibc (on the Yocto/OE
> >> created rootfs) the elf header is modified and this 0x41000000 offset
> >> is added.
> >>
> >
> > And indeed it is the case. Yocto/OE by default perform prelinking (use
> > prelink program) to speedup start time of dynamic program.
> >
> > The prelink [1] itself assigns some virtual addresses to all required
> > shared objects (in our case for /sbin/init), so no clashes are
> > encountered.
> >
> > And using prelink is a _default_ behaviour in Yocto/OE poky distro.
>
> Does it work without prelink? Also, does it fail with prelink in real
> hardware?
>
> It indeed might be a prelink issue in fact.

This will fail everywhere if prelink is used.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 14:15                     ` Szabolcs Nagy
@ 2021-10-07 14:58                       ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-07 14:58 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Fangrui Song, Adhemerval Zanella, Florian Weimer, Joseph Myers,
	Andreas Schwab, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 3550 bytes --]

Hi Szabolcs,

> The 10/07/2021 12:00, Lukasz Majewski wrote:
> > On Thu, 7 Oct 2021 11:19:26 +0200
> > Lukasz Majewski <lukma@denx.de> wrote:  
> > > Please find working and broken ld-linux-armhf.so.3:
> > > https://owncloud.denx.de/s/wtAfktG6pXLffSA  
> 
> thanks.
> 
> i can confirm that this crashes on an aarch64 machine too
> (that supports running arm binaries), so not just a qemu
> problem.
> 
> 
> > > > > Links:
> > > > > [1] -
> > > > > https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> > > > > 
> > > > > readelf -e ld-linux-armhf.so.3 
> > > > > 
> > > > > [10] .plt              PROGBITS        41000994 000994 000050
> > > > > 04 AX  0   0  4 [11] .text             PROGBITS
> > > > > 41000a00 000a00 01fed0 00  AX  0   0 64 [12] .rodata
> > > > >  PROGBITS 410208d0 0208d0 004b24 00   A  0   0  4 [13]
> > > > > .ARM.extab PROGBITS        410253f4 0253f4 000018 00   A  0
> > > > > 0  4 [14] .ARM.exidx        ARM_EXIDX       4102540c 02540c
> > > > > 0000c8 00  AL 11 0  4 [15] .data.rel.ro      PROGBITS
> > > > > 41036200 026200 000cf4 00  WA  0   0  8 [16] .dynamic
> > > > >  DYNAMIC 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
> > > > > PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4      
> > > > 
> > > > why are all addresses >0x41000000 ?
> > > > in a shared library i expect all those addresses
> > > > to be close to 0.    
> > > 
> > > On this real HW system (the rootfs which is running) - libc.so.6
> > > also has address > 0x41000000
> > > libm.so.6 also has the value > 0x41200000
> > > (Entry point address:               0x412c9190)
> > > 
> > > The offset of > 0x41000000 looks a bit strange indeed, but it is
> > > still less than the kernel space. Moreover, with position
> > > independent code it shall not matter.
> > >   
> > > > 
> > > > is this made by some modified binutils?    
> > > 
> > > I've double checked the ld-linux-armhf.so.3 and after build it
> > > has: (Entry point address:               0xa00) which seems to be
> > > correct.
> > > 
> > > So it looks like during installation of the glibc (on the Yocto/OE
> > > created rootfs) the elf header is modified and this 0x41000000
> > > offset is added.
> > >   
> > 
> > And indeed it is the case. Yocto/OE by default perform prelinking
> > (use prelink program) to speedup start time of dynamic program.
> > 
> > The prelink [1] itself assigns some virtual addresses to all
> > required shared objects (in our case for /sbin/init), so no clashes
> > are encountered.
> > 
> > And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> > 
> > Links:
> > [1] - https://linux.die.net/man/8/prelink  
> 
> ah prelinking.
> 
> ok so &__ehdr_start is supposed to point to the
> elf header, which supposed to start at address 0
> within the elf image.
> 
> but with prelinking everthing is moved by a fixed
> offset (including the elf header), so __ehdr_start
> is no longer has 0 address.

I think that __ehdr_start now points to > 0x41000000.

> 
> i don't immediately know what's the best fix for this.

I think that getting the address with 'adr' ASM instruction was more
robust, as it was not relying on the set constant (__ehdr_start).


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 14:29                       ` H.J. Lu
@ 2021-10-07 15:57                         ` Szabolcs Nagy
  2021-10-07 16:22                           ` H.J. Lu
  2021-10-11  8:56                         ` Lukasz Majewski
  1 sibling, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-07 15:57 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Adhemerval Zanella, Florian Weimer, libc-alpha, Andreas Schwab,
	Joseph Myers

The 10/07/2021 07:29, H.J. Lu via Libc-alpha wrote:
> On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> > On 07/10/2021 07:00, Lukasz Majewski wrote:
> > > On Thu, 7 Oct 2021 11:19:26 +0200
> > > Lukasz Majewski <lukma@denx.de> wrote:
> > > And indeed it is the case. Yocto/OE by default perform prelinking (use
> > > prelink program) to speedup start time of dynamic program.
> > >
> > > The prelink [1] itself assigns some virtual addresses to all required
> > > shared objects (in our case for /sbin/init), so no clashes are
> > > encountered.
> > >
> > > And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> >
> > Does it work without prelink? Also, does it fail with prelink in real
> > hardware?
> >
> > It indeed might be a prelink issue in fact.
> 
> This will fail everywhere if prelink is used.

i thought the point of prelinking is that the vaddr in
the elf image is the runtime address so you don't have
to process relative relocs or adjust pointers in the
dynamic array with += l_addr at all (and then l_addr
does not have to be correct, ld.so would still work).

but here ld.so is loaded to some random offset on top
of the static prelink offset. is this expected?
does it make sense to prelink ld.so this way?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 15:57                         ` Szabolcs Nagy
@ 2021-10-07 16:22                           ` H.J. Lu
  2021-10-07 16:53                             ` Adhemerval Zanella
  0 siblings, 1 reply; 62+ messages in thread
From: H.J. Lu @ 2021-10-07 16:22 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Adhemerval Zanella, Florian Weimer, libc-alpha, Andreas Schwab,
	Joseph Myers

On Thu, Oct 7, 2021 at 8:57 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>
> The 10/07/2021 07:29, H.J. Lu via Libc-alpha wrote:
> > On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > > On 07/10/2021 07:00, Lukasz Majewski wrote:
> > > > On Thu, 7 Oct 2021 11:19:26 +0200
> > > > Lukasz Majewski <lukma@denx.de> wrote:
> > > > And indeed it is the case. Yocto/OE by default perform prelinking (use
> > > > prelink program) to speedup start time of dynamic program.
> > > >
> > > > The prelink [1] itself assigns some virtual addresses to all required
> > > > shared objects (in our case for /sbin/init), so no clashes are
> > > > encountered.
> > > >
> > > > And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> > >
> > > Does it work without prelink? Also, does it fail with prelink in real
> > > hardware?
> > >
> > > It indeed might be a prelink issue in fact.
> >
> > This will fail everywhere if prelink is used.
>
> i thought the point of prelinking is that the vaddr in
> the elf image is the runtime address so you don't have
> to process relative relocs or adjust pointers in the
> dynamic array with += l_addr at all (and then l_addr
> does not have to be correct, ld.so would still work).
>
> but here ld.so is loaded to some random offset on top
> of the static prelink offset. is this expected?
> does it make sense to prelink ld.so this way?

ld.so is loaded by the kernel.  It makes no sense to
prelink ld.so.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 16:22                           ` H.J. Lu
@ 2021-10-07 16:53                             ` Adhemerval Zanella
  2021-10-07 17:05                               ` H.J. Lu
  2021-10-07 17:24                               ` Fāng-ruì Sòng
  0 siblings, 2 replies; 62+ messages in thread
From: Adhemerval Zanella @ 2021-10-07 16:53 UTC (permalink / raw)
  To: H.J. Lu, Szabolcs Nagy
  Cc: Florian Weimer, libc-alpha, Andreas Schwab, Joseph Myers



On 07/10/2021 13:22, H.J. Lu wrote:
> On Thu, Oct 7, 2021 at 8:57 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
>>
>> The 10/07/2021 07:29, H.J. Lu via Libc-alpha wrote:
>>> On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
>>> <libc-alpha@sourceware.org> wrote:
>>>> On 07/10/2021 07:00, Lukasz Majewski wrote:
>>>>> On Thu, 7 Oct 2021 11:19:26 +0200
>>>>> Lukasz Majewski <lukma@denx.de> wrote:
>>>>> And indeed it is the case. Yocto/OE by default perform prelinking (use
>>>>> prelink program) to speedup start time of dynamic program.
>>>>>
>>>>> The prelink [1] itself assigns some virtual addresses to all required
>>>>> shared objects (in our case for /sbin/init), so no clashes are
>>>>> encountered.
>>>>>
>>>>> And using prelink is a _default_ behaviour in Yocto/OE poky distro.
>>>>
>>>> Does it work without prelink? Also, does it fail with prelink in real
>>>> hardware?
>>>>
>>>> It indeed might be a prelink issue in fact.
>>>
>>> This will fail everywhere if prelink is used.
>>
>> i thought the point of prelinking is that the vaddr in
>> the elf image is the runtime address so you don't have
>> to process relative relocs or adjust pointers in the
>> dynamic array with += l_addr at all (and then l_addr
>> does not have to be correct, ld.so would still work).
>>
>> but here ld.so is loaded to some random offset on top
>> of the static prelink offset. is this expected?
>> does it make sense to prelink ld.so this way?
> 
> ld.so is loaded by the kernel.  It makes no sense to
> prelink ld.so.
> 

The question is whether we consider this a regression or if
we don't support it and instruct consumers to not do it.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 16:53                             ` Adhemerval Zanella
@ 2021-10-07 17:05                               ` H.J. Lu
  2021-10-07 17:24                               ` Fāng-ruì Sòng
  1 sibling, 0 replies; 62+ messages in thread
From: H.J. Lu @ 2021-10-07 17:05 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: Szabolcs Nagy, Florian Weimer, libc-alpha, Andreas Schwab, Joseph Myers

On Thu, Oct 7, 2021 at 9:53 AM Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 07/10/2021 13:22, H.J. Lu wrote:
> > On Thu, Oct 7, 2021 at 8:57 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>
> >> The 10/07/2021 07:29, H.J. Lu via Libc-alpha wrote:
> >>> On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
> >>> <libc-alpha@sourceware.org> wrote:
> >>>> On 07/10/2021 07:00, Lukasz Majewski wrote:
> >>>>> On Thu, 7 Oct 2021 11:19:26 +0200
> >>>>> Lukasz Majewski <lukma@denx.de> wrote:
> >>>>> And indeed it is the case. Yocto/OE by default perform prelinking (use
> >>>>> prelink program) to speedup start time of dynamic program.
> >>>>>
> >>>>> The prelink [1] itself assigns some virtual addresses to all required
> >>>>> shared objects (in our case for /sbin/init), so no clashes are
> >>>>> encountered.
> >>>>>
> >>>>> And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> >>>>
> >>>> Does it work without prelink? Also, does it fail with prelink in real
> >>>> hardware?
> >>>>
> >>>> It indeed might be a prelink issue in fact.
> >>>
> >>> This will fail everywhere if prelink is used.
> >>
> >> i thought the point of prelinking is that the vaddr in
> >> the elf image is the runtime address so you don't have
> >> to process relative relocs or adjust pointers in the
> >> dynamic array with += l_addr at all (and then l_addr
> >> does not have to be correct, ld.so would still work).
> >>
> >> but here ld.so is loaded to some random offset on top
> >> of the static prelink offset. is this expected?
> >> does it make sense to prelink ld.so this way?
> >
> > ld.so is loaded by the kernel.  It makes no sense to
> > prelink ld.so.
> >
>
> The question is whether we consider this a regression or if
> we don't support it and instruct consumers to not do it.

I consider it a pilot error.  We can add a check in ld.so
to error out if it has been prelinked.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 16:53                             ` Adhemerval Zanella
  2021-10-07 17:05                               ` H.J. Lu
@ 2021-10-07 17:24                               ` Fāng-ruì Sòng
  2021-10-08  9:15                                 ` Szabolcs Nagy
  1 sibling, 1 reply; 62+ messages in thread
From: Fāng-ruì Sòng @ 2021-10-07 17:24 UTC (permalink / raw)
  To: Adhemerval Zanella
  Cc: H.J. Lu, Szabolcs Nagy, Florian Weimer, Andreas Schwab,
	libc-alpha, Joseph Myers

On Thu, Oct 7, 2021 at 9:54 AM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
>
>
> On 07/10/2021 13:22, H.J. Lu wrote:
> > On Thu, Oct 7, 2021 at 8:57 AM Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> >>
> >> The 10/07/2021 07:29, H.J. Lu via Libc-alpha wrote:
> >>> On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
> >>> <libc-alpha@sourceware.org> wrote:
> >>>> On 07/10/2021 07:00, Lukasz Majewski wrote:
> >>>>> On Thu, 7 Oct 2021 11:19:26 +0200
> >>>>> Lukasz Majewski <lukma@denx.de> wrote:
> >>>>> And indeed it is the case. Yocto/OE by default perform prelinking (use
> >>>>> prelink program) to speedup start time of dynamic program.
> >>>>>
> >>>>> The prelink [1] itself assigns some virtual addresses to all required
> >>>>> shared objects (in our case for /sbin/init), so no clashes are
> >>>>> encountered.
> >>>>>
> >>>>> And using prelink is a _default_ behaviour in Yocto/OE poky distro.
> >>>>
> >>>> Does it work without prelink? Also, does it fail with prelink in real
> >>>> hardware?
> >>>>
> >>>> It indeed might be a prelink issue in fact.
> >>>
> >>> This will fail everywhere if prelink is used.
> >>
> >> i thought the point of prelinking is that the vaddr in
> >> the elf image is the runtime address so you don't have
> >> to process relative relocs or adjust pointers in the
> >> dynamic array with += l_addr at all (and then l_addr
> >> does not have to be correct, ld.so would still work).
> >>
> >> but here ld.so is loaded to some random offset on top
> >> of the static prelink offset. is this expected?
> >> does it make sense to prelink ld.so this way?
> >
> > ld.so is loaded by the kernel.  It makes no sense to
> > prelink ld.so.
> >
>
> The question is whether we consider this a regression or if
> we don't support it and instruct consumers to not do it.

&__ehdr_start was used long before "elf: Unconditionally use __ehdr_start"
https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=elf/rtld.c;h=878e6480f4980b2fef468d8e271cd68acd4fc2d0;hp=d733359eaf808b8a5579908a6cd1153062c02919;hb=302247c89121e8d4c7629e589edbb4974fff6edb;hpb=13710e7e6af6c8965cc9a63a0660cb4ce1966557

It is surprising if that code could do something useful with prelinking.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 17:24                               ` Fāng-ruì Sòng
@ 2021-10-08  9:15                                 ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-08  9:15 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Adhemerval Zanella, H.J. Lu, Florian Weimer, Andreas Schwab,
	libc-alpha, Joseph Myers

The 10/07/2021 10:24, Fāng-ruì Sòng wrote:
> &__ehdr_start was used long before "elf: Unconditionally use __ehdr_start"
> https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=elf/rtld.c;h=878e6480f4980b2fef468d8e271cd68acd4fc2d0;hp=d733359eaf808b8a5579908a6cd1153062c02919;hb=302247c89121e8d4c7629e589edbb4974fff6edb;hpb=13710e7e6af6c8965cc9a63a0660cb4ce1966557
> 
> It is surprising if that code could do something useful with prelinking.

__ehdr_start was used as the start of the elf header.
that's true even after prelinking does its offsets.

the new use of __ehdr_start is to mean the base address
by which local symbol vaddr needs to be offset to get
the runtime address. this use relies on __ehdr_start to
have 0 vaddr wihch is broken by prelinking that moves
everything to some offset.

in principle the elf object after prelinking is valid,
but since this is the ld.so we can say that we only
support the case when the elf header is at 0 vaddr.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-07 14:29                       ` H.J. Lu
  2021-10-07 15:57                         ` Szabolcs Nagy
@ 2021-10-11  8:56                         ` Lukasz Majewski
  2021-10-11 10:18                           ` Szabolcs Nagy
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-11  8:56 UTC (permalink / raw)
  To: H.J. Lu, Adhemerval Zanella, Szabolcs Nagy
  Cc: Florian Weimer, libc-alpha, Andreas Schwab, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 5934 bytes --]

Hi H.J, Adhemerval, Szabolcs,

> On Thu, Oct 7, 2021 at 7:18 AM Adhemerval Zanella via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> >
> >
> > On 07/10/2021 07:00, Lukasz Majewski wrote:  
> > > On Thu, 7 Oct 2021 11:19:26 +0200
> > > Lukasz Majewski <lukma@denx.de> wrote:
> > >  
> > >> Hi Szabolcs,
> > >>  
> > >>> The 10/06/2021 13:43, Lukasz Majewski wrote:  
> > >>>> Please find in-depth analyze about the issue:
> > >>>>
> > >>>> It was tested with Beagle Bone Black (BBB) and QEMU (the same
> > >>>> binary rootfs images).
> > >>>> (If needed I will upload images and script to run QEMU to some
> > >>>> server for reproduction).
> > >>>> Branch:
> > >>>> https://github.com/lmajewski/y2038_glibc/commits/y2038_edge  
> > >>>
> > >>> i think it is easier to look at if you upload the broken
> > >>> ld.so binary. or at least readelf -aW ld.so output.  
> > >>
> > >> Please find working and broken ld-linux-armhf.so.3:
> > >> https://owncloud.denx.de/s/wtAfktG6pXLffSA
> > >>  
> > >>>  
> > >>>> On working setup to trigger the core dump:
> > >>>> /home/root/ld-linux-armhf.so.3 /sbin/init
> > >>>> gdb ./ld-linux-armhf.so.3 core
> > >>>>
> > >>>> (and the /home/root/ld-linux-armhf.so.3 is the "broken" one).
> > >>>>
> > >>>>
> > >>>> Not working (patch [1] not applied):
> > >>>> ====================================
> > >>>>
> > >>>> All the code is located in _dl_start in elf/rtld.c and
> > >>>> elf/get-dynamic-info.h files:
> > >>>>
> > >>>> (gdb) p/x $r5
> > >>>> $46 = 0xb6fc8000
> > >>>> r5 is set from the elf_machine_load_address()
> > >>>>
> > >>>> Then we enter the elf_get_dynamic_info()
> > >>>>
> > >>>> 0xb6fc95fc      99            ADJUST_DYN_INFO (DT_SYMTAB);
> > >>>>    0xb6fc95f8 <_dl_start+308>:  04 30 92 15     ldrne   r3,
> > >>>> [r2, #4] => 0xb6fc95fc <_dl_start+312>:  05 30 83 10     addne
> > >>>>   r3, r3, r5 0xb6fc9600 <_dl_start+316>:  04 30 82 15
> > >>>> strne   r3, [r2, #4] (gdb) p/x $r3
> > >>>> $63 = 0x410003f4
> > >>>> (gdb) p/x $r5
> > >>>> $64 = 0xb6fc8000  
> > >>>
> > >>> it seems r5 is already wrong here, it's not the runtime
> > >>> address of 0. (looks more like the runtime address of
> > >>> 0x41000000)  
> > >>
> > >> The r5 is set according to the change in patch which now I'm
> > >> trying to fix.
> > >>
> > >> To be more specific - r5 is set according to code in patch SHA1:
> > >> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > >>  
> > >>>
> > >>> likely something goes wrong with the computation of r5.  
> > >>
> > >> Before the SHA1: bca0f5cbc9257c13322b99e55235c4f21ba0bd82 patch
> > >> r5 was computed with using 'adr' assembler instruction, not set
> > >> from some constant value.
> > >>
> > >> asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> > >>
> > >> And r5 value was 0x75fc8000
> > >>  
> > >>>  
> > >>>> Links:
> > >>>> [1] -
> > >>>> https://github.com/lmajewski/y2038_glibc/commit/e67e0f589b88a44be8f8b9b770b08950dd7e5bd5
> > >>>>
> > >>>> readelf -e ld-linux-armhf.so.3
> > >>>>
> > >>>> [10] .plt              PROGBITS        41000994 000994 000050
> > >>>> 04 AX  0   0  4 [11] .text             PROGBITS        41000a00
> > >>>> 000a00 01fed0 00  AX  0   0 64 [12] .rodata           PROGBITS
> > >>>> 410208d0 0208d0 004b24 00   A  0   0  4 [13] .ARM.extab
> > >>>> PROGBITS        410253f4 0253f4 000018 00   A  0   0  4 [14]
> > >>>> .ARM.exidx        ARM_EXIDX       4102540c 02540c 0000c8 00
> > >>>> AL 11 0  4 [15] .data.rel.ro      PROGBITS        41036200
> > >>>> 026200 000cf4 00  WA  0   0  8 [16] .dynamic          DYNAMIC
> > >>>> 41036ef4 026ef4 0000c8 08  WA  5   0  4 [17] .got
> > >>>> PROGBITS 41036fbc 026fbc 000040 04  WA  0   0  4  
> > >>>
> > >>> why are all addresses >0x41000000 ?
> > >>> in a shared library i expect all those addresses
> > >>> to be close to 0.  
> > >>
> > >> On this real HW system (the rootfs which is running) - libc.so.6
> > >> also has address > 0x41000000
> > >> libm.so.6 also has the value > 0x41200000
> > >> (Entry point address:               0x412c9190)
> > >>
> > >> The offset of > 0x41000000 looks a bit strange indeed, but it is
> > >> still less than the kernel space. Moreover, with position
> > >> independent code it shall not matter.
> > >>  
> > >>>
> > >>> is this made by some modified binutils?  
> > >>
> > >> I've double checked the ld-linux-armhf.so.3 and after build it
> > >> has: (Entry point address:               0xa00) which seems to
> > >> be correct.
> > >>
> > >> So it looks like during installation of the glibc (on the
> > >> Yocto/OE created rootfs) the elf header is modified and this
> > >> 0x41000000 offset is added.
> > >>  
> > >
> > > And indeed it is the case. Yocto/OE by default perform prelinking
> > > (use prelink program) to speedup start time of dynamic program.
> > >
> > > The prelink [1] itself assigns some virtual addresses to all
> > > required shared objects (in our case for /sbin/init), so no
> > > clashes are encountered.
> > >
> > > And using prelink is a _default_ behaviour in Yocto/OE poky
> > > distro.  
> >
> > Does it work without prelink? Also, does it fail with prelink in
> > real hardware?
> >
> > It indeed might be a prelink issue in fact.  
> 
> This will fail everywhere if prelink is used.
> 

Do we have _any_ plan how to fix it? BSPs for many boards are built
with Yocto/OE nowadays...

The approach with using 'adr' (pseudo) assembler instruction to
calculate the on-runtime addresses was working previously.

Shall we revert changes which were introduced recently (to use
__ehdr_start)?


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11  8:56                         ` Lukasz Majewski
@ 2021-10-11 10:18                           ` Szabolcs Nagy
  2021-10-11 11:47                             ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-11 10:18 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: H.J. Lu, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

The 10/11/2021 10:56, Lukasz Majewski wrote:
> Do we have _any_ plan how to fix it? BSPs for many boards are built
> with Yocto/OE nowadays...
> 
> The approach with using 'adr' (pseudo) assembler instruction to
> calculate the on-runtime addresses was working previously.
> 
> Shall we revert changes which were introduced recently (to use
> __ehdr_start)?

the reason for the change was to avoid relying on GOT[0].

i guess we can revert elf_machine_load_address on arm, but
keep the new elf_machine_dynamic that does not rely on GOT[0].

it is also useful to have the same c code across targets
for the load address. if we want to do that and support
vaddr!=0 for ehdr then i think it can be (untested):

__attribute__ ((const)) ElfW(Addr)
load_addr (void)
{
  extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility ("hidden")));
  ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
  ElfW(Word) phnum = __ehdr_start.e_phnum;
  const ElfW(Phdr) *phdr = (const void *) (addr + __ehdr_start.e_phoff);
  const ElfW(Phdr) *ph;
  assert (__ehdr_start.e_phentsize == sizeof *phdr);
  for (ph = phdr; ph < &phdr[phnum]; ++ph)
    if (ph->p_type == PT_LOAD && ph->p_offset == 0)
      {
        addr -= ph->p_vaddr;
        break;
      }
  return addr;
}

i don't have a strong preference either way:
1) fix yocto
2) partial revert
3) new way to compute the load address.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 10:18                           ` Szabolcs Nagy
@ 2021-10-11 11:47                             ` Lukasz Majewski
  2021-10-11 12:01                               ` H.J. Lu
  2021-10-11 12:48                               ` Szabolcs Nagy
  0 siblings, 2 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-11 11:47 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: H.J. Lu, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 2433 bytes --]

Hi Szabolcs,

Thank you very much for your input.

> The 10/11/2021 10:56, Lukasz Majewski wrote:
> > Do we have _any_ plan how to fix it? BSPs for many boards are built
> > with Yocto/OE nowadays...
> > 
> > The approach with using 'adr' (pseudo) assembler instruction to
> > calculate the on-runtime addresses was working previously.
> > 
> > Shall we revert changes which were introduced recently (to use
> > __ehdr_start)?  
> 
> the reason for the change was to avoid relying on GOT[0].

And why it is requested to not relying on GOT[0]?

> 
> i guess we can revert elf_machine_load_address on arm, but
> keep the new elf_machine_dynamic that does not rely on GOT[0].

This was the approach proposed by this patch.

> 
> it is also useful to have the same c code across targets
> for the load address. if we want to do that and support
> vaddr!=0 for ehdr then i think it can be (untested):
> 
> __attribute__ ((const)) ElfW(Addr)
> load_addr (void)
> {
>   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
> ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
>   ElfW(Word) phnum = __ehdr_start.e_phnum;
>   const ElfW(Phdr) *phdr = (const void *) (addr +
> __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
>   assert (__ehdr_start.e_phentsize == sizeof *phdr);
>   for (ph = phdr; ph < &phdr[phnum]; ++ph)
>     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
>       {
>         addr -= ph->p_vaddr;

The above line is a bit strange for me. Isn't the p_offset set by
prelink as well?

And most of all - why do we need to perform relocation in such very
early stage of the ld-linux-armhf.so.3 ?

>         break;
>       }
>   return addr;
> }
> 
> i don't have a strong preference either way:
> 1) fix yocto

Yocto by default uses prelinkg from the very long time. I think that it
would be very difficult to change that.

(moreover, IMHO running prelink on any binary and glibc is a valid use
case)

> 2) partial revert

Ok.

> 3) new way to compute the load address.

This would require some input from developers deeply involved in ELF
loader development. (Florian IIRC).

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 11:47                             ` Lukasz Majewski
@ 2021-10-11 12:01                               ` H.J. Lu
  2021-10-11 13:10                                 ` Lukasz Majewski
  2021-10-11 13:34                                 ` Adhemerval Zanella
  2021-10-11 12:48                               ` Szabolcs Nagy
  1 sibling, 2 replies; 62+ messages in thread
From: H.J. Lu @ 2021-10-11 12:01 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Szabolcs Nagy, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

On Mon, Oct 11, 2021 at 4:47 AM Lukasz Majewski <lukma@denx.de> wrote:
>
> Hi Szabolcs,
>
> Thank you very much for your input.
>
> > The 10/11/2021 10:56, Lukasz Majewski wrote:
> > > Do we have _any_ plan how to fix it? BSPs for many boards are built
> > > with Yocto/OE nowadays...
> > >
> > > The approach with using 'adr' (pseudo) assembler instruction to
> > > calculate the on-runtime addresses was working previously.
> > >
> > > Shall we revert changes which were introduced recently (to use
> > > __ehdr_start)?
> >
> > the reason for the change was to avoid relying on GOT[0].
>
> And why it is requested to not relying on GOT[0]?
>
> >
> > i guess we can revert elf_machine_load_address on arm, but
> > keep the new elf_machine_dynamic that does not rely on GOT[0].
>
> This was the approach proposed by this patch.
>
> >
> > it is also useful to have the same c code across targets
> > for the load address. if we want to do that and support
> > vaddr!=0 for ehdr then i think it can be (untested):
> >
> > __attribute__ ((const)) ElfW(Addr)
> > load_addr (void)
> > {
> >   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
> > ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
> >   ElfW(Word) phnum = __ehdr_start.e_phnum;
> >   const ElfW(Phdr) *phdr = (const void *) (addr +
> > __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
> >   assert (__ehdr_start.e_phentsize == sizeof *phdr);
> >   for (ph = phdr; ph < &phdr[phnum]; ++ph)
> >     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
> >       {
> >         addr -= ph->p_vaddr;
>
> The above line is a bit strange for me. Isn't the p_offset set by
> prelink as well?
>
> And most of all - why do we need to perform relocation in such very
> early stage of the ld-linux-armhf.so.3 ?
>
> >         break;
> >       }
> >   return addr;
> > }
> >
> > i don't have a strong preference either way:
> > 1) fix yocto
>
> Yocto by default uses prelinkg from the very long time. I think that it
> would be very difficult to change that.

Just don't run prelink on ld.so since the load address is controlled
by kernel.

> (moreover, IMHO running prelink on any binary and glibc is a valid use
> case)
>
> > 2) partial revert
>
> Ok.
>
> > 3) new way to compute the load address.
>
> This would require some input from developers deeply involved in ELF
> loader development. (Florian IIRC).
>
> Best regards,
>
> Lukasz Majewski
>
> --
>
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de



-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 11:47                             ` Lukasz Majewski
  2021-10-11 12:01                               ` H.J. Lu
@ 2021-10-11 12:48                               ` Szabolcs Nagy
  1 sibling, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-11 12:48 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: H.J. Lu, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

The 10/11/2021 13:47, Lukasz Majewski wrote:
> Hi Szabolcs,
> 
> Thank you very much for your input.
> 
> > The 10/11/2021 10:56, Lukasz Majewski wrote:
> > > Do we have _any_ plan how to fix it? BSPs for many boards are built
> > > with Yocto/OE nowadays...
> > > 
> > > The approach with using 'adr' (pseudo) assembler instruction to
> > > calculate the on-runtime addresses was working previously.
> > > 
> > > Shall we revert changes which were introduced recently (to use
> > > __ehdr_start)?  
> > 
> > the reason for the change was to avoid relying on GOT[0].
> 
> And why it is requested to not relying on GOT[0]?

to support lld.

> > 
> > i guess we can revert elf_machine_load_address on arm, but
> > keep the new elf_machine_dynamic that does not rely on GOT[0].
> 
> This was the approach proposed by this patch.

i see, then i think only the commit message needs
update to explain the ld.so prelink issue.

> > 
> > it is also useful to have the same c code across targets
> > for the load address. if we want to do that and support
> > vaddr!=0 for ehdr then i think it can be (untested):
> > 
> > __attribute__ ((const)) ElfW(Addr)
> > load_addr (void)
> > {
> >   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
> > ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
> >   ElfW(Word) phnum = __ehdr_start.e_phnum;
> >   const ElfW(Phdr) *phdr = (const void *) (addr +
> > __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
> >   assert (__ehdr_start.e_phentsize == sizeof *phdr);
> >   for (ph = phdr; ph < &phdr[phnum]; ++ph)
> >     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
> >       {
> >         addr -= ph->p_vaddr;
> 
> The above line is a bit strange for me. Isn't the p_offset set by
> prelink as well?
> 
> And most of all - why do we need to perform relocation in such very
> early stage of the ld-linux-armhf.so.3 ?

ld.so itself relies on dynamic relocs, so it has to
self relocate (as early as possible). (one ugliness
of the old arm load address computation is that it
relies on running before self relocation so the GOT
entry of _dl_start still contains the link time addr.
on other targets the computation still works later.)

the p_offset is the offset in the file, prelinking
does not change that and the elf header starts at
offset==0 which is normally part of a load segment:

$ readelf -lW BROKEN_ld-linux-armhf.so.3

Elf file type is DYN (Shared object file)
Entry point 0x41000a00
There are 7 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  EXIDX          0x02540c 0x4102540c 0x4102540c 0x000c8 0x000c8 R   0x4
  LOAD           0x000000 0x41000000 0x41000000 0x254d4 0x254d4 R E 0x10000
  LOAD           0x026200 0x41036200 0x41036200 0x016d8 0x017c0 RW  0x10000
  DYNAMIC        0x026ef4 0x41036ef4 0x41036ef4 0x000c8 0x000c8 RW  0x4
  NOTE           0x000114 0x41000114 0x41000114 0x00024 0x00024 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  GNU_RELRO      0x026200 0x41036200 0x41036200 0x00e00 0x00e00 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     .ARM.exidx
   01     .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .rel.dyn .rel.plt .plt .text .rodata .ARM.extab .ARM.exidx
   02     .data.rel.ro .dynamic .got .data .bss
   03     .dynamic
   04     .note.gnu.build-id
   05
   06     .data.rel.ro .dynamic .got

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 12:01                               ` H.J. Lu
@ 2021-10-11 13:10                                 ` Lukasz Majewski
  2021-10-11 13:22                                   ` H.J. Lu
  2021-10-11 13:34                                 ` Adhemerval Zanella
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-11 13:10 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Szabolcs Nagy, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 3562 bytes --]

On Mon, 11 Oct 2021 05:01:23 -0700
"H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Mon, Oct 11, 2021 at 4:47 AM Lukasz Majewski <lukma@denx.de> wrote:
> >
> > Hi Szabolcs,
> >
> > Thank you very much for your input.
> >  
> > > The 10/11/2021 10:56, Lukasz Majewski wrote:  
> > > > Do we have _any_ plan how to fix it? BSPs for many boards are
> > > > built with Yocto/OE nowadays...
> > > >
> > > > The approach with using 'adr' (pseudo) assembler instruction to
> > > > calculate the on-runtime addresses was working previously.
> > > >
> > > > Shall we revert changes which were introduced recently (to use
> > > > __ehdr_start)?  
> > >
> > > the reason for the change was to avoid relying on GOT[0].  
> >
> > And why it is requested to not relying on GOT[0]?
> >  
> > >
> > > i guess we can revert elf_machine_load_address on arm, but
> > > keep the new elf_machine_dynamic that does not rely on GOT[0].  
> >
> > This was the approach proposed by this patch.
> >  
> > >
> > > it is also useful to have the same c code across targets
> > > for the load address. if we want to do that and support
> > > vaddr!=0 for ehdr then i think it can be (untested):
> > >
> > > __attribute__ ((const)) ElfW(Addr)
> > > load_addr (void)
> > > {
> > >   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
> > > ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
> > >   ElfW(Word) phnum = __ehdr_start.e_phnum;
> > >   const ElfW(Phdr) *phdr = (const void *) (addr +
> > > __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
> > >   assert (__ehdr_start.e_phentsize == sizeof *phdr);
> > >   for (ph = phdr; ph < &phdr[phnum]; ++ph)
> > >     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
> > >       {
> > >         addr -= ph->p_vaddr;  
> >
> > The above line is a bit strange for me. Isn't the p_offset set by
> > prelink as well?
> >
> > And most of all - why do we need to perform relocation in such very
> > early stage of the ld-linux-armhf.so.3 ?
> >  
> > >         break;
> > >       }
> > >   return addr;
> > > }
> > >
> > > i don't have a strong preference either way:
> > > 1) fix yocto  
> >
> > Yocto by default uses prelinkg from the very long time. I think
> > that it would be very difficult to change that.  
> 
> Just don't run prelink on ld.so since the load address is controlled
> by kernel.

prelink is used also to speed things up. In some use cases - boot time
optimization (like kiosk, or automotive) the bootup time is crucial.

IMHO, the change with __ehdr_start introduced regression and hence
shall be fixed so the previous behaviour is preserved.

> 
> > (moreover, IMHO running prelink on any binary and glibc is a valid
> > use case)
> >  
> > > 2) partial revert  
> >
> > Ok.
> >  
> > > 3) new way to compute the load address.  
> >
> > This would require some input from developers deeply involved in ELF
> > loader development. (Florian IIRC).
> >
> > Best regards,
> >
> > Lukasz Majewski
> >
> > --
> >
> > DENX Software Engineering GmbH,      Managing Director: Wolfgang
> > Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell,
> > Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email:
> > lukma@denx.de  
> 
> 
> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 13:10                                 ` Lukasz Majewski
@ 2021-10-11 13:22                                   ` H.J. Lu
  2021-10-11 14:31                                     ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: H.J. Lu @ 2021-10-11 13:22 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Szabolcs Nagy, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

On Mon, Oct 11, 2021 at 6:10 AM Lukasz Majewski <lukma@denx.de> wrote:
>
> On Mon, 11 Oct 2021 05:01:23 -0700
> "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
> > On Mon, Oct 11, 2021 at 4:47 AM Lukasz Majewski <lukma@denx.de> wrote:
> > >
> > > Hi Szabolcs,
> > >
> > > Thank you very much for your input.
> > >
> > > > The 10/11/2021 10:56, Lukasz Majewski wrote:
> > > > > Do we have _any_ plan how to fix it? BSPs for many boards are
> > > > > built with Yocto/OE nowadays...
> > > > >
> > > > > The approach with using 'adr' (pseudo) assembler instruction to
> > > > > calculate the on-runtime addresses was working previously.
> > > > >
> > > > > Shall we revert changes which were introduced recently (to use
> > > > > __ehdr_start)?
> > > >
> > > > the reason for the change was to avoid relying on GOT[0].
> > >
> > > And why it is requested to not relying on GOT[0]?
> > >
> > > >
> > > > i guess we can revert elf_machine_load_address on arm, but
> > > > keep the new elf_machine_dynamic that does not rely on GOT[0].
> > >
> > > This was the approach proposed by this patch.
> > >
> > > >
> > > > it is also useful to have the same c code across targets
> > > > for the load address. if we want to do that and support
> > > > vaddr!=0 for ehdr then i think it can be (untested):
> > > >
> > > > __attribute__ ((const)) ElfW(Addr)
> > > > load_addr (void)
> > > > {
> > > >   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
> > > > ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
> > > >   ElfW(Word) phnum = __ehdr_start.e_phnum;
> > > >   const ElfW(Phdr) *phdr = (const void *) (addr +
> > > > __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
> > > >   assert (__ehdr_start.e_phentsize == sizeof *phdr);
> > > >   for (ph = phdr; ph < &phdr[phnum]; ++ph)
> > > >     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
> > > >       {
> > > >         addr -= ph->p_vaddr;
> > >
> > > The above line is a bit strange for me. Isn't the p_offset set by
> > > prelink as well?
> > >
> > > And most of all - why do we need to perform relocation in such very
> > > early stage of the ld-linux-armhf.so.3 ?
> > >
> > > >         break;
> > > >       }
> > > >   return addr;
> > > > }
> > > >
> > > > i don't have a strong preference either way:
> > > > 1) fix yocto
> > >
> > > Yocto by default uses prelinkg from the very long time. I think
> > > that it would be very difficult to change that.
> >
> > Just don't run prelink on ld.so since the load address is controlled
> > by kernel.
>
> prelink is used also to speed things up. In some use cases - boot time
> optimization (like kiosk, or automotive) the bootup time is crucial.

I am NOT asking you to remove prelink.  I am only asking not to run
prelink on ld.so.  Do you have bootup time data on ld.so with and without
prelink?

> IMHO, the change with __ehdr_start introduced regression and hence
> shall be fixed so the previous behaviour is preserved.
>

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 12:01                               ` H.J. Lu
  2021-10-11 13:10                                 ` Lukasz Majewski
@ 2021-10-11 13:34                                 ` Adhemerval Zanella
  1 sibling, 0 replies; 62+ messages in thread
From: Adhemerval Zanella @ 2021-10-11 13:34 UTC (permalink / raw)
  To: H.J. Lu, Lukasz Majewski
  Cc: Szabolcs Nagy, Florian Weimer, libc-alpha, Andreas Schwab, Joseph Myers



On 11/10/2021 09:01, H.J. Lu wrote:
> On Mon, Oct 11, 2021 at 4:47 AM Lukasz Majewski <lukma@denx.de> wrote:
>>
>> Hi Szabolcs,
>>
>> Thank you very much for your input.
>>
>>> The 10/11/2021 10:56, Lukasz Majewski wrote:
>>>> Do we have _any_ plan how to fix it? BSPs for many boards are built
>>>> with Yocto/OE nowadays...
>>>>
>>>> The approach with using 'adr' (pseudo) assembler instruction to
>>>> calculate the on-runtime addresses was working previously.
>>>>
>>>> Shall we revert changes which were introduced recently (to use
>>>> __ehdr_start)?
>>>
>>> the reason for the change was to avoid relying on GOT[0].
>>
>> And why it is requested to not relying on GOT[0]?
>>
>>>
>>> i guess we can revert elf_machine_load_address on arm, but
>>> keep the new elf_machine_dynamic that does not rely on GOT[0].
>>
>> This was the approach proposed by this patch.
>>
>>>
>>> it is also useful to have the same c code across targets
>>> for the load address. if we want to do that and support
>>> vaddr!=0 for ehdr then i think it can be (untested):
>>>
>>> __attribute__ ((const)) ElfW(Addr)
>>> load_addr (void)
>>> {
>>>   extern const ElfW(Ehdr) __ehdr_start __attribute__ ((visibility
>>> ("hidden"))); ElfW(Addr) addr = (ElfW(Addr)) &__ehdr_start;
>>>   ElfW(Word) phnum = __ehdr_start.e_phnum;
>>>   const ElfW(Phdr) *phdr = (const void *) (addr +
>>> __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
>>>   assert (__ehdr_start.e_phentsize == sizeof *phdr);
>>>   for (ph = phdr; ph < &phdr[phnum]; ++ph)
>>>     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
>>>       {
>>>         addr -= ph->p_vaddr;
>>
>> The above line is a bit strange for me. Isn't the p_offset set by
>> prelink as well?
>>
>> And most of all - why do we need to perform relocation in such very
>> early stage of the ld-linux-armhf.so.3 ?
>>
>>>         break;
>>>       }
>>>   return addr;
>>> }
>>>
>>> i don't have a strong preference either way:
>>> 1) fix yocto
>>
>> Yocto by default uses prelinkg from the very long time. I think that it
>> would be very difficult to change that.
> 
> Just don't run prelink on ld.so since the load address is controlled
> by kernel.

I think it would be better to fail early on ld.so with a meaningful
warning to avoid the possible issues.  Another solution would be
to ignore the entrypoint value set by prelink for the loader and
always use 0.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH] dl: Use "adr" assembler command to get proper load address
  2021-10-11 13:22                                   ` H.J. Lu
@ 2021-10-11 14:31                                     ` Lukasz Majewski
  0 siblings, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-11 14:31 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Szabolcs Nagy, Adhemerval Zanella, Florian Weimer, libc-alpha,
	Andreas Schwab, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 4019 bytes --]

On Mon, 11 Oct 2021 06:22:53 -0700
"H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Mon, Oct 11, 2021 at 6:10 AM Lukasz Majewski <lukma@denx.de> wrote:
> >
> > On Mon, 11 Oct 2021 05:01:23 -0700
> > "H.J. Lu" <hjl.tools@gmail.com> wrote:
> >  
> > > On Mon, Oct 11, 2021 at 4:47 AM Lukasz Majewski <lukma@denx.de>
> > > wrote:  
> > > >
> > > > Hi Szabolcs,
> > > >
> > > > Thank you very much for your input.
> > > >  
> > > > > The 10/11/2021 10:56, Lukasz Majewski wrote:  
> > > > > > Do we have _any_ plan how to fix it? BSPs for many boards
> > > > > > are built with Yocto/OE nowadays...
> > > > > >
> > > > > > The approach with using 'adr' (pseudo) assembler
> > > > > > instruction to calculate the on-runtime addresses was
> > > > > > working previously.
> > > > > >
> > > > > > Shall we revert changes which were introduced recently (to
> > > > > > use __ehdr_start)?  
> > > > >
> > > > > the reason for the change was to avoid relying on GOT[0].  
> > > >
> > > > And why it is requested to not relying on GOT[0]?
> > > >  
> > > > >
> > > > > i guess we can revert elf_machine_load_address on arm, but
> > > > > keep the new elf_machine_dynamic that does not rely on
> > > > > GOT[0].  
> > > >
> > > > This was the approach proposed by this patch.
> > > >  
> > > > >
> > > > > it is also useful to have the same c code across targets
> > > > > for the load address. if we want to do that and support
> > > > > vaddr!=0 for ehdr then i think it can be (untested):
> > > > >
> > > > > __attribute__ ((const)) ElfW(Addr)
> > > > > load_addr (void)
> > > > > {
> > > > >   extern const ElfW(Ehdr) __ehdr_start __attribute__
> > > > > ((visibility ("hidden"))); ElfW(Addr) addr = (ElfW(Addr))
> > > > > &__ehdr_start; ElfW(Word) phnum = __ehdr_start.e_phnum;
> > > > >   const ElfW(Phdr) *phdr = (const void *) (addr +
> > > > > __ehdr_start.e_phoff); const ElfW(Phdr) *ph;
> > > > >   assert (__ehdr_start.e_phentsize == sizeof *phdr);
> > > > >   for (ph = phdr; ph < &phdr[phnum]; ++ph)
> > > > >     if (ph->p_type == PT_LOAD && ph->p_offset == 0)
> > > > >       {
> > > > >         addr -= ph->p_vaddr;  
> > > >
> > > > The above line is a bit strange for me. Isn't the p_offset set
> > > > by prelink as well?
> > > >
> > > > And most of all - why do we need to perform relocation in such
> > > > very early stage of the ld-linux-armhf.so.3 ?
> > > >  
> > > > >         break;
> > > > >       }
> > > > >   return addr;
> > > > > }
> > > > >
> > > > > i don't have a strong preference either way:
> > > > > 1) fix yocto  
> > > >
> > > > Yocto by default uses prelinkg from the very long time. I think
> > > > that it would be very difficult to change that.  
> > >
> > > Just don't run prelink on ld.so since the load address is
> > > controlled by kernel.  
> >
> > prelink is used also to speed things up. In some use cases - boot
> > time optimization (like kiosk, or automotive) the bootup time is
> > crucial.  
> 
> I am NOT asking you to remove prelink.  I am only asking not to run
> prelink on ld.so.  Do you have bootup time data on ld.so with and
> without prelink?

The prelink is run on /sbin/init binary, so the libc.so.6 and
ld-linux-armhf.so.3 are automatically affected.

To put it in other words - up till this issue everything was working
(maybe it was just working by a pure coincidence).

Now, because of change in glibc (__ehdr_start usage) you ask if
Yocto/OE can stop using prelink on certain libraries to avoid this
error.

Have I understood your comment correctly?

> 
> > IMHO, the change with __ehdr_start introduced regression and hence
> > shall be fixed so the previous behaviour is preserved.
> >  
> 




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-09-07 13:16 [PATCH] dl: Use "adr" assembler command to get proper load address Lukasz Majewski
  2021-09-07 16:49 ` Fangrui Song
@ 2021-10-15  7:54 ` Lukasz Majewski
  2021-10-15 12:09   ` Szabolcs Nagy
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-15  7:54 UTC (permalink / raw)
  To: Adhemerval Zanella, Fangrui Song, Florian Weimer
  Cc: Joseph Myers, Carlos O'Donell, Szabolcs Nagy, Andreas Schwab,
	libc-alpha, Lukasz Majewski

This change is a partial revert of commit
bca0f5cbc9257c13322b99e55235c4f21ba0bd82
"arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
of __ehdr_start linker variable to get the address of loaded program.

The elf_machine_load_address() function is declared in the
sysdeps/arm/dl-machine.h header. It is called from (very early)
_dl_start() entry point for the program. It shall return the load
address of the dynamic linker program.

With this revert the 'adr' assembler instruction is used instead of a
place holder:

arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
00000000 l       .note.gnu.build-id     00000000      __ehdr_start

which is pre-set by binutils.

The problem starts when one runs 'prelink' on the rootfs created with
for example OE/Yocto.
Then the _ehdr_start stays as 0x0, but the ELF header's sections have
different addresses - for example 0x41000000 instead of the originally
set 0x0.

This is crucial when /sbin/init is executed. Value set in __ehdr_start
symbol is not updated. This causes the program to crash very early
when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
for loader relocation is going to hit the kernel space (0xf7xxyyyy).

It looks like the correct way to obtain the _dl_start offset on ARM is
to use assembler instruction 'adr' at execution time (so the prelink
assigned offset is taken into consideration) instead of __ehdr_start.

With this patch we only modify the elf_machine_load_address() function,
as it is called very early, before the ld-linux-armhf.so.3 is performing
relocation (also its own one).

HW:
Hardware name:
	- ARM-Versatile Express (Run with QEMU)
	- Beagle Bone Black

Build Environment: OE/Yocto -> poky
SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1

Fixes: BZ #28293
---
 sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
index dfa05eee44..d6e5f1d5ec 100644
--- a/sysdeps/arm/dl-machine.h
+++ b/sysdeps/arm/dl-machine.h
@@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
 }
 
 /* Return the run-time load address of the shared object.  */
-static inline ElfW(Addr) __attribute__ ((unused))
+static inline Elf32_Addr __attribute__ ((unused))
 elf_machine_load_address (void)
 {
-  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
-  return (ElfW(Addr)) &__ehdr_start;
+  Elf32_Addr pcrel_addr;
+#ifdef SHARED
+  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
+  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
+#else
+  extern Elf32_Addr __dl_relocate_static_pie (void *)
+    asm ("_dl_relocate_static_pie") attribute_hidden;
+  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
+  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
+#endif
+#ifdef __thumb__
+  /* Clear the low bit of the function address.
+
+     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
+     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
+     which will finish during assembling.  GAS assembler before the fix for
+     PR gas/21458 was not setting the lsb but does after that.  Always do the
+     strip for both, so the code works with various combinations of glibc and
+     Binutils.  */
+  got_addr &= ~(Elf32_Addr) 1;
+  pcrel_addr &= ~(Elf32_Addr) 1;
+#endif
+  return pcrel_addr - got_addr;
 }
 
 /* Return the link-time address of _DYNAMIC.  */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15  7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
@ 2021-10-15 12:09   ` Szabolcs Nagy
  2021-10-15 12:21     ` H.J. Lu
  2021-10-15 13:59     ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-15 12:09 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Adhemerval Zanella, Fangrui Song, Florian Weimer, Joseph Myers,
	Carlos O'Donell, Andreas Schwab, libc-alpha

The 10/15/2021 09:54, Lukasz Majewski wrote:
> This change is a partial revert of commit
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> "arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
> of __ehdr_start linker variable to get the address of loaded program.
> 
> The elf_machine_load_address() function is declared in the
> sysdeps/arm/dl-machine.h header. It is called from (very early)
> _dl_start() entry point for the program. It shall return the load
> address of the dynamic linker program.
> 
> With this revert the 'adr' assembler instruction is used instead of a
> place holder:
> 
> arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> 
> which is pre-set by binutils.
> 
> The problem starts when one runs 'prelink' on the rootfs created with
> for example OE/Yocto.
> Then the _ehdr_start stays as 0x0, but the ELF header's sections have
> different addresses - for example 0x41000000 instead of the originally
> set 0x0.
> 
> This is crucial when /sbin/init is executed. Value set in __ehdr_start
> symbol is not updated. This causes the program to crash very early
> when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
> for loader relocation is going to hit the kernel space (0xf7xxyyyy).
> 
> It looks like the correct way to obtain the _dl_start offset on ARM is
> to use assembler instruction 'adr' at execution time (so the prelink
> assigned offset is taken into consideration) instead of __ehdr_start.
> 
> With this patch we only modify the elf_machine_load_address() function,
> as it is called very early, before the ld-linux-armhf.so.3 is performing
> relocation (also its own one).

i'd use an explanation like:

__ehdr_start is a linker created symbol that points to the elf header.
The elf header is at the beginning of the elf file and normally its
virtual address is 0 in a shared library.  This means the runtime
address of __ehdr_start is the load address of the module.  However if
prelinking is applied to ld.so then all virtual addresses are moved by
an offset so the runtime address of the elf header becomes the load
address + prelink offset.  The kernel does not treat prelinked ld.so
specially so the load address is not 0, it still has to be computed,
but simply using __ehdr_start no longer gives a correct value for that.

This issue affects all targets with prelinking support, but so far we
only got reports from OE/Yocto builds for arm that has prelinked ld.so.

but i think a better fix is possible than revert:

ElfW(Addr)
elf_machine_load_address ()
{
  extern ElfW(Dyn) _DYNAMIC[] attribute_hidden;
  extern ElfW(Dyn) extern_DYNAMIC[] asm ("_DYNAMIC");

  /* Uses pc-relative address computation.  */
  ElfW(Addr) runtime_addr = (ElfW(Addr)) &_DYNAMIC;

  /* Loads an unrelocated GOT entry.  */
  ElfW(Addr) linktime_addr = (ElfW(Addr)) &extern_DYNAMIC;

  return runtime_addr - linktime_addr;
}

I expect this to work on most targets and very similar to the code
that was originally used on other targets: only a new GOT entry is
introduced instead of using GOT[0]. (that new got entry will have a
relative relocation which means there must be a dynamic section even
in a static PIE, so i expect _DYNAMIC to be defined. this also means
that it's slightly more expensive than &__ehdr_start, so it is for
targets that want to support prelinked ld.so)

The original arm code used _dl_start symbol, likely because that's
within range for the adr instruction for more efficient pc-relative
computation. But that's a function symbol that requires fixups due
to thumb interworking issues and is not available in static PIE, so
using _DYNAMIC sounds better even on arm.

> 
> HW:
> Hardware name:
> 	- ARM-Versatile Express (Run with QEMU)
> 	- Beagle Bone Black
> 
> Build Environment: OE/Yocto -> poky
> SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> 
> Fixes: BZ #28293
> ---
>  sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> index dfa05eee44..d6e5f1d5ec 100644
> --- a/sysdeps/arm/dl-machine.h
> +++ b/sysdeps/arm/dl-machine.h
> @@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr *ehdr)
>  }
>  
>  /* Return the run-time load address of the shared object.  */
> -static inline ElfW(Addr) __attribute__ ((unused))
> +static inline Elf32_Addr __attribute__ ((unused))
>  elf_machine_load_address (void)
>  {
> -  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> -  return (ElfW(Addr)) &__ehdr_start;
> +  Elf32_Addr pcrel_addr;
> +#ifdef SHARED
> +  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> +  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> +#else
> +  extern Elf32_Addr __dl_relocate_static_pie (void *)
> +    asm ("_dl_relocate_static_pie") attribute_hidden;
> +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> +  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> +#endif
> +#ifdef __thumb__
> +  /* Clear the low bit of the function address.
> +
> +     NOTE: got_addr is from GOT table whose lsb is always set by linker if it's
> +     Thumb function address.  PCREL_ADDR comes from PC-relative calculation
> +     which will finish during assembling.  GAS assembler before the fix for
> +     PR gas/21458 was not setting the lsb but does after that.  Always do the
> +     strip for both, so the code works with various combinations of glibc and
> +     Binutils.  */
> +  got_addr &= ~(Elf32_Addr) 1;
> +  pcrel_addr &= ~(Elf32_Addr) 1;
> +#endif
> +  return pcrel_addr - got_addr;
>  }
>  
>  /* Return the link-time address of _DYNAMIC.  */
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15 12:09   ` Szabolcs Nagy
@ 2021-10-15 12:21     ` H.J. Lu
  2021-10-15 12:59       ` Lukasz Majewski
  2021-10-15 13:59     ` Lukasz Majewski
  1 sibling, 1 reply; 62+ messages in thread
From: H.J. Lu @ 2021-10-15 12:21 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Lukasz Majewski, Florian Weimer, libc-alpha, Andreas Schwab,
	Joseph Myers

On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The 10/15/2021 09:54, Lukasz Majewski wrote:
> > This change is a partial revert of commit
> > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed usage
> > of __ehdr_start linker variable to get the address of loaded program.
> >
> > The elf_machine_load_address() function is declared in the
> > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > _dl_start() entry point for the program. It shall return the load
> > address of the dynamic linker program.
> >
> > With this revert the 'adr' assembler instruction is used instead of a
> > place holder:
> >
> > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> >
> > which is pre-set by binutils.
> >
> > The problem starts when one runs 'prelink' on the rootfs created with
> > for example OE/Yocto.
> > Then the _ehdr_start stays as 0x0, but the ELF header's sections have
> > different addresses - for example 0x41000000 instead of the originally
> > set 0x0.
> >
> > This is crucial when /sbin/init is executed. Value set in __ehdr_start
> > symbol is not updated. This causes the program to crash very early
> > when ld-linux-armhf.so.3's _dl_start is executed, as calculated offset
> > for loader relocation is going to hit the kernel space (0xf7xxyyyy).
> >
> > It looks like the correct way to obtain the _dl_start offset on ARM is
> > to use assembler instruction 'adr' at execution time (so the prelink
> > assigned offset is taken into consideration) instead of __ehdr_start.
> >
> > With this patch we only modify the elf_machine_load_address() function,
> > as it is called very early, before the ld-linux-armhf.so.3 is performing
> > relocation (also its own one).
>
> i'd use an explanation like:
>
> __ehdr_start is a linker created symbol that points to the elf header.
> The elf header is at the beginning of the elf file and normally its
> virtual address is 0 in a shared library.  This means the runtime
> address of __ehdr_start is the load address of the module.  However if
> prelinking is applied to ld.so then all virtual addresses are moved by
> an offset so the runtime address of the elf header becomes the load
> address + prelink offset.  The kernel does not treat prelinked ld.so
> specially so the load address is not 0, it still has to be computed,
> but simply using __ehdr_start no longer gives a correct value for that.
>
> This issue affects all targets with prelinking support, but so far we
> only got reports from OE/Yocto builds for arm that has prelinked ld.so.
>
> but i think a better fix is possible than revert:

I think either prelink should be fixed not to prelink ld.so or Yocto
should be fixed not to prelink ld.so.

-- 
H.J.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15 12:21     ` H.J. Lu
@ 2021-10-15 12:59       ` Lukasz Majewski
  2021-10-15 23:53         ` Fāng-ruì Sòng
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-15 12:59 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Szabolcs Nagy, Florian Weimer, libc-alpha, Andreas Schwab,
	Joseph Myers, Patches and discussions about the oe-core layer,
	Khem Raj

[-- Attachment #1: Type: text/plain, Size: 3850 bytes --]

On Fri, 15 Oct 2021 05:21:23 -0700
"H.J. Lu" <hjl.tools@gmail.com> wrote:

> On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > The 10/15/2021 09:54, Lukasz Majewski wrote:  
> > > This change is a partial revert of commit
> > > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > > usage of __ehdr_start linker variable to get the address of
> > > loaded program.
> > >
> > > The elf_machine_load_address() function is declared in the
> > > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > > _dl_start() entry point for the program. It shall return the load
> > > address of the dynamic linker program.
> > >
> > > With this revert the 'adr' assembler instruction is used instead
> > > of a place holder:
> > >
> > > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > >
> > > which is pre-set by binutils.
> > >
> > > The problem starts when one runs 'prelink' on the rootfs created
> > > with for example OE/Yocto.
> > > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > > have different addresses - for example 0x41000000 instead of the
> > > originally set 0x0.
> > >
> > > This is crucial when /sbin/init is executed. Value set in
> > > __ehdr_start symbol is not updated. This causes the program to
> > > crash very early when ld-linux-armhf.so.3's _dl_start is
> > > executed, as calculated offset for loader relocation is going to
> > > hit the kernel space (0xf7xxyyyy).
> > >
> > > It looks like the correct way to obtain the _dl_start offset on
> > > ARM is to use assembler instruction 'adr' at execution time (so
> > > the prelink assigned offset is taken into consideration) instead
> > > of __ehdr_start.
> > >
> > > With this patch we only modify the elf_machine_load_address()
> > > function, as it is called very early, before the
> > > ld-linux-armhf.so.3 is performing relocation (also its own one).  
> >
> > i'd use an explanation like:
> >
> > __ehdr_start is a linker created symbol that points to the elf
> > header. The elf header is at the beginning of the elf file and
> > normally its virtual address is 0 in a shared library.  This means
> > the runtime address of __ehdr_start is the load address of the
> > module.  However if prelinking is applied to ld.so then all virtual
> > addresses are moved by an offset so the runtime address of the elf
> > header becomes the load address + prelink offset.  The kernel does
> > not treat prelinked ld.so specially so the load address is not 0,
> > it still has to be computed, but simply using __ehdr_start no
> > longer gives a correct value for that.
> >
> > This issue affects all targets with prelinking support, but so far
> > we only got reports from OE/Yocto builds for arm that has prelinked
> > ld.so.
> >
> > but i think a better fix is possible than revert:  
> 
> I think either prelink should be fixed not to prelink ld.so or Yocto
> should be fixed not to prelink ld.so.
> 

Could you explain why?

Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
bug from the very beginning and it was apparent just now?

From my point of view - the original change to use __ehdr_start broke
working setups, so it is a regression and shall be fixed in glibc.

Anyway, it would be beneficial to have input from other glibc
developers how to proceed with this issue.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15 12:09   ` Szabolcs Nagy
  2021-10-15 12:21     ` H.J. Lu
@ 2021-10-15 13:59     ` Lukasz Majewski
  1 sibling, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-15 13:59 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Adhemerval Zanella, Fangrui Song, Florian Weimer, Joseph Myers,
	Carlos O'Donell, Andreas Schwab, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 7521 bytes --]

Hi Szabolcs,

> The 10/15/2021 09:54, Lukasz Majewski wrote:
> > This change is a partial revert of commit
> > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > usage of __ehdr_start linker variable to get the address of loaded
> > program.
> > 
> > The elf_machine_load_address() function is declared in the
> > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > _dl_start() entry point for the program. It shall return the load
> > address of the dynamic linker program.
> > 
> > With this revert the 'adr' assembler instruction is used instead of
> > a place holder:
> > 
> > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > 
> > which is pre-set by binutils.
> > 
> > The problem starts when one runs 'prelink' on the rootfs created
> > with for example OE/Yocto.
> > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > have different addresses - for example 0x41000000 instead of the
> > originally set 0x0.
> > 
> > This is crucial when /sbin/init is executed. Value set in
> > __ehdr_start symbol is not updated. This causes the program to
> > crash very early when ld-linux-armhf.so.3's _dl_start is executed,
> > as calculated offset for loader relocation is going to hit the
> > kernel space (0xf7xxyyyy).
> > 
> > It looks like the correct way to obtain the _dl_start offset on ARM
> > is to use assembler instruction 'adr' at execution time (so the
> > prelink assigned offset is taken into consideration) instead of
> > __ehdr_start.
> > 
> > With this patch we only modify the elf_machine_load_address()
> > function, as it is called very early, before the
> > ld-linux-armhf.so.3 is performing relocation (also its own one).  
> 
> i'd use an explanation like:
> 
> __ehdr_start is a linker created symbol that points to the elf header.
> The elf header is at the beginning of the elf file and normally its
> virtual address is 0 in a shared library.  This means the runtime
> address of __ehdr_start is the load address of the module.  However if
> prelinking is applied to ld.so then all virtual addresses are moved by
> an offset so the runtime address of the elf header becomes the load
> address + prelink offset.  The kernel does not treat prelinked ld.so
> specially so the load address is not 0, it still has to be computed,
> but simply using __ehdr_start no longer gives a correct value for
> that.
> 
> This issue affects all targets with prelinking support, but so far we
> only got reports from OE/Yocto builds for arm that has prelinked
> ld.so.
> 

Thanks for a very detailed description.

> but i think a better fix is possible than revert:
> 
> ElfW(Addr)
> elf_machine_load_address ()
> {
>   extern ElfW(Dyn) _DYNAMIC[] attribute_hidden;
>   extern ElfW(Dyn) extern_DYNAMIC[] asm ("_DYNAMIC");
> 

So the _DYNAMIC = GOT[0] (and it points into the .dynamic section)
objdump -d -j .got ld-linux-armhf.so.3

Disassembly of section .got:

41036fbc <.got+0x41000000>:
41036fbc:       41036ef4        .word   0x41036ef4

So it indeed points into the
  [16] .dynamic          DYNAMIC         41036ef4

>   /* Uses pc-relative address computation.  */
>   ElfW(Addr) runtime_addr = (ElfW(Addr)) &_DYNAMIC;

I guess that the &_DYNAMIC gives the address around which $pc runs 
(e.g. 0xb6fc9504) - this is the actual address of run program.

(A side question - is there any way to read the _DYNAMIC symbol value
directly - via e.g. readelf or objdump?)

> 
>   /* Loads an unrelocated GOT entry.  */
>   ElfW(Addr) linktime_addr = (ElfW(Addr)) &extern_DYNAMIC;
> 

This is the prelink'ed address -> 0x41036ef4 in our case?

>   return runtime_addr - linktime_addr;

And the address to which we shall relocated would be:

0xb6fc9504 - 0x41036ef4 = 0x75f92610 - which is the address to which
the ld.so (ld-linux-armhf.so.3) will re-relocate itself?

> }
> 
> I expect this to work on most targets and very similar to the code
> that was originally used on other targets: only a new GOT entry is
> introduced instead of using GOT[0].

In fact we only rely on _DYNAMIC symbol -> which points into .dynamic
section.

> (that new got entry will have a
> relative relocation which means there must be a dynamic section even
> in a static PIE, so i expect _DYNAMIC to be defined.

Ok.

> this also means
> that it's slightly more expensive than &__ehdr_start, so it is for
> targets that want to support prelinked ld.so)
> 
> The original arm code used _dl_start symbol, likely because that's
> within range for the adr instruction for more efficient pc-relative
> computation. But that's a function symbol that requires fixups due
> to thumb interworking issues and is not available in static PIE, so
> using _DYNAMIC sounds better even on arm.

+1.

> 
> > 
> > HW:
> > Hardware name:
> > 	- ARM-Versatile Express (Run with QEMU)
> > 	- Beagle Bone Black
> > 
> > Build Environment: OE/Yocto -> poky
> > SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
> > 
> > Fixes: BZ #28293
> > ---
> >  sysdeps/arm/dl-machine.h | 28 +++++++++++++++++++++++++---
> >  1 file changed, 25 insertions(+), 3 deletions(-)
> > 
> > diff --git a/sysdeps/arm/dl-machine.h b/sysdeps/arm/dl-machine.h
> > index dfa05eee44..d6e5f1d5ec 100644
> > --- a/sysdeps/arm/dl-machine.h
> > +++ b/sysdeps/arm/dl-machine.h
> > @@ -39,11 +39,33 @@ elf_machine_matches_host (const Elf32_Ehdr
> > *ehdr) }
> >  
> >  /* Return the run-time load address of the shared object.  */
> > -static inline ElfW(Addr) __attribute__ ((unused))
> > +static inline Elf32_Addr __attribute__ ((unused))
> >  elf_machine_load_address (void)
> >  {
> > -  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> > -  return (ElfW(Addr)) &__ehdr_start;
> > +  Elf32_Addr pcrel_addr;
> > +#ifdef SHARED
> > +  extern Elf32_Addr __dl_start (void *) asm ("_dl_start");
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_start;
> > +  asm ("adr %0, _dl_start" : "=r" (pcrel_addr));
> > +#else
> > +  extern Elf32_Addr __dl_relocate_static_pie (void *)
> > +    asm ("_dl_relocate_static_pie") attribute_hidden;
> > +  Elf32_Addr got_addr = (Elf32_Addr) &__dl_relocate_static_pie;
> > +  asm ("adr %0, _dl_relocate_static_pie" : "=r" (pcrel_addr));
> > +#endif
> > +#ifdef __thumb__
> > +  /* Clear the low bit of the function address.
> > +
> > +     NOTE: got_addr is from GOT table whose lsb is always set by
> > linker if it's
> > +     Thumb function address.  PCREL_ADDR comes from PC-relative
> > calculation
> > +     which will finish during assembling.  GAS assembler before
> > the fix for
> > +     PR gas/21458 was not setting the lsb but does after that.
> > Always do the
> > +     strip for both, so the code works with various combinations
> > of glibc and
> > +     Binutils.  */
> > +  got_addr &= ~(Elf32_Addr) 1;
> > +  pcrel_addr &= ~(Elf32_Addr) 1;
> > +#endif
> > +  return pcrel_addr - got_addr;
> >  }
> >  
> >  /* Return the link-time address of _DYNAMIC.  */
> > -- 
> > 2.20.1
> >   




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15 12:59       ` Lukasz Majewski
@ 2021-10-15 23:53         ` Fāng-ruì Sòng
  2021-10-18 11:08           ` Szabolcs Nagy
  0 siblings, 1 reply; 62+ messages in thread
From: Fāng-ruì Sòng @ 2021-10-15 23:53 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: H.J. Lu, Florian Weimer, libc-alpha, Szabolcs Nagy,
	Patches and discussions about the oe-core layer, Andreas Schwab,
	Joseph Myers

On Fri, Oct 15, 2021 at 6:00 AM Lukasz Majewski <lukma@denx.de> wrote:
>
> On Fri, 15 Oct 2021 05:21:23 -0700
> "H.J. Lu" <hjl.tools@gmail.com> wrote:
>
> > On Fri, Oct 15, 2021 at 5:09 AM Szabolcs Nagy via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > >
> > > The 10/15/2021 09:54, Lukasz Majewski wrote:
> > > > This change is a partial revert of commit
> > > > bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> > > > "arm: Simplify elf_machine_{load_address,dynamic}" which imposed
> > > > usage of __ehdr_start linker variable to get the address of
> > > > loaded program.
> > > >
> > > > The elf_machine_load_address() function is declared in the
> > > > sysdeps/arm/dl-machine.h header. It is called from (very early)
> > > > _dl_start() entry point for the program. It shall return the load
> > > > address of the dynamic linker program.
> > > >
> > > > With this revert the 'adr' assembler instruction is used instead
> > > > of a place holder:
> > > >
> > > > arm-poky-linux-gnueabi-objdump -t ld-linux-armhf.so.3 | grep ehdr
> > > > 00000000 l       .note.gnu.build-id     00000000      __ehdr_start
> > > >
> > > > which is pre-set by binutils.
> > > >
> > > > The problem starts when one runs 'prelink' on the rootfs created
> > > > with for example OE/Yocto.
> > > > Then the _ehdr_start stays as 0x0, but the ELF header's sections
> > > > have different addresses - for example 0x41000000 instead of the
> > > > originally set 0x0.
> > > >
> > > > This is crucial when /sbin/init is executed. Value set in
> > > > __ehdr_start symbol is not updated. This causes the program to
> > > > crash very early when ld-linux-armhf.so.3's _dl_start is
> > > > executed, as calculated offset for loader relocation is going to
> > > > hit the kernel space (0xf7xxyyyy).
> > > >
> > > > It looks like the correct way to obtain the _dl_start offset on
> > > > ARM is to use assembler instruction 'adr' at execution time (so
> > > > the prelink assigned offset is taken into consideration) instead
> > > > of __ehdr_start.
> > > >
> > > > With this patch we only modify the elf_machine_load_address()
> > > > function, as it is called very early, before the
> > > > ld-linux-armhf.so.3 is performing relocation (also its own one).
> > >
> > > i'd use an explanation like:
> > >
> > > __ehdr_start is a linker created symbol that points to the elf
> > > header. The elf header is at the beginning of the elf file and
> > > normally its virtual address is 0 in a shared library.  This means
> > > the runtime address of __ehdr_start is the load address of the
> > > module.  However if prelinking is applied to ld.so then all virtual
> > > addresses are moved by an offset so the runtime address of the elf
> > > header becomes the load address + prelink offset.  The kernel does
> > > not treat prelinked ld.so specially so the load address is not 0,
> > > it still has to be computed, but simply using __ehdr_start no
> > > longer gives a correct value for that.
> > >
> > > This issue affects all targets with prelinking support, but so far
> > > we only got reports from OE/Yocto builds for arm that has prelinked
> > > ld.so.
> > >
> > > but i think a better fix is possible than revert:
> >
> > I think either prelink should be fixed not to prelink ld.so or Yocto
> > should be fixed not to prelink ld.so.
> >
>
> Could you explain why?
>
> Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
> bug from the very beginning and it was apparent just now?

Prelinking improves application relocation performance but prelinking
ld.so itself doesn't provide any saving.
It is very likely that the prelink program doesn't intend to prelink
ld.so. It just doesn't provide a diagnostic.
If we look at the problem from this angle, prelinking ld.so is a pilot
error: OE/Yocto used an unsupported thing which happened to work in
the past.
Now, the unsupported (well, it can be supported if prelink correctly
prelinks ld.so) thing fails.
I sent the original commit trying to untangle the messy arm code.
Although Szabolcs's version is still short, I'd prefer we don't work
around glibc for error/prelink errors.

> From my point of view - the original change to use __ehdr_start broke
> working setups, so it is a regression and shall be fixed in glibc.
>
> Anyway, it would be beneficial to have input from other glibc
> developers how to proceed with this issue.
>
>
> Best regards,
>
> Lukasz Majewski
>
> --
>
> DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-15 23:53         ` Fāng-ruì Sòng
@ 2021-10-18 11:08           ` Szabolcs Nagy
  2021-10-18 11:35             ` Florian Weimer
  0 siblings, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-18 11:08 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Lukasz Majewski, H.J. Lu, Florian Weimer, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab,
	Joseph Myers

The 10/15/2021 16:53, Fāng-ruì Sòng wrote:
> On Fri, Oct 15, 2021 at 6:00 AM Lukasz Majewski <lukma@denx.de> wrote:
> > On Fri, 15 Oct 2021 05:21:23 -0700
> > "H.J. Lu" <hjl.tools@gmail.com> wrote:
> > > I think either prelink should be fixed not to prelink ld.so or Yocto
> > > should be fixed not to prelink ld.so.
> >
> > Could you explain why?
> >
> > Was the relocation of ld.so (I guess that ld.so = ld-linux-arm.so) a
> > bug from the very beginning and it was apparent just now?
> 
> Prelinking improves application relocation performance but prelinking
> ld.so itself doesn't provide any saving.
> It is very likely that the prelink program doesn't intend to prelink
> ld.so. It just doesn't provide a diagnostic.
> If we look at the problem from this angle, prelinking ld.so is a pilot
> error: OE/Yocto used an unsupported thing which happened to work in
> the past.
> Now, the unsupported (well, it can be supported if prelink correctly
> prelinks ld.so) thing fails.
> I sent the original commit trying to untangle the messy arm code.
> Although Szabolcs's version is still short, I'd prefer we don't work
> around glibc for error/prelink errors.

i don't know much about pelinking, but i'd expect that ld.so
has to be prelinked for it to work:

if the kernel can load ld.so anywhere it will conflict with
other libraries that prelinking allocated to a fixed location.

instead ld.so has to be prelinked to an offset that comes after
all other prelinked libraries in the system, then the kernel
will place it after all other libraries at runtime.

i don't have a prelinked system to check if this is the case.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-18 11:08           ` Szabolcs Nagy
@ 2021-10-18 11:35             ` Florian Weimer
  2021-10-19 12:03               ` Lukasz Majewski
  2021-10-25 10:18               ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Florian Weimer @ 2021-10-18 11:35 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Fāng-ruì Sòng, Lukasz Majewski, H.J. Lu,
	libc-alpha, Patches and discussions about the oe-core layer,
	Andreas Schwab, Joseph Myers

* Szabolcs Nagy:

> i don't know much about pelinking, but i'd expect that ld.so
> has to be prelinked for it to work:
>
> if the kernel can load ld.so anywhere it will conflict with
> other libraries that prelinking allocated to a fixed location.

I think ld.so can back out prelinking if it detects any conflicts.
(ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
prelinking.)

> instead ld.so has to be prelinked to an offset that comes after
> all other prelinked libraries in the system, then the kernel
> will place it after all other libraries at runtime.
>
> i don't have a prelinked system to check if this is the case.

I tried on glibc 2.12-based system with prelink enabled and got this:

# ldd /bin/bash
	linux-vdso.so.1 =>  (0x00007ffc7e798000)
	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
# ldd /bin/bash
	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
# eu-readelf -d /lib64/ld-linux-x86-64.so.2

Dynamic segment contains 25 entries:
 Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5] '.dynstr'
  Type              Value
  SONAME            Library soname: [ld-linux-x86-64.so.2]
  HASH              0x0000003da70001f0
  GNU_HASH          0x0000003da70002a8
  STRTAB            0x0000003da7000608
  SYMTAB            0x0000003da7000380
  STRSZ             380 (bytes)
  SYMENT            24 (bytes)
  PLTGOT            0x0000003da7220f80
  PLTRELSZ          144 (bytes)
  PLTREL            RELA
  JMPREL            0x0000003da7000a30
  RELA              0x0000003da7000868
  RELASZ            456 (bytes)
  RELAENT           24 (bytes)
  VERDEF            0x0000003da70007c0
  VERDEFNUM         5
  BIND_NOW          
  FLAGS_1           NOW
  VERSYM            0x0000003da7000784
  RELACOUNT         16
  CHECKSUM          0x00000000e90e92bc
  GNU_PRELINKED     0x00000000616d5a26
  NULL              
  NULL              
  NULL
# 

As expected based on the previous discussion here, the kernel maps ld.so
at random addresses even though it has been prelinked.

This looks like another place where ASLR layout as to be tweaked
carefully to avoid obscure failure modes.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-18 11:35             ` Florian Weimer
@ 2021-10-19 12:03               ` Lukasz Majewski
  2021-10-25 10:18               ` Lukasz Majewski
  1 sibling, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-19 12:03 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Szabolcs Nagy, Fāng-ruì Sòng, H.J. Lu, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab,
	Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 3149 bytes --]

Hi Florian,

> * Szabolcs Nagy:
> 
> > i don't know much about pelinking, but i'd expect that ld.so
> > has to be prelinked for it to work:
> >
> > if the kernel can load ld.so anywhere it will conflict with
> > other libraries that prelinking allocated to a fixed location.  
> 
> I think ld.so can back out prelinking if it detects any conflicts.
> (ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
> prelinking.)
> 
> > instead ld.so has to be prelinked to an offset that comes after
> > all other prelinked libraries in the system, then the kernel
> > will place it after all other libraries at runtime.
> >
> > i don't have a prelinked system to check if this is the case.  
> 
> I tried on glibc 2.12-based system with prelink enabled and got this:
> 
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffc7e798000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
> # eu-readelf -d /lib64/ld-linux-x86-64.so.2
> 
> Dynamic segment contains 25 entries:
>  Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5]
> '.dynstr' Type              Value
>   SONAME            Library soname: [ld-linux-x86-64.so.2]
>   HASH              0x0000003da70001f0
>   GNU_HASH          0x0000003da70002a8
>   STRTAB            0x0000003da7000608
>   SYMTAB            0x0000003da7000380
>   STRSZ             380 (bytes)
>   SYMENT            24 (bytes)
>   PLTGOT            0x0000003da7220f80
>   PLTRELSZ          144 (bytes)
>   PLTREL            RELA
>   JMPREL            0x0000003da7000a30
>   RELA              0x0000003da7000868
>   RELASZ            456 (bytes)
>   RELAENT           24 (bytes)
>   VERDEF            0x0000003da70007c0
>   VERDEFNUM         5
>   BIND_NOW          
>   FLAGS_1           NOW
>   VERSYM            0x0000003da7000784
>   RELACOUNT         16
>   CHECKSUM          0x00000000e90e92bc
>   GNU_PRELINKED     0x00000000616d5a26
>   NULL              
>   NULL              
>   NULL
> # 
> 
> As expected based on the previous discussion here, the kernel maps
> ld.so at random addresses even though it has been prelinked.
> 
> This looks like another place where ASLR layout as to be tweaked
> carefully to avoid obscure failure modes.

Is the approach proposed by Szabolcs acceptable for 32 bit ARM? Or
shall we look for other way to proceed with this issue?

> 
> Thanks,
> Florian
> 


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-18 11:35             ` Florian Weimer
  2021-10-19 12:03               ` Lukasz Majewski
@ 2021-10-25 10:18               ` Lukasz Majewski
  2021-10-25 10:25                 ` Florian Weimer
  1 sibling, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-25 10:18 UTC (permalink / raw)
  To: Florian Weimer, Szabolcs Nagy
  Cc: Fāng-ruì Sòng, H.J. Lu, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab,
	Joseph Myers, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]

Dear Community,

> * Szabolcs Nagy:
> 
> > i don't know much about pelinking, but i'd expect that ld.so
> > has to be prelinked for it to work:
> >
> > if the kernel can load ld.so anywhere it will conflict with
> > other libraries that prelinking allocated to a fixed location.  
> 
> I think ld.so can back out prelinking if it detects any conflicts.
> (ld.so doesn't use MAP_FIXED for the initial ET_DYN mapping even when
> prelinking.)
> 
> > instead ld.so has to be prelinked to an offset that comes after
> > all other prelinked libraries in the system, then the kernel
> > will place it after all other libraries at runtime.
> >
> > i don't have a prelinked system to check if this is the case.  
> 
> I tried on glibc 2.12-based system with prelink enabled and got this:
> 
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffc7e798000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007f8dc919c000)
> # ldd /bin/bash
> 	linux-vdso.so.1 =>  (0x00007ffef3bf4000)
> 	libtinfo.so.5 => /lib64/libtinfo.so.5 (0x0000003da9800000)
> 	libdl.so.2 => /lib64/libdl.so.2 (0x0000003da7400000)
> 	libc.so.6 => /lib64/libc.so.6 (0x0000003da7800000)
> 	/lib64/ld-linux-x86-64.so.2 (0x00007ff9e66a6000)
> # eu-readelf -d /lib64/ld-linux-x86-64.so.2
> 
> Dynamic segment contains 25 entries:
>  Addr: 0x0000003da7220df0  Offset: 0x020df0  Link to section: [ 5]
> '.dynstr' Type              Value
>   SONAME            Library soname: [ld-linux-x86-64.so.2]
>   HASH              0x0000003da70001f0
>   GNU_HASH          0x0000003da70002a8
>   STRTAB            0x0000003da7000608
>   SYMTAB            0x0000003da7000380
>   STRSZ             380 (bytes)
>   SYMENT            24 (bytes)
>   PLTGOT            0x0000003da7220f80
>   PLTRELSZ          144 (bytes)
>   PLTREL            RELA
>   JMPREL            0x0000003da7000a30
>   RELA              0x0000003da7000868
>   RELASZ            456 (bytes)
>   RELAENT           24 (bytes)
>   VERDEF            0x0000003da70007c0
>   VERDEFNUM         5
>   BIND_NOW          
>   FLAGS_1           NOW
>   VERSYM            0x0000003da7000784
>   RELACOUNT         16
>   CHECKSUM          0x00000000e90e92bc
>   GNU_PRELINKED     0x00000000616d5a26
>   NULL              
>   NULL              
>   NULL
> # 
> 
> As expected based on the previous discussion here, the kernel maps
> ld.so at random addresses even though it has been prelinked.
> 
> This looks like another place where ASLR layout as to be tweaked
> carefully to avoid obscure failure modes.
> 

Do we have any idea on how to move forward with this issue?

> Thanks,
> Florian
> 

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 10:18               ` Lukasz Majewski
@ 2021-10-25 10:25                 ` Florian Weimer
  2021-10-25 10:53                   ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Florian Weimer @ 2021-10-25 10:25 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Szabolcs Nagy, Fāng-ruì Sòng, H.J. Lu, libc-alpha

* Lukasz Majewski:

> Do we have any idea on how to move forward with this issue?

Either fix the prelink tool not to prelink shared objects that do not
have a dependency on libc.so.6, or fix the dynamic loader to work if
prelinked on AArch64.  I do not have a strong opinion.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 10:25                 ` Florian Weimer
@ 2021-10-25 10:53                   ` Lukasz Majewski
  2021-10-25 13:34                     ` Szabolcs Nagy
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-25 10:53 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Szabolcs Nagy, Fāng-ruì Sòng, H.J. Lu, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab,
	Joseph Myers, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 761 bytes --]

Hi Florian,

> * Lukasz Majewski:
> 
> > Do we have any idea on how to move forward with this issue?  
> 
> Either fix the prelink tool not to prelink shared objects that do not
> have a dependency on libc.so.6, or fix the dynamic loader to work if
> prelinked on AArch64.

Just for the correctness - both 64 and 32 bit ARMs are affected.

> I do not have a strong opinion.

Thanks for your opinion. Let's wait for other community members
opinions.

> 
> Thanks,
> Florian
> 



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 10:53                   ` Lukasz Majewski
@ 2021-10-25 13:34                     ` Szabolcs Nagy
  2021-10-25 14:04                       ` Lukasz Majewski
  0 siblings, 1 reply; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-25 13:34 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Florian Weimer, Fāng-ruì Sòng, H.J. Lu,
	libc-alpha, Patches and discussions about the oe-core layer,
	Andreas Schwab, Joseph Myers, Carlos O'Donell

The 10/25/2021 12:53, Lukasz Majewski wrote:
> Hi Florian,
> 
> > * Lukasz Majewski:
> > 
> > > Do we have any idea on how to move forward with this issue?  
> > 
> > Either fix the prelink tool not to prelink shared objects that do not
> > have a dependency on libc.so.6, or fix the dynamic loader to work if
> > prelinked on AArch64.
> 
> Just for the correctness - both 64 and 32 bit ARMs are affected.

last time i looked, prelinking did not support tlsdesc
correctly so it is unusable for aarch64.

does yocto/oe use prelinking on aarch64?

> 
> > I do not have a strong opinion.
> 
> Thanks for your opinion. Let's wait for other community members
> opinions.

i think fixing the arm load address computation makes
sense (small extra cost of a relative reloc). i think
the c code proposal i made in the thread is nicer than
the old asm.

(i'm happy to make the same change on aarch64 too if
prelinking is used there, but i think that's broken.)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 13:34                     ` Szabolcs Nagy
@ 2021-10-25 14:04                       ` Lukasz Majewski
  2021-10-25 15:09                         ` Szabolcs Nagy
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-25 14:04 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, Fāng-ruì Sòng, H.J. Lu,
	libc-alpha, Patches and discussions about the oe-core layer,
	Andreas Schwab, Joseph Myers, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]

Hi Szabolcs,

> The 10/25/2021 12:53, Lukasz Majewski wrote:
> > Hi Florian,
> >   
> > > * Lukasz Majewski:
> > >   
> > > > Do we have any idea on how to move forward with this issue?    
> > > 
> > > Either fix the prelink tool not to prelink shared objects that do
> > > not have a dependency on libc.so.6, or fix the dynamic loader to
> > > work if prelinked on AArch64.  
> > 
> > Just for the correctness - both 64 and 32 bit ARMs are affected.  
> 
> last time i looked, prelinking did not support tlsdesc
> correctly so it is unusable for aarch64.
> 
> does yocto/oe use prelinking on aarch64?

I think yes - the 
USER_CLASSES ?= "buildstats image-prelink"

is added by default to local.conf

> 
> >   
> > > I do not have a strong opinion.  
> > 
> > Thanks for your opinion. Let's wait for other community members
> > opinions.  
> 
> i think fixing the arm load address computation makes
> sense (small extra cost of a relative reloc). i think
> the c code proposal i made in the thread is nicer than
> the old asm.
> 
> (i'm happy to make the same change on aarch64 too if
> prelinking is used there, but i think that's broken.)

+1

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 14:04                       ` Lukasz Majewski
@ 2021-10-25 15:09                         ` Szabolcs Nagy
  2021-10-25 17:26                           ` Joseph Myers
  2021-10-25 18:25                           ` Lukasz Majewski
  0 siblings, 2 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-25 15:09 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Florian Weimer, Fāng-ruì Sòng, H.J. Lu,
	libc-alpha, Patches and discussions about the oe-core layer,
	Andreas Schwab, Joseph Myers, Carlos O'Donell

The 10/25/2021 16:04, Lukasz Majewski wrote:
> > > > Either fix the prelink tool not to prelink shared objects that do
> > > > not have a dependency on libc.so.6, or fix the dynamic loader to
> > > > work if prelinked on AArch64.  
> > > 
> > > Just for the correctness - both 64 and 32 bit ARMs are affected.  
> > 
> > last time i looked, prelinking did not support tlsdesc
> > correctly so it is unusable for aarch64.
> > 
> > does yocto/oe use prelinking on aarch64?
> 
> I think yes - the 
> USER_CLASSES ?= "buildstats image-prelink"
> 
> is added by default to local.conf

ok, i think we need the patches upstream for that like
https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html

> > > > I do not have a strong opinion.  
> > > 
> > > Thanks for your opinion. Let's wait for other community members
> > > opinions.  
> > 
> > i think fixing the arm load address computation makes
> > sense (small extra cost of a relative reloc). i think
> > the c code proposal i made in the thread is nicer than
> > the old asm.
> > 
> > (i'm happy to make the same change on aarch64 too if
> > prelinking is used there, but i think that's broken.)
> 
> +1

since you have a prelink setup, can you prepare the
arm and aarch64 patches?

(i suspect x86 would need the same fix, but probably
prelink is not used there anymore..?)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 15:09                         ` Szabolcs Nagy
@ 2021-10-25 17:26                           ` Joseph Myers
  2021-10-26 13:52                             ` Lukasz Majewski
  2021-10-25 18:25                           ` Lukasz Majewski
  1 sibling, 1 reply; 62+ messages in thread
From: Joseph Myers @ 2021-10-25 17:26 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Lukasz Majewski, Florian Weimer, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab

On Mon, 25 Oct 2021, Szabolcs Nagy via Libc-alpha wrote:

> ok, i think we need the patches upstream for that like
> https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html

The AArch64 prelink support isn't in the upstream Yocto cross-prelink, and 
the version written by Samsung in 2015 and on the cross_prelink_aarch64 
branch has various problems resulting in test failures, in my experience.

I sent patches (on top of a merge of the upstream cross_prelink and 
cross_prelink_aarch64 branches) to the maintainer in May 2020 (the Yocto 
project mailing list doesn't accept email from non-subscribers, so won't 
have seen those patches), which made it work well enough to get clean 
prelink test results, but so far they haven't been committed to the 
upstream cross_prelink branch (or any other upstream branch).  I've now 
made those available at https://github.com/jsm28/prelink 
(cross_prelink_aarch64_fixes branch).

Note however that, like the original patches from Samsung, this version 
does indeed depend on a hack in _dl_tlsdesc_undefweak to work with TLS 
descriptors.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 15:09                         ` Szabolcs Nagy
  2021-10-25 17:26                           ` Joseph Myers
@ 2021-10-25 18:25                           ` Lukasz Majewski
  1 sibling, 0 replies; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-25 18:25 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Florian Weimer, Fāng-ruì Sòng, H.J. Lu,
	libc-alpha, Patches and discussions about the oe-core layer,
	Andreas Schwab, Joseph Myers, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

Hi Szabolcs,

> The 10/25/2021 16:04, Lukasz Majewski wrote:
> > > > > Either fix the prelink tool not to prelink shared objects
> > > > > that do not have a dependency on libc.so.6, or fix the
> > > > > dynamic loader to work if prelinked on AArch64.    
> > > > 
> > > > Just for the correctness - both 64 and 32 bit ARMs are
> > > > affected.    
> > > 
> > > last time i looked, prelinking did not support tlsdesc
> > > correctly so it is unusable for aarch64.
> > > 
> > > does yocto/oe use prelinking on aarch64?  
> > 
> > I think yes - the 
> > USER_CLASSES ?= "buildstats image-prelink"
> > 
> > is added by default to local.conf  
> 
> ok, i think we need the patches upstream for that like
> https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html
> 

Oh... I see.

> > > > > I do not have a strong opinion.    
> > > > 
> > > > Thanks for your opinion. Let's wait for other community members
> > > > opinions.    
> > > 
> > > i think fixing the arm load address computation makes
> > > sense (small extra cost of a relative reloc). i think
> > > the c code proposal i made in the thread is nicer than
> > > the old asm.
> > > 
> > > (i'm happy to make the same change on aarch64 too if
> > > prelinking is used there, but i think that's broken.)  
> > 
> > +1  
> 
> since you have a prelink setup, can you prepare the
> arm and aarch64 patches?
> 

I can prepare the patch - no problem.

Beforehand, I would like to hear from the community if we do have a
consensus about this solution...

> (i suspect x86 would need the same fix, but probably
> prelink is not used there anymore..?)

I do assume that in yocto at least it would use prelink by default as
well.

Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-25 17:26                           ` Joseph Myers
@ 2021-10-26 13:52                             ` Lukasz Majewski
  2021-10-26 20:55                               ` Joseph Myers
  0 siblings, 1 reply; 62+ messages in thread
From: Lukasz Majewski @ 2021-10-26 13:52 UTC (permalink / raw)
  To: Joseph Myers, Szabolcs Nagy
  Cc: Florian Weimer, libc-alpha,
	Patches and discussions about the oe-core layer, Andreas Schwab

[-- Attachment #1: Type: text/plain, Size: 1970 bytes --]

Hi Joseph, Szabolcs 

> On Mon, 25 Oct 2021, Szabolcs Nagy via Libc-alpha wrote:
> 
> > ok, i think we need the patches upstream for that like
> > https://sourceware.org/pipermail/libc-alpha/2015-November/066153.html
> >  
> 
> The AArch64 prelink support isn't in the upstream Yocto
> cross-prelink, and the version written by Samsung in 2015 and on the
> cross_prelink_aarch64 branch has various problems resulting in test
> failures, in my experience.

Ok.

> 
> I sent patches (on top of a merge of the upstream cross_prelink and 
> cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> Yocto project mailing list doesn't accept email from non-subscribers,
> so won't have seen those patches), which made it work well enough to
> get clean prelink test results, but so far they haven't been
> committed to the upstream cross_prelink branch (or any other upstream
> branch).  I've now made those available at
> https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> 

So this branch shall be pulled by yocto's cross-prelink maintainer.
Without it the cross-prelink doesn't support aarch64?

> Note however that, like the original patches from Samsung, this
> version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> with TLS descriptors.
> 

To properly solve this issue we shall:

1. For arm
- Fix the cross-prelink (no patches available)

or

- Fix glibc (as proposed by Szabolcs)

2. For aarch64

- Try to upstream patches from Joseph to OE/Yocto's cross-prelink

or

- Fix glibc (if required)

or

- Do nothing (aarch64 will not be prelinked in OE/Yocto, which means
  that it will work correctly)


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-26 13:52                             ` Lukasz Majewski
@ 2021-10-26 20:55                               ` Joseph Myers
  2021-10-27  9:38                                 ` Szabolcs Nagy
  0 siblings, 1 reply; 62+ messages in thread
From: Joseph Myers @ 2021-10-26 20:55 UTC (permalink / raw)
  To: Lukasz Majewski
  Cc: Szabolcs Nagy, Florian Weimer, Andreas Schwab, libc-alpha,
	Patches and discussions about the oe-core layer

On Tue, 26 Oct 2021, Lukasz Majewski wrote:

> > I sent patches (on top of a merge of the upstream cross_prelink and 
> > cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> > Yocto project mailing list doesn't accept email from non-subscribers,
> > so won't have seen those patches), which made it work well enough to
> > get clean prelink test results, but so far they haven't been
> > committed to the upstream cross_prelink branch (or any other upstream
> > branch).  I've now made those available at
> > https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> 
> So this branch shall be pulled by yocto's cross-prelink maintainer.

It should be.

> Without it the cross-prelink doesn't support aarch64?

Correct.

With that branch, basic AArch64 support is there, but the following still 
applies:

> > Note however that, like the original patches from Samsung, this
> > version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> > with TLS descriptors.

(And given that hack, test results with my branch should be clear on 
AArch64.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM
  2021-10-26 20:55                               ` Joseph Myers
@ 2021-10-27  9:38                                 ` Szabolcs Nagy
  0 siblings, 0 replies; 62+ messages in thread
From: Szabolcs Nagy @ 2021-10-27  9:38 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Lukasz Majewski, Florian Weimer, Andreas Schwab, libc-alpha,
	Patches and discussions about the oe-core layer

The 10/26/2021 20:55, Joseph Myers wrote:
> On Tue, 26 Oct 2021, Lukasz Majewski wrote:
> 
> > > I sent patches (on top of a merge of the upstream cross_prelink and 
> > > cross_prelink_aarch64 branches) to the maintainer in May 2020 (the
> > > Yocto project mailing list doesn't accept email from non-subscribers,
> > > so won't have seen those patches), which made it work well enough to
> > > get clean prelink test results, but so far they haven't been
> > > committed to the upstream cross_prelink branch (or any other upstream
> > > branch).  I've now made those available at
> > > https://github.com/jsm28/prelink (cross_prelink_aarch64_fixes branch).
> > 
> > So this branch shall be pulled by yocto's cross-prelink maintainer.
> 
> It should be.
> 
> > Without it the cross-prelink doesn't support aarch64?
> 
> Correct.
> 
> With that branch, basic AArch64 support is there, but the following still 
> applies:
> 
> > > Note however that, like the original patches from Samsung, this
> > > version does indeed depend on a hack in _dl_tlsdesc_undefweak to work
> > > with TLS descriptors.
> 
> (And given that hack, test results with my branch should be clear on 
> AArch64.)

i think the undefweak hack is acceptable.
(in the non-prelinked case that path is rare)

and then i'm happy to take the load address change
to support prelinked ld.so.


^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2021-10-27  9:38 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-07 13:16 [PATCH] dl: Use "adr" assembler command to get proper load address Lukasz Majewski
2021-09-07 16:49 ` Fangrui Song
2021-09-07 17:32   ` Lukasz Majewski
2021-09-07 17:44     ` Fangrui Song
2021-09-08 15:05       ` Lukasz Majewski
2021-09-08 17:41         ` Fāng-ruì Sòng
2021-09-08 19:19         ` Adhemerval Zanella
2021-09-08 20:34           ` Lukasz Majewski
2021-09-09  7:18             ` Lukasz Majewski
2021-09-09  9:49               ` Lukasz Majewski
2021-09-10 10:10                 ` Lukasz Majewski
2021-09-17  8:29                   ` Lukasz Majewski
2021-09-17 13:27                     ` Joseph Myers
2021-09-17 16:17                       ` Andreas Schwab
2021-09-26 19:58                       ` Lukasz Majewski
2021-09-27 16:00                         ` Joseph Myers
2021-10-05  7:45       ` Lukasz Majewski
2021-10-06  7:57         ` Fangrui Song
2021-10-06  9:03           ` Lukasz Majewski
2021-10-06 11:43             ` Lukasz Majewski
2021-10-06 12:55               ` Szabolcs Nagy
2021-10-07  9:19                 ` Lukasz Majewski
2021-10-07 10:00                   ` Lukasz Majewski
2021-10-07 14:15                     ` Szabolcs Nagy
2021-10-07 14:58                       ` Lukasz Majewski
2021-10-07 14:16                     ` Adhemerval Zanella
2021-10-07 14:29                       ` H.J. Lu
2021-10-07 15:57                         ` Szabolcs Nagy
2021-10-07 16:22                           ` H.J. Lu
2021-10-07 16:53                             ` Adhemerval Zanella
2021-10-07 17:05                               ` H.J. Lu
2021-10-07 17:24                               ` Fāng-ruì Sòng
2021-10-08  9:15                                 ` Szabolcs Nagy
2021-10-11  8:56                         ` Lukasz Majewski
2021-10-11 10:18                           ` Szabolcs Nagy
2021-10-11 11:47                             ` Lukasz Majewski
2021-10-11 12:01                               ` H.J. Lu
2021-10-11 13:10                                 ` Lukasz Majewski
2021-10-11 13:22                                   ` H.J. Lu
2021-10-11 14:31                                     ` Lukasz Majewski
2021-10-11 13:34                                 ` Adhemerval Zanella
2021-10-11 12:48                               ` Szabolcs Nagy
2021-10-15  7:54 ` [PATCH v2] dl: Use "adr" assembler command to get proper load address on ARM Lukasz Majewski
2021-10-15 12:09   ` Szabolcs Nagy
2021-10-15 12:21     ` H.J. Lu
2021-10-15 12:59       ` Lukasz Majewski
2021-10-15 23:53         ` Fāng-ruì Sòng
2021-10-18 11:08           ` Szabolcs Nagy
2021-10-18 11:35             ` Florian Weimer
2021-10-19 12:03               ` Lukasz Majewski
2021-10-25 10:18               ` Lukasz Majewski
2021-10-25 10:25                 ` Florian Weimer
2021-10-25 10:53                   ` Lukasz Majewski
2021-10-25 13:34                     ` Szabolcs Nagy
2021-10-25 14:04                       ` Lukasz Majewski
2021-10-25 15:09                         ` Szabolcs Nagy
2021-10-25 17:26                           ` Joseph Myers
2021-10-26 13:52                             ` Lukasz Majewski
2021-10-26 20:55                               ` Joseph Myers
2021-10-27  9:38                                 ` Szabolcs Nagy
2021-10-25 18:25                           ` Lukasz Majewski
2021-10-15 13:59     ` Lukasz Majewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).