public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* global pointer gets overwritten with dlopen(3) on RISC-V
       [not found] <CGME20230512142114eucas1p112d969a89ad2480a0c10a532bd6d8440@eucas1p1.samsung.com>
@ 2023-05-12 14:21 ` Lukasz Stelmach
  2023-05-12 15:13   ` Florian Weimer
  2023-05-12 19:50   ` Fangrui Song
  0 siblings, 2 replies; 19+ messages in thread
From: Lukasz Stelmach @ 2023-05-12 14:21 UTC (permalink / raw)
  To: libc-alpha
  Cc: schwab, maskray, fweimer, palmer, adhemerval.zanella, joseph,
	binutils, Marek Pikula, Marek Szyprowski, Karol Lewandowski

[-- Attachment #1: Type: text/plain, Size: 3069 bytes --]

Hi,

We've encountered an issue of a program misbehaving due to its gp value
being overwritten. Let me present our setup and the exact sequence of
events.

We've got a program (the testee) written in C that we test with another
one (a testing harness, the tester) written in C++ with gtest. So far,
so good. To make the testing and inspection of the internal state of the
testee easier the tester does not start the testee as a separate process
but loads it with dlopen(3) and calls the testee's main() function. Data
structures of the testee get initialised but the main() exits (as
desired) due to some unmet requirements. But this is fine. The code of
the testee remains usable and the tester starts calling it function by
function.

Alas, this is the point where things go south. What is worse they do so
in a semi-random fashion. We've seen several different behaviours they
were consistent between runs, but sometimes changed after compilation.
Long story short, both the tester and testee were compiled and linked
with relaxed relocations turned on. Both chunks of code assumed
different value of the gp register, of course.

What happens — step by step:

1. The tester starts and sets its the gp value in _start (see sysdeps/riscv/start.S)

2. The tester loads the testee with dlopen(3)

3. The dlopen(3) calls load_gp via preinit_array (see sysdeps/riscv/start.S)

4. The testee's code works fine, because the the gp register holds the value
   from loaded with the testee's load_gp.

5. The tester's code fails in many curious ways (e.g. stdio doesn't work,
   different functions are called than were supposed to because
   ofoverwrittent GOT entries etc.) Even in situations when the tester
   didn't fail until the end of its main(), it always caught SIGSEGV in
   __do_global_dtors_aux().

Our fix was to link the tester with -mno-relax option set. And it
worked. However, it took us a few days to understand all the details and
we think something needs to be done to avoid the confusing semi-random
failure mode even though we recognise our use-case is somewhat unusual.

Possible general solutions:

1. Make -mno-relax the default for ld(1) (on Linux?). We have no
benchmarks whatsoever, but global variables aren't very popular in
application code these days and the gp register allows access to a
single memory page (4kB) only. No big deal really.

2. Make dlopen(3) (or any appropriate piece of code deep down in glibc)
recognise the situation where the gp has been set and may be overwritten and
report error. Neither overwriting the the gp nor loading a binary without
(e.g. removing load_gp from preinit_array. why is it there in the first
place?) would give us a working code.

The above solutions aren't mutually exclusive and implementing both of
them seems like a good idea.

Are there any other ways to avoid misbehaviour when a process dlopens an
executable binary and calls its code?

Kind regards,
-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 14:21 ` global pointer gets overwritten with dlopen(3) on RISC-V Lukasz Stelmach
@ 2023-05-12 15:13   ` Florian Weimer
  2023-05-15 13:47     ` Palmer Dabbelt
  2023-05-12 19:50   ` Fangrui Song
  1 sibling, 1 reply; 19+ messages in thread
From: Florian Weimer @ 2023-05-12 15:13 UTC (permalink / raw)
  To: Lukasz Stelmach via Libc-alpha
  Cc: Lukasz Stelmach, schwab, maskray, fweimer, palmer,
	adhemerval.zanella, joseph, binutils, Marek Pikula,
	Marek Szyprowski, Karol Lewandowski

* Lukasz Stelmach via Libc-alpha:

> We've got a program (the testee) written in C that we test with another
> one (a testing harness, the tester) written in C++ with gtest. So far,
> so good. To make the testing and inspection of the internal state of the
> testee easier the tester does not start the testee as a separate process
> but loads it with dlopen(3) and calls the testee's main() function.

We specifically disallow this in current glibc because it does not
work in general—unless the application is really loadable as a shared
object (compiled as PIC and linked with -shared).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 14:21 ` global pointer gets overwritten with dlopen(3) on RISC-V Lukasz Stelmach
  2023-05-12 15:13   ` Florian Weimer
@ 2023-05-12 19:50   ` Fangrui Song
  2023-05-12 20:11     ` Florian Weimer
  1 sibling, 1 reply; 19+ messages in thread
From: Fangrui Song @ 2023-05-12 19:50 UTC (permalink / raw)
  To: Lukasz Stelmach
  Cc: libc-alpha, schwab, fweimer, palmer, adhemerval.zanella, joseph,
	binutils, Marek Pikula, Marek Szyprowski, Karol Lewandowski

On 2023-05-12, Lukasz Stelmach wrote:
>Hi,
>
>We've encountered an issue of a program misbehaving due to its gp value
>being overwritten. Let me present our setup and the exact sequence of
>events.
>
>We've got a program (the testee) written in C that we test with another
>one (a testing harness, the tester) written in C++ with gtest. So far,
>so good. To make the testing and inspection of the internal state of the
>testee easier the tester does not start the testee as a separate process
>but loads it with dlopen(3) and calls the testee's main() function.

As Florian replied, dlopening an executable is disallowed.
dlopen will give an error (dlerror()):
"cannot dynamically load executable" or
"cannot dynamically load position-independent executable".

>
>Data
>structures of the testee get initialised but the main() exits (as
>desired) due to some unmet requirements. But this is fine. The code of
>the testee remains usable and the tester starts calling it function by
>function.
>
>Alas, this is the point where things go south. What is worse they do so
>in a semi-random fashion. We've seen several different behaviours they
>were consistent between runs, but sometimes changed after compilation.
>Long story short, both the tester and testee were compiled and linked
>with relaxed relocations turned on. Both chunks of code assumed
>different value of the gp register, of course.
>
>What happens — step by step:
>
>1. The tester starts and sets its the gp value in _start (see sysdeps/riscv/start.S)
>
>2. The tester loads the testee with dlopen(3)
>
>3. The dlopen(3) calls load_gp via preinit_array (see sysdeps/riscv/start.S)
>
>4. The testee's code works fine, because the the gp register holds the value
>   from loaded with the testee's load_gp.
>
>5. The tester's code fails in many curious ways (e.g. stdio doesn't work,
>   different functions are called than were supposed to because
>   ofoverwrittent GOT entries etc.) Even in situations when the tester
>   didn't fail until the end of its main(), it always caught SIGSEGV in
>   __do_global_dtors_aux().
>
>Our fix was to link the tester with -mno-relax option set. And it
>worked. However, it took us a few days to understand all the details and
>we think something needs to be done to avoid the confusing semi-random
>failure mode even though we recognise our use-case is somewhat unusual.
>
>Possible general solutions:
>
>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>benchmarks whatsoever, but global variables aren't very popular in
>application code these days and the gp register allows access to a
>single memory page (4kB) only. No big deal really.

I do agree that --no-relax-gp is a sensible default choice for GNU ld.
https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation

Perhaps you can start a separate topic on binutils? :)

According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
global pointer relaxation saves at best 0.5% size (I guess that refers
to .text. If we count all allocable sections, the percentage is likely
even smaller.)

I understand that global pointer relaxation may be more useful for
certain embedded use cases, but its saving for many other scenarios is
probably not significant enough, and using gp (x3) for platform
specific purposes
(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/371)
may provide a larger benefit.


>2. Make dlopen(3) (or any appropriate piece of code deep down in glibc)
>recognise the situation where the gp has been set and may be overwritten and
>report error. Neither overwriting the the gp nor loading a binary without
>(e.g. removing load_gp from preinit_array. why is it there in the first
>place?) would give us a working code.
>
>The above solutions aren't mutually exclusive and implementing both of
>them seems like a good idea.
>
>Are there any other ways to avoid misbehaviour when a process dlopens an
>executable binary and calls its code?
>
>Kind regards,
>--
>Łukasz Stelmach
>Samsung R&D Institute Poland
>Samsung Electronics

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 19:50   ` Fangrui Song
@ 2023-05-12 20:11     ` Florian Weimer
  2023-05-12 20:33       ` Palmer Dabbelt
  2023-05-12 20:35       ` Jeff Law
  0 siblings, 2 replies; 19+ messages in thread
From: Florian Weimer @ 2023-05-12 20:11 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Lukasz Stelmach, libc-alpha, schwab, palmer, adhemerval.zanella,
	joseph, binutils, Marek Pikula, Marek Szyprowski,
	Karol Lewandowski

* Fangrui Song:

>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>benchmarks whatsoever, but global variables aren't very popular in
>>application code these days and the gp register allows access to a
>>single memory page (4kB) only. No big deal really.
>
> I do agree that --no-relax-gp is a sensible default choice for GNU ld.
> https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>
> Perhaps you can start a separate topic on binutils? :)
>
> According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
> https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
> global pointer relaxation saves at best 0.5% size (I guess that refers
> to .text. If we count all allocable sections, the percentage is likely
> even smaller.)

For a mature toolchain, 0.5% in code size reduction would be *a lot*,
so I wouldn't dismiss that.

Do we have a reproducer?  Is the issue actually about gp relaxation for
the main executable?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 20:11     ` Florian Weimer
@ 2023-05-12 20:33       ` Palmer Dabbelt
  2023-05-12 21:09         ` Fangrui Song
  2023-05-12 20:35       ` Jeff Law
  1 sibling, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-12 20:33 UTC (permalink / raw)
  To: fweimer
  Cc: maskray, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk

On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
> * Fangrui Song:
>
>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>benchmarks whatsoever, but global variables aren't very popular in
>>>application code these days and the gp register allows access to a
>>>single memory page (4kB) only. No big deal really.
>>
>> I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>> https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>
>> Perhaps you can start a separate topic on binutils? :)
>>
>> According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>> https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>> global pointer relaxation saves at best 0.5% size (I guess that refers
>> to .text. If we count all allocable sections, the percentage is likely
>> even smaller.)
>
> For a mature toolchain, 0.5% in code size reduction would be *a lot*,
> so I wouldn't dismiss that.

That's broadly speaking why it sticks around.  We've got a bunch of 
headaches related to relaxation, GP or otherwise, but they improve 
performance and nobody's figured out how to replace that yet.

> Do we have a reproducer?  Is the issue actually about gp relaxation for
> the main executable?

In general we don't reference GP from shared libraries as we don't have 
a GP save/restore scheme.  There may be a bug floating around here 
somewhere, in which case we should fix it, but the original post sounds 
like it wasn't a supported use case.

> Thanks,
> Florian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 20:11     ` Florian Weimer
  2023-05-12 20:33       ` Palmer Dabbelt
@ 2023-05-12 20:35       ` Jeff Law
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff Law @ 2023-05-12 20:35 UTC (permalink / raw)
  To: libc-alpha



On 5/12/23 14:11, Florian Weimer via Libc-alpha wrote:
> * Fangrui Song:
> 
>>> 1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>> benchmarks whatsoever, but global variables aren't very popular in
>>> application code these days and the gp register allows access to a
>>> single memory page (4kB) only. No big deal really.
>>
>> I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>> https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>
>> Perhaps you can start a separate topic on binutils? :)
>>
>> According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>> https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>> global pointer relaxation saves at best 0.5% size (I guess that refers
>> to .text. If we count all allocable sections, the percentage is likely
>> even smaller.)
> 
> For a mature toolchain, 0.5% in code size reduction would be *a lot*,
> so I wouldn't dismiss that.
However, that value is probably a little low.

There is a really interesting little bug in the interaction between how 
we place gp and how we determine if an address computation is relaxable. 
  The net result is for relatively small data segments we can fail to 
relax address computations that we should be able to trivially cover 
with gp based addressing.

This showed up as a clearly measurable runtime performance regression in 
spec2017.  leela or deepsjeng, I can't remember which offhand.  It would 
also likely be a size issue, though I rarely look at those size issues.

Knowing that I still question if it is worth the effort to relax. 
Others have much more strongly held positions.

Jeff




> 
> Do we have a reproducer?  Is the issue actually about gp relaxation for
> the main executable?
> 
> Thanks,
> Florian
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 20:33       ` Palmer Dabbelt
@ 2023-05-12 21:09         ` Fangrui Song
  2023-05-12 21:57           ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Fangrui Song @ 2023-05-12 21:09 UTC (permalink / raw)
  To: fweimer, Palmer Dabbelt
  Cc: l.stelmach, libc-alpha, schwab, adhemerval.zanella, joseph,
	binutils, m.pikula, m.szyprowski, k.lewandowsk, Jeff Law

On 2023-05-12, Palmer Dabbelt wrote:
>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>* Fangrui Song:
>>
>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>application code these days and the gp register allows access to a
>>>>single memory page (4kB) only. No big deal really.
>>>
>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>
>>>Perhaps you can start a separate topic on binutils? :)
>>>
>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>to .text. If we count all allocable sections, the percentage is likely
>>>even smaller.)
>>
>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>so I wouldn't dismiss that.
>
>That's broadly speaking why it sticks around.  We've got a bunch of 
>headaches related to relaxation, GP or otherwise, but they improve 
>performance and nobody's figured out how to replace that yet.
>
>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>the main executable?
>
>In general we don't reference GP from shared libraries as we don't 
>have a GP save/restore scheme.  There may be a bug floating around 
>here somewhere, in which case we should fix it, but the original post 
>sounds like it wasn't a supported use case.
>
>>Thanks,
>>Florian

Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
The area that potentially benefits global pointer relaxation is very small.

0.5% code size reduction (relative to .text?) is the best case. I
suspect the program somehow has a lot global variable accesses and
placing these variables in .sdata helps.

I've got results from Yingwei Zheng at PLCT lab using many
configurations. The saving is like 0.1%.
https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109

On the binutils side, we occasionally see patches to fix global pointer
relaxation bugs, e.g. the patch just sent a few hours ago:
https://sourceware.org/pipermail/binutils/2023-May/127413.html

I do not know the embedded toolchain well, but for Linux desktop/server,
disabling global pointer relaxation seems like a sensible choice. If we
discover a better way to utilize GP (x3) in the future, disabling global
pointer relaxation today will result in fewer compatibility issues.

Haiku
(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 21:09         ` Fangrui Song
@ 2023-05-12 21:57           ` Palmer Dabbelt
  2023-05-12 22:34             ` Fangrui Song
  0 siblings, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-12 21:57 UTC (permalink / raw)
  To: maskray
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
> On 2023-05-12, Palmer Dabbelt wrote:
>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>* Fangrui Song:
>>>
>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>application code these days and the gp register allows access to a
>>>>>single memory page (4kB) only. No big deal really.
>>>>
>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>
>>>>Perhaps you can start a separate topic on binutils? :)
>>>>
>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>even smaller.)
>>>
>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>so I wouldn't dismiss that.
>>
>>That's broadly speaking why it sticks around.  We've got a bunch of
>>headaches related to relaxation, GP or otherwise, but they improve
>>performance and nobody's figured out how to replace that yet.
>>
>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>the main executable?
>>
>>In general we don't reference GP from shared libraries as we don't
>>have a GP save/restore scheme.  There may be a bug floating around
>>here somewhere, in which case we should fix it, but the original post
>>sounds like it wasn't a supported use case.
>>
>>>Thanks,
>>>Florian
>
> Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
> The area that potentially benefits global pointer relaxation is very small.
>
> 0.5% code size reduction (relative to .text?) is the best case. I
> suspect the program somehow has a lot global variable accesses and
> placing these variables in .sdata helps.
>
> I've got results from Yingwei Zheng at PLCT lab using many
> configurations. The saving is like 0.1%.
> https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>
> On the binutils side, we occasionally see patches to fix global pointer
> relaxation bugs, e.g. the patch just sent a few hours ago:
> https://sourceware.org/pipermail/binutils/2023-May/127413.html
>
> I do not know the embedded toolchain well, but for Linux desktop/server,
> disabling global pointer relaxation seems like a sensible choice. If we
> discover a better way to utilize GP (x3) in the future, disabling global
> pointer relaxation today will result in fewer compatibility issues.

This comes up all the time, you're just pushing for a backdoor ABI 
break.  I get the desire to remove GP, if we were to be able to redo 
things I'd also not have it, but it's in the ABI and we can't change 
the binaries that exist.

If you want a GP-free ABI then you should just go write one up.  Then 
it'll become a distro problem, and if it turns out that users also don't 
want in then the GP ABI will rot and we can eventually deprecate it.

> Haiku
> (https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
> have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 21:57           ` Palmer Dabbelt
@ 2023-05-12 22:34             ` Fangrui Song
  2023-05-12 22:47               ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Fangrui Song @ 2023-05-12 22:34 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On 2023-05-12, Palmer Dabbelt wrote:
>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>On 2023-05-12, Palmer Dabbelt wrote:
>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>* Fangrui Song:
>>>>
>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>application code these days and the gp register allows access to a
>>>>>>single memory page (4kB) only. No big deal really.
>>>>>
>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>
>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>
>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>even smaller.)
>>>>
>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>so I wouldn't dismiss that.
>>>
>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>headaches related to relaxation, GP or otherwise, but they improve
>>>performance and nobody's figured out how to replace that yet.
>>>
>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>the main executable?
>>>
>>>In general we don't reference GP from shared libraries as we don't
>>>have a GP save/restore scheme.  There may be a bug floating around
>>>here somewhere, in which case we should fix it, but the original post
>>>sounds like it wasn't a supported use case.
>>>
>>>>Thanks,
>>>>Florian
>>
>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>The area that potentially benefits global pointer relaxation is very small.
>>
>>0.5% code size reduction (relative to .text?) is the best case. I
>>suspect the program somehow has a lot global variable accesses and
>>placing these variables in .sdata helps.
>>
>>I've got results from Yingwei Zheng at PLCT lab using many
>>configurations. The saving is like 0.1%.
>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>
>>On the binutils side, we occasionally see patches to fix global pointer
>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>
>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>disabling global pointer relaxation seems like a sensible choice. If we
>>discover a better way to utilize GP (x3) in the future, disabling global
>>pointer relaxation today will result in fewer compatibility issues.
>
>This comes up all the time, you're just pushing for a backdoor ABI 
>break.  I get the desire to remove GP, if we were to be able to redo 
>things I'd also not have it, but it's in the ABI and we can't change 
>the binaries that exist.
>
>If you want a GP-free ABI then you should just go write one up.  Then 
>it'll become a distro problem, and if it turns out that users also 
>don't want in then the GP ABI will rot and we can eventually deprecate 
>it.

I am advocating for a change in GNU ld to make --no-relax-gp the default
option, but I am not sure it can be considered an ABI break.

When using ld --no-relax-gp, the conversion of code sequences to gp is
disabled, thus eliminating an assumption related to global pointer relaxation.
If an executable is relinked without global pointer relaxation, it should still
function properly.

To the best of my knowledge, there is no official documentation that designates
linker relaxation as a mandatory feature. Relaxation schemes, including global
pointer relaxation, are optional. Making an optional feature opt-in does not
constitute an ABI break.

glibc continues to initialize gp to __global_pointer$ to accommodate users who
opt-in for global pointer relaxation. Removing the initialization of gp would
indeed be an ABI break, and I have never proposed such a change.

>>Haiku
>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 22:34             ` Fangrui Song
@ 2023-05-12 22:47               ` Palmer Dabbelt
  2023-05-13  0:05                 ` Fangrui Song
  0 siblings, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-12 22:47 UTC (permalink / raw)
  To: maskray
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
> On 2023-05-12, Palmer Dabbelt wrote:
>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>* Fangrui Song:
>>>>>
>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>application code these days and the gp register allows access to a
>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>
>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>
>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>
>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>even smaller.)
>>>>>
>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>so I wouldn't dismiss that.
>>>>
>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>performance and nobody's figured out how to replace that yet.
>>>>
>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>the main executable?
>>>>
>>>>In general we don't reference GP from shared libraries as we don't
>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>here somewhere, in which case we should fix it, but the original post
>>>>sounds like it wasn't a supported use case.
>>>>
>>>>>Thanks,
>>>>>Florian
>>>
>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>The area that potentially benefits global pointer relaxation is very small.
>>>
>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>suspect the program somehow has a lot global variable accesses and
>>>placing these variables in .sdata helps.
>>>
>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>configurations. The saving is like 0.1%.
>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>
>>>On the binutils side, we occasionally see patches to fix global pointer
>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>
>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>pointer relaxation today will result in fewer compatibility issues.
>>
>>This comes up all the time, you're just pushing for a backdoor ABI
>>break.  I get the desire to remove GP, if we were to be able to redo
>>things I'd also not have it, but it's in the ABI and we can't change
>>the binaries that exist.
>>
>>If you want a GP-free ABI then you should just go write one up.  Then
>>it'll become a distro problem, and if it turns out that users also
>>don't want in then the GP ABI will rot and we can eventually deprecate
>>it.
>
> I am advocating for a change in GNU ld to make --no-relax-gp the default
> option, but I am not sure it can be considered an ABI break.
>
> When using ld --no-relax-gp, the conversion of code sequences to gp is
> disabled, thus eliminating an assumption related to global pointer relaxation.
> If an executable is relinked without global pointer relaxation, it should still
> function properly.
>
> To the best of my knowledge, there is no official documentation that designates
> linker relaxation as a mandatory feature. Relaxation schemes, including global
> pointer relaxation, are optional. Making an optional feature opt-in does not
> constitute an ABI break.
>
> glibc continues to initialize gp to __global_pointer$ to accommodate users who
> opt-in for global pointer relaxation. Removing the initialization of gp would
> indeed be an ABI break, and I have never proposed such a change.

The ABI break is using GP for something else, but that always ends up 
being part of the argument for disabling GP-based relaxation.  Without 
that the argument ends up just being that the GP-relaxations are too 
complicated for the benefit they provide.

I certainly understand that argument, these (and for the rest of the 
relaxation stuff) is a huge amount of pain for a small amount of benefit 
(which seems to be mostly based on benchmark dragracing, which is never 
that strong of an argument).  That said, last time we had this 
discussion was only a few months ago.  I don't think anything has 
changed since then so I doubt things well go any differently.

Just turning GP-based relaxation off by default doesn't get rid of the 
complexity, they're still in the ABI and people will still use them 
(even if just for benchmark dragracing) so they'll still need to work.


>
>>>Haiku
>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 22:47               ` Palmer Dabbelt
@ 2023-05-13  0:05                 ` Fangrui Song
  2023-05-13  0:35                   ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Fangrui Song @ 2023-05-13  0:05 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On 2023-05-12, Palmer Dabbelt wrote:
>On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
>>On 2023-05-12, Palmer Dabbelt wrote:
>>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>>* Fangrui Song:
>>>>>>
>>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>>application code these days and the gp register allows access to a
>>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>>
>>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>>
>>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>>
>>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>>even smaller.)
>>>>>>
>>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>>so I wouldn't dismiss that.
>>>>>
>>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>>performance and nobody's figured out how to replace that yet.
>>>>>
>>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>>the main executable?
>>>>>
>>>>>In general we don't reference GP from shared libraries as we don't
>>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>>here somewhere, in which case we should fix it, but the original post
>>>>>sounds like it wasn't a supported use case.
>>>>>
>>>>>>Thanks,
>>>>>>Florian
>>>>
>>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>>The area that potentially benefits global pointer relaxation is very small.
>>>>
>>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>>suspect the program somehow has a lot global variable accesses and
>>>>placing these variables in .sdata helps.
>>>>
>>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>>configurations. The saving is like 0.1%.
>>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>>
>>>>On the binutils side, we occasionally see patches to fix global pointer
>>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>>
>>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>>pointer relaxation today will result in fewer compatibility issues.
>>>
>>>This comes up all the time, you're just pushing for a backdoor ABI
>>>break.  I get the desire to remove GP, if we were to be able to redo
>>>things I'd also not have it, but it's in the ABI and we can't change
>>>the binaries that exist.
>>>
>>>If you want a GP-free ABI then you should just go write one up.  Then
>>>it'll become a distro problem, and if it turns out that users also
>>>don't want in then the GP ABI will rot and we can eventually deprecate
>>>it.
>>
>>I am advocating for a change in GNU ld to make --no-relax-gp the default
>>option, but I am not sure it can be considered an ABI break.
>>
>>When using ld --no-relax-gp, the conversion of code sequences to gp is
>>disabled, thus eliminating an assumption related to global pointer relaxation.
>>If an executable is relinked without global pointer relaxation, it should still
>>function properly.
>>
>>To the best of my knowledge, there is no official documentation that designates
>>linker relaxation as a mandatory feature. Relaxation schemes, including global
>>pointer relaxation, are optional. Making an optional feature opt-in does not
>>constitute an ABI break.
>>
>>glibc continues to initialize gp to __global_pointer$ to accommodate users who
>>opt-in for global pointer relaxation. Removing the initialization of gp would
>>indeed be an ABI break, and I have never proposed such a change.
>
>The ABI break is using GP for something else, but that always ends up 
>being part of the argument for disabling GP-based relaxation.  Without 
>that the argument ends up just being that the GP-relaxations are too 
>complicated for the benefit they provide.

This has happened. Some platforms are investigating or using GP for
software shadow call stack. I think they include Android and Fuchsia.  I
am not affiliated with them.

>I certainly understand that argument, these (and for the rest of the 
>relaxation stuff) is a huge amount of pain for a small amount of 
>benefit (which seems to be mostly based on benchmark dragracing, which 
>is never that strong of an argument).  That said, last time we had 
>this discussion was only a few months ago.  I don't think anything has 
>changed since then so I doubt things well go any differently.

The rest of the relaxation mechanism does offer significant savings that
should not be overlooked. I have heard claims of up to a 10% .text
reduction in certain embedded systems.

However, it is important to acknowledge the costs associated with
increased toolchain complexity and bloating of debug information, LSDA ,
and custom metadata sections. I discussed these concerns in detail at
https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation
(Apologies if the title came across as derogatory. It was never my
intention, and I genuinely enjoyed delving into the intricacies of
this toolchain technology.)

>Just turning GP-based relaxation off by default doesn't get rid of the 
>complexity, they're still in the ABI and people will still use them 
>(even if just for benchmark dragracing) so they'll still need to work.

Regarding the ABI story, shared objects do not utilize global pointer
relaxation. Therefore, if only an executable requires GP for a
platform-specific purpose, I believe there is no ABI risk involved. The
risk arises when shared objects rely on the executable's initialized GP
for certain purposes.

ABI breaks related to global pointer usage occur when a platform
initially ships something using GP and then switches to a different GP
usage.

As I mentioned earlier, many non-Linux/glibc platforms don't use global
pointer relaxation from the beginning.  Even within Linux/glibc, a new
distribution may choose not to use global pointer relaxation initially.
For these platforms, there would be no ABI break.

Changing the default behavior does indeed have an impact.
--no-relax-gp is a binutils 2.41 feature, which means that projects that
discover improved ways to utilize GP in the future need to avoid
--no-relax-gp to maintain compatibility with older GNU ld versions.
They could use --no-relax with GNU ld versions, but that's a big hammer.

>>
>>>>Haiku
>>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-13  0:05                 ` Fangrui Song
@ 2023-05-13  0:35                   ` Palmer Dabbelt
  2023-05-16  3:56                     ` Fangrui Song
  0 siblings, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-13  0:35 UTC (permalink / raw)
  To: maskray
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On Fri, 12 May 2023 17:05:02 PDT (-0700), maskray@google.com wrote:
> On 2023-05-12, Palmer Dabbelt wrote:
>>On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>>>* Fangrui Song:
>>>>>>>
>>>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>>>application code these days and the gp register allows access to a
>>>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>>>
>>>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>>>
>>>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>>>
>>>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>>>even smaller.)
>>>>>>>
>>>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>>>so I wouldn't dismiss that.
>>>>>>
>>>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>>>performance and nobody's figured out how to replace that yet.
>>>>>>
>>>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>>>the main executable?
>>>>>>
>>>>>>In general we don't reference GP from shared libraries as we don't
>>>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>>>here somewhere, in which case we should fix it, but the original post
>>>>>>sounds like it wasn't a supported use case.
>>>>>>
>>>>>>>Thanks,
>>>>>>>Florian
>>>>>
>>>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>>>The area that potentially benefits global pointer relaxation is very small.
>>>>>
>>>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>>>suspect the program somehow has a lot global variable accesses and
>>>>>placing these variables in .sdata helps.
>>>>>
>>>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>>>configurations. The saving is like 0.1%.
>>>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>>>
>>>>>On the binutils side, we occasionally see patches to fix global pointer
>>>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>>>
>>>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>>>pointer relaxation today will result in fewer compatibility issues.
>>>>
>>>>This comes up all the time, you're just pushing for a backdoor ABI
>>>>break.  I get the desire to remove GP, if we were to be able to redo
>>>>things I'd also not have it, but it's in the ABI and we can't change
>>>>the binaries that exist.
>>>>
>>>>If you want a GP-free ABI then you should just go write one up.  Then
>>>>it'll become a distro problem, and if it turns out that users also
>>>>don't want in then the GP ABI will rot and we can eventually deprecate
>>>>it.
>>>
>>>I am advocating for a change in GNU ld to make --no-relax-gp the default
>>>option, but I am not sure it can be considered an ABI break.
>>>
>>>When using ld --no-relax-gp, the conversion of code sequences to gp is
>>>disabled, thus eliminating an assumption related to global pointer relaxation.
>>>If an executable is relinked without global pointer relaxation, it should still
>>>function properly.
>>>
>>>To the best of my knowledge, there is no official documentation that designates
>>>linker relaxation as a mandatory feature. Relaxation schemes, including global
>>>pointer relaxation, are optional. Making an optional feature opt-in does not
>>>constitute an ABI break.
>>>
>>>glibc continues to initialize gp to __global_pointer$ to accommodate users who
>>>opt-in for global pointer relaxation. Removing the initialization of gp would
>>>indeed be an ABI break, and I have never proposed such a change.
>>
>>The ABI break is using GP for something else, but that always ends up
>>being part of the argument for disabling GP-based relaxation.  Without
>>that the argument ends up just being that the GP-relaxations are too
>>complicated for the benefit they provide.
>
> This has happened. Some platforms are investigating or using GP for
> software shadow call stack. I think they include Android and Fuchsia.  I
> am not affiliated with them.

IIUC the Android shadow stack stuff isn't 100% decided, but it's the way 
they're headed.  I'd argue that's a great reason to actually write this 
down in an ABI spec: we've got users who either are or will soon violate 
the spec, we should fix the spec rather than force those users to fork 
it.

>>I certainly understand that argument, these (and for the rest of the
>>relaxation stuff) is a huge amount of pain for a small amount of
>>benefit (which seems to be mostly based on benchmark dragracing, which
>>is never that strong of an argument).  That said, last time we had
>>this discussion was only a few months ago.  I don't think anything has
>>changed since then so I doubt things well go any differently.
>
> The rest of the relaxation mechanism does offer significant savings that
> should not be overlooked. I have heard claims of up to a 10% .text
> reduction in certain embedded systems.
>
> However, it is important to acknowledge the costs associated with
> increased toolchain complexity and bloating of debug information, LSDA ,
> and custom metadata sections. I discussed these concerns in detail at
> https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation
> (Apologies if the title came across as derogatory. It was never my
> intention, and I genuinely enjoyed delving into the intricacies of
> this toolchain technology.)

I'd argue we should make this new ABI just forbid all relaxation.  It's 
a ton of headache for an unknown amount of benefit, we can't easily 
measure it because we haven't chased down all the missed optimizations 
due to relaxation.  Modern Linux distros are all PIE anyway, so it's not 
like most of the relaxation does much.

Pretty much everyone who looks at RISC-V says relaxation is bad, let's 
go let folks prove that out -- and if it's right we can get ride of a 
pile of complexity, which is great.

>>Just turning GP-based relaxation off by default doesn't get rid of the
>>complexity, they're still in the ABI and people will still use them
>>(even if just for benchmark dragracing) so they'll still need to work.
>
> Regarding the ABI story, shared objects do not utilize global pointer
> relaxation. Therefore, if only an executable requires GP for a
> platform-specific purpose, I believe there is no ABI risk involved. The
> risk arises when shared objects rely on the executable's initialized GP
> for certain purposes.
>
> ABI breaks related to global pointer usage occur when a platform
> initially ships something using GP and then switches to a different GP
> usage.

Using GP for anything that's not the global pointer violates the psABI.  
It's just a spec violation and I agree it's possible to violate the 
specs while still producing working systems.  At that point it's really 
up to the projects to ensure their spec-violating systems work, there's 
not much we can do on the toolchain side of things.  I'm guilty of doing 
that too (just look at a buncho of the Linux port, for example), but 
that's always a complicated way to do things.

IMO it's still better to have a spec written down so the toolchain and 
users can be on the same page.

> As I mentioned earlier, many non-Linux/glibc platforms don't use global
> pointer relaxation from the beginning.  Even within Linux/glibc, a new
> distribution may choose not to use global pointer relaxation initially.
> For these platforms, there would be no ABI break.
>
> Changing the default behavior does indeed have an impact.
> --no-relax-gp is a binutils 2.41 feature, which means that projects that
> discover improved ways to utilize GP in the future need to avoid
> --no-relax-gp to maintain compatibility with older GNU ld versions.
> They could use --no-relax with GNU ld versions, but that's a big hammer.

I would describe that as an ABI break.

>
>>>
>>>>>Haiku
>>>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-12 15:13   ` Florian Weimer
@ 2023-05-15 13:47     ` Palmer Dabbelt
       [not found]       ` <CGME20230516065316eucas1p17bffcd25209bb441b9a9f4d263aa8b3c@eucas1p1.samsung.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-15 13:47 UTC (permalink / raw)
  To: fw
  Cc: libc-alpha, l.stelmach, schwab, maskray, fweimer,
	adhemerval.zanella, joseph, binutils, m.pikula, m.szyprowski,
	k.lewandowsk

On Fri, 12 May 2023 08:13:10 PDT (-0700), fw@deneb.enyo.de wrote:
> * Lukasz Stelmach via Libc-alpha:
>
>> We've got a program (the testee) written in C that we test with another
>> one (a testing harness, the tester) written in C++ with gtest. So far,
>> so good. To make the testing and inspection of the internal state of the
>> testee easier the tester does not start the testee as a separate process
>> but loads it with dlopen(3) and calls the testee's main() function.
>
> We specifically disallow this in current glibc because it does not
> work in general—unless the application is really loadable as a shared
> object (compiled as PIC and linked with -shared).

Just popping back up here, as we got lost in the ABI discussions and 
were talking about it during the glibc patchwork call: essentially it 
boils down to us needing a more concrete reproducer for the bug.

If GP is used in a shared object then we've got a bug somewhere, 
probably the linker.  There's been some debate about what things like 
position independent mean in RISC-V land, so it's entirely possible 
there's something odd floating around here.  If you can reproduce that 
it's probably a bug, but probably a LD/LLD bug.

It sounds like there are no known bugs in glibc related to loading 
executables via dlopen(), as that doesn't work for any port due to a 
host of reasons (GP is just one of them).  We might have some bug 
floating around, RISC-V specific or otherwise, though.  If you have a 
reproducer for that then we can try and sort things out.

Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-13  0:35                   ` Palmer Dabbelt
@ 2023-05-16  3:56                     ` Fangrui Song
  2023-05-16 22:51                       ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Fangrui Song @ 2023-05-16  3:56 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On 2023-05-12, Palmer Dabbelt wrote:
>On Fri, 12 May 2023 17:05:02 PDT (-0700), maskray@google.com wrote:
>>On 2023-05-12, Palmer Dabbelt wrote:
>>>On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>>>>* Fangrui Song:
>>>>>>>>
>>>>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>>>>application code these days and the gp register allows access to a
>>>>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>>>>
>>>>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>>>>
>>>>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>>>>
>>>>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>>>>even smaller.)
>>>>>>>>
>>>>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>>>>so I wouldn't dismiss that.
>>>>>>>
>>>>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>>>>performance and nobody's figured out how to replace that yet.
>>>>>>>
>>>>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>>>>the main executable?
>>>>>>>
>>>>>>>In general we don't reference GP from shared libraries as we don't
>>>>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>>>>here somewhere, in which case we should fix it, but the original post
>>>>>>>sounds like it wasn't a supported use case.
>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Florian
>>>>>>
>>>>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>>>>The area that potentially benefits global pointer relaxation is very small.
>>>>>>
>>>>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>>>>suspect the program somehow has a lot global variable accesses and
>>>>>>placing these variables in .sdata helps.
>>>>>>
>>>>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>>>>configurations. The saving is like 0.1%.
>>>>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>>>>
>>>>>>On the binutils side, we occasionally see patches to fix global pointer
>>>>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>>>>
>>>>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>>>>pointer relaxation today will result in fewer compatibility issues.
>>>>>
>>>>>This comes up all the time, you're just pushing for a backdoor ABI
>>>>>break.  I get the desire to remove GP, if we were to be able to redo
>>>>>things I'd also not have it, but it's in the ABI and we can't change
>>>>>the binaries that exist.
>>>>>
>>>>>If you want a GP-free ABI then you should just go write one up.  Then
>>>>>it'll become a distro problem, and if it turns out that users also
>>>>>don't want in then the GP ABI will rot and we can eventually deprecate
>>>>>it.
>>>>
>>>>I am advocating for a change in GNU ld to make --no-relax-gp the default
>>>>option, but I am not sure it can be considered an ABI break.
>>>>
>>>>When using ld --no-relax-gp, the conversion of code sequences to gp is
>>>>disabled, thus eliminating an assumption related to global pointer relaxation.
>>>>If an executable is relinked without global pointer relaxation, it should still
>>>>function properly.
>>>>
>>>>To the best of my knowledge, there is no official documentation that designates
>>>>linker relaxation as a mandatory feature. Relaxation schemes, including global
>>>>pointer relaxation, are optional. Making an optional feature opt-in does not
>>>>constitute an ABI break.
>>>>
>>>>glibc continues to initialize gp to __global_pointer$ to accommodate users who
>>>>opt-in for global pointer relaxation. Removing the initialization of gp would
>>>>indeed be an ABI break, and I have never proposed such a change.
>>>
>>>The ABI break is using GP for something else, but that always ends up
>>>being part of the argument for disabling GP-based relaxation.  Without
>>>that the argument ends up just being that the GP-relaxations are too
>>>complicated for the benefit they provide.
>>
>>This has happened. Some platforms are investigating or using GP for
>>software shadow call stack. I think they include Android and Fuchsia.  I
>>am not affiliated with them.
>
>IIUC the Android shadow stack stuff isn't 100% decided, but it's the 
>way they're headed.  I'd argue that's a great reason to actually write 
>this down in an ABI spec: we've got users who either are or will soon 
>violate the spec, we should fix the spec rather than force those users 
>to fork it.
>
>>>I certainly understand that argument, these (and for the rest of the
>>>relaxation stuff) is a huge amount of pain for a small amount of
>>>benefit (which seems to be mostly based on benchmark dragracing, which
>>>is never that strong of an argument).  That said, last time we had
>>>this discussion was only a few months ago.  I don't think anything has
>>>changed since then so I doubt things well go any differently.
>>
>>The rest of the relaxation mechanism does offer significant savings that
>>should not be overlooked. I have heard claims of up to a 10% .text
>>reduction in certain embedded systems.
>>
>>However, it is important to acknowledge the costs associated with
>>increased toolchain complexity and bloating of debug information, LSDA ,
>>and custom metadata sections. I discussed these concerns in detail at
>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation
>>(Apologies if the title came across as derogatory. It was never my
>>intention, and I genuinely enjoyed delving into the intricacies of
>>this toolchain technology.)
>
>I'd argue we should make this new ABI just forbid all relaxation.  
>It's a ton of headache for an unknown amount of benefit, we can't 
>easily measure it because we haven't chased down all the missed 
>optimizations due to relaxation.  Modern Linux distros are all PIE 
>anyway, so it's not like most of the relaxation does much.
>
>Pretty much everyone who looks at RISC-V says relaxation is bad, let's 
>go let folks prove that out -- and if it's right we can get ride of a 
>pile of complexity, which is great.

It's too late to make linker relaxation opt-in. I'll do my part to fix
some LLVM assembler deficiency, though.

>>>Just turning GP-based relaxation off by default doesn't get rid of the
>>>complexity, they're still in the ABI and people will still use them
>>>(even if just for benchmark dragracing) so they'll still need to work.
>>
>>Regarding the ABI story, shared objects do not utilize global pointer
>>relaxation. Therefore, if only an executable requires GP for a
>>platform-specific purpose, I believe there is no ABI risk involved. The
>>risk arises when shared objects rely on the executable's initialized GP
>>for certain purposes.
>>
>>ABI breaks related to global pointer usage occur when a platform
>>initially ships something using GP and then switches to a different GP
>>usage.
>
>Using GP for anything that's not the global pointer violates the 
>psABI.

I disagree.  The psABI makes it clear that linker relaxation is optional
with wording like "With linker relaxation enabled" "the linker is
permitted to use ... relaxation".

>It's just a spec violation and I agree it's possible to 
>violate the specs while still producing working systems.  At that 
>point it's really up to the projects to ensure their spec-violating 
>systems work, there's not much we can do on the toolchain side of 
>things.  I'm guilty of doing that too (just look at a buncho of the 
>Linux port, for example), but that's always a complicated way to do 
>things.

"If a platform requires use of a dedicated general-purpose register for a
platform-specific purpose, it is recommended to use gp (x3). The
platform ABI specification must document the use of this register. For
such platforms, care must be taken to ensure all code (compiler
generated or otherwise) avoids using gp in a way incompatible with the
platform specific purpose, and that global
pointer relaxation is disabled in the toolchain."

>IMO it's still better to have a spec written down so the toolchain and 
>users can be on the same page.

You can make a proposal that global pointer must not be used for other
purposes and someone will object to it:)

>>As I mentioned earlier, many non-Linux/glibc platforms don't use global
>>pointer relaxation from the beginning.  Even within Linux/glibc, a new
>>distribution may choose not to use global pointer relaxation initially.
>>For these platforms, there would be no ABI break.
>>
>>Changing the default behavior does indeed have an impact.
>>--no-relax-gp is a binutils 2.41 feature, which means that projects that
>>discover improved ways to utilize GP in the future need to avoid
>>--no-relax-gp to maintain compatibility with older GNU ld versions.
>>They could use --no-relax with GNU ld versions, but that's a big hammer.
>
>I would describe that as an ABI break.

I provide evidence to back up my point.
Can you give reasons that this is an ABI break? And break what?

>>
>>>>
>>>>>>Haiku
>>>>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
       [not found]       ` <CGME20230516065316eucas1p17bffcd25209bb441b9a9f4d263aa8b3c@eucas1p1.samsung.com>
@ 2023-05-16  6:53         ` Lukasz Stelmach
  2023-05-16  7:59           ` Szabolcs Nagy
  0 siblings, 1 reply; 19+ messages in thread
From: Lukasz Stelmach @ 2023-05-16  6:53 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: fw, libc-alpha, schwab, maskray, fweimer, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

It was <2023-05-15 pon 06:47>, when Palmer Dabbelt wrote:
> On Fri, 12 May 2023 08:13:10 PDT (-0700), fw@deneb.enyo.de wrote:
>> * Lukasz Stelmach via Libc-alpha:
>>
>>> We've got a program (the testee) written in C that we test with another
>>> one (a testing harness, the tester) written in C++ with gtest. So far,
>>> so good. To make the testing and inspection of the internal state of the
>>> testee easier the tester does not start the testee as a separate process
>>> but loads it with dlopen(3) and calls the testee's main() function.
>>
>> We specifically disallow this in current glibc because it does not
>> work in general—unless the application is really loadable as a shared
>> object (compiled as PIC and linked with -shared).
>
> Just popping back up here, as we got lost in the ABI discussions and
> were talking about it during the glibc patchwork call: essentially it 
> boils down to us needing a more concrete reproducer for the bug.
>
> If GP is used in a shared object then we've got a bug somewhere,

No, the file we dlopen is an executable meant to work standalone. We
dlopen it for testing and this setup has worked for us on different
platforms (armv7l, aarch64, x86). We MAY have not encoutered an error
because our glibc has been patched. I have to investigate the details as
Florian brough it to our attention that an error should be reported.

> probably the linker.  There's been some debate about what things like 
> position independent mean in RISC-V land, so it's entirely possible
> there's something odd floating around here.  If you can reproduce that 
> it's probably a bug, but probably a LD/LLD bug.
>
> It sounds like there are no known bugs in glibc related to loading
> executables via dlopen(), as that doesn't work for any port due to a 
> host of reasons (GP is just one of them).  We might have some bug
> floating around, RISC-V specific or otherwise, though.  If you have a 
> reproducer for that then we can try and sort things out.

Now that I know I should get an error I will look more closely to see
what is going on.

Thank you, everyone, for your comments. They will help me a lot.
-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-16  6:53         ` Lukasz Stelmach
@ 2023-05-16  7:59           ` Szabolcs Nagy
  2023-05-17  0:11             ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Szabolcs Nagy @ 2023-05-16  7:59 UTC (permalink / raw)
  To: Lukasz Stelmach, Palmer Dabbelt
  Cc: fw, libc-alpha, schwab, maskray, fweimer, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk

The 05/16/2023 08:53, Lukasz Stelmach via Binutils wrote:
> No, the file we dlopen is an executable meant to work standalone. We
> dlopen it for testing and this setup has worked for us on different
> platforms (armv7l, aarch64, x86). We MAY have not encoutered an error

it is guaranteed broken on all those targets if the exe has
local exec TLS access. (initial exec TLS is broken too but
you may get lucky with that)

i think you can get into trouble with interposition, copy
relocs or canonical plts too.

but even if everything happens to work, it is just bad
design: it relies on implementation internals instead of
documented interfaces.

> because our glibc has been patched. I have to investigate the details as
> Florian brough it to our attention that an error should be reported.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-16  3:56                     ` Fangrui Song
@ 2023-05-16 22:51                       ` Palmer Dabbelt
  2023-05-16 23:21                         ` Palmer Dabbelt
  0 siblings, 1 reply; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-16 22:51 UTC (permalink / raw)
  To: maskray
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On Mon, 15 May 2023 20:56:47 PDT (-0700), maskray@google.com wrote:
> On 2023-05-12, Palmer Dabbelt wrote:
>>On Fri, 12 May 2023 17:05:02 PDT (-0700), maskray@google.com wrote:
>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>>>>>* Fangrui Song:
>>>>>>>>>
>>>>>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>>>>>application code these days and the gp register allows access to a
>>>>>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>>>>>
>>>>>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>>>>>
>>>>>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>>>>>
>>>>>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>>>>>even smaller.)
>>>>>>>>>
>>>>>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>>>>>so I wouldn't dismiss that.
>>>>>>>>
>>>>>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>>>>>performance and nobody's figured out how to replace that yet.
>>>>>>>>
>>>>>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>>>>>the main executable?
>>>>>>>>
>>>>>>>>In general we don't reference GP from shared libraries as we don't
>>>>>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>>>>>here somewhere, in which case we should fix it, but the original post
>>>>>>>>sounds like it wasn't a supported use case.
>>>>>>>>
>>>>>>>>>Thanks,
>>>>>>>>>Florian
>>>>>>>
>>>>>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>>>>>The area that potentially benefits global pointer relaxation is very small.
>>>>>>>
>>>>>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>>>>>suspect the program somehow has a lot global variable accesses and
>>>>>>>placing these variables in .sdata helps.
>>>>>>>
>>>>>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>>>>>configurations. The saving is like 0.1%.
>>>>>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>>>>>
>>>>>>>On the binutils side, we occasionally see patches to fix global pointer
>>>>>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>>>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>>>>>
>>>>>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>>>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>>>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>>>>>pointer relaxation today will result in fewer compatibility issues.
>>>>>>
>>>>>>This comes up all the time, you're just pushing for a backdoor ABI
>>>>>>break.  I get the desire to remove GP, if we were to be able to redo
>>>>>>things I'd also not have it, but it's in the ABI and we can't change
>>>>>>the binaries that exist.
>>>>>>
>>>>>>If you want a GP-free ABI then you should just go write one up.  Then
>>>>>>it'll become a distro problem, and if it turns out that users also
>>>>>>don't want in then the GP ABI will rot and we can eventually deprecate
>>>>>>it.
>>>>>
>>>>>I am advocating for a change in GNU ld to make --no-relax-gp the default
>>>>>option, but I am not sure it can be considered an ABI break.
>>>>>
>>>>>When using ld --no-relax-gp, the conversion of code sequences to gp is
>>>>>disabled, thus eliminating an assumption related to global pointer relaxation.
>>>>>If an executable is relinked without global pointer relaxation, it should still
>>>>>function properly.
>>>>>
>>>>>To the best of my knowledge, there is no official documentation that designates
>>>>>linker relaxation as a mandatory feature. Relaxation schemes, including global
>>>>>pointer relaxation, are optional. Making an optional feature opt-in does not
>>>>>constitute an ABI break.
>>>>>
>>>>>glibc continues to initialize gp to __global_pointer$ to accommodate users who
>>>>>opt-in for global pointer relaxation. Removing the initialization of gp would
>>>>>indeed be an ABI break, and I have never proposed such a change.
>>>>
>>>>The ABI break is using GP for something else, but that always ends up
>>>>being part of the argument for disabling GP-based relaxation.  Without
>>>>that the argument ends up just being that the GP-relaxations are too
>>>>complicated for the benefit they provide.
>>>
>>>This has happened. Some platforms are investigating or using GP for
>>>software shadow call stack. I think they include Android and Fuchsia.  I
>>>am not affiliated with them.
>>
>>IIUC the Android shadow stack stuff isn't 100% decided, but it's the
>>way they're headed.  I'd argue that's a great reason to actually write
>>this down in an ABI spec: we've got users who either are or will soon
>>violate the spec, we should fix the spec rather than force those users
>>to fork it.
>>
>>>>I certainly understand that argument, these (and for the rest of the
>>>>relaxation stuff) is a huge amount of pain for a small amount of
>>>>benefit (which seems to be mostly based on benchmark dragracing, which
>>>>is never that strong of an argument).  That said, last time we had
>>>>this discussion was only a few months ago.  I don't think anything has
>>>>changed since then so I doubt things well go any differently.
>>>
>>>The rest of the relaxation mechanism does offer significant savings that
>>>should not be overlooked. I have heard claims of up to a 10% .text
>>>reduction in certain embedded systems.
>>>
>>>However, it is important to acknowledge the costs associated with
>>>increased toolchain complexity and bloating of debug information, LSDA ,
>>>and custom metadata sections. I discussed these concerns in detail at
>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation
>>>(Apologies if the title came across as derogatory. It was never my
>>>intention, and I genuinely enjoyed delving into the intricacies of
>>>this toolchain technology.)
>>
>>I'd argue we should make this new ABI just forbid all relaxation.
>>It's a ton of headache for an unknown amount of benefit, we can't
>>easily measure it because we haven't chased down all the missed
>>optimizations due to relaxation.  Modern Linux distros are all PIE
>>anyway, so it's not like most of the relaxation does much.
>>
>>Pretty much everyone who looks at RISC-V says relaxation is bad, let's
>>go let folks prove that out -- and if it's right we can get ride of a
>>pile of complexity, which is great.
>
> It's too late to make linker relaxation opt-in. I'll do my part to fix
> some LLVM assembler deficiency, though.
>
>>>>Just turning GP-based relaxation off by default doesn't get rid of the
>>>>complexity, they're still in the ABI and people will still use them
>>>>(even if just for benchmark dragracing) so they'll still need to work.
>>>
>>>Regarding the ABI story, shared objects do not utilize global pointer
>>>relaxation. Therefore, if only an executable requires GP for a
>>>platform-specific purpose, I believe there is no ABI risk involved. The
>>>risk arises when shared objects rely on the executable's initialized GP
>>>for certain purposes.
>>>
>>>ABI breaks related to global pointer usage occur when a platform
>>>initially ships something using GP and then switches to a different GP
>>>usage.
>>
>>Using GP for anything that's not the global pointer violates the
>>psABI.
>
> I disagree.  The psABI makes it clear that linker relaxation is optional
> with wording like "With linker relaxation enabled" "the linker is
> permitted to use ... relaxation".
>
>>It's just a spec violation and I agree it's possible to
>>violate the specs while still producing working systems.  At that
>>point it's really up to the projects to ensure their spec-violating
>>systems work, there's not much we can do on the toolchain side of
>>things.  I'm guilty of doing that too (just look at a buncho of the
>>Linux port, for example), but that's always a complicated way to do
>>things.
>
> "If a platform requires use of a dedicated general-purpose register for a
> platform-specific purpose, it is recommended to use gp (x3). The
> platform ABI specification must document the use of this register. For
> such platforms, care must be taken to ensure all code (compiler
> generated or otherwise) avoids using gp in a way incompatible with the
> platform specific purpose, and that global
> pointer relaxation is disabled in the toolchain."
>
>>IMO it's still better to have a spec written down so the toolchain and
>>users can be on the same page.
>
> You can make a proposal that global pointer must not be used for other
> purposes and someone will object to it:)

https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/379

>>>As I mentioned earlier, many non-Linux/glibc platforms don't use global
>>>pointer relaxation from the beginning.  Even within Linux/glibc, a new
>>>distribution may choose not to use global pointer relaxation initially.
>>>For these platforms, there would be no ABI break.
>>>
>>>Changing the default behavior does indeed have an impact.
>>>--no-relax-gp is a binutils 2.41 feature, which means that projects that
>>>discover improved ways to utilize GP in the future need to avoid
>>>--no-relax-gp to maintain compatibility with older GNU ld versions.
>>>They could use --no-relax with GNU ld versions, but that's a big hammer.
>>
>>I would describe that as an ABI break.
>
> I provide evidence to back up my point.
> Can you give reasons that this is an ABI break? And break what?
>
>>>
>>>>>
>>>>>>>Haiku
>>>>>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>>>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-16 22:51                       ` Palmer Dabbelt
@ 2023-05-16 23:21                         ` Palmer Dabbelt
  0 siblings, 0 replies; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-16 23:21 UTC (permalink / raw)
  To: maskray
  Cc: fweimer, l.stelmach, libc-alpha, schwab, adhemerval.zanella,
	joseph, binutils, m.pikula, m.szyprowski, k.lewandowsk,
	jeffreyalaw

On Tue, 16 May 2023 15:51:50 PDT (-0700), Palmer Dabbelt wrote:
> On Mon, 15 May 2023 20:56:47 PDT (-0700), maskray@google.com wrote:
>> On 2023-05-12, Palmer Dabbelt wrote:
>>>On Fri, 12 May 2023 17:05:02 PDT (-0700), maskray@google.com wrote:
>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>On Fri, 12 May 2023 15:34:21 PDT (-0700), maskray@google.com wrote:
>>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>>On Fri, 12 May 2023 14:09:08 PDT (-0700), maskray@google.com wrote:
>>>>>>>>On 2023-05-12, Palmer Dabbelt wrote:
>>>>>>>>>On Fri, 12 May 2023 13:11:43 PDT (-0700), fweimer@redhat.com wrote:
>>>>>>>>>>* Fangrui Song:
>>>>>>>>>>
>>>>>>>>>>>>1. Make -mno-relax the default for ld(1) (on Linux?). We have no
>>>>>>>>>>>>benchmarks whatsoever, but global variables aren't very popular in
>>>>>>>>>>>>application code these days and the gp register allows access to a
>>>>>>>>>>>>single memory page (4kB) only. No big deal really.
>>>>>>>>>>>
>>>>>>>>>>>I do agree that --no-relax-gp is a sensible default choice for GNU ld.
>>>>>>>>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation#global-pointer-relaxation
>>>>>>>>>>>
>>>>>>>>>>>Perhaps you can start a separate topic on binutils? :)
>>>>>>>>>>>
>>>>>>>>>>>According to a doc from SiFive about -static -mcpu=sifive-u74 builds,
>>>>>>>>>>>https://docs.google.com/spreadsheets/d/14V7cPbyc80AcGHzsMaw9hYb232dzRbGCmTApnxj-SpU/edit#gid=0
>>>>>>>>>>>global pointer relaxation saves at best 0.5% size (I guess that refers
>>>>>>>>>>>to .text. If we count all allocable sections, the percentage is likely
>>>>>>>>>>>even smaller.)
>>>>>>>>>>
>>>>>>>>>>For a mature toolchain, 0.5% in code size reduction would be *a lot*,
>>>>>>>>>>so I wouldn't dismiss that.
>>>>>>>>>
>>>>>>>>>That's broadly speaking why it sticks around.  We've got a bunch of
>>>>>>>>>headaches related to relaxation, GP or otherwise, but they improve
>>>>>>>>>performance and nobody's figured out how to replace that yet.
>>>>>>>>>
>>>>>>>>>>Do we have a reproducer?  Is the issue actually about gp relaxation for
>>>>>>>>>>the main executable?
>>>>>>>>>
>>>>>>>>>In general we don't reference GP from shared libraries as we don't
>>>>>>>>>have a GP save/restore scheme.  There may be a bug floating around
>>>>>>>>>here somewhere, in which case we should fix it, but the original post
>>>>>>>>>sounds like it wasn't a supported use case.
>>>>>>>>>
>>>>>>>>>>Thanks,
>>>>>>>>>>Florian
>>>>>>>>
>>>>>>>>Global pointer relaxation only applies to +-2KiB data relative to __global_pointer$ (= .sdata + 0x800).
>>>>>>>>The area that potentially benefits global pointer relaxation is very small.
>>>>>>>>
>>>>>>>>0.5% code size reduction (relative to .text?) is the best case. I
>>>>>>>>suspect the program somehow has a lot global variable accesses and
>>>>>>>>placing these variables in .sdata helps.
>>>>>>>>
>>>>>>>>I've got results from Yingwei Zheng at PLCT lab using many
>>>>>>>>configurations. The saving is like 0.1%.
>>>>>>>>https://docs.google.com/spreadsheets/d/1Gz0h-C4U0toa9qELFtRaEWT_CzauE5JD9xMsLR8RyK8/edit#gid=1721258109
>>>>>>>>
>>>>>>>>On the binutils side, we occasionally see patches to fix global pointer
>>>>>>>>relaxation bugs, e.g. the patch just sent a few hours ago:
>>>>>>>>https://sourceware.org/pipermail/binutils/2023-May/127413.html
>>>>>>>>
>>>>>>>>I do not know the embedded toolchain well, but for Linux desktop/server,
>>>>>>>>disabling global pointer relaxation seems like a sensible choice. If we
>>>>>>>>discover a better way to utilize GP (x3) in the future, disabling global
>>>>>>>>pointer relaxation today will result in fewer compatibility issues.
>>>>>>>
>>>>>>>This comes up all the time, you're just pushing for a backdoor ABI
>>>>>>>break.  I get the desire to remove GP, if we were to be able to redo
>>>>>>>things I'd also not have it, but it's in the ABI and we can't change
>>>>>>>the binaries that exist.
>>>>>>>
>>>>>>>If you want a GP-free ABI then you should just go write one up.  Then
>>>>>>>it'll become a distro problem, and if it turns out that users also
>>>>>>>don't want in then the GP ABI will rot and we can eventually deprecate
>>>>>>>it.
>>>>>>
>>>>>>I am advocating for a change in GNU ld to make --no-relax-gp the default
>>>>>>option, but I am not sure it can be considered an ABI break.
>>>>>>
>>>>>>When using ld --no-relax-gp, the conversion of code sequences to gp is
>>>>>>disabled, thus eliminating an assumption related to global pointer relaxation.
>>>>>>If an executable is relinked without global pointer relaxation, it should still
>>>>>>function properly.
>>>>>>
>>>>>>To the best of my knowledge, there is no official documentation that designates
>>>>>>linker relaxation as a mandatory feature. Relaxation schemes, including global
>>>>>>pointer relaxation, are optional. Making an optional feature opt-in does not
>>>>>>constitute an ABI break.
>>>>>>
>>>>>>glibc continues to initialize gp to __global_pointer$ to accommodate users who
>>>>>>opt-in for global pointer relaxation. Removing the initialization of gp would
>>>>>>indeed be an ABI break, and I have never proposed such a change.
>>>>>
>>>>>The ABI break is using GP for something else, but that always ends up
>>>>>being part of the argument for disabling GP-based relaxation.  Without
>>>>>that the argument ends up just being that the GP-relaxations are too
>>>>>complicated for the benefit they provide.
>>>>
>>>>This has happened. Some platforms are investigating or using GP for
>>>>software shadow call stack. I think they include Android and Fuchsia.  I
>>>>am not affiliated with them.
>>>
>>>IIUC the Android shadow stack stuff isn't 100% decided, but it's the
>>>way they're headed.  I'd argue that's a great reason to actually write
>>>this down in an ABI spec: we've got users who either are or will soon
>>>violate the spec, we should fix the spec rather than force those users
>>>to fork it.
>>>
>>>>>I certainly understand that argument, these (and for the rest of the
>>>>>relaxation stuff) is a huge amount of pain for a small amount of
>>>>>benefit (which seems to be mostly based on benchmark dragracing, which
>>>>>is never that strong of an argument).  That said, last time we had
>>>>>this discussion was only a few months ago.  I don't think anything has
>>>>>changed since then so I doubt things well go any differently.
>>>>
>>>>The rest of the relaxation mechanism does offer significant savings that
>>>>should not be overlooked. I have heard claims of up to a 10% .text
>>>>reduction in certain embedded systems.
>>>>
>>>>However, it is important to acknowledge the costs associated with
>>>>increased toolchain complexity and bloating of debug information, LSDA ,
>>>>and custom metadata sections. I discussed these concerns in detail at
>>>>https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation
>>>>(Apologies if the title came across as derogatory. It was never my
>>>>intention, and I genuinely enjoyed delving into the intricacies of
>>>>this toolchain technology.)
>>>
>>>I'd argue we should make this new ABI just forbid all relaxation.
>>>It's a ton of headache for an unknown amount of benefit, we can't
>>>easily measure it because we haven't chased down all the missed
>>>optimizations due to relaxation.  Modern Linux distros are all PIE
>>>anyway, so it's not like most of the relaxation does much.
>>>
>>>Pretty much everyone who looks at RISC-V says relaxation is bad, let's
>>>go let folks prove that out -- and if it's right we can get ride of a
>>>pile of complexity, which is great.
>>
>> It's too late to make linker relaxation opt-in. I'll do my part to fix
>> some LLVM assembler deficiency, though.
>>
>>>>>Just turning GP-based relaxation off by default doesn't get rid of the
>>>>>complexity, they're still in the ABI and people will still use them
>>>>>(even if just for benchmark dragracing) so they'll still need to work.
>>>>
>>>>Regarding the ABI story, shared objects do not utilize global pointer
>>>>relaxation. Therefore, if only an executable requires GP for a
>>>>platform-specific purpose, I believe there is no ABI risk involved. The
>>>>risk arises when shared objects rely on the executable's initialized GP
>>>>for certain purposes.
>>>>
>>>>ABI breaks related to global pointer usage occur when a platform
>>>>initially ships something using GP and then switches to a different GP
>>>>usage.
>>>
>>>Using GP for anything that's not the global pointer violates the
>>>psABI.
>>
>> I disagree.  The psABI makes it clear that linker relaxation is optional
>> with wording like "With linker relaxation enabled" "the linker is
>> permitted to use ... relaxation".
>>
>>>It's just a spec violation and I agree it's possible to
>>>violate the specs while still producing working systems.  At that
>>>point it's really up to the projects to ensure their spec-violating
>>>systems work, there's not much we can do on the toolchain side of
>>>things.  I'm guilty of doing that too (just look at a buncho of the
>>>Linux port, for example), but that's always a complicated way to do
>>>things.
>>
>> "If a platform requires use of a dedicated general-purpose register for a
>> platform-specific purpose, it is recommended to use gp (x3). The
>> platform ABI specification must document the use of this register. For
>> such platforms, care must be taken to ensure all code (compiler
>> generated or otherwise) avoids using gp in a way incompatible with the
>> platform specific purpose, and that global
>> pointer relaxation is disabled in the toolchain."
>>
>>>IMO it's still better to have a spec written down so the toolchain and
>>>users can be on the same page.
>>
>> You can make a proposal that global pointer must not be used for other
>> purposes and someone will object to it:)
>
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/379

and unsurprisingly, it's not wanted in the psABI ;)

>>>>As I mentioned earlier, many non-Linux/glibc platforms don't use global
>>>>pointer relaxation from the beginning.  Even within Linux/glibc, a new
>>>>distribution may choose not to use global pointer relaxation initially.
>>>>For these platforms, there would be no ABI break.
>>>>
>>>>Changing the default behavior does indeed have an impact.
>>>>--no-relax-gp is a binutils 2.41 feature, which means that projects that
>>>>discover improved ways to utilize GP in the future need to avoid
>>>>--no-relax-gp to maintain compatibility with older GNU ld versions.
>>>>They could use --no-relax with GNU ld versions, but that's a big hammer.
>>>
>>>I would describe that as an ABI break.
>>
>> I provide evidence to back up my point.
>> Can you give reasons that this is an ABI break? And break what?

I'm not really sure how else to describe it: the spec you're quoting 
says it must be documented, and the scenerio you're describing is 
exactly the definition of an ABI break (changing what binaries the 
toolchain emits and then removing support for the old binaries).

Like I said above, you (or anyone else) is free to violate the spec and 
then go make sure the systems you're building based on that unspecified 
behavior continue to work well enough for the users.

>>
>>>>
>>>>>>
>>>>>>>>Haiku
>>>>>>>>(https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/298#issuecomment-1344724796), Android, and Fuchsia
>>>>>>>>have mentioned that they don't use global pointer relaxation.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: global pointer gets overwritten with dlopen(3) on RISC-V
  2023-05-16  7:59           ` Szabolcs Nagy
@ 2023-05-17  0:11             ` Palmer Dabbelt
  0 siblings, 0 replies; 19+ messages in thread
From: Palmer Dabbelt @ 2023-05-17  0:11 UTC (permalink / raw)
  To: szabolcs.nagy
  Cc: l.stelmach, fw, libc-alpha, schwab, maskray, fweimer,
	adhemerval.zanella, joseph, binutils, m.pikula, m.szyprowski,
	k.lewandowsk

On Tue, 16 May 2023 00:59:51 PDT (-0700), szabolcs.nagy@arm.com wrote:
> The 05/16/2023 08:53, Lukasz Stelmach via Binutils wrote:
>> No, the file we dlopen is an executable meant to work standalone. We
>> dlopen it for testing and this setup has worked for us on different
>> platforms (armv7l, aarch64, x86). We MAY have not encoutered an error
>
> it is guaranteed broken on all those targets if the exe has
> local exec TLS access. (initial exec TLS is broken too but
> you may get lucky with that)
>
> i think you can get into trouble with interposition, copy
> relocs or canonical plts too.
>
> but even if everything happens to work, it is just bad
> design: it relies on implementation internals instead of
> documented interfaces.

OK, so sounds like this just isn't a bug.  It's oddly similar to the 
other half of this thread: users are relying on carefully constructing 
binaries to work, but that relies on unspecified behavior and thus isn't 
supportable.

>> because our glibc has been patched. I have to investigate the details as
>> Florian brough it to our attention that an error should be reported.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-05-17  0:11 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20230512142114eucas1p112d969a89ad2480a0c10a532bd6d8440@eucas1p1.samsung.com>
2023-05-12 14:21 ` global pointer gets overwritten with dlopen(3) on RISC-V Lukasz Stelmach
2023-05-12 15:13   ` Florian Weimer
2023-05-15 13:47     ` Palmer Dabbelt
     [not found]       ` <CGME20230516065316eucas1p17bffcd25209bb441b9a9f4d263aa8b3c@eucas1p1.samsung.com>
2023-05-16  6:53         ` Lukasz Stelmach
2023-05-16  7:59           ` Szabolcs Nagy
2023-05-17  0:11             ` Palmer Dabbelt
2023-05-12 19:50   ` Fangrui Song
2023-05-12 20:11     ` Florian Weimer
2023-05-12 20:33       ` Palmer Dabbelt
2023-05-12 21:09         ` Fangrui Song
2023-05-12 21:57           ` Palmer Dabbelt
2023-05-12 22:34             ` Fangrui Song
2023-05-12 22:47               ` Palmer Dabbelt
2023-05-13  0:05                 ` Fangrui Song
2023-05-13  0:35                   ` Palmer Dabbelt
2023-05-16  3:56                     ` Fangrui Song
2023-05-16 22:51                       ` Palmer Dabbelt
2023-05-16 23:21                         ` Palmer Dabbelt
2023-05-12 20:35       ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).