public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* RELLEB relocation format for ELF
@ 2024-03-15  0:16 Fangrui Song
  2024-03-23  1:51 ` CREL relocation format for ELF (was: RELLEB) Fangrui Song
  0 siblings, 1 reply; 7+ messages in thread
From: Fangrui Song @ 2024-03-15  0:16 UTC (permalink / raw)
  To: binutils, gcc; +Cc: Cary Coutant

The relocation formats REL and RELA for ELF are inefficient. In a
release build of Clang for x86-64, .rela.* sections consume a
significant portion (approximately 20.9%) of the file size.

I propose RELLEB, a new format offering significant file size
reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!

Your thoughts on RELLEB are welcome!

Detailed analysis:
https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
generic ABI (ELF specification):
https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600

Implementation primarily involves binutils changes. Any volunteers?
For GCC, a driver option like -mrelleb in my Clang prototype would be
needed. The option instructs the assembler to use RELLEB.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* CREL relocation format for ELF (was: RELLEB)
  2024-03-15  0:16 RELLEB relocation format for ELF Fangrui Song
@ 2024-03-23  1:51 ` Fangrui Song
  2024-03-28  7:43   ` Fangrui Song
  2024-03-28 13:04   ` CREL relocation format for ELF (was: RELLEB) Alan Modra
  0 siblings, 2 replies; 7+ messages in thread
From: Fangrui Song @ 2024-03-23  1:51 UTC (permalink / raw)
  To: binutils, gcc; +Cc: Cary Coutant

On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>
> The relocation formats REL and RELA for ELF are inefficient. In a
> release build of Clang for x86-64, .rela.* sections consume a
> significant portion (approximately 20.9%) of the file size.
>
> I propose RELLEB, a new format offering significant file size
> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
>
> Your thoughts on RELLEB are welcome!
>
> Detailed analysis:
> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> generic ABI (ELF specification):
> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
>
> Implementation primarily involves binutils changes. Any volunteers?
> For GCC, a driver option like -mrelleb in my Clang prototype would be
> needed. The option instructs the assembler to use RELLEB.

The format was tentatively named RELLEB. As I refine the original pure
LEB-based format, “RELLEB” might not be the most fitting name.

I have switched to SHT_CREL/DT_CREL/.crel and updated
https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
and
https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ

The new format is simpler and better than RELLEB even in the absence
of the shifted offset technique.

Dynamic relocations using CREL are even smaller than Android's packed
relocations.

// encodeULEB128(uint64_t, raw_ostream &os);
// encodeSLEB128(int64_t, raw_ostream &os);

Elf_Addr offsetMask = 8, offset = 0, addend = 0;
uint32_t symidx = 0, type = 0;
for (const Reloc &rel : relocs)
  offsetMask |= crels[i].r_offset;
int shift = std::countr_zero(offsetMask)
encodeULEB128(relocs.size() * 4 + shift, os);
for (const Reloc &rel : relocs) {
  Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
  uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
              (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
  if (deltaOffset < 0x10) {
    os << char(b);
  } else {
    os << char(b | 0x80);
    encodeULEB128(deltaOffset >> 4, os);
  }
  if (b & 1) {
    encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
    symidx = rel.r_symidx;
  }
  if (b & 2) {
    encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
    type = rel.r_type;
  }
  if (b & 4) {
    encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
    addend = rel.r_addend;
  }
}

---

While alternatives like PrefixVarInt (or a suffix-based variant) might
excel when encoding larger integers, LEB128 offers advantages when
most integers fit within one or two bytes, as it avoids the need for
shift operations in the common one-byte representation.

While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
SLEB128-encoded type/addend to use ULEB128 instead, the generate code
is inferior to or on par with SLEB128 for one-byte encodings.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CREL relocation format for ELF (was: RELLEB)
  2024-03-23  1:51 ` CREL relocation format for ELF (was: RELLEB) Fangrui Song
@ 2024-03-28  7:43   ` Fangrui Song
  2024-03-28  9:23     ` CREL relocation format for ELF Jan Beulich
  2024-03-28 13:04   ` CREL relocation format for ELF (was: RELLEB) Alan Modra
  1 sibling, 1 reply; 7+ messages in thread
From: Fangrui Song @ 2024-03-28  7:43 UTC (permalink / raw)
  To: binutils, gcc; +Cc: Cary Coutant

On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
> >
> > The relocation formats REL and RELA for ELF are inefficient. In a
> > release build of Clang for x86-64, .rela.* sections consume a
> > significant portion (approximately 20.9%) of the file size.
> >
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!
> >
> > Detailed analysis:
> > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> > generic ABI (ELF specification):
> > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> > binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> > LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
> >
> > Implementation primarily involves binutils changes. Any volunteers?
> > For GCC, a driver option like -mrelleb in my Clang prototype would be
> > needed. The option instructs the assembler to use RELLEB.
>
> The format was tentatively named RELLEB. As I refine the original pure
> LEB-based format, “RELLEB” might not be the most fitting name.
>
> I have switched to SHT_CREL/DT_CREL/.crel and updated
> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> and
> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>
> The new format is simpler and better than RELLEB even in the absence
> of the shifted offset technique.
>
> Dynamic relocations using CREL are even smaller than Android's packed
> relocations.
>
> // encodeULEB128(uint64_t, raw_ostream &os);
> // encodeSLEB128(int64_t, raw_ostream &os);
>
> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
> uint32_t symidx = 0, type = 0;
> for (const Reloc &rel : relocs)
>   offsetMask |= crels[i].r_offset;
> int shift = std::countr_zero(offsetMask)
> encodeULEB128(relocs.size() * 4 + shift, os);
> for (const Reloc &rel : relocs) {
>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>               (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>   if (deltaOffset < 0x10) {
>     os << char(b);
>   } else {
>     os << char(b | 0x80);
>     encodeULEB128(deltaOffset >> 4, os);
>   }
>   if (b & 1) {
>     encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
>     symidx = rel.r_symidx;
>   }
>   if (b & 2) {
>     encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
>     type = rel.r_type;
>   }
>   if (b & 4) {
>     encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
>     addend = rel.r_addend;
>   }
> }
>
> ---
>
> While alternatives like PrefixVarInt (or a suffix-based variant) might
> excel when encoding larger integers, LEB128 offers advantages when
> most integers fit within one or two bytes, as it avoids the need for
> shift operations in the common one-byte representation.
>
> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
> is inferior to or on par with SLEB128 for one-byte encodings.


We can introduce a gas option --crel, then users can specify `gcc
-Wa,--crel a.c` (-flto also gets -Wa, options).

I propose that we add another gas option --implicit-addends-for-data
(does the name look good?) to allow non-code sections to use implicit
addends to save space
(https://sourceware.org/PR31567).
Using implicit addends primarily benefits debug sections such as
.debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
data sections such as .eh_frame, .data., .data.rel.ro, .init_array.

-Wa,--implicit-addends-for-data can be used on its own (6.4% .o
reduction in a clang -g -g0 -gpubnames build)       or together with
CREL to achieve more incredible size reduction, one single byte for
most .debug_* relocations!
With CREL, concerns of debug section relocations will become a thing
of the past.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CREL relocation format for ELF
  2024-03-28  7:43   ` Fangrui Song
@ 2024-03-28  9:23     ` Jan Beulich
  2024-03-29  7:24       ` Fangrui Song
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2024-03-28  9:23 UTC (permalink / raw)
  To: Fangrui Song; +Cc: Cary Coutant, binutils, gcc

On 28.03.2024 08:43, Fangrui Song wrote:
> On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>>
>> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>>>
>>> The relocation formats REL and RELA for ELF are inefficient. In a
>>> release build of Clang for x86-64, .rela.* sections consume a
>>> significant portion (approximately 20.9%) of the file size.
>>>
>>> I propose RELLEB, a new format offering significant file size
>>> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
>>>
>>> Your thoughts on RELLEB are welcome!
>>>
>>> Detailed analysis:
>>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>>> generic ABI (ELF specification):
>>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
>>> binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
>>> LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
>>>
>>> Implementation primarily involves binutils changes. Any volunteers?
>>> For GCC, a driver option like -mrelleb in my Clang prototype would be
>>> needed. The option instructs the assembler to use RELLEB.
>>
>> The format was tentatively named RELLEB. As I refine the original pure
>> LEB-based format, “RELLEB” might not be the most fitting name.
>>
>> I have switched to SHT_CREL/DT_CREL/.crel and updated
>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>> and
>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>>
>> The new format is simpler and better than RELLEB even in the absence
>> of the shifted offset technique.
>>
>> Dynamic relocations using CREL are even smaller than Android's packed
>> relocations.
>>
>> // encodeULEB128(uint64_t, raw_ostream &os);
>> // encodeSLEB128(int64_t, raw_ostream &os);
>>
>> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
>> uint32_t symidx = 0, type = 0;
>> for (const Reloc &rel : relocs)
>>   offsetMask |= crels[i].r_offset;
>> int shift = std::countr_zero(offsetMask)
>> encodeULEB128(relocs.size() * 4 + shift, os);
>> for (const Reloc &rel : relocs) {
>>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>>               (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>>   if (deltaOffset < 0x10) {
>>     os << char(b);
>>   } else {
>>     os << char(b | 0x80);
>>     encodeULEB128(deltaOffset >> 4, os);
>>   }
>>   if (b & 1) {
>>     encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
>>     symidx = rel.r_symidx;
>>   }
>>   if (b & 2) {
>>     encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
>>     type = rel.r_type;
>>   }
>>   if (b & 4) {
>>     encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
>>     addend = rel.r_addend;
>>   }
>> }
>>
>> ---
>>
>> While alternatives like PrefixVarInt (or a suffix-based variant) might
>> excel when encoding larger integers, LEB128 offers advantages when
>> most integers fit within one or two bytes, as it avoids the need for
>> shift operations in the common one-byte representation.
>>
>> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
>> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
>> is inferior to or on par with SLEB128 for one-byte encodings.
> 
> 
> We can introduce a gas option --crel, then users can specify `gcc
> -Wa,--crel a.c` (-flto also gets -Wa, options).
> 
> I propose that we add another gas option --implicit-addends-for-data
> (does the name look good?) to allow non-code sections to use implicit
> addends to save space
> (https://sourceware.org/PR31567).
> Using implicit addends primarily benefits debug sections such as
> .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
> data sections such as .eh_frame, .data., .data.rel.ro, .init_array.
> 
> -Wa,--implicit-addends-for-data can be used on its own (6.4% .o
> reduction in a clang -g -g0 -gpubnames build)

And this option will the switch from RELA to REL relocation sections,
effectively in violation of most ABIs I'm aware of?

Furthermore, why just data? x86 at least could benefit almost as much
for code. Hence maybe better --implicit-addends=data, with an
option for architectures to also permit --implicit-addends=text.

Jan

>       or together with
> CREL to achieve more incredible size reduction, one single byte for
> most .debug_* relocations!
> With CREL, concerns of debug section relocations will become a thing
> of the past.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CREL relocation format for ELF (was: RELLEB)
  2024-03-23  1:51 ` CREL relocation format for ELF (was: RELLEB) Fangrui Song
  2024-03-28  7:43   ` Fangrui Song
@ 2024-03-28 13:04   ` Alan Modra
  2024-03-28 16:26     ` Fangrui Song
  1 sibling, 1 reply; 7+ messages in thread
From: Alan Modra @ 2024-03-28 13:04 UTC (permalink / raw)
  To: Fangrui Song; +Cc: binutils, gcc, Cary Coutant

On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
> > I propose RELLEB, a new format offering significant file size
> > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >
> > Your thoughts on RELLEB are welcome!

Does anyone really care about relocatable object file size?  If they
do, wouldn't they be better off using a compressed file system?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CREL relocation format for ELF (was: RELLEB)
  2024-03-28 13:04   ` CREL relocation format for ELF (was: RELLEB) Alan Modra
@ 2024-03-28 16:26     ` Fangrui Song
  0 siblings, 0 replies; 7+ messages in thread
From: Fangrui Song @ 2024-03-28 16:26 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils, gcc, Cary Coutant

On Thu, Mar 28, 2024 at 6:04 AM Alan Modra <amodra@gmail.com> wrote:
>
> On Fri, Mar 22, 2024 at 06:51:41PM -0700, Fangrui Song wrote:
> > On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
> > > I propose RELLEB, a new format offering significant file size
> > > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> > >
> > > Your thoughts on RELLEB are welcome!
>
> Does anyone really care about relocatable object file size?  If they
> do, wouldn't they be better off using a compressed file system?

Yes, many people care about relocatable file sizes.

* Relocation sizes affect DWARF evolution and we were/are using an
imperfect metric due to overly bloated REL/RELA. .debug_str_offsets
does not get much traction in GCC, probably partly because it needs
relocations. DWARF v5 introduced changes to keep relocations small.
Many are good on their own, but we need to be cautious of relocation
concerns causing us to pick the wrong trade-off in the future.
* On many Linux targets, Clang emits .llvm_addrsig by default to allow
ld.lld --icf=safe. .llvm_addrsig stores symbol indexes in ULEB128
instead of using relocations to prevent a significant size increase.
* Static relocations make .a files larger.
* Some users care about the build artifact size due to limited disk space.
  + I believe part of the reasons -ffunction-sections -fdata-sections
do not get more adoption is due to the relocatable file size concern.
  + I prefer to place build directories in Linux tmpfs. 12G vs 10G in
memory matters to me :)
  + Large .o files => more IO amount. This may be more significant
when the storage is remote.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CREL relocation format for ELF
  2024-03-28  9:23     ` CREL relocation format for ELF Jan Beulich
@ 2024-03-29  7:24       ` Fangrui Song
  0 siblings, 0 replies; 7+ messages in thread
From: Fangrui Song @ 2024-03-29  7:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Cary Coutant, binutils, gcc

On Thu, Mar 28, 2024 at 2:23 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 28.03.2024 08:43, Fangrui Song wrote:
> > On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
> >>
> >> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
> >>>
> >>> The relocation formats REL and RELA for ELF are inefficient. In a
> >>> release build of Clang for x86-64, .rela.* sections consume a
> >>> significant portion (approximately 20.9%) of the file size.
> >>>
> >>> I propose RELLEB, a new format offering significant file size
> >>> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
> >>>
> >>> Your thoughts on RELLEB are welcome!
> >>>
> >>> Detailed analysis:
> >>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> >>> generic ABI (ELF specification):
> >>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
> >>> binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
> >>> LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
> >>>
> >>> Implementation primarily involves binutils changes. Any volunteers?
> >>> For GCC, a driver option like -mrelleb in my Clang prototype would be
> >>> needed. The option instructs the assembler to use RELLEB.
> >>
> >> The format was tentatively named RELLEB. As I refine the original pure
> >> LEB-based format, “RELLEB” might not be the most fitting name.
> >>
> >> I have switched to SHT_CREL/DT_CREL/.crel and updated
> >> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
> >> and
> >> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
> >>
> >> The new format is simpler and better than RELLEB even in the absence
> >> of the shifted offset technique.
> >>
> >> Dynamic relocations using CREL are even smaller than Android's packed
> >> relocations.
> >>
> >> // encodeULEB128(uint64_t, raw_ostream &os);
> >> // encodeSLEB128(int64_t, raw_ostream &os);
> >>
> >> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
> >> uint32_t symidx = 0, type = 0;
> >> for (const Reloc &rel : relocs)
> >>   offsetMask |= crels[i].r_offset;
> >> int shift = std::countr_zero(offsetMask)
> >> encodeULEB128(relocs.size() * 4 + shift, os);
> >> for (const Reloc &rel : relocs) {
> >>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
> >>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
> >>               (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
> >>   if (deltaOffset < 0x10) {
> >>     os << char(b);
> >>   } else {
> >>     os << char(b | 0x80);
> >>     encodeULEB128(deltaOffset >> 4, os);
> >>   }
> >>   if (b & 1) {
> >>     encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
> >>     symidx = rel.r_symidx;
> >>   }
> >>   if (b & 2) {
> >>     encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
> >>     type = rel.r_type;
> >>   }
> >>   if (b & 4) {
> >>     encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
> >>     addend = rel.r_addend;
> >>   }
> >> }
> >>
> >> ---
> >>
> >> While alternatives like PrefixVarInt (or a suffix-based variant) might
> >> excel when encoding larger integers, LEB128 offers advantages when
> >> most integers fit within one or two bytes, as it avoids the need for
> >> shift operations in the common one-byte representation.
> >>
> >> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
> >> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
> >> is inferior to or on par with SLEB128 for one-byte encodings.
> >
> >
> > We can introduce a gas option --crel, then users can specify `gcc
> > -Wa,--crel a.c` (-flto also gets -Wa, options).
> >
> > I propose that we add another gas option --implicit-addends-for-data
> > (does the name look good?) to allow non-code sections to use implicit
> > addends to save space
> > (https://sourceware.org/PR31567).
> > Using implicit addends primarily benefits debug sections such as
> > .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
> > data sections such as .eh_frame, .data., .data.rel.ro, .init_array.
> >
> > -Wa,--implicit-addends-for-data can be used on its own (6.4% .o
> > reduction in a clang -g -g0 -gpubnames build)

> And this option will the switch from RELA to REL relocation sections, effectively in violation of most ABIs I'm aware of?

This does violate x86-64 LP64 psABI and PPC64 ELFv2. The AArch64 psABI
allows REL while the RISC-V psABI doesn't say anything about REL/RELA.

x86-64:

    The AMD64 LP64 ABI architecture uses only Elf64_Rela relocation
entries with explicit addends. The r_addend member serves as the
relocation addend.
    The AMD64 ILP32 ABI architecture uses only Elf32_Rela relocation
entries in relocatable files. Executable files or shared objects may
use either Elf32_Rela or Elf32_Rel relocation entries.

AArch64:

    A binary file may use ``REL`` or ``RELA`` relocations or a mixture
of the two (but multiple relocations of the same place must use only
one type).

    The initial addend for a ``REL``-type relocation is formed
according to the following rules.

    - If the relocation relocates data (`Static Data relocations`_)
the initial value in the place is sign-extended to 64 bits.

    - If the relocation relocates an instruction the immediate field
of the instruction is extracted, scaled as required by the instruction
field encoding, and sign-extended to 64 bits.

    A ``RELA`` format relocation must be used if the initial addend
cannot be encoded in the place.

    There is no PC bias to accommodate in the relocation of a place
containing an instruction that formulates a PC- relative address. The
program counter reflects the address of the currently executing
instruction.

PPC64 ELFv2:

    The 64-bit OpenPOWER Architecture uses Elf64_Rela relocation
entries exclusively.

> Furthermore, why just data? x86 at least could benefit almost as much for code. Hence maybe better --implicit-addends=data, with an option for architectures to also permit --implicit-addends=text.

I agree that the design is not great. I am thinking about an option
that applies to all sections:
During fixup conversion to relocations, check if the relocation type
can accommodate the addend as a "data relocation type."
If any relocation within a section encounters an oversized addend,
switch from REL to RELA.
However, the feasibility of this approach needs evaluation regarding
implementation complexity.

---

I have made `clang -g -gz=zstd` experiments, building lld for both
`-O0` and `-O2`:

```
.o size    | reloc size | .debug size |.debug_addr|.c?rela?.debug_addr
1453265896 |  467465160 |  200379733  |     77894 |  51123648
| -g -gz=zstd
1361904480 |  345821648 |  230681356  |   1628142 |  34082432
| -g -gz=zstd -Wa,--implicit-addends-for-data
1042317288 |   56517599 |  200378501  |     77894 |   5000201
| -g -gz=zstd -Wa,--crel
1057438728 |   41336040 |  230681552  |   1628142 |   3720546
| -g -gz=zstd -Wa,--crel,--implicit-addends-for-data

 626745136 |  292634688 |  225932160  |     77920 |  47820480
| -O2 -g -gz=zstd
 564322008 |  201200656 |  254962205  |   3104850 |  31880320
| -O2 -g -gz=zstd -Wa,--implicit-addends-for-data
 363224200 |   29114818 |  225930949  |     77920 |   4513572
| -O2 -g -gz=zstd -Wa,--crel
 377970016 |   14829524 |  254962382  |   3104850 |   2118037
| -O2 -g -gz=zstd -Wa,--crel,--implicit-addends-for-data
```

Observations:

* With or without -gz=zstd (another experiment not shown here), the .o
size reduction ratios with REL are close.
* Implicit addends make .debug* sections less compressible. If the
focus is .debug* and .rela.debug* sections, REL is a loss with
-gz=zstd.
* REL -gz=zstd is still smaller than RELA -gz=zstd, which is not
surprising as we compare uncompressed REL/RELA (larger difference) and
compressed non-zero/zero `.debug` contents (smaller difference).

A few points about CREL:

* For CREL -gz=zstd, using implicit addends increases .o file sizes
likely because the "less compressible" factor is more significant when
the relocation size becomes negligible.
* CREL reduction ratio becomes incredible with -gz=zstd at a high
optimization level: for -O2 -g -gz=zstd, it's a 42.0% reduction in the
.o size!
* CREL with implicit addends might not be worth doing if the priority
is debug sections.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-29  7:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15  0:16 RELLEB relocation format for ELF Fangrui Song
2024-03-23  1:51 ` CREL relocation format for ELF (was: RELLEB) Fangrui Song
2024-03-28  7:43   ` Fangrui Song
2024-03-28  9:23     ` CREL relocation format for ELF Jan Beulich
2024-03-29  7:24       ` Fangrui Song
2024-03-28 13:04   ` CREL relocation format for ELF (was: RELLEB) Alan Modra
2024-03-28 16:26     ` Fangrui Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).