public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Fangrui Song <maskray@gcc.gnu.org>
Cc: Cary Coutant <ccoutant@gmail.com>,
	binutils@sourceware.org, gcc@gcc.gnu.org
Subject: Re: CREL relocation format for ELF
Date: Thu, 28 Mar 2024 10:23:46 +0100	[thread overview]
Message-ID: <dcd9daf2-f2fb-4eca-94c8-8c878683b986@suse.com> (raw)
In-Reply-To: <CAN30aBFN1jiJiMQQ63m2iyaZB+_UW_=1xoC=i1zMzTHZAJ6Jbg@mail.gmail.com>

On 28.03.2024 08:43, Fangrui Song wrote:
> On Fri, Mar 22, 2024 at 6:51 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>>
>> On Thu, Mar 14, 2024 at 5:16 PM Fangrui Song <maskray@gcc.gnu.org> wrote:
>>>
>>> The relocation formats REL and RELA for ELF are inefficient. In a
>>> release build of Clang for x86-64, .rela.* sections consume a
>>> significant portion (approximately 20.9%) of the file size.
>>>
>>> I propose RELLEB, a new format offering significant file size
>>> reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)!
>>>
>>> Your thoughts on RELLEB are welcome!
>>>
>>> Detailed analysis:
>>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>>> generic ABI (ELF specification):
>>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw
>>> binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=31475
>>> LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format-for-elf/77600
>>>
>>> Implementation primarily involves binutils changes. Any volunteers?
>>> For GCC, a driver option like -mrelleb in my Clang prototype would be
>>> needed. The option instructs the assembler to use RELLEB.
>>
>> The format was tentatively named RELLEB. As I refine the original pure
>> LEB-based format, “RELLEB” might not be the most fitting name.
>>
>> I have switched to SHT_CREL/DT_CREL/.crel and updated
>> https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf
>> and
>> https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ
>>
>> The new format is simpler and better than RELLEB even in the absence
>> of the shifted offset technique.
>>
>> Dynamic relocations using CREL are even smaller than Android's packed
>> relocations.
>>
>> // encodeULEB128(uint64_t, raw_ostream &os);
>> // encodeSLEB128(int64_t, raw_ostream &os);
>>
>> Elf_Addr offsetMask = 8, offset = 0, addend = 0;
>> uint32_t symidx = 0, type = 0;
>> for (const Reloc &rel : relocs)
>>   offsetMask |= crels[i].r_offset;
>> int shift = std::countr_zero(offsetMask)
>> encodeULEB128(relocs.size() * 4 + shift, os);
>> for (const Reloc &rel : relocs) {
>>   Elf_Addr deltaOffset = (rel.r_offset - offset) >> shift;
>>   uint8_t b = deltaOffset * 8 + (symidx != rel.r_symidx) +
>>               (type != rel.r_type ? 2 : 0) + (addend != rel.r_addend ? 4 : 0);
>>   if (deltaOffset < 0x10) {
>>     os << char(b);
>>   } else {
>>     os << char(b | 0x80);
>>     encodeULEB128(deltaOffset >> 4, os);
>>   }
>>   if (b & 1) {
>>     encodeSLEB128(static_cast<int32_t>(rel.r_symidx - symidx), os);
>>     symidx = rel.r_symidx;
>>   }
>>   if (b & 2) {
>>     encodeSLEB128(static_cast<int32_t>(rel.r_type - type), os);
>>     type = rel.r_type;
>>   }
>>   if (b & 4) {
>>     encodeSLEB128(std::make_signed_t<uint>(rel.r_addend - addend), os);
>>     addend = rel.r_addend;
>>   }
>> }
>>
>> ---
>>
>> While alternatives like PrefixVarInt (or a suffix-based variant) might
>> excel when encoding larger integers, LEB128 offers advantages when
>> most integers fit within one or two bytes, as it avoids the need for
>> shift operations in the common one-byte representation.
>>
>> While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert
>> SLEB128-encoded type/addend to use ULEB128 instead, the generate code
>> is inferior to or on par with SLEB128 for one-byte encodings.
> 
> 
> We can introduce a gas option --crel, then users can specify `gcc
> -Wa,--crel a.c` (-flto also gets -Wa, options).
> 
> I propose that we add another gas option --implicit-addends-for-data
> (does the name look good?) to allow non-code sections to use implicit
> addends to save space
> (https://sourceware.org/PR31567).
> Using implicit addends primarily benefits debug sections such as
> .debug_str_offsets, .debug_names, .debug_addr, .debug_line, but also
> data sections such as .eh_frame, .data., .data.rel.ro, .init_array.
> 
> -Wa,--implicit-addends-for-data can be used on its own (6.4% .o
> reduction in a clang -g -g0 -gpubnames build)

And this option will the switch from RELA to REL relocation sections,
effectively in violation of most ABIs I'm aware of?

Furthermore, why just data? x86 at least could benefit almost as much
for code. Hence maybe better --implicit-addends=data, with an
option for architectures to also permit --implicit-addends=text.

Jan

>       or together with
> CREL to achieve more incredible size reduction, one single byte for
> most .debug_* relocations!
> With CREL, concerns of debug section relocations will become a thing
> of the past.


  reply	other threads:[~2024-03-28  9:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15  0:16 RELLEB " Fangrui Song
2024-03-23  1:51 ` CREL relocation format for ELF (was: RELLEB) Fangrui Song
2024-03-28  7:43   ` Fangrui Song
2024-03-28  9:23     ` Jan Beulich [this message]
2024-03-29  7:24       ` CREL relocation format for ELF Fangrui Song
2024-03-28 13:04   ` CREL relocation format for ELF (was: RELLEB) Alan Modra
2024-03-28 16:26     ` Fangrui Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcd9daf2-f2fb-4eca-94c8-8c878683b986@suse.com \
    --to=jbeulich@suse.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=maskray@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).