From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-f178.google.com (mail-vk1-f178.google.com [209.85.221.178]) by sourceware.org (Postfix) with ESMTPS id B562C3858D1E for ; Sat, 23 Mar 2024 01:51:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B562C3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B562C3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=209.85.221.178 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711158716; cv=none; b=trCoQOmWGAAb9Apz5Aw9DiNfVkNidhwSArstORcqMC2PDmE8fDD56XmfvtnzQFgzx7+Hiy2eCqDqdQKcDisiJMfrba8UwHZh+rwB7B9tBrRbDwzPZ3Z5ssM4gjAlOKlxbVKWzNNfc+wzzb2/BorllvkpkzccQwpKkIRYhELfy7I= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711158716; c=relaxed/simple; bh=6SMh3UIAjUjz6WLOBVrQF/CRxAjVcfKWRZfXVOXjSO0=; h=MIME-Version:From:Date:Message-ID:Subject:To; b=jr+XAA85SdE5zNZbtxAdN8ef/UjvC+lHvaXkYOykxyDv0PsmqzPc+5+oL44XRNKoLLwWWy8DFWnsI4FXHENHNt48KFBtc0WpXJ865r+YczXQD7Mj46GX7OCiHTyV53eJ95kFkh4aeL3eQCtPeHibBverXlzQslb9ENCHHovFJaU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-vk1-f178.google.com with SMTP id 71dfb90a1353d-4d42d18c683so923395e0c.1 for ; Fri, 22 Mar 2024 18:51:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711158713; x=1711763513; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9xS3JhPUn3EQsDA1JN27uoJNlem5MK4zp2tsUH0okiI=; b=uhquGA31kB6snfMh382X1kgUBbeq27Mxk/PQVaYOYKy6Qylx/WMuC4Aq1rMeVuAh/K w8W0mcvUyVrSNqZgQN2edkKNmUHrM4PMa8pZrskRMgbFHbgS2eCcJw+Iz2A7VrXZs/0D uco2SaynnFGQbwZi9GH5tRpkvLudiQSX1sGNUPZFOuZfLjZVsCwEE1rVtUh92IPhVgFe Ri6gpPDQWXWRvY7ysBvdMOcIHAqr/NOgSFP/lY3/URFGyznAyqQTUtzBVl2Utx6iFsOr Aqi55ndLeO7rw0nYmBlErMlTSEawajDgb/TitSC/AbmrGTI6Toarbb5XwXRTdF40QBx3 MUUA== X-Forwarded-Encrypted: i=1; AJvYcCUzjvfgHx0GydukMjdqVGpe54QtGV8ZlEr8Pwqx5AsO8YFztxS0wLK3e+yH3H3HSFp4h63homrZZ9XXQZNvAEY= X-Gm-Message-State: AOJu0Yxb0iAQ706uNTxwabxHdBClcPuAgyFyDqComndwYN5hCEUPvlVn mO41tPDKny9gWwzZ9HnOd63boA4dunG2RbJZmUhLk+l0MwaMjy+jlnxUAQBN X-Google-Smtp-Source: AGHT+IF60mLmGx5fU7avj5P+b0nBiniBqgH3P1OToJps31+yvH0XoZeu/NorEo4qi/K/jXuEZD2YQA== X-Received: by 2002:a05:6102:19ce:b0:476:9fe0:6e with SMTP id jn14-20020a05610219ce00b004769fe0006emr1256377vsb.8.1711158712875; Fri, 22 Mar 2024 18:51:52 -0700 (PDT) Received: from mail-vk1-f171.google.com (mail-vk1-f171.google.com. [209.85.221.171]) by smtp.gmail.com with ESMTPSA id j20-20020a056102335400b004768738d8c2sm156884vse.22.2024.03.22.18.51.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Mar 2024 18:51:52 -0700 (PDT) Received: by mail-vk1-f171.google.com with SMTP id 71dfb90a1353d-4d43df40579so951677e0c.0 for ; Fri, 22 Mar 2024 18:51:52 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUv8j1YDQ4qMGpOYonUPkfX/6of+M/OVpj18/OPmdp/K6ULKlfH3F9afkLs+lXCWa/mgPdEuFR7X901uUV1wyE= X-Received: by 2002:a05:6122:2019:b0:4c8:90e5:6792 with SMTP id l25-20020a056122201900b004c890e56792mr1433963vkd.7.1711158712496; Fri, 22 Mar 2024 18:51:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Fangrui Song Date: Fri, 22 Mar 2024 18:51:41 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: CREL relocation format for ELF (was: RELLEB) To: binutils@sourceware.org, gcc@gcc.gnu.org Cc: Cary Coutant Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_00,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_INFOUSMEBIZ,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Mar 14, 2024 at 5:16=E2=80=AFPM Fangrui Song = wrote: > > The relocation formats REL and RELA for ELF are inefficient. In a > release build of Clang for x86-64, .rela.* sections consume a > significant portion (approximately 20.9%) of the file size. > > I propose RELLEB, a new format offering significant file size > reductions: 17.2% (x86-64), 16.5% (aarch64), and even 32.4% (riscv64)! > > Your thoughts on RELLEB are welcome! > > Detailed analysis: > https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf > generic ABI (ELF specification): > https://groups.google.com/g/generic-abi/c/yb0rjw56ORw > binutils feature request: https://sourceware.org/bugzilla/show_bug.cgi?id= =3D31475 > LLVM: https://discourse.llvm.org/t/rfc-relleb-a-compact-relocation-format= -for-elf/77600 > > Implementation primarily involves binutils changes. Any volunteers? > For GCC, a driver option like -mrelleb in my Clang prototype would be > needed. The option instructs the assembler to use RELLEB. The format was tentatively named RELLEB. As I refine the original pure LEB-based format, =E2=80=9CRELLEB=E2=80=9D might not be the most fitting na= me. I have switched to SHT_CREL/DT_CREL/.crel and updated https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf and https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ The new format is simpler and better than RELLEB even in the absence of the shifted offset technique. Dynamic relocations using CREL are even smaller than Android's packed relocations. // encodeULEB128(uint64_t, raw_ostream &os); // encodeSLEB128(int64_t, raw_ostream &os); Elf_Addr offsetMask =3D 8, offset =3D 0, addend =3D 0; uint32_t symidx =3D 0, type =3D 0; for (const Reloc &rel : relocs) offsetMask |=3D crels[i].r_offset; int shift =3D std::countr_zero(offsetMask) encodeULEB128(relocs.size() * 4 + shift, os); for (const Reloc &rel : relocs) { Elf_Addr deltaOffset =3D (rel.r_offset - offset) >> shift; uint8_t b =3D deltaOffset * 8 + (symidx !=3D rel.r_symidx) + (type !=3D rel.r_type ? 2 : 0) + (addend !=3D rel.r_addend ? = 4 : 0); if (deltaOffset < 0x10) { os << char(b); } else { os << char(b | 0x80); encodeULEB128(deltaOffset >> 4, os); } if (b & 1) { encodeSLEB128(static_cast(rel.r_symidx - symidx), os); symidx =3D rel.r_symidx; } if (b & 2) { encodeSLEB128(static_cast(rel.r_type - type), os); type =3D rel.r_type; } if (b & 4) { encodeSLEB128(std::make_signed_t(rel.r_addend - addend), os); addend =3D rel.r_addend; } } --- While alternatives like PrefixVarInt (or a suffix-based variant) might excel when encoding larger integers, LEB128 offers advantages when most integers fit within one or two bytes, as it avoids the need for shift operations in the common one-byte representation. While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert SLEB128-encoded type/addend to use ULEB128 instead, the generate code is inferior to or on par with SLEB128 for one-byte encodings.