From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 81011 invoked by alias); 7 Dec 2017 22:51:26 -0000 Mailing-List: contact gnu-gabi-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: gnu-gabi-owner@sourceware.org Received: (qmail 80987 invoked by uid 89); 7 Dec 2017 22:51:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Specifically, 99.99, Discussion, cary X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: mail-yb0-f169.google.com Received: from mail-yb0-f169.google.com (HELO mail-yb0-f169.google.com) (209.85.213.169) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Dec 2017 22:51:23 +0000 Received: by mail-yb0-f169.google.com with SMTP id 5so3636513ybp.4 for ; Thu, 07 Dec 2017 14:51:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=r5RQM+h9mijNK4GiI31I4735Pc+/FzNu+/pjbANe6lo=; b=UvG/ODlLSRsWdlItm99BKQ3BQ0Gui8iakKsBQ6X6HcG+wYeslgjr7GZ0YITtKt1si0 Sc7pfxHH7pZWqXRgxDcs+BUjspoWBbXpomRN4m46g7eiRLVt9X+emWOgsWixOSxLPrXw jiFoYcYLeNOd5BfhdTpSt6+xZCdbiS6tzw+N19qEuRgm0NI7rG8NDu2621df5dhiNpSR fJpM3XLmFg1WobCgKCQkhPeYL0jPA4Njj8b/PKfiSg7QJzyMUzcKZJC0beOdt1Nh+YYa SLfSjkWK/vlVQ0ueBOTxBIEJQBXA4qjngkt5lG2u1vGIkQZqdKAc9CzcvMUYOmjUWVpJ E2VA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=r5RQM+h9mijNK4GiI31I4735Pc+/FzNu+/pjbANe6lo=; b=gehuFQoQ1fmZR+KKV7oxZTzBbSc7Ly3OeVWsUBXkFNaP2B4mubJcWCenjc5m9eiGdU jm9oGggzF2m9njxNY4gCnqm89QEQ13HiubxDJmKYgl/hMfHO49Mme1RaQMiGc8STKgbX Digp5X4F/TvNkFY5PnW2OwnNcJ9SyjRh/y6hTRlnxNqx51g/qHFRH+5SBILEDZGuTQbp PpBv6Txycl5xXEIRdtk9RGHwDuyw+cz1UXNLsRCxuFxpElsJvnh1TOFX1WbetO6RQa6c OC9Kuuxe6vrKlsY3qAIkiec1Ky4eFvQvqP6GfgzyL5xoelfIv4b5FSOOZ7w4ykMaqh99 3b3w== X-Gm-Message-State: AKGB3mKlxyzIJXJhQ6gZbxBp3Zr/SgBXMsT7bKNlJ62aoUSKAFi9xqtH coIQrt3wRbEidKQUZFfqrGghPUYhw1JX3uPDowcS0Q== X-Google-Smtp-Source: AGs4zMZ9F8/coFnr7tvpSQCY+IzZ3Q08n6OJ4zfpDzfHcxcBX2qNddJRIGHFLe6KwZ5x6QEF33oJGFbzvHQevp5RQYY= X-Received: by 10.37.61.7 with SMTP id k7mr12185886yba.46.1512687081499; Thu, 07 Dec 2017 14:51:21 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.59.11 with HTTP; Thu, 7 Dec 2017 14:51:20 -0800 (PST) In-Reply-To: References: <8737cosnym.fsf@localhost.localdomain.i-did-not-set--mail-host-address--so-tickle-me> <7e698a5f-32d7-6549-7e23-8850b85e6c10@gmail.com> From: "Rahul Chaudhry via gnu-gabi" Reply-To: Rahul Chaudhry Date: Sun, 01 Jan 2017 00:00:00 -0000 Message-ID: Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section To: Sriraman Tallam Cc: hegdesmailbox@gmail.com, Florian Weimer , David Edelsohn , Rafael Avila de Espindola , Binutils Development , Alan Modra , Cary Coutant , gnu-gabi@sourceware.org, Xinliang David Li , Sterling Augustine , Paul Pluzhnikov , Ian Lance Taylor , "H.J. Lu" , Luis Lozano , Peter Collingbourne , Rui Ueyama , llvm-dev@lists.llvm.org Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2017-q4/txt/msg00005.txt.bz2 Sri and I have been working on this over the past few months, and we've made some good progress that we'd like to share and get feedback on. Our work is based on the 'experimental-relr' prototype from Cary that is available at 'users/ccoutant/experimental-relr' branch in the binutils repository [1], and was described earlier in this thread: https://sourceware.org/ml/gnu-gabi/2017-q2/msg00003.html We've taken the '.relr.dyn' section from Cary's prototype, and implemented a custom encoding to compactly represent the list of offsets. We're calling the new compressed section '.relrz.dyn' (for relocations-relative-compressed). The encoding used is a simple combination of delta-encoding and a bitmap of offsets. The section consists of 64-bit entries: higher 8-bits contain delta since last offset, and lower 56-bits contain a bitmap for which words to apply the relocation to. This is best described by showing the code for decoding the section: typedef struct { Elf64_Xword r_data; /* jump and bitmap for relative relocations */ } Elf64_Relrz; #define ELF64_R_JUMP(val) ((val) >> 56) #define ELF64_R_BITS(val) ((val) & 0xffffffffffffff) #ifdef DO_RELRZ { ElfW(Addr) offset = 0; for (; relative < end; ++relative) { ElfW(Addr) jump = ELFW(R_JUMP) (relative->r_data); ElfW(Addr) bits = ELFW(R_BITS) (relative->r_data); offset += jump * sizeof(ElfW(Addr)); if (jump == 0) { ++relative; offset = relative->r_data; } ElfW(Addr) r_offset = offset; for (; bits != 0; bits >>= 1) { if ((bits&1) != 0) elf_machine_relrz_relative (l_addr, (void *) (l_addr + r_offset)); r_offset += sizeof(ElfW(Addr)); } } } #endif Note that the 8-bit 'jump' encodes the number of _words_ since last offset. The case where jump would not fit in 8-bits is handled by setting jump to 0, and emitting the full offset for the next relocation in the subsequent entry. The above code is the entirety of the implementation for decoding and processing '.relrz.dyn' sections in glibc dynamic loader. This encoding can represent up to 56 relocation offsets in a single 64-bit word. For many of the binaries we tested, this encoding provides >40x compression for storing offsets over the original `.relr.dyn` section. For 32-bit targets, we use 32-bit entries: 8-bits for 'jump' and 24-bits for the bitmap. Here are three real world examples that demonstrate the savings: 1) Chrome browser (x86_64, built as PIE): File size (stripped): 152265064 bytes (145.21MB) 605159 relocation entries (24 bytes each) in '.rela.dyn' 594542 are R_X86_64_RELATIVE relocations (98.25%) 14269008 bytes (13.61MB) in use in '.rela.dyn' section 109256 bytes (0.10MB) if moved to '.relrz.dyn' section Savings: 14159752 bytes, or 9.29% of original file size. 2) Go net/http test binary (x86_64, 'go test -buildmode=pie -c net/http') File size (stripped): 10238168 bytes (9.76MB) 83810 relocation entries (24 bytes each) in '.rela.dyn' 83804 are R_X86_64_RELATIVE relocations (99.99%) 2011296 bytes (1.92MB) in use in .rela.dyn section 43744 bytes (0.04MB) if moved to .relrz.dyn section Savings: 1967552 bytes, or 19.21% of original file size. 3) Vim binary in /usr/bin on my workstation (Ubuntu, x86_64) File size (stripped): 3030032 bytes (2.89MB) 6680 relocation entries (24 bytes each) in '.rela.dyn' 6272 are R_X86_64_RELATIVE relocations (93.89%) 150528 bytes (0.14MB) in use in .rela.dyn section 1992 bytes (0.00MB) if moved to .relrz.dyn section Savings: 148536 bytes, or 4.90% of original file size. Recent releases of Debian, Ubuntu, and several other distributions build executables as PIE by default. Suprateeka posted some statistics earlier in this thread on the prevalence of relative relocations in executables residing in /usr/bin: https://sourceware.org/ml/gnu-gabi/2017-q2/msg00013.html The third example above shows that using '.relrz.dyn' sections to encode relative relocations can bring decent savings to executable sizes in /usr/bin across many distributions. We have working ld.gold and ld.so implementations for arm, aarch64, and x86_64, and would be happy to send patches to the binutils and glibc communities for review. However, before that can happen, we need agreement on the ABI side for the new section type and the encoding. We haven't worked on a change of this magnitude before that touches so many different pieces from the linker, elf tools, and the dynamic loader. Specifically, we need agreement and/or guidance on where and how should the new section type and its encoding be documented. We're proposing adding new defines for SHT_RELRZ, DT_RELRZ, DT_RELRZSZ, DT_RELRZENT, and DT_RELRZCOUNT that all the different parts of the toolchains can agree on. Thanks, Rahul [1]: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/ccoutant/experimental-relr On Mon, May 8, 2017 at 1:55 PM, Sriraman Tallam wrote: > +llvm-dev > > Discussion here: https://sourceware.org/ml/gnu-gabi/2017-q2/msg00000.html > > On Tue, May 2, 2017 at 10:17 AM, Suprateeka R Hegde > wrote: >> On 02-May-2017 12:05 AM, Florian Weimer wrote: >>> On 05/01/2017 08:28 PM, Suprateeka R Hegde wrote: >>>> So the ratio shows ~96% is RELATIVE reloc. And only ~4% others. This is >>>> not the case on HP-UX/Itanium. But as I said, this comparison does not >>>> make sense as the runtime architecture and ISA are totally different. >>> >>> It could be that HP-UX was written in a way to reduce relative >>> relocations, >> >> Rather, the Itanium runtime architecture itself provides a way to reduce >> them. >> >>> or that the final executables aren't actually PIC anymore. >> >> I was referring to shlibs (PIC) on HP-UX and it was implicit in my mind. >> Sorry for that. >> >> I just built a large C++ shlib both on HP-UX/Itanium with our aCC >> compiler and Linux x86-64 using GCC-6.2. The sources are almost same >> with only a couple of lines differing between platforms. >> >> (HP-UX/Linux) >> Total: 12224/38311 >> RELATIVE: 18/6397 >> >> I will try to check the reason for such a huge difference in RELATIVE >> reloc count. It might be useful for this discussion (just a guess) >> >> -- >> Supra