From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 72223 invoked by alias); 11 Dec 2017 23:50:32 -0000 Mailing-List: contact gnu-gabi-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: gnu-gabi-owner@sourceware.org Received: (qmail 69401 invoked by uid 89); 11 Dec 2017 23:50:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Hx-spam-relays-external:sk:mail-yb, H*r:sk:mail-yb, H*RU:sk:mail-yb, HX-HELO:sk:mail-yb X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: mail-yb0-f181.google.com Received: from mail-yb0-f181.google.com (HELO mail-yb0-f181.google.com) (209.85.213.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Dec 2017 23:50:29 +0000 Received: by mail-yb0-f181.google.com with SMTP id z11so2224684ybm.1 for ; Mon, 11 Dec 2017 15:50:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=QrwM5jxjXu8VboLBhwLRIMhbRHjNfiBjkUI7IVkKqDg=; b=rCdA9LWv4DmDvESw+iyyH5GhVmQi+F4lCfB1MVo0TYZx41S8teI0L+qpiR2xZPzjSF Y/qTtVYvY0EDgcweh7O14KeXlnFTo7G4V8De/uAm0BM157y6hZZ7Yi0TcRriPopGHenz wixcB7exgiHuwbcyVYAg15+L9PJ46wpDb4mklG4eZec94qgVG+6wJCYipjqODRntpNw7 u+bqiHDq5GD/M/mnhsY2lVUTC17DC+4tYoR/ZMCypIfwO1XczDN5v6EPAkZyzX6DJXrt S1GjbhGLaaOTklxZllyKQ/+Gv7ffiB31cb/BGoNRCYMv3BPfD5zxKmfesWs+s/PwygCE jG1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=QrwM5jxjXu8VboLBhwLRIMhbRHjNfiBjkUI7IVkKqDg=; b=ILE7ccjvWXVF48fGZ+P70Zn54DP9F4HdVxGnIKorymTQb/5LPYM+X3gywl4DD0D3qJ Lvkt8M4Qgq1GDukg9hmmy+TpwiSteIiVJo5Nw4D98vVa+OEcdIBAqbyaJTFs1b4LCIK1 8d9+ygeHsY1/67fhi+UHKwOe+xKTJ1WUBn6IkEdcrA1wLWSFuPqog/3VO/7jNQuMgk9t UzvGaadgPrcevCFwF6w9gUsQjtCafK2fIO1CZNuW9MG9/2XSfiSWOxeshrLkoUXql0K+ BkMaJl++c+8TAH3i2alBRHZgXTxtwVEIaTTJGTgyPkzgpFBvbl27iodXrPT2TMdTfZUx rEHw== X-Gm-Message-State: AKGB3mIom9wZ3tfHi4eymbZPt1AX9c3DL952Jyu2w/zl4stRZA9OzHab mGrXzFV4raOiKm3VaIi6CRIcvKH0BBCCzMD+2DBx/FK20vRahg== X-Google-Smtp-Source: ACJfBouzuU/IBdxf4qG790BM1clkvIwRVR1lyF1obPBquOZGhfSZbw4CcPKGrDMQIDuawuPOwbjzXBzGDGjj2C81t40= X-Received: by 10.37.61.7 with SMTP id k7mr1739973yba.46.1513036227414; Mon, 11 Dec 2017 15:50:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.59.11 with HTTP; Mon, 11 Dec 2017 15:50:26 -0800 (PST) In-Reply-To: References: <8737cosnym.fsf@localhost.localdomain.i-did-not-set--mail-host-address--so-tickle-me> <7e698a5f-32d7-6549-7e23-8850b85e6c10@gmail.com> <874lozec25.fsf@mid.deneb.enyo.de> From: "Rahul Chaudhry via gnu-gabi" Reply-To: Rahul Chaudhry Date: Sun, 01 Jan 2017 00:00:00 -0000 Message-ID: Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section To: Sriraman Tallam Cc: Florian Weimer , Rahul Chaudhry via gnu-gabi , Suprateeka R Hegde , Florian Weimer , David Edelsohn , Rafael Avila de Espindola , Binutils Development , Alan Modra , Cary Coutant , Xinliang David Li , Sterling Augustine , Paul Pluzhnikov , Ian Lance Taylor , "H.J. Lu" , Luis Lozano , Peter Collingbourne , Rui Ueyama , llvm-dev@lists.llvm.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2017-q4/txt/msg00015.txt.bz2 A simple combination of delta-encoding and run_length-encoding is one of the first schemes we experimented with (32-bit entries with 24-bit 'delta' and = an 8-bit 'count'). This gave really good results, but as Sri mentions, we obse= rved several cases where the relative relocations were not on consecutive offset= s. There were common cases where the relocations applied to alternate words, a= nd that totally wrecked the scheme (a bunch of entries with delta=3D=3D16 and count=3D=3D1). I dug up some numbers on how that scheme compared with the current proposal= on the three examples I posted before: delta+run_length encoding is using 32-bit entries (24-bit delta, 8-bit coun= t). delta+bitmap encoding is using 64-bit entries (8-bit delta, 56-bit bitmap). 1. Chrome browser (x86_64, built as PIE): 605159 relocation entries (24 bytes each) in '.rela.dyn' 594542 are R_X86_64_RELATIVE relocations (98.25%) 14269008 bytes (13.61MB) in use in '.rela.dyn' section 385420 bytes (0.37MB) using delta+run_length encoding 109256 bytes (0.10MB) using delta+bitmap encoding 2. Go net/http test binary (x86_64, 'go test -buildmode=3Dpie -c net/http') 83810 relocation entries (24 bytes each) in '.rela.dyn' 83804 are R_X86_64_RELATIVE relocations (99.99%) 2011296 bytes (1.92MB) in use in .rela.dyn section 204476 bytes (0.20MB) using delta+run_length encoding 43744 bytes (0.04MB) using delta+bitmap encoding 3. Vim binary in /usr/bin on my workstation (Ubuntu, x86_64) 6680 relocation entries (24 bytes each) in '.rela.dyn' 6272 are R_X86_64_RELATIVE relocations (93.89%) 150528 bytes (0.14MB) in use in .rela.dyn section 14388 bytes (0.01MB) using delta+run_length encoding 1992 bytes (0.00MB) using delta+bitmap encoding Rahul On Mon, Dec 11, 2017 at 10:41 AM, Sriraman Tallam wro= te: > On Sat, Dec 9, 2017 at 3:06 PM, Florian Weimer wrote: >> * Rahul Chaudhry via gnu-gabi: >> >>> The encoding used is a simple combination of delta-encoding and a >>> bitmap of offsets. The section consists of 64-bit entries: higher >>> 8-bits contain delta since last offset, and lower 56-bits contain a >>> bitmap for which words to apply the relocation to. This is best >>> described by showing the code for decoding the section: >>> >>> typedef struct >>> { >>> Elf64_Xword r_data; /* jump and bitmap for relative relocations */ >>> } Elf64_Relrz; >>> >>> #define ELF64_R_JUMP(val) ((val) >> 56) >>> #define ELF64_R_BITS(val) ((val) & 0xffffffffffffff) >>> >>> #ifdef DO_RELRZ >>> { >>> ElfW(Addr) offset =3D 0; >>> for (; relative < end; ++relative) >>> { >>> ElfW(Addr) jump =3D ELFW(R_JUMP) (relative->r_data); >>> ElfW(Addr) bits =3D ELFW(R_BITS) (relative->r_data); >>> offset +=3D jump * sizeof(ElfW(Addr)); >>> if (jump =3D=3D 0) >>> { >>> ++relative; >>> offset =3D relative->r_data; >>> } >>> ElfW(Addr) r_offset =3D offset; >>> for (; bits !=3D 0; bits >>=3D 1) >>> { >>> if ((bits&1) !=3D 0) >>> elf_machine_relrz_relative (l_addr, (void *) (l_addr + r_= offset)); >>> r_offset +=3D sizeof(ElfW(Addr)); >>> } >>> } >>> } >>> #endif >> >> That data-dependent =E2=80=9Cif ((bits&1) !=3D 0)=E2=80=9D branch looks = a bit nasty. >> >> Have you investigated whether some sort of RLE-style encoding would be >> beneficial? If there are blocks of relative relocations, it might even >> be possible to use vector instructions to process them (although more >> than four relocations at a time are probably not achievable in a >> power-efficient manner on current x86-64). > > Yes, we originally investigated RLE style encoding but I guess the key > insight which led us towards the proposed encoding is the following. > The offset addresses which contain the relocations are close but not > necessarily contiguous. We experimented with an encoding strategy > where we would store the initial offset and the number of words after > that which contained dynamic relocations. This gave us good > compression numbers but the proposed scheme was way better. I will > let Rahul say more as he did quite a bit of experiments with different > strategies. > > Thanks > Sri