From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 62493 invoked by alias); 15 Dec 2017 20:23:14 -0000 Mailing-List: contact gnu-gabi-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: gnu-gabi-owner@sourceware.org Received: (qmail 62471 invoked by uid 89); 15 Dec 2017 20:23:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.2 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=weakness, repeating X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: mail-yb0-f173.google.com Received: from mail-yb0-f173.google.com (HELO mail-yb0-f173.google.com) (209.85.213.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 15 Dec 2017 20:23:12 +0000 Received: by mail-yb0-f173.google.com with SMTP id g15so6982594ybc.13 for ; Fri, 15 Dec 2017 12:23:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nMHB6Qg1usnSBxfOWNIr3C6CqktfzOGNVk084FXEMHM=; b=HWsIWZ7nGhHmhYu37cbitMNvrirqR+l11zoPB/o5M+zJgpTqXZ5VSc+0udi+5pigdh ALARGG84dlzw8uMyubMmQEoWZZPSF0i47YQiiqtlo23s7uz+55HZ9RdM28CZ5JR/312U fhMFBVpyN8LeDCVEaSP/CoTTTaopKC1L73YKPpoAyNDdvxPyybFjGs+ltO9kXDD5Ejci mPPhZF4cMoPg45ItLrhG+aBpyx076aTCrvGm52GYL7H0T1Ym2zLEDxKODiDbOON/fEUw mOGKeFGa131RaSu3UoIu6cGt9HrVYiVghRq7+/Ry8hGHQVufYHuzS2lu1z8gUcLtzMmr GsNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nMHB6Qg1usnSBxfOWNIr3C6CqktfzOGNVk084FXEMHM=; b=tvmwpxdUpq6LR7gbLU9CgUletmVzM/xUO5hfGuilyxKalSBZln8YBZUl4uQRefRPLW GqEo/sz6JgdLcTOXfmYguMgUB+UMXTMjarMnf64zl/qKvbeSMezvoXzBUYFpb5I8CoNw 7LmnF4adXkmTF7OtPsnr22S6G3Gb3sfKvjM36pExqQ31HM6UYyKQdV0kCQONDgTs/Mwx K/UtYTZ7Az2rsQK6D94D0vj9NYkQdsjcZuVBghu//bcwEtvEGBmVh3eHVzCwUwQnCtSx 9vXgfh3sTFz0Ahcx6LJX7ToO3jxGNTzJv6nYqdzj9JnPVb1Bmr5fVh1rlkcV+gzbZOOs hUnw== X-Gm-Message-State: AKGB3mLSZlycBbDv8UHznRe4tSWDpztCbnVaPA2HgVctsDqXd0r3Fz0W oHX7OG0SKig66E5qgAoWsq0uh+FLk8opHt1JA7Qdqw== X-Google-Smtp-Source: ACJfBotksaxXNLGjfPu0yi+XqQTCuMamkPqkp+BmpSiE/z/gTd7Tp7q2yRx+mq9T6yJQ9vgRlz2ebIOzQQu+i/6A+5Y= X-Received: by 10.37.77.65 with SMTP id a62mr8241457ybb.185.1513369390482; Fri, 15 Dec 2017 12:23:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.104.69 with HTTP; Fri, 15 Dec 2017 12:23:09 -0800 (PST) In-Reply-To: References: <8737cosnym.fsf@localhost.localdomain.i-did-not-set--mail-host-address--so-tickle-me> <7e698a5f-32d7-6549-7e23-8850b85e6c10@gmail.com> <874lozec25.fsf@mid.deneb.enyo.de> From: "Rahul Chaudhry via gnu-gabi" Reply-To: Rahul Chaudhry Date: Sun, 01 Jan 2017 00:00:00 -0000 Message-ID: Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section To: Cary Coutant Cc: Roland McGrath , Sriraman Tallam , Florian Weimer , Rahul Chaudhry via gnu-gabi , Suprateeka R Hegde , Florian Weimer , David Edelsohn , Rafael Avila de Espindola , Binutils Development , Alan Modra , Xinliang David Li , Sterling Augustine , Paul Pluzhnikov , Ian Lance Taylor , "H.J. Lu" , Luis Lozano , Peter Collingbourne , Rui Ueyama , llvm-dev@lists.llvm.org Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2017-q4/txt/msg00025.txt.bz2 On Thu, Dec 14, 2017 at 12:11 AM, Cary Coutant wrote: >> While adding a 'stride' field is definitely an improvement over simple >> delta+count encoding, it doesn't compare well against the bitmap based >> encoding. >> >> I took a look inside the encoding for the Vim binary. There are some instances >> in the bitmap based encoding like >> [0x3855555555555555 0x3855555555555555 0x3855555555555555 ...] >> that encode sequences of relocations applying to alternate words. The stride >> based encoding works very well on these and turns it into much more compact >> [0x0ff010ff 0x0ff010ff 0x0ff010ff ...] >> using stride==0x10 and count==0xff. > > Have you looked much at where the RELATIVE relocations are coming from? > > I've looked at a PIE build of gold, and they're almost all for > vtables, which mostly have consecutive entries with 8-byte strides. > There are a few for the GOT, a few for static constructors (in > .init_array), and a few for other initialized data, but vtables seem > to account for the vast majority. (Gold has almost 19,000 RELATIVE > dynamic relocs, and only about 500 non-RELATIVE dynamic relocs.) > > Where do the 16-byte strides come from? Vim is plain C, right? I'm > guessing its RELATIVE relocation count is fairly low compared to big > C++ apps. I'm also guessing that the pattern comes from some large > structure or structures in the source code where initialized pointers > alternate with non-pointer values. I'm also curious about Roland's > app. I took a look inside vim for the source of the ..5555.. pattern (relative relocations applying to alternate words). One of the sources is the "builtin_termcaps" symbol, which is an array of "struct builtin_term": struct builtin_term { int bt_entry; char *bt_string; }; So the pattern makes sense. An encoding using strides will work really well here with stride == 0x10. There is another repeating pattern I noticed in vim ..9999... One of the sources behind this pattern is the "cmdnames" symbol, which is an array of "struct cmdname": struct cmdname { char_u *cmd_name; /* name of the command */ ex_func_T cmd_func; /* function for this command */ long_u cmd_argt; /* flags declared above */ int cmd_addr_type; /* flag for address type */ }; In this struct, the first two fields are pointers, and the next two are scalars. This explains the ..9999.. pattern for relative relocations. This is an example where a stride based encoding does not work well, simply because there is no single stride. The deltas are 8,24,8,24,8,24,... I think these two examples demonstrate the main weakness of using a simple stride based encoding: it is too sensitive to how the data structures are laid out in the program source. Rahul