Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section

public inbox for gnu-gabi@sourceware.org
 help / color / mirror / Atom feed

From: "Rahul Chaudhry via gnu-gabi" <gnu-gabi@sourceware.org>
To: Roland McGrath <roland@hack.frob.com>
Cc: Sriraman Tallam <tmsriram@google.com>,
	Florian Weimer <fw@deneb.enyo.de>,
		Rahul Chaudhry via gnu-gabi <gnu-gabi@sourceware.org>,
	Suprateeka R Hegde <hegdesmailbox@gmail.com>,
		Florian Weimer <fweimer@redhat.com>,
	David Edelsohn <dje.gcc@gmail.com>,
		Rafael Avila de Espindola <rafael.espindola@gmail.com>,
	Binutils Development <binutils@sourceware.org>,
		Alan Modra <amodra@gmail.com>, Cary Coutant <ccoutant@gmail.com>,
		Xinliang David Li <davidxl@google.com>,
	Sterling Augustine <saugustine@google.com>,
		Paul Pluzhnikov <ppluzhnikov@google.com>,
	Ian Lance Taylor <iant@google.com>,
		"H.J. Lu" <hjl.tools@gmail.com>,
	Luis Lozano <llozano@google.com>,
		Peter Collingbourne <pcc@google.com>,
	Rui Ueyama <ruiu@google.com>,
	llvm-dev@lists.llvm.org
Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
Date: Sun, 01 Jan 2017 00:00:00 -0000	[thread overview]
Message-ID: <CAJRD=opERJszwQMFfaKMVdOYF-YAbqqYW0iNWWMqNp3pq2njzw@mail.gmail.com> (raw)
In-Reply-To: <CAORpzuMftCGpXUObOyoFY0=jorMBDWEDbQJ23DifTNW3v-WA6Q@mail.gmail.com>

On Mon, Dec 11, 2017 at 6:14 PM, Roland McGrath <roland@hack.frob.com> wrote:
>
> On Mon, Dec 11, 2017 at 3:50 PM Rahul Chaudhry via gnu-gabi <gnu-gabi@sourceware.org> wrote:
>>
>> A simple combination of delta-encoding and run_length-encoding is one of the
>> first schemes we experimented with (32-bit entries with 24-bit 'delta' and an
>> 8-bit 'count'). This gave really good results, but as Sri mentions, we observed
>> several cases where the relative relocations were not on consecutive offsets.
>> There were common cases where the relocations applied to alternate words, and
>> that totally wrecked the scheme (a bunch of entries with delta==16 and
>> count==1).
>
>
> For the same issue in a different context, I recently implemented a scheme using run-length-encoding but using a variable stride.  So for a run of alternate words, you still get a single entry, but with stride 16 instead of 8.  In my application, most cases of strides > 8 are a run of only 2 or 3 but there are a few cases of dozens or hundreds with a stride of 16.  My case is a solution tailored to exactly one application (a kernel), so there is a closed sample set that's all that matters and the trade-off between simplicity of the analysis and compactness of the results is different than the general case you're addressing (my "analysis" consists of a few lines of AWK).  But I wonder if it might be worthwhile to study the effect a variable-stride RLE scheme or adding the variable-stride ability into your hybrid scheme has on your sample applications.
>
> Since we're talking about specifying a new ABI that will be serving us for many years to come and will be hard to change once deployed, it seems worth spending quite a bit of effort up front to come to the most compact scheme that's feasible.

I agree. Can you share more details of the encoding scheme that you found
useful (size of each entry, number of bits used for stride/count etc.)?

I just ran some experiments with an encoding with 32-bit entries: 16-bits for
delta, 8-bits for stride, and 8-bits for count. Here are the numbers, inlined
with those from the previous schemes for comparison:


1. Chrome browser (x86_64, built as PIE):
   605159 relocation entries (24 bytes each) in '.rela.dyn'
   594542 are R_X86_64_RELATIVE relocations (98.25%)
       14269008 bytes (13.61MB) in use in '.rela.dyn' section
         385420 bytes (0.37MB) using delta+count encoding
         232540 bytes (0.22MB) using delta+stride+count encoding
         109256 bytes  (0.10MB) using jump+bitmap encoding


2. Go net/http test binary (x86_64, 'go test -buildmode=pie -c net/http')
   83810 relocation entries (24 bytes each) in '.rela.dyn'
   83804 are R_X86_64_RELATIVE relocations (99.99%)
       2011296 bytes (1.92MB) in use in .rela.dyn section
        204476 bytes (0.20MB) using delta+count encoding
        132568 bytes (0.13MB) using delta+stride+count encoding
         43744 bytes (0.04MB) using jump+bitmap encoding


3. Vim binary in /usr/bin on my workstation (Ubuntu, x86_64)
   6680 relocation entries (24 bytes each) in '.rela.dyn'
   6272 are R_X86_64_RELATIVE relocations (93.89%)
       150528 bytes (0.14MB) in use in .rela.dyn section
        14388 bytes (0.01MB) using delta+count encoding
         7000 bytes (0.01MB) using delta+stride+count encoding
         1992 bytes (0.00MB) using jump+bitmap encoding


delta+count encoding is using 32-bit entries:
  24-bit delta: number of bytes since last offset.
   8-bit count: number of relocations to apply (consecutive words).

delta+stride+count encoding is using 32-bit entries:
  16-bit delta: number of bytes since last offset.
   8-bit stride: stride (in bytes) for applying 'count' relocations.
   8-bit count: number of relocations to apply (using 'stride').

jump+bitmap encoding is using 64-bit entries:
   8-bit jump: number of words since last offset.
  56-bit bitmap: bitmap for which words to apply relocations to.


While adding a 'stride' field is definitely an improvement over simple
delta+count encoding, it doesn't compare well against the bitmap based
encoding.

I took a look inside the encoding for the Vim binary. There are some instances
in the bitmap based encoding like
  [0x3855555555555555 0x3855555555555555 0x3855555555555555 ...]
that encode sequences of relocations applying to alternate words. The stride
based encoding works very well on these and turns it into much more compact
  [0x0ff010ff 0x0ff010ff 0x0ff010ff ...]
using stride==0x10 and count==0xff.

However, for the vast majority of cases, the stride based encoding ends up with
count <= 2, and that kills it in the end.

I could try something more complex with 16-bit entries, but that can only give
2x improvement at best, so it still won't be better than the bitmap approach.

Thanks,
Rahul


> --
>
>
> Thanks,
> Roland

next prev parent reply	other threads:[~2017-12-13  0:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAGWvnynFwXFGLj3tAVgDatn0zmuHcWHyRNuDvR+wRZCXLnar_A@mail.gmail.com>
2017-01-01  0:00 ` Rafael Avila de Espindola
2017-01-01  0:00   ` David Edelsohn
2017-01-01  0:00     ` Suprateeka R Hegde
2017-01-01  0:00       ` Florian Weimer
2017-01-01  0:00         ` Suprateeka R Hegde
2017-01-01  0:00           ` Sriraman Tallam via gnu-gabi
2017-01-01  0:00             ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00               ` Ian Lance Taylor via gnu-gabi
2017-01-01  0:00               ` Cary Coutant
2017-01-01  0:00                 ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00                   ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00               ` Florian Weimer
2017-01-01  0:00                 ` Sriraman Tallam via gnu-gabi
2017-01-01  0:00                   ` Rahul Chaudhry via gnu-gabi
     [not found]                     ` <CAORpzuMftCGpXUObOyoFY0=jorMBDWEDbQJ23DifTNW3v-WA6Q@mail.gmail.com>
2017-01-01  0:00                       ` Rahul Chaudhry via gnu-gabi [this message]
2017-01-01  0:00                         ` Cary Coutant
2017-01-01  0:00                           ` Rahul Chaudhry via gnu-gabi
     [not found]                             ` <CAORpzuPYsSBJtypm3NDcfcgRzos3WO4JjkvgiqpyBYBhoqLVFA@mail.gmail.com>
2018-01-01  0:00                               ` Florian Weimer
2017-01-01  0:00 Sriraman Tallam
2017-01-01  0:00 ` Rafael Espíndola
2017-01-01  0:00   ` Sriraman Tallam
2017-01-01  0:00 ` Cary Coutant
2017-01-01  0:00   ` Florian Weimer
2017-01-01  0:00     ` Alan Modra
2017-01-01  0:00   ` Markus Trippelsdorf
2017-01-01  0:00   ` H.J. Lu
2017-01-01  0:00   ` Sriraman Tallam
2017-01-01  0:00 ` H.J. Lu
2017-01-01  0:00   ` Sriraman Tallam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJRD=opERJszwQMFfaKMVdOYF-YAbqqYW0iNWWMqNp3pq2njzw@mail.gmail.com' \
    --to=gnu-gabi@sourceware.org \
    --cc=amodra@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@gmail.com \
    --cc=davidxl@google.com \
    --cc=dje.gcc@gmail.com \
    --cc=fw@deneb.enyo.de \
    --cc=fweimer@redhat.com \
    --cc=hegdesmailbox@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=iant@google.com \
    --cc=llozano@google.com \
    --cc=llvm-dev@lists.llvm.org \
    --cc=pcc@google.com \
    --cc=ppluzhnikov@google.com \
    --cc=rafael.espindola@gmail.com \
    --cc=rahulchaudhry@google.com \
    --cc=roland@hack.frob.com \
    --cc=ruiu@google.com \
    --cc=saugustine@google.com \
    --cc=tmsriram@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).