public inbox for gnu-gabi@sourceware.org
 help / color / mirror / Atom feed
From: "Rahul Chaudhry via gnu-gabi" <gnu-gabi@sourceware.org>
To: Sriraman Tallam <tmsriram@google.com>
Cc: Florian Weimer <fw@deneb.enyo.de>,
	Rahul Chaudhry via gnu-gabi <gnu-gabi@sourceware.org>,
		Suprateeka R Hegde <hegdesmailbox@gmail.com>,
	Florian Weimer <fweimer@redhat.com>,
		David Edelsohn <dje.gcc@gmail.com>,
	Rafael Avila de Espindola <rafael.espindola@gmail.com>,
		Binutils Development <binutils@sourceware.org>,
	Alan Modra <amodra@gmail.com>, 	Cary Coutant <ccoutant@gmail.com>,
	Xinliang David Li <davidxl@google.com>,
		Sterling Augustine <saugustine@google.com>,
	Paul Pluzhnikov <ppluzhnikov@google.com>,
		Ian Lance Taylor <iant@google.com>,
	"H.J. Lu" <hjl.tools@gmail.com>, Luis Lozano <llozano@google.com>,
		Peter Collingbourne <pcc@google.com>,
	Rui Ueyama <ruiu@google.com>,
	llvm-dev@lists.llvm.org
Subject: Re: Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
Date: Sun, 01 Jan 2017 00:00:00 -0000	[thread overview]
Message-ID: <CAJRD=opP96vFuSKK-1d1jw3nOKeTDE1T_E5hDwj3Zy-VUeAnRA@mail.gmail.com> (raw)
In-Reply-To: <CAAs8HmwMRTjyLjvUAbP9drkagbpedonHOGGRvoFQVr1TE7wyCQ@mail.gmail.com>

A simple combination of delta-encoding and run_length-encoding is one of the
first schemes we experimented with (32-bit entries with 24-bit 'delta' and an
8-bit 'count'). This gave really good results, but as Sri mentions, we observed
several cases where the relative relocations were not on consecutive offsets.
There were common cases where the relocations applied to alternate words, and
that totally wrecked the scheme (a bunch of entries with delta==16 and
count==1).

I dug up some numbers on how that scheme compared with the current proposal on
the three examples I posted before:

delta+run_length encoding is using 32-bit entries (24-bit delta, 8-bit count).
delta+bitmap encoding is using 64-bit entries (8-bit delta, 56-bit bitmap).

1. Chrome browser (x86_64, built as PIE):
   605159 relocation entries (24 bytes each) in '.rela.dyn'
   594542 are R_X86_64_RELATIVE relocations (98.25%)
       14269008 bytes (13.61MB) in use in '.rela.dyn' section
         385420 bytes (0.37MB) using delta+run_length encoding
         109256 bytes  (0.10MB) using delta+bitmap encoding


2. Go net/http test binary (x86_64, 'go test -buildmode=pie -c net/http')
   83810 relocation entries (24 bytes each) in '.rela.dyn'
   83804 are R_X86_64_RELATIVE relocations (99.99%)
       2011296 bytes (1.92MB) in use in .rela.dyn section
        204476 bytes (0.20MB) using delta+run_length encoding
         43744 bytes (0.04MB) using delta+bitmap encoding


3. Vim binary in /usr/bin on my workstation (Ubuntu, x86_64)
   6680 relocation entries (24 bytes each) in '.rela.dyn'
   6272 are R_X86_64_RELATIVE relocations (93.89%)
       150528 bytes (0.14MB) in use in .rela.dyn section
        14388 bytes (0.01MB) using delta+run_length encoding
         1992 bytes (0.00MB) using delta+bitmap encoding

Rahul


On Mon, Dec 11, 2017 at 10:41 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Sat, Dec 9, 2017 at 3:06 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * Rahul Chaudhry via gnu-gabi:
>>
>>> The encoding used is a simple combination of delta-encoding and a
>>> bitmap of offsets. The section consists of 64-bit entries: higher
>>> 8-bits contain delta since last offset, and lower 56-bits contain a
>>> bitmap for which words to apply the relocation to. This is best
>>> described by showing the code for decoding the section:
>>>
>>> typedef struct
>>> {
>>>   Elf64_Xword  r_data;  /* jump and bitmap for relative relocations */
>>> } Elf64_Relrz;
>>>
>>> #define ELF64_R_JUMP(val)    ((val) >> 56)
>>> #define ELF64_R_BITS(val)    ((val) & 0xffffffffffffff)
>>>
>>> #ifdef DO_RELRZ
>>>   {
>>>     ElfW(Addr) offset = 0;
>>>     for (; relative < end; ++relative)
>>>       {
>>>         ElfW(Addr) jump = ELFW(R_JUMP) (relative->r_data);
>>>         ElfW(Addr) bits = ELFW(R_BITS) (relative->r_data);
>>>         offset += jump * sizeof(ElfW(Addr));
>>>         if (jump == 0)
>>>           {
>>>             ++relative;
>>>             offset = relative->r_data;
>>>           }
>>>         ElfW(Addr) r_offset = offset;
>>>         for (; bits != 0; bits >>= 1)
>>>           {
>>>             if ((bits&1) != 0)
>>>               elf_machine_relrz_relative (l_addr, (void *) (l_addr + r_offset));
>>>             r_offset += sizeof(ElfW(Addr));
>>>           }
>>>       }
>>>   }
>>> #endif
>>
>> That data-dependent “if ((bits&1) != 0)” branch looks a bit nasty.
>>
>> Have you investigated whether some sort of RLE-style encoding would be
>> beneficial? If there are blocks of relative relocations, it might even
>> be possible to use vector instructions to process them (although more
>> than four relocations at a time are probably not achievable in a
>> power-efficient manner on current x86-64).
>
> Yes, we originally investigated RLE style encoding but I guess the key
> insight which led us towards the proposed encoding is the following.
> The offset addresses which contain the relocations are close but not
> necessarily contiguous.  We experimented with an encoding strategy
> where we would store the initial offset and the number of words after
> that which contained dynamic relocations.  This gave us good
> compression numbers but the proposed scheme was way better.  I will
> let Rahul say more as he did quite a bit of experiments with different
> strategies.
>
> Thanks
> Sri

  reply	other threads:[~2017-12-11 23:50 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAGWvnynFwXFGLj3tAVgDatn0zmuHcWHyRNuDvR+wRZCXLnar_A@mail.gmail.com>
2017-01-01  0:00 ` Rafael Avila de Espindola
2017-01-01  0:00   ` David Edelsohn
2017-01-01  0:00     ` Suprateeka R Hegde
2017-01-01  0:00       ` Florian Weimer
2017-01-01  0:00         ` Suprateeka R Hegde
2017-01-01  0:00           ` Sriraman Tallam via gnu-gabi
2017-01-01  0:00             ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00               ` Florian Weimer
2017-01-01  0:00                 ` Sriraman Tallam via gnu-gabi
2017-01-01  0:00                   ` Rahul Chaudhry via gnu-gabi [this message]
     [not found]                     ` <CAORpzuMftCGpXUObOyoFY0=jorMBDWEDbQJ23DifTNW3v-WA6Q@mail.gmail.com>
2017-01-01  0:00                       ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00                         ` Cary Coutant
2017-01-01  0:00                           ` Rahul Chaudhry via gnu-gabi
     [not found]                             ` <CAORpzuPYsSBJtypm3NDcfcgRzos3WO4JjkvgiqpyBYBhoqLVFA@mail.gmail.com>
2018-01-01  0:00                               ` Florian Weimer
2017-01-01  0:00               ` Cary Coutant
2017-01-01  0:00                 ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00                   ` Rahul Chaudhry via gnu-gabi
2017-01-01  0:00               ` Ian Lance Taylor via gnu-gabi
2017-01-01  0:00 Sriraman Tallam
2017-01-01  0:00 ` Cary Coutant
2017-01-01  0:00   ` Sriraman Tallam
2017-01-01  0:00   ` H.J. Lu
2017-01-01  0:00   ` Markus Trippelsdorf
2017-01-01  0:00   ` Florian Weimer
2017-01-01  0:00     ` Alan Modra
2017-01-01  0:00 ` Rafael Espíndola
2017-01-01  0:00   ` Sriraman Tallam
2017-01-01  0:00 ` H.J. Lu
2017-01-01  0:00   ` Sriraman Tallam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJRD=opP96vFuSKK-1d1jw3nOKeTDE1T_E5hDwj3Zy-VUeAnRA@mail.gmail.com' \
    --to=gnu-gabi@sourceware.org \
    --cc=amodra@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@gmail.com \
    --cc=davidxl@google.com \
    --cc=dje.gcc@gmail.com \
    --cc=fw@deneb.enyo.de \
    --cc=fweimer@redhat.com \
    --cc=hegdesmailbox@gmail.com \
    --cc=hjl.tools@gmail.com \
    --cc=iant@google.com \
    --cc=llozano@google.com \
    --cc=llvm-dev@lists.llvm.org \
    --cc=pcc@google.com \
    --cc=ppluzhnikov@google.com \
    --cc=rafael.espindola@gmail.com \
    --cc=rahulchaudhry@google.com \
    --cc=ruiu@google.com \
    --cc=saugustine@google.com \
    --cc=tmsriram@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).