From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ian Lance Taylor <ian@zembu.com>
To: hjl@lucon.org
Cc: bfd@cygnus.com
Subject: Re: arm questions
Date: Thu, 15 Apr 1999 13:45:00 -0000
Message-id: <19990415204531.3273.qmail@daffy.airs.com>
References: <m10XsSi-000ErMC@ocean.lucon.org> <m10XsSi-000ErMC@ocean.lucon.org>
X-SW-Source: 1999/msg00091.html

   From: hjl@lucon.org (H.J. Lu)
   Date: Thu, 15 Apr 1999 13:10:12 -0700 (PDT)

   While we are on RELA/REL, could someone kindly enough tell me the
   implementation and performance impacts between RELA and REL, on
   both static and dynamic objects?

I quote bfd/doc/bfdint.texi:

    In the absence of a supplement, it's easier to work with @samp{Rela}
    relocations.  @samp{Rela} relocations will require more space in object
    files (but not in executables, except when using dynamic linking).
    However, this is outweighed by the simplicity of addend handling when
    using @samp{Rela} relocations.  With @samp{Rel} relocations, the addend
    must be stored in the section contents, which makes relocateable links
    more complex.

    For example, consider C code like @code{i = a[1000];} where @samp{a} is
    a global array.  The instructions which load the value of @samp{a[1000]}
    will most likely use a relocation which refers to the symbol
    representing @samp{a}, with an addend that gives the offset from the
    start of @samp{a} to element @samp{1000}.  When using @samp{Rel}
    relocations, that addend must be stored in the instructions themselves.
    If you are adding support for a RISC chip which uses two or more
    instructions to load an address, then the addend may not fit in a single
    instruction, and will have to be somehow split among the instructions.
    This makes linking awkward, particularly when doing a relocateable link
    in which the addend may have to be updated.  It can be done---the MIPS
    ELF support does it---but it should be avoided when possible.

In short, it's much easier to get the implementation right when using
RELA relocations.

RELA relocations also permit more flexible handling of split
instructions.  When the addend has to be gathered together from two or
more instructions, the linker has to be able to find them all, which
makes it harder to duplicate some and even harder to move them around.
This takes away flexibility from the compiler, which may want to be
able to move the instructions around freely in order to fill delay
slots and the like.

A statically linked executable contains no relocation information, so
there is no performance difference between REL and RELA relocations.

A dynamically linked executable does contain relocation entries.  A
RELA relocation is 4 bytes larger than a REL relocation (8 bytes
larger for a 64 bit target).  A REL relocation is 8 bytes, a RELA
relocation is 12 bytes (16 and 24 for a 64 bit target).

I checked /lib/libc.so.6 on an i386 RedHat 5.2 GNU/Linux system.  It
has 1914 dynamic relocations.  With REL relocations, that takes 15312
bytes, or 3 pages assuming a 4K page size as on the i386.  With RELA
relocations, it takes 22968 bytes, or 5 pages.

So the cost of using RELA relocations is reading 2 disk pages from the
cache each time a program is run.  (The ix86 uses REL relocations,
incidentally.)

Actually, it's not so simple, because the dynamic linker does not have
to process the jump table relocation entries each time.  They can be
handled lazily.  However, most programs will force at least some jump
table relocations to be processed, so it is likely that the dynamic
linker will have to read all the dynamic relocations at some point.

A complex program may conceivably have enough dynamic relocations to
force an extra page to be read, but I doubt that is common.

Ian