public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/43129]  New: Simplify global variable's address loading with option -fpic
@ 2010-02-20  8:28 carrot at google dot com
  2010-02-20 12:41 ` [Bug target/43129] " steven at gcc dot gnu dot org
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: carrot at google dot com @ 2010-02-20  8:28 UTC (permalink / raw)
  To: gcc-bugs

Compile the following code with options -march=armv5te -mthumb -Os -fpic

extern int i;
int foo(int j)
{
  int t = i;
  i = j;
  return t;
}

GCC generates following code:

foo:
        ldr     r3, .L2        // A
        ldr     r2, .L2+4      // B
.LPIC0:
        add     r3, pc         // A
        ldr     r3, [r3, r2]   // B
        @ sp needed for prologue
        ldr     r2, [r3]
        str     r0, [r3]
        mov     r0, r2
        bx      lr
.L3:
        .align  2
.L2:
        .word   _GLOBAL_OFFSET_TABLE_-(.LPIC0+4)  // A
        .word   i(GOT)               // B

Instructions marked A compute the address of GOT table, instructions marked B
load the global variables GOT entry to get actual address of the global
variable. There are 4 instructions and 2 constant pool entries in total. It can
be simplified by applying the fact that the offset from label .LPIC0 to any GOT
entry is fixed at linking time. The result is:

foo:
        ldr     r3, .L2        // C
.LPIC0:
        add     r3, pc         // C
        ldr     r3, [r3]       // C
        @ sp needed for prologue
        ldr     r2, [r3]
        str     r0, [r3]
        mov     r0, r2
        bx      lr
.L3:
        .align  2
.L2:
        .word   ABS_ADDRESS_OF_GOT_ENTRY_FOR_i -(.LPIC0+4) // C

The instructions marked C load the actual address of a global variable. It uses
only 3 instructions and 1 constant pool entry. It is both smaller and faster.

But it is not always beneficial to use instruction sequence C. If there are
many global variable accesses, by using code sequence B, each one global
variable need 2 extra instructions to load its address. But using code sequence
C, each one global variable need 3 extra instructions to load its address.

Suppose there are n global variables, the code size needed to compute the
actual addresses by instruction sequence A and B is:

code_size(A) + code_size(B) * n = 2*2 + 4 + (2*2 + 4) * n = 8n + 8         <1>

The code size needed by instruction sequence C is:

code_size(C) = (3*2 + 4) * n = 10n                                      <2>

Let <1> = <2>, we get

   8n + 8 = 10n
        n = 4

So if there are more than 4 global variables' access instruction sequence A and
B is smaller, if there are less than 4 global variables' access instruction
sequence C is smaller. If there are 4 global variables' access both methods
have same code size. But code sequence C has one less memory load (the load in
instructions A) and use one less register(the global register hold the GOT
address). So code sequence C is still faster.

For arm instruction set, both methods have same code sequence, but with
different code size, now we have

   4 * 2 + 4 + (4 * 2 + 4) * n = (4 * 3 + 4) * n
                             n = 3

So the threshold value of n is 3 for arm instruction set.

Now the problem is how to represent the offset from a code label to a global
variable's GOT entry. Ian mentioned that arm relocation R_ARM_GOT_PREL can be
used, but I can't find how to represent this relocation in gnu assembler.

Any suggestions?


-- 
           Summary: Simplify global variable's address loading with option -
                    fpic
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: carrot at google dot com
 GCC build triplet: i686-linux
  GCC host triplet: i686-linux
GCC target triplet: arm-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43129


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-02-16 13:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-43129-4@http.gcc.gnu.org/bugzilla/>
2010-10-14 16:33 ` [Bug target/43129] Simplify global variable's address loading with option -fpic stephen.clarke at st dot com
     [not found] ` <201010141633.o9EGXcYL022132@rly04e.srv.mailcontrol.com>
2010-10-14 16:39   ` Ramana Radhakrishnan
2010-10-14 16:40 ` ramana.radhakrishnan at arm dot com
2010-10-14 17:02 ` stephen.clarke at st dot com
2014-02-16 13:15 ` jackie.rosen at hushmail dot com
2010-02-20  8:28 [Bug target/43129] New: " carrot at google dot com
2010-02-20 12:41 ` [Bug target/43129] " steven at gcc dot gnu dot org
2010-02-22 14:10 ` rearnsha at gcc dot gnu dot org
2010-03-16  6:24 ` carrot at google dot com
2010-03-30 14:36 ` ebotcazou at gcc dot gnu dot org
2010-03-31  8:01 ` carrot at google dot com
2010-04-07 15:45 ` rth at gcc dot gnu dot org
2010-04-08 13:32 ` carrot at google dot com
2010-04-11 12:09 ` carrot at google dot com
2010-04-11 12:13 ` carrot at google dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).