public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/63634] New: Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12 pair overflowed by large TP offset
@ 2014-10-23 23:08 shenhan at google dot com
  2014-11-27 14:34 ` [Bug target/63634] " ramana at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: shenhan at google dot com @ 2014-10-23 23:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63634

            Bug ID: 63634
           Summary: Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12
                    pair overflowed by large TP offset
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: shenhan at google dot com
                CC: jingyu at google dot com, shenhan at google dot com
            Target: aarch64

Created attachment 33798
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33798&action=edit
Test case

(Trunk) Compiler generates by default the following insn and relocation pair to
access a TLS variable via its offset from TP, like below - 

main:
        mrs     x3, tpidr_el0
        mov     w2, 7
        add     x0, x3, #:tprel_hi12:.LANCHOR0
        add     x0, x0, #:tprel_lo12_nc:.LANCHOR0
        add     x1, x0, 1000
.L2:
        strb    w2, [x0], 1
        cmp     x0, x1
        bne     .L2
>>      add     x3, x3, #:tprel_hi12:.LANCHOR1
>>      add     x3, x3, #:tprel_lo12_nc:.LANCHOR1
        ldr     w0, [x3, 1024]
        ret


Insns where indicated by ">>" are generated to add offset to x3, which is the
TP(thread pointer), the actually offset value is filled in by linker at static
link phase. However this means offsets are limited to 24 bits, in other words,
it fails the attach test case. (The bad things is bfd fails silently, resulting
a bad binary.)

To fix this, the compiler/assembler at least needs to generate the following to
support 32-bit TP offset.

>>      movk xx, #:R_AARCH64_TLSLE_MOVW_TPREL_G1, lsl 16
>>      movk xx, #:R_AARCH64_TLSLE_MOVW_TPREL_G0
>>      add  x3, xx
>From gcc-bugs-return-464882-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Oct 23 23:09:00 2014
Return-Path: <gcc-bugs-return-464882-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 30425 invoked by alias); 23 Oct 2014 23:08:59 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 30403 invoked by uid 48); 23 Oct 2014 23:08:55 -0000
From: "carrot at google dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/63635] New: Reduce toc relative address computation for multiple data access
Date: Thu, 23 Oct 2014 23:23:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: carrot at google dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cf_gcctarget
Message-ID: <bug-63635-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-10/txt/msg01903.txt.bz2
Content-length: 2158

https://gcc.gnu.org/bugzilla/show_bug.cgi?idc635

            Bug ID: 63635
           Summary: Reduce toc relative address computation for multiple
                    data access
           Product: gcc
           Version: 5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: carrot at google dot com
            Target: powerpc64le

Currently ppc gcc generates two instructions to compute the address of non
local data. If the data layout is known to compiler, we can reduce one
instruction for the second and later data address computation. Following is an
example:

#include <stdio.h>

static int a,b,c;

void bar(int x)
{
  a = b = c = x;
}

int foo()
{
  return a+b+c;
}

int aa = 1;
int bb = 2;
int cc = 3;
int asdf()
{
  return aa + bb + cc;
}

int main()
{
  printf("Hello");
  printf(", ");
  printf("world.\n");
}

Compile it with options -O2 -m64 -mvsx -mcpu=power8

Function asdf is compiled to:

asdf:
0:      addis 2,12,.TOC.-0b@ha
        addi 2,2,.TOC.-0b@l
        .localentry     asdf,.-asdf
        addis 3,2,.LANCHOR1@toc@ha       // A
        addis 10,2,.LANCHOR1+4@toc@ha    // B
        addis 9,2,.LANCHOR1+8@toc@ha     // C
        lwz 3,.LANCHOR1@toc@l(3)         // D
        lwz 10,.LANCHOR1+4@toc@l(10)     // E
        lwz 9,.LANCHOR1+8@toc@l(9)       // F
        add 3,3,10
        add 3,3,9
        extsw 3,3
        blr

...

        .globl cc
        .globl bb
        .globl aa
        .section        ".data"
        .align 2
        .set    .LANCHOR1,. + 0
        .type   aa, @object
        .size   aa, 4
aa:
        .long   1
        .type   bb, @object
        .size   bb, 4
bb:
        .long   2
        .type   cc, @object
        .size   cc, 4
cc:
        .long   3

Since the data layout of aa,bb,cc is known to compiler and their distance is
less than 64k, so the code sequence A-F can be optimized to:

        addis 3,2,.LANCHOR1@toc@ha
        addi  3,3,.LANCHOR1@toc@l
        lwz 10,4(3)
        lwz 9,8(3)
        lwz 3,0(3)

Other functions can be similarly optimized.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/63634] Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12 pair overflowed by large TP offset
  2014-10-23 23:08 [Bug target/63634] New: Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12 pair overflowed by large TP offset shenhan at google dot com
@ 2014-11-27 14:34 ` ramana at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: ramana at gcc dot gnu.org @ 2014-11-27 14:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63634

Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2014-11-27
                 CC|                            |ramana at gcc dot gnu.org
           Assignee|unassigned at gcc dot gnu.org      |renlin.li at arm dot com
     Ever confirmed|0                           |1


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-11-27 14:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-23 23:08 [Bug target/63634] New: Compiler generated R_AARCH64_TLSLE_ADD_TPREL_HI12/LO12 pair overflowed by large TP offset shenhan at google dot com
2014-11-27 14:34 ` [Bug target/63634] " ramana at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).