From: "H.J. Lu" <hjl.tools@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: gcc-patches@gcc.gnu.org, Richard Henderson <rth@redhat.com>
Subject: PATCH: Properly generate X32 IE sequence
Date: Fri, 09 Mar 2012 22:26:00 -0000 [thread overview]
Message-ID: <CAMe9rOq3L51bwgiVKnzupBtdck==dKvrM9Ts5Dtv2-3OtiVUWg@mail.gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 6145 bytes --]
On Mon, Mar 5, 2012 at 9:25 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Mar 5, 2012 at 6:03 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>> X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
>>>> by checking
>>>>
>>>> movq foo@gottpoff(%rip), %reg
>>>>
>>>> and
>>>>
>>>> addq foo@gottpoff(%rip), %reg
>>>>
>>>> It uses the REX prefix to avoid the last byte of the previous
>>>> instruction. With 32bit Pmode, we may not have the REX prefix and
>>>> the last byte of the previous instruction may be an offset, which
>>>> may look like a REX prefix. IE->LE optimization will generate corrupted
>>>> binary. This patch makes sure we always output an REX pfrefix for
>>>> UNSPEC_GOTNTPOFF. OK for trunk?
>>>
>>> Actually, linker has:
>>>
>>> case R_X86_64_GOTTPOFF:
>>> /* Check transition from IE access model:
>>> mov foo@gottpoff(%rip), %reg
>>> add foo@gottpoff(%rip), %reg
>>> */
>>>
>>> /* Check REX prefix first. */
>>> if (offset >= 3 && (offset + 4) <= sec->size)
>>> {
>>> val = bfd_get_8 (abfd, contents + offset - 3);
>>> if (val != 0x48 && val != 0x4c)
>>> {
>>> /* X32 may have 0x44 REX prefix or no REX prefix. */
>>> if (ABI_64_P (abfd))
>>> return FALSE;
>>> }
>>> }
>>> else
>>> {
>>> /* X32 may not have any REX prefix. */
>>> if (ABI_64_P (abfd))
>>> return FALSE;
>>> if (offset < 2 || (offset + 3) > sec->size)
>>> return FALSE;
>>> }
>>>
>>> So, it should handle the case without REX just OK. If it doesn't, then
>>> this is a bug in binutils.
>>>
>>
>> The last byte of the displacement in the previous instruction
>> may happen to look like a REX byte. In that case, linker
>> will overwrite the last byte of the previous instruction and
>> generate the wrong instruction sequence.
>>
>> I need to update linker to enforce the REX byte check.
>
> One important observation: if we want to follow the x86_64 TLS spec
> strictly, we have to use existing DImode patterns only. This also
> means that we should NOT convert other TLS patterns to Pmode, since
> they explicitly state movq and addq. If this is not the case, then we
> need new TLS specification for X32.
Here is a patch to properly generate X32 IE sequence.
This is the summary of differences between x86-64 TLS and x32 TLS:
x86-64 x32
GD
byte 0x66; leaq foo@tlsgd(%rip),%rdi; leaq foo@tlsgd(%rip),%rdi;
.word 0x6666; rex64; call __tls_get_addr@plt .word 0x6666; rex64;
call __tls_get_addr@plt
GD->IE optimization
movq %fs:0,%rax; addq x@gottpoff(%rip),%rax movl %fs:0,%eax;
addq x@gottpoff(%rip),%rax
GD->LE optimization
movq %fs:0,%rax; leaq x@tpoff(%rax),%rax movl %fs:0,%eax;
leaq x@tpoff(%rax),%rax
LD
leaq foo@tlsld(%rip),%rdi; leaq foo@tlsld(%rip),%rdi;
call __tls_get_addr@plt call __tls_get_addr@plt
LD->LE optimization
.word 0x6666; .byte 0x66; movq %fs:0, %rax nopl 0x0(%rax); movl
%fs:0, %eax
IE
movq %fs:0,%reg64; movl %fs:0,%reg32;
addq x@gottpoff(%rip),%reg64 addl x@gottpoff(%rip),%reg32
or
Not supported if
Pmode == SImode
movq x@gottpoff(%rip),%reg64; movq x@gottpoff(%rip),%reg64;
movq %fs:(%reg64),%reg32 movl %fs:(%reg64), %reg32
IE->LE optimization
movq %fs:0,%reg64; movl %fs:0,%reg32;
addq x@gottpoff(%rip),%reg64 addl x@gottpoff(%rip),%reg32
to
movq %fs:0,%reg64; movl %fs:0,%reg32;
addq foo@tpoff, %reg64 addl foo@tpoff, %reg32
movq %fs:0,%reg64; movl %fs:0,%reg32;
leaq foo@tpoff(%reg64), %reg64 leal foo@tpoff(%reg32), %reg32
or
movq x@gottpoff(%rip),%reg64 movq x@gottpoff(%rip),%reg64;
movl %fs:(%reg64),%reg32 movl %fs:(%reg64), %reg32
to
movq foo@tpoff, %reg64 movq foo@tpoff, %reg64
movl %fs:(%reeg64),%reg32 movl %fs:(%reg64), %reg32
LE
movq %fs:0,%reg64; movl %fs:0,%reg32;
leaq x@tpoff(%reg64),%reg32 leal x@tpoff(%reg32),%reg32
or
movq %fs:0,%reg64; movl %fs:0,%reg32;
addq $x@tpoff,%reg64 addl $x@tpoff,%reg32
or
movq %fs:0,%reg64; movl %fs:0,%reg32;
movl x@tpoff(%reg64),%reg32 movl x@tpoff(%reg32),%reg32
or
movl %fs:x@tpoff,%reg32 movl %fs:x@tpoff,%reg32
X32 TLS implementation is straight forward, except for IE:
1. Since address override works only on the (reg32) part in fs:(reg32),
we can't use it as memory operand. This patch changes ix86_decompose_address
to disallow fs:(reg) if Pmode != word_mode.
2. When Pmode == SImode, there may be no REX prefix for ADD. Avoid
any instructions between MOV and ADD, which may interfere linker
IE->LE optimization, since the last byte of the previous instruction
before ADD may look like a REX prefix. This patch adds tls_initial_exec_x32
to make sure that we always have
movl %fs:0, %reg32
addl xgottpoff(%rip), %reg32
so that the last byte of the previous instruction before ADD will
never be a REX byte. Tested on Linux/x32.
--
H.J.
--
2012-03-09 H.J. Lu <hongjiu.lu@intel.com>
* config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
if Pmode != word_mode.
(legitimize_tls_address): Call gen_tls_initial_exec_x32 if
Pmode == SImode for x32.
* config/i386/i386.md (UNSPEC_TLS_IE_X32): New.
(tls_initial_exec_x32): Likewise.
[-- Attachment #2: gcc-x32-tls-1.patch --]
[-- Type: text/plain, Size: 2698 bytes --]
2012-03-09 H.J. Lu <hongjiu.lu@intel.com>
* config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
if Pmode != word_mode.
(legitimize_tls_address): Call gen_tls_initial_exec_x32 if
Pmode == SImode for x32.
* config/i386/i386.md (UNSPEC_TLS_IE_X32): New.
(tls_initial_exec_x32): Likewise.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 15465c2..312b50c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11524,6 +11534,11 @@ ix86_decompose_address (rtx addr, struct ix86_address *out)
else
disp = addr; /* displacement */
+ /* Since address override works only on the (reg32) part in fs:(reg32),
+ we can't use it as memory operand. */
+ if (Pmode != word_mode && seg == SEG_FS && (base || index))
+ return 0;
+
if (index)
{
if (REG_P (index))
@@ -12618,6 +12643,17 @@ legitimize_tls_address (rtx x, enum tls_model model, bool for_mov)
emit_insn (gen_tls_initial_exec_64_sun (dest, x));
return dest;
}
+ else if (Pmode == SImode)
+ {
+ /* Always generate
+ movl %fs:0, %reg32
+ addl xgottpoff(%rip), %reg32
+ to support linker IE->LE optimization and avoid
+ fs:(%reg32) as memory operand. */
+ dest = gen_reg_rtx (Pmode);
+ emit_insn (gen_tls_initial_exec_x32 (dest, x));
+ return dest;
+ }
pic = NULL;
type = UNSPEC_GOTNTPOFF;
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 188c982..d1fa997 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -95,6 +95,7 @@
UNSPEC_TLS_LD_BASE
UNSPEC_TLSDESC
UNSPEC_TLS_IE_SUN
+ UNSPEC_TLS_IE_X32
;; Other random patterns
UNSPEC_SCAS
@@ -12775,6 +12776,28 @@
}
[(set_attr "type" "multi")])
+;; When Pmode == SImode, there may be no REX prefix for ADD. Avoid
+;; any instructions between MOV and ADD, which may interfere linker
+;; IE->LE optimization, since the last byte of the previous instruction
+;; before ADD may look like a REX prefix. This also avoids
+;; movl x@gottpoff(%rip), %reg32
+;; movl $fs:(%reg32), %reg32
+;; Since address override works only on the (reg32) part in fs:(reg32),
+;; we can't use it as memory operand.
+(define_insn "tls_initial_exec_x32"
+ [(set (match_operand:SI 0 "register_operand" "=r")
+ (unspec:SI
+ [(match_operand:SI 1 "tls_symbolic_operand" "")]
+ UNSPEC_TLS_IE_X32))
+ (clobber (reg:CC FLAGS_REG))]
+ "TARGET_X32"
+{
+ output_asm_insn
+ ("mov{l}\t{%%fs:0, %0|%0, DWORD PTR fs:0}", operands);
+ return "add{l}\t{%a1@gottpoff(%%rip), %0|%0, %a1@gottpoff[rip]}";
+}
+ [(set_attr "type" "multi")])
+
;; GNU2 TLS patterns can be split.
(define_expand "tls_dynamic_gnu2_32"
next reply other threads:[~2012-03-09 22:26 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-09 22:26 H.J. Lu [this message]
2012-03-10 13:10 ` Uros Bizjak
2012-03-10 18:50 ` H.J. Lu
2012-03-11 17:12 ` H.J. Lu
2012-03-11 17:55 ` Uros Bizjak
2012-03-11 18:16 ` H.J. Lu
2012-03-11 18:21 ` Uros Bizjak
2012-03-11 21:25 ` H.J. Lu
2012-03-12 19:39 ` Uros Bizjak
2012-03-12 22:35 ` H.J. Lu
2012-03-13 1:21 ` H.J. Lu
2012-03-13 7:11 ` Uros Bizjak
2012-03-13 10:37 ` Uros Bizjak
2012-03-13 15:47 ` H.J. Lu
2012-03-17 17:53 ` H.J. Lu
2012-03-17 18:10 ` Uros Bizjak
2012-03-17 18:19 ` H.J. Lu
2012-03-17 18:21 ` Uros Bizjak
2012-03-17 21:50 ` H.J. Lu
2012-03-18 16:02 ` Uros Bizjak
2012-03-18 20:55 ` Uros Bizjak
2012-03-19 15:51 ` H.J. Lu
2012-03-19 15:54 ` H.J. Lu
2012-03-19 16:20 ` H.J. Lu
2012-03-19 16:35 ` H.J. Lu
2012-03-19 16:38 ` Uros Bizjak
2012-03-19 16:47 ` H.J. Lu
2012-03-19 16:49 ` Uros Bizjak
2012-03-19 16:56 ` H.J. Lu
2012-03-19 17:02 ` Uros Bizjak
2012-03-19 17:30 ` Uros Bizjak
2012-03-19 17:50 ` H.J. Lu
2012-03-19 19:14 ` Uros Bizjak
2012-03-20 9:35 ` Paolo Bonzini
2012-03-19 16:47 ` Uros Bizjak
2012-03-20 8:52 ` Eric Botcazou
2012-03-20 8:59 ` Jakub Jelinek
2012-03-20 11:20 ` Jakub Jelinek
2012-03-20 15:52 ` H.J. Lu
2012-03-20 17:55 ` Uros Bizjak
2012-03-20 18:27 ` H.J. Lu
2012-03-20 18:44 ` Uros Bizjak
2012-03-20 19:26 ` H.J. Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMe9rOq3L51bwgiVKnzupBtdck==dKvrM9Ts5Dtv2-3OtiVUWg@mail.gmail.com' \
--to=hjl.tools@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rth@redhat.com \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).