* [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
@ 2015-05-09 13:52 H.J. Lu
2015-05-11 7:28 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: H.J. Lu @ 2015-05-09 13:52 UTC (permalink / raw)
To: binutils
The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
jmp and jcc in 64-bit mode.
gas/testsuite/
PR binutis/18386
* gas/i386/i386.exp: Run x86-64-jump.
* gas/i386/x86-64-branch.d: Updated.
* gas/i386/ilp32/x86-64-branch.d: Likewise.
* gas/i386/x86-64-branch.s: Add tests for the operand size prefix
with call, jmp and jb.
* gas/i386/x86-64-jump.d: New file.
* gas/i386/x86-64-jump.s: Likewise.
ld/testsuite/
PR binutis/18386
* ld-x86-64/tlsgdesc.dd: Updated.
* ld-x86-64/tlspic.dd: Likewise.
opcodes/
PR binutis/18386
* i386-dis.c (X86_64_E8): New.
(X86_64_E9): Likewise.
Update comments on 'T', 'U', 'V'. Add comments for '^'.
(dis386): Replace callT/jmpT with X86_64_E8/X86_64_E9.
(x86_64_table): Add X86_64_E8 and X86_64_E9.
(mod_table): Replace {T|} with ^ on Jcall/Jmp.
(putop): Handle '^'.
(OP_J): Ignore the operand size prefix in 64-bit. Don't check
REX_W.
---
gas/testsuite/gas/i386/i386.exp | 2 ++
gas/testsuite/gas/i386/ilp32/x86-64-branch.d | 9 +++--
gas/testsuite/gas/i386/x86-64-branch.d | 9 +++--
gas/testsuite/gas/i386/x86-64-branch.s | 7 ++++
gas/testsuite/gas/i386/x86-64-jump.d | 43 ++++++++++++++++++++++++
gas/testsuite/gas/i386/x86-64-jump.s | 41 +++++++++++++++++++++++
ld/testsuite/ld-x86-64/tlsgdesc.dd | 4 +--
ld/testsuite/ld-x86-64/tlspic.dd | 8 ++---
opcodes/i386-dis.c | 50 ++++++++++++++++++++++------
9 files changed, 151 insertions(+), 22 deletions(-)
create mode 100644 gas/testsuite/gas/i386/x86-64-jump.d
create mode 100644 gas/testsuite/gas/i386/x86-64-jump.s
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index bedd84c..9492e80 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -754,6 +754,8 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
run_dump_test "x86-64-relax-2"
run_dump_test "x86-64-relax-3"
+
+ run_dump_test "x86-64-jump"
}
set ASFLAGS "$old_ASFLAGS"
diff --git a/gas/testsuite/gas/i386/ilp32/x86-64-branch.d b/gas/testsuite/gas/i386/ilp32/x86-64-branch.d
index 9118db1..9fcb8ca 100644
--- a/gas/testsuite/gas/i386/ilp32/x86-64-branch.d
+++ b/gas/testsuite/gas/i386/ilp32/x86-64-branch.d
@@ -1,7 +1,7 @@
#source: ../x86-64-branch.s
#as: -J
#objdump: -drw
-#name: x86-64 (ILP32) indirect branch
+#name: x86-64 (ILP32) branch
.*: +file format .*
@@ -20,6 +20,9 @@ Disassembly of section .text:
[ ]*[a-f0-9]+: 66 ff 20 jmpw \*\(%rax\)
[ ]*[a-f0-9]+: e8 00 00 00 00 callq 0x1f 1b: R_X86_64_PC32 \*ABS\*\+0x10003c
[ ]*[a-f0-9]+: e9 00 00 00 00 jmpq 0x24 20: R_X86_64_PC32 \*ABS\*\+0x10003c
+[ ]*[a-f0-9]+: 66 e8 00 00 00 00 data16 callq 0x2a 26: R_X86_64_PC32 foo-0x4
+[ ]*[a-f0-9]+: 66 e9 00 00 00 00 data16 jmpq 0x30 2c: R_X86_64_PC32 foo-0x4
+[ ]*[a-f0-9]+: 66 0f 82 00 00 00 00 data16 jb 0x37 33: R_X86_64_PC32 foo-0x4
[ ]*[a-f0-9]+: ff d0 callq \*%rax
[ ]*[a-f0-9]+: ff d0 callq \*%rax
[ ]*[a-f0-9]+: 66 ff d0 callw \*%ax
@@ -30,6 +33,6 @@ Disassembly of section .text:
[ ]*[a-f0-9]+: 66 ff e0 jmpw \*%ax
[ ]*[a-f0-9]+: 66 ff e0 jmpw \*%ax
[ ]*[a-f0-9]+: 66 ff 20 jmpw \*\(%rax\)
-[ ]*[a-f0-9]+: e8 00 00 00 00 callq 0x43 3f: R_X86_64_PC32 \*ABS\*\+0x10003c
-[ ]*[a-f0-9]+: e9 00 00 00 00 jmpq 0x48 44: R_X86_64_PC32 \*ABS\*\+0x10003c
+[ ]*[a-f0-9]+: e8 00 00 00 00 callq 0x56 52: R_X86_64_PC32 \*ABS\*\+0x10003c
+[ ]*[a-f0-9]+: e9 00 00 00 00 jmpq 0x5b 57: R_X86_64_PC32 \*ABS\*\+0x10003c
#pass
diff --git a/gas/testsuite/gas/i386/x86-64-branch.d b/gas/testsuite/gas/i386/x86-64-branch.d
index fee2099..49e17a4 100644
--- a/gas/testsuite/gas/i386/x86-64-branch.d
+++ b/gas/testsuite/gas/i386/x86-64-branch.d
@@ -1,6 +1,6 @@
#as: -J
#objdump: -dw
-#name: x86-64 indirect branch
+#name: x86-64 branch
.*: +file format .*
@@ -19,6 +19,9 @@ Disassembly of section .text:
[ ]*[a-f0-9]+: 66 ff 20 jmpw \*\(%rax\)
[ ]*[a-f0-9]+: e8 (00|5b) 00 (00|10) 00 callq (0x1f|10007a <.text\+0x10007a>)
[ ]*[a-f0-9]+: e9 (00|60) 00 (00|10) 00 jmpq (0x24|100084 <.text\+0x100084>)
+[ ]*[a-f0-9]+: 66 e8 00 00 00 00 data16 callq (0x2a|2a <.text\+0x2a>)
+[ ]*[a-f0-9]+: 66 e9 00 00 00 00 data16 jmpq (0x30|30 <.text\+0x30>)
+[ ]*[a-f0-9]+: 66 0f 82 00 00 00 00 data16 jb (0x37|37 <.text\+0x37>)
[ ]*[a-f0-9]+: ff d0 callq \*%rax
[ ]*[a-f0-9]+: ff d0 callq \*%rax
[ ]*[a-f0-9]+: 66 ff d0 callw \*%ax
@@ -29,6 +32,6 @@ Disassembly of section .text:
[ ]*[a-f0-9]+: 66 ff e0 jmpw \*%ax
[ ]*[a-f0-9]+: 66 ff e0 jmpw \*%ax
[ ]*[a-f0-9]+: 66 ff 20 jmpw \*\(%rax\)
-[ ]*[a-f0-9]+: e8 (00|7f) 00 (00|10) 00 callq (0x43|1000c2 <.text\+0x1000c2>)
-[ ]*[a-f0-9]+: e9 (00|84) 00 (00|10) 00 jmpq (0x48|1000cc <.text\+0x1000cc>)
+[ ]*[a-f0-9]+: e8 (00|92) 00 (00|10) 00 callq (0x56|1000e8 <.text\+0x1000e8>)
+[ ]*[a-f0-9]+: e9 (00|97) 00 (00|10) 00 jmpq (0x5b|1000f2 <.text\+0x1000f2>)
#pass
diff --git a/gas/testsuite/gas/i386/x86-64-branch.s b/gas/testsuite/gas/i386/x86-64-branch.s
index 4c1861f..9451d76 100644
--- a/gas/testsuite/gas/i386/x86-64-branch.s
+++ b/gas/testsuite/gas/i386/x86-64-branch.s
@@ -12,6 +12,13 @@
call 0x100040
jmp 0x100040
+ .byte 0x66
+ call foo
+ .byte 0x66
+ jmp foo
+ .byte 0x66
+ jb foo
+
.intel_syntax noprefix
call rax
callq rax
diff --git a/gas/testsuite/gas/i386/x86-64-jump.d b/gas/testsuite/gas/i386/x86-64-jump.d
new file mode 100644
index 0000000..edb34e6
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-jump.d
@@ -0,0 +1,43 @@
+#objdump: -drw
+#name: x86-64 jump
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <.text>:
+[ ]*[a-f0-9]+: eb fe jmp (0x0|0 <.text>)
+[ ]*[a-f0-9]+: e9 00 00 00 00 jmpq 0x7 3: R_X86_64_PC32 xxx-0x4
+[ ]*[a-f0-9]+: ff 24 25 00 00 00 00 jmpq \*0x0 a: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: ff e7 jmpq \*%rdi
+[ ]*[a-f0-9]+: ff 27 jmpq \*\(%rdi\)
+[ ]*[a-f0-9]+: ff 2c bd 00 00 00 00 ljmp \*0x0\(,%rdi,4\) 15: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: 66 ff 2c bd 00 00 00 00 ljmpw \*0x0\(,%rdi,4\) 1d: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: ff 2c 25 00 00 00 00 ljmp \*0x0 24: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: 66 ff 2c 25 00 00 00 00 ljmpw \*0x0 2c: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: e8 cb ff ff ff callq 0x0
+[ ]*[a-f0-9]+: e8 00 00 00 00 callq 0x3a 36: R_X86_64_PC32 xxx-0x4
+[ ]*[a-f0-9]+: ff 14 25 00 00 00 00 callq \*0x0 3d: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: ff d7 callq \*%rdi
+[ ]*[a-f0-9]+: ff 17 callq \*\(%rdi\)
+[ ]*[a-f0-9]+: ff 1c bd 00 00 00 00 lcall \*0x0\(,%rdi,4\) 48: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: 66 ff 1c bd 00 00 00 00 lcallw \*0x0\(,%rdi,4\) 50: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: ff 1c 25 00 00 00 00 lcall \*0x0 57: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: 66 ff 1c 25 00 00 00 00 lcallw \*0x0 5f: R_X86_64_32S xxx
+[ ]*[a-f0-9]+: 67 e3 00 jecxz 0x66 65: R_X86_64_PC8 \$\+0x2
+[ ]*[a-f0-9]+: 90 nop
+[ ]*[a-f0-9]+: e3 00 jrcxz 0x69 68: R_X86_64_PC8 \$\+0x1
+[ ]*[a-f0-9]+: 90 nop
+[ ]*[a-f0-9]+: 66 ff 13 callw \*\(%rbx\)
+[ ]*[a-f0-9]+: ff 1b lcall \*\(%rbx\)
+[ ]*[a-f0-9]+: 66 ff 23 jmpw \*\(%rbx\)
+[ ]*[a-f0-9]+: ff 2b ljmp \*\(%rbx\)
+[ ]*[a-f0-9]+: eb 00 jmp 0x76
+[ ]*[a-f0-9]+: 90 nop
+[ ]*[a-f0-9]+: 67 e3 00 jecxz 0x7a
+[ ]*[a-f0-9]+: 90 nop
+[ ]*[a-f0-9]+: e3 00 jrcxz 0x7d
+[ ]*[a-f0-9]+: 90 nop
+[ ]*[a-f0-9]+: eb 00 jmp 0x80
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-jump.s b/gas/testsuite/gas/i386/x86-64-jump.s
new file mode 100644
index 0000000..96ae66e
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-jump.s
@@ -0,0 +1,41 @@
+.psize 0
+.text
+.extern xxx
+
+1: jmp 1b
+ jmp xxx
+ jmp *xxx
+ jmp *%rdi
+ jmp *(%rdi)
+ ljmp *xxx(,%rdi,4)
+ ljmpw *xxx(,%rdi,4)
+ ljmp *xxx
+ ljmpw *xxx
+
+ call 1b
+ call xxx
+ call *xxx
+ call *%rdi
+ call *(%rdi)
+ lcall *xxx(,%rdi,4)
+ lcallw *xxx(,%rdi,4)
+ lcall *xxx
+ lcallw *xxx
+
+ jecxz 3+$
+ nop
+ jrcxz 2+$
+ nop
+
+ .intel_syntax noprefix
+ call word ptr [rbx]
+ call fword ptr [rbx]
+ jmp word ptr [rbx]
+ jmp fword ptr [rbx]
+ jmp $+2
+ nop
+ jecxz 3+$
+ nop
+ jrcxz 2+$
+ nop
+ jmp .+2
diff --git a/ld/testsuite/ld-x86-64/tlsgdesc.dd b/ld/testsuite/ld-x86-64/tlsgdesc.dd
index 85b219f..88eb953 100644
--- a/ld/testsuite/ld-x86-64/tlsgdesc.dd
+++ b/ld/testsuite/ld-x86-64/tlsgdesc.dd
@@ -41,7 +41,7 @@ Disassembly of section .text:
+[0-9a-f]+: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x180>
+[0-9a-f]+: [0-9a-f]{2} *
# -> R_X86_64_DTPMOD64 sG1
- +[0-9a-f]+: 66 66 48 e8 ([0-9a-f]{2} ){3}[ ]+data16 data16 callq [0-9a-f]+ <__tls_get_addr@plt>
+ +[0-9a-f]+: 66 66 48 e8 ([0-9a-f]{2} ){3}[ ]+data16 data16 rex.W callq [0-9a-f]+ <__tls_get_addr@plt>
+[0-9a-f]+: [0-9a-f]{2} *
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+[0-9a-f]+: 90[ ]+nop *
@@ -66,7 +66,7 @@ Disassembly of section .text:
+[0-9a-f]+: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x160>
+[0-9a-f]+: [0-9a-f]{2} *
# -> R_X86_64_DTPMOD64 sG2
- +[0-9a-f]+: 66 66 48 e8 ([0-9a-f]{2} ){3}[ ]+data16 data16 callq [0-9a-f]+ <__tls_get_addr@plt>
+ +[0-9a-f]+: 66 66 48 e8 ([0-9a-f]{2} ){3}[ ]+data16 data16 rex.W callq [0-9a-f]+ <__tls_get_addr@plt>
+[0-9a-f]+: [0-9a-f]{2} *
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+[0-9a-f]+: 90[ ]+nop *
diff --git a/ld/testsuite/ld-x86-64/tlspic.dd b/ld/testsuite/ld-x86-64/tlspic.dd
index 26d83e9..aab8181 100644
--- a/ld/testsuite/ld-x86-64/tlspic.dd
+++ b/ld/testsuite/ld-x86-64/tlspic.dd
@@ -20,7 +20,7 @@ Disassembly of section .text:
+1008: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x180>
+100f: [0-9a-f ]+
# -> R_X86_64_DTPMOD64 sg1
- +1010: 66 66 48 e8 [0-9a-f ]+data16 data16 callq [0-9a-f]+ <.*>
+ +1010: 66 66 48 e8 [0-9a-f ]+data16 data16 rex.W callq [0-9a-f]+ <.*>
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+1017: [0-9a-f ]+
+1018: 90[ ]+nop *
@@ -40,7 +40,7 @@ Disassembly of section .text:
+1030: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x130>
+1037: [0-9a-f ]+
# -> R_X86_64_DTPMOD64 [0 0x2000000000000000]
- +1038: 66 66 48 e8 [0-9a-f ]+data16 data16 callq [0-9a-f]+ <.*>
+ +1038: 66 66 48 e8 [0-9a-f ]+data16 data16 rex.W callq [0-9a-f]+ <.*>
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+103f: [0-9a-f ]+
+1040: 90[ ]+nop *
@@ -60,7 +60,7 @@ Disassembly of section .text:
+1058: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x1a8>
+105f: [0-9a-f ]+
# -> R_X86_64_DTPMOD64 [0 0x4000000000000000]
- +1060: 66 66 48 e8 [0-9a-f ]+data16 data16 callq [0-9a-f]+ <.*>
+ +1060: 66 66 48 e8 [0-9a-f ]+data16 data16 rex.W callq [0-9a-f]+ <.*>
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+1067: [0-9a-f ]+
+1068: 90[ ]+nop *
@@ -80,7 +80,7 @@ Disassembly of section .text:
+1080: 66 48 8d 3d ([0-9a-f]{2} ){3}[ ]+data16 lea 0x[0-9a-f]+\(%rip\),%rdi +# [0-9a-f]+ <_DYNAMIC\+0x160>
+1087: [0-9a-f ]+
# -> R_X86_64_DTPMOD64 [0 0x6000000000000000]
- +1088: 66 66 48 e8 [0-9a-f ]+data16 data16 callq [0-9a-f]+ <.*>
+ +1088: 66 66 48 e8 [0-9a-f ]+data16 data16 rex.W callq [0-9a-f]+ <.*>
# -> R_X86_64_JUMP_SLOT __tls_get_addr
+108f: [0-9a-f ]+
+1090: 90[ ]+nop *
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 88c1758..941f699 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1632,6 +1632,8 @@ enum
X86_64_CE,
X86_64_D4,
X86_64_D5,
+ X86_64_E8,
+ X86_64_E9,
X86_64_EA,
X86_64_0F01_REG_0,
X86_64_0F01_REG_1,
@@ -2401,9 +2403,12 @@ struct dis386 {
is true
'R' => print 'w', 'l' or 'q' ('d' for 'l' and 'e' in Intel mode)
'S' => print 'w', 'l' or 'q' if suffix_always is true
- 'T' => print 'q' in 64bit mode and behave as 'P' otherwise
- 'U' => print 'q' in 64bit mode and behave as 'Q' otherwise
- 'V' => print 'q' in 64bit mode and behave as 'S' otherwise
+ 'T' => print 'q' in 64bit mode if instruction has no operand size
+ prefix and behave as 'P' otherwise
+ 'U' => print 'q' in 64bit mode if instruction has no operand size
+ prefix and behave as 'Q' otherwise
+ 'V' => print 'q' in 64bit mode if instruction has no operand size
+ prefix and behave as 'S' otherwise
'W' => print 'b', 'w' or 'l' ('d' in Intel mode)
'X' => print 's', 'd' depending on data16 prefix (for XMM)
'Y' => 'q' if instruction has an REX 64bit overwrite prefix and
@@ -2411,6 +2416,8 @@ struct dis386 {
'Z' => print 'q' in 64bit mode and behave as 'L' otherwise
'!' => change condition from true to false or from false to true.
'%' => add 1 upper case letter to the macro.
+ '^' => print 'w' or 'l' depending on operand size prefix or
+ suffix_always is true (lcall/ljmp).
2 upper case letter macros:
"XY" => print 'x' or 'y' if suffix_always is true or no register
@@ -2697,8 +2704,8 @@ static const struct dis386 dis386[] = {
{ "outB", { Ib, AL }, 0 },
{ "outG", { Ib, zAX }, 0 },
/* e8 */
- { "callT", { Jv, BND }, 0 },
- { "jmpT", { Jv, BND }, 0 },
+ { X86_64_TABLE (X86_64_E8) },
+ { X86_64_TABLE (X86_64_E9) },
{ X86_64_TABLE (X86_64_EA) },
{ "jmp", { Jb, BND }, 0 },
{ "inB", { AL, indirDX }, 0 },
@@ -6834,6 +6841,18 @@ static const struct dis386 x86_64_table[][2] = {
{ "aad", { Ib }, 0 },
},
+ /* X86_64_E8 */
+ {
+ { "callP", { Jv, BND }, 0 },
+ { "callq", { Jv, BND }, 0 }
+ },
+
+ /* X86_64_E9 */
+ {
+ { "jmpP", { Jv, BND }, 0 },
+ { "jmpq", { Jv, BND }, 0 }
+ },
+
/* X86_64_EA */
{
{ "Jjmp{T|}", { Ap }, 0 },
@@ -11576,11 +11595,11 @@ static const struct dis386 mod_table[][2] = {
},
{
/* MOD_FF_REG_3 */
- { "Jcall{T|}", { indirEp }, 0 },
+ { "Jcall^", { indirEp }, 0 },
},
{
/* MOD_FF_REG_5 */
- { "Jjmp{T|}", { indirEp }, 0 },
+ { "Jjmp^", { indirEp }, 0 },
},
{
/* MOD_0F01_REG_0 */
@@ -14177,6 +14196,18 @@ case_S:
*obufp++ = vex.w ? 'q': 'd';
}
break;
+ case '^':
+ if (intel_syntax)
+ break;
+ if ((prefixes & PREFIX_DATA) || (sizeflag & SUFFIX_ALWAYS))
+ {
+ if (sizeflag & DFLAG)
+ *obufp++ = 'l';
+ else
+ *obufp++ = 'w';
+ used_prefixes |= (prefixes & PREFIX_DATA);
+ }
+ break;
}
alt = 0;
}
@@ -15693,8 +15724,7 @@ OP_J (int bytemode, int sizeflag)
disp -= 0x100;
break;
case v_mode:
- USED_REX (REX_W);
- if ((sizeflag & DFLAG) || (rex & REX_W))
+ if (address_mode == mode_64bit || (sizeflag & DFLAG))
disp = get32s ();
else
{
@@ -15710,7 +15740,7 @@ OP_J (int bytemode, int sizeflag)
segment = ((start_pc + codep - start_codep)
& ~((bfd_vma) 0xffff));
}
- if (!(rex & REX_W))
+ if (address_mode != mode_64bit)
used_prefixes |= (prefixes & PREFIX_DATA);
break;
default:
--
2.1.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-09 13:52 [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode H.J. Lu
@ 2015-05-11 7:28 ` Jan Beulich
2015-05-11 11:04 ` H.J. Lu
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2015-05-11 7:28 UTC (permalink / raw)
To: H.J. Lu; +Cc: binutils
>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
> jmp and jcc in 64-bit mode.
Now that's again a change that should have been discussed
before committing: While it is correct for the Intel implementation
of x86-64, it isn't for AMD's original (and I'd tend to say that the
latter is more correct, since afaik there's nowhere that exceptions
are being made for the meaning of the operand size prefix).
Apart from that I find it odd that you mention only 32-bit PC-relative
branches when the same behavioral difference also applies to 8-bit
displacement ones as well as near branches to absolute addresses
(including returns). And of course there are more differences
elsewhere. Perhaps this should be hidden behind an option to
select the flavor, defaulting (imo) to the original AMD behavior.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 7:28 ` Jan Beulich
@ 2015-05-11 11:04 ` H.J. Lu
2015-05-11 11:29 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: H.J. Lu @ 2015-05-11 11:04 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, May 11, 2015 at 12:28 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
>> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
>> jmp and jcc in 64-bit mode.
>
> Now that's again a change that should have been discussed
> before committing: While it is correct for the Intel implementation
> of x86-64, it isn't for AMD's original (and I'd tend to say that the
> latter is more correct, since afaik there's nowhere that exceptions
> are being made for the meaning of the operand size prefix).
>
It is also correct for AMD. Otherwise, Linux won't run on AMD
machines since x86-64 TLS uses 0x66 as NOP prefix before
call.
--
H.J.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 11:04 ` H.J. Lu
@ 2015-05-11 11:29 ` Jan Beulich
2015-05-11 12:40 ` H.J. Lu
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2015-05-11 11:29 UTC (permalink / raw)
To: H.J. Lu; +Cc: Binutils
>>> On 11.05.15 at 13:04, <hjl.tools@gmail.com> wrote:
> On Mon, May 11, 2015 at 12:28 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
>>> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
>>> jmp and jcc in 64-bit mode.
>>
>> Now that's again a change that should have been discussed
>> before committing: While it is correct for the Intel implementation
>> of x86-64, it isn't for AMD's original (and I'd tend to say that the
>> latter is more correct, since afaik there's nowhere that exceptions
>> are being made for the meaning of the operand size prefix).
>
> It is also correct for AMD.
Very certainly not, based on practical observation: This code
x86_64:
mov rcx, rsp
.byte 0x66, 0xe8, 0x00, 0x00, 0x90, 0x90
xchg rcx, rsp
ret
yields
Reading symbols from /home/jbeulich/x86-64...done.
(gdb) break x86_64
Breakpoint 1 at 0x400520
(gdb) r
Starting program: /home/jbeulich/x86-64
Failed to read a valid object file image from memory.
Breakpoint 1, 0x0000000000400520 in x86_64 ()
(gdb) x/5i $pc
=> 0x400520 <x86_64>: mov rcx,rsp
0x400523 <x86_64+3>: call 0x527
0x400527 <x86_64+7>: nop
0x400528 <x86_64+8>: nop
0x400529 <x86_64+9>: xchg rcx,rsp
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000527 in ?? ()
with RSP also decremented by just 2.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 11:29 ` Jan Beulich
@ 2015-05-11 12:40 ` H.J. Lu
2015-05-11 13:22 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: H.J. Lu @ 2015-05-11 12:40 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, May 11, 2015 at 4:29 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 11.05.15 at 13:04, <hjl.tools@gmail.com> wrote:
>> On Mon, May 11, 2015 at 12:28 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
>>>> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
>>>> jmp and jcc in 64-bit mode.
>>>
>>> Now that's again a change that should have been discussed
>>> before committing: While it is correct for the Intel implementation
>>> of x86-64, it isn't for AMD's original (and I'd tend to say that the
>>> latter is more correct, since afaik there's nowhere that exceptions
>>> are being made for the meaning of the operand size prefix).
>>
>> It is also correct for AMD.
>
> Very certainly not, based on practical observation: This code
>
> x86_64:
> mov rcx, rsp
> .byte 0x66, 0xe8, 0x00, 0x00, 0x90, 0x90
> xchg rcx, rsp
> ret
>
> yields
>
> Reading symbols from /home/jbeulich/x86-64...done.
> (gdb) break x86_64
> Breakpoint 1 at 0x400520
> (gdb) r
> Starting program: /home/jbeulich/x86-64
> Failed to read a valid object file image from memory.
>
> Breakpoint 1, 0x0000000000400520 in x86_64 ()
> (gdb) x/5i $pc
> => 0x400520 <x86_64>: mov rcx,rsp
> 0x400523 <x86_64+3>: call 0x527
> 0x400527 <x86_64+7>: nop
> 0x400528 <x86_64+8>: nop
> 0x400529 <x86_64+9>: xchg rcx,rsp
> (gdb) c
> Continuing.
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000000527 in ?? ()
>
> with RSP also decremented by just 2.
>
The displacement is still 32-bit, but stack adjustment is
different. Display it as callq isn't accurate for AMD and
callw is not correct for both AMD and Intel. I will leave it
as is since I don't think we should add another instruction
for this.
BTW, thanks for letting me know. I was using 0x66 as
nop prefix for call/jmp PIC relaxation. I changed it to
0x48 now. Will 0x48 a problem on AMD?
H.J.
--
H.J.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 12:40 ` H.J. Lu
@ 2015-05-11 13:22 ` Jan Beulich
2015-05-11 13:33 ` H.J. Lu
0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2015-05-11 13:22 UTC (permalink / raw)
To: H.J. Lu; +Cc: Binutils
>>> On 11.05.15 at 14:40, <hjl.tools@gmail.com> wrote:
> On Mon, May 11, 2015 at 4:29 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 11.05.15 at 13:04, <hjl.tools@gmail.com> wrote:
>>> On Mon, May 11, 2015 at 12:28 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
>>>>> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
>>>>> jmp and jcc in 64-bit mode.
>>>>
>>>> Now that's again a change that should have been discussed
>>>> before committing: While it is correct for the Intel implementation
>>>> of x86-64, it isn't for AMD's original (and I'd tend to say that the
>>>> latter is more correct, since afaik there's nowhere that exceptions
>>>> are being made for the meaning of the operand size prefix).
>>>
>>> It is also correct for AMD.
>>
>> Very certainly not, based on practical observation: This code
>>
>> x86_64:
>> mov rcx, rsp
>> .byte 0x66, 0xe8, 0x00, 0x00, 0x90, 0x90
>> xchg rcx, rsp
>> ret
>>
>> yields
>>
>> Reading symbols from /home/jbeulich/x86-64...done.
>> (gdb) break x86_64
>> Breakpoint 1 at 0x400520
>> (gdb) r
>> Starting program: /home/jbeulich/x86-64
>> Failed to read a valid object file image from memory.
>>
>> Breakpoint 1, 0x0000000000400520 in x86_64 ()
>> (gdb) x/5i $pc
>> => 0x400520 <x86_64>: mov rcx,rsp
>> 0x400523 <x86_64+3>: call 0x527
>> 0x400527 <x86_64+7>: nop
>> 0x400528 <x86_64+8>: nop
>> 0x400529 <x86_64+9>: xchg rcx,rsp
>> (gdb) c
>> Continuing.
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x0000000000000527 in ?? ()
>>
>> with RSP also decremented by just 2.
>>
>
> The displacement is still 32-bit, but stack adjustment is
No, it's not - as you can see from the SEGV happing on address
00000527. If the displacement was 32 bits, then the faulting
address would have been 90900527.
> different. Display it as callq isn't accurate for AMD and
> callw is not correct for both AMD and Intel. I will leave it
> as is since I don't think we should add another instruction
> for this.
>
> BTW, thanks for letting me know. I was using 0x66 as
> nop prefix for call/jmp PIC relaxation. I changed it to
> 0x48 now. Will 0x48 a problem on AMD?
Not as far as I'm aware.
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 13:22 ` Jan Beulich
@ 2015-05-11 13:33 ` H.J. Lu
2015-05-12 9:42 ` Jan Beulich
0 siblings, 1 reply; 8+ messages in thread
From: H.J. Lu @ 2015-05-11 13:33 UTC (permalink / raw)
To: Jan Beulich; +Cc: Binutils
On Mon, May 11, 2015 at 6:22 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 11.05.15 at 14:40, <hjl.tools@gmail.com> wrote:
>> On Mon, May 11, 2015 at 4:29 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 11.05.15 at 13:04, <hjl.tools@gmail.com> wrote:
>>>> On Mon, May 11, 2015 at 12:28 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 09.05.15 at 15:52, <hjl.tools@gmail.com> wrote:
>>>>>> The operand size prefix (0x66) is ignored for 32-bit PC-relative call,
>>>>>> jmp and jcc in 64-bit mode.
>>>>>
>>>>> Now that's again a change that should have been discussed
>>>>> before committing: While it is correct for the Intel implementation
>>>>> of x86-64, it isn't for AMD's original (and I'd tend to say that the
>>>>> latter is more correct, since afaik there's nowhere that exceptions
>>>>> are being made for the meaning of the operand size prefix).
>>>>
>>>> It is also correct for AMD.
>>>
>>> Very certainly not, based on practical observation: This code
>>>
>>> x86_64:
>>> mov rcx, rsp
>>> .byte 0x66, 0xe8, 0x00, 0x00, 0x90, 0x90
>>> xchg rcx, rsp
>>> ret
>>>
>>> yields
>>>
>>> Reading symbols from /home/jbeulich/x86-64...done.
>>> (gdb) break x86_64
>>> Breakpoint 1 at 0x400520
>>> (gdb) r
>>> Starting program: /home/jbeulich/x86-64
>>> Failed to read a valid object file image from memory.
>>>
>>> Breakpoint 1, 0x0000000000400520 in x86_64 ()
>>> (gdb) x/5i $pc
>>> => 0x400520 <x86_64>: mov rcx,rsp
>>> 0x400523 <x86_64+3>: call 0x527
>>> 0x400527 <x86_64+7>: nop
>>> 0x400528 <x86_64+8>: nop
>>> 0x400529 <x86_64+9>: xchg rcx,rsp
>>> (gdb) c
>>> Continuing.
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x0000000000000527 in ?? ()
>>>
>>> with RSP also decremented by just 2.
>>>
>>
>> The displacement is still 32-bit, but stack adjustment is
>
> No, it's not - as you can see from the SEGV happing on address
> 00000527. If the displacement was 32 bits, then the faulting
> address would have been 90900527.
>
>> different. Display it as callq isn't accurate for AMD and
>> callw is not correct for both AMD and Intel. I will leave it
>> as is since I don't think we should add another instruction
>> for this.
>>
So AMD and Intel are different. I think
data16 callq rel32
is better than
callw rel16
foobar
--
H.J.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode
2015-05-11 13:33 ` H.J. Lu
@ 2015-05-12 9:42 ` Jan Beulich
0 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2015-05-12 9:42 UTC (permalink / raw)
To: H.J. Lu; +Cc: Binutils
>>> On 11.05.15 at 15:33, <hjl.tools@gmail.com> wrote:
> So AMD and Intel are different. I think
>
> data16 callq rel32
>
> is better than
>
> callw rel16
> foobar
I don't think so - neither is going to result in proper disassembly of
following instructions when looked at from the opposite corner. I.e.
disassembling as instruction with 2-byte displacement when it was
written with a 4-byte one will yield rubbish for the two extra
bytes, while disassembling as instruction with 4-byte displacement
when it was written with a 2-byte one will wrongly consume the
next instruction's first two bytes. Without the user telling you (via
command line option or alike; in live gdb sessions it may also be
possible to simply default to the CPU being run on) and without a
relocation to infer the displacement size from, you just can't get it
right (and what was there before your patch was as good or as
bad as what is there now).
But of course a pretty clear conclusion here is - unless people
intentionally write vendor specific code, use of overrides with
these instructions would perhaps best be considered invalidating
the instructions altogether (i.e. an even more reasonable default
in the absence of knowing any better might be to disassemble
them just like other undefined ones).
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-05-12 9:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-09 13:52 [committed, PATCH] PR binutis/18386: callw with 0x66 prefix incorrectly disassembled in 64-bit mode H.J. Lu
2015-05-11 7:28 ` Jan Beulich
2015-05-11 11:04 ` H.J. Lu
2015-05-11 11:29 ` Jan Beulich
2015-05-11 12:40 ` H.J. Lu
2015-05-11 13:22 ` Jan Beulich
2015-05-11 13:33 ` H.J. Lu
2015-05-12 9:42 ` Jan Beulich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).