* PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
[not found] <20050111210753.0C8CB219E0@nexus.cs.usfca.edu>
@ 2005-01-12 19:10 ` H. J. Lu
2005-01-13 3:44 ` Alan Modra
0 siblings, 1 reply; 13+ messages in thread
From: H. J. Lu @ 2005-01-12 19:10 UTC (permalink / raw)
To: Allan B. Cruse; +Cc: binutils
On Tue, Jan 11, 2005 at 01:07:53PM -0800, Allan B. Cruse wrote:
>
> //----------------------------------------------------------------
> // needsfix.s
> //
> // This program is erroneously disassembled by the 'objdump'
> // utility (version 2.15.92.0.2, dated 20040927) distributed
> // in Fedora Core 3 from Redhat. The mistake was discovered
> // by single-stepping through this code using an interactive
> // debugger and observing the effects on EAX register-value.
> // All four machine-instructions coded in hexadecimal behave
> // identically on my Pentium-III and Pentium-4 computers.
> //
> // (Assemblers would not ordinarily generate the three forms
> // of this instruction which get erroneously disassembled by
> // 'objdump', and Intel's documentation is very confusing as
> // to how the SIB-byte's 'shift' field affects these cases.)
> //
> // assemble using: $ as needfix.s -o needfix.o
> // then link with: $ ld needfix.o -o needfix
> //
> // programmer: ALLAN CRUSE
> // written on: 11 JAN 2005
> //----------------------------------------------------------------
>
> .data
> num1: .long 0x12345678
>
> .text
> _start: leal num1, %ebx
>
> # experiment #1 (correctly disassembled)
> xor %eax, %eax
> .byte 0x8B, 0x04, 0x23 # effect is: movl (%ebx), %eax
>
> # experiment #2 (incorectly disassembled)
> xor %eax, %eax
> .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
>
> # experiment #3 (incorectly disassembled)
> xor %eax, %eax
> .byte 0x8B, 0x04, 0xA3 # effect is: movl (%ebx), %eax
>
> # experiment #4 (incorectly disassembled)
> xor %eax, %eax
> .byte 0x8B, 0x04, 0xE3 # effect is: movl (%ebx), %eax
>
> # exit()
> movl $1, %eax
> int $0x80
>
> .global _start
> .end
>
>
> #----- Output from: objdump -d needsfix
> #
> needsfix: file format elf32-i386
>
> Disassembly of section .text:
>
> 08048074 <_start>:
> 8048074: 8d 1d 98 90 04 08 lea 0x8049098,%ebx
> 804807a: 31 c0 xor %eax,%eax
> 804807c: 8b 04 23 mov (%ebx),%eax
> 804807f: 31 c0 xor %eax,%eax
> 8048081: 8b 04 63 mov (%ebx,2),%eax
> 8048084: 31 c0 xor %eax,%eax
> 8048086: 8b 04 a3 mov (%ebx,4),%eax
> 8048089: 31 c0 xor %eax,%eax
> 804808b: 8b 04 e3 mov (%ebx,8),%eax
> 804808e: b8 01 00 00 00 mov $0x1,%eax
> 8048093: cd 80 int $0x80
>
>
>
> #----- Output from: elfunasm needsfix
> #
> File 'fixsegs': executable file Intel-386
>
> Disassembly of section .text
>
> <_start>:
> CS:08048074 8D1D98900408 lea 0x08049098, %ebx
> CS:0804807A 31C0 xor %eax, %eax
> CS:0804807C 8B0423 mov (%ebx), %eax
> CS:0804807F 31C0 xor %eax, %eax
> CS:08048081 8B0463 mov (%ebx), %eax
> CS:08048084 31C0 xor %eax, %eax
> CS:08048086 8B04A3 mov (%ebx), %eax
> CS:08048089 31C0 xor %eax, %eax
> CS:0804808B 8B04E3 mov (%ebx), %eax
> CS:0804808E B801000000 mov $0x00000001, %eax
> CS:08048093 CD80 int $0x80
>
Thanks for your bug report. I will check in this patch to fix it.
H.J.
----
gas/testsuite/
2005-01-12 H.J. Lu <hongjiu.lu@intel.com>
* i386/i386.exp: Run "sib".
* gas/i386/sib.d: New file.
* gas/i386/sib.s: Likewise.
opcodes/
2005-01-12 H.J. Lu <hongjiu.lu@intel.com>
* i386-dis.c (OP_E): Ignore scale when index == 0x4 in SIB.
--- binutils/gas/testsuite/gas/i386/i386.exp.sib 2004-11-04 09:34:52.000000000 -0800
+++ binutils/gas/testsuite/gas/i386/i386.exp 2005-01-12 10:56:17.244574211 -0800
@@ -57,6 +57,7 @@ if [expr ([istarget "i*86-*-*"] || [ist
run_dump_test "sse2"
run_dump_test "sub"
run_dump_test "prescott"
+ run_dump_test "sib"
if {![istarget "*-*-aix*"]
&& (![is_elf_format] || [istarget "*-*-linux*"]
--- binutils/gas/testsuite/gas/i386/sib.d.sib 2005-01-12 10:54:16.743398658 -0800
+++ binutils/gas/testsuite/gas/i386/sib.d 2005-01-12 11:04:40.044545548 -0800
@@ -0,0 +1,15 @@
+#objdump: -dw
+#name: i386 SIB
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+000 <foo>:
+ 0: 8b 04 23 [ ]*mov [ ]*\(%ebx\),%eax
+ 3: 8b 04 63 [ ]*mov [ ]*\(%ebx\),%eax
+ 6: 8b 04 a3 [ ]*mov [ ]*\(%ebx\),%eax
+ 9: 8b 04 e3 [ ]*mov [ ]*\(%ebx\),%eax
+ c: 90 [ ]*nop [ ]*
+ d: 90 [ ]*nop [ ]*
+ ...
--- binutils/gas/testsuite/gas/i386/sib.s.sib 2005-01-12 10:54:14.263724294 -0800
+++ binutils/gas/testsuite/gas/i386/sib.s 2005-01-12 11:03:02.479357997 -0800
@@ -0,0 +1,11 @@
+#Test the special case of the index bits, 0x4, in SIB.
+
+ .text
+foo:
+ .byte 0x8B, 0x04, 0x23 # effect is: movl (%ebx), %eax
+ .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
+ .byte 0x8B, 0x04, 0xA3 # effect is: movl (%ebx), %eax
+ .byte 0x8B, 0x04, 0xE3 # effect is: movl (%ebx), %eax
+ nop
+ nop
+ .p2align 4,0
--- binutils/opcodes/i386-dis.c.sib 2004-11-04 09:35:19.000000000 -0800
+++ binutils/opcodes/i386-dis.c 2005-01-12 10:50:01.790879515 -0800
@@ -3191,8 +3191,10 @@ OP_E (int bytemode, int sizeflag)
{
havesib = 1;
FETCH_DATA (the_info, codep + 1);
- scale = (*codep >> 6) & 3;
index = (*codep >> 3) & 7;
+ if (index != 0x4)
+ /* When INDEX == 0x4, scale is ignored. */
+ scale = (*codep >> 6) & 3;
base = *codep & 7;
USED_REX (REX_EXTY);
USED_REX (REX_EXTZ);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-12 19:10 ` PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report) H. J. Lu
@ 2005-01-13 3:44 ` Alan Modra
2005-01-13 17:09 ` H. J. Lu
0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2005-01-13 3:44 UTC (permalink / raw)
To: H. J. Lu; +Cc: Allan B. Cruse, binutils
On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
[snip]
> > 8048081: 8b 04 63 mov (%ebx,2),%eax
I don't agree that this is a problem. In fact, I think that this
disassembly is more accurate than "mov (%ebx),%eax". Note that gas
accepts "mov (%ebx,2),%eax" giving
Warning: scale factor of 2 without an index register
Yes, I agree that the effect of executing these byte sequences is the
same as "mov (%ebx),%eax", but that's beside the point. For example,
plenty of x86 instructions execute as a nop, but that doesn't mean they
should all be disassembled as "nop". The disassembler ought to reflect
the machine encoding as closely as possible, and in this case that means
printing the ignored scale factor.
I think this change should be reverted.
> --- binutils/opcodes/i386-dis.c.sib 2004-11-04 09:35:19.000000000 -0800
> +++ binutils/opcodes/i386-dis.c 2005-01-12 10:50:01.790879515 -0800
> @@ -3191,8 +3191,10 @@ OP_E (int bytemode, int sizeflag)
> {
> havesib = 1;
> FETCH_DATA (the_info, codep + 1);
> - scale = (*codep >> 6) & 3;
> index = (*codep >> 3) & 7;
> + if (index != 0x4)
> + /* When INDEX == 0x4, scale is ignored. */
> + scale = (*codep >> 6) & 3;
> base = *codep & 7;
> USED_REX (REX_EXTY);
> USED_REX (REX_EXTZ);
--
Alan Modra
IBM OzLabs - Linux Technology Centre
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-13 3:44 ` Alan Modra
@ 2005-01-13 17:09 ` H. J. Lu
2005-01-13 17:27 ` H. J. Lu
2005-01-14 0:06 ` Alan Modra
0 siblings, 2 replies; 13+ messages in thread
From: H. J. Lu @ 2005-01-13 17:09 UTC (permalink / raw)
To: Allan B. Cruse, binutils
On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote:
> On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
> [snip]
> > > 8048081: 8b 04 63 mov (%ebx,2),%eax
>
> I don't agree that this is a problem. In fact, I think that this
> disassembly is more accurate than "mov (%ebx),%eax". Note that gas
> accepts "mov (%ebx,2),%eax" giving
> Warning: scale factor of 2 without an index register
But it generates "8b 03", not "8b 04 63".
>
> Yes, I agree that the effect of executing these byte sequences is the
> same as "mov (%ebx),%eax", but that's beside the point. For example,
> plenty of x86 instructions execute as a nop, but that doesn't mean they
> should all be disassembled as "nop". The disassembler ought to reflect
> the machine encoding as closely as possible, and in this case that means
> printing the ignored scale factor.
>
> I think this change should be reverted.
>
IA-32 instruction reference manual says when INDEX == 0x4, scaled index
is "[none]". Displaying "(%ebx,2)" is simply wrong here.
H.J.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-13 17:09 ` H. J. Lu
@ 2005-01-13 17:27 ` H. J. Lu
2005-01-14 0:06 ` Alan Modra
1 sibling, 0 replies; 13+ messages in thread
From: H. J. Lu @ 2005-01-13 17:27 UTC (permalink / raw)
To: Allan B. Cruse, binutils
On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote:
> On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote:
> > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
> > [snip]
> > > > 8048081: 8b 04 63 mov (%ebx,2),%eax
> >
> > I don't agree that this is a problem. In fact, I think that this
> > disassembly is more accurate than "mov (%ebx),%eax". Note that gas
> > accepts "mov (%ebx,2),%eax" giving
> > Warning: scale factor of 2 without an index register
>
> But it generates "8b 03", not "8b 04 63".
>
> >
> > Yes, I agree that the effect of executing these byte sequences is the
> > same as "mov (%ebx),%eax", but that's beside the point. For example,
> > plenty of x86 instructions execute as a nop, but that doesn't mean they
> > should all be disassembled as "nop". The disassembler ought to reflect
> > the machine encoding as closely as possible, and in this case that means
> > printing the ignored scale factor.
> >
> > I think this change should be reverted.
> >
>
> IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> is "[none]". Displaying "(%ebx,2)" is simply wrong here.
>
I don't mind to revert my patch and make "mov (%ebx,2),%eax" a valid
instruction for "8b 04 63". I kind of like this approach.
H.J.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-13 17:09 ` H. J. Lu
2005-01-13 17:27 ` H. J. Lu
@ 2005-01-14 0:06 ` Alan Modra
2005-01-14 0:27 ` H. J. Lu
2005-01-14 7:04 ` Bernd Jendrissek
1 sibling, 2 replies; 13+ messages in thread
From: Alan Modra @ 2005-01-14 0:06 UTC (permalink / raw)
To: H. J. Lu; +Cc: Allan B. Cruse, binutils
On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote:
> On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote:
> > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
> > [snip]
> > > > 8048081: 8b 04 63 mov (%ebx,2),%eax
> >
> > I don't agree that this is a problem. In fact, I think that this
> > disassembly is more accurate than "mov (%ebx),%eax". Note that gas
> > accepts "mov (%ebx,2),%eax" giving
> > Warning: scale factor of 2 without an index register
>
> But it generates "8b 03", not "8b 04 63".
Sure. That's an optimization, just like mov %es,%ax is assembled
without the operand size prefix as if the programmer had written
mov %es,%eax. I'm quite happy with the assembler optimizing a little
where it can. :)
> > Yes, I agree that the effect of executing these byte sequences is the
> > same as "mov (%ebx),%eax", but that's beside the point. For example,
> > plenty of x86 instructions execute as a nop, but that doesn't mean they
> > should all be disassembled as "nop". The disassembler ought to reflect
> > the machine encoding as closely as possible, and in this case that means
> > printing the ignored scale factor.
> >
> > I think this change should be reverted.
>
> IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> is "[none]". Displaying "(%ebx,2)" is simply wrong here.
The IA-32 instruction reference manual specifies both instruction
operation and instruction encoding. There isn't a one to one mapping
between encoding and operation on IA-32, sometimes multiple encodings
are available for a particular operation.
And that's where I have a philosophical disagreement with Allan Cruse.
I believe the disassembler should reflect the encoding as much as
possible, while he seems to believe the disassembler should reflect
operation. The trouble with that argument is that taken to its logical
conclusion we should disassemble
0x89,0xf6 as "nop"
0x8d,0x76,0x00 as "nop"
0x8d,0x74,0x26,0x00 as "nop"
and so on for all of the zillion different "nop" encodings. Indeed,
that might help some people. We've had the occasional bug report that
gas wasn't aligning with nops! But people use the disassembler for more
that just teaching, where instruction operation might be the primary
concern. I'd guess that programmers casually debugging programs are
most interested in instruction operation too, but more advanced analysis
might focus on execution speed and instruction scheduling where
different encodings do sometimes behave differently. There's also the
possibility of subtle cpu bugs that only show up in certain machine
encodings.
--
Alan Modra
IBM OzLabs - Linux Technology Centre
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 0:06 ` Alan Modra
@ 2005-01-14 0:27 ` H. J. Lu
2005-01-14 0:59 ` Alan Modra
2005-01-14 7:04 ` Bernd Jendrissek
1 sibling, 1 reply; 13+ messages in thread
From: H. J. Lu @ 2005-01-14 0:27 UTC (permalink / raw)
To: Allan B. Cruse, binutils; +Cc: gcc, GNU C Library
On Fri, Jan 14, 2005 at 10:35:28AM +1030, Alan Modra wrote:
> On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote:
> > On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote:
> > > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > > > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
> > > [snip]
> > > > > 8048081: 8b 04 63 mov (%ebx,2),%eax
> > >
> > > I don't agree that this is a problem. In fact, I think that this
> > > disassembly is more accurate than "mov (%ebx),%eax". Note that gas
> > > accepts "mov (%ebx,2),%eax" giving
> > > Warning: scale factor of 2 without an index register
> >
> > But it generates "8b 03", not "8b 04 63".
>
> Sure. That's an optimization, just like mov %es,%ax is assembled
> without the operand size prefix as if the programmer had written
> mov %es,%eax. I'm quite happy with the assembler optimizing a little
> where it can. :)
If it is an optimization, there shouldn't be a warning. I think it
may be useful to turn "leal 0xf(%eax,1), %eax" into "8d 44 20 0f"
Gcc/ld use
leal foo(%reg), %eax; call ___tls_get_addr; nop
today for TLS optimization. With the change, we can use
leal foo(%reg,1), %eax; call ___tls_get_addr;
>
> > > Yes, I agree that the effect of executing these byte sequences is the
> > > same as "mov (%ebx),%eax", but that's beside the point. For example,
> > > plenty of x86 instructions execute as a nop, but that doesn't mean they
> > > should all be disassembled as "nop". The disassembler ought to reflect
> > > the machine encoding as closely as possible, and in this case that means
> > > printing the ignored scale factor.
> > >
> > > I think this change should be reverted.
>
> >
> > IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> > is "[none]". Displaying "(%ebx,2)" is simply wrong here.
>
> The IA-32 instruction reference manual specifies both instruction
> operation and instruction encoding. There isn't a one to one mapping
> between encoding and operation on IA-32, sometimes multiple encodings
> are available for a particular operation.
>
> And that's where I have a philosophical disagreement with Allan Cruse.
> I believe the disassembler should reflect the encoding as much as
> possible, while he seems to believe the disassembler should reflect
Then it should display
8b 04 23 mov (%ebx,1),%eax
not
8b 04 23 mov (%ebx),%eax
I am enclosing a patch here. I didn't include testcase change.
H.J.
----
gas/
2005-01-13 H.J. Lu <hongjiu.lu@intel.com>
* config/tc-i386.c (SCALE1_WHEN_NO_INDEX): Removed.
(_i386_insn): Add need_sib.
(build_modrm_byte): Use SIB if need_sib is not 0.
(i386_scale): Set i386_scale. Disallow 0 scale.
opcodes/
2005-01-13 H.J. Lu <hongjiu.lu@intel.com>
* 386-dis.c (OP_E): Undo the 2005-01-12 change. Display scale
for SIB with INDEX == 4.
--- binutils/gas/config/tc-i386.c.sib 2004-12-22 09:30:53.000000000 -0800
+++ binutils/gas/config/tc-i386.c 2005-01-13 15:55:46.911502210 -0800
@@ -43,14 +43,6 @@
#define INFER_ADDR_PREFIX 1
#endif
-#ifndef SCALE1_WHEN_NO_INDEX
-/* Specifying a scale factor besides 1 when there is no index is
- futile. eg. `mov (%ebx,2),%al' does exactly the same as
- `mov (%ebx),%al'. To slavishly follow what the programmer
- specified, set SCALE1_WHEN_NO_INDEX to 0. */
-#define SCALE1_WHEN_NO_INDEX 1
-#endif
-
#ifndef DEFAULT_ARCH
#define DEFAULT_ARCH "i386"
#endif
@@ -162,6 +154,9 @@ struct _i386_insn
const reg_entry *index_reg;
unsigned int log2_scale_factor;
+ /* NEED_SIB is used to indicate if the SIB byte is needed. */
+ int need_sib;
+
/* SEG gives the seg_entries of this insn. They are zero unless
explicit segment overrides are given. */
const seg_entry *seg[2];
@@ -3006,11 +3001,9 @@ build_modrm_byte ()
Any base register besides %esp will not use the
extra modrm byte. */
i.sib.index = NO_INDEX_REGISTER;
-#if !SCALE1_WHEN_NO_INDEX
/* Another case where we force the second modrm byte. */
- if (i.log2_scale_factor)
+ if (i.need_sib)
i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
-#endif
}
else
{
@@ -3950,9 +3943,9 @@ i386_scale (scale)
input_line_pointer = scale;
val = get_absolute_expression ();
+ i.need_sib = 1;
switch (val)
{
- case 0:
case 1:
i.log2_scale_factor = 0;
break;
@@ -3971,14 +3964,6 @@ i386_scale (scale)
input_line_pointer = save;
return NULL;
}
- if (i.log2_scale_factor != 0 && i.index_reg == 0)
- {
- as_warn (_("scale factor of %d without an index register"),
- 1 << i.log2_scale_factor);
-#if SCALE1_WHEN_NO_INDEX
- i.log2_scale_factor = 0;
-#endif
- }
scale = input_line_pointer;
input_line_pointer = save;
return scale;
--- binutils/opcodes/i386-dis.c.sib 2005-01-13 09:41:31.000000000 -0800
+++ binutils/opcodes/i386-dis.c 2005-01-13 15:46:31.746631238 -0800
@@ -3191,10 +3191,8 @@ OP_E (int bytemode, int sizeflag)
{
havesib = 1;
FETCH_DATA (the_info, codep + 1);
+ scale = (*codep >> 6) & 3;
index = (*codep >> 3) & 7;
- if (mode_64bit || index != 0x4)
- /* When INDEX == 0x4 in 32 bit mode, SCALE is ignored. */
- scale = (*codep >> 6) & 3;
base = *codep & 7;
USED_REX (REX_EXTY);
USED_REX (REX_EXTZ);
@@ -3316,7 +3314,9 @@ OP_E (int bytemode, int sizeflag)
oappend (mode_64bit && (sizeflag & AFLAG)
? names64[index] : names32[index]);
}
- if (scale != 0 || (!intel_syntax && index != 4))
+ if (scale != 0
+ || (!intel_syntax && index != 4)
+ || (index == 4 && base != 4 && base != 5))
{
*obufp++ = scale_char;
*obufp = '\0';
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 0:27 ` H. J. Lu
@ 2005-01-14 0:59 ` Alan Modra
2005-01-14 21:49 ` H. J. Lu
0 siblings, 1 reply; 13+ messages in thread
From: Alan Modra @ 2005-01-14 0:59 UTC (permalink / raw)
To: H. J. Lu; +Cc: Allan B. Cruse, binutils, gcc, GNU C Library
On Thu, Jan 13, 2005 at 04:26:59PM -0800, H. J. Lu wrote:
> If it is an optimization, there shouldn't be a warning.
No, whether we warn or not is an entirely separate matter to whether we
optimize.
> I think it
> may be useful to turn "leal 0xf(%eax,1), %eax" into "8d 44 20 0f"
> Gcc/ld use
>
> leal foo(%reg), %eax; call ___tls_get_addr; nop
>
> today for TLS optimization. With the change, we can use
>
> leal foo(%reg,1), %eax; call ___tls_get_addr;
Hmm. So that you generate a larger instruction on purpose? Wanted for
the space needed with some of the tls transformations, I expect.
OK, that is a valid reason to support encoding of the instruction
that way. You still should warn for scale factors other than 1,
because it's easy to forget the comma in (,%reg,2) where you really
do want the register to be scaled.
> Then it should display
>
> 8b 04 23 mov (%ebx,1),%eax
Agreed.
--
Alan Modra
IBM OzLabs - Linux Technology Centre
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 0:06 ` Alan Modra
2005-01-14 0:27 ` H. J. Lu
@ 2005-01-14 7:04 ` Bernd Jendrissek
1 sibling, 0 replies; 13+ messages in thread
From: Bernd Jendrissek @ 2005-01-14 7:04 UTC (permalink / raw)
To: H. J. Lu, Allan B. Cruse, binutils
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Fri, Jan 14, 2005 at 10:35:28AM +1030, Alan Modra wrote:
> On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote:
> > IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> > is "[none]". Displaying "(%ebx,2)" is simply wrong here.
>
> The IA-32 instruction reference manual specifies both instruction
> operation and instruction encoding. There isn't a one to one mapping
> between encoding and operation on IA-32, sometimes multiple encodings
> are available for a particular operation.
>
> And that's where I have a philosophical disagreement with Allan Cruse.
> I believe the disassembler should reflect the encoding as much as
> possible, while he seems to believe the disassembler should reflect
> operation. The trouble with that argument is that taken to its logical
> conclusion we should disassemble
> 0x89,0xf6 as "nop"
> 0x8d,0x76,0x00 as "nop"
> 0x8d,0x74,0x26,0x00 as "nop"
> and so on for all of the zillion different "nop" encodings.
Another nice-to-have is that the disassembled output can be re-assembled
to produce *exactly* the same output binary.
IOW if at all possible, I like to have *complete* control over the
encoding of the assembled instructions, without resorting to .byte et
al. Of course, this nice-to-have is already broken by addl %edx,%ebx:
is that 01 d3 or is it 03 da?
As to *why* I would want such totalitarian control... well I'll just
deflect and say that anyone nitpicking over (%ebx,2) vs. (%ebx) is
already at the same level of moral turpitude as I. :-)
Or maybe introduce -Mpedagogical and -Mrealprogrammer?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQFB52zC/FmLrNfLpjMRArjeAKCk8vJJSqnBUMZmWSLjR51Av1ulKgCdF9k9
YDextHIRCcWVGPwVWIRAg88=
=Jrwd
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 0:59 ` Alan Modra
@ 2005-01-14 21:49 ` H. J. Lu
0 siblings, 0 replies; 13+ messages in thread
From: H. J. Lu @ 2005-01-14 21:49 UTC (permalink / raw)
To: Allan B. Cruse, binutils, gcc, GNU C Library
On Fri, Jan 14, 2005 at 11:29:18AM +1030, Alan Modra wrote:
> On Thu, Jan 13, 2005 at 04:26:59PM -0800, H. J. Lu wrote:
> > If it is an optimization, there shouldn't be a warning.
>
> No, whether we warn or not is an entirely separate matter to whether we
> optimize.
>
> > I think it
> > may be useful to turn "leal 0xf(%eax,1), %eax" into "8d 44 20 0f"
> > Gcc/ld use
> >
> > leal foo(%reg), %eax; call ___tls_get_addr; nop
> >
> > today for TLS optimization. With the change, we can use
> >
> > leal foo(%reg,1), %eax; call ___tls_get_addr;
>
> Hmm. So that you generate a larger instruction on purpose? Wanted for
> the space needed with some of the tls transformations, I expect.
>
> OK, that is a valid reason to support encoding of the instruction
> that way. You still should warn for scale factors other than 1,
> because it's easy to forget the comma in (,%reg,2) where you really
> do want the register to be scaled.
>
> > Then it should display
> >
> > 8b 04 23 mov (%ebx,1),%eax
>
I decided to use (%ebx,,1) for this and adjusted assembler to accept
it. You will get warnings for (%ebx,1) as before. It will be easier
to check if assembler takes (%ebx,,1).
BTW, I didn't change the Intel syntax since I don't know enough
about it.
H.J.
-----
gas/
2005-01-14 H.J. Lu <hongjiu.lu@intel.com>
PR 658
* config/tc-i386.c (SCALE1_WHEN_NO_INDEX): Removed.
(_i386_insn): Add empty_index_reg.
(build_modrm_byte): Use SIB if empty_index_reg is not 0.
(i386_scale): Don't warn scale factor without index register if
empty_index_reg is not 0.
(i386_operand): Set empty_index_reg 1 if the index register is
"".
gas/testsuite/
2005-01-14 H.J. Lu <hongjiu.lu@intel.com>
PR 658
* gas/i386/sib.d: Updated.
* gas/i386/sib.s: Likewise.
* gas/i386/ssemmx2.d: Likewise.
ld/testsuite/
2005-01-14 H.J. Lu <hongjiu.lu@intel.com>
PR 658
* ld-i386/tlsbin.dd: Updated.
opcodes/
2005-01-14 H.J. Lu <hongjiu.lu@intel.com>
PR 658
* 386-dis.c (OP_E): Undo the 2005-01-12 change. Display scale
for SIB with INDEX == 4.
--- binutils/gas/config/tc-i386.c.sib 2005-01-14 11:27:06.000000000 -0800
+++ binutils/gas/config/tc-i386.c 2005-01-14 12:25:45.624584751 -0800
@@ -43,14 +43,6 @@
#define INFER_ADDR_PREFIX 1
#endif
-#ifndef SCALE1_WHEN_NO_INDEX
-/* Specifying a scale factor besides 1 when there is no index is
- futile. eg. `mov (%ebx,2),%al' does exactly the same as
- `mov (%ebx),%al'. To slavishly follow what the programmer
- specified, set SCALE1_WHEN_NO_INDEX to 0. */
-#define SCALE1_WHEN_NO_INDEX 1
-#endif
-
#ifndef DEFAULT_ARCH
#define DEFAULT_ARCH "i386"
#endif
@@ -162,6 +154,8 @@ struct _i386_insn
const reg_entry *index_reg;
unsigned int log2_scale_factor;
+ int empty_index_reg;
+
/* SEG gives the seg_entries of this insn. They are zero unless
explicit segment overrides are given. */
const seg_entry *seg[2];
@@ -3006,11 +3000,9 @@ build_modrm_byte ()
Any base register besides %esp will not use the
extra modrm byte. */
i.sib.index = NO_INDEX_REGISTER;
-#if !SCALE1_WHEN_NO_INDEX
/* Another case where we force the second modrm byte. */
- if (i.log2_scale_factor)
+ if (i.empty_index_reg)
i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING;
-#endif
}
else
{
@@ -3970,13 +3962,13 @@ i386_scale (scale)
input_line_pointer = save;
return NULL;
}
- if (i.log2_scale_factor != 0 && i.index_reg == 0)
+ if (i.log2_scale_factor != 0
+ && i.index_reg == 0
+ && i.empty_index_reg == 0)
{
as_warn (_("scale factor of %d without an index register"),
1 << i.log2_scale_factor);
-#if SCALE1_WHEN_NO_INDEX
i.log2_scale_factor = 0;
-#endif
}
scale = input_line_pointer;
input_line_pointer = save;
@@ -4430,6 +4422,12 @@ i386_operand (operand_string)
as_bad (_("bad register name `%s'"), base_string);
return 0;
}
+ else if (*base_string == ',' && i.base_reg)
+ {
+ /* Check for empty index reg. */
+ base_string++;
+ i.empty_index_reg = 1;
+ }
/* Check for scale factor. */
if (*base_string != ')')
--- binutils/gas/testsuite/gas/i386/sib.d.sib 2005-01-12 11:12:51.000000000 -0800
+++ binutils/gas/testsuite/gas/i386/sib.d 2005-01-14 12:43:29.131911163 -0800
@@ -6,10 +6,14 @@
Disassembly of section .text:
0+000 <foo>:
- 0: 8b 04 23 [ ]*mov [ ]*\(%ebx\),%eax
- 3: 8b 04 63 [ ]*mov [ ]*\(%ebx\),%eax
- 6: 8b 04 a3 [ ]*mov [ ]*\(%ebx\),%eax
- 9: 8b 04 e3 [ ]*mov [ ]*\(%ebx\),%eax
- c: 90 [ ]*nop [ ]*
- d: 90 [ ]*nop [ ]*
- ...
+ 0: 8b 03 [ ]*mov [ ]*\(%ebx\),%eax
+ 2: 8b 04 23 [ ]*mov [ ]*\(%ebx,,1\),%eax
+ 5: 8b 04 63 [ ]*mov [ ]*\(%ebx,,2\),%eax
+ 8: 8b 04 a3 [ ]*mov [ ]*\(%ebx,,4\),%eax
+ b: 8b 04 e3 [ ]*mov [ ]*\(%ebx,,8\),%eax
+ e: 8b 04 24 [ ]*mov [ ]*\(%esp\),%eax
+ 11: 8b 04 24 [ ]*mov [ ]*\(%esp\),%eax
+ 14: 8b 04 64 [ ]*mov [ ]*\(%esp,,2\),%eax
+ 17: 8b 04 a4 [ ]*mov [ ]*\(%esp,,4\),%eax
+ 1a: 8b 04 e4 [ ]*mov [ ]*\(%esp,,8\),%eax
+ 1d: 8d 76 00 [ ]*lea [ ]*0x0\(%esi\),%esi
--- binutils/gas/testsuite/gas/i386/sib.s.sib 2005-01-12 11:12:51.000000000 -0800
+++ binutils/gas/testsuite/gas/i386/sib.s 2005-01-14 13:33:54.696308591 -0800
@@ -2,10 +2,14 @@
.text
foo:
- .byte 0x8B, 0x04, 0x23 # effect is: movl (%ebx), %eax
- .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax
- .byte 0x8B, 0x04, 0xA3 # effect is: movl (%ebx), %eax
- .byte 0x8B, 0x04, 0xE3 # effect is: movl (%ebx), %eax
- nop
- nop
- .p2align 4,0
+ mov (%ebx),%eax
+ mov (%ebx,,1),%eax
+ mov (%ebx,,2),%eax
+ mov (%ebx,,4),%eax
+ mov (%ebx,,8),%eax
+ mov (%esp),%eax
+ mov (%esp,,1),%eax
+ mov (%esp,,2),%eax
+ mov (%esp,,4),%eax
+ mov (%esp,,8),%eax
+ .p2align 4
--- binutils/gas/testsuite/gas/i386/ssemmx2.d.sib 2004-01-18 15:13:35.000000000 -0800
+++ binutils/gas/testsuite/gas/i386/ssemmx2.d 2005-01-14 11:11:57.000000000 -0800
@@ -85,4 +85,4 @@ Disassembly of section .text:
1f1: 66 0f fc 90 90 90 90 90 paddb[ ]+0x90909090\(%eax\),%xmm2
1f9: 66 0f fd 90 90 90 90 90 paddw[ ]+0x90909090\(%eax\),%xmm2
201: 66 0f fe 90 90 90 90 90 paddd[ ]+0x90909090\(%eax\),%xmm2
- 209: 8d b4 26 00 00 00 00 lea[ ]+0x0\(%esi\),%esi
+ 209: 8d b4 26 00 00 00 00 lea[ ]+0x0\(%esi,,1\),%esi
--- binutils/ld/testsuite/ld-i386/tlsbin.dd.sib 2004-05-11 10:08:36.000000000 -0700
+++ binutils/ld/testsuite/ld-i386/tlsbin.dd 2005-01-14 11:14:25.000000000 -0800
@@ -92,7 +92,7 @@ Disassembly of section .text:
# LD -> LE
8049085: 65 a1 00 00 00 00[ ]+mov %gs:0x0,%eax
804908b: 90[ ]+nop *
- 804908c: 8d 74 26 00[ ]+lea 0x0\(%esi\),%esi
+ 804908c: 8d 74 26 00[ ]+lea 0x0\(%esi,,1\),%esi
8049090: 90[ ]+nop *
8049091: 90[ ]+nop *
8049092: 8d 90 20 f0 ff ff[ ]+lea 0xfffff020\(%eax\),%edx
@@ -108,7 +108,7 @@ Disassembly of section .text:
# LD -> LE against hidden variables
80490a4: 65 a1 00 00 00 00[ ]+mov %gs:0x0,%eax
80490aa: 90[ ]+nop *
- 80490ab: 8d 74 26 00[ ]+lea 0x0\(%esi\),%esi
+ 80490ab: 8d 74 26 00[ ]+lea 0x0\(%esi,,1\),%esi
80490af: 90[ ]+nop *
80490b0: 90[ ]+nop *
80490b1: 8d 90 40 f0 ff ff[ ]+lea 0xfffff040\(%eax\),%edx
--- binutils/opcodes/i386-dis.c.sib 2005-01-13 09:41:31.000000000 -0800
+++ binutils/opcodes/i386-dis.c 2005-01-14 13:33:42.625890827 -0800
@@ -3191,10 +3191,8 @@ OP_E (int bytemode, int sizeflag)
{
havesib = 1;
FETCH_DATA (the_info, codep + 1);
+ scale = (*codep >> 6) & 3;
index = (*codep >> 3) & 7;
- if (mode_64bit || index != 0x4)
- /* When INDEX == 0x4 in 32 bit mode, SCALE is ignored. */
- scale = (*codep >> 6) & 3;
base = *codep & 7;
USED_REX (REX_EXTY);
USED_REX (REX_EXTZ);
@@ -3316,7 +3314,20 @@ OP_E (int bytemode, int sizeflag)
oappend (mode_64bit && (sizeflag & AFLAG)
? names64[index] : names32[index]);
}
- if (scale != 0 || (!intel_syntax && index != 4))
+ else if (!intel_syntax
+ && havebase
+ && (scale != 0
+ || ((base & 7) != 4
+ && (base & 7) != 5)))
+ {
+ *obufp++ = separator_char;
+ *obufp = '\0';
+ }
+ if (scale != 0
+ || (!intel_syntax && index != 4)
+ || (index == 4
+ && (base & 7) != 4
+ && (base & 7) != 5))
{
*obufp++ = scale_char;
*obufp = '\0';
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 7:32 ` Bernd Jendrissek
@ 2005-01-14 17:19 ` E. Weddington
0 siblings, 0 replies; 13+ messages in thread
From: E. Weddington @ 2005-01-14 17:19 UTC (permalink / raw)
To: Bernd Jendrissek; +Cc: Allan B. Cruse, binutils, hjl, gcc, libc-alpha
OT:
Bernd Jendrissek wrote:
>
>
>>and to decrypt secret messages someone might have hidden inside a
>>code-stream.
>>
>>
>
>So who's writing up a patch to fingerprint binaries with "GAS and GNU
>rulez!"? :-)
>
>
>
See this interesting project:
<http://www.crazyboy.com/hydan/>
Eric
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
2005-01-14 6:11 Allan B. Cruse
@ 2005-01-14 7:32 ` Bernd Jendrissek
2005-01-14 17:19 ` E. Weddington
0 siblings, 1 reply; 13+ messages in thread
From: Bernd Jendrissek @ 2005-01-14 7:32 UTC (permalink / raw)
To: Allan B. Cruse; +Cc: binutils, hjl, gcc, libc-alpha
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Jan 13, 2005 at 10:10:22PM -0800, Allan B. Cruse wrote:
> whereas only those having an intimate acquaintance with Intel's
> documentation would be able to quickly know that " movl (%esi,2),%eax
> " does NOT scale the index-register, contrary to what the syntax
> indicates.
For compulsive bit-fiddlers, it might make sense to allow (%esi,2,) just
like gas already allows (?) (,1,%ebx).
> and to decrypt secret messages someone might have hidden inside a
> code-stream.
So who's writing up a patch to fingerprint binaries with "GAS and GNU
rulez!"? :-)
- --
Seen in comp.lang.c:
> cody wrote:
>> The problem is that i believe that my assertions are correct.
> Yes, that is a problem.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQFB53Nq/FmLrNfLpjMRAha4AJwM4oiuloCfULekZ5Sih3HpyeKpMgCgh/pa
c9sbSZ/fDeaosyolljLBpE4=
=qgJj
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
@ 2005-01-14 6:11 Allan B. Cruse
2005-01-14 7:32 ` Bernd Jendrissek
0 siblings, 1 reply; 13+ messages in thread
From: Allan B. Cruse @ 2005-01-14 6:11 UTC (permalink / raw)
To: binutils, cruse, hjl; +Cc: gcc, libc-alpha
On Fri, 14 Jan 2005, Alan Modra <amodra@bigpond.net.au> wrote:
>
> Subject: Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump
>
>
> ...And that's where I have a philosophical disagreement with Allan Cruse.
> I believe the disassembler should reflect the encoding as much as
> possible, while he seems to believe the disassembler should reflect
> operation. The trouble with that argument is that taken to its logical
> conclusion we should disassemble
> 0x89,0xf6 as "nop"
> 0x8d,0x76,0x00 as "nop"
> 0x8d,0x74,0x26,0x00 as "nop"
> and so on for all of the zillion different "nop" encodings. Indeed,
> that might help some people. We've had the occasional bug report that
> gas wasn't aligning with nops! But people use the disassembler for more
> that just teaching, where instruction operation might be the primary
> concern. I'd guess that programmers casually debugging programs are
> most interested in instruction operation too, but more advanced analysis
> might focus on execution speed and instruction scheduling where
> different encodings do sometimes behave differently. There's also the
> possibility of subtle cpu bugs that only show up in certain machine
> encodings.
>
I think one difference between disassemblies of those 'nop' instructions
and the disassembly of " movl (%esi,2),%eax " is that programmers who
possess a general understanding of the assembly language syntax would be
quickly able to figure out that instructions like " xchg %ax,%ax " are
no-ops, whereas only those having an intimate acquaintance with Intel's
documentation would be able to quickly know that " movl (%esi,2),%eax "
does NOT scale the index-register, contrary to what the syntax indicates.
I don't dispute the valid points that Alan Morda raises, nor the purity of
his appealing philosophical vision which says that a disassembler ought to
reveal subtle distinctions between alternative machine-language encodings.
But, if one invokes the philosophical principle of "the greatest good for
the greatest number," then I would guess that there are more individuals
who are relying on 'objdump' for help with program-debugging. and with
clarifying processor-operations, than there are people who use 'objdump'
for doing esoteric code-optimizations -- and to decrypt secret messages
someone might have hidden inside a code-stream.
Helping more people be more productive with their computers, rather than
waste time figuring out misleading syntax, isn't a bad goal -- is it?
--Allan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
@ 2005-01-13 17:42 Allan B. Cruse
0 siblings, 0 replies; 13+ messages in thread
From: Allan B. Cruse @ 2005-01-13 17:42 UTC (permalink / raw)
To: binutils, cruse, hjl
On Thu, 13 Jan 2005, "H. J. Lu" <hjl@lucon.org> wrote:
>
> Subject: Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump
> bug-report)
>
> ... IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> is "[none]". Displaying "(%ebx,2)" is simply wrong here.
>
> H.J.
>
Thanks, H.J.
I use 'objdump' to help in teaching x86 architecture and assembly
language to college students. In the absence of a suitable AT&T-
syntax textbook that's comprehensive, the accuracy of objdump's
output, as a guide to cpu's behavior, is an invaluable resource.
Your efforts are much appreciated!
Allan
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-01-14 21:49 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20050111210753.0C8CB219E0@nexus.cs.usfca.edu>
2005-01-12 19:10 ` PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report) H. J. Lu
2005-01-13 3:44 ` Alan Modra
2005-01-13 17:09 ` H. J. Lu
2005-01-13 17:27 ` H. J. Lu
2005-01-14 0:06 ` Alan Modra
2005-01-14 0:27 ` H. J. Lu
2005-01-14 0:59 ` Alan Modra
2005-01-14 21:49 ` H. J. Lu
2005-01-14 7:04 ` Bernd Jendrissek
2005-01-13 17:42 Allan B. Cruse
2005-01-14 6:11 Allan B. Cruse
2005-01-14 7:32 ` Bernd Jendrissek
2005-01-14 17:19 ` E. Weddington
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).