From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6050 invoked by alias); 14 Jan 2005 00:27:16 -0000 Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sources.redhat.com Received: (qmail 5963 invoked from network); 14 Jan 2005 00:27:01 -0000 Received: from unknown (HELO rwcrmhc12.comcast.net) (216.148.227.85) by sourceware.org with SMTP; 14 Jan 2005 00:27:01 -0000 Received: from lucon.org ([24.6.212.230]) by comcast.net (rwcrmhc12) with ESMTP id <200501140027000140016uq4e>; Fri, 14 Jan 2005 00:27:00 +0000 Received: by lucon.org (Postfix, from userid 1000) id 0599E640F4; Thu, 13 Jan 2005 16:26:59 -0800 (PST) Date: Fri, 14 Jan 2005 00:27:00 -0000 From: "H. J. Lu" To: "Allan B. Cruse" , binutils@sources.redhat.com Cc: gcc@gcc.gnu.org, GNU C Library Subject: Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report) Message-ID: <20050114002659.GA4491@lucon.org> References: <20050111210753.0C8CB219E0@nexus.cs.usfca.edu> <20050112191052.GA12463@lucon.org> <20050113034440.GG30985@bubble.modra.org> <20050113170849.GA30644@lucon.org> <20050114000528.GA3408@bubble.modra.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050114000528.GA3408@bubble.modra.org> User-Agent: Mutt/1.4.1i X-SW-Source: 2005-01/txt/msg00139.txt.bz2 On Fri, Jan 14, 2005 at 10:35:28AM +1030, Alan Modra wrote: > On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote: > > On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote: > > > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote: > > > > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax > > > [snip] > > > > > 8048081: 8b 04 63 mov (%ebx,2),%eax > > > > > > I don't agree that this is a problem. In fact, I think that this > > > disassembly is more accurate than "mov (%ebx),%eax". Note that gas > > > accepts "mov (%ebx,2),%eax" giving > > > Warning: scale factor of 2 without an index register > > > > But it generates "8b 03", not "8b 04 63". > > Sure. That's an optimization, just like mov %es,%ax is assembled > without the operand size prefix as if the programmer had written > mov %es,%eax. I'm quite happy with the assembler optimizing a little > where it can. :) If it is an optimization, there shouldn't be a warning. I think it may be useful to turn "leal 0xf(%eax,1), %eax" into "8d 44 20 0f" Gcc/ld use leal foo(%reg), %eax; call ___tls_get_addr; nop today for TLS optimization. With the change, we can use leal foo(%reg,1), %eax; call ___tls_get_addr; > > > > Yes, I agree that the effect of executing these byte sequences is the > > > same as "mov (%ebx),%eax", but that's beside the point. For example, > > > plenty of x86 instructions execute as a nop, but that doesn't mean they > > > should all be disassembled as "nop". The disassembler ought to reflect > > > the machine encoding as closely as possible, and in this case that means > > > printing the ignored scale factor. > > > > > > I think this change should be reverted. > > > > > IA-32 instruction reference manual says when INDEX == 0x4, scaled index > > is "[none]". Displaying "(%ebx,2)" is simply wrong here. > > The IA-32 instruction reference manual specifies both instruction > operation and instruction encoding. There isn't a one to one mapping > between encoding and operation on IA-32, sometimes multiple encodings > are available for a particular operation. > > And that's where I have a philosophical disagreement with Allan Cruse. > I believe the disassembler should reflect the encoding as much as > possible, while he seems to believe the disassembler should reflect Then it should display 8b 04 23 mov (%ebx,1),%eax not 8b 04 23 mov (%ebx),%eax I am enclosing a patch here. I didn't include testcase change. H.J. ---- gas/ 2005-01-13 H.J. Lu * config/tc-i386.c (SCALE1_WHEN_NO_INDEX): Removed. (_i386_insn): Add need_sib. (build_modrm_byte): Use SIB if need_sib is not 0. (i386_scale): Set i386_scale. Disallow 0 scale. opcodes/ 2005-01-13 H.J. Lu * 386-dis.c (OP_E): Undo the 2005-01-12 change. Display scale for SIB with INDEX == 4. --- binutils/gas/config/tc-i386.c.sib 2004-12-22 09:30:53.000000000 -0800 +++ binutils/gas/config/tc-i386.c 2005-01-13 15:55:46.911502210 -0800 @@ -43,14 +43,6 @@ #define INFER_ADDR_PREFIX 1 #endif -#ifndef SCALE1_WHEN_NO_INDEX -/* Specifying a scale factor besides 1 when there is no index is - futile. eg. `mov (%ebx,2),%al' does exactly the same as - `mov (%ebx),%al'. To slavishly follow what the programmer - specified, set SCALE1_WHEN_NO_INDEX to 0. */ -#define SCALE1_WHEN_NO_INDEX 1 -#endif - #ifndef DEFAULT_ARCH #define DEFAULT_ARCH "i386" #endif @@ -162,6 +154,9 @@ struct _i386_insn const reg_entry *index_reg; unsigned int log2_scale_factor; + /* NEED_SIB is used to indicate if the SIB byte is needed. */ + int need_sib; + /* SEG gives the seg_entries of this insn. They are zero unless explicit segment overrides are given. */ const seg_entry *seg[2]; @@ -3006,11 +3001,9 @@ build_modrm_byte () Any base register besides %esp will not use the extra modrm byte. */ i.sib.index = NO_INDEX_REGISTER; -#if !SCALE1_WHEN_NO_INDEX /* Another case where we force the second modrm byte. */ - if (i.log2_scale_factor) + if (i.need_sib) i.rm.regmem = ESCAPE_TO_TWO_BYTE_ADDRESSING; -#endif } else { @@ -3950,9 +3943,9 @@ i386_scale (scale) input_line_pointer = scale; val = get_absolute_expression (); + i.need_sib = 1; switch (val) { - case 0: case 1: i.log2_scale_factor = 0; break; @@ -3971,14 +3964,6 @@ i386_scale (scale) input_line_pointer = save; return NULL; } - if (i.log2_scale_factor != 0 && i.index_reg == 0) - { - as_warn (_("scale factor of %d without an index register"), - 1 << i.log2_scale_factor); -#if SCALE1_WHEN_NO_INDEX - i.log2_scale_factor = 0; -#endif - } scale = input_line_pointer; input_line_pointer = save; return scale; --- binutils/opcodes/i386-dis.c.sib 2005-01-13 09:41:31.000000000 -0800 +++ binutils/opcodes/i386-dis.c 2005-01-13 15:46:31.746631238 -0800 @@ -3191,10 +3191,8 @@ OP_E (int bytemode, int sizeflag) { havesib = 1; FETCH_DATA (the_info, codep + 1); + scale = (*codep >> 6) & 3; index = (*codep >> 3) & 7; - if (mode_64bit || index != 0x4) - /* When INDEX == 0x4 in 32 bit mode, SCALE is ignored. */ - scale = (*codep >> 6) & 3; base = *codep & 7; USED_REX (REX_EXTY); USED_REX (REX_EXTZ); @@ -3316,7 +3314,9 @@ OP_E (int bytemode, int sizeflag) oappend (mode_64bit && (sizeflag & AFLAG) ? names64[index] : names32[index]); } - if (scale != 0 || (!intel_syntax && index != 4)) + if (scale != 0 + || (!intel_syntax && index != 4) + || (index == 4 && base != 4 && base != 5)) { *obufp++ = scale_char; *obufp = '\0';