From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10548 invoked by alias); 14 Jan 2005 00:06:49 -0000 Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sources.redhat.com Received: (qmail 9660 invoked from network); 14 Jan 2005 00:05:30 -0000 Received: from unknown (HELO gizmo01ps.bigpond.com) (144.140.71.11) by sourceware.org with SMTP; 14 Jan 2005 00:05:30 -0000 Received: (qmail 25636 invoked from network); 14 Jan 2005 00:05:28 -0000 Received: from unknown (HELO psmam12.bigpond.com) (144.135.25.103) by gizmo01ps.bigpond.com with SMTP; 14 Jan 2005 00:05:28 -0000 Received: from cpe-144-136-221-26.sa.bigpond.net.au ([144.136.221.26]) by psmam12.bigpond.com(MAM REL_3_4_2a 234/36274695) with SMTP id 36274695; Fri, 14 Jan 2005 10:05:28 +1000 Received: by bubble.modra.org (Postfix, from userid 500) id 42758EA879; Fri, 14 Jan 2005 10:35:28 +1030 Date: Fri, 14 Jan 2005 00:06:00 -0000 From: Alan Modra To: "H. J. Lu" Cc: "Allan B. Cruse" , binutils@sources.redhat.com Subject: Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report) Message-ID: <20050114000528.GA3408@bubble.modra.org> Mail-Followup-To: "H. J. Lu" , "Allan B. Cruse" , binutils@sources.redhat.com References: <20050111210753.0C8CB219E0@nexus.cs.usfca.edu> <20050112191052.GA12463@lucon.org> <20050113034440.GG30985@bubble.modra.org> <20050113170849.GA30644@lucon.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050113170849.GA30644@lucon.org> User-Agent: Mutt/1.4i X-SW-Source: 2005-01/txt/msg00136.txt.bz2 On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote: > On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote: > > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote: > > > > .byte 0x8B, 0x04, 0x63 # effect is: movl (%ebx), %eax > > [snip] > > > > 8048081: 8b 04 63 mov (%ebx,2),%eax > > > > I don't agree that this is a problem. In fact, I think that this > > disassembly is more accurate than "mov (%ebx),%eax". Note that gas > > accepts "mov (%ebx,2),%eax" giving > > Warning: scale factor of 2 without an index register > > But it generates "8b 03", not "8b 04 63". Sure. That's an optimization, just like mov %es,%ax is assembled without the operand size prefix as if the programmer had written mov %es,%eax. I'm quite happy with the assembler optimizing a little where it can. :) > > Yes, I agree that the effect of executing these byte sequences is the > > same as "mov (%ebx),%eax", but that's beside the point. For example, > > plenty of x86 instructions execute as a nop, but that doesn't mean they > > should all be disassembled as "nop". The disassembler ought to reflect > > the machine encoding as closely as possible, and in this case that means > > printing the ignored scale factor. > > > > I think this change should be reverted. > > IA-32 instruction reference manual says when INDEX == 0x4, scaled index > is "[none]". Displaying "(%ebx,2)" is simply wrong here. The IA-32 instruction reference manual specifies both instruction operation and instruction encoding. There isn't a one to one mapping between encoding and operation on IA-32, sometimes multiple encodings are available for a particular operation. And that's where I have a philosophical disagreement with Allan Cruse. I believe the disassembler should reflect the encoding as much as possible, while he seems to believe the disassembler should reflect operation. The trouble with that argument is that taken to its logical conclusion we should disassemble 0x89,0xf6 as "nop" 0x8d,0x76,0x00 as "nop" 0x8d,0x74,0x26,0x00 as "nop" and so on for all of the zillion different "nop" encodings. Indeed, that might help some people. We've had the occasional bug report that gas wasn't aligning with nops! But people use the disassembler for more that just teaching, where instruction operation might be the primary concern. I'd guess that programmers casually debugging programs are most interested in instruction operation too, but more advanced analysis might focus on execution speed and instruction scheduling where different encodings do sometimes behave differently. There's also the possibility of subtle cpu bugs that only show up in certain machine encodings. -- Alan Modra IBM OzLabs - Linux Technology Centre