From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <binutils-return-37141-listarch-binutils=sources.redhat.com@sources.redhat.com>
Received: (qmail 10548 invoked by alias); 14 Jan 2005 00:06:49 -0000
Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:binutils-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/binutils/>
List-Post: <mailto:binutils@sources.redhat.com>
List-Help: <mailto:binutils-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: binutils-owner@sources.redhat.com
Received: (qmail 9660 invoked from network); 14 Jan 2005 00:05:30 -0000
Received: from unknown (HELO gizmo01ps.bigpond.com) (144.140.71.11)
  by sourceware.org with SMTP; 14 Jan 2005 00:05:30 -0000
Received: (qmail 25636 invoked from network); 14 Jan 2005 00:05:28 -0000
Received: from unknown (HELO psmam12.bigpond.com) (144.135.25.103)
  by gizmo01ps.bigpond.com with SMTP; 14 Jan 2005 00:05:28 -0000
Received: from cpe-144-136-221-26.sa.bigpond.net.au ([144.136.221.26]) by psmam12.bigpond.com(MAM REL_3_4_2a 234/36274695) with SMTP id 36274695; Fri, 14 Jan 2005 10:05:28 +1000
Received: by bubble.modra.org (Postfix, from userid 500)
	id 42758EA879; Fri, 14 Jan 2005 10:35:28 +1030
Date: Fri, 14 Jan 2005 00:06:00 -0000
From: Alan Modra <amodra@bigpond.net.au>
To: "H. J. Lu" <hjl@lucon.org>
Cc: "Allan B. Cruse" <cruse@cs.usfca.edu>,
	binutils@sources.redhat.com
Subject: Re: PATCH: Fix i386 disassembler with index == 0x4 in SIB (Re: objdump bug-report)
Message-ID: <20050114000528.GA3408@bubble.modra.org>
Mail-Followup-To: "H. J. Lu" <hjl@lucon.org>,
	"Allan B. Cruse" <cruse@cs.usfca.edu>, binutils@sources.redhat.com
References: <20050111210753.0C8CB219E0@nexus.cs.usfca.edu> <20050112191052.GA12463@lucon.org> <20050113034440.GG30985@bubble.modra.org> <20050113170849.GA30644@lucon.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20050113170849.GA30644@lucon.org>
User-Agent: Mutt/1.4i
X-SW-Source: 2005-01/txt/msg00136.txt.bz2

On Thu, Jan 13, 2005 at 09:08:49AM -0800, H. J. Lu wrote:
> On Thu, Jan 13, 2005 at 02:14:40PM +1030, Alan Modra wrote:
> > On Wed, Jan 12, 2005 at 11:10:52AM -0800, H. J. Lu wrote:
> > > > 	.byte	0x8B, 0x04, 0x63	# effect is: movl (%ebx), %eax	
> > [snip]
> > > >  8048081:	8b 04 63             	mov    (%ebx,2),%eax
> > 
> > I don't agree that this is a problem.  In fact, I think that this
> > disassembly is more accurate than "mov (%ebx),%eax".  Note that gas
> > accepts "mov (%ebx,2),%eax" giving
> > Warning: scale factor of 2 without an index register
> 
> But it generates "8b 03", not "8b 04 63".

Sure.  That's an optimization, just like mov %es,%ax is assembled
without the operand size prefix as if the programmer had written
mov %es,%eax.  I'm quite happy with the assembler optimizing a little
where it can.  :)

> > Yes, I agree that the effect of executing these byte sequences is the
> > same as "mov (%ebx),%eax", but that's beside the point.  For example,
> > plenty of x86 instructions execute as a nop, but that doesn't mean they
> > should all be disassembled as "nop".  The disassembler ought to reflect
> > the machine encoding as closely as possible, and in this case that means
> > printing the ignored scale factor.
> > 
> > I think this change should be reverted.

> 
> IA-32 instruction reference manual says when INDEX == 0x4, scaled index
> is "[none]". Displaying "(%ebx,2)" is simply wrong here.

The IA-32 instruction reference manual specifies both instruction
operation and instruction encoding.  There isn't a one to one mapping
between encoding and operation on IA-32, sometimes multiple encodings
are available for a particular operation.

And that's where I have a philosophical disagreement with Allan Cruse.
I believe the disassembler should reflect the encoding as much as
possible, while he seems to believe the disassembler should reflect
operation.  The trouble with that argument is that taken to its logical
conclusion we should disassemble
  0x89,0xf6 as "nop"
  0x8d,0x76,0x00 as "nop"
  0x8d,0x74,0x26,0x00 as "nop"
and so on for all of the zillion different "nop" encodings.  Indeed,
that might help some people.  We've had the occasional bug report that
gas wasn't aligning with nops!  But people use the disassembler for more
that just teaching, where instruction operation might be the primary
concern.  I'd guess that programmers casually debugging programs are
most interested in instruction operation too, but more advanced analysis
might focus on execution speed and instruction scheduling where
different encodings do sometimes behave differently.  There's also the
possibility of subtle cpu bugs that only show up in certain machine
encodings.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre