RE: CGEN_DIS_HASH: how to get endianness and/or instruction size?

public inbox for cgen@sourceware.org
 help / color / mirror / Atom feed

* RE: CGEN_DIS_HASH: how to get endianness and/or instruction size?
@ 2007-01-26 11:17 Joern Rennecke
  0 siblings, 0 replies; 7+ messages in thread
From: Joern Rennecke @ 2007-01-26 11:17 UTC (permalink / raw)
  To: Frank Ch. Eigler, Joern Rennecke; +Cc: cgen

From: Frank Ch. Eigler 

> Heck, just use those five bits.

If I only knew where they are. 
Unless otherwise expressly stated, this message does not create or vary any contractual relationship between you and ARC International. The contents of this e-mail may be confidential and if you have received it in error, please delete it from your system, destroy any hard copies and telephone the above number. Incoming emails to ARC may be subject to monitoring other than by the addressee. EL  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CGEN_DIS_HASH: how to get endianness and/or instruction size?
  2007-01-26  1:11 Joern Rennecke
@ 2007-01-26  1:31 ` Frank Ch. Eigler
  0 siblings, 0 replies; 7+ messages in thread
From: Frank Ch. Eigler @ 2007-01-26  1:31 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: cgen

Hi -

joern wrote:

> [...]
> The top 5 bits of the first 16 bit word are what is known as major
> opcode field.  [...]
> I reckon that, given a suitably endian-correted input,
> I could make an adequate hash by using a switch based on
> the 5 bit major opcode to decide which bits from the first 16
> bit word to use.

Heck, just use those five bits.

> I had already written the hash function, only to discover that the
> endian check depended on an argument that was not passed to my
> macro/function.

You should use the *value* rather than *buf* input.

> Its easy to change his code into incorrect and/or useless (return 0)
> code that will compile [...]

It is neither incorrect nor useless.  It just means that the hash
table will be degenerate, and a linear search will be required for
each disassembled instruction.  It won't hurt anyone and will only
slightly worsen global warming.

> Unless otherwise expressly stated, this message does not create or vary =
> any contractual relationship between you and ARC International.  [...]

That's a relief!

- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: CGEN_DIS_HASH: how to get endianness and/or instruction size?
@ 2007-01-26  1:11 Joern Rennecke
  2007-01-26  1:31 ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: Joern Rennecke @ 2007-01-26  1:11 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: cgen

From: Frank Ch. Eigler  

>Hi -

Hi.

>Where are the bits that allow the insn to be decoded as a branch? 

There are lots of different branch patterns; for this particular one,
we have to check that the top 5 bits and the lowest bit of the first
16-bit word are zero.
(The actual data size in memory is likely different, but from a tools
 standpoint it is easiest to think of the instructions as a succession
 of 16 bit words.  The bytes within the 16 bit words are ordered according
 to target endianness.
 OTOH the manual uses insn-lsb0 #t, and describes 32 bit opcodes as 32 bit words.
 According to a comment in sh.cpu, cgen does not support insn-lsb0 #t with
 variable instruction length, so I have to translate all the bit numbers...)

>How the hardware know whether it's a 16- or 32-bit insn?

The top 5 bits of the first 16 bit word are what is known as major
opcode field.  The top three bits of these determine if the opcode
is 16 bit, 32 bit, or if the encoding is reserved.
(However, a 16 or 32 bit opcode does not necessarily equate to a 16
 respective 32 bit instruction.  When the number 62 is encoded in a six bit
 register operand field, that means that a 32 bit immediate value is
 used by this instruction, and that that immediate value is used by all
 operands which have 62 encoded in their 6 bit register field.)

> Those are the
>kind of bits are what are normally mingled into the hash. 

I reckon that, given a suitably endian-correted input,
I could make an adequate hash by using a switch based on
the 5 bit major opcode to decide which bits from the first 16
bit word to use.
Most frequent instructions should get distinct hash values that way.
The two main remaining weak points are 32 bit opcode shifts, which will
hash all into the same bucket as atomic exchange, trap and sleep
(the bits to distinguish these are actually in the 2nd 16 bit word),
and 16 bit subroutine returns, which hash with some less frequent patterns
into a bucket of 11.

> Maybe your
>base_insn designation is too small.

Having read the description of base_insn in cgen/doc/rtl.texi, I don't see
how anything but 16 bit can be correct for ARCompact.

>Indeed, but if you can accept a lesser standard of proof, you could
>leave this part of the port till the end.

Yes.  To test anything, I have to make it build first ;-)
I had already written the hash function, only to discover that the
endian check depended on an argument that was not passed to my
macro/function.

Its easy to change his code into incorrect and/or useless (return 0) code that
will compile, but there is still a lot of other stuff to fix before I have something
that I can test.
I was hoping to get there without leaving FIXMEs left and right... alas, if the infrastructure
allows no sane way to compute a hash on bi-endian processors with variable instruction size,
it seems I have to punt on this issue for now. 
Unless otherwise expressly stated, this message does not create or vary any contractual relationship between you and ARC International. The contents of this e-mail may be confidential and if you have received it in error, please delete it from your system, destroy any hard copies and telephone the above number. Incoming emails to ARC may be subject to monitoring other than by the addressee. EL  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CGEN_DIS_HASH: how to get endianness and/or instruction size?
  2007-01-25 20:30   ` Joern Rennecke
@ 2007-01-25 22:15     ` Frank Ch. Eigler
  0 siblings, 0 replies; 7+ messages in thread
From: Frank Ch. Eigler @ 2007-01-25 22:15 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: cgen

Hi -

On Thu, Jan 25, 2007 at 08:30:11PM +0000, Joern Rennecke wrote:
> I've seen that, but it assumes that if the top 16 its are zero,
> the instruction can be hashed as a 16 bit instruction.  That is not
> the case for ARCompact. [...]

Where are the bits that allow the insn to be decoded as a branch?  How
the hardware know whether it's a 16- or 32-bit insn?  Those are the
kind of bits are what are normally mingled into the hash.  Maybe your
base_insn designation is too small.

> > It is important to realize though that this disassembler hashing
> > widget is strictly an optimization.  [...]
> 
> I can only verify this positively when I've completely finished the
> port so that other people can use it...

Indeed, but if you can accept a lesser standard of proof, you could
leave this part of the port till the end.


- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CGEN_DIS_HASH: how to get endianness and/or instruction size?
  2007-01-25 15:00 ` Frank Ch. Eigler
@ 2007-01-25 20:30   ` Joern Rennecke
  2007-01-25 22:15     ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: Joern Rennecke @ 2007-01-25 20:30 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: cgen

On Thu, Jan 25, 2007 at 09:59:50AM -0500, Frank Ch. Eigler wrote:
> Use the "value" parameter (a host-endian copy of the "base insn")
> rather than the "buffer" parameter.
> 
> > If I want to use the passed instruction value, I need to know what
> > size it is.  [...]
> 
> See m32r_cgen_dis_hash: a hand-written baby insn classifier routine.

I've seen that, but it assumes that if the top 16 its are zero,
the instruction can be hashed as a 16 bit instruction.  That is not
the case for ARCompact.
Branch Conditionally is a 32 bit instuction can have all the top 16 bits
zeroed, and its bottom 16 bits are all operand.

> It is important to realize though that this disassembler hashing
> widget is strictly an optimization.  You can try hard-coding the hash
> value to 0 like some other cgen platforms, and see if the performance
> is bearable.

I can only verify this positively when I've completely finished the
port so that other people can use it...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CGEN_DIS_HASH: how to get endianness and/or instruction size?
  2007-01-25 14:20 Joern Rennecke
@ 2007-01-25 15:00 ` Frank Ch. Eigler
  2007-01-25 20:30   ` Joern Rennecke
  0 siblings, 1 reply; 7+ messages in thread
From: Frank Ch. Eigler @ 2007-01-25 15:00 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: cgen

Hi -

On Thu, Jan 25, 2007 at 02:20:36PM +0000, Joern Rennecke wrote:

> [...]  So, inside CGEN_DIS_HASH, how can I get the first 16 bits of
> the instruction, represented in host byte order?

> If I want to dereference the buffer pointer, I need to know the
> target endianness.

Use the "value" parameter (a host-endian copy of the "base insn")
rather than the "buffer" parameter.

> If I want to use the passed instruction value, I need to know what
> size it is.  [...]

See m32r_cgen_dis_hash: a hand-written baby insn classifier routine.

It is important to realize though that this disassembler hashing
widget is strictly an optimization.  You can try hard-coding the hash
value to 0 like some other cgen platforms, and see if the performance
is bearable.

- FChE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* CGEN_DIS_HASH: how to get endianness and/or instruction size?
@ 2007-01-25 14:20 Joern Rennecke
  2007-01-25 15:00 ` Frank Ch. Eigler
  0 siblings, 1 reply; 7+ messages in thread
From: Joern Rennecke @ 2007-01-25 14:20 UTC (permalink / raw)
  To: cgen

The ARCompact architecture has both 16 and 32 bit opcodes, and each can
take an optional 32 bit immediate.  Moreover, this is a bi-endian
architecture.
So, inside CGEN_DIS_HASH, how can I get the first 16 bits of the instruction,
represented in host byte order?
If I want to dereference the buffer pointer, I need to know the target
endianness.
If I want to use the passed instruction value, I need to know what size it
is.  Note that there are valid 32 bit opcodes which have all upper 16 bits
cleared.
It is also not quite clear if I can use this value if there is a 32 bit
immediate attached to the opcode.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-01-26 11:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-26 11:17 CGEN_DIS_HASH: how to get endianness and/or instruction size? Joern Rennecke
  -- strict thread matches above, loose matches on Subject: below --
2007-01-26  1:11 Joern Rennecke
2007-01-26  1:31 ` Frank Ch. Eigler
2007-01-25 14:20 Joern Rennecke
2007-01-25 15:00 ` Frank Ch. Eigler
2007-01-25 20:30   ` Joern Rennecke
2007-01-25 22:15     ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).