public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* libopcode endian and subarch inconsistencies?
@ 2012-06-10 16:33 Dr. David Alan Gilbert
  2012-06-11  0:37 ` Maciej W. Rozycki
  0 siblings, 1 reply; 4+ messages in thread
From: Dr. David Alan Gilbert @ 2012-06-10 16:33 UTC (permalink / raw)
  To: binutils

Hi,
  I've been writing a simple disassembler app (on android) with libopcode and
hit a couple of inconsistencies that I wondered what was the 'right' way
to solve them.  In my app I've been taking a chunk of raw bytes
and throwing them to libopcode, having chosen 'binary' as the target;
see the code at:
  https://github.com/penguin42/pocketdisassembler/blob/master/jni/binutilsglue.c

(I'm using binutils-2.22 tweeked to build on Android)

1) Endianness
  There seem to be two separate places to set the endianness; one by
using the target and the other by using disassembler_info.endian;
the code in disassemble.c uses bfd_big_endian which pulls it from
the target.

2) The 'binary' target
  The 'binary' target has it's flags set as BFD_ENDIAN_UNKNOWN;
but there again there is a comment that says it's only for
use as output not by input; so other than ignoring that
comment (as I did) - what's the right way to use libopcode
to disassemble from a byte stream?
  (I took a copy of the xvec from binary and tweeked it's endian
flag - I know that's grim, better suggestions welcome - I think
I found somewhere similar in binutils)

3) Arch/subarch/-M
  Different architectures seem to prefer things passed in different
places with different styles.  To give some examples:
   Alpha - you have a subarch like alpha:ev6 to specify that version
   ARM - subarch is xscale or armv3m (not arm:xscale to follow the alpha)
   PPC - seems to prefer the variant to be passed by -M

   It's a pity these are all different.

Dave
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\ gro.gilbert @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libopcode endian and subarch inconsistencies?
  2012-06-10 16:33 libopcode endian and subarch inconsistencies? Dr. David Alan Gilbert
@ 2012-06-11  0:37 ` Maciej W. Rozycki
  2012-06-16 23:55   ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 4+ messages in thread
From: Maciej W. Rozycki @ 2012-06-11  0:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: binutils

On Sun, 10 Jun 2012, Dr. David Alan Gilbert wrote:

> 2) The 'binary' target
>   The 'binary' target has it's flags set as BFD_ENDIAN_UNKNOWN;
> but there again there is a comment that says it's only for
> use as output not by input; so other than ignoring that
> comment (as I did) - what's the right way to use libopcode
> to disassemble from a byte stream?

 See how `objdump' drives opcodes -- it has -EB/-EL options to select the 
endianness regardless of the target chosen.  It can disassemble from the 
"binary" target just fine.

  Maciej

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libopcode endian and subarch inconsistencies?
  2012-06-11  0:37 ` Maciej W. Rozycki
@ 2012-06-16 23:55   ` Dr. David Alan Gilbert
  2012-07-07 19:43     ` Maciej W. Rozycki
  0 siblings, 1 reply; 4+ messages in thread
From: Dr. David Alan Gilbert @ 2012-06-16 23:55 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: binutils

* Maciej W. Rozycki (macro@linux-mips.org) wrote:
> On Sun, 10 Jun 2012, Dr. David Alan Gilbert wrote:

Hi Maciej,
  Thanks for the reply.

> > 2) The 'binary' target
> >   The 'binary' target has it's flags set as BFD_ENDIAN_UNKNOWN;
> > but there again there is a comment that says it's only for
> > use as output not by input; so other than ignoring that
> > comment (as I did) - what's the right way to use libopcode
> > to disassemble from a byte stream?
> 
>  See how `objdump' drives opcodes -- it has -EB/-EL options to select the 
> endianness regardless of the target chosen.

Yes, I think that might have been where I originally saw the trick I used;
from the objdump code:

  if (endian != BFD_ENDIAN_UNKNOWN)
    {
      struct bfd_target *xvec;

      xvec = (struct bfd_target *) xmalloc (sizeof (struct bfd_target));
      memcpy (xvec, abfd->xvec, sizeof (struct bfd_target));
      xvec->byteorder = endian;
      abfd->xvec = xvec;
    }

and then later:

  if (bfd_big_endian (abfd))
    disasm_info.display_endian = disasm_info.endian = BFD_ENDIAN_BIG;
  else if (bfd_little_endian (abfd))
    disasm_info.display_endian = disasm_info.endian = BFD_ENDIAN_LITTLE;

My point really is that memcpying and tweeking xvec feels a horrible
hack, especially when you also have disasm_info.endian which feels
like the place you'd expect to set the endianness because it exists.

I'd think that there should only be one of them that a caller to libopcode
should have to set.

> It can disassemble from the 
> "binary" target just fine.

Yes, my point about 'binary' was that there is the comment in binary.c:
   'It may only be used for output, not input.'

Dave
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\ gro.gilbert @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: libopcode endian and subarch inconsistencies?
  2012-06-16 23:55   ` Dr. David Alan Gilbert
@ 2012-07-07 19:43     ` Maciej W. Rozycki
  0 siblings, 0 replies; 4+ messages in thread
From: Maciej W. Rozycki @ 2012-07-07 19:43 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: binutils

On Sun, 17 Jun 2012, Dr. David Alan Gilbert wrote:

> >  See how `objdump' drives opcodes -- it has -EB/-EL options to select the 
> > endianness regardless of the target chosen.
> 
> Yes, I think that might have been where I originally saw the trick I used;
> from the objdump code:
> 
>   if (endian != BFD_ENDIAN_UNKNOWN)
>     {
>       struct bfd_target *xvec;
> 
>       xvec = (struct bfd_target *) xmalloc (sizeof (struct bfd_target));
>       memcpy (xvec, abfd->xvec, sizeof (struct bfd_target));
>       xvec->byteorder = endian;
>       abfd->xvec = xvec;
>     }
> 
> and then later:
> 
>   if (bfd_big_endian (abfd))
>     disasm_info.display_endian = disasm_info.endian = BFD_ENDIAN_BIG;
>   else if (bfd_little_endian (abfd))
>     disasm_info.display_endian = disasm_info.endian = BFD_ENDIAN_LITTLE;
> 
> My point really is that memcpying and tweeking xvec feels a horrible
> hack, especially when you also have disasm_info.endian which feels
> like the place you'd expect to set the endianness because it exists.
> 
> I'd think that there should only be one of them that a caller to libopcode
> should have to set.

 You're welcome to propose improvement.  This is a large code base with 
long history, people often cannot know all the deficiencies in all the 
corners yet alone have resource to address them all.  If something is of
particular concern to you, then you are warmly encouraged to address its 
shortcomings.

> > It can disassemble from the 
> > "binary" target just fine.
> 
> Yes, my point about 'binary' was that there is the comment in binary.c:
>    'It may only be used for output, not input.'

 Well, the comment may be a bit of overstatement, I have certainly used 
the "binary" BFD as input both to `objdump' and `objcopy'.  What the 
author meant might have been it's certainly not easy if possible at all to 
use this BFD as input to the linker.

 But then mixing different input BFDs in linking has never worked 
particularly reliably and I think by now it has to be considered 
essentially not supported at all -- you need to `objcopy' any odd inputs 
to the same object format that all the others use before feeding them all 
to the linker.  Or you can use GAS's .incbin pseudo-op nowadays -- it's a 
relatively recent addition meant to address a shortcoming of `objcopy' 
that has no means to set essential output BFD's file flags that are needed 
in some configurations, e.g. a flag that marks file contents compatible 
with PIC code.

 A note along the lines of: "Its main use is for output, and its usability 
for input is limited." -- or suchlike might make sense instead; again, 
feel free to propose a patch if that bothers you.

 And sorry for the high RTT -- my resources are limited too.

  Maciej

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-07-07 19:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-10 16:33 libopcode endian and subarch inconsistencies? Dr. David Alan Gilbert
2012-06-11  0:37 ` Maciej W. Rozycki
2012-06-16 23:55   ` Dr. David Alan Gilbert
2012-07-07 19:43     ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).