Re: RFC: Prevent disassembly beyond symbolic boundaries

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Tristan Gingold <gingold@adacore.com>
To: Nicholas Clifton <nickc@redhat.com>
Cc: binutils@sourceware.org, gdb-patches@sourceware.org
Subject: Re: RFC: Prevent disassembly beyond symbolic boundaries
Date: Fri, 19 Jun 2015 16:33:00 -0000	[thread overview]
Message-ID: <3F2C1B8E-BFB8-4AE4-BDCE-8B66FC208E4B@adacore.com> (raw)
In-Reply-To: <5583FFEE.6060106@redhat.com>


> On 19 Jun 2015, at 13:41, Nicholas Clifton <nickc@redhat.com> wrote:
> 
> Hi Tristan,
> 
>>>  This will disassemble as:
>>> 
>>>    0000000000000000 <foo>:
>>>       0:   24 2f                   and    $0x2f,%al
>>>       2:   83 0f ba                orl    $0xffffffba,(%rdi)
>>> 
>>>    0000000000000003 <bar>:
>>>       3:   0f ba e2 03             bt     $0x3,%edx
>>> 
>>>  Note how the instruction decoded at address 0x2 has stolen two bytes
>>>  from "foo", but these bytes are also decoded (correctly this time) as
>>>  part of the first instruction of foo.
> 
>> I am curious.  Why do you think it was a problem ?
> 
> Strangely enough, this actually causes regressions with the perf tool's testsuite:
> 
>  https://bugzilla.redhat.com/show_bug.cgi?id=1054767
> 
> What happens is that perf test 21 runs objdump on a binary, *parses* this output and compares that to the actual bytes in the binary. Because of the overrun feature shown above you actually get more bytes displayed in objdump's output than actually exist in the binary and so the perf test fails.

I can argue that this is an issue in the perf tool.  After all, the objdump output is clear that pc goes backward.

>> Even if there is a symbol in the middle of an instruction, I’d like
>> to understand what the processor will execute.
> 
> Except that even the current the displayed disassembly is not what the processor would execute.  In the example above the processor would execute the ORL instruction starting at address 0x2. but it would not continue on to execute the BT instruction at address 0x3.  Instead it would start decoding from address 0x5, whatever instruction that might be…

That’s a very good point!

>> Before the proposed
>> change, it was possible, but after it isn’t easy anymore.
> 
> True - but this only matters if the processor would execute from that piece of memory.  What if the byte(s) are actually data ?  (eg a constant pool).  Then it would make more sense to display the bytes as just byte values.

OTOH, if this is a constant pool it is possible that objdump is already of out track for a while.

> The point being that if there is a symbol that is in the middle of an instruction then something hinky is going on.  Either the symbol is misplaced or the instruction is not really an instruction or else an assembly programmer is being extra super clever and hiding data inside instructions.

Yes.  My scenario was setting a label on a known part of an instruction like the offset in a call instruction you want to patch later.
But I agree that before and after your proposed change, objdump output is not very readable.

> How about a tweak to the patch then ?  What if the -D option (disassemble all) disables this feature, and so the disassembled instruction is displayed as before, whilst the -d option (disassemble code) leaves it enabled.  Then if you want to see bytes as instructions you can use the -D option (possibly combined with -j), but if you want to see a more likely, only real instructions disassembled version, then use the -d option.  (Obviously the patch would need to be extended with an update to the documentation too).

It’s up to you.  I don’t insist at all on modifying your change, I was just curious about the motivation.
And the scenario I had in mind is not really affected by your proposal.

Tristan.

next prev parent reply	other threads:[~2015-06-19 16:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-18 16:40 Nick Clifton
2015-06-19  7:13 ` Tristan Gingold
2015-06-19 11:41   ` Nicholas Clifton
2015-06-19 16:33     ` Tristan Gingold [this message]
2015-06-22 16:13       ` Nicholas Clifton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F2C1B8E-BFB8-4AE4-BDCE-8B66FC208E4B@adacore.com \
    --to=gingold@adacore.com \
    --cc=binutils@sourceware.org \
    --cc=gdb-patches@sourceware.org \
    --cc=nickc@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).