public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@linaro.org>
To: Zied Guermazi <zied.guermazi@trande.de>,
	"gdb@sourceware.org" <gdb@sourceware.org>
Subject: Re: A lean way for getting the size of the instruction at a given address
Date: Mon, 5 Apr 2021 19:04:51 -0300	[thread overview]
Message-ID: <0f2147c8-770d-5cb2-f415-8549b7192b36@linaro.org> (raw)
In-Reply-To: <72e584f8-2cf6-0bfa-882d-a1ba21a43931@trande.de>

Hi Zied,

On 4/5/21 6:47 PM, Zied Guermazi wrote:
> hi Luis,
> 
> thanks for your support. To experiment the impact of removing the 
> printing of the instruction on the overall performance, I commented out 
> setting and using the print function pointer in print_insn (bfd_vma pc, 
> struct disassemble_info *info, bfd_boolean little) in opcodes/arm-dis.c, 
> and the result was very interesting: The time needed to process the 
> traces dropped down from 12 minutes to 34 seconds for 64 MB of traces.

That is quite a bottleneck! I think this code path isn't exercised often.

> 
> now that we have a proof that the bottleneck was printing, we can think 
> about a way to provide a clean implementation.

I agree. A faster implementation of this particular function would be 
nice to have. It may even improve some other code paths that use this 
information.

> 
> Kind Regards
> 
> Zied Guermazi
> 
> 
> On 05.04.21 18:40, Luis Machado wrote:
>> On 4/5/21 1:17 PM, Zied Guermazi wrote:
>>> hi Luis
>>>
>>> A new member function in "class gdb_disassembler" to calculate the 
>>> instruction length only will be a good solution. In fact a big 
>>> overhead is added by the printing of instruction disassembly, which 
>>> is not needed at all. On aarch64, the decoder is optimized to issue 
>>> many instruction in one trace element, and here calculating the size 
>>> consumes more than 80% of the time. On arm, the decoder issues one 
>>> instruction after another and here getting the size consumes 50% of 
>>> the time. Considering the amount of traces this can sum up to a dozen 
>>> of minutes in some cases (64MB of traces)
>>
>> Indeed, that doesn't sound good.
>>
>>>
>>> Calculating the instruction size per se, on arm is a "rapid" 
>>> operation and consists of checking few bits in the opcode. So the 
>>> time can be drastically decreased by having a function to calculate 
>>> the size only.
>>>
>>>
>>> gdb_print_insn can be then changed as following (pseudo code):
>>>
>>> int
>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>>>          struct ui_file *stream, int *branch_delay_insns)
>>> {
>>>
>>>    gdb_disassembler di (gdbarch, stream);
>>>
>>>    if ( di.get_insn_size != 0)
>>>
>>>     return di.get_insn_size(memaddr);
>>>
>>>    else
>>>
>>>     return di.print_insn (memaddr, branch_delay_insns);
>>> }
>>>
>>> Is there a function in aarch64-tdep or arm-tdep doing job of 
>>> disassembly ( the lower layer handling the opcode)? are we relaying 
>>> on the bfd library for it? can someone give me a hint of where to 
>>> find those functions?
>>
>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and 
>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions 
>> and do some initial setup, but the code to disassemble lies in 
>> opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c 
>> (print_insn_aarch64).
>>
>> If you go with the route of changing "class gdb_disassembler", then 
>> you'll probably need to touch binutils/opcodes.
>>
>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then 
>> you only need to change GDB.
>>>
>>>
>>> Kind Regards
>>>
>>> Zied Guermazi
>>>
>>>
>>> On 05.04.21 15:01, Luis Machado wrote:
>>>> Hi Zied,
>>>>
>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>>>> hi
>>>>>
>>>>> I need to get the size of the instruction at a given address. I am 
>>>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
>>>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, 
>>>>> CORE_ADDR memaddr, struct ui_file *stream, int 
>>>>> *branch_delay_insns). and this is consuming a huge time, 
>>>>> considering that this is used in branch tracing and this gets 
>>>>> repeated up to few millions times.
>>>>>
>>>>>
>>>>> Is there a lean way for getting the size of the instruction at a 
>>>>> given address, I am using it for aarch64 and arm targets.
>>>>
>>>> At the moment I don't think there is an optimal solution for this. 
>>>> The instruction length is calculated as part of the disassemble 
>>>> process, and is tied to the function that prints instructions.
>>>>
>>>> One way to speed things up is to have a new member function in 
>>>> "class gdb_disassembler" to calculate the instruction length only.
>>>>
>>>> Another way is to have a new gdbarch hook that calculates the size 
>>>> of an instruction based on the current PC, mapping symbols etc.
>>>>
>>>>>
>>>>> Kind Regards
>>>>>
>>>>> Zied Guermazi
>>>>>
>>>>>
>>>
>>>
> 

  reply	other threads:[~2021-04-05 22:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de>
2021-04-04  7:59 ` Zied Guermazi
2021-04-05 13:01   ` Luis Machado
2021-04-05 16:17     ` Zied Guermazi
2021-04-05 16:40       ` Luis Machado
2021-04-05 21:47         ` Zied Guermazi
2021-04-05 22:04           ` Luis Machado [this message]
2021-04-05 22:12             ` Zied Guermazi
2021-04-05 22:15               ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0f2147c8-770d-5cb2-f415-8549b7192b36@linaro.org \
    --to=luis.machado@linaro.org \
    --cc=gdb@sourceware.org \
    --cc=zied.guermazi@trande.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).