public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* A lean way for getting the size of the instruction at a given address
       [not found] <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de>
@ 2021-04-04  7:59 ` Zied Guermazi
  2021-04-05 13:01   ` Luis Machado
  0 siblings, 1 reply; 8+ messages in thread
From: Zied Guermazi @ 2021-04-04  7:59 UTC (permalink / raw)
  To: gdb

hi

I need to get the size of the instruction at a given address. I am 
currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR 
memaddr, struct ui_file *stream, int *branch_delay_insns). and this is 
consuming a huge time, considering that this is used in branch tracing 
and this gets repeated up to few millions times.


Is there a lean way for getting the size of the instruction at a given 
address, I am using it for aarch64 and arm targets.

Kind Regards

Zied Guermazi



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-04  7:59 ` A lean way for getting the size of the instruction at a given address Zied Guermazi
@ 2021-04-05 13:01   ` Luis Machado
  2021-04-05 16:17     ` Zied Guermazi
  0 siblings, 1 reply; 8+ messages in thread
From: Luis Machado @ 2021-04-05 13:01 UTC (permalink / raw)
  To: Zied Guermazi, gdb

Hi Zied,

On 4/4/21 4:59 AM, Zied Guermazi wrote:
> hi
> 
> I need to get the size of the instruction at a given address. I am 
> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR 
> memaddr, struct ui_file *stream, int *branch_delay_insns). and this is 
> consuming a huge time, considering that this is used in branch tracing 
> and this gets repeated up to few millions times.
> 
> 
> Is there a lean way for getting the size of the instruction at a given 
> address, I am using it for aarch64 and arm targets.

At the moment I don't think there is an optimal solution for this. The 
instruction length is calculated as part of the disassemble process, and 
is tied to the function that prints instructions.

One way to speed things up is to have a new member function in "class 
gdb_disassembler" to calculate the instruction length only.

Another way is to have a new gdbarch hook that calculates the size of an 
instruction based on the current PC, mapping symbols etc.

> 
> Kind Regards
> 
> Zied Guermazi
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 13:01   ` Luis Machado
@ 2021-04-05 16:17     ` Zied Guermazi
  2021-04-05 16:40       ` Luis Machado
  0 siblings, 1 reply; 8+ messages in thread
From: Zied Guermazi @ 2021-04-05 16:17 UTC (permalink / raw)
  To: Luis Machado, gdb

hi Luis

A new member function in "class gdb_disassembler" to calculate the 
instruction length only will be a good solution. In fact a big overhead 
is added by the printing of instruction disassembly, which is not needed 
at all. On aarch64, the decoder is optimized to issue many instruction 
in one trace element, and here calculating the size consumes more than 
80% of the time. On arm, the decoder issues one instruction after 
another and here getting the size consumes 50% of the time. Considering 
the amount of traces this can sum up to a dozen of minutes in some cases 
(64MB of traces)

Calculating the instruction size per se, on arm is a "rapid" operation 
and consists of checking few bits in the opcode. So the time can be 
drastically decreased by having a function to calculate the size only.


gdb_print_insn can be then changed as following (pseudo code):

int
gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
         struct ui_file *stream, int *branch_delay_insns)
{

   gdb_disassembler di (gdbarch, stream);

   if ( di.get_insn_size != 0)

    return di.get_insn_size(memaddr);

   else

    return di.print_insn (memaddr, branch_delay_insns);
}

Is there a function in aarch64-tdep or arm-tdep doing job of disassembly 
( the lower layer handling the opcode)? are we relaying on the bfd 
library for it? can someone give me a hint of where to find those functions?


Kind Regards

Zied Guermazi


On 05.04.21 15:01, Luis Machado wrote:
> Hi Zied,
>
> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>> hi
>>
>> I need to get the size of the instruction at a given address. I am 
>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR 
>> memaddr, struct ui_file *stream, int *branch_delay_insns). and this 
>> is consuming a huge time, considering that this is used in branch 
>> tracing and this gets repeated up to few millions times.
>>
>>
>> Is there a lean way for getting the size of the instruction at a 
>> given address, I am using it for aarch64 and arm targets.
>
> At the moment I don't think there is an optimal solution for this. The 
> instruction length is calculated as part of the disassemble process, 
> and is tied to the function that prints instructions.
>
> One way to speed things up is to have a new member function in "class 
> gdb_disassembler" to calculate the instruction length only.
>
> Another way is to have a new gdbarch hook that calculates the size of 
> an instruction based on the current PC, mapping symbols etc.
>
>>
>> Kind Regards
>>
>> Zied Guermazi
>>
>>
-- 

*Zied Guermazi*
founder

Trande UG
Leuschnerstraße 2
69469 Weinheim/Germany

Mobile: +491722645127
mailto:zied.guermazi@trande.de

*Trande UG*
Leuschnerstraße 2, D-69469 Weinheim; Telefon: +491722645127
Sitz der Gesellschaft: Weinheim- Registergericht: AG Mannheim HRB 736209 
- Geschäftsführung: Zied Guermazi

*Confidentiality Note*
This message is intended only for the use of the named recipient(s) and 
may contain confidential and/or privileged information. If you are not 
the intended recipient, please contact the sender and delete the 
message. Any unauthorized use of the information contained in this 
message is prohibited.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 16:17     ` Zied Guermazi
@ 2021-04-05 16:40       ` Luis Machado
  2021-04-05 21:47         ` Zied Guermazi
  0 siblings, 1 reply; 8+ messages in thread
From: Luis Machado @ 2021-04-05 16:40 UTC (permalink / raw)
  To: Zied Guermazi, gdb

On 4/5/21 1:17 PM, Zied Guermazi wrote:
> hi Luis
> 
> A new member function in "class gdb_disassembler" to calculate the 
> instruction length only will be a good solution. In fact a big overhead 
> is added by the printing of instruction disassembly, which is not needed 
> at all. On aarch64, the decoder is optimized to issue many instruction 
> in one trace element, and here calculating the size consumes more than 
> 80% of the time. On arm, the decoder issues one instruction after 
> another and here getting the size consumes 50% of the time. Considering 
> the amount of traces this can sum up to a dozen of minutes in some cases 
> (64MB of traces)

Indeed, that doesn't sound good.

> 
> Calculating the instruction size per se, on arm is a "rapid" operation 
> and consists of checking few bits in the opcode. So the time can be 
> drastically decreased by having a function to calculate the size only.
> 
> 
> gdb_print_insn can be then changed as following (pseudo code):
> 
> int
> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>          struct ui_file *stream, int *branch_delay_insns)
> {
> 
>    gdb_disassembler di (gdbarch, stream);
> 
>    if ( di.get_insn_size != 0)
> 
>     return di.get_insn_size(memaddr);
> 
>    else
> 
>     return di.print_insn (memaddr, branch_delay_insns);
> }
> 
> Is there a function in aarch64-tdep or arm-tdep doing job of disassembly 
> ( the lower layer handling the opcode)? are we relaying on the bfd 
> library for it? can someone give me a hint of where to find those functions?

The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and aarch64-tdep.c 
(aarch64_gdb_print_insn) are more like helper functions and do some 
initial setup, but the code to disassemble lies in opcodes/arm-dis.c 
(print_insn) and opcodes/aarch64-dis.c (print_insn_aarch64).

If you go with the route of changing "class gdb_disassembler", then 
you'll probably need to touch binutils/opcodes.

If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then 
you only need to change GDB.
> 
> 
> Kind Regards
> 
> Zied Guermazi
> 
> 
> On 05.04.21 15:01, Luis Machado wrote:
>> Hi Zied,
>>
>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>> hi
>>>
>>> I need to get the size of the instruction at a given address. I am 
>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR 
>>> memaddr, struct ui_file *stream, int *branch_delay_insns). and this 
>>> is consuming a huge time, considering that this is used in branch 
>>> tracing and this gets repeated up to few millions times.
>>>
>>>
>>> Is there a lean way for getting the size of the instruction at a 
>>> given address, I am using it for aarch64 and arm targets.
>>
>> At the moment I don't think there is an optimal solution for this. The 
>> instruction length is calculated as part of the disassemble process, 
>> and is tied to the function that prints instructions.
>>
>> One way to speed things up is to have a new member function in "class 
>> gdb_disassembler" to calculate the instruction length only.
>>
>> Another way is to have a new gdbarch hook that calculates the size of 
>> an instruction based on the current PC, mapping symbols etc.
>>
>>>
>>> Kind Regards
>>>
>>> Zied Guermazi
>>>
>>>
> -- 
> 
> *Zied Guermazi*
> founder
> 
> Trande UG
> Leuschnerstraße 2
> 69469 Weinheim/Germany
> 
> Mobile: +491722645127
> mailto:zied.guermazi@trande.de <mailto:zied.guermazi@trande.de>
> 
> *Trande UG*
> Leuschnerstraße 2, D-69469 Weinheim; Telefon: +491722645127
> Sitz der Gesellschaft: Weinheim- Registergericht: AG Mannheim HRB 736209 
> - Geschäftsführung: Zied Guermazi
> 
> *Confidentiality Note*
> This message is intended only for the use of the named recipient(s) and 
> may contain confidential and/or privileged information. If you are not 
> the intended recipient, please contact the sender and delete the 
> message. Any unauthorized use of the information contained in this 
> message is prohibited.
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 16:40       ` Luis Machado
@ 2021-04-05 21:47         ` Zied Guermazi
  2021-04-05 22:04           ` Luis Machado
  0 siblings, 1 reply; 8+ messages in thread
From: Zied Guermazi @ 2021-04-05 21:47 UTC (permalink / raw)
  To: Luis Machado, gdb

hi Luis,

thanks for your support. To experiment the impact of removing the 
printing of the instruction on the overall performance, I commented out 
setting and using the print function pointer in print_insn (bfd_vma pc, 
struct disassemble_info *info, bfd_boolean little) in opcodes/arm-dis.c, 
and the result was very interesting: The time needed to process the 
traces dropped down from 12 minutes to 34 seconds for 64 MB of traces.

now that we have a proof that the bottleneck was printing, we can think 
about a way to provide a clean implementation.

Kind Regards

Zied Guermazi


On 05.04.21 18:40, Luis Machado wrote:
> On 4/5/21 1:17 PM, Zied Guermazi wrote:
>> hi Luis
>>
>> A new member function in "class gdb_disassembler" to calculate the 
>> instruction length only will be a good solution. In fact a big 
>> overhead is added by the printing of instruction disassembly, which 
>> is not needed at all. On aarch64, the decoder is optimized to issue 
>> many instruction in one trace element, and here calculating the size 
>> consumes more than 80% of the time. On arm, the decoder issues one 
>> instruction after another and here getting the size consumes 50% of 
>> the time. Considering the amount of traces this can sum up to a dozen 
>> of minutes in some cases (64MB of traces)
>
> Indeed, that doesn't sound good.
>
>>
>> Calculating the instruction size per se, on arm is a "rapid" 
>> operation and consists of checking few bits in the opcode. So the 
>> time can be drastically decreased by having a function to calculate 
>> the size only.
>>
>>
>> gdb_print_insn can be then changed as following (pseudo code):
>>
>> int
>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>>          struct ui_file *stream, int *branch_delay_insns)
>> {
>>
>>    gdb_disassembler di (gdbarch, stream);
>>
>>    if ( di.get_insn_size != 0)
>>
>>     return di.get_insn_size(memaddr);
>>
>>    else
>>
>>     return di.print_insn (memaddr, branch_delay_insns);
>> }
>>
>> Is there a function in aarch64-tdep or arm-tdep doing job of 
>> disassembly ( the lower layer handling the opcode)? are we relaying 
>> on the bfd library for it? can someone give me a hint of where to 
>> find those functions?
>
> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and 
> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions 
> and do some initial setup, but the code to disassemble lies in 
> opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c 
> (print_insn_aarch64).
>
> If you go with the route of changing "class gdb_disassembler", then 
> you'll probably need to touch binutils/opcodes.
>
> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then 
> you only need to change GDB.
>>
>>
>> Kind Regards
>>
>> Zied Guermazi
>>
>>
>> On 05.04.21 15:01, Luis Machado wrote:
>>> Hi Zied,
>>>
>>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>>> hi
>>>>
>>>> I need to get the size of the instruction at a given address. I am 
>>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
>>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, 
>>>> CORE_ADDR memaddr, struct ui_file *stream, int 
>>>> *branch_delay_insns). and this is consuming a huge time, 
>>>> considering that this is used in branch tracing and this gets 
>>>> repeated up to few millions times.
>>>>
>>>>
>>>> Is there a lean way for getting the size of the instruction at a 
>>>> given address, I am using it for aarch64 and arm targets.
>>>
>>> At the moment I don't think there is an optimal solution for this. 
>>> The instruction length is calculated as part of the disassemble 
>>> process, and is tied to the function that prints instructions.
>>>
>>> One way to speed things up is to have a new member function in 
>>> "class gdb_disassembler" to calculate the instruction length only.
>>>
>>> Another way is to have a new gdbarch hook that calculates the size 
>>> of an instruction based on the current PC, mapping symbols etc.
>>>
>>>>
>>>> Kind Regards
>>>>
>>>> Zied Guermazi
>>>>
>>>>
>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 21:47         ` Zied Guermazi
@ 2021-04-05 22:04           ` Luis Machado
  2021-04-05 22:12             ` Zied Guermazi
  0 siblings, 1 reply; 8+ messages in thread
From: Luis Machado @ 2021-04-05 22:04 UTC (permalink / raw)
  To: Zied Guermazi, gdb

Hi Zied,

On 4/5/21 6:47 PM, Zied Guermazi wrote:
> hi Luis,
> 
> thanks for your support. To experiment the impact of removing the 
> printing of the instruction on the overall performance, I commented out 
> setting and using the print function pointer in print_insn (bfd_vma pc, 
> struct disassemble_info *info, bfd_boolean little) in opcodes/arm-dis.c, 
> and the result was very interesting: The time needed to process the 
> traces dropped down from 12 minutes to 34 seconds for 64 MB of traces.

That is quite a bottleneck! I think this code path isn't exercised often.

> 
> now that we have a proof that the bottleneck was printing, we can think 
> about a way to provide a clean implementation.

I agree. A faster implementation of this particular function would be 
nice to have. It may even improve some other code paths that use this 
information.

> 
> Kind Regards
> 
> Zied Guermazi
> 
> 
> On 05.04.21 18:40, Luis Machado wrote:
>> On 4/5/21 1:17 PM, Zied Guermazi wrote:
>>> hi Luis
>>>
>>> A new member function in "class gdb_disassembler" to calculate the 
>>> instruction length only will be a good solution. In fact a big 
>>> overhead is added by the printing of instruction disassembly, which 
>>> is not needed at all. On aarch64, the decoder is optimized to issue 
>>> many instruction in one trace element, and here calculating the size 
>>> consumes more than 80% of the time. On arm, the decoder issues one 
>>> instruction after another and here getting the size consumes 50% of 
>>> the time. Considering the amount of traces this can sum up to a dozen 
>>> of minutes in some cases (64MB of traces)
>>
>> Indeed, that doesn't sound good.
>>
>>>
>>> Calculating the instruction size per se, on arm is a "rapid" 
>>> operation and consists of checking few bits in the opcode. So the 
>>> time can be drastically decreased by having a function to calculate 
>>> the size only.
>>>
>>>
>>> gdb_print_insn can be then changed as following (pseudo code):
>>>
>>> int
>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>>>          struct ui_file *stream, int *branch_delay_insns)
>>> {
>>>
>>>    gdb_disassembler di (gdbarch, stream);
>>>
>>>    if ( di.get_insn_size != 0)
>>>
>>>     return di.get_insn_size(memaddr);
>>>
>>>    else
>>>
>>>     return di.print_insn (memaddr, branch_delay_insns);
>>> }
>>>
>>> Is there a function in aarch64-tdep or arm-tdep doing job of 
>>> disassembly ( the lower layer handling the opcode)? are we relaying 
>>> on the bfd library for it? can someone give me a hint of where to 
>>> find those functions?
>>
>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and 
>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions 
>> and do some initial setup, but the code to disassemble lies in 
>> opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c 
>> (print_insn_aarch64).
>>
>> If you go with the route of changing "class gdb_disassembler", then 
>> you'll probably need to touch binutils/opcodes.
>>
>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then 
>> you only need to change GDB.
>>>
>>>
>>> Kind Regards
>>>
>>> Zied Guermazi
>>>
>>>
>>> On 05.04.21 15:01, Luis Machado wrote:
>>>> Hi Zied,
>>>>
>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>>>> hi
>>>>>
>>>>> I need to get the size of the instruction at a given address. I am 
>>>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR 
>>>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, 
>>>>> CORE_ADDR memaddr, struct ui_file *stream, int 
>>>>> *branch_delay_insns). and this is consuming a huge time, 
>>>>> considering that this is used in branch tracing and this gets 
>>>>> repeated up to few millions times.
>>>>>
>>>>>
>>>>> Is there a lean way for getting the size of the instruction at a 
>>>>> given address, I am using it for aarch64 and arm targets.
>>>>
>>>> At the moment I don't think there is an optimal solution for this. 
>>>> The instruction length is calculated as part of the disassemble 
>>>> process, and is tied to the function that prints instructions.
>>>>
>>>> One way to speed things up is to have a new member function in 
>>>> "class gdb_disassembler" to calculate the instruction length only.
>>>>
>>>> Another way is to have a new gdbarch hook that calculates the size 
>>>> of an instruction based on the current PC, mapping symbols etc.
>>>>
>>>>>
>>>>> Kind Regards
>>>>>
>>>>> Zied Guermazi
>>>>>
>>>>>
>>>
>>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 22:04           ` Luis Machado
@ 2021-04-05 22:12             ` Zied Guermazi
  2021-04-05 22:15               ` Luis Machado
  0 siblings, 1 reply; 8+ messages in thread
From: Zied Guermazi @ 2021-04-05 22:12 UTC (permalink / raw)
  To: Luis Machado, gdb

Hi Luis,

yes, it guess it was intended for processing disassemble command. Itwas 
not intended to be used in performance critical use cases. Once it was 
removed, the next bottle neck is the printf in 
get_all_disassembler_options ( a string was used as a mean for passing 
options). it consumes 20% of the time.

Shall we put the changes needed to increase the performance in the "etm 
for branch tracing" patch set, or in a dedicated one (performance 
improvement one). please advicse

/Zied

On 06.04.21 00:04, Luis Machado wrote:
> Hi Zied,
>
> On 4/5/21 6:47 PM, Zied Guermazi wrote:
>> hi Luis,
>>
>> thanks for your support. To experiment the impact of removing the 
>> printing of the instruction on the overall performance, I commented 
>> out setting and using the print function pointer in print_insn 
>> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in 
>> opcodes/arm-dis.c, and the result was very interesting: The time 
>> needed to process the traces dropped down from 12 minutes to 34 
>> seconds for 64 MB of traces.
>
> That is quite a bottleneck! I think this code path isn't exercised often.
>
>>
>> now that we have a proof that the bottleneck was printing, we can 
>> think about a way to provide a clean implementation.
>
> I agree. A faster implementation of this particular function would be 
> nice to have. It may even improve some other code paths that use this 
> information.
>
>>
>> Kind Regards
>>
>> Zied Guermazi
>>
>>
>> On 05.04.21 18:40, Luis Machado wrote:
>>> On 4/5/21 1:17 PM, Zied Guermazi wrote:
>>>> hi Luis
>>>>
>>>> A new member function in "class gdb_disassembler" to calculate the 
>>>> instruction length only will be a good solution. In fact a big 
>>>> overhead is added by the printing of instruction disassembly, which 
>>>> is not needed at all. On aarch64, the decoder is optimized to issue 
>>>> many instruction in one trace element, and here calculating the 
>>>> size consumes more than 80% of the time. On arm, the decoder issues 
>>>> one instruction after another and here getting the size consumes 
>>>> 50% of the time. Considering the amount of traces this can sum up 
>>>> to a dozen of minutes in some cases (64MB of traces)
>>>
>>> Indeed, that doesn't sound good.
>>>
>>>>
>>>> Calculating the instruction size per se, on arm is a "rapid" 
>>>> operation and consists of checking few bits in the opcode. So the 
>>>> time can be drastically decreased by having a function to calculate 
>>>> the size only.
>>>>
>>>>
>>>> gdb_print_insn can be then changed as following (pseudo code):
>>>>
>>>> int
>>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>>>>          struct ui_file *stream, int *branch_delay_insns)
>>>> {
>>>>
>>>>    gdb_disassembler di (gdbarch, stream);
>>>>
>>>>    if ( di.get_insn_size != 0)
>>>>
>>>>     return di.get_insn_size(memaddr);
>>>>
>>>>    else
>>>>
>>>>     return di.print_insn (memaddr, branch_delay_insns);
>>>> }
>>>>
>>>> Is there a function in aarch64-tdep or arm-tdep doing job of 
>>>> disassembly ( the lower layer handling the opcode)? are we relaying 
>>>> on the bfd library for it? can someone give me a hint of where to 
>>>> find those functions?
>>>
>>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and 
>>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper 
>>> functions and do some initial setup, but the code to disassemble 
>>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c 
>>> (print_insn_aarch64).
>>>
>>> If you go with the route of changing "class gdb_disassembler", then 
>>> you'll probably need to touch binutils/opcodes.
>>>
>>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), 
>>> then you only need to change GDB.
>>>>
>>>>
>>>> Kind Regards
>>>>
>>>> Zied Guermazi
>>>>
>>>>
>>>> On 05.04.21 15:01, Luis Machado wrote:
>>>>> Hi Zied,
>>>>>
>>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>>>>> hi
>>>>>>
>>>>>> I need to get the size of the instruction at a given address. I 
>>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, 
>>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch 
>>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int 
>>>>>> *branch_delay_insns). and this is consuming a huge time, 
>>>>>> considering that this is used in branch tracing and this gets 
>>>>>> repeated up to few millions times.
>>>>>>
>>>>>>
>>>>>> Is there a lean way for getting the size of the instruction at a 
>>>>>> given address, I am using it for aarch64 and arm targets.
>>>>>
>>>>> At the moment I don't think there is an optimal solution for this. 
>>>>> The instruction length is calculated as part of the disassemble 
>>>>> process, and is tied to the function that prints instructions.
>>>>>
>>>>> One way to speed things up is to have a new member function in 
>>>>> "class gdb_disassembler" to calculate the instruction length only.
>>>>>
>>>>> Another way is to have a new gdbarch hook that calculates the size 
>>>>> of an instruction based on the current PC, mapping symbols etc.
>>>>>
>>>>>>
>>>>>> Kind Regards
>>>>>>
>>>>>> Zied Guermazi
>>>>>>
>>>>>>
>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: A lean way for getting the size of the instruction at a given address
  2021-04-05 22:12             ` Zied Guermazi
@ 2021-04-05 22:15               ` Luis Machado
  0 siblings, 0 replies; 8+ messages in thread
From: Luis Machado @ 2021-04-05 22:15 UTC (permalink / raw)
  To: Zied Guermazi, gdb

Zied,

On 4/5/21 7:12 PM, Zied Guermazi wrote:
> Hi Luis,
> 
> yes, it guess it was intended for processing disassemble command. Itwas 
> not intended to be used in performance critical use cases. Once it was 
> removed, the next bottle neck is the printf in 
> get_all_disassembler_options ( a string was used as a mean for passing 
> options). it consumes 20% of the time.
> 
> Shall we put the changes needed to increase the performance in the "etm 
> for branch tracing" patch set, or in a dedicated one (performance 
> improvement one). please advicse

This would be best as a separate patch. It will be easier to review that 
way.

You may need to submit the change to both gdb/binutils lists, if the 
patch touches both projects.

> 
> /Zied
> 
> On 06.04.21 00:04, Luis Machado wrote:
>> Hi Zied,
>>
>> On 4/5/21 6:47 PM, Zied Guermazi wrote:
>>> hi Luis,
>>>
>>> thanks for your support. To experiment the impact of removing the 
>>> printing of the instruction on the overall performance, I commented 
>>> out setting and using the print function pointer in print_insn 
>>> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in 
>>> opcodes/arm-dis.c, and the result was very interesting: The time 
>>> needed to process the traces dropped down from 12 minutes to 34 
>>> seconds for 64 MB of traces.
>>
>> That is quite a bottleneck! I think this code path isn't exercised often.
>>
>>>
>>> now that we have a proof that the bottleneck was printing, we can 
>>> think about a way to provide a clean implementation.
>>
>> I agree. A faster implementation of this particular function would be 
>> nice to have. It may even improve some other code paths that use this 
>> information.
>>
>>>
>>> Kind Regards
>>>
>>> Zied Guermazi
>>>
>>>
>>> On 05.04.21 18:40, Luis Machado wrote:
>>>> On 4/5/21 1:17 PM, Zied Guermazi wrote:
>>>>> hi Luis
>>>>>
>>>>> A new member function in "class gdb_disassembler" to calculate the 
>>>>> instruction length only will be a good solution. In fact a big 
>>>>> overhead is added by the printing of instruction disassembly, which 
>>>>> is not needed at all. On aarch64, the decoder is optimized to issue 
>>>>> many instruction in one trace element, and here calculating the 
>>>>> size consumes more than 80% of the time. On arm, the decoder issues 
>>>>> one instruction after another and here getting the size consumes 
>>>>> 50% of the time. Considering the amount of traces this can sum up 
>>>>> to a dozen of minutes in some cases (64MB of traces)
>>>>
>>>> Indeed, that doesn't sound good.
>>>>
>>>>>
>>>>> Calculating the instruction size per se, on arm is a "rapid" 
>>>>> operation and consists of checking few bits in the opcode. So the 
>>>>> time can be drastically decreased by having a function to calculate 
>>>>> the size only.
>>>>>
>>>>>
>>>>> gdb_print_insn can be then changed as following (pseudo code):
>>>>>
>>>>> int
>>>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr,
>>>>>          struct ui_file *stream, int *branch_delay_insns)
>>>>> {
>>>>>
>>>>>    gdb_disassembler di (gdbarch, stream);
>>>>>
>>>>>    if ( di.get_insn_size != 0)
>>>>>
>>>>>     return di.get_insn_size(memaddr);
>>>>>
>>>>>    else
>>>>>
>>>>>     return di.print_insn (memaddr, branch_delay_insns);
>>>>> }
>>>>>
>>>>> Is there a function in aarch64-tdep or arm-tdep doing job of 
>>>>> disassembly ( the lower layer handling the opcode)? are we relaying 
>>>>> on the bfd library for it? can someone give me a hint of where to 
>>>>> find those functions?
>>>>
>>>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and 
>>>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper 
>>>> functions and do some initial setup, but the code to disassemble 
>>>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c 
>>>> (print_insn_aarch64).
>>>>
>>>> If you go with the route of changing "class gdb_disassembler", then 
>>>> you'll probably need to touch binutils/opcodes.
>>>>
>>>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), 
>>>> then you only need to change GDB.
>>>>>
>>>>>
>>>>> Kind Regards
>>>>>
>>>>> Zied Guermazi
>>>>>
>>>>>
>>>>> On 05.04.21 15:01, Luis Machado wrote:
>>>>>> Hi Zied,
>>>>>>
>>>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote:
>>>>>>> hi
>>>>>>>
>>>>>>> I need to get the size of the instruction at a given address. I 
>>>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, 
>>>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch 
>>>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int 
>>>>>>> *branch_delay_insns). and this is consuming a huge time, 
>>>>>>> considering that this is used in branch tracing and this gets 
>>>>>>> repeated up to few millions times.
>>>>>>>
>>>>>>>
>>>>>>> Is there a lean way for getting the size of the instruction at a 
>>>>>>> given address, I am using it for aarch64 and arm targets.
>>>>>>
>>>>>> At the moment I don't think there is an optimal solution for this. 
>>>>>> The instruction length is calculated as part of the disassemble 
>>>>>> process, and is tied to the function that prints instructions.
>>>>>>
>>>>>> One way to speed things up is to have a new member function in 
>>>>>> "class gdb_disassembler" to calculate the instruction length only.
>>>>>>
>>>>>> Another way is to have a new gdbarch hook that calculates the size 
>>>>>> of an instruction based on the current PC, mapping symbols etc.
>>>>>>
>>>>>>>
>>>>>>> Kind Regards
>>>>>>>
>>>>>>> Zied Guermazi
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-05 22:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de>
2021-04-04  7:59 ` A lean way for getting the size of the instruction at a given address Zied Guermazi
2021-04-05 13:01   ` Luis Machado
2021-04-05 16:17     ` Zied Guermazi
2021-04-05 16:40       ` Luis Machado
2021-04-05 21:47         ` Zied Guermazi
2021-04-05 22:04           ` Luis Machado
2021-04-05 22:12             ` Zied Guermazi
2021-04-05 22:15               ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).