* A lean way for getting the size of the instruction at a given address [not found] <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de> @ 2021-04-04 7:59 ` Zied Guermazi 2021-04-05 13:01 ` Luis Machado 0 siblings, 1 reply; 8+ messages in thread From: Zied Guermazi @ 2021-04-04 7:59 UTC (permalink / raw) To: gdb hi I need to get the size of the instruction at a given address. I am currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int *branch_delay_insns). and this is consuming a huge time, considering that this is used in branch tracing and this gets repeated up to few millions times. Is there a lean way for getting the size of the instruction at a given address, I am using it for aarch64 and arm targets. Kind Regards Zied Guermazi ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-04 7:59 ` A lean way for getting the size of the instruction at a given address Zied Guermazi @ 2021-04-05 13:01 ` Luis Machado 2021-04-05 16:17 ` Zied Guermazi 0 siblings, 1 reply; 8+ messages in thread From: Luis Machado @ 2021-04-05 13:01 UTC (permalink / raw) To: Zied Guermazi, gdb Hi Zied, On 4/4/21 4:59 AM, Zied Guermazi wrote: > hi > > I need to get the size of the instruction at a given address. I am > currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR > addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR > memaddr, struct ui_file *stream, int *branch_delay_insns). and this is > consuming a huge time, considering that this is used in branch tracing > and this gets repeated up to few millions times. > > > Is there a lean way for getting the size of the instruction at a given > address, I am using it for aarch64 and arm targets. At the moment I don't think there is an optimal solution for this. The instruction length is calculated as part of the disassemble process, and is tied to the function that prints instructions. One way to speed things up is to have a new member function in "class gdb_disassembler" to calculate the instruction length only. Another way is to have a new gdbarch hook that calculates the size of an instruction based on the current PC, mapping symbols etc. > > Kind Regards > > Zied Guermazi > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 13:01 ` Luis Machado @ 2021-04-05 16:17 ` Zied Guermazi 2021-04-05 16:40 ` Luis Machado 0 siblings, 1 reply; 8+ messages in thread From: Zied Guermazi @ 2021-04-05 16:17 UTC (permalink / raw) To: Luis Machado, gdb hi Luis A new member function in "class gdb_disassembler" to calculate the instruction length only will be a good solution. In fact a big overhead is added by the printing of instruction disassembly, which is not needed at all. On aarch64, the decoder is optimized to issue many instruction in one trace element, and here calculating the size consumes more than 80% of the time. On arm, the decoder issues one instruction after another and here getting the size consumes 50% of the time. Considering the amount of traces this can sum up to a dozen of minutes in some cases (64MB of traces) Calculating the instruction size per se, on arm is a "rapid" operation and consists of checking few bits in the opcode. So the time can be drastically decreased by having a function to calculate the size only. gdb_print_insn can be then changed as following (pseudo code): int gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int *branch_delay_insns) { gdb_disassembler di (gdbarch, stream); if ( di.get_insn_size != 0) return di.get_insn_size(memaddr); else return di.print_insn (memaddr, branch_delay_insns); } Is there a function in aarch64-tdep or arm-tdep doing job of disassembly ( the lower layer handling the opcode)? are we relaying on the bfd library for it? can someone give me a hint of where to find those functions? Kind Regards Zied Guermazi On 05.04.21 15:01, Luis Machado wrote: > Hi Zied, > > On 4/4/21 4:59 AM, Zied Guermazi wrote: >> hi >> >> I need to get the size of the instruction at a given address. I am >> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR >> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR >> memaddr, struct ui_file *stream, int *branch_delay_insns). and this >> is consuming a huge time, considering that this is used in branch >> tracing and this gets repeated up to few millions times. >> >> >> Is there a lean way for getting the size of the instruction at a >> given address, I am using it for aarch64 and arm targets. > > At the moment I don't think there is an optimal solution for this. The > instruction length is calculated as part of the disassemble process, > and is tied to the function that prints instructions. > > One way to speed things up is to have a new member function in "class > gdb_disassembler" to calculate the instruction length only. > > Another way is to have a new gdbarch hook that calculates the size of > an instruction based on the current PC, mapping symbols etc. > >> >> Kind Regards >> >> Zied Guermazi >> >> -- *Zied Guermazi* founder Trande UG Leuschnerstraße 2 69469 Weinheim/Germany Mobile: +491722645127 mailto:zied.guermazi@trande.de *Trande UG* Leuschnerstraße 2, D-69469 Weinheim; Telefon: +491722645127 Sitz der Gesellschaft: Weinheim- Registergericht: AG Mannheim HRB 736209 - Geschäftsführung: Zied Guermazi *Confidentiality Note* This message is intended only for the use of the named recipient(s) and may contain confidential and/or privileged information. If you are not the intended recipient, please contact the sender and delete the message. Any unauthorized use of the information contained in this message is prohibited. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 16:17 ` Zied Guermazi @ 2021-04-05 16:40 ` Luis Machado 2021-04-05 21:47 ` Zied Guermazi 0 siblings, 1 reply; 8+ messages in thread From: Luis Machado @ 2021-04-05 16:40 UTC (permalink / raw) To: Zied Guermazi, gdb On 4/5/21 1:17 PM, Zied Guermazi wrote: > hi Luis > > A new member function in "class gdb_disassembler" to calculate the > instruction length only will be a good solution. In fact a big overhead > is added by the printing of instruction disassembly, which is not needed > at all. On aarch64, the decoder is optimized to issue many instruction > in one trace element, and here calculating the size consumes more than > 80% of the time. On arm, the decoder issues one instruction after > another and here getting the size consumes 50% of the time. Considering > the amount of traces this can sum up to a dozen of minutes in some cases > (64MB of traces) Indeed, that doesn't sound good. > > Calculating the instruction size per se, on arm is a "rapid" operation > and consists of checking few bits in the opcode. So the time can be > drastically decreased by having a function to calculate the size only. > > > gdb_print_insn can be then changed as following (pseudo code): > > int > gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, > struct ui_file *stream, int *branch_delay_insns) > { > > gdb_disassembler di (gdbarch, stream); > > if ( di.get_insn_size != 0) > > return di.get_insn_size(memaddr); > > else > > return di.print_insn (memaddr, branch_delay_insns); > } > > Is there a function in aarch64-tdep or arm-tdep doing job of disassembly > ( the lower layer handling the opcode)? are we relaying on the bfd > library for it? can someone give me a hint of where to find those functions? The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions and do some initial setup, but the code to disassemble lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c (print_insn_aarch64). If you go with the route of changing "class gdb_disassembler", then you'll probably need to touch binutils/opcodes. If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then you only need to change GDB. > > > Kind Regards > > Zied Guermazi > > > On 05.04.21 15:01, Luis Machado wrote: >> Hi Zied, >> >> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>> hi >>> >>> I need to get the size of the instruction at a given address. I am >>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR >>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR >>> memaddr, struct ui_file *stream, int *branch_delay_insns). and this >>> is consuming a huge time, considering that this is used in branch >>> tracing and this gets repeated up to few millions times. >>> >>> >>> Is there a lean way for getting the size of the instruction at a >>> given address, I am using it for aarch64 and arm targets. >> >> At the moment I don't think there is an optimal solution for this. The >> instruction length is calculated as part of the disassemble process, >> and is tied to the function that prints instructions. >> >> One way to speed things up is to have a new member function in "class >> gdb_disassembler" to calculate the instruction length only. >> >> Another way is to have a new gdbarch hook that calculates the size of >> an instruction based on the current PC, mapping symbols etc. >> >>> >>> Kind Regards >>> >>> Zied Guermazi >>> >>> > -- > > *Zied Guermazi* > founder > > Trande UG > Leuschnerstraße 2 > 69469 Weinheim/Germany > > Mobile: +491722645127 > mailto:zied.guermazi@trande.de <mailto:zied.guermazi@trande.de> > > *Trande UG* > Leuschnerstraße 2, D-69469 Weinheim; Telefon: +491722645127 > Sitz der Gesellschaft: Weinheim- Registergericht: AG Mannheim HRB 736209 > - Geschäftsführung: Zied Guermazi > > *Confidentiality Note* > This message is intended only for the use of the named recipient(s) and > may contain confidential and/or privileged information. If you are not > the intended recipient, please contact the sender and delete the > message. Any unauthorized use of the information contained in this > message is prohibited. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 16:40 ` Luis Machado @ 2021-04-05 21:47 ` Zied Guermazi 2021-04-05 22:04 ` Luis Machado 0 siblings, 1 reply; 8+ messages in thread From: Zied Guermazi @ 2021-04-05 21:47 UTC (permalink / raw) To: Luis Machado, gdb hi Luis, thanks for your support. To experiment the impact of removing the printing of the instruction on the overall performance, I commented out setting and using the print function pointer in print_insn (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in opcodes/arm-dis.c, and the result was very interesting: The time needed to process the traces dropped down from 12 minutes to 34 seconds for 64 MB of traces. now that we have a proof that the bottleneck was printing, we can think about a way to provide a clean implementation. Kind Regards Zied Guermazi On 05.04.21 18:40, Luis Machado wrote: > On 4/5/21 1:17 PM, Zied Guermazi wrote: >> hi Luis >> >> A new member function in "class gdb_disassembler" to calculate the >> instruction length only will be a good solution. In fact a big >> overhead is added by the printing of instruction disassembly, which >> is not needed at all. On aarch64, the decoder is optimized to issue >> many instruction in one trace element, and here calculating the size >> consumes more than 80% of the time. On arm, the decoder issues one >> instruction after another and here getting the size consumes 50% of >> the time. Considering the amount of traces this can sum up to a dozen >> of minutes in some cases (64MB of traces) > > Indeed, that doesn't sound good. > >> >> Calculating the instruction size per se, on arm is a "rapid" >> operation and consists of checking few bits in the opcode. So the >> time can be drastically decreased by having a function to calculate >> the size only. >> >> >> gdb_print_insn can be then changed as following (pseudo code): >> >> int >> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >> struct ui_file *stream, int *branch_delay_insns) >> { >> >> gdb_disassembler di (gdbarch, stream); >> >> if ( di.get_insn_size != 0) >> >> return di.get_insn_size(memaddr); >> >> else >> >> return di.print_insn (memaddr, branch_delay_insns); >> } >> >> Is there a function in aarch64-tdep or arm-tdep doing job of >> disassembly ( the lower layer handling the opcode)? are we relaying >> on the bfd library for it? can someone give me a hint of where to >> find those functions? > > The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and > aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions > and do some initial setup, but the code to disassemble lies in > opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c > (print_insn_aarch64). > > If you go with the route of changing "class gdb_disassembler", then > you'll probably need to touch binutils/opcodes. > > If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then > you only need to change GDB. >> >> >> Kind Regards >> >> Zied Guermazi >> >> >> On 05.04.21 15:01, Luis Machado wrote: >>> Hi Zied, >>> >>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>> hi >>>> >>>> I need to get the size of the instruction at a given address. I am >>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR >>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, >>>> CORE_ADDR memaddr, struct ui_file *stream, int >>>> *branch_delay_insns). and this is consuming a huge time, >>>> considering that this is used in branch tracing and this gets >>>> repeated up to few millions times. >>>> >>>> >>>> Is there a lean way for getting the size of the instruction at a >>>> given address, I am using it for aarch64 and arm targets. >>> >>> At the moment I don't think there is an optimal solution for this. >>> The instruction length is calculated as part of the disassemble >>> process, and is tied to the function that prints instructions. >>> >>> One way to speed things up is to have a new member function in >>> "class gdb_disassembler" to calculate the instruction length only. >>> >>> Another way is to have a new gdbarch hook that calculates the size >>> of an instruction based on the current PC, mapping symbols etc. >>> >>>> >>>> Kind Regards >>>> >>>> Zied Guermazi >>>> >>>> >> >> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 21:47 ` Zied Guermazi @ 2021-04-05 22:04 ` Luis Machado 2021-04-05 22:12 ` Zied Guermazi 0 siblings, 1 reply; 8+ messages in thread From: Luis Machado @ 2021-04-05 22:04 UTC (permalink / raw) To: Zied Guermazi, gdb Hi Zied, On 4/5/21 6:47 PM, Zied Guermazi wrote: > hi Luis, > > thanks for your support. To experiment the impact of removing the > printing of the instruction on the overall performance, I commented out > setting and using the print function pointer in print_insn (bfd_vma pc, > struct disassemble_info *info, bfd_boolean little) in opcodes/arm-dis.c, > and the result was very interesting: The time needed to process the > traces dropped down from 12 minutes to 34 seconds for 64 MB of traces. That is quite a bottleneck! I think this code path isn't exercised often. > > now that we have a proof that the bottleneck was printing, we can think > about a way to provide a clean implementation. I agree. A faster implementation of this particular function would be nice to have. It may even improve some other code paths that use this information. > > Kind Regards > > Zied Guermazi > > > On 05.04.21 18:40, Luis Machado wrote: >> On 4/5/21 1:17 PM, Zied Guermazi wrote: >>> hi Luis >>> >>> A new member function in "class gdb_disassembler" to calculate the >>> instruction length only will be a good solution. In fact a big >>> overhead is added by the printing of instruction disassembly, which >>> is not needed at all. On aarch64, the decoder is optimized to issue >>> many instruction in one trace element, and here calculating the size >>> consumes more than 80% of the time. On arm, the decoder issues one >>> instruction after another and here getting the size consumes 50% of >>> the time. Considering the amount of traces this can sum up to a dozen >>> of minutes in some cases (64MB of traces) >> >> Indeed, that doesn't sound good. >> >>> >>> Calculating the instruction size per se, on arm is a "rapid" >>> operation and consists of checking few bits in the opcode. So the >>> time can be drastically decreased by having a function to calculate >>> the size only. >>> >>> >>> gdb_print_insn can be then changed as following (pseudo code): >>> >>> int >>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >>> struct ui_file *stream, int *branch_delay_insns) >>> { >>> >>> gdb_disassembler di (gdbarch, stream); >>> >>> if ( di.get_insn_size != 0) >>> >>> return di.get_insn_size(memaddr); >>> >>> else >>> >>> return di.print_insn (memaddr, branch_delay_insns); >>> } >>> >>> Is there a function in aarch64-tdep or arm-tdep doing job of >>> disassembly ( the lower layer handling the opcode)? are we relaying >>> on the bfd library for it? can someone give me a hint of where to >>> find those functions? >> >> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and >> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper functions >> and do some initial setup, but the code to disassemble lies in >> opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c >> (print_insn_aarch64). >> >> If you go with the route of changing "class gdb_disassembler", then >> you'll probably need to touch binutils/opcodes. >> >> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), then >> you only need to change GDB. >>> >>> >>> Kind Regards >>> >>> Zied Guermazi >>> >>> >>> On 05.04.21 15:01, Luis Machado wrote: >>>> Hi Zied, >>>> >>>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>>> hi >>>>> >>>>> I need to get the size of the instruction at a given address. I am >>>>> currently using gdb_insn_length (struct gdbarch *gdbarch, CORE_ADDR >>>>> addr) which calls gdb_print_insn (struct gdbarch *gdbarch, >>>>> CORE_ADDR memaddr, struct ui_file *stream, int >>>>> *branch_delay_insns). and this is consuming a huge time, >>>>> considering that this is used in branch tracing and this gets >>>>> repeated up to few millions times. >>>>> >>>>> >>>>> Is there a lean way for getting the size of the instruction at a >>>>> given address, I am using it for aarch64 and arm targets. >>>> >>>> At the moment I don't think there is an optimal solution for this. >>>> The instruction length is calculated as part of the disassemble >>>> process, and is tied to the function that prints instructions. >>>> >>>> One way to speed things up is to have a new member function in >>>> "class gdb_disassembler" to calculate the instruction length only. >>>> >>>> Another way is to have a new gdbarch hook that calculates the size >>>> of an instruction based on the current PC, mapping symbols etc. >>>> >>>>> >>>>> Kind Regards >>>>> >>>>> Zied Guermazi >>>>> >>>>> >>> >>> > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 22:04 ` Luis Machado @ 2021-04-05 22:12 ` Zied Guermazi 2021-04-05 22:15 ` Luis Machado 0 siblings, 1 reply; 8+ messages in thread From: Zied Guermazi @ 2021-04-05 22:12 UTC (permalink / raw) To: Luis Machado, gdb Hi Luis, yes, it guess it was intended for processing disassemble command. Itwas not intended to be used in performance critical use cases. Once it was removed, the next bottle neck is the printf in get_all_disassembler_options ( a string was used as a mean for passing options). it consumes 20% of the time. Shall we put the changes needed to increase the performance in the "etm for branch tracing" patch set, or in a dedicated one (performance improvement one). please advicse /Zied On 06.04.21 00:04, Luis Machado wrote: > Hi Zied, > > On 4/5/21 6:47 PM, Zied Guermazi wrote: >> hi Luis, >> >> thanks for your support. To experiment the impact of removing the >> printing of the instruction on the overall performance, I commented >> out setting and using the print function pointer in print_insn >> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in >> opcodes/arm-dis.c, and the result was very interesting: The time >> needed to process the traces dropped down from 12 minutes to 34 >> seconds for 64 MB of traces. > > That is quite a bottleneck! I think this code path isn't exercised often. > >> >> now that we have a proof that the bottleneck was printing, we can >> think about a way to provide a clean implementation. > > I agree. A faster implementation of this particular function would be > nice to have. It may even improve some other code paths that use this > information. > >> >> Kind Regards >> >> Zied Guermazi >> >> >> On 05.04.21 18:40, Luis Machado wrote: >>> On 4/5/21 1:17 PM, Zied Guermazi wrote: >>>> hi Luis >>>> >>>> A new member function in "class gdb_disassembler" to calculate the >>>> instruction length only will be a good solution. In fact a big >>>> overhead is added by the printing of instruction disassembly, which >>>> is not needed at all. On aarch64, the decoder is optimized to issue >>>> many instruction in one trace element, and here calculating the >>>> size consumes more than 80% of the time. On arm, the decoder issues >>>> one instruction after another and here getting the size consumes >>>> 50% of the time. Considering the amount of traces this can sum up >>>> to a dozen of minutes in some cases (64MB of traces) >>> >>> Indeed, that doesn't sound good. >>> >>>> >>>> Calculating the instruction size per se, on arm is a "rapid" >>>> operation and consists of checking few bits in the opcode. So the >>>> time can be drastically decreased by having a function to calculate >>>> the size only. >>>> >>>> >>>> gdb_print_insn can be then changed as following (pseudo code): >>>> >>>> int >>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >>>> struct ui_file *stream, int *branch_delay_insns) >>>> { >>>> >>>> gdb_disassembler di (gdbarch, stream); >>>> >>>> if ( di.get_insn_size != 0) >>>> >>>> return di.get_insn_size(memaddr); >>>> >>>> else >>>> >>>> return di.print_insn (memaddr, branch_delay_insns); >>>> } >>>> >>>> Is there a function in aarch64-tdep or arm-tdep doing job of >>>> disassembly ( the lower layer handling the opcode)? are we relaying >>>> on the bfd library for it? can someone give me a hint of where to >>>> find those functions? >>> >>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and >>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper >>> functions and do some initial setup, but the code to disassemble >>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c >>> (print_insn_aarch64). >>> >>> If you go with the route of changing "class gdb_disassembler", then >>> you'll probably need to touch binutils/opcodes. >>> >>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), >>> then you only need to change GDB. >>>> >>>> >>>> Kind Regards >>>> >>>> Zied Guermazi >>>> >>>> >>>> On 05.04.21 15:01, Luis Machado wrote: >>>>> Hi Zied, >>>>> >>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>>>> hi >>>>>> >>>>>> I need to get the size of the instruction at a given address. I >>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, >>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch >>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int >>>>>> *branch_delay_insns). and this is consuming a huge time, >>>>>> considering that this is used in branch tracing and this gets >>>>>> repeated up to few millions times. >>>>>> >>>>>> >>>>>> Is there a lean way for getting the size of the instruction at a >>>>>> given address, I am using it for aarch64 and arm targets. >>>>> >>>>> At the moment I don't think there is an optimal solution for this. >>>>> The instruction length is calculated as part of the disassemble >>>>> process, and is tied to the function that prints instructions. >>>>> >>>>> One way to speed things up is to have a new member function in >>>>> "class gdb_disassembler" to calculate the instruction length only. >>>>> >>>>> Another way is to have a new gdbarch hook that calculates the size >>>>> of an instruction based on the current PC, mapping symbols etc. >>>>> >>>>>> >>>>>> Kind Regards >>>>>> >>>>>> Zied Guermazi >>>>>> >>>>>> >>>> >>>> >> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: A lean way for getting the size of the instruction at a given address 2021-04-05 22:12 ` Zied Guermazi @ 2021-04-05 22:15 ` Luis Machado 0 siblings, 0 replies; 8+ messages in thread From: Luis Machado @ 2021-04-05 22:15 UTC (permalink / raw) To: Zied Guermazi, gdb Zied, On 4/5/21 7:12 PM, Zied Guermazi wrote: > Hi Luis, > > yes, it guess it was intended for processing disassemble command. Itwas > not intended to be used in performance critical use cases. Once it was > removed, the next bottle neck is the printf in > get_all_disassembler_options ( a string was used as a mean for passing > options). it consumes 20% of the time. > > Shall we put the changes needed to increase the performance in the "etm > for branch tracing" patch set, or in a dedicated one (performance > improvement one). please advicse This would be best as a separate patch. It will be easier to review that way. You may need to submit the change to both gdb/binutils lists, if the patch touches both projects. > > /Zied > > On 06.04.21 00:04, Luis Machado wrote: >> Hi Zied, >> >> On 4/5/21 6:47 PM, Zied Guermazi wrote: >>> hi Luis, >>> >>> thanks for your support. To experiment the impact of removing the >>> printing of the instruction on the overall performance, I commented >>> out setting and using the print function pointer in print_insn >>> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in >>> opcodes/arm-dis.c, and the result was very interesting: The time >>> needed to process the traces dropped down from 12 minutes to 34 >>> seconds for 64 MB of traces. >> >> That is quite a bottleneck! I think this code path isn't exercised often. >> >>> >>> now that we have a proof that the bottleneck was printing, we can >>> think about a way to provide a clean implementation. >> >> I agree. A faster implementation of this particular function would be >> nice to have. It may even improve some other code paths that use this >> information. >> >>> >>> Kind Regards >>> >>> Zied Guermazi >>> >>> >>> On 05.04.21 18:40, Luis Machado wrote: >>>> On 4/5/21 1:17 PM, Zied Guermazi wrote: >>>>> hi Luis >>>>> >>>>> A new member function in "class gdb_disassembler" to calculate the >>>>> instruction length only will be a good solution. In fact a big >>>>> overhead is added by the printing of instruction disassembly, which >>>>> is not needed at all. On aarch64, the decoder is optimized to issue >>>>> many instruction in one trace element, and here calculating the >>>>> size consumes more than 80% of the time. On arm, the decoder issues >>>>> one instruction after another and here getting the size consumes >>>>> 50% of the time. Considering the amount of traces this can sum up >>>>> to a dozen of minutes in some cases (64MB of traces) >>>> >>>> Indeed, that doesn't sound good. >>>> >>>>> >>>>> Calculating the instruction size per se, on arm is a "rapid" >>>>> operation and consists of checking few bits in the opcode. So the >>>>> time can be drastically decreased by having a function to calculate >>>>> the size only. >>>>> >>>>> >>>>> gdb_print_insn can be then changed as following (pseudo code): >>>>> >>>>> int >>>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >>>>> struct ui_file *stream, int *branch_delay_insns) >>>>> { >>>>> >>>>> gdb_disassembler di (gdbarch, stream); >>>>> >>>>> if ( di.get_insn_size != 0) >>>>> >>>>> return di.get_insn_size(memaddr); >>>>> >>>>> else >>>>> >>>>> return di.print_insn (memaddr, branch_delay_insns); >>>>> } >>>>> >>>>> Is there a function in aarch64-tdep or arm-tdep doing job of >>>>> disassembly ( the lower layer handling the opcode)? are we relaying >>>>> on the bfd library for it? can someone give me a hint of where to >>>>> find those functions? >>>> >>>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and >>>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper >>>> functions and do some initial setup, but the code to disassemble >>>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c >>>> (print_insn_aarch64). >>>> >>>> If you go with the route of changing "class gdb_disassembler", then >>>> you'll probably need to touch binutils/opcodes. >>>> >>>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), >>>> then you only need to change GDB. >>>>> >>>>> >>>>> Kind Regards >>>>> >>>>> Zied Guermazi >>>>> >>>>> >>>>> On 05.04.21 15:01, Luis Machado wrote: >>>>>> Hi Zied, >>>>>> >>>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>>>>> hi >>>>>>> >>>>>>> I need to get the size of the instruction at a given address. I >>>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, >>>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch >>>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int >>>>>>> *branch_delay_insns). and this is consuming a huge time, >>>>>>> considering that this is used in branch tracing and this gets >>>>>>> repeated up to few millions times. >>>>>>> >>>>>>> >>>>>>> Is there a lean way for getting the size of the instruction at a >>>>>>> given address, I am using it for aarch64 and arm targets. >>>>>> >>>>>> At the moment I don't think there is an optimal solution for this. >>>>>> The instruction length is calculated as part of the disassemble >>>>>> process, and is tied to the function that prints instructions. >>>>>> >>>>>> One way to speed things up is to have a new member function in >>>>>> "class gdb_disassembler" to calculate the instruction length only. >>>>>> >>>>>> Another way is to have a new gdbarch hook that calculates the size >>>>>> of an instruction based on the current PC, mapping symbols etc. >>>>>> >>>>>>> >>>>>>> Kind Regards >>>>>>> >>>>>>> Zied Guermazi >>>>>>> >>>>>>> >>>>> >>>>> >>> > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-04-05 22:15 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de> 2021-04-04 7:59 ` A lean way for getting the size of the instruction at a given address Zied Guermazi 2021-04-05 13:01 ` Luis Machado 2021-04-05 16:17 ` Zied Guermazi 2021-04-05 16:40 ` Luis Machado 2021-04-05 21:47 ` Zied Guermazi 2021-04-05 22:04 ` Luis Machado 2021-04-05 22:12 ` Zied Guermazi 2021-04-05 22:15 ` Luis Machado
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).