From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nx241.node01.secure-mailgate.com (nx241.node01.secure-mailgate.com [89.22.108.241]) by sourceware.org (Postfix) with ESMTPS id 0D6E1384B824 for ; Mon, 5 Apr 2021 22:12:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0D6E1384B824 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=trande.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=zied.guermazi@trande.de Received: from host202.checkdomain.de ([185.137.168.148]) by node01.secure-mailgate.com with esmtps (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.92) (envelope-from ) id 1lTXS3-00EBBb-Ox; Tue, 06 Apr 2021 00:12:08 +0200 X-SecureMailgate-Identity: zied.guermazi@trande.de;host202.checkdomain.de Received: from [192.168.178.48] (x4db3ad69.dyn.telefonica.de [77.179.173.105]) (Authenticated sender: zied.guermazi@trande.de) by host202.checkdomain.de (Postfix) with ESMTPSA id B11292C1E7B; Tue, 6 Apr 2021 00:12:06 +0200 (CEST) X-SecureMailgate-Identity: zied.guermazi@trande.de;host202.checkdomain.de Subject: Re: A lean way for getting the size of the instruction at a given address To: Luis Machado , "gdb@sourceware.org" References: <295a186e-0dd9-fb96-671a-3df0a5611dd9@trande.de> <442482d9-31bd-8101-38f0-fb7c7763e61c@trande.de> <476fcf13-8782-a69f-f43b-069497ba7e3b@linaro.org> <51cfbb5b-ed10-c9d8-8dc3-81b3da496022@linaro.org> <72e584f8-2cf6-0bfa-882d-a1ba21a43931@trande.de> <0f2147c8-770d-5cb2-f415-8549b7192b36@linaro.org> From: Zied Guermazi Message-ID: Date: Tue, 6 Apr 2021 00:12:06 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <0f2147c8-770d-5cb2-f415-8549b7192b36@linaro.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-PPP-Message-ID: <20210405221207.3163184.98958@host202.checkdomain.de> X-PPP-Vhost: trande.de X-Originating-IP: 185.137.168.148 X-SecureMailgate-Domain: host202.checkdomain.de X-SecureMailgate-Username: 185.137.168.148 Authentication-Results: secure-mailgate.com; auth=pass smtp.auth=185.137.168.148@host202.checkdomain.de X-SecureMailgate-Outgoing-Class: ham X-SecureMailgate-Outgoing-Evidence: Combined (0.15) X-Recommended-Action: accept X-Filter-ID: Pt3MvcO5N4iKaDQ5O6lkdGlMVN6RH8bjRMzItlySaT8ZSqYuF3dcH5gA07phRVhbPUtbdvnXkggZ 3YnVId/Y5jcf0yeVQAvfjHznO7+bT5wqBcSGeVWq1E9YWZbsmhKMTlod1WpdzSJAbIL3qp5Jutzr cyixydPSsneZ1BAaVhAgiifvPJN7o5v5p11VZDvDzZsxyd77zxVLEN8sWs8I4HiaS8Y2jxxNQPzs mflfncQj8IUcFKf7vY1kUlViIeaDyr3QtDIUgb2i4KId0jgb/P0V4PkXNAjMRw4Q43mUc6ps7OfG g/rIg8XDImgE5uX9cAElvqCMIPQLM387tACskOSeJJSW9klt2ASD9+uySjMPYas822rjnhHXt5Rp 1fytyGJFjXwzzy7ggV8Y7zqBIZ9ZXMGSUGXzMKzahWMRMjzFtivW+648fff2kH7qwSRxOWNMUK0p fLku1Krnmfgn/tI+pvGd027gyAiy/CBKXr7X7D3r+yzaf09FesHDYiVh+mW6KLAhhg63cf8hhJ+u keFXSTQpwn772WSOtkQMzpK0i05Fb6YGNO4Z16vdkKSIRV16qVXFRYQ6YJ5xnfU78KXbhVB+eMaM yF2AvwDuK7kOUNajOgnyG49cHXrfXNCdkxagxqTy3Eelsrf2AHilqGAJUsGkiX9pnLLp9ipqVxqP 8lyJOtfGFSaeYzNpGXFIBz1G8OwlaTaJXM0qE7ibEZxDbEhAgjrMuJIrUNhFTjfPyfbPVATYvxzQ XMIbof3Kql5rHDE4PRNjj2FfBnZobFHLxSRSLLl5+oDWOG+jcTfqf32HRAIXLxSxjQIuPQZeEAkj 3A+Ayf90lXAE1KL6y3yKrvIr8CMr1jViqwsevvLhD7PweRUdg8UNuqxyE4BjzTEKheHUUbrh/pCk PPNOj2vJfOI+MmJ4cnbgLVfroSdi0CfpmxbmrsOvNppw/i7zUgZpEW8JaYlju/HFCCAt7BxwEiTV JqDh0qKoKsXx5lkDq95dh8DQc3WQBHlugNJgY+pKoRnIohTX8xlcQsqDqF6H8pq6zJ2ciDwQ/Cfs SW4rSpDzjEtjfX8+OHl4cyU6+OB5tXbR+3ySPPsBZh72Z9JhsaTJwVgm+eDpp4cv3rhOPS+fNCMB 82tk/ufUt4Z4PErpQNW9cpneSkLfuNCIkDMX/bBVDYIR2zo2EhbcURI6aZlYU/1t/6kcui5LWirA 9jihx+Za/cV70jOJzN2r4A== X-Report-Abuse-To: spam@node04.secure-mailgate.com X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00, BODY_8BITS, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Apr 2021 22:12:11 -0000 Hi Luis, yes, it guess it was intended for processing disassemble command. Itwas not intended to be used in performance critical use cases. Once it was removed, the next bottle neck is the printf in get_all_disassembler_options ( a string was used as a mean for passing options). it consumes 20% of the time. Shall we put the changes needed to increase the performance in the "etm for branch tracing" patch set, or in a dedicated one (performance improvement one). please advicse /Zied On 06.04.21 00:04, Luis Machado wrote: > Hi Zied, > > On 4/5/21 6:47 PM, Zied Guermazi wrote: >> hi Luis, >> >> thanks for your support. To experiment the impact of removing the >> printing of the instruction on the overall performance, I commented >> out setting and using the print function pointer in print_insn >> (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) in >> opcodes/arm-dis.c, and the result was very interesting: The time >> needed to process the traces dropped down from 12 minutes to 34 >> seconds for 64 MB of traces. > > That is quite a bottleneck! I think this code path isn't exercised often. > >> >> now that we have a proof that the bottleneck was printing, we can >> think about a way to provide a clean implementation. > > I agree. A faster implementation of this particular function would be > nice to have. It may even improve some other code paths that use this > information. > >> >> Kind Regards >> >> Zied Guermazi >> >> >> On 05.04.21 18:40, Luis Machado wrote: >>> On 4/5/21 1:17 PM, Zied Guermazi wrote: >>>> hi Luis >>>> >>>> A new member function in "class gdb_disassembler" to calculate the >>>> instruction length only will be a good solution. In fact a big >>>> overhead is added by the printing of instruction disassembly, which >>>> is not needed at all. On aarch64, the decoder is optimized to issue >>>> many instruction in one trace element, and here calculating the >>>> size consumes more than 80% of the time. On arm, the decoder issues >>>> one instruction after another and here getting the size consumes >>>> 50% of the time. Considering the amount of traces this can sum up >>>> to a dozen of minutes in some cases (64MB of traces) >>> >>> Indeed, that doesn't sound good. >>> >>>> >>>> Calculating the instruction size per se, on arm is a "rapid" >>>> operation and consists of checking few bits in the opcode. So the >>>> time can be drastically decreased by having a function to calculate >>>> the size only. >>>> >>>> >>>> gdb_print_insn can be then changed as following (pseudo code): >>>> >>>> int >>>> gdb_print_insn (struct gdbarch *gdbarch, CORE_ADDR memaddr, >>>>          struct ui_file *stream, int *branch_delay_insns) >>>> { >>>> >>>>    gdb_disassembler di (gdbarch, stream); >>>> >>>>    if ( di.get_insn_size != 0) >>>> >>>>     return di.get_insn_size(memaddr); >>>> >>>>    else >>>> >>>>     return di.print_insn (memaddr, branch_delay_insns); >>>> } >>>> >>>> Is there a function in aarch64-tdep or arm-tdep doing job of >>>> disassembly ( the lower layer handling the opcode)? are we relaying >>>> on the bfd library for it? can someone give me a hint of where to >>>> find those functions? >>> >>> The gdbarch hooks in arm-tdep.c (gdb_print_insn_arm) and >>> aarch64-tdep.c (aarch64_gdb_print_insn) are more like helper >>> functions and do some initial setup, but the code to disassemble >>> lies in opcodes/arm-dis.c (print_insn) and opcodes/aarch64-dis.c >>> (print_insn_aarch64). >>> >>> If you go with the route of changing "class gdb_disassembler", then >>> you'll probably need to touch binutils/opcodes. >>> >>> If you decide to have a gdbarch hook (in arm-tdep/aarch64-tdep), >>> then you only need to change GDB. >>>> >>>> >>>> Kind Regards >>>> >>>> Zied Guermazi >>>> >>>> >>>> On 05.04.21 15:01, Luis Machado wrote: >>>>> Hi Zied, >>>>> >>>>> On 4/4/21 4:59 AM, Zied Guermazi wrote: >>>>>> hi >>>>>> >>>>>> I need to get the size of the instruction at a given address. I >>>>>> am currently using gdb_insn_length (struct gdbarch *gdbarch, >>>>>> CORE_ADDR addr) which calls gdb_print_insn (struct gdbarch >>>>>> *gdbarch, CORE_ADDR memaddr, struct ui_file *stream, int >>>>>> *branch_delay_insns). and this is consuming a huge time, >>>>>> considering that this is used in branch tracing and this gets >>>>>> repeated up to few millions times. >>>>>> >>>>>> >>>>>> Is there a lean way for getting the size of the instruction at a >>>>>> given address, I am using it for aarch64 and arm targets. >>>>> >>>>> At the moment I don't think there is an optimal solution for this. >>>>> The instruction length is calculated as part of the disassemble >>>>> process, and is tied to the function that prints instructions. >>>>> >>>>> One way to speed things up is to have a new member function in >>>>> "class gdb_disassembler" to calculate the instruction length only. >>>>> >>>>> Another way is to have a new gdbarch hook that calculates the size >>>>> of an instruction based on the current PC, mapping symbols etc. >>>>> >>>>>> >>>>>> Kind Regards >>>>>> >>>>>> Zied Guermazi >>>>>> >>>>>> >>>> >>>> >>