From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 29CE13858416 for ; Fri, 22 Oct 2021 13:30:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 29CE13858416 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 19MDU1qC027579 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Oct 2021 09:30:06 -0400 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 19MDU1qC027579 Received: from [10.0.0.11] (192-222-157-6.qc.cable.ebox.net [192.222.157.6]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id C3FA91ECEB; Fri, 22 Oct 2021 09:30:01 -0400 (EDT) Message-ID: <761bf1d8-293b-7232-83f4-c6c333f75f30@polymtl.ca> Date: Fri, 22 Oct 2021 09:30:01 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH 5/5] gdb/python: implement the print_insn extension language hook Content-Language: en-US To: Andrew Burgess , gdb-patches@sourceware.org References: From: Simon Marchi In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Fri, 22 Oct 2021 13:30:01 +0000 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Oct 2021 13:30:10 -0000 Hi Andrew, I don't have time to read all the code, so I'll just nit-pick on the public API. On 2021-10-13 17:59, Andrew Burgess wrote: > This commit extends the Python API to include disassembler support, > and additionally provides a syntax highlighting disassembler. > > The motivation for this commit was to provide an API by which the user > could write Python scripts that would augment the output of the > disassembler. > > To achieve this I have followed the model of the existing libopcodes > disassembler, that is, instructions are disassembled one by one. This > does restrict the type of things that it is possible to do from a > Python script, i.e. all additional output has to fit on a single line, > but this was all I needed, and creating something more complex would, > I think, require greater changes to how GDB's internal disassembler > operates. > > It was only once I had a working prototype that I realised I could > very easily use this to perform syntax highlighting on GDB's > disassembly output, so I've included that too. The new commands added > are: > > set style disassembly on|off > show style disassembly > > which enable or disable disassembly syntax highlighting. > > The disassembler API is contained in the new gdb.disassembler module, > which defines the following classes: > > DisassembleInfo > > Similar to libopcodes disassemble_info structure, has read-only > attributes: address, string, length, architecture, and > can_emit_style_escape. And has methods: read_memory, set_result, > and memory_error. > > Each time GDB wants an instruction disassembled, an instance of > this class is passed to a user written disassembler, by reading the > attributes, and calling the methods, the user can perform > disassembly, and set the result within the DisassembleInfo instance. > > Disassembler > > This is a base-class which user written disassemblers should > inherit from, just provides base implementations of __init__ and > __call__ which the user written disassembler should override. > > The gdb.disassembler module also provides the following functions: > > register_disassembler > > This function registers an instance of a Disassembler sub-class > as a disassembler, either for one specific architecture, or, as a > global disassembler for all architectures. > > format_address > > This wraps GDB's print_address function, converting an address > into a string that can be placed into disassembler output. > > syntax_highlight > > This adds syntax highlighting escapes to some disassembler > output, users can call this from their own custom disassemblers to > retain syntax highlighting, this function handles switching syntax > highlighting off, or the case where the pygments library is not > available. > > builtin_disassemble > > This provides access to GDB's builtin disassembler. A common > user case that I see is augmenting the existing disassembler > output. The user code can call this function to have GDB > disassemble the instruction in the normal way, and then the user can > tweak the output before returning that as the result. This function > also provides a mechanism to intercept the disassemblers reads of > memory, thus the user can adjust what GDB sees when it is > disassembling. > > The included documentation provides a more detailed description of the > API. > --- > gdb/Makefile.in | 1 + > gdb/NEWS | 42 ++ > gdb/data-directory/Makefile.in | 1 + > gdb/disasm.c | 5 +- > gdb/disasm.h | 13 +- > gdb/doc/gdb.texinfo | 14 + > gdb/doc/python.texi | 252 +++++++ > gdb/python/lib/gdb/disassembler.py | 194 ++++++ > gdb/python/py-arch.c | 9 + > gdb/python/py-disasm.c | 905 +++++++++++++++++++++++++ > gdb/python/python-internal.h | 21 + > gdb/python/python.c | 11 +- > gdb/testsuite/gdb.base/style.exp | 45 +- > gdb/testsuite/gdb.python/py-disasm.c | 25 + > gdb/testsuite/gdb.python/py-disasm.exp | 201 ++++++ > gdb/testsuite/gdb.python/py-disasm.py | 538 +++++++++++++++ > 16 files changed, 2267 insertions(+), 10 deletions(-) > create mode 100644 gdb/python/lib/gdb/disassembler.py > create mode 100644 gdb/python/py-disasm.c > create mode 100644 gdb/testsuite/gdb.python/py-disasm.c > create mode 100644 gdb/testsuite/gdb.python/py-disasm.exp > create mode 100644 gdb/testsuite/gdb.python/py-disasm.py > > diff --git a/gdb/Makefile.in b/gdb/Makefile.in > index ec5d332c145..3981cc9507c 100644 > --- a/gdb/Makefile.in > +++ b/gdb/Makefile.in > @@ -392,6 +392,7 @@ SUBDIR_PYTHON_SRCS = \ > python/py-breakpoint.c \ > python/py-cmd.c \ > python/py-continueevent.c \ > + python/py-disasm.c \ > python/py-event.c \ > python/py-evtregistry.c \ > python/py-evts.c \ > diff --git a/gdb/NEWS b/gdb/NEWS > index d001a03145d..fd1952a2f59 100644 > --- a/gdb/NEWS > +++ b/gdb/NEWS > @@ -32,6 +32,12 @@ maint show internal-warning backtrace > internal-error, or an internal-warning. This is on by default for > internal-error and off by default for internal-warning. > > +set style disassembly on|off > +show style disassembly > + If GDB is compiled with Python support, and the Python pygments > + module is available, then, when this setting is on, disassembler > + output will have styling applied. > + > * Python API > > ** New function gdb.add_history(), which takes a gdb.Value object > @@ -49,6 +55,42 @@ maint show internal-warning backtrace > containing all of the possible Architecture.name() values. Each > entry is a string. > > + ** New Python API for wrapping GDB's disassembler: > + > + - gdb.disassembler.register_disassembler(DISASSEMBLER, ARCH). > + DISASSEMBLER is a sub-class of gdb.disassembler.Disassembler. > + ARCH is either None or a string containing a bfd architecture > + name. DISASSEMBLER is registered as a disassembler for > + architecture ARCH, or for all architectures if ARCH is None. > + The previous disassembler registered for ARCH is returned, this > + can be None if no previous disassembler was registered. > + > + - gdb.disassembler.Disassembler is the class from which all > + disassemblers should inherit. Its constructor takes a string, > + a name for the disassembler, which is currently only used is > + some debug output. Sub-classes should override the __call__ > + method to perform disassembly, invoking __call__ on this base > + class will raise an exception. > + > + - gdb.disassembler.DisassembleInfo is the class used to describe > + a single disassembly request from GDB. An instace of this instace -> instance > + class is passed to the __call__ method of > + gdb.disassembler.Disassembler and has the following read-only > + attributes: 'address', 'string', 'length', 'architecture', > + 'can_emit_style_escape', and the following methods > + 'read_memory', 'set_result', and 'memory error'. Just wondering, why having a "set_result" method instead of just having the __call__ method return something? You probably mean 'memory_error' instead of 'memory error'. But can you explain when you expect users to manually call "memory_error"? I would expect that calling read_memory may raise a gdb.MemoryError, but when would the user manually generate a memory error? And regardless of the above, I think it would be more Pythonic to have the user raise an exception to signal an error, instead of calling a method. I'm not sure I understand the use case of calling set_result and / or memory_error more than once, and have one overwrite the other. > + > + - gdb.disassembler.format_address(ARCHITECTURE, ADDRESS), formats > + an address into a string so that the string can be included in > + the disassembler output. ARCHITECTURE is a gdb.Architecture > + object. Would it make sense to have that as a "gdb.Architecture.format_address(ADDRESS)" method instead? I'm thinking that you might want to use this in other contexts than disassembly. You could always use gdb.disassembly.format_address anyway, but it would be weird to use the gdb.disassembly module for something not disassembly-related. Simon