From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 326C83858439 for ; Tue, 19 Jul 2022 15:33:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 326C83858439 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6B6E51570; Tue, 19 Jul 2022 08:33:39 -0700 (PDT) Received: from [10.2.78.52] (unknown [10.2.78.52]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 767273F70D; Tue, 19 Jul 2022 08:33:38 -0700 (PDT) Message-ID: Date: Tue, 19 Jul 2022 16:33:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCHv4 2/2] libopcodes/aarch64: add support for disassembler styling Content-Language: en-GB To: Andrew Burgess , binutils@sourceware.org References: From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3490.7 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2022 15:33:41 -0000 On 08/07/2022 11:25, Andrew Burgess via Binutils wrote: > This commit enables disassembler styling for AArch64. After this > commit it is possible to have objdump style AArch64 disassembler > output (using --disassembler-color option). Once the required GDB > patches are merged, GDB will also style the disassembler output. > > The changes to support styling are mostly split between two files > opcodes/aarch64-dis.c and opcodes/aarch64-opc.c. > > The entry point for the AArch64 disassembler can be found in > aarch64-dis.c, this file handles printing the instruction mnemonics, > and assembler directives (e.g. '.byte', '.word', etc). Some operands, > mostly relating to assembler directives are also printed from this > file. This commit changes all of this to pass through suitable > styling information. > > However, for most "normal" instructions, the instruction operands are > printed using a two step process. From aarch64-dis.c, in the > print_operands function, the function aarch64_print_operand is called, > this function is in aarch64-opc.c, and converts an instruction operand > into a string. Then, back in print_operands (aarch64-dis.c), the > operand string is printed. > > Unfortunately, the string returned by aarch64_print_operand can be > quite complex, it will include syntax elements, like '[' and ']', in > addition to register names and immediate values. In some cases, a > single operand will expand into what will appear (to the user) as > multiple operands separated with a ','. > > This makes the task of styling more complex, all these different > components need to by styled differently, so we need to get the > styling information out of aarch64_print_operand in some way. > > The solution that I propose here is similar to the solution that I > used for the i386 disassembler. > > Currently, aarch64_print_operand uses snprintf to write the operand > text into a buffer provided by the caller. > > What I propose is that we pass an extra argument to the > aarch64_print_operand function, this argument will be a structure, the > structure contains a callback function and some state. > > When aarch64_print_operand needs to format part of its output this can > be done by using the callback function within the new structure, this > callback returns a string with special embedded markers that indicate > which mode should be used for each piece of text. Back in > aarch64-dis.c we can spot these special style markers and use this to > split the disassembler output up and apply the correct style to each > piece. > > To make aarch64-opc.c clearer a series of new static functions have > been added, e.g. 'style_reg', 'style_imm', etc. Each of these > functions formats a piece of text in a different style, 'register' and > 'immediate' in this case. > > Here's an example taken from aarch64-opc.c of the new functions in > use: > > snprintf (buf, size, "[%s, %s]!", > style_reg (styler, base), > style_imm (styler, "#%d", opnd->addr.offset.imm)); > > The aarch64_print_operand function is also called from the assembler > to aid in printing diagnostic messages. Right now I have no plans to > add styling to the assembler output, and so, the callback function > used in the assembler ignores the styling information and just returns > an plain string. > > I've used the source files in gas/testsuite/gas/aarch64/ for testing, > and have manually gone through and checked that the styling looks > reasonable, however, I'm not an AArch64 expert, so it is possible that > the odd piece is styled incorrectly. Please point out any mistakes > I've made. > There are several places where you've omitted a format string and instead passed the output of style_...() directly as the format. For example: case AARCH64_OPND_BARRIER_DSB_NXS: - snprintf (buf, size, "%s", opnd->barrier->name); + { + if (opnd->barrier->name[0] == '#') + snprintf (buf, size, style_imm (styler, opnd->barrier->name)); + else + snprintf (buf, size, style_sub_mnem (styler, opnd->barrier->name)); + } break; There's a small risk of that containing a % character itself and thus causing UB, so I think it would be better to use an explicit "%s" argument in those cases. There's one possible typo that I spotted: s/slit/split/? Other than that, I think this is OK. It would be nice if the obstack could be allocated once and then reset after each use, that would avoid having to malloc a new stack each time aarch64_print_operand is called, but it's not a massive overhead. R. > +/* Return a short string to indicate a switch to STYLE. These strings > + will be embedded into the disassembled operand text (as produced by > + aarch64_print_operand), and then spotted in the print_operands function > + so that the disassembler output can be slit by style. */ s/slit/split/?