From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-sender-0.a4lg.com (mail-sender.a4lg.com [153.120.152.154]) by sourceware.org (Postfix) with ESMTPS id 3E46A3858D3C for ; Mon, 28 Nov 2022 04:44:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3E46A3858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=irq.a4lg.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=irq.a4lg.com Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id D6C2D300089; Mon, 28 Nov 2022 04:44:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irq.a4lg.com; s=2017s01; t=1669610645; bh=gHXko+kS/fNZ3LY+uSFjeylS7BpsrOPl40VRmWta52w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Mime-Version:Content-Transfer-Encoding; b=Y+KXSUw3WgruR0q7yvznai9LSS4+5A5EpEDyATfwyhLvakphVi1Uj682qbmYIyvWt ZLgcvIQP8F4ELYw7hCstOBD3H30KdYBdFmDNJ7ayMVjzhMIi0J/SyziCACgzUmyaG4 NF8qYWRVBgmYkVXfQzCaZf5JC1fZJSf86ceEjOgU= From: Tsukasa OI To: Tsukasa OI , Nelson Chu , Kito Cheng , Palmer Dabbelt Cc: binutils@sourceware.org Subject: [PATCH v2 00/11] RISC-V: Requirements for disassembler optimizations batch 1 Date: Mon, 28 Nov 2022 04:43:35 +0000 Message-Id: In-Reply-To: References: Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,KAM_ASCII_DIVIDERS,KAM_NUMSUBJECT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, To improve the core disassembler (both for performance and feature), this patchset now prepares for further optimization. It will be a prerequisite of upcoming core disassembler changes including two major optimizations. Tracker on GitHub (with detailed benchmark results): This is the part 2 of 4-part project: [Changes: v1 -> v2] - Rebased against commit 97f006bc56af: "RISC-V: Better support for long instructions (disassembler)" Individual changes are detailed in each patch. In this cover letter, I will explain... - Aggregate Performance Improvements - Some Backgrounds behind some patches 1. Aggregate Performance Improvements 2. Split Match/Print Steps on the Disassembler (PATCH 8/11) 3. State Initialization (PATCH 9/11) 3.1. Disassembler Lifecycle 3.2. Moving the Initialization 3.3. Fixed Issues as a Side Effect 3.4. Fixed Issue 1: Unsetting Options on GDB (and new smaller issue) 3.5. Fixed Issue 2: Location of Error Message 4. Architecture Initialization and Management (PATCH 10/11) Aggregate Performance Improvements =================================== Since commit 40f1a1a4564b ("RISC-V: Output mapping symbols with ISA string." ), the RISC-V disassembler was slower than before. I'm not blaming this because it has a good reason to be changed (to support mapping symbols with ISA string). Despite that this patchset itself is not main optimization part, it roughly restores the disassembler performance before that commit (if the input ELF file contains mapping symbols with ISA string, this patchset makes disassembling such files significantly faster). Note that, this patchset contains two "optimization" commits that significantly (>30%) improves the performance under certain conditions: - When thousands of CSR instructions are being disassembled (PATCH 5/11) - When disassembling large chunk of code using GDB (not objdump) (PATCH 10/11) - When the input has mapping symbols with ISA string (both on GDB/objdump) (PATCH 10/11) Split Match/Print Steps on the Disassembler (PATCH 8/11) ========================================================= This is more like a future-proof commit. riscv_disassemble_insn function (before this patch) works like this: FOREACH (opcode) { if (!opcode.MATCH(insn)) continue; // // Print insn using matched opcode info. // return insnlen; } // // opcode not matched. // But if we split this to two steps: matching and printing like this: matched_opcode = NULL; // Step 1: opcode matching FOREACH (opcode) { if (!opcode.MATCH(insn)) continue; matched_opcode = opcode; break; } // Step 2: opcode printing if (matched_opcode != NULL) { // // Print insn using matched_opcode info. // return insnlen; } // // opcode not matched. // We can independently improve the opcode matching step and makes a room to implement more complex opcode matching. Despite that this patch alone imposes a small amount of performance penalty, it can be easily paid back by other optimizations. State Initialization (PATCH 9/11) ================================== Disassembler Lifecycle ----------------------- In arch-independent libopcodes, we use following functions to disassemble the object file/memory range: - disassembler To get real disassembler function for specific BFD and architecture. The result is print_insn_riscv on RISC-V. - (the result of "disassembler") as explained above. - disassemble_init_for_target This function initializes disassemble_info structure (e.g. allocates target-specific private data and sets certain fields). - disassemble_free_for_target This function frees private_data in the disassemble_info. It also allows the target to run custom logic. Ideally, all disassembler state should be contained within disassemble_info.private_data (like in PowerPC) but many (including Arm, MIPS and RISC-V) don't do that. It means that we should handle this abstract "disassembler" as a singleton for now. It's used from objdump (binutils) and GDB but note that they have a different execution order. This is shown below: objdump (completely target-independent): (Source) - binutils/objdump.c (Note) - This cycle happens per BFD object. This is usually once on objdump but may occur twice or more (or even 0 times) on archive files. (Initialization Step) a1. Call "disassembler" function (with input BFD, arch, endian and mach) a2. Set initial disassemble_info structure contents (including flavour, arch, mach and disassembler_options) a3. Call disassemble_init_for_target (For Every Instruction) b1. Call the result of "disassembler" (print_insn_riscv on RISC-V) (Finalization Step) c1. Call disassemble_free_for_target GDB (some may change through gdbarch but this is the general sequence): (Source) - gdb/disasm.c - gdb/disasm.h - gdb/gdbarch.c - gdb/arch-utils.c (Note) - This cycle happens per one disassembler request (e.g. per one "disas" command). - A disassembler lifecycle is represented as a gdb_disassemble_info class object. (Initialization Step - constructor) a1. Set initial disassemble_info structure contents (including flavour [unknown], arch, mach and disassembler_options) a2. Call disassemble_init_for_target (For Every Instruction - by default) Call trace: 1. gdb_disassembler::print_insn method (gdb_disassembler is a subclass of gdb_disassemble_info) 2. gdb_print_insn_1 function 3. gdbarch_print_insn 4. gdbarch->print_insn (default_print_insn by default but target-overridable) 5. default_print_insn (by default) b1. Call "disassembler" function (with input BFD, and arch, endian and mach as in disassemble_info) b2. Call the result of "disassembler" (print_insn_riscv on RISC-V) (Finalization Step - destructor) c1. Call disassemble_free_for_target In both usecases, disassembler_options, arch and mach are set only once (before disassemble_init_for_target) and never changes in multiple print_insn_riscv calls. That means, we are safe to parse some disassemble_info fields such as mach and disassembler_options in disassemble_init_for_target (it's even expected to do that, considering a side effect I will explain later). Note that we need to take care of two possible calling order: - disassembler -> disassemble_init_for_target (objdump) - disassemble_init_for_target -> disassembler (GDB) Moving the Initialization -------------------------- For instance, PowerPC and WebAssembly targets parse disassembler options in the disassembler_init_for_target call (PowerPC parses mach, too). Many architectures instead try to initialize its state on per-instruction call (in the "result of disassembler" call to be specific) but it's clearly a waste of time and only a tradition (a bad one, to be honest). As objdump works well in PowerPC and WebAssembly and GDB works well in PowerPC (note: GDB for WebAssembly is not implemented), moving RISC-V's disassembler state initialization (including parsing disassembler_options) is safe as PowerPC and makes state management simpler and faster. Some of my optimizations depend on this change otherwise will be too complex and/or doesn't speed up the disassembler as expected. But not everything can be moved. Because symbol table in the disassemble_info (symtab, symtab_size...) is not available on disassembler_init_for_target and disassemble_info object is not available on "diassembler", we have no stable way to initialize pd->gp prior to the first riscv_disassemble_insn call. So, info->private_data == NULL branch on riscv_disassemble_insn is preserved. Fixed Issues as a Side Effect ------------------------------ Also, changing RISC-V disassembler like this has two good side effects: 1. We can really "unset" the disassembler options in the GDB 2. Error message regarding the disassembler options is at the right place on objdump Fixed Issue 1: Unsetting Options on GDB (and new smaller issue) ---------------------------------------------------------------- First, GDB expects to allow "set disassembler-options XXX..." to set disassembler options "XXX..." **and** "set disassembler-options" to **unset** them, reverting to the default. It didn't work like this in RISC-V but my patchset fixes this issue as a side effect. Before: (gdb) set disassembler-options (gdb) disas _start Dump of assembler code for function _start: 0x0000000080000028 <+0>: ret End of assembler dump. (gdb) set disassembler-options no-aliases (gdb) disas _start Dump of assembler code for function _start: 0x0000000080000028 <+0>: c.jr ra End of assembler dump. (gdb) set disassembler-options (gdb) disas _start Dump of assembler code for function _start: 0x0000000080000028 <+0>: c.jr ra <-- should be "ret"! End of assembler dump. After (snip): (gdb) set disassembler-options <-- 1st/2nd time (gdb) disas _start Dump of assembler code for function _start: 0x0000000080000028 <+0>: ret <-- restored! End of assembler dump. This is because, after "set disassembler-options" to unset disassembler options, disassemble_info.disassembler_options is set to NULL, not an empty string. Because current print_insn_riscv function checks whether disassembler_options is NULL and parses the options when it's not, "set disassembler-options" works as "do nothing" (instead of "reset to default") in RISC-V. Beware that, PATCH 9/11 introduces a smaller bug in GDB which suppresses an error message when mismatched "priv-spec" option is specified once before the first disassembly. In the following example, we assume that we are debugging an ELF file with privileged specification 1.11 ELF attributes. It will not occur in objdump. Before (okay): (gdb) file test.elf (gdb) set disassembler-options priv-spec=1.12 (gdb) disas _start Dump of assembler code for function _start: BFD: mis-matched privilege spec set by priv-spec=1.12, the elf privilege attribute is 1.11 0x0000000080000028 <+0>: csrr a0,0x3d0 ^^^^^ "pmpaddr32" CSR but not defined in 1.11. 0x000000008000002c <+4>: ret End of assembler dump. (gdb) disas _start Dump of assembler code for function _start: BFD: mis-matched privilege spec set by priv-spec=1.12, the elf privilege attribute is 1.11 0x0000000080000028 <+0>: csrr a0,0x3d0 0x000000008000002c <+4>: ret End of assembler dump. After PATCH 9/11 (an error message is suppressed): (gdb) file test.elf (gdb) set disassembler-options priv-spec=1.12 (gdb) disas _start Dump of assembler code for function _start: 0x0000000080000028 <+0>: csrr a0,0x3d0 ^^^^^ "pmpaddr32" CSR but not defined in 1.11. An error message just before this line is suppressed but "priv-spec" override does not actually happen. 0x000000008000002c <+4>: ret End of assembler dump. (gdb) disas _start Dump of assembler code for function _start: BFD: mis-matched privilege spec set by priv-spec=1.12, the elf privilege attribute is 1.11 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ correctly printed at the second time (and later). 0x0000000080000028 <+0>: csrr a0,0x3d0 0x000000008000002c <+4>: ret End of assembler dump. This new smaller issue is taken care in my upcoming patchset. Fixed Issue 2: Location of Error Message ----------------------------------------- Second, because PATCH 9/11 changes where error message is generated, this patch makes error message location (when an invalid disassembler option is specified) more "right", more sensible. Before: test.elf: file format elf64-littleriscv Disassembly of section .text: 0000000080000000 <__start>: 80000000: riscv64-unknown-elf-objdump: mis-matched privilege spec set by priv-spec=1.12, the elf privilege attribute is 1.11 00002197 auipc gp,0x2 80000004: 82e18193 addi gp,gp,-2002 # 8000182e <__global_pointer$> ... After: test.elf: file format elf64-littleriscv riscv64-unknown-elf-objdump: mis-matched privilege spec set by priv-spec=1.12, the elf privilege attribute is 1.11 Disassembly of section .text: 0000000080000000 <__start>: 80000000: 00002197 auipc gp,0x2 80000004: 82e18193 addi gp,gp,-2002 # 8000182e <__global_pointer$> ... Architecture Initialization and Management (PATCH 10/11) ========================================================= This commit makes major refinements to the architecture initialization and management code. The purpose of this is to: - Provide easy interface to initialize when the architecture is changed - Enable switching between default architecture (from an ELF attribute or from default string "rv64gc") and the architecture override (from mapping symbol with ISA string). - Avoid repeated initialization cost. Repeated architecture (riscv_rps_dis reinitialization) can occur per instruction on following cases: - When disassembling with GDB (does not occur on objdump due to the execution order of libopcodes shown in the PATCH 9/11 section) - When an input ELF file has mapping symbols with ISA string. Like Nelson's proposal tried to enable switching using riscv_file_subsets, my PATCH 9/11 provides new type "context" to store riscv_subset_list_t, XLEN and management-related variables. The major differences between this change and Nelson's are: - PATCH 10/11 does not reinitialize the architecture (riscv_rps_dis) if the architecture string is not changed. This makes a huge difference in performance. - PATCH 10/11 stores "XLEN per context". The second property provides interesting "feature" although not realistic in the real world: multi-XLEN object disassembling. If an object has mapping symbols with ISA string, we can switch between architectures, EVEN IF XLEN HAS CHANGED (e.g. "$xrv32i" and "$xrv64i"). It's not the default behavior (by default, a XLEN-specific machine either "riscv:rv32" or "riscv:rv64" is set and *this* XLEN is preferred than the mapping symbols) and requires explicitly specifying the XLEN-neutral RISC-V machine (objdump: "-m riscv"). Although it's not realistic in the real world, it may help testing Binutils / GAS. Tsukasa OI (11): opcodes/riscv-dis.c: More tidying RISC-V: Add test for 'Zfinx' register switching RISC-V: Make mapping symbol checking consistent RISC-V: Split riscv_get_map_state into two steps RISC-V: One time CSR hash table initialization RISC-V: Use static xlen on ADDIW sequence opcodes/riscv-dis.c: Add form feed for separation RISC-V: Split match/print steps on disassembler RISC-V: Reorganize disassembler state initialization RISC-V: Reorganize arch-related initialization and management RISC-V: Move disassembler private data initialization gas/testsuite/gas/riscv/mapping-dis.d | 7 + gas/testsuite/gas/riscv/mapping-symbols.d | 4 + gas/testsuite/gas/riscv/mapping.s | 10 + include/dis-asm.h | 1 + opcodes/disassemble.c | 2 +- opcodes/riscv-dis.c | 485 +++++++++++++++------- 6 files changed, 366 insertions(+), 143 deletions(-) base-commit: c341f4676af4f9922ca61e1b093d103ed808ae6e -- 2.38.1