public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/3] RISC-V: Disassembler Core Optimization 1-1 (Hash table and Caching)
@ 2022-11-20  1:08 Tsukasa OI
  2022-11-20  1:08 ` [PATCH 1/3] RISC-V: Use faster hash table on disassembling Tsukasa OI
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Tsukasa OI @ 2022-11-20  1:08 UTC (permalink / raw)
  To: Tsukasa OI, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils

Hello,

This is the Part 3 of 4-part project to improve disassembler performance
drastically:
<https://github.com/a4lg/binutils-gdb/wiki/proj_dis_perf_improvements_1>

** this patchset does not apply to master directly. **

This patchset requires following patchset(s) to be applied first:
<https://sourceware.org/pipermail/binutils/2022-November/124378.html>


PATCH 1/3 improves performance on disassembling RISC-V code (which may also
possibly contain invalid data).  It replaces riscv_hash (on opcodes/
riscv-dis.c) with much faster data structure: sorted and partitioned
hash table.

This is a technique actually used on SPARC architecture
(opcodes/sparc-dis.c) and the author simplified the algorithm even further.
Unlike SPARC, RISC-V's hashed opcode table is not a table to linked lists,
it's just a table, pointing "start" elements in the sorted opcode list
(per hash code) and a global tail.

PATCH 3/3 takes care of per-instruction instruction support probing problem.
By caching which instruction classes are queried already, we no longer have
to call riscv_multi_subset_supports function for every instruction.  It
speeds up the disassembling even further.

PATCH 2/3 is not a part of the optimization but a safety net to complement
PATCH 1/3.  It enables implementing custom instructions that span through
multiple major opcodes (such as both CUSTOM_0 and CUSTOM_1 **in a single
instruction**) without causing disassembler functionality problems.  Note
that it has a big performance penalty if a vendor implements such
instruction so if such instruction is implemented in the mainline, a
separate solution will be required.


I benchmarked some of the programs and I usually get 20-50% performance
improvements while disassembling code section of compiled RISC-V ELF
programs ("objdump -d $FILE").  That is significant and pretty nice for such
a small modification (with about 12KB heap memory allocation on 64-bit
environment).  On libraries and big programs with many debug symbols, the
improvements are not that high but this is to be taken care with the next
part (the mapping symbol optimization).

This is not the end.

This structure significantly improves plain binary file handling (on
objdump, "objdump -b binary -m riscv:rv[32|64] -D $FILE").  I tested on
various binary files including random one and big vmlinux images and I
confirmed significant performance improvements (over 70% on many cases).

This is partially due to the fact that, disassembling about one quarter of
invalid "instruction" words required iterating over one thousand opcode
entries (348 or more being vector instructions with OP-V, that can be easily
skipped with this new data structure).  Another reason for this significance
is it doesn't have various ELF overhead.

It also has a great synergy with the commit "RISC-V: One time CSR hash table
initialization" and disassembling many CSR instructions is now over 6 times
faster (in contrast to only about 30% faster at the patchset part 2).


Thanks,
Tsukasa




Tsukasa OI (3):
  RISC-V: Use faster hash table on disassembling
  RISC-V: Fallback on faster hash table
  RISC-V: Cache instruction support

 include/opcode/riscv.h |   2 +
 opcodes/riscv-dis.c    | 129 +++++++++++++++++++++++++++++++++++------
 2 files changed, 113 insertions(+), 18 deletions(-)


base-commit: f3fcf98b44621fb8768cf11121d3fd77089bca5b
-- 
2.38.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-28  4:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-20  1:08 [PATCH 0/3] RISC-V: Disassembler Core Optimization 1-1 (Hash table and Caching) Tsukasa OI
2022-11-20  1:08 ` [PATCH 1/3] RISC-V: Use faster hash table on disassembling Tsukasa OI
2022-11-20  1:08 ` [PATCH 2/3] RISC-V: Fallback on faster hash table Tsukasa OI
2022-11-20  1:08 ` [PATCH 3/3] RISC-V: Cache instruction support Tsukasa OI
2022-11-28  4:46 ` [PATCH v2 0/3] RISC-V: Disassembler Core Optimization 1-1 (Hash table and Caching) Tsukasa OI
2022-11-28  4:46   ` [PATCH v2 1/3] RISC-V: Use faster hash table on disassembling Tsukasa OI
2022-11-28  4:46   ` [PATCH v2 2/3] RISC-V: Fallback on faster hash table Tsukasa OI
2022-11-28  4:46   ` [PATCH v2 3/3] RISC-V: Cache instruction support Tsukasa OI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).