public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Tsukasa OI <research_trasio@irq.a4lg.com>
To: Tsukasa OI <research_trasio@irq.a4lg.com>,
	Nelson Chu <nelson@rivosinc.com>,
	Kito Cheng <kito.cheng@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>
Cc: binutils@sourceware.org
Subject: [PATCH 0/3] RISC-V: Disassembler Core Optimization 1-2 (Mapping symbols)
Date: Sun, 20 Nov 2022 01:10:06 +0000	[thread overview]
Message-ID: <cover.1668906599.git.research_trasio@irq.a4lg.com> (raw)

Hello,

This is the Part 4 of 4-part project to improve disassembler performance
drastically:
<https://github.com/a4lg/binutils-gdb/wiki/proj_dis_perf_improvements_1>

** this patchset does not apply to master directly. **

This patchset requires following patchset(s) to be applied first:
<https://sourceware.org/pipermail/binutils/2022-November/124378.html>
<https://sourceware.org/pipermail/binutils/2022-November/124519.html>

Following is basically a copy from the PATCH 3/3 commit message.


For ELF files with many symbols and/or sections (static libraries, partially
linked files [e.g. vmlinux.o] or large object files), the disassembler is
drastically slowed down by looking up the suitable mapping symbol.

This is caused by the fact that:

-   It used an inefficient linear search to find the suitable mapping symbol
-   symtab_pos is not always a good hint for forward linear search and
-   The symbol table accessible by the disassembler is sorted by address and
    then section (not section, then address).

They sometimes force O(n^2) mapping symbol search time while searching for
the suitable mapping symbol for given address.

This commit implements:

-   A binary search to look up suitable mapping symbol (O(log(n)) time per
    a lookup call, O(n) time on initialization),
-   Separate mapping symbol table, sorted by section and then address
    (unless the section to disassemble is NULL),
-   A very short linear search, even faster than binary search,
    when disassembling consecutive addresses (usually traverses only 1 or 2
    symbols, O(n) on the worst case but this is only expected on adversarial
    samples) and
-   Efficient tracking of mapping symbols with ISA string
    (by propagating arch field of "$x+(arch)" to succeeding "$x" symbols).

It also changes when the disassembler reuses the last mapping symbol.  This
commit only uses the last disassembled address to determine whether the last
mapping symbol should be reused.

This commit doesn't improve the disassembler performance much on regular
programs in general.  However, it expects >50% disassembler performance
improvements on some files that "RISC-V: Use faster hash table on
disassembling" was not effective enough.

On bigger libraries, following numbers are observed during the benchmark:

-   x  2.13 -  2.22 : Static library : Newlib (libc.a)
-   x  5.67 -  6.09 : Static library : GNU libc (libc.a)
-   x 11.72 - 12.04 : Shared library : OpenSSL (libcrypto.so)
-   x 96.29         : Shared library : LLVM 14 (libLLVM-14.so)


Thanks,
Tsukasa




Tsukasa OI (3):
  RISC-V: Easy optimization on riscv_search_mapping_symbol
  RISC-V: Per-section private data initialization
  RISC-V: Optimized search on mapping symbols

 opcodes/disassemble.c |   1 +
 opcodes/disassemble.h |   2 +
 opcodes/riscv-dis.c   | 443 +++++++++++++++++++++++++++++-------------
 3 files changed, 311 insertions(+), 135 deletions(-)


base-commit: 844db363911065a3b5f0c5e4601f89ee1d7360c5
-- 
2.38.1


             reply	other threads:[~2022-11-20  1:10 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-20  1:10 Tsukasa OI [this message]
2022-11-20  1:10 ` [PATCH 1/3] RISC-V: Easy optimization on riscv_search_mapping_symbol Tsukasa OI
2022-11-20  1:10 ` [PATCH 2/3] RISC-V: Per-section private data initialization Tsukasa OI
2022-11-20  1:10 ` [PATCH 3/3] RISC-V: Optimized search on mapping symbols Tsukasa OI
2022-11-28  4:47 ` [PATCH v2 0/3] RISC-V: Disassembler Core Optimization 1-2 (Mapping symbols) Tsukasa OI
2022-11-28  4:47   ` [PATCH v2 1/3] RISC-V: Easy optimization on riscv_search_mapping_symbol Tsukasa OI
2022-11-28  4:47   ` [PATCH v2 2/3] RISC-V: Per-section private data initialization Tsukasa OI
2022-11-28  4:47   ` [PATCH v2 3/3] RISC-V: Optimized search on mapping symbols Tsukasa OI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1668906599.git.research_trasio@irq.a4lg.com \
    --to=research_trasio@irq.a4lg.com \
    --cc=binutils@sourceware.org \
    --cc=kito.cheng@sifive.com \
    --cc=nelson@rivosinc.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).