From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-sender-0.a4lg.com (mail-sender-0.a4lg.com [IPv6:2401:2500:203:30b:4000:6bfe:4757:0]) by sourceware.org (Postfix) with ESMTPS id C13763858D32 for ; Sat, 30 Jul 2022 04:22:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C13763858D32 Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id 26959300089; Sat, 30 Jul 2022 04:22:15 +0000 (UTC) Message-ID: <362b7c02-0802-4db8-3a44-7f43e807572b@irq.a4lg.com> Date: Sat, 30 Jul 2022 13:22:13 +0900 Mime-Version: 1.0 Subject: Re: [PATCH 0/1] RISC-V: Use faster hash table on disassembling Content-Language: en-US To: Nelson Chu , Kito Cheng , Palmer Dabbelt Cc: binutils@sourceware.org References: From: Tsukasa OI In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jul 2022 04:22:18 -0000 This patch is postponed because I will merge this into a larger patchset with general core disassembler improvements. Thanks, Tsukasa On 2022/07/09 12:49, Tsukasa OI wrote: > Hello, > > This patchset intends to improve performance on disassembling RISC-V > code (which may possibly contain invalid data). It replaces riscv_hash > (on opcodes/riscv-dis.c) with much faster data structure: sorted and > partitioned hash table. > > Tracker on GitHub: > > > Sidenote: > I started listing my Binutils submissions on my GitHub Wiki: > > hoping that current status and conflicting patches are clear. > > ***WARNING*** > > This patchset conflicts with following patchset(s): > - > (Tracker: ) > If either of them is merged, I will submit rebased patchset. > > > > This is a technique actually used on SPARC architecture > (opcodes/sparc-dis.c) and I simplified the algorithm even further. > Unlike SPARC, RISC-V hashed opcode table is not a table to linked lists, > it's just a table, pointing to "start" elements of the sorted opcode > list (sorted by hash code) plus global tail. > > I benchmarked some of the programs and I measure somewhat between 2% > to 10% performance increase while disassembling code section of RISC-V > ELF files (objdump -d $FILE). That is not significant but not bad for > such a small modification (with ~ 11KB heap memory allocation on 64-bit > environment). > > This is not the end. This structure significantly improves plain binary > file handling (on objdump, objdump -b binary -m riscv:rv[32|64] -D > $FILE). I tested on a big vmlinux image with debug symbols and I got > over 50% performance boost. This is due to the fact that, disassembling > about one quarter of invalid "instruction" words required iterating over > one thousand opcode entries (>= 348 being vector instructions with OP-V, > that can be easily skipped with this new data structure). > > Thanks, > Tsukasa > > > > > Tsukasa OI (1): > RISC-V: Use faster hash table on disassembling > > opcodes/riscv-dis.c | 214 ++++++++++++++++++++++++++++---------------- > 1 file changed, 136 insertions(+), 78 deletions(-) > > > base-commit: d2acd4b0c5bab349aaa152d60268bc144634a844