public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/1] RISC-V: Use faster hash table on disassembling
@ 2022-07-09  3:49 Tsukasa OI
  2022-07-09  3:49 ` [PATCH 1/1] " Tsukasa OI
  2022-07-30  4:22 ` [PATCH 0/1] " Tsukasa OI
  0 siblings, 2 replies; 3+ messages in thread
From: Tsukasa OI @ 2022-07-09  3:49 UTC (permalink / raw)
  To: Tsukasa OI, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils

Hello,

This patchset intends to improve performance on disassembling RISC-V
code (which may possibly contain invalid data).  It replaces riscv_hash
(on opcodes/riscv-dis.c) with much faster data structure: sorted and
partitioned hash table.

Tracker on GitHub:
<https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_opts_hashtable>

    Sidenote:
    I started listing my Binutils submissions on my GitHub Wiki:
    <https://github.com/a4lg/binutils-gdb/wiki/Patch-Queue>
    hoping that current status and conflicting patches are clear.

***WARNING***

This patchset conflicts with following patchset(s):
-   <https://sourceware.org/pipermail/binutils/2022-June/121441.html>
    (Tracker: <https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_generics>)
If either of them is merged, I will submit rebased patchset.



This is a technique actually used on SPARC architecture
(opcodes/sparc-dis.c) and I simplified the algorithm even further.
Unlike SPARC, RISC-V hashed opcode table is not a table to linked lists,
it's just a table, pointing to "start" elements of the sorted opcode
list (sorted by hash code) plus global tail.

I benchmarked some of the programs and I measure somewhat between 2%
to 10% performance increase while disassembling code section of RISC-V
ELF files (objdump -d $FILE).  That is not significant but not bad for
such a small modification (with ~ 11KB heap memory allocation on 64-bit
environment).

This is not the end.  This structure significantly improves plain binary
file handling (on objdump, objdump -b binary -m riscv:rv[32|64] -D
$FILE).  I tested on a big vmlinux image with debug symbols and I got
over 50% performance boost.  This is due to the fact that, disassembling
about one quarter of invalid "instruction" words required iterating over
one thousand opcode entries (>= 348 being vector instructions with OP-V,
that can be easily skipped with this new data structure).

Thanks,
Tsukasa




Tsukasa OI (1):
  RISC-V: Use faster hash table on disassembling

 opcodes/riscv-dis.c | 214 ++++++++++++++++++++++++++++----------------
 1 file changed, 136 insertions(+), 78 deletions(-)


base-commit: d2acd4b0c5bab349aaa152d60268bc144634a844
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/1] RISC-V: Use faster hash table on disassembling
  2022-07-09  3:49 [PATCH 0/1] RISC-V: Use faster hash table on disassembling Tsukasa OI
@ 2022-07-09  3:49 ` Tsukasa OI
  2022-07-30  4:22 ` [PATCH 0/1] " Tsukasa OI
  1 sibling, 0 replies; 3+ messages in thread
From: Tsukasa OI @ 2022-07-09  3:49 UTC (permalink / raw)
  To: Tsukasa OI, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils

This commit replaces riscv_hash with sorted and partitioned hash table
for faster disassembling (general idea is based on SPARC architecture
implementation).  It expects around 2%-10% performance improvement on
regular RISC-V code area and 50% or more on entire area of large binary
files (including but not limited to RISC-V code).

opcodes/ChangeLog:

	* riscv-dis.c (OP_HASH_IDX): Move out from
	riscv_disassemble_insn.
	(compare_opcodes, OP_HASH_LEN): New helpers.
	(riscv_hash, riscv_opcodes_sorted): New.
	(build_hash_table): New hash table building function.
	(riscv_disassemble_insn): Use faster hash table.  Split match
	and print steps.
---
 opcodes/riscv-dis.c | 214 ++++++++++++++++++++++++++++----------------
 1 file changed, 136 insertions(+), 78 deletions(-)

diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 164fd209dbd..d8c16583ba1 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -564,6 +564,63 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
     }
 }
 
+/* Build a hash table for disassembler to shorten the search time.
+   We sort pointers to riscv_opcodes entries to be even faster.  */
+
+#define OP_HASH_LEN (OP_MASK_OP + 1)
+#define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP))
+static const struct riscv_opcode **riscv_hash[OP_HASH_LEN + 1];
+static const struct riscv_opcode **riscv_opcodes_sorted;
+
+static int
+compare_opcodes (const void *ap, const void *bp)
+{
+  const struct riscv_opcode *a = *(const struct riscv_opcode **) ap;
+  const struct riscv_opcode *b = *(const struct riscv_opcode **) bp;
+  insn_t ai = OP_HASH_IDX (a->match);
+  insn_t bi = OP_HASH_IDX (b->match);
+  if (ai < bi)
+    return -1;
+  if (ai > bi)
+    return +1;
+  /* Stable sort (on riscv_opcodes entry order) is required.  */
+  if (a < b)
+    return -1;
+  if (a > b)
+    return +1;
+  return 0;
+}
+
+static void
+build_hash_table (void)
+{
+  const struct riscv_opcode *op;
+  const struct riscv_opcode **pop, **pop_end;
+  size_t len = 0, i;
+
+  /* Sort pointers to riscv_opcodes entries (except macros).  */
+  for (op = riscv_opcodes; op->name; op++)
+    if (op->pinfo != INSN_MACRO)
+      len++;
+  riscv_opcodes_sorted = xcalloc (len, sizeof (struct riscv_opcode *));
+  pop_end = riscv_opcodes_sorted;
+  for (op = riscv_opcodes; op->name; op++)
+    if (op->pinfo != INSN_MACRO)
+      *pop_end++ = op;
+  qsort (riscv_opcodes_sorted, len, sizeof (struct riscv_opcode *),
+	 compare_opcodes);
+
+  /* Initialize faster hash table.  */
+  pop = riscv_opcodes_sorted;
+  for (i = 0; i < OP_HASH_LEN; i++)
+    {
+      riscv_hash[i] = pop;
+      while (pop != pop_end && OP_HASH_IDX ((*pop)->match) == i)
+	pop++;
+    }
+  riscv_hash[OP_HASH_LEN] = pop;
+}
+
 /* Print the RISC-V instruction at address MEMADDR in debugged memory,
    on using INFO.  Returns length of the instruction, in bytes.
    BIGENDIAN must be 1 if this is big-endian code, 0 if
@@ -572,21 +629,15 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
 static int
 riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
 {
-  const struct riscv_opcode *op;
+  const struct riscv_opcode **pop, **pop_end;
+  const struct riscv_opcode *op, *matched_op;
   static bool init = 0;
-  static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
   struct riscv_private_data *pd;
   int insnlen;
 
-#define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP))
-
-  /* Build a hash table to shorten the search time.  */
-  if (! init)
+  if (!init)
     {
-      for (op = riscv_opcodes; op->name; op++)
-	if (!riscv_hash[OP_HASH_IDX (op->match)])
-	  riscv_hash[OP_HASH_IDX (op->match)] = op;
-
+      build_hash_table ();
       init = 1;
     }
 
@@ -623,82 +674,89 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
   info->target = 0;
   info->target2 = 0;
 
-  op = riscv_hash[OP_HASH_IDX (word)];
-  if (op != NULL)
+  /* If XLEN is not known, get its value from the ELF class.  */
+  if (info->mach == bfd_mach_riscv64)
+    xlen = 64;
+  else if (info->mach == bfd_mach_riscv32)
+    xlen = 32;
+  else if (info->section != NULL)
     {
-      /* If XLEN is not known, get its value from the ELF class.  */
-      if (info->mach == bfd_mach_riscv64)
-	xlen = 64;
-      else if (info->mach == bfd_mach_riscv32)
-	xlen = 32;
-      else if (info->section != NULL)
-	{
-	  Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
-	  xlen = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
-	}
+      Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
+      xlen = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
+    }
+
+  matched_op = NULL;
+  pop     = riscv_hash[OP_HASH_IDX (word)];
+  pop_end = riscv_hash[OP_HASH_IDX (word) + 1];
+  for (; pop != pop_end; pop++)
+    {
+      op = *pop;
+      /* Does the opcode match?  */
+      if (!(op->match_func) (op, word))
+	continue;
+      /* Is this a pseudo-instruction and may we print it as such?  */
+      if (no_aliases && (op->pinfo & INSN_ALIAS))
+	continue;
+      /* Is this instruction restricted to a certain value of XLEN?  */
+      if ((op->xlen_requirement != 0) && (op->xlen_requirement != xlen))
+	continue;
+
+      if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class))
+	continue;
+
+      matched_op = op;
+      break;
+    }
+
+  /* There is a match.  */
+  if (matched_op != NULL)
+    {
+      op = matched_op;
 
-      /* If arch has ZFINX flags, use gpr for disassemble.  */
-      if(riscv_subset_supports (&riscv_rps_dis, "zfinx"))
+      /* If arch has Zfinx extension, use GPR to disassemble.  */
+      if (riscv_subset_supports (&riscv_rps_dis, "zfinx"))
 	riscv_fpr_names = riscv_gpr_names;
 
-      for (; op->name; op++)
-	{
-	  /* Does the opcode match?  */
-	  if (! (op->match_func) (op, word))
-	    continue;
-	  /* Is this a pseudo-instruction and may we print it as such?  */
-	  if (no_aliases && (op->pinfo & INSN_ALIAS))
-	    continue;
-	  /* Is this instruction restricted to a certain value of XLEN?  */
-	  if ((op->xlen_requirement != 0) && (op->xlen_requirement != xlen))
-	    continue;
-
-	  if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class))
-	    continue;
-
-	  /* It's a match.  */
-	  (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic,
-					"%s", op->name);
-	  print_insn_args (op->args, word, memaddr, info);
-
-	  /* Try to disassemble multi-instruction addressing sequences.  */
-	  if (pd->print_addr != (bfd_vma)-1)
-	    {
-	      info->target = pd->print_addr;
-	      (*info->fprintf_styled_func)
-		(info->stream, dis_style_comment_start, " # ");
-	      (*info->print_address_func) (info->target, info);
-	      pd->print_addr = -1;
-	    }
+      (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic,
+				    "%s", op->name);
+      print_insn_args (op->args, word, memaddr, info);
 
-	  /* Finish filling out insn_info fields.  */
-	  switch (op->pinfo & INSN_TYPE)
-	    {
-	    case INSN_BRANCH:
-	      info->insn_type = dis_branch;
-	      break;
-	    case INSN_CONDBRANCH:
-	      info->insn_type = dis_condbranch;
-	      break;
-	    case INSN_JSR:
-	      info->insn_type = dis_jsr;
-	      break;
-	    case INSN_DREF:
-	      info->insn_type = dis_dref;
-	      break;
-	    default:
-	      break;
-	    }
+      /* Try to disassemble multi-instruction addressing sequences.  */
+      if (pd->print_addr != (bfd_vma) - 1)
+	{
+	  info->target = pd->print_addr;
+	  (*info->fprintf_styled_func)
+	    (info->stream, dis_style_comment_start, " # ");
+	  (*info->print_address_func) (info->target, info);
+	  pd->print_addr = -1;
+	}
 
-	  if (op->pinfo & INSN_DATA_SIZE)
-	    {
-	      int size = ((op->pinfo & INSN_DATA_SIZE)
-			  >> INSN_DATA_SIZE_SHIFT);
-	      info->data_size = 1 << (size - 1);
-	    }
+      /* Finish filling out insn_info fields.  */
+      switch (op->pinfo & INSN_TYPE)
+	{
+	case INSN_BRANCH:
+	  info->insn_type = dis_branch;
+	  break;
+	case INSN_CONDBRANCH:
+	  info->insn_type = dis_condbranch;
+	  break;
+	case INSN_JSR:
+	  info->insn_type = dis_jsr;
+	  break;
+	case INSN_DREF:
+	  info->insn_type = dis_dref;
+	  break;
+	default:
+	  break;
+	}
 
-	  return insnlen;
+      if (op->pinfo & INSN_DATA_SIZE)
+	{
+	  int size = ((op->pinfo & INSN_DATA_SIZE) >> INSN_DATA_SIZE_SHIFT);
+	  info->data_size = 1 << (size - 1);
 	}
+
+      return insnlen;
     }
 
   /* We did not find a match, so just print the instruction bits.  */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 0/1] RISC-V: Use faster hash table on disassembling
  2022-07-09  3:49 [PATCH 0/1] RISC-V: Use faster hash table on disassembling Tsukasa OI
  2022-07-09  3:49 ` [PATCH 1/1] " Tsukasa OI
@ 2022-07-30  4:22 ` Tsukasa OI
  1 sibling, 0 replies; 3+ messages in thread
From: Tsukasa OI @ 2022-07-30  4:22 UTC (permalink / raw)
  To: Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils

This patch is postponed because I will merge this into a larger patchset
with general core disassembler improvements.

Thanks,
Tsukasa

On 2022/07/09 12:49, Tsukasa OI wrote:
> Hello,
> 
> This patchset intends to improve performance on disassembling RISC-V
> code (which may possibly contain invalid data).  It replaces riscv_hash
> (on opcodes/riscv-dis.c) with much faster data structure: sorted and
> partitioned hash table.
> 
> Tracker on GitHub:
> <https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_opts_hashtable>
> 
>     Sidenote:
>     I started listing my Binutils submissions on my GitHub Wiki:
>     <https://github.com/a4lg/binutils-gdb/wiki/Patch-Queue>
>     hoping that current status and conflicting patches are clear.
> 
> ***WARNING***
> 
> This patchset conflicts with following patchset(s):
> -   <https://sourceware.org/pipermail/binutils/2022-June/121441.html>
>     (Tracker: <https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_generics>)
> If either of them is merged, I will submit rebased patchset.
> 
> 
> 
> This is a technique actually used on SPARC architecture
> (opcodes/sparc-dis.c) and I simplified the algorithm even further.
> Unlike SPARC, RISC-V hashed opcode table is not a table to linked lists,
> it's just a table, pointing to "start" elements of the sorted opcode
> list (sorted by hash code) plus global tail.
> 
> I benchmarked some of the programs and I measure somewhat between 2%
> to 10% performance increase while disassembling code section of RISC-V
> ELF files (objdump -d $FILE).  That is not significant but not bad for
> such a small modification (with ~ 11KB heap memory allocation on 64-bit
> environment).
> 
> This is not the end.  This structure significantly improves plain binary
> file handling (on objdump, objdump -b binary -m riscv:rv[32|64] -D
> $FILE).  I tested on a big vmlinux image with debug symbols and I got
> over 50% performance boost.  This is due to the fact that, disassembling
> about one quarter of invalid "instruction" words required iterating over
> one thousand opcode entries (>= 348 being vector instructions with OP-V,
> that can be easily skipped with this new data structure).
> 
> Thanks,
> Tsukasa
> 
> 
> 
> 
> Tsukasa OI (1):
>   RISC-V: Use faster hash table on disassembling
> 
>  opcodes/riscv-dis.c | 214 ++++++++++++++++++++++++++++----------------
>  1 file changed, 136 insertions(+), 78 deletions(-)
> 
> 
> base-commit: d2acd4b0c5bab349aaa152d60268bc144634a844

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-30  4:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-09  3:49 [PATCH 0/1] RISC-V: Use faster hash table on disassembling Tsukasa OI
2022-07-09  3:49 ` [PATCH 1/1] " Tsukasa OI
2022-07-30  4:22 ` [PATCH 0/1] " Tsukasa OI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).