public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Tsukasa OI <research_trasio@irq.a4lg.com>
To: Tsukasa OI <research_trasio@irq.a4lg.com>,
	Nelson Chu <nelson@rivosinc.com>,
	Kito Cheng <kito.cheng@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>
Cc: binutils@sourceware.org
Subject: [PATCH 09/11] RISC-V: Reorganize disassembler state initialization
Date: Tue, 15 Nov 2022 04:52:52 +0000	[thread overview]
Message-ID: <12cd8820841f695708875206b6461b6322c74428.1668487922.git.research_trasio@irq.a4lg.com> (raw)
In-Reply-To: <cover.1668487922.git.research_trasio@irq.a4lg.com>

The current disassembler repeatedly checks current state per instruction:

-   Whether riscv_gpr_names is initialized to a non-NULL value
-   Whether the Zfinx extension is available
-   Whether the hash table is initialized

... but they are not frequently changed.

riscv_gpr_names is initialized to a non-NULL value when

-   The first disassembler option is specified, or
-   Not initialized with disassembler options
    (in the first print_insn_riscv function call).

We can safely initialize the default disassembler options prior to the
print_insn_riscv function call and this per-instruction checking of
riscv_gpr_names can be safely removed.

The opcode hash table initialization will be taken care with a later commit.

To group disassembler state initialization, this commit utilizes the
disassemble_init_for_target function.  As a side effect, this will also fix
a potential issue when disassembler options is set and then unset on GDB.
This idea is based on opcodes/ppc-dis.c and opcodes/wasm32-dis.c.

New callback function init_riscv_dis_state_for_arch_and_options is called
when either the architecture string or an option is possibly changed.

We can now group the disassembler state initialization together.
It makes state initialization clearer and makes further changes easier.

In performance perspective, this commit has various effects (-5% to 6%).
However, this commit makes implementing large optimizations easier
as well as "RISC-V: Split match/print steps on disassembler".

include/ChangeLog:

	* dis-asm.h (disassemble_init_riscv): Add declaration of
	disassemble_init_riscv.

opcodes/ChangeLog:

	* disassemble.c (disassemble_init_for_target): Call
	disassemble_init_riscv to group state initialization together.
	* riscv-dis.c
	(xlen_by_mach, xlen_by_elf): New variables to store
	environment-inferred XLEN.
	(is_numeric): New.  Instead of directly setting
	riscv_{gpr,fpr}_names, use this to store an option.
	(update_riscv_dis_xlen): New function to set actual XLEN from
	xlen_by_mach and xlen_by_elf variables.
	(init_riscv_dis_state_for_arch_and_options): New callback function
	called when either the architecture or an option is changed.  Set
	riscv_{gpr,fpr}_names here.
	(set_default_riscv_dis_options): Initialize is_numeric instead of
	riscv_gpr_names and riscv_fpr_names.
	(parse_riscv_dis_option_without_args): When the "numeric" option
	is specified, write to is_numeric instead of register names.
	(parse_riscv_dis_options): Suppress setting the default options
	here and let disassemble_init_riscv to initialize them.
	(riscv_disassemble_insn): Move probing Zfinx and setting XLEN
	portions to init_riscv_dis_state_for_arch_and_options and
	update_riscv_dis_xlen.
	(riscv_get_map_state): If a mapping symbol with ISA string is
	suitable, call init_riscv_dis_state_for_arch_and_options function
	to update disassembler state.
	(print_insn_riscv): Update XLEN only if we haven't guessed correct
	XLEN for the disassembler.  Stop checking disassembler options for
	every instruction and let disassemble_init_riscv to parse options.
	(riscv_get_disassembler): Call
	init_riscv_dis_state_for_arch_and_options because the architecture
	string is updated here.
	(disassemble_init_riscv): New function to initialize the
	structure, reset/guess correct XLEN and reset/parse disassembler
	options.
---
 include/dis-asm.h     |   1 +
 opcodes/disassemble.c |   2 +-
 opcodes/riscv-dis.c   | 112 +++++++++++++++++++++++++++++-------------
 3 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/include/dis-asm.h b/include/dis-asm.h
index 4921c040710..11537b432ff 100644
--- a/include/dis-asm.h
+++ b/include/dis-asm.h
@@ -393,6 +393,7 @@ extern bool arm_symbol_is_valid (asymbol *, struct disassemble_info *);
 extern bool csky_symbol_is_valid (asymbol *, struct disassemble_info *);
 extern bool riscv_symbol_is_valid (asymbol *, struct disassemble_info *);
 extern void disassemble_init_powerpc (struct disassemble_info *);
+extern void disassemble_init_riscv (struct disassemble_info *);
 extern void disassemble_init_s390 (struct disassemble_info *);
 extern void disassemble_init_wasm32 (struct disassemble_info *);
 extern void disassemble_init_nds32 (struct disassemble_info *);
diff --git a/opcodes/disassemble.c b/opcodes/disassemble.c
index 0a8f2da629f..704fb476ea9 100644
--- a/opcodes/disassemble.c
+++ b/opcodes/disassemble.c
@@ -717,7 +717,7 @@ disassemble_init_for_target (struct disassemble_info * info)
 #endif
 #ifdef ARCH_riscv
     case bfd_arch_riscv:
-      info->symbol_is_valid = riscv_symbol_is_valid;
+      disassemble_init_riscv (info);
       info->created_styled_output = true;
       break;
 #endif
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 316c6c97607..76387efbe4b 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -35,6 +35,12 @@
 /* Current XLEN for the disassembler.  */
 static unsigned xlen = 0;
 
+/* XLEN as inferred by the machine architecture.  */
+static unsigned xlen_by_mach = 0;
+
+/* XLEN as inferred by ELF header.  */
+static unsigned xlen_by_elf = 0;
+
 /* Default ISA specification version (constant as of now).  */
 static enum riscv_spec_class default_isa_spec = ISA_SPEC_CLASS_DRAFT - 1;
 
@@ -72,6 +78,48 @@ static const char * const *riscv_fpr_names;
 
 /* If set, disassemble as most general instruction.  */
 static bool no_aliases = false;
+
+/* If set, disassemble with numeric register names.  */
+static bool is_numeric = false;
+\f
+
+/* Guess and update current XLEN.  */
+
+static void
+update_riscv_dis_xlen (struct disassemble_info *info)
+{
+  /* Set XLEN with following precedence rules:
+     1. BFD machine architecture set by either:
+       a. -m riscv:rv[32|64] option (GDB: set arch riscv:rv[32|64])
+       b. ELF class in actual ELF header (only on RISC-V ELF)
+       This is only effective if XLEN-specific BFD machine architecture is
+       chosen.  If XLEN-neutral (like riscv), BFD machine architecture is
+       ignored on XLEN selection.
+     2. ELF class in dummy ELF header.  */
+  if (xlen_by_mach != 0)
+    xlen = xlen_by_mach;
+  else if (xlen_by_elf != 0)
+    xlen = xlen_by_elf;
+  else if (info != NULL && info->section != NULL)
+    {
+      Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
+      xlen = xlen_by_elf = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
+    }
+}
+
+/* Initialization (for arch and options).  */
+
+static void
+init_riscv_dis_state_for_arch_and_options (void)
+{
+  /* Set GPR register names to disassemble.  */
+  riscv_gpr_names = is_numeric ? riscv_gpr_names_numeric : riscv_gpr_names_abi;
+  /* Set FPR register names to disassemble.  */
+  riscv_fpr_names
+      = !riscv_subset_supports (&riscv_rps_dis, "zfinx")
+	    ? (is_numeric ? riscv_fpr_names_numeric : riscv_fpr_names_abi)
+	    : riscv_gpr_names;
+}
 \f
 
 /* Set default RISC-V disassembler options.  */
@@ -79,9 +127,8 @@ static bool no_aliases = false;
 static void
 set_default_riscv_dis_options (void)
 {
-  riscv_gpr_names = riscv_gpr_names_abi;
-  riscv_fpr_names = riscv_fpr_names_abi;
   no_aliases = false;
+  is_numeric = false;
 }
 
 /* Parse RISC-V disassembler option (without arguments).  */
@@ -92,10 +139,7 @@ parse_riscv_dis_option_without_args (const char *option)
   if (strcmp (option, "no-aliases") == 0)
     no_aliases = true;
   else if (strcmp (option, "numeric") == 0)
-    {
-      riscv_gpr_names = riscv_gpr_names_numeric;
-      riscv_fpr_names = riscv_fpr_names_numeric;
-    }
+    is_numeric = true;
   else
     return false;
   return true;
@@ -163,8 +207,6 @@ parse_riscv_dis_options (const char *opts_in)
 {
   char *opts = xstrdup (opts_in), *opt = opts, *opt_end = opts;
 
-  set_default_riscv_dis_options ();
-
   for ( ; opt_end != NULL; opt = opt_end + 1)
     {
       if ((opt_end = strchr (opt, ',')) != NULL)
@@ -705,25 +747,6 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
   matched_op = NULL;
   op = riscv_hash[OP_HASH_IDX (word)];
 
-  /* If XLEN is not known, get its value from the ELF class.  */
-  if (info->mach == bfd_mach_riscv64)
-    xlen = 64;
-  else if (info->mach == bfd_mach_riscv32)
-    xlen = 32;
-  else if (info->section != NULL)
-    {
-      Elf_Internal_Ehdr *ehdr = elf_elfheader (info->section->owner);
-      xlen = ehdr->e_ident[EI_CLASS] == ELFCLASS64 ? 64 : 32;
-    }
-
-  /* If arch has the Zfinx extension, replace FPR with GPR.  */
-  if (riscv_subset_supports (&riscv_rps_dis, "zfinx"))
-    riscv_fpr_names = riscv_gpr_names;
-  else
-    riscv_fpr_names = riscv_gpr_names == riscv_gpr_names_abi
-			  ? riscv_fpr_names_abi
-			  : riscv_fpr_names_numeric;
-
   for (; op && op->name; op++)
     {
       /* Does the opcode match?  */
@@ -867,6 +890,7 @@ riscv_get_map_state (int n,
     {
       riscv_release_subset_list (&riscv_subsets);
       riscv_parse_subset (&riscv_rps_dis, arch);
+      init_riscv_dis_state_for_arch_and_options ();
     }
   return true;
 }
@@ -1063,14 +1087,9 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
   enum riscv_seg_mstate mstate;
   int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
 
-  if (info->disassembler_options != NULL)
-    {
-      parse_riscv_dis_options (info->disassembler_options);
-      /* Avoid repeatedly parsing the options.  */
-      info->disassembler_options = NULL;
-    }
-  else if (riscv_gpr_names == NULL)
-    set_default_riscv_dis_options ();
+  /* Guess and update XLEN if we haven't determined it yet.  */
+  if (xlen == 0)
+    update_riscv_dis_xlen (info);
 
   mstate = riscv_search_mapping_symbol (memaddr, info);
 
@@ -1132,9 +1151,32 @@ riscv_get_disassembler (bfd *abfd)
 
   riscv_release_subset_list (&riscv_subsets);
   riscv_parse_subset (&riscv_rps_dis, default_arch);
+  init_riscv_dis_state_for_arch_and_options ();
   return print_insn_riscv;
 }
 
+/* Initialize disassemble_info and parse options.  */
+
+void
+disassemble_init_riscv (struct disassemble_info *info)
+{
+  info->symbol_is_valid = riscv_symbol_is_valid;
+  /* Clear previous XLEN and guess by mach.  */
+  xlen = 0;
+  xlen_by_mach = 0;
+  xlen_by_elf = 0;
+  if (info->mach == bfd_mach_riscv64)
+    xlen_by_mach = 64;
+  else if (info->mach == bfd_mach_riscv32)
+    xlen_by_mach = 32;
+  update_riscv_dis_xlen (info);
+  /* Parse disassembler options.  */
+  set_default_riscv_dis_options ();
+  if (info->disassembler_options != NULL)
+    parse_riscv_dis_options (info->disassembler_options);
+  init_riscv_dis_state_for_arch_and_options ();
+}
+
 /* Prevent use of the fake labels that are generated as part of the DWARF
    and for relaxable relocations in the assembler.  */
 
-- 
2.37.2


  parent reply	other threads:[~2022-11-15  4:59 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-15  4:52 [PATCH 00/11] RISC-V: Requirements for disassembler optimizations batch 1 Tsukasa OI
2022-11-15  4:52 ` [PATCH 01/11] opcodes/riscv-dis.c: More tidying Tsukasa OI
2022-11-15  4:52 ` [PATCH 02/11] RISC-V: Add test for 'Zfinx' register switching Tsukasa OI
2022-11-15  4:52 ` [PATCH 03/11] RISC-V: Make mapping symbol checking consistent Tsukasa OI
2022-11-15  4:52 ` [PATCH 04/11] RISC-V: Split riscv_get_map_state into two steps Tsukasa OI
2022-11-15  4:52 ` [PATCH 05/11] RISC-V: One time CSR hash table initialization Tsukasa OI
2022-11-15  4:52 ` [PATCH 06/11] RISC-V: Use static xlen on ADDIW sequence Tsukasa OI
2022-11-15  4:52 ` [PATCH 07/11] opcodes/riscv-dis.c: Add form feed for separation Tsukasa OI
2022-11-15  4:52 ` [PATCH 08/11] RISC-V: Split match/print steps on disassembler Tsukasa OI
2022-11-15  4:52 ` Tsukasa OI [this message]
2022-11-15  4:52 ` [PATCH 10/11] RISC-V: Reorganize arch-related initialization and management Tsukasa OI
2022-11-15  4:52 ` [PATCH 11/11] RISC-V: Move disassembler private data initialization Tsukasa OI
2022-11-28  4:43 ` [PATCH v2 00/11] RISC-V: Requirements for disassembler optimizations batch 1 Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 01/11] opcodes/riscv-dis.c: More tidying Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 02/11] RISC-V: Add test for 'Zfinx' register switching Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 03/11] RISC-V: Make mapping symbol checking consistent Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 04/11] RISC-V: Split riscv_get_map_state into two steps Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 05/11] RISC-V: One time CSR hash table initialization Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 06/11] RISC-V: Use static xlen on ADDIW sequence Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 07/11] opcodes/riscv-dis.c: Add form feed for separation Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 08/11] RISC-V: Split match/print steps on disassembler Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 09/11] RISC-V: Reorganize disassembler state initialization Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 10/11] RISC-V: Reorganize arch-related initialization and management Tsukasa OI
2022-11-28  4:43   ` [PATCH v2 11/11] RISC-V: Move disassembler private data initialization Tsukasa OI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12cd8820841f695708875206b6461b6322c74428.1668487922.git.research_trasio@irq.a4lg.com \
    --to=research_trasio@irq.a4lg.com \
    --cc=binutils@sourceware.org \
    --cc=kito.cheng@sifive.com \
    --cc=nelson@rivosinc.com \
    --cc=palmer@dabbelt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).