public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Tsukasa OI <research_trasio@irq.a4lg.com>
To: Nelson Chu <nelson.chu@sifive.com>
Cc: Binutils <binutils@sourceware.org>,
	jim.wilson.gcc@gmail.com, Palmer Dabbelt <palmer@rivosinc.com>,
	Kito Cheng <kito.cheng@sifive.com>,
	Jessica Clarke <jrtc27@jrtc27.com>,
	luismarques@lowrisc.org, maskray@google.com, asb@asbradbury.org
Subject: Re: [PATCH] RISC-V: Output mapping symbols with ISA string once .option arch is used.
Date: Fri, 5 Aug 2022 19:31:58 +0900	[thread overview]
Message-ID: <86f551b9-5096-7ec0-f27d-cab3ae198499@irq.a4lg.com> (raw)
In-Reply-To: <1659692183-5682-1-git-send-email-nelson.chu@sifive.com>

LGTM.

Also, it's very interesting because it enables the disassembler to
process multi-arch ELF files (e.g. with optimized code paths).  I have a
plan to make the disassembler much faster, flexible and maintainable and
I think my changes can be also built on your changes.

Part of my ongoing (not submitted) works:
<https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_opts_batch_1>
<https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_arch_priv_spec>
<https://github.com/a4lg/binutils-gdb/wiki/riscv_dis_opts_batch_2>
(I plan to submit them after Zfinx and some disassembler changes are
merged upstream.)

Two comments follow:

On 2022/08/05 18:36, Nelson Chu wrote:
> [1] Proposal: Extend .option directive for control enabled extensions on
> specific code region,
> https://github.com/riscv-non-isa/riscv-asm-manual/pull/67
> 
> [2] Proposal: Add mapping symbol,
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/196
> 
> Originally, the extended .option directive [1] was used to resolve the known
> ifunc issue, but it also allows users to enable and disable extensions of a
> certain code region at will.  Once the extensions have overlapped encodings,
> it is difficult to dump the instructions correctly for now.  Therefore, the
> mapping symbols with ISA string [2] for different code regions was proposed
> to resolve the dump issue, and this patch is for that purpose.
> 
> The riscv-gnu-toolchain regressions passed for this patch.
> 
> gas/
>     * config/tc-riscv.c (updated_riscv_subsets): New boolean.  Output the
>     instruction mapping symbols with ISA string if it is true, otherwise
>     set it to false.
>     (remove_mapping_symbol): New function.  Remove the previous mapping
>     symbol by checking the current one.
>     (make_mapping_symbol): Added the current ISA string to $x when
>     updated_riscv_subsets is true.  Besides, call remove_mapping_symbol
>     rather than jusr remove the previous symbol.

Typo: just?

>     (riscv_mapping_state): Handle new mapping type MAP_INSN_ARCH, and also
>     call remove_mapping_symbol before removing it.
>     (s_riscv_option): Set updated_riscv_subsets to true once we meet .option
>     rvc/norvc/arch/pop directives.
>     * testsuite/gas/riscv/mapping-01a.d: Updated.
>     * testsuite/gas/riscv/mapping-02a.d: Likewise.
>     * testsuite/gas/riscv/mapping-03a.d: Likewise.
>     * testsuite/gas/riscv/mapping-04a.d: Likewise.
>     * testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
>     * testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.
>     * testsuite/gas/riscv/option-arch-01a.d: Dump frcsr since f should be
>     enabled in that code region.
> include/
>     * opcode/riscv.h (enum riscv_seg_mstat): Added MAP_INSN_ARCH, means
>     instructions with specific extensions.
> opcodes/
>     * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
>     riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
>     for some specfic code region.
>     (riscv_get_map_state): Recognized mapping symbols with ISA, and then
>     reset the architecture string once the ISA is different.
> ---
>  gas/config/tc-riscv.c                         | 63 ++++++++++++++++++++++-----
>  gas/testsuite/gas/riscv/mapping-01a.d         |  2 +-
>  gas/testsuite/gas/riscv/mapping-02a.d         |  2 +-
>  gas/testsuite/gas/riscv/mapping-03a.d         |  2 +-
>  gas/testsuite/gas/riscv/mapping-04a.d         |  2 +-
>  gas/testsuite/gas/riscv/mapping-norelax-03a.d |  2 +-
>  gas/testsuite/gas/riscv/mapping-norelax-04a.d |  2 +-
>  gas/testsuite/gas/riscv/option-arch-01a.d     |  2 +-
>  include/opcode/riscv.h                        |  1 +
>  opcodes/riscv-dis.c                           | 12 ++++-
>  10 files changed, 71 insertions(+), 19 deletions(-)
> 
> diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
> index 34ce68e..f61dd86 100644
> --- a/gas/config/tc-riscv.c
> +++ b/gas/config/tc-riscv.c
> @@ -282,6 +282,9 @@ struct riscv_option_stack
>  
>  static struct riscv_option_stack *riscv_opts_stack = NULL;
>  
> +/* Output the instruction mapping symbols with ISA string if it is true.  */
> +static bool updated_riscv_subsets = false;
> +
>  /* Set which ISA and extensions are available.  */
>  
>  static void
> @@ -445,6 +448,22 @@ static char *expr_end;
>  #define OPCODE_MATCHES(OPCODE, OP) \
>    (((OPCODE) & MASK_##OP) == MATCH_##OP)
>  
> +/* Remove PREVIOUS mapping symbol by checking the CURRENT.  */
> +
> +static void
> +remove_mapping_symbol (symbolS *previous, symbolS *current)
> +{
> +  /* If the mapping symbol we add isn't MAP_INSN_ARCH, but
> +     the removed previous mapping symbol is MAP_INSN_ARCH,
> +     then the next MAP_INSN should be converted to MAP_INSN_ARCH.  */
> +  if (current != NULL
> +      && !strncmp (S_GET_NAME (current), "$xrv", 4) == 0
> +      && strncmp (S_GET_NAME (previous), "$xrv", 4) == 0)
> +    updated_riscv_subsets = true;
> +
> +  symbol_remove (previous, &symbol_rootP, &symbol_lastP);
> +}
> +
>  /* Create a new mapping symbol for the transition to STATE.  */
>  
>  static void
> @@ -452,14 +471,28 @@ make_mapping_symbol (enum riscv_seg_mstate state,
>  		     valueT value,
>  		     fragS *frag)
>  {
> -  const char *name;
> +  const char *name = NULL;
> +  char *buff = NULL;
> +
>    switch (state)
>      {
>      case MAP_DATA:
>        name = "$d";
>        break;
>      case MAP_INSN:
> -      name = "$x";
> +    case MAP_INSN_ARCH:
> +      if (updated_riscv_subsets)
> +	{
> +	  char *isa = riscv_arch_str (xlen, riscv_subsets);
> +	  size_t size = strlen (isa) + 3; /* "rv" + '\0'  */
> +	  buff = xmalloc (size);
> +	  snprintf (buff, size, "$x%s", isa);
> +	  xfree ((void *) isa);
> +	  name = buff;
> +	  updated_riscv_subsets = false;
> +	}
> +      else
> +	name = "$x";
>        break;
>      default:
>        abort ();
> @@ -467,19 +500,19 @@ make_mapping_symbol (enum riscv_seg_mstate state,
>  
>    symbolS *symbol = symbol_new (name, now_seg, frag, value);
>    symbol_get_bfdsym (symbol)->flags |= (BSF_NO_FLAGS | BSF_LOCAL);
> +  xfree ((void *) buff);
>  
>    /* If .fill or other data filling directive generates zero sized data,
>       or we are adding odd alignemnts, then the mapping symbol for the
>       following code will have the same value.  */
>    if (value == 0)
>      {
> -       if (frag->tc_frag_data.first_map_symbol != NULL)
> +      if (frag->tc_frag_data.first_map_symbol != NULL)
>  	{
>  	  know (S_GET_VALUE (frag->tc_frag_data.first_map_symbol)
>  		== S_GET_VALUE (symbol));
>  	  /* Remove the old one.  */
> -	  symbol_remove (frag->tc_frag_data.first_map_symbol,
> -			 &symbol_rootP, &symbol_lastP);
> +	  remove_mapping_symbol (frag->tc_frag_data.first_map_symbol, symbol);
>  	}
>        frag->tc_frag_data.first_map_symbol = symbol;
>      }
> @@ -491,8 +524,7 @@ make_mapping_symbol (enum riscv_seg_mstate state,
>        /* Remove the old one.  */
>        if (S_GET_VALUE (frag->tc_frag_data.last_map_symbol)
>  	  == S_GET_VALUE (symbol))
> -	symbol_remove (frag->tc_frag_data.last_map_symbol,
> -		       &symbol_rootP, &symbol_lastP);
> +	remove_mapping_symbol (frag->tc_frag_data.last_map_symbol, symbol);
>      }
>    frag->tc_frag_data.last_map_symbol = symbol;
>  }
> @@ -514,9 +546,14 @@ riscv_mapping_state (enum riscv_seg_mstate to_state,
>        || !subseg_text_p (now_seg))
>      return;
>  
> +  if (to_state == MAP_INSN && updated_riscv_subsets)
> +    to_state = MAP_INSN_ARCH;
> +
>    /* The mapping symbol should be emitted if not in the right
> -     mapping state  */
> -  if (from_state == to_state)
> +     mapping state.  */
> +  if ((from_state == MAP_DATA && to_state == MAP_DATA)
> +      || (from_state == MAP_INSN && to_state == MAP_INSN)
> +      || (from_state == MAP_INSN_ARCH && to_state == MAP_INSN))
>      return;
>  
>    valueT value = (valueT) (frag_now_fix () - max_chars);
> @@ -573,7 +610,7 @@ riscv_check_mapping_symbols (bfd *abfd ATTRIBUTE_UNUSED,
>  	    {
>  	      /* The last mapping symbol overlaps with another one
>  		 which at the start of the next frag.  */
> -	      symbol_remove (last, &symbol_rootP, &symbol_lastP);
> +	      remove_mapping_symbol (last, NULL);
>  	      break;
>  	    }
>  
> @@ -581,7 +618,7 @@ riscv_check_mapping_symbols (bfd *abfd ATTRIBUTE_UNUSED,
>  	    {
>  	      /* The last mapping symbol is at the end of the section.  */
>  	      know (next->fr_fix == 0 && next->fr_var == 0);
> -	      symbol_remove (last, &symbol_rootP, &symbol_lastP);
> +	      remove_mapping_symbol (last, NULL);
>  	      break;
>  	    }
>  
> @@ -3855,11 +3892,13 @@ s_riscv_option (int x ATTRIBUTE_UNUSED)
>    if (strcmp (name, "rvc") == 0)
>      {
>        riscv_update_subset (&riscv_rps_as, "+c");
> +      updated_riscv_subsets = true;
>        riscv_set_rvc (true);
>      }
>    else if (strcmp (name, "norvc") == 0)
>      {
>        riscv_update_subset (&riscv_rps_as, "-c");
> +      updated_riscv_subsets = true;
>        riscv_set_rvc (false);
>      }
>    else if (strcmp (name, "pic") == 0)
> @@ -3880,6 +3919,7 @@ s_riscv_option (int x ATTRIBUTE_UNUSED)
>        if (ISSPACE (*name) && *name != '\0')
>  	name++;
>        riscv_update_subset (&riscv_rps_as, name);
> +      updated_riscv_subsets = true;
>  
>        riscv_set_rvc (false);
>        if (riscv_subset_supports (&riscv_rps_as, "c"))
> @@ -3911,6 +3951,7 @@ s_riscv_option (int x ATTRIBUTE_UNUSED)
>  	  riscv_opts = s->options;
>  	  riscv_subsets = s->subset_list;
>  	  riscv_rps_as.subset_list = riscv_subsets;
> +	  updated_riscv_subsets = true;
>  	  riscv_release_subset_list (release_subsets);
>  	  free (s);
>  	}
> diff --git a/gas/testsuite/gas/riscv/mapping-01a.d b/gas/testsuite/gas/riscv/mapping-01a.d
> index 32e0027..2aefe5a 100644
> --- a/gas/testsuite/gas/riscv/mapping-01a.d
> +++ b/gas/testsuite/gas/riscv/mapping-01a.d
> @@ -8,7 +8,7 @@ SYMBOL TABLE:
>  0+00 l    d  .text	0+00 .text
>  0+00 l    d  .data	0+00 .data
>  0+00 l    d  .bss	0+00 .bss
> -0+00 l       .text	0+00 \$x
> +0+00 l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+00 l    d  .foo	0+00 .foo
>  0+00 l       .foo	0+00 foo
>  0+00 l       .foo	0+00 \$x
> diff --git a/gas/testsuite/gas/riscv/mapping-02a.d b/gas/testsuite/gas/riscv/mapping-02a.d
> index 333f12c..155ae02 100644
> --- a/gas/testsuite/gas/riscv/mapping-02a.d
> +++ b/gas/testsuite/gas/riscv/mapping-02a.d
> @@ -9,7 +9,7 @@ SYMBOL TABLE:
>  0+00 l    d  .data	0+00 .data
>  0+00 l    d  .bss	0+00 .bss
>  0+00 l       .text	0+00 \$d
> -0+04 l       .text	0+00 \$x
> +0+04 l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+0c l       .text	0+00 \$d
>  0+0e l       .text	0+00 \$x
>  0+00 l    d  .riscv.attributes	0+00 .riscv.attributes
> diff --git a/gas/testsuite/gas/riscv/mapping-03a.d b/gas/testsuite/gas/riscv/mapping-03a.d
> index d3663b6..36646bf 100644
> --- a/gas/testsuite/gas/riscv/mapping-03a.d
> +++ b/gas/testsuite/gas/riscv/mapping-03a.d
> @@ -8,7 +8,7 @@ SYMBOL TABLE:
>  0+00 l    d  .text	0+00 .text
>  0+00 l    d  .data	0+00 .data
>  0+00 l    d  .bss	0+00 .bss
> -0+00 l       .text	0+00 \$x
> +0+00 l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+04 l       .text	0+00 \$d
>  0+08 l       .text	0+00 \$x
>  0+14 l       .text	0+00 \$d
> diff --git a/gas/testsuite/gas/riscv/mapping-04a.d b/gas/testsuite/gas/riscv/mapping-04a.d
> index 1ae9653..13d17bc 100644
> --- a/gas/testsuite/gas/riscv/mapping-04a.d
> +++ b/gas/testsuite/gas/riscv/mapping-04a.d
> @@ -9,7 +9,7 @@ SYMBOL TABLE:
>  0+00 l    d  .data	0+00 .data
>  0+00 l    d  .bss	0+00 .bss
>  0+00 l       .text	0+00 \$d
> -0+0d l       .text	0+00 \$x
> +0+0d l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+15 l       .text	0+00 \$d
>  0+1f l       .text	0+00 \$x
>  0+00 l    d  .riscv.attributes	0+00 .riscv.attributes
> diff --git a/gas/testsuite/gas/riscv/mapping-norelax-03a.d b/gas/testsuite/gas/riscv/mapping-norelax-03a.d
> index 916f732..dd92854 100644
> --- a/gas/testsuite/gas/riscv/mapping-norelax-03a.d
> +++ b/gas/testsuite/gas/riscv/mapping-norelax-03a.d
> @@ -8,7 +8,7 @@ SYMBOL TABLE:
>  0+00 l    d  .text	0+00 .text
>  0+00 l    d  .data	0+00 .data
>  0+00 l    d  .bss	0+00 .bss
> -0+00 l       .text	0+00 \$x
> +0+00 l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+04 l       .text	0+00 \$d
>  0+08 l       .text	0+00 \$x
>  0+10 l       .text	0+00 \$d
> diff --git a/gas/testsuite/gas/riscv/mapping-norelax-04a.d b/gas/testsuite/gas/riscv/mapping-norelax-04a.d
> index d552a7f..a22da2c 100644
> --- a/gas/testsuite/gas/riscv/mapping-norelax-04a.d
> +++ b/gas/testsuite/gas/riscv/mapping-norelax-04a.d
> @@ -12,5 +12,5 @@ SYMBOL TABLE:
>  0+14 l       .text	0+00 \$d
>  0+1e l       .text	0+00 \$x
>  0+0d l       .text	0+00 \$d
> -0+0e l       .text	0+00 \$x
> +0+0e l       .text	0+00 \$xrv64i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
>  0+00 l    d  .riscv.attributes	0+00 .riscv.attributes
> diff --git a/gas/testsuite/gas/riscv/option-arch-01a.d b/gas/testsuite/gas/riscv/option-arch-01a.d
> index aed4ca8..1d14c60 100644
> --- a/gas/testsuite/gas/riscv/option-arch-01a.d
> +++ b/gas/testsuite/gas/riscv/option-arch-01a.d
> @@ -10,5 +10,5 @@ Disassembly of section .text:
>  0+000 <.text>:
>  [ 	]+[0-9a-f]+:[  	]+952e[    	]+add[        	]+a0,a0,a1
>  [ 	]+[0-9a-f]+:[  	]+00b50533[    	]+add[        	]+a0,a0,a1
> -[ 	]+[0-9a-f]+:[  	]+00302573[    	]+csrr[        	]+a0,fcsr
> +[ 	]+[0-9a-f]+:[  	]+00302573[    	]+frcsr[        	]+a0
>  #...
> diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h
> index b115e33..3721131 100644
> --- a/include/opcode/riscv.h
> +++ b/include/opcode/riscv.h
> @@ -515,6 +515,7 @@ enum riscv_seg_mstate
>    MAP_NONE = 0,		/* Must be zero, for seginfo in new sections.  */
>    MAP_DATA,		/* Data.  */
>    MAP_INSN,		/* Instructions.  */
> +  MAP_INSN_ARCH,	/* Instructions with specific extensions.  */
>  };
>  
>  extern const char * const riscv_gpr_names_numeric[NGPR];
> diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
> index 164fd20..e7eb641 100644
> --- a/opcodes/riscv-dis.c
> +++ b/opcodes/riscv-dis.c
> @@ -640,6 +640,9 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
>        /* If arch has ZFINX flags, use gpr for disassemble.  */
>        if(riscv_subset_supports (&riscv_rps_dis, "zfinx"))
>  	riscv_fpr_names = riscv_gpr_names;
> +      else
> +	riscv_fpr_names = riscv_gpr_names == riscv_gpr_names_abi ?
> +			  riscv_fpr_names_abi : riscv_fpr_names_numeric;
>  
>        for (; op->name; op++)
>  	{
> @@ -750,7 +753,14 @@ riscv_get_map_state (int n,
>      return false;
>  
>    name = bfd_asymbol_name(info->symtab[n]);
> -  if (strcmp (name, "$x") == 0)
> +  if (strncmp (name, "$xrv", 4) == 0)
> +    {
> +      *state = MAP_INSN_ARCH;
> +
> +      riscv_release_subset_list (&riscv_subsets);
> +      riscv_parse_subset (&riscv_rps_dis, name + 2);
> +    }
> +  else if (strcmp (name, "$x") == 0)
>      *state = MAP_INSN;

It seems, if we meet "$xrv...." and then "$x" (with no ISA), it does not
change the ISA (specified by "$xrv...") but it's against the current
proposal.

> an instruction mapping symobl
> without ISA string means using ISA configuration from ELF attribute.

I think we need to save the ISA string from the ELF attributes.
In my upcoming changes, it does save the ISA string from the ELF
attributes ("RISC-V: Minimize arch-related initialization") for a
different reason:

https://github.com/a4lg/binutils-gdb/pull/28/commits/3d553885ab46570c5fc4c810c6dc6c90b9bf3fc4


>    else if (strcmp (name, "$d") == 0)
>      *state = MAP_DATA;

Thanks,
Tsukasa

  reply	other threads:[~2022-08-05 10:32 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-05  9:36 Nelson Chu
2022-08-05 10:31 ` Tsukasa OI [this message]
2022-08-11  7:00 ` [RFC PATCH 0/5] RISC-V: Support mapping symbols with ISA string Tsukasa OI
2022-08-11  7:00   ` [RFC PATCH 1/5] RISC-V: Use bool on riscv_set_options members Tsukasa OI
2022-08-11  7:00   ` [RFC PATCH 2/5] gas: Copyediting on tc-riscv.c Tsukasa OI
2022-08-11  7:00   ` [RFC PATCH 3/5] RISC-V: Mapping symbols with ISA string on assembler Tsukasa OI
2022-08-11  7:31     ` Jan Beulich
2022-08-11 11:43       ` Tsukasa OI
2022-08-11 12:12         ` Jan Beulich
2022-08-11  7:00   ` [RFC PATCH 4/5] RISC-V: Mapping symbols with ISA string on disassembler Tsukasa OI
2022-08-11  7:00   ` [RFC PATCH 5/5] RISC-V: New testcases for mapping symbols w/ ISA Tsukasa OI

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86f551b9-5096-7ec0-f27d-cab3ae198499@irq.a4lg.com \
    --to=research_trasio@irq.a4lg.com \
    --cc=asb@asbradbury.org \
    --cc=binutils@sourceware.org \
    --cc=jim.wilson.gcc@gmail.com \
    --cc=jrtc27@jrtc27.com \
    --cc=kito.cheng@sifive.com \
    --cc=luismarques@lowrisc.org \
    --cc=maskray@google.com \
    --cc=nelson.chu@sifive.com \
    --cc=palmer@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).