From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-sender-0.a4lg.com (mail-sender.a4lg.com [153.120.152.154]) by sourceware.org (Postfix) with ESMTPS id A03813858C53 for ; Sat, 5 Nov 2022 10:47:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A03813858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=irq.a4lg.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=irq.a4lg.com Received: from [127.0.0.1] (localhost [127.0.0.1]) by mail-sender-0.a4lg.com (Postfix) with ESMTPSA id 7B426300089; Sat, 5 Nov 2022 10:47:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irq.a4lg.com; s=2017s01; t=1667645226; bh=11qSSayt5XLQzcO4nbErEIcOHxfZaAwNNoiTc4NB2tA=; h=Message-ID:Date:Mime-Version:Subject:To:References:From: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=a/W+SQQ5n4gF0yUlM2utA9mrwiDko/2bPceTeuLJCSCpZ0q9UDn7VKPz2iEKgYhJe B7tBYh0lB1LuObAuFlazgJtXsp5AETfQFfceK/W41IzsZIjyyKZUtMX4lDPR9Uj/MI 1h/F+LaN+IPkcCqMCMQ/JJYJBCoEbRvjpuZ8DL5c= Message-ID: <78c5a6db-9b85-e931-984b-e4c73362c8c5@irq.a4lg.com> Date: Sat, 5 Nov 2022 19:47:03 +0900 Mime-Version: 1.0 Subject: Re: [PATCH 2/2] RISC-V: Clarify the suppress rule of mapping symbol with architecture string. To: Nelson Chu , binutils@sourceware.org, Kito Cheng References: <20221104115038.8957-1-nelson@rivosinc.com> <20221104115038.8957-2-nelson@rivosinc.com> Content-Language: en-US From: Tsukasa OI In-Reply-To: <20221104115038.8957-2-nelson@rivosinc.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2022/11/04 20:50, Nelson Chu wrote: > From: Tsukasa OI > > Clarify the suppress rule - if the section (segment) only has a mapping symbol > with architecture marked at the first instruction, and the architecture is also > same as the file-level setting, then we can suppress the mapping symbol name to > $x. However, we also need to distinguish the meanings of the suppressed $x in > dis-assembler, since sometimes it means the architecture is same as the previous > one, or the same as the file-level one. For PATCH 1/2, LGTM. I truly appreciate that. For PATCH 2/2, as far as I can tell, it looks not bad to me but with some comments and to-dos (two of which related to Binutils itself). 1. Disassembler part Your implementation itself looks good. I was making multiple part 4/4 disassembler improvement patchsets depending on the possible interpretation of "$x" and this interpretation was my "likely winner". So, I might be forced to remove your disassembler contribution in the process if yours is merged first. I will need to apologize that first. 2. Relationship with Linker The rule: 'the first "$x" symbol in the section means instruction with ELF attributes' makes sense to me but may force the linker to do something more than the current implementation. This is because, the first "$x" symbol in an section *of an object file* might not be the first "$x" symbol in an section of the final (linked) object file. Both my RFC PATCH and your patchset (PATCH 2/2) might complicate the linker because of this property. They replaced the first "$x+arch" with "$x" in the section when the architecture matches ELF attributes in the assembler. However, not doing this like your commit 0ce50fc900a5 ("RISC-V: Always generate mapping symbols at the start of the sections.") might be an option (NOT as a workaround). If this option is acceptable to you, my new proposal on linker is either: (a) Do nothing (just like the old linker) It looks like a joke but I want to say that object files generated from your commit 0ce50fc900a5 is compatible with the old linker, assuming new "$x" interpretation rule. Also, multiple long "$x+arch" symbols do not increase the final file size that much (because the same symbol names are reused on disk). The description above is not strictly true when we consider about multi-toolchain object linking support but still applies on many cases. (b) Replace "$x+arch" with "$x" if the implied architecture depending on the context is either: (1) The same as the final ELF attributes or (2) compatible with that. and an optional handling: (c) (Virtually/actually) replace first "$x" in the section with "$x+arch" with input ELF file's attribute (not output) when preprocessing input object files. Following combinations are possible: 1. None (a) 2. (c) 3. (b.1) 4. (b.1) + (c) 5. (b.2) 6. (b.2) + (c) I'm sorry to say I haven't read GNU ld code much and I'm not sure whether the options (b.{1,2}) are feasible. Option (c) is for much complex situation: multi-toolchain object linking support (e.g. a static library is built with an old toolchain without "$x+arch" but with "$x", application code is built with a new toolchain with "$x+arch" and we link them together with a new [or even newer] one supporting "$x+arch"). 3. Outside Binutils Current mapping symbol proposal: requires some updates because, in that proposal, all "$x" means "revert to default (as in ELF attributes)". So we need to talk to Kito about changing this proposal. Thanks, Tsukasa > > gas/ > * config/tc-riscv.h (struct riscv_segment_info_type): Added new boolean > `arch_changed', to record if the segment has more than one mapping symbol > with architecture. All segments should have at least one at their first > instruction. > * config/tc-riscv.c (riscv_mapping_state): Set `arch_changed' to true if > the segment has more than one mapping symbol with architecture. > (riscv_check_mapping_symbols): Suppress the mapping symbol with architecture > to $x if the section/segment only has this setting and the architecture is > same as the file-level one. > * testsuite/gas/riscv/mapping.s: Updated, and added new testcases, > .text.suppress, .text.suppress.push.pop and .text.no.suppress. > * testsuite/gas/riscv/mapping-dis.d: Updated. > * testsuite/gas/riscv/mapping-symbols.d: Likewise. > * testsuite/gas/riscv/mapping-attr.d: New testcase to see if the file-level > architecure is polluted by other section-level .option settings. > opcodes/ > * riscv-dis.c (riscv_file_subsets): Added to record the file-level subsets. > (riscv_rps_dis): Delay to set subset_list in riscv_get_disassembler or > riscv_get_map_state. > (from_last_map_symbol): Moved from riscv_search_mapping_symbol and make it > global. Set to false means we are dumping a new section. > (riscv_get_map_state): Switch riscv_rps_dis.subset_list to riscv_file_subsets > if the mapping symbol is $x, and we are dumping a new section. Otherwise, > set it to riscv_subsets, and then parsing the architecture from $x+arch. > Besides, add new boolean parameter `update', since we don't need to update > the architecture when called from riscv_data_length. > (riscv_search_mapping_symbol): Updated. > (riscv_get_disassembler): Set the riscv_rps_dis.subset_list to riscv_file_subsets. > > Co-developed-by: Nelson Chu > --- > gas/config/tc-riscv.c | 13 ++++++++ > gas/config/tc-riscv.h | 1 + > gas/testsuite/gas/riscv/mapping-attr.d | 8 +++++ > gas/testsuite/gas/riscv/mapping-dis.d | 23 +++++++++++-- > gas/testsuite/gas/riscv/mapping-symbols.d | 11 ++++++- > gas/testsuite/gas/riscv/mapping.s | 35 +++++++++++++++++--- > opcodes/riscv-dis.c | 39 +++++++++++++++-------- > 7 files changed, 110 insertions(+), 20 deletions(-) > create mode 100644 gas/testsuite/gas/riscv/mapping-attr.d > > diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c > index dcb70f9d7ad..56ece8b2dc3 100644 > --- a/gas/config/tc-riscv.c > +++ b/gas/config/tc-riscv.c > @@ -598,6 +598,7 @@ riscv_mapping_state (enum riscv_seg_mstate to_state, > S_GET_NAME (seg_arch_symbol) + 2) != 0) > { > reset_seg_arch_str = true; > + seg_info (now_seg)->tc_segment_info_data.arch_changed = true; > } > else if (from_state == to_state) > return; > @@ -639,6 +640,18 @@ riscv_check_mapping_symbols (bfd *abfd ATTRIBUTE_UNUSED, > if (seginfo == NULL || seginfo->frchainP == NULL) > return; > > + /* If the section only has a mapping symbol at the first instruction, > + and it is the same as the file-level architecture, then no need to > + add $x+arch for it, just add $x is enough. > + > + Please see gas/testsuite/gas/riscv/mapping.s: + section name>. */ > + symbolS *arch_map_symbol = seginfo->tc_segment_info_data.arch_map_symbol; > + if (!seginfo->tc_segment_info_data.arch_changed > + && arch_map_symbol != 0 > + && strcmp (riscv_file_arch, S_GET_NAME (arch_map_symbol) + 2) == 0) > + S_SET_NAME (arch_map_symbol, "$x"); > + > for (fragp = seginfo->frchainP->frch_root; > fragp != NULL; > fragp = fragp->fr_next) > diff --git a/gas/config/tc-riscv.h b/gas/config/tc-riscv.h > index 19c45ba2d12..0dc567e1665 100644 > --- a/gas/config/tc-riscv.h > +++ b/gas/config/tc-riscv.h > @@ -140,6 +140,7 @@ struct riscv_segment_info_type > enum riscv_seg_mstate map_state; > /* The current mapping symbol with architecture string. */ > symbolS *arch_map_symbol; > + bool arch_changed; > }; > > /* Define target fragment type. */ > diff --git a/gas/testsuite/gas/riscv/mapping-attr.d b/gas/testsuite/gas/riscv/mapping-attr.d > new file mode 100644 > index 00000000000..71bcebf17d4 > --- /dev/null > +++ b/gas/testsuite/gas/riscv/mapping-attr.d > @@ -0,0 +1,8 @@ > +#as: -misa-spec=20191213 > +#source: mapping.s > +#readelf: -A > + > +Attribute Section: riscv > +File Attributes > + Tag_RISCV_arch: "rv32i2p1_f2p2_c2p0_zicsr2p0" > +#... > diff --git a/gas/testsuite/gas/riscv/mapping-dis.d b/gas/testsuite/gas/riscv/mapping-dis.d > index b1a26fbd151..fe581996e3c 100644 > --- a/gas/testsuite/gas/riscv/mapping-dis.d > +++ b/gas/testsuite/gas/riscv/mapping-dis.d > @@ -8,8 +8,8 @@ > Disassembly of section .text.cross.section.A: > > 0+000 : > -[ ]+[0-9a-f]+:[ ]+4505[ ]+li[ ]+a0,1 > -[ ]+[0-9a-f]+:[ ]+bffd[ ]+j[ ]+0 > +[ ]+[0-9a-f]+:[ ]+00100513[ ]+li[ ]+a0,1 > +[ ]+[0-9a-f]+:[ ]+bff5[ ]+j[ ]+0 > > Disassembly of section .text.corss.section.B: > > @@ -91,3 +91,22 @@ Disassembly of section .text.relax.align: > [ ]+[0-9a-f]+:[ ]+00000013[ ]+nop > [ ]+[0-9a-f]+:[ ]+00200513[ ]+li[ ]+a0,2 > [ ]+[0-9a-f]+:[ ]+00000013[ ]+nop > + > +Disassembly of section .text.suppress: > + > +0+000 <.text.suppress>: > +[ ]+[0-9a-f]+:[ ]+003022f3[ ]+frcsr[ ]+t0 > +[ ]+[0-9a-f]+:[ ]+4505[ ]+li[ ]+a0,1 > + > +Disassembly of section .text.suppress.push.pop: > + > +0+000 <.text.suppress.push.pop>: > +[ ]+[0-9a-f]+:[ ]+003022f3[ ]+frcsr[ ]+t0 > +[ ]+[0-9a-f]+:[ ]+4505[ ]+li[ ]+a0,1 > + > +Disassembly of section .text.no.suppress: > + > +0+000 <.text.no.suppress>: > +[ ]+[0-9a-f]+:[ ]+003022f3[ ]+frcsr[ ]+t0 > +[ ]+[0-9a-f]+:[ ]+003022f3[ ]+csrr[ ]+t0,fcsr > +[ ]+[0-9a-f]+:[ ]+003022f3[ ]+frcsr[ ]+t0 > diff --git a/gas/testsuite/gas/riscv/mapping-symbols.d b/gas/testsuite/gas/riscv/mapping-symbols.d > index 40df3409736..db69c05bfa4 100644 > --- a/gas/testsuite/gas/riscv/mapping-symbols.d > +++ b/gas/testsuite/gas/riscv/mapping-symbols.d > @@ -9,7 +9,8 @@ SYMBOL TABLE: > 0+00 l d .data 0+00 .data > 0+00 l d .bss 0+00 .bss > 0+00 l d .text.cross.section.A 0+00 .text.cross.section.A > -0+00 l .text.cross.section.A 0+00 \$xrv32i2p1_c2p0 > +0+00 l .text.cross.section.A 0+00 \$xrv32i2p1 > +0+04 l .text.cross.section.A 0+00 \$xrv32i2p1_c2p0 > 0+00 l d .text.corss.section.B 0+00 .text.corss.section.B > 0+00 l .text.corss.section.B 0+00 \$xrv32i2p1_c2p0 > 0+02 l .text.corss.section.B 0+00 \$xrv32i2p1 > @@ -42,6 +43,14 @@ SYMBOL TABLE: > 0+00 l d .text.relax.align 0+00 .text.relax.align > 0+00 l .text.relax.align 0+00 \$xrv32i2p1_c2p0 > 0+08 l .text.relax.align 0+00 \$xrv32i2p1 > +0+00 l d .text.suppress 0+00 .text.suppress > +0+00 l .text.suppress 0+00 \$x > +0+00 l d .text.suppress.push.pop 0+00 .text.suppress.push.pop > +0+00 l .text.suppress.push.pop 0+00 \$x > +0+00 l d .text.no.suppress 0+00 .text.no.suppress > +0+00 l .text.no.suppress 0+00 \$xrv32i2p1_f2p2_c2p0_zicsr2p0 > +0+04 l .text.no.suppress 0+00 \$xrv32i2p1_c2p0_zicsr2p0 > +0+08 l .text.no.suppress 0+00 \$xrv32i2p1_f2p2_c2p0_zicsr2p0 > 0+0a l .text.section.padding 0+00 \$x > 0+03 l .text.odd.align.start.insn 0+00 \$d > 0+04 l .text.odd.align.start.insn 0+00 \$x > diff --git a/gas/testsuite/gas/riscv/mapping.s b/gas/testsuite/gas/riscv/mapping.s > index 3014a69e792..5041d4d4cd4 100644 > --- a/gas/testsuite/gas/riscv/mapping.s > +++ b/gas/testsuite/gas/riscv/mapping.s > @@ -1,10 +1,11 @@ > -.attribute arch, "rv32ic" > +.attribute arch, "rv32ifc" > .option norelax # FIXME: assembler fill the paddings after parsing everything, > # so we probably won't fill anything for the norelax region when > # the riscv_opts.relax is enabled at somewhere. > > .section .text.cross.section.A, "ax" > .option push > +.option arch, rv32i > .global funcA > funcA: > addi a0, zero, 1 # rv32i > @@ -20,6 +21,7 @@ j funcB # rv32i > > .section .text.data, "ax" > .option push > +.option arch, rv32ic > .word 0 # $d > .long 1 > addi a0, zero, 1 # rv32ic > @@ -35,7 +37,7 @@ addi a0, zero, 2 # $x, but same as previous addi, so removed > .section .text.odd.align.start.insn, "ax" > .option push > .option norelax > -.option arch, +c > +.option arch, rv32ic > addi a0, zero, 1 # $xrv32ic > .byte 1 # $d > .option arch, -c > @@ -46,7 +48,7 @@ addi a0, zero, 2 # $xrv32i > .section .text.odd.align.start.data, "ax" > .option push > .option norelax > -.option arch, +c > +.option arch, rv32ic > .byte 1 # $d > .align 2 # odd alignment, $xrv32ic replaced by $d + $xrv32ic > addi a0, zero, 1 > @@ -55,6 +57,7 @@ addi a0, zero, 1 > .section .text.zero.fill.first, "ax" > .option push > .option norelax > +.option arch, rv32ic > .fill 1, 0, 0 # $d with zero size, removed in make_mapping_symbol > addi a0, zero, 1 # $xrv32ic > .option pop > @@ -62,6 +65,7 @@ addi a0, zero, 1 # $xrv32ic > .section .text.zero.fill.last, "ax" > .option push > .option norelax > +.option arch, rv32ic > addi a0, zero, 1 # $xrv32ic > .fill 1, 0, 0 # $d with zero size, removed in make_mapping_symbol > addi a0, zero, 2 # $x, FIXME: need find a way to remove? > @@ -71,6 +75,7 @@ addi a0, zero, 2 # $x, FIXME: need find a way to remove? > .section .text.zero.fill.align.A, "ax" > .option push > .option norelax > +.option arch, rv32ic > .align 2 # $xrv32ic, .align and .fill are in the different frag, so neither be removed > .fill 1, 0, 0 # $d with zero size, removed in make_mapping_symbol when adding $xrv32ic > addi a0, zero, 1 # $x, should be removed in riscv_check_mapping_symbols > @@ -81,6 +86,7 @@ addi a0, zero, 2 > .section .text.zero.fill.align.B, "ax" > .option push > .option norelax > +.option arch, rv32ic > .align 2 # $xrv32ic, .align and .fill are in the different frag, so neither be removed, > # but will be removed in riscv_check_mapping_symbols > .fill 1, 0, 0 # $d with zero size, removed in make_mapping_symbol when adding $xrv32ic > @@ -92,7 +98,7 @@ addi a0, zero, 2 > .section .text.last.section, "ax" > .option push > .option norelax > -.option arch, -c > +.option arch, rv32i > addi a0, zero, 1 # $xrv32i > .word 1 # $d > .align 2 # zero section padding, $x at the end of section, removed in riscv_check_mapping_symbols > @@ -101,6 +107,7 @@ addi a0, zero, 1 # $xrv32i > .section .text.section.padding, "ax" > .option push > .option norelax > +.option arch, rv32ic > .align 2 > addi a0, zero, 1 # $rv32ic > .option arch, +a > @@ -119,3 +126,23 @@ addi a0, zero, 1 # $x, won't added > .align 3 # $x, won't added > addi a0, zero, 2 # $xrv32i > .option pop > + > +.section .text.suppress, "ax" > +csrrs t0, fcsr, zero # suppress $xrv32ifc to $x > +addi a0, zero, 1 > + > +.section .text.suppress.push.pop, "ax" > +.option push > +.option arch, rv32i > +.option arch, +f, +c > +csrrs t0, fcsr, zero # suppress $xrv32ifc to $x > +addi a0, zero, 1 > +.option pop > + > +.section .text.no.suppress, "ax" > +csrrs t0, fcsr, zero # $xrv32ifc, since not only one $x+arch, so don't suppress > +.option push > +.option arch, -f > +csrrs t0, fcsr, zero # $xrv32ic > +.option pop > +csrrs t0, fcsr, zero # $xrv32ifc, likewise > diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c > index 3a31647a2f8..f935490c64f 100644 > --- a/opcodes/riscv-dis.c > +++ b/opcodes/riscv-dis.c > @@ -42,10 +42,11 @@ static enum riscv_spec_class default_isa_spec = ISA_SPEC_CLASS_DRAFT - 1; > (as specified by the ELF attributes or the `priv-spec' option). */ > static enum riscv_spec_class default_priv_spec = PRIV_SPEC_CLASS_NONE; > > +static riscv_subset_list_t riscv_file_subsets; > static riscv_subset_list_t riscv_subsets; > static riscv_parse_subset_t riscv_rps_dis = > { > - &riscv_subsets, /* subset_list. */ > + NULL, /* subset_list. */ > opcodes_error_handler,/* error_handler. */ > &xlen, /* xlen. */ > &default_isa_spec, /* isa_spec. */ > @@ -64,6 +65,7 @@ struct riscv_private_data > /* Used for mapping symbols. */ > static int last_map_symbol = -1; > static bfd_vma last_stop_offset = 0; > +static bool from_last_map_symbol = false; > > /* Register names as used by the disassembler. */ > static const char * const *riscv_gpr_names; > @@ -821,7 +823,8 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) > static bool > riscv_get_map_state (int n, > enum riscv_seg_mstate *state, > - struct disassemble_info *info) > + struct disassemble_info *info, > + bool update) > { > const char *name; > > @@ -831,15 +834,25 @@ riscv_get_map_state (int n, > return false; > > name = bfd_asymbol_name(info->symtab[n]); > - if (strcmp (name, "$x") == 0) > - *state = MAP_INSN; > - else if (strcmp (name, "$d") == 0) > + if (strcmp (name, "$d") == 0) > *state = MAP_DATA; > + else if (strcmp (name, "$x") == 0) > + { > + *state = MAP_INSN; > + /* Switch back to file-level architecture when starting from > + a new section. */ > + if (update && !from_last_map_symbol) > + riscv_rps_dis.subset_list = &riscv_file_subsets; > + } > else if (strncmp (name, "$xrv", 4) == 0) > { > *state = MAP_INSN; > - riscv_release_subset_list (&riscv_subsets); > - riscv_parse_subset (&riscv_rps_dis, name + 2); > + if (update) > + { > + riscv_rps_dis.subset_list = &riscv_subsets; > + riscv_release_subset_list (riscv_rps_dis.subset_list); > + riscv_parse_subset (&riscv_rps_dis, name + 2); > + } > } > else > return false; > @@ -855,7 +868,6 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > struct disassemble_info *info) > { > enum riscv_seg_mstate mstate; > - bool from_last_map_symbol; > bool found = false; > int symbol = -1; > int n; > @@ -872,7 +884,7 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour) > return mstate; > > - /* Reset the last_map_symbol if we start to dump a new section. */ > + /* Reset the last_map_symbol when pc <= 0. */ > if (memaddr <= 0) > last_map_symbol = -1; > > @@ -895,7 +907,7 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > /* We have searched all possible symbols in the range. */ > if (addr > memaddr) > break; > - if (riscv_get_map_state (n, &mstate, info)) > + if (riscv_get_map_state (n, &mstate, info, true/* update */)) > { > symbol = n; > found = true; > @@ -922,7 +934,7 @@ riscv_search_mapping_symbol (bfd_vma memaddr, > if (addr < (info->section ? info->section->vma : 0)) > break; > /* Stop searching once we find the closed mapping symbol. */ > - if (riscv_get_map_state (n, &mstate, info)) > + if (riscv_get_map_state (n, &mstate, info, true/* update */)) > { > symbol = n; > found = true; > @@ -958,7 +970,7 @@ riscv_data_length (bfd_vma memaddr, > { > bfd_vma addr = bfd_asymbol_value (info->symtab[n]); > if (addr > memaddr > - && riscv_get_map_state (n, &m, info)) > + && riscv_get_map_state (n, &m, info, false/* update */)) > { > if (addr - memaddr < length) > length = addr - memaddr; > @@ -1106,7 +1118,8 @@ riscv_get_disassembler (bfd *abfd) > } > } > > - riscv_release_subset_list (&riscv_subsets); > + riscv_rps_dis.subset_list = &riscv_file_subsets; > + riscv_release_subset_list (riscv_rps_dis.subset_list); > riscv_parse_subset (&riscv_rps_dis, default_arch); > return print_insn_riscv; > }