* [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler @ 2022-06-03 12:05 Tsukasa OI 2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw) To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils Hello, I noticed another disassembler issue and this is a fix. As I submitted in my previous patchset: <https://sourceware.org/pipermail/binutils/2022-June/121159.html>, there is a problem while disassembling zext.h instruction on RV{32,64}_Zbb_Zbkb configuration. On RV32, zext.h on Zbb is a specialized form of pack (Zbkb). On RV64, zext.h on Zbb is a specialized form of packw (Zbkb). Note that, if Zbkb extension is disabled, zext.h is a single instruction with no "generalized" forms. When **both** Zbb and Zbkb extensions are enabled and a zext.h instruction is disassembled with `-M no-aliases', it should print a non-alias instruction (that is, either pack or packw). # Input source file _start: zext.h a0, a1 # Specialized form on Zbb # Expected (RV32_Zbb_Zbkb, objdump -d -M no-aliases) 80000028 <_start>: 80000028: 0805c533 pack a0,a1,zero However, it prints an alias. # Actual (RV32_Zbb_Zbkb, objdump -d -M no-aliases) 80000028 <_start>: 80000028: 0805c533 zext.h a0,a1 Note that, depending on -march, an "alias" is no longer an alias (as I noted earlier). # Expected/Actual (RV32_Zbb, objdump -d -M no-aliases) 80000028 <_start>: 80000028: 0805c533 zext.h a0,a1 In general, this kind of issue occurs when: 1. There are two instructions (one of them is a specialized form of another). 2. Those requirements are different (separate) but can co-exist. Because of 2., both instructions cannot be simple aliases (INSN_ALIAS cannot be used). However on non-alias instructions, if a match is found, riscv_disassemble_insn thinks this is it and quits too early. Because zext.h is declared before pack and packw, generalized forms are not matched for zext.h. As a solution, I propose a new concept: generic subsets. For instructions with INSN_GENERICS, opcode matching cannot early-quit. Instead it searches an instruction with: - Longest mask (by default; when -M no-aliases is NOT specified) - Shortest mask (when -M no-aliases is specified) "Length" of the mask equals population count. More one bits on the mask means more specialized instruction form. It fixes disassembler on following instructions and configurations: - zext.h (Zbb) <--> pack (Zbkb; RV32) - zext.h (Zbb) <--> packw (Zbkb; RV64) Note that INSN_GENERICS (new flag in PATCH 2) must be set **both** on specialized and generic forms. In the example above, those instructions require INSN_GENERICS: - zext.h (two XLEN-specific forms) - pack - packw This concept can be used to following instruction pairs where the same issue can occur once non-ratified instruction is ratified. - orc.b (Zbb) <--> gorci (NOT RATIFIED) - brev8 (Zbkb) <--> grevi (NOT RATIFIED) - zip (Zbkb) <--> shfli (NOT RATIFIED) - unzip (Zbkb) <--> unshfli (NOT RATIFIED) - rev8 (Zbb/Zbkb) <--> grevi (NOT RATIFIED) PATCH 1: Reorganize riscv_disassemble_insn function riscv_disassemble_insn function (before PATCH 1) works like this: FOREACH (opcode) { if (!opcode.MATCH(insn)) continue; // // Print insn with matched opcode. // return insnlen; } // // opcode not matched. // Because instruction printing code is in the opcode loop, this is not good to fix the problem. This patch reorganizes the function like this: matched_opcode = NULL; // Step 1: opcode matching FOREACH (opcode) { if (!opcode.MATCH(insn)) continue; matched_opcode = opcode; break; } // Step 2: opcode printing if (matched_opcode != NULL) { // // Print insn with matched_opcode. // return insnlen; } // // opcode not matched. // PATCH 1 splits opcode loop to two separate steps: opcode matching and printing. With this, we can modify opcode matching step to implement "generic subsets". PATCH 2: Implement "generic subsets" With this commit, riscv_disassemble_insn function works like this: int masklen; matched_opcode = NULL; // Step 1: opcode matching FOREACH (opcode) { if (!opcode.MATCH(insn)) continue; if (!(opcode.pinfo & INSN_GENERICS)) { matched_opcode = opcode; break; // Early-quit as before (for non generic subsets) } // Handling of generic subsets if (matched_opcode == NULL) { matched_opcode = opcode; // Set corresponding masklen. } else if (no_aliases) { // Prefer opcode with shortest mask. // Update match_opcode and masklen where necessary. } else { // Prefer opcode with longest mask. // Update match_opcode and masklen where necessary. } } // Step 2: opcode printing (continues...) PATCH 2 also adds INSN_GENERICS flag to two zext.h forms and pack/packw instructions. PATCH 3 contains additional testcases. Thanks, Tsukasa Tsukasa OI (3): RISC-V: Split disasm opcode matching and printing RISC-V: Implement "generic subsets" for disasm RISC-V: Add testcases disassembling zext.h gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 + gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++ gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 + include/opcode/riscv.h | 4 + opcodes/riscv-dis.c | 93 ++++++++++++-------- opcodes/riscv-opc.c | 8 +- 9 files changed, 114 insertions(+), 41 deletions(-) create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s base-commit: 625b6eae091709b95471eae92d42dc6bc71e6553 -- 2.34.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] RISC-V: Split disasm opcode matching and printing 2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI @ 2022-06-03 12:05 ` Tsukasa OI 2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw) To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils This commit splits instruction handling on riscv_disassemble_insn function to two separate steps (opcode matching and printing). This is a preparation for much complex opcode matching. opcodes/ChangeLog: * riscv-dis.c (riscv_disassemble_insn): Split instruction handling to two separate steps: opcode matching and printing. --- opcodes/riscv-dis.c | 85 ++++++++++++++++++++++++--------------------- 1 file changed, 46 insertions(+), 39 deletions(-) diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c index 9ff31167775..48858e61c38 100644 --- a/opcodes/riscv-dis.c +++ b/opcodes/riscv-dis.c @@ -572,7 +572,7 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info static int riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) { - const struct riscv_opcode *op; + const struct riscv_opcode *op, *matched_op; static bool init = 0; static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1]; struct riscv_private_data *pd; @@ -624,6 +624,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) info->target2 = 0; op = riscv_hash[OP_HASH_IDX (word)]; + matched_op = NULL; if (op != NULL) { /* If XLEN is not known, get its value from the ELF class. */ @@ -656,49 +657,55 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class)) continue; - /* It's a match. */ - (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic, - "%s", op->name); - print_insn_args (op->args, word, memaddr, info); + matched_op = op; + break; + } + } - /* Try to disassemble multi-instruction addressing sequences. */ - if (pd->print_addr != (bfd_vma)-1) - { - info->target = pd->print_addr; - (*info->fprintf_styled_func) - (info->stream, dis_style_comment_start, " # "); - (*info->print_address_func) (info->target, info); - pd->print_addr = -1; - } + /* There is a match. */ + if (matched_op != NULL) + { + (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic, + "%s", matched_op->name); + print_insn_args (matched_op->args, word, memaddr, info); - /* Finish filling out insn_info fields. */ - switch (op->pinfo & INSN_TYPE) - { - case INSN_BRANCH: - info->insn_type = dis_branch; - break; - case INSN_CONDBRANCH: - info->insn_type = dis_condbranch; - break; - case INSN_JSR: - info->insn_type = dis_jsr; - break; - case INSN_DREF: - info->insn_type = dis_dref; - break; - default: - break; - } + /* Try to disassemble multi-instruction addressing sequences. */ + if (pd->print_addr != (bfd_vma)-1) + { + info->target = pd->print_addr; + (*info->fprintf_styled_func) + (info->stream, dis_style_comment_start, " # "); + (*info->print_address_func) (info->target, info); + pd->print_addr = -1; + } - if (op->pinfo & INSN_DATA_SIZE) - { - int size = ((op->pinfo & INSN_DATA_SIZE) - >> INSN_DATA_SIZE_SHIFT); - info->data_size = 1 << (size - 1); - } + /* Finish filling out insn_info fields. */ + switch (matched_op->pinfo & INSN_TYPE) + { + case INSN_BRANCH: + info->insn_type = dis_branch; + break; + case INSN_CONDBRANCH: + info->insn_type = dis_condbranch; + break; + case INSN_JSR: + info->insn_type = dis_jsr; + break; + case INSN_DREF: + info->insn_type = dis_dref; + break; + default: + break; + } - return insnlen; + if (matched_op->pinfo & INSN_DATA_SIZE) + { + int size = ((matched_op->pinfo & INSN_DATA_SIZE) + >> INSN_DATA_SIZE_SHIFT); + info->data_size = 1 << (size - 1); } + + return insnlen; } /* We did not find a match, so just print the instruction bits. */ -- 2.34.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm 2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI 2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI @ 2022-06-03 12:05 ` Tsukasa OI 2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI 2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI 3 siblings, 0 replies; 5+ messages in thread From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw) To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils There's a situation where an instruction is a specialized form of another but they have different requirements (that can co-exist). On such cases, no-aliases option on diassembler will not work as expected. This commit implements so called "generic subsets" to work with such situation. include/ChangeLog: * opcode/riscv.h (INSN_GENERICS): New. opcodes/ChangeLog: * riscv-dis.c (riscv_disassemble_insn): Implement so called "generic subsets" for disassembler. * riscv-opc.c (riscv_opcodes): Add INSN_GENERICS to zext.h, pack and packw instructions. --- include/opcode/riscv.h | 4 ++++ opcodes/riscv-dis.c | 18 +++++++++++++++--- opcodes/riscv-opc.c | 8 ++++---- 3 files changed, 23 insertions(+), 7 deletions(-) diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h index f0beceaec44..b526dd8fc0d 100644 --- a/include/opcode/riscv.h +++ b/include/opcode/riscv.h @@ -438,6 +438,10 @@ struct riscv_opcode /* Instruction is a simple alias (e.g. "mv" for "addi"). */ #define INSN_ALIAS 0x00000001 +/* Instruction can be a simple instruction or an alias + depending on enabled ISA extensions. */ +#define INSN_GENERICS 0x00000100 + /* These are for setting insn_info fields. Nonbranch is the default. Noninsn is used only if there is no match. diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c index 48858e61c38..74655a2acb5 100644 --- a/opcodes/riscv-dis.c +++ b/opcodes/riscv-dis.c @@ -576,7 +576,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) static bool init = 0; static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1]; struct riscv_private_data *pd; - int insnlen; + int insnlen, masklen = -1; #define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP)) @@ -657,8 +657,20 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info) if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class)) continue; - matched_op = op; - break; + if (! (op->pinfo & INSN_GENERICS)) + { + matched_op = op; + break; + } + + int curmasklen = __builtin_popcountll(op->mask); + if (matched_op == NULL || (no_aliases + ? (curmasklen < masklen) + : (curmasklen >= masklen))) + { + matched_op = op; + masklen = curmasklen; + } } } diff --git a/opcodes/riscv-opc.c b/opcodes/riscv-opc.c index 6355f8059f5..0eae9a788ed 100644 --- a/opcodes/riscv-opc.c +++ b/opcodes/riscv-opc.c @@ -933,8 +933,8 @@ const struct riscv_opcode riscv_opcodes[] = {"sext.b", 0, INSN_CLASS_I, "d,s", 0, (int) M_SEXTB, match_never, INSN_MACRO }, {"sext.h", 0, INSN_CLASS_ZBB, "d,s", MATCH_SEXT_H, MASK_SEXT_H, match_opcode, 0 }, {"sext.h", 0, INSN_CLASS_I, "d,s", 0, (int) M_SEXTH, match_never, INSN_MACRO }, -{"zext.h", 32, INSN_CLASS_ZBB, "d,s", MATCH_PACK, MASK_PACK | MASK_RS2, match_opcode, 0 }, -{"zext.h", 64, INSN_CLASS_ZBB, "d,s", MATCH_PACKW, MASK_PACKW | MASK_RS2, match_opcode, 0 }, +{"zext.h", 32, INSN_CLASS_ZBB, "d,s", MATCH_PACK, MASK_PACK | MASK_RS2, match_opcode, INSN_GENERICS }, +{"zext.h", 64, INSN_CLASS_ZBB, "d,s", MATCH_PACKW, MASK_PACKW | MASK_RS2, match_opcode, INSN_GENERICS }, {"zext.h", 0, INSN_CLASS_I, "d,s", 0, (int) M_ZEXTH, match_never, INSN_MACRO }, {"orc.b", 0, INSN_CLASS_ZBB, "d,s", MATCH_GORCI | MATCH_SHAMT_ORC_B, MASK_GORCI | MASK_SHAMT, match_opcode, 0 }, {"clzw", 64, INSN_CLASS_ZBB, "d,s", MATCH_CLZW, MASK_CLZW, match_opcode, 0 }, @@ -944,9 +944,9 @@ const struct riscv_opcode riscv_opcodes[] = {"brev8", 64, INSN_CLASS_ZBKB, "d,s", MATCH_GREVI | MATCH_SHAMT_BREV8, MASK_GREVI | MASK_SHAMT, match_opcode, 0 }, {"zip", 32, INSN_CLASS_ZBKB, "d,s", MATCH_SHFLI|MATCH_SHAMT_ZIP_32, MASK_SHFLI|MASK_SHAMT, match_opcode, INSN_ALIAS }, {"unzip", 32, INSN_CLASS_ZBKB, "d,s", MATCH_UNSHFLI|MATCH_SHAMT_ZIP_32, MASK_UNSHFLI|MASK_SHAMT, match_opcode, INSN_ALIAS }, -{"pack", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACK, MASK_PACK, match_opcode, 0 }, +{"pack", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACK, MASK_PACK, match_opcode, INSN_GENERICS }, {"packh", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKH, MASK_PACKH, match_opcode, 0 }, -{"packw", 64, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKW, MASK_PACKW, match_opcode, 0 }, +{"packw", 64, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKW, MASK_PACKW, match_opcode, INSN_GENERICS }, {"andn", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_ANDN, MASK_ANDN, match_opcode, 0 }, {"orn", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_ORN, MASK_ORN, match_opcode, 0 }, {"xnor", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_XNOR, MASK_XNOR, match_opcode, 0 }, -- 2.34.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 3/3] RISC-V: Add testcases disassembling zext.h 2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI 2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI 2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI @ 2022-06-03 12:05 ` Tsukasa OI 2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI 3 siblings, 0 replies; 5+ messages in thread From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw) To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils This commit adds testcases for zext.h instruction (Zbb extension) on diassembler. gas/ChangeLog: * testsuite/gas/riscv/zbb-zext.h-dis-1.s: New test. * testsuite/gas/riscv/zbb-zext.h-dis-2.s: Likewise. * testsuite/gas/riscv/zbb-zext.h-dis-1-1.d: Likewise. * testsuite/gas/riscv/zbb-zext.h-dis-1-2.d: Likewise. * testsuite/gas/riscv/zbb-zext.h-dis-1-3.d: Likewise. * testsuite/gas/riscv/zbb-zext.h-dis-2.d: Likewise. --- gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++++++++++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++++++++++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++++++++++ gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 ++ gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++++++++++ gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 ++++ 6 files changed, 50 insertions(+) create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d new file mode 100644 index 00000000000..13e3d610342 --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d @@ -0,0 +1,11 @@ +#as: -march=rv32i_zbb_zbkb +#source: zbb-zext.h-dis-1.s +#objdump: -d + +.*:[ ]+file format .* + + +Disassembly of section .text: + +0+000 <target>: +[ ]+[0-9a-f]+:[ ]+0805c533[ ]+zext.h[ ]+a0,a1 diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d new file mode 100644 index 00000000000..e9218a8e3d2 --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d @@ -0,0 +1,11 @@ +#as: -march=rv32i_zbb_zbkb +#source: zbb-zext.h-dis-1.s +#objdump: -d -M no-aliases + +.*:[ ]+file format .* + + +Disassembly of section .text: + +0+000 <target>: +[ ]+[0-9a-f]+:[ ]+0805c533[ ]+pack[ ]+a0,a1,zero diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d new file mode 100644 index 00000000000..629da4ff984 --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d @@ -0,0 +1,11 @@ +#as: -march=rv32i_zbb +#source: zbb-zext.h-dis-1.s +#objdump: -d -M no-aliases + +.*:[ ]+file format .* + + +Disassembly of section .text: + +0+000 <target>: +[ ]+[0-9a-f]+:[ ]+0805c533[ ]+zext.h[ ]+a0,a1 diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s new file mode 100644 index 00000000000..ab04765dd5d --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s @@ -0,0 +1,2 @@ +target: + zext.h a0, a1 diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d new file mode 100644 index 00000000000..3b1ee9de01e --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d @@ -0,0 +1,11 @@ +#as: -march=rv32i_zbkb +#source: zbb-zext.h-dis-2.s +#objdump: -d + +.*:[ ]+file format .* + + +Disassembly of section .text: + +0+000 <target>: +[ ]+[0-9a-f]+:[ ]+0805c533[ ]+pack[ ]+a0,a1,zero diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s new file mode 100644 index 00000000000..88067d3e87d --- /dev/null +++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s @@ -0,0 +1,4 @@ +target: + .option arch, +zbb + zext.h a0, a1 + .option arch, -zbb -- 2.34.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler 2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI ` (2 preceding siblings ...) 2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI @ 2022-07-30 4:20 ` Tsukasa OI 3 siblings, 0 replies; 5+ messages in thread From: Tsukasa OI @ 2022-07-30 4:20 UTC (permalink / raw) To: Kito Cheng, Palmer Dabbelt, Nelson Chu; +Cc: binutils This patch is postponed because I will merge this into a larger patchset with general core disassembler improvements. Thanks, Tsukasa On 2022/06/03 21:05, Tsukasa OI wrote: > Hello, > > I noticed another disassembler issue and this is a fix. > > > As I submitted in my previous patchset: > <https://sourceware.org/pipermail/binutils/2022-June/121159.html>, > there is a problem while disassembling zext.h instruction on > RV{32,64}_Zbb_Zbkb configuration. > > On RV32, zext.h on Zbb is a specialized form of pack (Zbkb). > On RV64, zext.h on Zbb is a specialized form of packw (Zbkb). > > Note that, if Zbkb extension is disabled, zext.h is a single > instruction with no "generalized" forms. > > > When **both** Zbb and Zbkb extensions are enabled and a zext.h > instruction is disassembled with `-M no-aliases', it should print > a non-alias instruction (that is, either pack or packw). > > # Input source file > _start: > zext.h a0, a1 # Specialized form on Zbb > > # Expected (RV32_Zbb_Zbkb, objdump -d -M no-aliases) > 80000028 <_start>: > 80000028: 0805c533 pack a0,a1,zero > > However, it prints an alias. > > # Actual (RV32_Zbb_Zbkb, objdump -d -M no-aliases) > 80000028 <_start>: > 80000028: 0805c533 zext.h a0,a1 > > Note that, depending on -march, an "alias" is no longer an alias > (as I noted earlier). > > # Expected/Actual (RV32_Zbb, objdump -d -M no-aliases) > 80000028 <_start>: > 80000028: 0805c533 zext.h a0,a1 > > > In general, this kind of issue occurs when: > > 1. There are two instructions > (one of them is a specialized form of another). > 2. Those requirements are different (separate) but can co-exist. > > Because of 2., both instructions cannot be simple aliases (INSN_ALIAS > cannot be used). However on non-alias instructions, if a match is > found, riscv_disassemble_insn thinks this is it and quits too early. > > Because zext.h is declared before pack and packw, generalized forms are > not matched for zext.h. > > > > > As a solution, I propose a new concept: generic subsets. > > For instructions with INSN_GENERICS, opcode matching cannot early-quit. > Instead it searches an instruction with: > > - Longest mask (by default; when -M no-aliases is NOT specified) > - Shortest mask (when -M no-aliases is specified) > > "Length" of the mask equals population count. More one bits on the mask > means more specialized instruction form. > > It fixes disassembler on following instructions and configurations: > > - zext.h (Zbb) <--> pack (Zbkb; RV32) > - zext.h (Zbb) <--> packw (Zbkb; RV64) > > Note that INSN_GENERICS (new flag in PATCH 2) must be set **both** on > specialized and generic forms. In the example above, those instructions > require INSN_GENERICS: > > - zext.h (two XLEN-specific forms) > - pack > - packw > > This concept can be used to following instruction pairs where the same > issue can occur once non-ratified instruction is ratified. > > - orc.b (Zbb) <--> gorci (NOT RATIFIED) > - brev8 (Zbkb) <--> grevi (NOT RATIFIED) > - zip (Zbkb) <--> shfli (NOT RATIFIED) > - unzip (Zbkb) <--> unshfli (NOT RATIFIED) > - rev8 (Zbb/Zbkb) <--> grevi (NOT RATIFIED) > > > > > PATCH 1: Reorganize riscv_disassemble_insn function > > riscv_disassemble_insn function (before PATCH 1) works like this: > > FOREACH (opcode) > { > if (!opcode.MATCH(insn)) > continue; > // > // Print insn with matched opcode. > // > return insnlen; > } > // > // opcode not matched. > // > > Because instruction printing code is in the opcode loop, this is not > good to fix the problem. This patch reorganizes the function like this: > > matched_opcode = NULL; > // Step 1: opcode matching > FOREACH (opcode) > { > if (!opcode.MATCH(insn)) > continue; > matched_opcode = opcode; > break; > } > // Step 2: opcode printing > if (matched_opcode != NULL) > { > // > // Print insn with matched_opcode. > // > return insnlen; > } > // > // opcode not matched. > // > > PATCH 1 splits opcode loop to two separate steps: opcode matching and > printing. With this, we can modify opcode matching step to implement > "generic subsets". > > > > > PATCH 2: Implement "generic subsets" > > With this commit, riscv_disassemble_insn function works like this: > > int masklen; > matched_opcode = NULL; > // Step 1: opcode matching > FOREACH (opcode) > { > if (!opcode.MATCH(insn)) > continue; > if (!(opcode.pinfo & INSN_GENERICS)) > { > matched_opcode = opcode; > break; // Early-quit as before (for non generic subsets) > } > // Handling of generic subsets > if (matched_opcode == NULL) > { > matched_opcode = opcode; > // Set corresponding masklen. > } > else if (no_aliases) > { > // Prefer opcode with shortest mask. > // Update match_opcode and masklen where necessary. > } > else > { > // Prefer opcode with longest mask. > // Update match_opcode and masklen where necessary. > } > } > // Step 2: opcode printing (continues...) > > PATCH 2 also adds INSN_GENERICS flag to two zext.h forms and pack/packw > instructions. > > > > > PATCH 3 contains additional testcases. > > > > > Thanks, > Tsukasa > > > > > Tsukasa OI (3): > RISC-V: Split disasm opcode matching and printing > RISC-V: Implement "generic subsets" for disasm > RISC-V: Add testcases disassembling zext.h > > gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++ > gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++ > gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++ > gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 + > gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++ > gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 + > include/opcode/riscv.h | 4 + > opcodes/riscv-dis.c | 93 ++++++++++++-------- > opcodes/riscv-opc.c | 8 +- > 9 files changed, 114 insertions(+), 41 deletions(-) > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d > create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s > > > base-commit: 625b6eae091709b95471eae92d42dc6bc71e6553 ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-07-30 4:20 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI 2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI 2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI 2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI 2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).