* [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler
@ 2022-06-03 12:05 Tsukasa OI
2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw)
To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils
Hello,
I noticed another disassembler issue and this is a fix.
As I submitted in my previous patchset:
<https://sourceware.org/pipermail/binutils/2022-June/121159.html>,
there is a problem while disassembling zext.h instruction on
RV{32,64}_Zbb_Zbkb configuration.
On RV32, zext.h on Zbb is a specialized form of pack (Zbkb).
On RV64, zext.h on Zbb is a specialized form of packw (Zbkb).
Note that, if Zbkb extension is disabled, zext.h is a single
instruction with no "generalized" forms.
When **both** Zbb and Zbkb extensions are enabled and a zext.h
instruction is disassembled with `-M no-aliases', it should print
a non-alias instruction (that is, either pack or packw).
# Input source file
_start:
zext.h a0, a1 # Specialized form on Zbb
# Expected (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
80000028 <_start>:
80000028: 0805c533 pack a0,a1,zero
However, it prints an alias.
# Actual (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
80000028 <_start>:
80000028: 0805c533 zext.h a0,a1
Note that, depending on -march, an "alias" is no longer an alias
(as I noted earlier).
# Expected/Actual (RV32_Zbb, objdump -d -M no-aliases)
80000028 <_start>:
80000028: 0805c533 zext.h a0,a1
In general, this kind of issue occurs when:
1. There are two instructions
(one of them is a specialized form of another).
2. Those requirements are different (separate) but can co-exist.
Because of 2., both instructions cannot be simple aliases (INSN_ALIAS
cannot be used). However on non-alias instructions, if a match is
found, riscv_disassemble_insn thinks this is it and quits too early.
Because zext.h is declared before pack and packw, generalized forms are
not matched for zext.h.
As a solution, I propose a new concept: generic subsets.
For instructions with INSN_GENERICS, opcode matching cannot early-quit.
Instead it searches an instruction with:
- Longest mask (by default; when -M no-aliases is NOT specified)
- Shortest mask (when -M no-aliases is specified)
"Length" of the mask equals population count. More one bits on the mask
means more specialized instruction form.
It fixes disassembler on following instructions and configurations:
- zext.h (Zbb) <--> pack (Zbkb; RV32)
- zext.h (Zbb) <--> packw (Zbkb; RV64)
Note that INSN_GENERICS (new flag in PATCH 2) must be set **both** on
specialized and generic forms. In the example above, those instructions
require INSN_GENERICS:
- zext.h (two XLEN-specific forms)
- pack
- packw
This concept can be used to following instruction pairs where the same
issue can occur once non-ratified instruction is ratified.
- orc.b (Zbb) <--> gorci (NOT RATIFIED)
- brev8 (Zbkb) <--> grevi (NOT RATIFIED)
- zip (Zbkb) <--> shfli (NOT RATIFIED)
- unzip (Zbkb) <--> unshfli (NOT RATIFIED)
- rev8 (Zbb/Zbkb) <--> grevi (NOT RATIFIED)
PATCH 1: Reorganize riscv_disassemble_insn function
riscv_disassemble_insn function (before PATCH 1) works like this:
FOREACH (opcode)
{
if (!opcode.MATCH(insn))
continue;
//
// Print insn with matched opcode.
//
return insnlen;
}
//
// opcode not matched.
//
Because instruction printing code is in the opcode loop, this is not
good to fix the problem. This patch reorganizes the function like this:
matched_opcode = NULL;
// Step 1: opcode matching
FOREACH (opcode)
{
if (!opcode.MATCH(insn))
continue;
matched_opcode = opcode;
break;
}
// Step 2: opcode printing
if (matched_opcode != NULL)
{
//
// Print insn with matched_opcode.
//
return insnlen;
}
//
// opcode not matched.
//
PATCH 1 splits opcode loop to two separate steps: opcode matching and
printing. With this, we can modify opcode matching step to implement
"generic subsets".
PATCH 2: Implement "generic subsets"
With this commit, riscv_disassemble_insn function works like this:
int masklen;
matched_opcode = NULL;
// Step 1: opcode matching
FOREACH (opcode)
{
if (!opcode.MATCH(insn))
continue;
if (!(opcode.pinfo & INSN_GENERICS))
{
matched_opcode = opcode;
break; // Early-quit as before (for non generic subsets)
}
// Handling of generic subsets
if (matched_opcode == NULL)
{
matched_opcode = opcode;
// Set corresponding masklen.
}
else if (no_aliases)
{
// Prefer opcode with shortest mask.
// Update match_opcode and masklen where necessary.
}
else
{
// Prefer opcode with longest mask.
// Update match_opcode and masklen where necessary.
}
}
// Step 2: opcode printing (continues...)
PATCH 2 also adds INSN_GENERICS flag to two zext.h forms and pack/packw
instructions.
PATCH 3 contains additional testcases.
Thanks,
Tsukasa
Tsukasa OI (3):
RISC-V: Split disasm opcode matching and printing
RISC-V: Implement "generic subsets" for disasm
RISC-V: Add testcases disassembling zext.h
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 +
gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++
gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 +
include/opcode/riscv.h | 4 +
opcodes/riscv-dis.c | 93 ++++++++++++--------
opcodes/riscv-opc.c | 8 +-
9 files changed, 114 insertions(+), 41 deletions(-)
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s
base-commit: 625b6eae091709b95471eae92d42dc6bc71e6553
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] RISC-V: Split disasm opcode matching and printing
2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
@ 2022-06-03 12:05 ` Tsukasa OI
2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw)
To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils
This commit splits instruction handling on riscv_disassemble_insn
function to two separate steps (opcode matching and printing).
This is a preparation for much complex opcode matching.
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Split instruction
handling to two separate steps: opcode matching and printing.
---
opcodes/riscv-dis.c | 85 ++++++++++++++++++++++++---------------------
1 file changed, 46 insertions(+), 39 deletions(-)
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 9ff31167775..48858e61c38 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -572,7 +572,7 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
static int
riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
{
- const struct riscv_opcode *op;
+ const struct riscv_opcode *op, *matched_op;
static bool init = 0;
static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
struct riscv_private_data *pd;
@@ -624,6 +624,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
info->target2 = 0;
op = riscv_hash[OP_HASH_IDX (word)];
+ matched_op = NULL;
if (op != NULL)
{
/* If XLEN is not known, get its value from the ELF class. */
@@ -656,49 +657,55 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class))
continue;
- /* It's a match. */
- (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic,
- "%s", op->name);
- print_insn_args (op->args, word, memaddr, info);
+ matched_op = op;
+ break;
+ }
+ }
- /* Try to disassemble multi-instruction addressing sequences. */
- if (pd->print_addr != (bfd_vma)-1)
- {
- info->target = pd->print_addr;
- (*info->fprintf_styled_func)
- (info->stream, dis_style_comment_start, " # ");
- (*info->print_address_func) (info->target, info);
- pd->print_addr = -1;
- }
+ /* There is a match. */
+ if (matched_op != NULL)
+ {
+ (*info->fprintf_styled_func) (info->stream, dis_style_mnemonic,
+ "%s", matched_op->name);
+ print_insn_args (matched_op->args, word, memaddr, info);
- /* Finish filling out insn_info fields. */
- switch (op->pinfo & INSN_TYPE)
- {
- case INSN_BRANCH:
- info->insn_type = dis_branch;
- break;
- case INSN_CONDBRANCH:
- info->insn_type = dis_condbranch;
- break;
- case INSN_JSR:
- info->insn_type = dis_jsr;
- break;
- case INSN_DREF:
- info->insn_type = dis_dref;
- break;
- default:
- break;
- }
+ /* Try to disassemble multi-instruction addressing sequences. */
+ if (pd->print_addr != (bfd_vma)-1)
+ {
+ info->target = pd->print_addr;
+ (*info->fprintf_styled_func)
+ (info->stream, dis_style_comment_start, " # ");
+ (*info->print_address_func) (info->target, info);
+ pd->print_addr = -1;
+ }
- if (op->pinfo & INSN_DATA_SIZE)
- {
- int size = ((op->pinfo & INSN_DATA_SIZE)
- >> INSN_DATA_SIZE_SHIFT);
- info->data_size = 1 << (size - 1);
- }
+ /* Finish filling out insn_info fields. */
+ switch (matched_op->pinfo & INSN_TYPE)
+ {
+ case INSN_BRANCH:
+ info->insn_type = dis_branch;
+ break;
+ case INSN_CONDBRANCH:
+ info->insn_type = dis_condbranch;
+ break;
+ case INSN_JSR:
+ info->insn_type = dis_jsr;
+ break;
+ case INSN_DREF:
+ info->insn_type = dis_dref;
+ break;
+ default:
+ break;
+ }
- return insnlen;
+ if (matched_op->pinfo & INSN_DATA_SIZE)
+ {
+ int size = ((matched_op->pinfo & INSN_DATA_SIZE)
+ >> INSN_DATA_SIZE_SHIFT);
+ info->data_size = 1 << (size - 1);
}
+
+ return insnlen;
}
/* We did not find a match, so just print the instruction bits. */
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm
2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI
@ 2022-06-03 12:05 ` Tsukasa OI
2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI
2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
3 siblings, 0 replies; 5+ messages in thread
From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw)
To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils
There's a situation where an instruction is a specialized form of
another but they have different requirements (that can co-exist). On
such cases, no-aliases option on diassembler will not work as expected.
This commit implements so called "generic subsets" to work with such
situation.
include/ChangeLog:
* opcode/riscv.h (INSN_GENERICS): New.
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Implement so called
"generic subsets" for disassembler.
* riscv-opc.c (riscv_opcodes): Add INSN_GENERICS to zext.h,
pack and packw instructions.
---
include/opcode/riscv.h | 4 ++++
opcodes/riscv-dis.c | 18 +++++++++++++++---
opcodes/riscv-opc.c | 8 ++++----
3 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/include/opcode/riscv.h b/include/opcode/riscv.h
index f0beceaec44..b526dd8fc0d 100644
--- a/include/opcode/riscv.h
+++ b/include/opcode/riscv.h
@@ -438,6 +438,10 @@ struct riscv_opcode
/* Instruction is a simple alias (e.g. "mv" for "addi"). */
#define INSN_ALIAS 0x00000001
+/* Instruction can be a simple instruction or an alias
+ depending on enabled ISA extensions. */
+#define INSN_GENERICS 0x00000100
+
/* These are for setting insn_info fields.
Nonbranch is the default. Noninsn is used only if there is no match.
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 48858e61c38..74655a2acb5 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -576,7 +576,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
static bool init = 0;
static const struct riscv_opcode *riscv_hash[OP_MASK_OP + 1];
struct riscv_private_data *pd;
- int insnlen;
+ int insnlen, masklen = -1;
#define OP_HASH_IDX(i) ((i) & (riscv_insn_length (i) == 2 ? 0x3 : OP_MASK_OP))
@@ -657,8 +657,20 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
if (!riscv_multi_subset_supports (&riscv_rps_dis, op->insn_class))
continue;
- matched_op = op;
- break;
+ if (! (op->pinfo & INSN_GENERICS))
+ {
+ matched_op = op;
+ break;
+ }
+
+ int curmasklen = __builtin_popcountll(op->mask);
+ if (matched_op == NULL || (no_aliases
+ ? (curmasklen < masklen)
+ : (curmasklen >= masklen)))
+ {
+ matched_op = op;
+ masklen = curmasklen;
+ }
}
}
diff --git a/opcodes/riscv-opc.c b/opcodes/riscv-opc.c
index 6355f8059f5..0eae9a788ed 100644
--- a/opcodes/riscv-opc.c
+++ b/opcodes/riscv-opc.c
@@ -933,8 +933,8 @@ const struct riscv_opcode riscv_opcodes[] =
{"sext.b", 0, INSN_CLASS_I, "d,s", 0, (int) M_SEXTB, match_never, INSN_MACRO },
{"sext.h", 0, INSN_CLASS_ZBB, "d,s", MATCH_SEXT_H, MASK_SEXT_H, match_opcode, 0 },
{"sext.h", 0, INSN_CLASS_I, "d,s", 0, (int) M_SEXTH, match_never, INSN_MACRO },
-{"zext.h", 32, INSN_CLASS_ZBB, "d,s", MATCH_PACK, MASK_PACK | MASK_RS2, match_opcode, 0 },
-{"zext.h", 64, INSN_CLASS_ZBB, "d,s", MATCH_PACKW, MASK_PACKW | MASK_RS2, match_opcode, 0 },
+{"zext.h", 32, INSN_CLASS_ZBB, "d,s", MATCH_PACK, MASK_PACK | MASK_RS2, match_opcode, INSN_GENERICS },
+{"zext.h", 64, INSN_CLASS_ZBB, "d,s", MATCH_PACKW, MASK_PACKW | MASK_RS2, match_opcode, INSN_GENERICS },
{"zext.h", 0, INSN_CLASS_I, "d,s", 0, (int) M_ZEXTH, match_never, INSN_MACRO },
{"orc.b", 0, INSN_CLASS_ZBB, "d,s", MATCH_GORCI | MATCH_SHAMT_ORC_B, MASK_GORCI | MASK_SHAMT, match_opcode, 0 },
{"clzw", 64, INSN_CLASS_ZBB, "d,s", MATCH_CLZW, MASK_CLZW, match_opcode, 0 },
@@ -944,9 +944,9 @@ const struct riscv_opcode riscv_opcodes[] =
{"brev8", 64, INSN_CLASS_ZBKB, "d,s", MATCH_GREVI | MATCH_SHAMT_BREV8, MASK_GREVI | MASK_SHAMT, match_opcode, 0 },
{"zip", 32, INSN_CLASS_ZBKB, "d,s", MATCH_SHFLI|MATCH_SHAMT_ZIP_32, MASK_SHFLI|MASK_SHAMT, match_opcode, INSN_ALIAS },
{"unzip", 32, INSN_CLASS_ZBKB, "d,s", MATCH_UNSHFLI|MATCH_SHAMT_ZIP_32, MASK_UNSHFLI|MASK_SHAMT, match_opcode, INSN_ALIAS },
-{"pack", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACK, MASK_PACK, match_opcode, 0 },
+{"pack", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACK, MASK_PACK, match_opcode, INSN_GENERICS },
{"packh", 0, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKH, MASK_PACKH, match_opcode, 0 },
-{"packw", 64, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKW, MASK_PACKW, match_opcode, 0 },
+{"packw", 64, INSN_CLASS_ZBKB, "d,s,t", MATCH_PACKW, MASK_PACKW, match_opcode, INSN_GENERICS },
{"andn", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_ANDN, MASK_ANDN, match_opcode, 0 },
{"orn", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_ORN, MASK_ORN, match_opcode, 0 },
{"xnor", 0, INSN_CLASS_ZBB_OR_ZBKB, "d,s,t", MATCH_XNOR, MASK_XNOR, match_opcode, 0 },
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 3/3] RISC-V: Add testcases disassembling zext.h
2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI
2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI
@ 2022-06-03 12:05 ` Tsukasa OI
2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
3 siblings, 0 replies; 5+ messages in thread
From: Tsukasa OI @ 2022-06-03 12:05 UTC (permalink / raw)
To: Tsukasa OI, Kito Cheng, Palmer Dabbelt; +Cc: binutils
This commit adds testcases for zext.h instruction (Zbb extension) on
diassembler.
gas/ChangeLog:
* testsuite/gas/riscv/zbb-zext.h-dis-1.s: New test.
* testsuite/gas/riscv/zbb-zext.h-dis-2.s: Likewise.
* testsuite/gas/riscv/zbb-zext.h-dis-1-1.d: Likewise.
* testsuite/gas/riscv/zbb-zext.h-dis-1-2.d: Likewise.
* testsuite/gas/riscv/zbb-zext.h-dis-1-3.d: Likewise.
* testsuite/gas/riscv/zbb-zext.h-dis-2.d: Likewise.
---
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++++++++++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++++++++++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++++++++++
gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 ++
gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++++++++++
gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 ++++
6 files changed, 50 insertions(+)
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d
create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d
new file mode 100644
index 00000000000..13e3d610342
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d
@@ -0,0 +1,11 @@
+#as: -march=rv32i_zbb_zbkb
+#source: zbb-zext.h-dis-1.s
+#objdump: -d
+
+.*:[ ]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <target>:
+[ ]+[0-9a-f]+:[ ]+0805c533[ ]+zext.h[ ]+a0,a1
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d
new file mode 100644
index 00000000000..e9218a8e3d2
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d
@@ -0,0 +1,11 @@
+#as: -march=rv32i_zbb_zbkb
+#source: zbb-zext.h-dis-1.s
+#objdump: -d -M no-aliases
+
+.*:[ ]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <target>:
+[ ]+[0-9a-f]+:[ ]+0805c533[ ]+pack[ ]+a0,a1,zero
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d
new file mode 100644
index 00000000000..629da4ff984
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d
@@ -0,0 +1,11 @@
+#as: -march=rv32i_zbb
+#source: zbb-zext.h-dis-1.s
+#objdump: -d -M no-aliases
+
+.*:[ ]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <target>:
+[ ]+[0-9a-f]+:[ ]+0805c533[ ]+zext.h[ ]+a0,a1
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s
new file mode 100644
index 00000000000..ab04765dd5d
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s
@@ -0,0 +1,2 @@
+target:
+ zext.h a0, a1
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d
new file mode 100644
index 00000000000..3b1ee9de01e
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d
@@ -0,0 +1,11 @@
+#as: -march=rv32i_zbkb
+#source: zbb-zext.h-dis-2.s
+#objdump: -d
+
+.*:[ ]+file format .*
+
+
+Disassembly of section .text:
+
+0+000 <target>:
+[ ]+[0-9a-f]+:[ ]+0805c533[ ]+pack[ ]+a0,a1,zero
diff --git a/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s
new file mode 100644
index 00000000000..88067d3e87d
--- /dev/null
+++ b/gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s
@@ -0,0 +1,4 @@
+target:
+ .option arch, +zbb
+ zext.h a0, a1
+ .option arch, -zbb
--
2.34.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler
2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
` (2 preceding siblings ...)
2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI
@ 2022-07-30 4:20 ` Tsukasa OI
3 siblings, 0 replies; 5+ messages in thread
From: Tsukasa OI @ 2022-07-30 4:20 UTC (permalink / raw)
To: Kito Cheng, Palmer Dabbelt, Nelson Chu; +Cc: binutils
This patch is postponed because I will merge this into a larger patchset
with general core disassembler improvements.
Thanks,
Tsukasa
On 2022/06/03 21:05, Tsukasa OI wrote:
> Hello,
>
> I noticed another disassembler issue and this is a fix.
>
>
> As I submitted in my previous patchset:
> <https://sourceware.org/pipermail/binutils/2022-June/121159.html>,
> there is a problem while disassembling zext.h instruction on
> RV{32,64}_Zbb_Zbkb configuration.
>
> On RV32, zext.h on Zbb is a specialized form of pack (Zbkb).
> On RV64, zext.h on Zbb is a specialized form of packw (Zbkb).
>
> Note that, if Zbkb extension is disabled, zext.h is a single
> instruction with no "generalized" forms.
>
>
> When **both** Zbb and Zbkb extensions are enabled and a zext.h
> instruction is disassembled with `-M no-aliases', it should print
> a non-alias instruction (that is, either pack or packw).
>
> # Input source file
> _start:
> zext.h a0, a1 # Specialized form on Zbb
>
> # Expected (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
> 80000028 <_start>:
> 80000028: 0805c533 pack a0,a1,zero
>
> However, it prints an alias.
>
> # Actual (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
> 80000028 <_start>:
> 80000028: 0805c533 zext.h a0,a1
>
> Note that, depending on -march, an "alias" is no longer an alias
> (as I noted earlier).
>
> # Expected/Actual (RV32_Zbb, objdump -d -M no-aliases)
> 80000028 <_start>:
> 80000028: 0805c533 zext.h a0,a1
>
>
> In general, this kind of issue occurs when:
>
> 1. There are two instructions
> (one of them is a specialized form of another).
> 2. Those requirements are different (separate) but can co-exist.
>
> Because of 2., both instructions cannot be simple aliases (INSN_ALIAS
> cannot be used). However on non-alias instructions, if a match is
> found, riscv_disassemble_insn thinks this is it and quits too early.
>
> Because zext.h is declared before pack and packw, generalized forms are
> not matched for zext.h.
>
>
>
>
> As a solution, I propose a new concept: generic subsets.
>
> For instructions with INSN_GENERICS, opcode matching cannot early-quit.
> Instead it searches an instruction with:
>
> - Longest mask (by default; when -M no-aliases is NOT specified)
> - Shortest mask (when -M no-aliases is specified)
>
> "Length" of the mask equals population count. More one bits on the mask
> means more specialized instruction form.
>
> It fixes disassembler on following instructions and configurations:
>
> - zext.h (Zbb) <--> pack (Zbkb; RV32)
> - zext.h (Zbb) <--> packw (Zbkb; RV64)
>
> Note that INSN_GENERICS (new flag in PATCH 2) must be set **both** on
> specialized and generic forms. In the example above, those instructions
> require INSN_GENERICS:
>
> - zext.h (two XLEN-specific forms)
> - pack
> - packw
>
> This concept can be used to following instruction pairs where the same
> issue can occur once non-ratified instruction is ratified.
>
> - orc.b (Zbb) <--> gorci (NOT RATIFIED)
> - brev8 (Zbkb) <--> grevi (NOT RATIFIED)
> - zip (Zbkb) <--> shfli (NOT RATIFIED)
> - unzip (Zbkb) <--> unshfli (NOT RATIFIED)
> - rev8 (Zbb/Zbkb) <--> grevi (NOT RATIFIED)
>
>
>
>
> PATCH 1: Reorganize riscv_disassemble_insn function
>
> riscv_disassemble_insn function (before PATCH 1) works like this:
>
> FOREACH (opcode)
> {
> if (!opcode.MATCH(insn))
> continue;
> //
> // Print insn with matched opcode.
> //
> return insnlen;
> }
> //
> // opcode not matched.
> //
>
> Because instruction printing code is in the opcode loop, this is not
> good to fix the problem. This patch reorganizes the function like this:
>
> matched_opcode = NULL;
> // Step 1: opcode matching
> FOREACH (opcode)
> {
> if (!opcode.MATCH(insn))
> continue;
> matched_opcode = opcode;
> break;
> }
> // Step 2: opcode printing
> if (matched_opcode != NULL)
> {
> //
> // Print insn with matched_opcode.
> //
> return insnlen;
> }
> //
> // opcode not matched.
> //
>
> PATCH 1 splits opcode loop to two separate steps: opcode matching and
> printing. With this, we can modify opcode matching step to implement
> "generic subsets".
>
>
>
>
> PATCH 2: Implement "generic subsets"
>
> With this commit, riscv_disassemble_insn function works like this:
>
> int masklen;
> matched_opcode = NULL;
> // Step 1: opcode matching
> FOREACH (opcode)
> {
> if (!opcode.MATCH(insn))
> continue;
> if (!(opcode.pinfo & INSN_GENERICS))
> {
> matched_opcode = opcode;
> break; // Early-quit as before (for non generic subsets)
> }
> // Handling of generic subsets
> if (matched_opcode == NULL)
> {
> matched_opcode = opcode;
> // Set corresponding masklen.
> }
> else if (no_aliases)
> {
> // Prefer opcode with shortest mask.
> // Update match_opcode and masklen where necessary.
> }
> else
> {
> // Prefer opcode with longest mask.
> // Update match_opcode and masklen where necessary.
> }
> }
> // Step 2: opcode printing (continues...)
>
> PATCH 2 also adds INSN_GENERICS flag to two zext.h forms and pack/packw
> instructions.
>
>
>
>
> PATCH 3 contains additional testcases.
>
>
>
>
> Thanks,
> Tsukasa
>
>
>
>
> Tsukasa OI (3):
> RISC-V: Split disasm opcode matching and printing
> RISC-V: Implement "generic subsets" for disasm
> RISC-V: Add testcases disassembling zext.h
>
> gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d | 11 +++
> gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d | 11 +++
> gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d | 11 +++
> gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s | 2 +
> gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d | 11 +++
> gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s | 4 +
> include/opcode/riscv.h | 4 +
> opcodes/riscv-dis.c | 93 ++++++++++++--------
> opcodes/riscv-opc.c | 8 +-
> 9 files changed, 114 insertions(+), 41 deletions(-)
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-1.d
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-2.d
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1-3.d
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-1.s
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.d
> create mode 100644 gas/testsuite/gas/riscv/zbb-zext.h-dis-2.s
>
>
> base-commit: 625b6eae091709b95471eae92d42dc6bc71e6553
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-07-30 4:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-03 12:05 [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
2022-06-03 12:05 ` [PATCH 1/3] RISC-V: Split disasm opcode matching and printing Tsukasa OI
2022-06-03 12:05 ` [PATCH 2/3] RISC-V: Implement "generic subsets" for disasm Tsukasa OI
2022-06-03 12:05 ` [PATCH 3/3] RISC-V: Add testcases disassembling zext.h Tsukasa OI
2022-07-30 4:20 ` [PATCH 0/3] RISC-V: Implement "generic subsets" for disassembler Tsukasa OI
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).