* [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
@ 2022-11-19 7:10 Tsukasa OI
2022-11-19 7:10 ` [PATCH 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
` (3 more replies)
0 siblings, 4 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-19 7:10 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils
Hello,
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") was suddenly merged into master by Jan Beulich.
Although this encoding is not considered frozen (see the section "Expanded
Instruction-Length Encoding" in the ISA Manual), such attempt makes sense.
However, it was insufficient in some ways. I quickly found a stack-based
buffer overflow and fixed that. Besides that, I found following issues
related to long (assembler: ".insn"-based, disassembler: unrecognized)
instructions:
[Assembler: False "value conflicts with instruction length" error]
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
44 hexadecimal digits (22 bytes)
GAS successfully compiles this (0x607f is one of the 2-byte prefixes of
22-byte instructions). However, it fails on following examples.
.insn 22, 0xfedcba98765432100123456789ab607f
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
32 hexadecimal digits (16 bytes)
with following error:
a.s: Assembler messages:
a.s:1: Error: value conflicts with instruction length `22,0xfedcba98765432100123456789ab607f'
This is not good. Yes, the value is 16 bytes but upper 6-bytes of a 22-byte
instruction could be just zeroed out. It should be valid just like:
.insn 22, 0x607f
This is a testcase from insn.s and current GAS can handle this. The only
reason which the example above fails is, the 16-byte constant is handled as
O_big (a big number). The root cause of this is:
if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
return _("value conflicts with instruction length");
where:
bytes : Expected instruction length
imm_expr->X_add_number : MINIMUM Number of entries (of generic_bignum)
required to represent a big number.
Non-zero if O_big.
CHARS_PER_LITTLENUM : A big number is represented as an array of
little numbers (current GAS: unsigned short).
This is the number of chars of a little number.
That means, if a big number can be represented with different bytes as
expected, ".insn" fails with "value conflicts with instruction length".
Replacing this with:
if (bytes < imm_expr->X_add_number * CHARS_PER_LITTLENUM) /* == -> < */
return _("value conflicts with instruction length");
would resolve this problem on the current GAS but this depends on how
the big numbers are represented. If big number changes its base type
(such as 2^32), it will stop working.
My solution (on PATCH 2/2) is much complex. It computes exact bytes
required to repsent a big number and fails only if number of represented
bytes exceeds expected number of bytes.
unsigned int llen = 0;
for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1]; lval != 0; llen++)
lval >>= BITS_PER_CHAR;
unsigned int repr_bytes = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
if (bytes < repr_bytes)
return _("value conflicts with instruction length");
Although it may not work on all possible bigint encoding configurations, my
implementation is flexible enough that any realistic bigint configurations
would be compatible.
This is the result after this fix (and the disassembler fix):
0: 607f 89ab 4567 0123 .byte 0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
8: 3210 7654 ba98 fedc
10: 0000 0000 0000
[Assembler: DWARF line number information is missing with big numbers]
We can compile this on the current GAS (note: line number is prefixed):
1: .insn 22, 0x607f
2: .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
44 hexadecimal digits (22 bytes)
with "as -gdwarf-2 -o a.o -c a.s"
or "as -g -o a.o -c a.s".
However, if we extract the debug information, the result will be rather
strange. Here's the result of "objdump -WL a.o":
a.o: file format elf64-littleriscv
Contents of the .debug_line section:
CU: a.s:
File name Line number Starting address View Stmt
a.s 1 0 x
a.s - 0x2c
A 22-byte instruction on the address 0 (line 1) is fine. But wait, where
the another 22-byte instruction on the address 0x16 (line 2) go? Before
describing, we'll see what's going on in the line 1.
1. s_riscv_insn (".insn" directive) is called
a. riscv_ip (".insn" with known instruction encoding) is called
but fails with "unrecognized opcode".
b. riscv_ip_hardcode (".insn" with raw values) is called
and succeeds.
c. append_insn is called **when imm_expr.X_op is not O_big**)
-> Go to 2
2. append_insn (output an instruction)
a. dwarf2_emit_insn is called (DWARF debug information is appended)
b. relocation / fixups
(if relaxed instructions are emitted, return early)
c. add_fixed_insn (add the instruction to the end of the output)
The reason why it fails on line 2 (O_big) is 1.c. Because DWARF debug
information is appended on 2.a., this part is not called because append_insn
is not called when imm_expr.X_op is O_big. Instead, it does completely
separate handling on the riscv_ip_hardcode function (1.b.).
My patchset (PATCH 2/2) unifies this separate handling with regular
instructions. That means, 1.c. is changed so that append_insn is called
even if imm_expr.X_op is O_big. To unify handling, it adds insn_long_opcode
(long opcode encoding up to 22-bytes) to struct riscv_cl_insn and the
install_insn function (called by add_fixed_insn function) is modified to
handle long instructions. riscv_ip_hardcode function is modified to store
long instruction encodings to new insn_long_opcode field.
As a result, ".insn" directive with a big number emits correct
debug information like this:
a.o: file format elf64-littleriscv
Contents of the .debug_line section:
CU: a.s:
File name Line number Starting address View Stmt
a.s 1 0 x
a.s 2 0x16 1 x
a.s - 0x2c
[Disassembler: Instruction is trimmed with 64-bits]
In this section, we reuse the object file generated by the section above
(two 22-byte instructions).
0000000000000000 <.text>:
0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
8: 0000 0000 0000 0000
10: 0000 0000 0000
16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26: 7654 ba98 fedc zeroed out after first 64-bits
See ".byte" at the address 0x16. It's trimmed at 64-bits.
The resolution is simple. If we simply add a char* argument (containing all
instruction bits), we can easily resolve this.
0000000000000000 <.text>:
0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
8: 0000 0000 0000 0000
10: 0000 0000 0000
16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26: 7654 ba98 fedc all instruction bits are correct
With this patchset, although we don't support any "valid" long instructions
yet, we can support long "invalid" instructions (only emitted through
".insn" and the encoded bits do not match to any valid instructions and
printed as ".byte").
This patchset is also important because it makes the assembler behavior
consistent (does not depend whether the raw value can be represented with
a small constant).
Thanks,
Tsukasa
Tsukasa OI (2):
RISC-V: Make .insn tests stricter
RISC-V: Better support for long instructions
gas/config/tc-riscv.c | 50 +++++++++++++++++++++++-----
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++-
gas/testsuite/gas/riscv/insn-na.d | 28 ++++++++++------
gas/testsuite/gas/riscv/insn.d | 32 +++++++++++++++---
gas/testsuite/gas/riscv/insn.s | 9 +++++
opcodes/riscv-dis.c | 13 +++++---
6 files changed, 113 insertions(+), 29 deletions(-)
base-commit: 15253318be0995200cc59929ca32eedbfd041e45
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 1/2] RISC-V: Make .insn tests stricter
2022-11-19 7:10 [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
@ 2022-11-19 7:10 ` Tsukasa OI
2022-11-21 7:32 ` Jan Beulich
2022-11-19 7:10 ` [PATCH 2/2] RISC-V: Better support for long instructions Tsukasa OI
` (2 subsequent siblings)
3 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-19 7:10 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
To make sure that all instruction bits are dumped through ".byte", this
commit makes matching patterns stricter (to cover all instruction bits).
gas/ChangeLog:
* testsuite/gas/riscv/insn.d: Make pattern stricter.
* testsuite/gas/riscv/insn-na.d: Likewise.
---
gas/testsuite/gas/riscv/insn-na.d | 20 ++++++++++----------
gas/testsuite/gas/riscv/insn.d | 10 +++++-----
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index 66dce71ebc21..be6c9f9dd66a 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -61,15 +61,15 @@ Disassembly of section .text:
[^:]+:[ ]+022180d7[ ]+vadd\.vv[ ]+v1,v2,v3
[^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
[^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
-[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
-[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
[^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
-[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
-[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index 2e5d35b39702..cf84f177af39 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -83,12 +83,12 @@ Disassembly of section .text:
[^:]+:[ ]+0000 0000 0000 ?
[^:]+:[ ]+0001[ ]+nop
[^:]+:[ ]+00000013[ ]+nop
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 ?
-[^:]+:[ ]+0000107f 00000000[ ]+[._a-z].*
+[^:]+:[ ]+0000107f 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+00000000 ?
-[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-19 7:10 [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-19 7:10 ` [PATCH 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
@ 2022-11-19 7:10 ` Tsukasa OI
2022-11-21 7:37 ` Jan Beulich
2022-11-22 0:43 ` [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Nelson Chu
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
3 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-19 7:10 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu, Kito Cheng, Palmer Dabbelt; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
1. It heavily depended on the bignum internals (radix of 2^16),
2. It generates "value conflicts with instruction length" even if a big
number instruction encoding does not exceed its expected length,
3. Because long opcode was handled separately (from struct riscv_cl_insn),
some information like DWARF line number correspondence was missing and
4. On the disassembler, disassembler dump was limited up to 64-bit.
For long (unknown) instructions, instruction bits are incorrectly
zeroed out.
To solve these problems, this commit:
1. Handles bignum (and its encodings) precisely,
2. Incorporates long opcode handling into regular struct
riscv_cl_insn-handling functions.
3. Adds packet argument to support dumping instructions
longer than 64-bits.
gas/ChangeLog:
* config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
(create_insn) Clear long opcode marker.
(install_insn) Install longer opcode as well.
(s_riscv_insn) Likewise.
(riscv_ip_hardcode): Make big number handling stricter. Length and
the value conflicts only if the bignum size exceeds the expected
length.
* testsuite/gas/riscv/insn.s: Add testcases such that big number
handling is required.
* testsuite/gas/riscv/insn.d: Likewise.
* testsuite/gas/riscv/insn-na.d: Likewise.
* testsuite/gas/riscv/insn-dwarf.d: Likewise.
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
---
gas/config/tc-riscv.c | 50 +++++++++++++++++++++++-----
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++-
gas/testsuite/gas/riscv/insn-na.d | 8 +++++
gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 +++++
opcodes/riscv-dis.c | 13 +++++---
6 files changed, 98 insertions(+), 14 deletions(-)
diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 019545171f5e..98e8ea0f5f6b 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -45,6 +45,9 @@ struct riscv_cl_insn
/* The encoded instruction bits. */
insn_t insn_opcode;
+ /* The long encoded instruction bits ([0] is non-zero if used). */
+ char insn_long_opcode[RISCV_MAX_INSN_LEN];
+
/* The frag that contains the instruction. */
struct frag *frag;
@@ -714,6 +717,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
{
insn->insn_mo = mo;
insn->insn_opcode = mo->match;
+ insn->insn_long_opcode[0] = 0; /* Long insn has non-zero value. */
insn->frag = NULL;
insn->where = 0;
insn->fixp = NULL;
@@ -725,7 +729,10 @@ static void
install_insn (const struct riscv_cl_insn *insn)
{
char *f = insn->frag->fr_literal + insn->where;
- number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
+ if (insn->insn_long_opcode[0] != 0)
+ memcpy (f, insn->insn_long_opcode, insn_length (insn));
+ else
+ number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
}
/* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
@@ -3481,7 +3488,21 @@ riscv_ip_hardcode (char *str,
values[num++] = (insn_t) imm_expr->X_add_number;
break;
case O_big:
- values[num++] = generic_bignum[0];
+ /* Extract lower 16-bits of a big number. */
+ if (CHARS_PER_LITTLENUM == 1)
+ {
+ offsetT n = imm_expr->X_add_number;
+ n = n > 2 ? 2 : n;
+ values[num] = 0;
+ for (offsetT i = 0; i < n; ++i)
+ {
+ values[num] *= (insn_t) LITTLENUM_RADIX;
+ values[num] |= (insn_t) generic_bignum[n - i - 1];
+ }
+ num++;
+ }
+ else
+ values[num++] = (insn_t) generic_bignum[0];
break;
default:
/* The first value isn't constant, so it should be
@@ -3508,12 +3529,25 @@ riscv_ip_hardcode (char *str,
if (imm_expr->X_op == O_big)
{
- if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
+ unsigned int llen = 0;
+ for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
+ lval != 0; llen++)
+ lval >>= BITS_PER_CHAR;
+ unsigned int repr_bytes
+ = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
+ if (bytes < repr_bytes)
return _("value conflicts with instruction length");
- char *f = frag_more (bytes);
- for (num = 0; num < imm_expr->X_add_number; ++num)
- number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
- generic_bignum[num], CHARS_PER_LITTLENUM);
+ for (num = 0; num < imm_expr->X_add_number - 1; ++num)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ CHARS_PER_LITTLENUM);
+ if (llen != 0)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ llen);
+ memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
return NULL;
}
@@ -4590,7 +4624,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
else
as_bad ("%s `%s'", error.msg, error.statement);
}
- else if (imm_expr.X_op != O_big)
+ else
{
gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
append_insn (&insn, &imm_expr, imm_reloc);
diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
index 89dc8d58ff09..b8bd42dff18c 100644
--- a/gas/testsuite/gas/riscv/insn-dwarf.d
+++ b/gas/testsuite/gas/riscv/insn-dwarf.d
@@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
insn.s +70 +0xfe.*
insn.s +71 +0x108.*
insn.s +72 +0x114.*
-insn.s +- +0x12a
+insn.s +74 +0x12a.*
+insn.s +75 +0x134.*
+insn.s +76 +0x13e.*
+insn.s +77 +0x154.*
+insn.s +78 +0x16a.*
+insn.s +79 +0x180.*
+insn.s +80 +0x196.*
+insn.s +81 +0x1ac.*
+insn.s +- +0x1c2
#pass
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index be6c9f9dd66a..60024555c71f 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -73,3 +73,11 @@ Disassembly of section .text:
[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index cf84f177af39..b02d67c5f120 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -92,3 +92,25 @@ Disassembly of section .text:
[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
diff --git a/gas/testsuite/gas/riscv/insn.s b/gas/testsuite/gas/riscv/insn.s
index 94f62fe5672e..48db59b14e88 100644
--- a/gas/testsuite/gas/riscv/insn.s
+++ b/gas/testsuite/gas/riscv/insn.s
@@ -70,3 +70,12 @@ target:
.insn 10, 0x007f
.insn 12, 0x107f
.insn 22, 0x607f
+
+ .insn 0x8000000000000000007f
+ .insn 10, 0x8000000000000000007f
+ .insn 0x000000000000fedcba98765432100123456789ab607f
+ .insn 22, 0x000000000000fedcba98765432100123456789ab607f
+ .insn 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 0xfedcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 3a31647a2f80..59ebbaf13417 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
this is little-endian code. */
static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+riscv_disassemble_insn (bfd_vma memaddr,
+ insn_t word,
+ const bfd_byte *packet,
+ disassemble_info *info)
{
const struct riscv_opcode *op;
static bool init = false;
@@ -806,8 +809,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
", ");
(*info->fprintf_styled_func) (info->stream, dis_style_immediate,
"0x%02x",
- (unsigned int) (word & 0xff));
- word >>= 8;
+ (unsigned int) (*packet++));
}
}
break;
@@ -983,6 +985,7 @@ riscv_data_length (bfd_vma memaddr,
static int
riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
insn_t data,
+ const bfd_byte *packet ATTRIBUTE_UNUSED,
disassemble_info *info)
{
info->display_endian = info->endian;
@@ -1037,7 +1040,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
bfd_vma dump_size;
int status;
enum riscv_seg_mstate mstate;
- int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+ int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte*, struct disassemble_info *);
if (info->disassembler_options != NULL)
{
@@ -1081,7 +1084,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
}
insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
- return (*riscv_disassembler) (memaddr, insn, info);
+ return (*riscv_disassembler) (memaddr, insn, packet, info);
}
disassembler_ftype
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 1/2] RISC-V: Make .insn tests stricter
2022-11-19 7:10 ` [PATCH 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
@ 2022-11-21 7:32 ` Jan Beulich
2022-11-23 8:20 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-21 7:32 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu, Kito Cheng, Palmer Dabbelt
On 19.11.2022 08:10, Tsukasa OI wrote:
> To make sure that all instruction bits are dumped through ".byte", this
> commit makes matching patterns stricter (to cover all instruction bits).
Hmm, it was deliberate to omit the .<n>byte part of the disassembly,
very specifically because of the inconsistency of using <n> only in
some cases. Personally I would agree with such tightening of the
expectations only if the disassembler was first improved. But of
course it's the maintainers judgement ...
Jan
> --- a/gas/testsuite/gas/riscv/insn-na.d
> +++ b/gas/testsuite/gas/riscv/insn-na.d
> @@ -61,15 +61,15 @@ Disassembly of section .text:
> [^:]+:[ ]+022180d7[ ]+vadd\.vv[ ]+v1,v2,v3
> [^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
> [^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
> -[^:]+:[ ]+001f 0000 0000[ ].*
> -[^:]+:[ ]+0000003f 00000000[ ].*
> -[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
> -[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
> -[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
> +[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> [^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
> [^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
> -[^:]+:[ ]+001f 0000 0000[ ].*
> -[^:]+:[ ]+0000003f 00000000[ ].*
> -[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
> -[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
> -[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
> +[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
> index 2e5d35b39702..cf84f177af39 100644
> --- a/gas/testsuite/gas/riscv/insn.d
> +++ b/gas/testsuite/gas/riscv/insn.d
> @@ -83,12 +83,12 @@ Disassembly of section .text:
> [^:]+:[ ]+0000 0000 0000 ?
> [^:]+:[ ]+0001[ ]+nop
> [^:]+:[ ]+00000013[ ]+nop
> -[^:]+:[ ]+001f 0000 0000[ ].*
> -[^:]+:[ ]+0000003f 00000000[ ].*
> -[^:]+:[ ]+007f 0000 0000 0000[ ]+[._a-z].*
> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
> +[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> [^:]+:[ ]+0000 ?
> -[^:]+:[ ]+0000107f 00000000[ ]+[._a-z].*
> +[^:]+:[ ]+0000107f 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> [^:]+:[ ]+00000000 ?
> -[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
> +[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> [^:]+:[ ]+0000 0000 0000 0000 ?
> [^:]+:[ ]+0000 0000 0000 ?
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-19 7:10 ` [PATCH 2/2] RISC-V: Better support for long instructions Tsukasa OI
@ 2022-11-21 7:37 ` Jan Beulich
2022-11-23 8:40 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-21 7:37 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu, Kito Cheng, Palmer Dabbelt
On 19.11.2022 08:10, Tsukasa OI wrote:
> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>
> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
> instructions with .insn") tried to start supporting long instructions but
> it was insufficient.
>
> 1. It heavily depended on the bignum internals (radix of 2^16),
> 2. It generates "value conflicts with instruction length" even if a big
> number instruction encoding does not exceed its expected length,
> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
> some information like DWARF line number correspondence was missing and
> 4. On the disassembler, disassembler dump was limited up to 64-bit.
> For long (unknown) instructions, instruction bits are incorrectly
> zeroed out.
Just FTR - of these 1 and 4 were deliberate (as in: deemed acceptable), the
former the keep the code reasonably simple and the latter because focus was
solely on the assembler.
I recall dealing with some instances of 2, but as you demonstrate I failed
to recognize and deal with further cases.
I will admit that I didn't even think of debug info generation.
> To solve these problems, this commit:
>
> 1. Handles bignum (and its encodings) precisely,
> 2. Incorporates long opcode handling into regular struct
> riscv_cl_insn-handling functions.
> 3. Adds packet argument to support dumping instructions
> longer than 64-bits.
Thanks for taking the time to make improvements there.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-19 7:10 [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-19 7:10 ` [PATCH 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
2022-11-19 7:10 ` [PATCH 2/2] RISC-V: Better support for long instructions Tsukasa OI
@ 2022-11-22 0:43 ` Nelson Chu
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
3 siblings, 0 replies; 38+ messages in thread
From: Nelson Chu @ 2022-11-22 0:43 UTC (permalink / raw)
To: Tsukasa OI; +Cc: Jan Beulich, Kito Cheng, Palmer Dabbelt, binutils
On Sat, Nov 19, 2022 at 3:11 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>
> Hello,
>
> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
> instructions with .insn") was suddenly merged into master by Jan Beulich.
> Although this encoding is not considered frozen (see the section "Expanded
> Instruction-Length Encoding" in the ISA Manual), such attempt makes sense.
>
> However, it was insufficient in some ways. I quickly found a stack-based
> buffer overflow and fixed that. Besides that, I found following issues
> related to long (assembler: ".insn"-based, disassembler: unrecognized)
> instructions:
>
>
>
> [Assembler: False "value conflicts with instruction length" error]
>
> .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 44 hexadecimal digits (22 bytes)
>
> GAS successfully compiles this (0x607f is one of the 2-byte prefixes of
> 22-byte instructions). However, it fails on following examples.
>
> .insn 22, 0xfedcba98765432100123456789ab607f
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 32 hexadecimal digits (16 bytes)
>
> with following error:
>
> a.s: Assembler messages:
> a.s:1: Error: value conflicts with instruction length `22,0xfedcba98765432100123456789ab607f'
>
> This is not good. Yes, the value is 16 bytes but upper 6-bytes of a 22-byte
> instruction could be just zeroed out. It should be valid just like:
>
> .insn 22, 0x607f
>
> This is a testcase from insn.s and current GAS can handle this. The only
> reason which the example above fails is, the 16-byte constant is handled as
> O_big (a big number). The root cause of this is:
>
> if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
> return _("value conflicts with instruction length");
>
> where:
>
> bytes : Expected instruction length
> imm_expr->X_add_number : MINIMUM Number of entries (of generic_bignum)
> required to represent a big number.
> Non-zero if O_big.
> CHARS_PER_LITTLENUM : A big number is represented as an array of
> little numbers (current GAS: unsigned short).
> This is the number of chars of a little number.
>
> That means, if a big number can be represented with different bytes as
> expected, ".insn" fails with "value conflicts with instruction length".
>
> Replacing this with:
>
> if (bytes < imm_expr->X_add_number * CHARS_PER_LITTLENUM) /* == -> < */
> return _("value conflicts with instruction length");
>
> would resolve this problem on the current GAS but this depends on how
> the big numbers are represented. If big number changes its base type
> (such as 2^32), it will stop working.
>
> My solution (on PATCH 2/2) is much complex. It computes exact bytes
> required to repsent a big number and fails only if number of represented
> bytes exceeds expected number of bytes.
>
> unsigned int llen = 0;
> for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1]; lval != 0; llen++)
> lval >>= BITS_PER_CHAR;
> unsigned int repr_bytes = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
> if (bytes < repr_bytes)
> return _("value conflicts with instruction length");
>
> Although it may not work on all possible bigint encoding configurations, my
> implementation is flexible enough that any realistic bigint configurations
> would be compatible.
>
> This is the result after this fix (and the disassembler fix):
>
> 0: 607f 89ab 4567 0123 .byte 0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 8: 3210 7654 ba98 fedc
> 10: 0000 0000 0000
>
>
>
> [Assembler: DWARF line number information is missing with big numbers]
>
> We can compile this on the current GAS (note: line number is prefixed):
>
> 1: .insn 22, 0x607f
> 2: .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 44 hexadecimal digits (22 bytes)
>
> with "as -gdwarf-2 -o a.o -c a.s"
> or "as -g -o a.o -c a.s".
>
> However, if we extract the debug information, the result will be rather
> strange. Here's the result of "objdump -WL a.o":
>
> a.o: file format elf64-littleriscv
>
> Contents of the .debug_line section:
>
> CU: a.s:
> File name Line number Starting address View Stmt
> a.s 1 0 x
> a.s - 0x2c
>
> A 22-byte instruction on the address 0 (line 1) is fine. But wait, where
> the another 22-byte instruction on the address 0x16 (line 2) go? Before
> describing, we'll see what's going on in the line 1.
>
> 1. s_riscv_insn (".insn" directive) is called
> a. riscv_ip (".insn" with known instruction encoding) is called
> but fails with "unrecognized opcode".
> b. riscv_ip_hardcode (".insn" with raw values) is called
> and succeeds.
> c. append_insn is called **when imm_expr.X_op is not O_big**)
> -> Go to 2
> 2. append_insn (output an instruction)
> a. dwarf2_emit_insn is called (DWARF debug information is appended)
> b. relocation / fixups
> (if relaxed instructions are emitted, return early)
> c. add_fixed_insn (add the instruction to the end of the output)
>
> The reason why it fails on line 2 (O_big) is 1.c. Because DWARF debug
> information is appended on 2.a., this part is not called because append_insn
> is not called when imm_expr.X_op is O_big. Instead, it does completely
> separate handling on the riscv_ip_hardcode function (1.b.).
>
> My patchset (PATCH 2/2) unifies this separate handling with regular
> instructions. That means, 1.c. is changed so that append_insn is called
> even if imm_expr.X_op is O_big. To unify handling, it adds insn_long_opcode
> (long opcode encoding up to 22-bytes) to struct riscv_cl_insn and the
> install_insn function (called by add_fixed_insn function) is modified to
> handle long instructions. riscv_ip_hardcode function is modified to store
> long instruction encodings to new insn_long_opcode field.
>
> As a result, ".insn" directive with a big number emits correct
> debug information like this:
>
> a.o: file format elf64-littleriscv
>
> Contents of the .debug_line section:
>
> CU: a.s:
> File name Line number Starting address View Stmt
> a.s 1 0 x
> a.s 2 0x16 1 x
> a.s - 0x2c
>
>
>
> [Disassembler: Instruction is trimmed with 64-bits]
>
> In this section, we reuse the object file generated by the section above
> (two 22-byte instructions).
>
> 0000000000000000 <.text>:
> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 8: 0000 0000 0000 0000
> 10: 0000 0000 0000
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 7654 ba98 fedc zeroed out after first 64-bits
>
> See ".byte" at the address 0x16. It's trimmed at 64-bits.
> The resolution is simple. If we simply add a char* argument (containing all
> instruction bits), we can easily resolve this.
>
> 0000000000000000 <.text>:
> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 8: 0000 0000 0000 0000
> 10: 0000 0000 0000
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 7654 ba98 fedc all instruction bits are correct
>
>
>
> With this patchset, although we don't support any "valid" long instructions
> yet, we can support long "invalid" instructions (only emitted through
> ".insn" and the encoded bits do not match to any valid instructions and
> printed as ".byte").
>
> This patchset is also important because it makes the assembler behavior
> consistent (does not depend whether the raw value can be represented with
> a small constant).
>
>
>
> Thanks,
> Tsukasa
>
>
>
>
> Tsukasa OI (2):
> RISC-V: Make .insn tests stricter
> RISC-V: Better support for long instructions
Thanks for Jan's feedback and it makes sense. I have no further
opinion here, so I will totally respect and follow Jan judgment for
this series. Anyway, thanks for spending time to fix this.
Nelson
> gas/config/tc-riscv.c | 50 +++++++++++++++++++++++-----
> gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++-
> gas/testsuite/gas/riscv/insn-na.d | 28 ++++++++++------
> gas/testsuite/gas/riscv/insn.d | 32 +++++++++++++++---
> gas/testsuite/gas/riscv/insn.s | 9 +++++
> opcodes/riscv-dis.c | 13 +++++---
> 6 files changed, 113 insertions(+), 29 deletions(-)
>
>
> base-commit: 15253318be0995200cc59929ca32eedbfd041e45
> --
> 2.38.1
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 1/2] RISC-V: Make .insn tests stricter
2022-11-21 7:32 ` Jan Beulich
@ 2022-11-23 8:20 ` Tsukasa OI
2022-11-23 8:56 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:20 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils, Nelson Chu
On 2022/11/21 16:32, Jan Beulich wrote:
> On 19.11.2022 08:10, Tsukasa OI wrote:
>> To make sure that all instruction bits are dumped through ".byte", this
>> commit makes matching patterns stricter (to cover all instruction bits).
>
> Hmm, it was deliberate to omit the .<n>byte part of the disassembly,
> very specifically because of the inconsistency of using <n> only in
> some cases. Personally I would agree with such tightening of the
> expectations only if the disassembler was first improved. But of
> course it's the maintainers judgement ...
>
> Jan
Hi,
I can agree most of your opinion except... even if we fix this
inconsistency later (I won't, though), I think we can change the tests
once we have changed the disassembler. So I don't find any problems
making tests tighter.
Nelson assigned you as the person who makes the final judgement and I
want to hear your thoughts/decision about PATCH 1/2.
p.s.
I made a small change to PATCH 2/2 and I'll reply your feedback to PATCH
2/2 after PATCH v2 is submitted (PATCH v2 1/2 is unchanged from v1).
Thanks,
Tsukasa
>
>> --- a/gas/testsuite/gas/riscv/insn-na.d
>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>> @@ -61,15 +61,15 @@ Disassembly of section .text:
>> [^:]+:[ ]+022180d7[ ]+vadd\.vv[ ]+v1,v2,v3
>> [^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
>> [^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
>> -[^:]+:[ ]+001f 0000 0000[ ].*
>> -[^:]+:[ ]+0000003f 00000000[ ].*
>> -[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>> -[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>> -[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
>> +[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> [^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
>> [^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
>> -[^:]+:[ ]+001f 0000 0000[ ].*
>> -[^:]+:[ ]+0000003f 00000000[ ].*
>> -[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>> -[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>> -[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
>> +[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
>> index 2e5d35b39702..cf84f177af39 100644
>> --- a/gas/testsuite/gas/riscv/insn.d
>> +++ b/gas/testsuite/gas/riscv/insn.d
>> @@ -83,12 +83,12 @@ Disassembly of section .text:
>> [^:]+:[ ]+0000 0000 0000 ?
>> [^:]+:[ ]+0001[ ]+nop
>> [^:]+:[ ]+00000013[ ]+nop
>> -[^:]+:[ ]+001f 0000 0000[ ].*
>> -[^:]+:[ ]+0000003f 00000000[ ].*
>> -[^:]+:[ ]+007f 0000 0000 0000[ ]+[._a-z].*
>> +[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
>> +[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
>> +[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> [^:]+:[ ]+0000 ?
>> -[^:]+:[ ]+0000107f 00000000[ ]+[._a-z].*
>> +[^:]+:[ ]+0000107f 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> [^:]+:[ ]+00000000 ?
>> -[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
>> +[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> [^:]+:[ ]+0000 0000 0000 0000 ?
>> [^:]+:[ ]+0000 0000 0000 ?
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-19 7:10 [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
` (2 preceding siblings ...)
2022-11-22 0:43 ` [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Nelson Chu
@ 2022-11-23 8:30 ` Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
` (2 more replies)
3 siblings, 3 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:30 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
Hello,
c.f. PATCH v1:
<https://sourceware.org/pipermail/binutils/2022-November/124516.html>
[Changes: v1 -> v2]
1. Rebased (as usual)
2. PATCH 2/2: Simplified the logic to extract low instruction bits
(will describe later)
3. PATCH 2/2: Changed the commit message slightly
As a part of the process making the logic more flexible to bignum internal
changes, PATCH v1 used a complex logic. After some investigation, I found
the function: generic_bignum_to_int32. I decided to use this on PATCH v2.
Despite that it DOES NOT guarantee that it will extract lower 32-bits of a
big number even if the number overflows but it does (on the current design)
and flexible enough to bignum internal changes (I don't want to make
dependencies to something which can be easily changed as long as it is
reasonable enough). Just for sure, I added a comment for the assumption of
this part.
Thanks,
Tsukasa
Tsukasa OI (2):
RISC-V: Make .insn tests stricter
RISC-V: Better support for long instructions
gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
gas/testsuite/gas/riscv/insn-na.d | 28 ++++++++++++--------
gas/testsuite/gas/riscv/insn.d | 32 +++++++++++++++++++----
gas/testsuite/gas/riscv/insn.s | 9 +++++++
opcodes/riscv-dis.c | 13 ++++++----
6 files changed, 101 insertions(+), 29 deletions(-)
base-commit: 829b6b3736d972f5fbacda09c82b31802d3b594c
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 1/2] RISC-V: Make .insn tests stricter
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
@ 2022-11-23 8:30 ` Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 2/2] RISC-V: Better support for long instructions Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:30 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
To make sure that all instruction bits are dumped through ".byte", this
commit makes matching patterns stricter (to cover all instruction bits).
gas/ChangeLog:
* testsuite/gas/riscv/insn.d: Make pattern stricter.
* testsuite/gas/riscv/insn-na.d: Likewise.
---
gas/testsuite/gas/riscv/insn-na.d | 20 ++++++++++----------
gas/testsuite/gas/riscv/insn.d | 10 +++++-----
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index 66dce71ebc21..be6c9f9dd66a 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -61,15 +61,15 @@ Disassembly of section .text:
[^:]+:[ ]+022180d7[ ]+vadd\.vv[ ]+v1,v2,v3
[^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
[^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
-[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
-[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0001[ ]+c\.addi[ ]+zero,0
[^:]+:[ ]+00000013[ ]+addi[ ]+zero,zero,0
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
-[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
-[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index 2e5d35b39702..cf84f177af39 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -83,12 +83,12 @@ Disassembly of section .text:
[^:]+:[ ]+0000 0000 0000 ?
[^:]+:[ ]+0001[ ]+nop
[^:]+:[ ]+00000013[ ]+nop
-[^:]+:[ ]+001f 0000 0000[ ].*
-[^:]+:[ ]+0000003f 00000000[ ].*
-[^:]+:[ ]+007f 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+001f 0000 0000[ ]+\.byte[ ]+0x1f, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+0000003f 00000000[ ]+\.8byte[ ]+0x3f
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 ?
-[^:]+:[ ]+0000107f 00000000[ ]+[._a-z].*
+[^:]+:[ ]+0000107f 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+00000000 ?
-[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 2/2] RISC-V: Better support for long instructions
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
@ 2022-11-23 8:30 ` Tsukasa OI
2022-11-23 9:04 ` Jan Beulich
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:30 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
1. It heavily depended on the bignum internals (radix of 2^16),
2. It generates "value conflicts with instruction length" even if a big
number instruction encoding does not exceed its expected length,
3. Because long opcode was handled separately (from struct riscv_cl_insn),
some information like DWARF line number correspondence was missing and
4. On the disassembler, disassembler dump was limited up to 64-bit.
For long (unknown) instructions, instruction bits are incorrectly
zeroed out.
To solve these problems, this commit:
1. Handles bignum (and its encodings) precisely,
2. Incorporates long opcode handling into regular
struct riscv_cl_insn-handling functions and
3. Adds packet argument to support dumping instructions
longer than 64-bits.
gas/ChangeLog:
* config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
(create_insn) Clear long opcode marker.
(install_insn) Install longer opcode as well.
(s_riscv_insn) Likewise.
(riscv_ip_hardcode): Make big number handling stricter. Length and
the value conflicts only if the bignum size exceeds the expected
maximum length.
* testsuite/gas/riscv/insn.s: Add testcases such that big number
handling is required.
* testsuite/gas/riscv/insn.d: Likewise.
* testsuite/gas/riscv/insn-na.d: Likewise.
* testsuite/gas/riscv/insn-dwarf.d: Likewise.
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
---
gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 +++++++
opcodes/riscv-dis.c | 13 ++++++----
6 files changed, 86 insertions(+), 14 deletions(-)
diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 019545171f5e..2237062d8b45 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -45,6 +45,9 @@ struct riscv_cl_insn
/* The encoded instruction bits. */
insn_t insn_opcode;
+ /* The long encoded instruction bits ([0] is non-zero if used). */
+ char insn_long_opcode[RISCV_MAX_INSN_LEN];
+
/* The frag that contains the instruction. */
struct frag *frag;
@@ -714,6 +717,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
{
insn->insn_mo = mo;
insn->insn_opcode = mo->match;
+ insn->insn_long_opcode[0] = 0; /* Long insn has non-zero value. */
insn->frag = NULL;
insn->where = 0;
insn->fixp = NULL;
@@ -725,7 +729,10 @@ static void
install_insn (const struct riscv_cl_insn *insn)
{
char *f = insn->frag->fr_literal + insn->where;
- number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
+ if (insn->insn_long_opcode[0] != 0)
+ memcpy (f, insn->insn_long_opcode, insn_length (insn));
+ else
+ number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
}
/* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
@@ -3481,7 +3488,9 @@ riscv_ip_hardcode (char *str,
values[num++] = (insn_t) imm_expr->X_add_number;
break;
case O_big:
- values[num++] = generic_bignum[0];
+ /* Extract lower 32-bits of a big number.
+ Assume that generic_bignum_to_int32 work on such number. */
+ values[num++] = (insn_t) generic_bignum_to_int32 ();
break;
default:
/* The first value isn't constant, so it should be
@@ -3508,12 +3517,25 @@ riscv_ip_hardcode (char *str,
if (imm_expr->X_op == O_big)
{
- if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
+ unsigned int llen = 0;
+ for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
+ lval != 0; llen++)
+ lval >>= BITS_PER_CHAR;
+ unsigned int repr_bytes
+ = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
+ if (bytes < repr_bytes)
return _("value conflicts with instruction length");
- char *f = frag_more (bytes);
- for (num = 0; num < imm_expr->X_add_number; ++num)
- number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
- generic_bignum[num], CHARS_PER_LITTLENUM);
+ for (num = 0; num < imm_expr->X_add_number - 1; ++num)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ CHARS_PER_LITTLENUM);
+ if (llen != 0)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ llen);
+ memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
return NULL;
}
@@ -4590,7 +4612,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
else
as_bad ("%s `%s'", error.msg, error.statement);
}
- else if (imm_expr.X_op != O_big)
+ else
{
gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
append_insn (&insn, &imm_expr, imm_reloc);
diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
index 89dc8d58ff09..b8bd42dff18c 100644
--- a/gas/testsuite/gas/riscv/insn-dwarf.d
+++ b/gas/testsuite/gas/riscv/insn-dwarf.d
@@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
insn.s +70 +0xfe.*
insn.s +71 +0x108.*
insn.s +72 +0x114.*
-insn.s +- +0x12a
+insn.s +74 +0x12a.*
+insn.s +75 +0x134.*
+insn.s +76 +0x13e.*
+insn.s +77 +0x154.*
+insn.s +78 +0x16a.*
+insn.s +79 +0x180.*
+insn.s +80 +0x196.*
+insn.s +81 +0x1ac.*
+insn.s +- +0x1c2
#pass
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index be6c9f9dd66a..60024555c71f 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -73,3 +73,11 @@ Disassembly of section .text:
[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000107f 00000000 00000000[ ]+\.byte[ ]+0x7f, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index cf84f177af39..b02d67c5f120 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -92,3 +92,25 @@ Disassembly of section .text:
[^:]+:[ ]+607f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
diff --git a/gas/testsuite/gas/riscv/insn.s b/gas/testsuite/gas/riscv/insn.s
index 94f62fe5672e..48db59b14e88 100644
--- a/gas/testsuite/gas/riscv/insn.s
+++ b/gas/testsuite/gas/riscv/insn.s
@@ -70,3 +70,12 @@ target:
.insn 10, 0x007f
.insn 12, 0x107f
.insn 22, 0x607f
+
+ .insn 0x8000000000000000007f
+ .insn 10, 0x8000000000000000007f
+ .insn 0x000000000000fedcba98765432100123456789ab607f
+ .insn 22, 0x000000000000fedcba98765432100123456789ab607f
+ .insn 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 0xfedcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 3a31647a2f80..59ebbaf13417 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
this is little-endian code. */
static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+riscv_disassemble_insn (bfd_vma memaddr,
+ insn_t word,
+ const bfd_byte *packet,
+ disassemble_info *info)
{
const struct riscv_opcode *op;
static bool init = false;
@@ -806,8 +809,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
", ");
(*info->fprintf_styled_func) (info->stream, dis_style_immediate,
"0x%02x",
- (unsigned int) (word & 0xff));
- word >>= 8;
+ (unsigned int) (*packet++));
}
}
break;
@@ -983,6 +985,7 @@ riscv_data_length (bfd_vma memaddr,
static int
riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
insn_t data,
+ const bfd_byte *packet ATTRIBUTE_UNUSED,
disassemble_info *info)
{
info->display_endian = info->endian;
@@ -1037,7 +1040,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
bfd_vma dump_size;
int status;
enum riscv_seg_mstate mstate;
- int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+ int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte*, struct disassemble_info *);
if (info->disassembler_options != NULL)
{
@@ -1081,7 +1084,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
}
insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
- return (*riscv_disassembler) (memaddr, insn, info);
+ return (*riscv_disassembler) (memaddr, insn, packet, info);
}
disassembler_ftype
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-21 7:37 ` Jan Beulich
@ 2022-11-23 8:40 ` Tsukasa OI
2022-11-23 8:44 ` Jan Beulich
2022-11-25 1:38 ` Nelson Chu
0 siblings, 2 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:40 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils, Nelson Chu
c.f. PATCH v2 2/2
<https://sourceware.org/pipermail/binutils/2022-November/124598.html>
On 2022/11/21 16:37, Jan Beulich wrote:
> On 19.11.2022 08:10, Tsukasa OI wrote:
>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>
>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>> instructions with .insn") tried to start supporting long instructions but
>> it was insufficient.
>>
>> 1. It heavily depended on the bignum internals (radix of 2^16),
>> 2. It generates "value conflicts with instruction length" even if a big
>> number instruction encoding does not exceed its expected length,
>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>> some information like DWARF line number correspondence was missing and
>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>> For long (unknown) instructions, instruction bits are incorrectly
>> zeroed out.
>
> Just FTR - of these 1 and 4 were deliberate (as in: deemed acceptable), the
> former the keep the code reasonably simple and the latter because focus was
> solely on the assembler.
I thought it's possible that 1 was a deliberate choice. I don't want to
depend on some internal structure that could change easily (as long as
this is a reasonable choice). To be honest, I didn't like my
"extracting prefix of an instruction" logic in PATCH v1 but I found a
good function: generic_bignum_to_int32 and decided use it on PATCH v2
(as a result, PATCH v2 2/2 is a bit simpler than PATCH v1).
For 4, resolving from the start would be better but since my current
focus is the RISC-V disassembler, I'm happy to resolve it (fortunately,
it didn't require large changes).
Nelson assigned you as the person who makes the final judgement for this
series and I want to hear your thoughts/decision about PATCH v2 2/2.
Thanks,
Tsukasa
>
> I recall dealing with some instances of 2, but as you demonstrate I failed
> to recognize and deal with further cases.
>
> I will admit that I didn't even think of debug info generation.
>
>> To solve these problems, this commit:
>>
>> 1. Handles bignum (and its encodings) precisely,
>> 2. Incorporates long opcode handling into regular struct
>> riscv_cl_insn-handling functions.
>> 3. Adds packet argument to support dumping instructions
>> longer than 64-bits.
>
> Thanks for taking the time to make improvements there.
>
> Jan
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-23 8:40 ` Tsukasa OI
@ 2022-11-23 8:44 ` Jan Beulich
2022-11-23 8:51 ` Tsukasa OI
2022-11-25 1:38 ` Nelson Chu
1 sibling, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-23 8:44 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 23.11.2022 09:40, Tsukasa OI wrote:
> c.f. PATCH v2 2/2
> <https://sourceware.org/pipermail/binutils/2022-November/124598.html>
>
> On 2022/11/21 16:37, Jan Beulich wrote:
>> On 19.11.2022 08:10, Tsukasa OI wrote:
>>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>>
>>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>>> instructions with .insn") tried to start supporting long instructions but
>>> it was insufficient.
>>>
>>> 1. It heavily depended on the bignum internals (radix of 2^16),
>>> 2. It generates "value conflicts with instruction length" even if a big
>>> number instruction encoding does not exceed its expected length,
>>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>>> some information like DWARF line number correspondence was missing and
>>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>>> For long (unknown) instructions, instruction bits are incorrectly
>>> zeroed out.
>>
>> Just FTR - of these 1 and 4 were deliberate (as in: deemed acceptable), the
>> former the keep the code reasonably simple and the latter because focus was
>> solely on the assembler.
>
> I thought it's possible that 1 was a deliberate choice. I don't want to
> depend on some internal structure that could change easily (as long as
> this is a reasonable choice). To be honest, I didn't like my
> "extracting prefix of an instruction" logic in PATCH v1 but I found a
> good function: generic_bignum_to_int32 and decided use it on PATCH v2
> (as a result, PATCH v2 2/2 is a bit simpler than PATCH v1).
FAOD by saying "deliberate" I have by no means meant to say that I'm not
happy to see you improve the state of things. I was merely trying to give
some background.
Jan
> For 4, resolving from the start would be better but since my current
> focus is the RISC-V disassembler, I'm happy to resolve it (fortunately,
> it didn't require large changes).
>
> Nelson assigned you as the person who makes the final judgement for this
> series and I want to hear your thoughts/decision about PATCH v2 2/2.
>
> Thanks,
> Tsukasa
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-23 8:44 ` Jan Beulich
@ 2022-11-23 8:51 ` Tsukasa OI
0 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-23 8:51 UTC (permalink / raw)
To: Jan Beulich, Binutils
On 2022/11/23 17:44, Jan Beulich wrote:
> On 23.11.2022 09:40, Tsukasa OI wrote:
>> c.f. PATCH v2 2/2
>> <https://sourceware.org/pipermail/binutils/2022-November/124598.html>
>>
>> On 2022/11/21 16:37, Jan Beulich wrote:
>>> On 19.11.2022 08:10, Tsukasa OI wrote:
>>>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>>>
>>>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>>>> instructions with .insn") tried to start supporting long instructions but
>>>> it was insufficient.
>>>>
>>>> 1. It heavily depended on the bignum internals (radix of 2^16),
>>>> 2. It generates "value conflicts with instruction length" even if a big
>>>> number instruction encoding does not exceed its expected length,
>>>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>>>> some information like DWARF line number correspondence was missing and
>>>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>>>> For long (unknown) instructions, instruction bits are incorrectly
>>>> zeroed out.
>>>
>>> Just FTR - of these 1 and 4 were deliberate (as in: deemed acceptable), the
>>> former the keep the code reasonably simple and the latter because focus was
>>> solely on the assembler.
>>
>> I thought it's possible that 1 was a deliberate choice. I don't want to
>> depend on some internal structure that could change easily (as long as
>> this is a reasonable choice). To be honest, I didn't like my
>> "extracting prefix of an instruction" logic in PATCH v1 but I found a
>> good function: generic_bignum_to_int32 and decided use it on PATCH v2
>> (as a result, PATCH v2 2/2 is a bit simpler than PATCH v1).
>
> FAOD by saying "deliberate" I have by no means meant to say that I'm not
> happy to see you improve the state of things. I was merely trying to give
> some background.
>
> Jan
Sorry, I never meant to blame you.
Tsukasa
>
>> For 4, resolving from the start would be better but since my current
>> focus is the RISC-V disassembler, I'm happy to resolve it (fortunately,
>> it didn't require large changes).
>>
>> Nelson assigned you as the person who makes the final judgement for this
>> series and I want to hear your thoughts/decision about PATCH v2 2/2.
>>
>> Thanks,
>> Tsukasa
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 1/2] RISC-V: Make .insn tests stricter
2022-11-23 8:20 ` Tsukasa OI
@ 2022-11-23 8:56 ` Jan Beulich
0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2022-11-23 8:56 UTC (permalink / raw)
To: Tsukasa OI, Nelson Chu; +Cc: binutils
On 23.11.2022 09:20, Tsukasa OI wrote:
> On 2022/11/21 16:32, Jan Beulich wrote:
>> On 19.11.2022 08:10, Tsukasa OI wrote:
>>> To make sure that all instruction bits are dumped through ".byte", this
>>> commit makes matching patterns stricter (to cover all instruction bits).
>>
>> Hmm, it was deliberate to omit the .<n>byte part of the disassembly,
>> very specifically because of the inconsistency of using <n> only in
>> some cases. Personally I would agree with such tightening of the
>> expectations only if the disassembler was first improved. But of
>> course it's the maintainers judgement ...
>
> I can agree most of your opinion except... even if we fix this
> inconsistency later (I won't, though), I think we can change the tests
> once we have changed the disassembler. So I don't find any problems
> making tests tighter.
>
> Nelson assigned you as the person who makes the final judgement and I
> want to hear your thoughts/decision about PATCH 1/2.
Hmm, I have to admit that I found Nelson's reply a little odd. For
arch-specific code I think the arch-specific maintainers are in far
better position to approve of patches, potentially also knowing
what longer term plans there are, and potentially also knowing of
reasons for the present behavior which I may be unaware of.
Therefore I'd like to make clear that my response was an opinion,
but not really meant to be an objection. Nevertheless some further
background (largely from experience with working on other
architectures, mostly x86): Once putting in specific expectations
for a particular test, the barrier to change those expectations
should be higher. So generally I would always advocate for putting
in as relaxed expectations as possible (but of course as strict as
necessary to fulfill the purpose of a test) when a particular
aspect of behavior may not be meant to remain that way long term.
Hence in the case here my intention with the original expectations
in the test in question was for them to remain valid once the
.<n>byte behavior was sanitized (which sooner or later I may get
to). Therefore I'd prefer (but not insist) that this remains to be
the case even after your adjustments. And I'd also prefer (but not
insist) that you, Nelson, at least voice an opinion, if not take on
again the role of being the one to eventually approve of the patch.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 2/2] RISC-V: Better support for long instructions
2022-11-23 8:30 ` [PATCH v2 2/2] RISC-V: Better support for long instructions Tsukasa OI
@ 2022-11-23 9:04 ` Jan Beulich
2022-11-24 2:34 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-23 9:04 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 23.11.2022 09:30, Tsukasa OI wrote:
> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>
> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
> instructions with .insn") tried to start supporting long instructions but
> it was insufficient.
>
> 1. It heavily depended on the bignum internals (radix of 2^16),
> 2. It generates "value conflicts with instruction length" even if a big
> number instruction encoding does not exceed its expected length,
> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
> some information like DWARF line number correspondence was missing and
> 4. On the disassembler, disassembler dump was limited up to 64-bit.
> For long (unknown) instructions, instruction bits are incorrectly
> zeroed out.
>
> To solve these problems, this commit:
>
> 1. Handles bignum (and its encodings) precisely,
> 2. Incorporates long opcode handling into regular
> struct riscv_cl_insn-handling functions and
> 3. Adds packet argument to support dumping instructions
> longer than 64-bits.
>
> gas/ChangeLog:
>
> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
> (create_insn) Clear long opcode marker.
> (install_insn) Install longer opcode as well.
> (s_riscv_insn) Likewise.
> (riscv_ip_hardcode): Make big number handling stricter. Length and
> the value conflicts only if the bignum size exceeds the expected
> maximum length.
> * testsuite/gas/riscv/insn.s: Add testcases such that big number
> handling is required.
> * testsuite/gas/riscv/insn.d: Likewise.
> * testsuite/gas/riscv/insn-na.d: Likewise.
> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
>
> opcodes/ChangeLog:
>
> * riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
> using the new argument packet.
> (riscv_disassemble_data): Add unused argument packet.
> (print_insn_riscv): Pass packet to the disassemble function.
The code changes look okay to me. For the testsuite additions I have
voiced my reservations, and I've given further background in an earlier
reply still on the v1 sub-thread. Whatever the resolution there would
imo want to be applied here as well.
As to mixing assembler and disassembler changes in the same patch: Is
this strictly necessary here for some reason? Generally I would suggest
to split such, but once again I wouldn't insist on you doing so ...
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 2/2] RISC-V: Better support for long instructions
2022-11-23 9:04 ` Jan Beulich
@ 2022-11-24 2:34 ` Tsukasa OI
2022-11-24 7:31 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-24 2:34 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils, Nelson Chu
On 2022/11/23 18:04, Jan Beulich wrote:
> On 23.11.2022 09:30, Tsukasa OI wrote:
>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>
>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>> instructions with .insn") tried to start supporting long instructions but
>> it was insufficient.
>>
>> 1. It heavily depended on the bignum internals (radix of 2^16),
>> 2. It generates "value conflicts with instruction length" even if a big
>> number instruction encoding does not exceed its expected length,
>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>> some information like DWARF line number correspondence was missing and
>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>> For long (unknown) instructions, instruction bits are incorrectly
>> zeroed out.
>>
>> To solve these problems, this commit:
>>
>> 1. Handles bignum (and its encodings) precisely,
>> 2. Incorporates long opcode handling into regular
>> struct riscv_cl_insn-handling functions and
>> 3. Adds packet argument to support dumping instructions
>> longer than 64-bits.
>>
>> gas/ChangeLog:
>>
>> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
>> (create_insn) Clear long opcode marker.
>> (install_insn) Install longer opcode as well.
>> (s_riscv_insn) Likewise.
>> (riscv_ip_hardcode): Make big number handling stricter. Length and
>> the value conflicts only if the bignum size exceeds the expected
>> maximum length.
>> * testsuite/gas/riscv/insn.s: Add testcases such that big number
>> handling is required.
>> * testsuite/gas/riscv/insn.d: Likewise.
>> * testsuite/gas/riscv/insn-na.d: Likewise.
>> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
>>
>> opcodes/ChangeLog:
>>
>> * riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
>> using the new argument packet.
>> (riscv_disassemble_data): Add unused argument packet.
>> (print_insn_riscv): Pass packet to the disassemble function.
>
> The code changes look okay to me. For the testsuite additions I have
> voiced my reservations, and I've given further background in an earlier
> reply still on the v1 sub-thread. Whatever the resolution there would
> imo want to be applied here as well.
Understood. My (minimum) opinion is, I want to keep 22-bytes patterns
corresponding PATCH v2 2/2 because that's exactly changed by the
assembler / disassembler fixes.
>
> As to mixing assembler and disassembler changes in the same patch: Is
> this strictly necessary here for some reason? Generally I would suggest
> to split such, but once again I wouldn't insist on you doing so ...
>
> Jan
>
I'm okay to split:
- Assembler fix + Disassembler fix + Test
to:
1. Assembler fix
2. Disassembler fix + Test
but there are a good reason I did like this in this patch.
To test fixed assembler, we need to fix disassembler as well. Although
they are not exactly the same issue, they are corresponding enough so
that merging changes might be justified.
But since they are not the same issue (as you pointed out), I'm okay to
split to two (three might be too much) separate patches.
Thanks,
Tsukasa
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 2/2] RISC-V: Better support for long instructions
2022-11-24 2:34 ` Tsukasa OI
@ 2022-11-24 7:31 ` Jan Beulich
2022-11-24 7:35 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-24 7:31 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 24.11.2022 03:34, Tsukasa OI wrote:
> On 2022/11/23 18:04, Jan Beulich wrote:
>> On 23.11.2022 09:30, Tsukasa OI wrote:
>>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>>
>>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>>> instructions with .insn") tried to start supporting long instructions but
>>> it was insufficient.
>>>
>>> 1. It heavily depended on the bignum internals (radix of 2^16),
>>> 2. It generates "value conflicts with instruction length" even if a big
>>> number instruction encoding does not exceed its expected length,
>>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>>> some information like DWARF line number correspondence was missing and
>>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>>> For long (unknown) instructions, instruction bits are incorrectly
>>> zeroed out.
>>>
>>> To solve these problems, this commit:
>>>
>>> 1. Handles bignum (and its encodings) precisely,
>>> 2. Incorporates long opcode handling into regular
>>> struct riscv_cl_insn-handling functions and
>>> 3. Adds packet argument to support dumping instructions
>>> longer than 64-bits.
>>>
>>> gas/ChangeLog:
>>>
>>> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
>>> (create_insn) Clear long opcode marker.
>>> (install_insn) Install longer opcode as well.
>>> (s_riscv_insn) Likewise.
>>> (riscv_ip_hardcode): Make big number handling stricter. Length and
>>> the value conflicts only if the bignum size exceeds the expected
>>> maximum length.
>>> * testsuite/gas/riscv/insn.s: Add testcases such that big number
>>> handling is required.
>>> * testsuite/gas/riscv/insn.d: Likewise.
>>> * testsuite/gas/riscv/insn-na.d: Likewise.
>>> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
>>>
>>> opcodes/ChangeLog:
>>>
>>> * riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
>>> using the new argument packet.
>>> (riscv_disassemble_data): Add unused argument packet.
>>> (print_insn_riscv): Pass packet to the disassemble function.
>>
>> The code changes look okay to me. For the testsuite additions I have
>> voiced my reservations, and I've given further background in an earlier
>> reply still on the v1 sub-thread. Whatever the resolution there would
>> imo want to be applied here as well.
>
> Understood. My (minimum) opinion is, I want to keep 22-bytes patterns
> corresponding PATCH v2 2/2 because that's exactly changed by the
> assembler / disassembler fixes.
But the assembler was rejecting the input there originally, wasn't it?
At which point _extending_ the testcase is certainly wanted, but do you
really need to check the ".byte ..." output to achieve the goal of the
test?
>> As to mixing assembler and disassembler changes in the same patch: Is
>> this strictly necessary here for some reason? Generally I would suggest
>> to split such, but once again I wouldn't insist on you doing so ...
>>
>> Jan
>>
> I'm okay to split:
> - Assembler fix + Disassembler fix + Test
> to:
> 1. Assembler fix
> 2. Disassembler fix + Test
> but there are a good reason I did like this in this patch.
>
> To test fixed assembler, we need to fix disassembler as well. Although
> they are not exactly the same issue, they are corresponding enough so
> that merging changes might be justified.
>
> But since they are not the same issue (as you pointed out), I'm okay to
> split to two (three might be too much) separate patches.
I agree three would be too much; I wonder whether
1. Disassembler fix
2. Assembler fix + Test
wouldn't be the better way to split, if you are going to in the first
place. Aiui the disassembler fix doesn't need any adjustments to
testcases, whereas my view is that the testcase addition is primarily
about the previously wrongly rejected assembler input, and only
secondarily about disassembler output. Hence keeping the testcase
adjustments with the assembler fix would, to me, seem more "natural".
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 2/2] RISC-V: Better support for long instructions
2022-11-24 7:31 ` Jan Beulich
@ 2022-11-24 7:35 ` Tsukasa OI
0 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-24 7:35 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
On 2022/11/24 16:31, Jan Beulich wrote:
> On 24.11.2022 03:34, Tsukasa OI wrote:
>> On 2022/11/23 18:04, Jan Beulich wrote:
>>> On 23.11.2022 09:30, Tsukasa OI wrote:
>>>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>>>
>>>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>>>> instructions with .insn") tried to start supporting long instructions but
>>>> it was insufficient.
>>>>
>>>> 1. It heavily depended on the bignum internals (radix of 2^16),
>>>> 2. It generates "value conflicts with instruction length" even if a big
>>>> number instruction encoding does not exceed its expected length,
>>>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>>>> some information like DWARF line number correspondence was missing and
>>>> 4. On the disassembler, disassembler dump was limited up to 64-bit.
>>>> For long (unknown) instructions, instruction bits are incorrectly
>>>> zeroed out.
>>>>
>>>> To solve these problems, this commit:
>>>>
>>>> 1. Handles bignum (and its encodings) precisely,
>>>> 2. Incorporates long opcode handling into regular
>>>> struct riscv_cl_insn-handling functions and
>>>> 3. Adds packet argument to support dumping instructions
>>>> longer than 64-bits.
>>>>
>>>> gas/ChangeLog:
>>>>
>>>> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
>>>> (create_insn) Clear long opcode marker.
>>>> (install_insn) Install longer opcode as well.
>>>> (s_riscv_insn) Likewise.
>>>> (riscv_ip_hardcode): Make big number handling stricter. Length and
>>>> the value conflicts only if the bignum size exceeds the expected
>>>> maximum length.
>>>> * testsuite/gas/riscv/insn.s: Add testcases such that big number
>>>> handling is required.
>>>> * testsuite/gas/riscv/insn.d: Likewise.
>>>> * testsuite/gas/riscv/insn-na.d: Likewise.
>>>> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
>>>>
>>>> opcodes/ChangeLog:
>>>>
>>>> * riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
>>>> using the new argument packet.
>>>> (riscv_disassemble_data): Add unused argument packet.
>>>> (print_insn_riscv): Pass packet to the disassemble function.
>>>
>>> The code changes look okay to me. For the testsuite additions I have
>>> voiced my reservations, and I've given further background in an earlier
>>> reply still on the v1 sub-thread. Whatever the resolution there would
>>> imo want to be applied here as well.
>>
>> Understood. My (minimum) opinion is, I want to keep 22-bytes patterns
>> corresponding PATCH v2 2/2 because that's exactly changed by the
>> assembler / disassembler fixes.
>
> But the assembler was rejecting the input there originally, wasn't it?
> At which point _extending_ the testcase is certainly wanted, but do you
> really need to check the ".byte ..." output to achieve the goal of the
> test?
>
>>> As to mixing assembler and disassembler changes in the same patch: Is
>>> this strictly necessary here for some reason? Generally I would suggest
>>> to split such, but once again I wouldn't insist on you doing so ...
>>>
>>> Jan
>>>
>> I'm okay to split:
>> - Assembler fix + Disassembler fix + Test
>> to:
>> 1. Assembler fix
>> 2. Disassembler fix + Test
>> but there are a good reason I did like this in this patch.
>>
>> To test fixed assembler, we need to fix disassembler as well. Although
>> they are not exactly the same issue, they are corresponding enough so
>> that merging changes might be justified.
>>
>> But since they are not the same issue (as you pointed out), I'm okay to
>> split to two (three might be too much) separate patches.
>
> I agree three would be too much; I wonder whether
>
> 1. Disassembler fix
> 2. Assembler fix + Test
>
> wouldn't be the better way to split, if you are going to in the first
> place. Aiui the disassembler fix doesn't need any adjustments to
> testcases, whereas my view is that the testcase addition is primarily
> about the previously wrongly rejected assembler input, and only
> secondarily about disassembler output. Hence keeping the testcase
> adjustments with the assembler fix would, to me, seem more "natural".
>
> Jan
>
Ah, I can agree that as well and I feel your option more natural.
I somehow missed that (probably because of my health issues for a weeks).
Anyway, thanks for pointing this out!
Tsukasa
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-23 8:40 ` Tsukasa OI
2022-11-23 8:44 ` Jan Beulich
@ 2022-11-25 1:38 ` Nelson Chu
2022-11-25 2:33 ` Tsukasa OI
1 sibling, 1 reply; 38+ messages in thread
From: Nelson Chu @ 2022-11-25 1:38 UTC (permalink / raw)
To: Tsukasa OI; +Cc: Jan Beulich, binutils
On Wed, Nov 23, 2022 at 4:40 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>
> Nelson assigned you as the person who makes the final judgement for this
> series and I want to hear your thoughts/decision about PATCH v2 2/2.
I'm not a natural language speaker, so I thought this should be a
little bit of a misunderstanding. "Assigned" isn't a suitable word
here, Jan is one of the global binutils maintainers, so he definitely
has the right to approve anything by himself, including the target
stuff. Not only for these two patches, I will definitely respect all
of his suggestions, just like he always respects the decision from
target maintainers. Therefore, I am glad he has time to help review
and do lots of risc-v stuff, and also thanks to everyone who helped
risc-v, of course including you :)
Thanks
Nelson
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 2/2] RISC-V: Better support for long instructions Tsukasa OI
@ 2022-11-25 2:17 ` Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
` (3 more replies)
2 siblings, 4 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 2:17 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
Hello,
c.f. PATCH v1:
<https://sourceware.org/pipermail/binutils/2022-November/124516.html>
c.f. PATCH v2:
<https://sourceware.org/pipermail/binutils/2022-November/124596.html>
[Changes: v2 -> v3]
1. PATCH v2 1/2 is removed
2. PATCH v2 2/2 is splitted to PATCH v3 {1,2}/2
based on the feedback of Jan
Strict ".byte" testcases are only preserved to test new behavior.
They are not 4-byte aligned (10 and 22-bytes) and unlikely to change
any time soon.
I hope this can be a good compromise.
[Changes: v1 -> v2]
1. Rebased (as usual)
2. PATCH 2/2: Simplified the logic to extract low instruction bits
(will describe later)
3. PATCH 2/2: Changed the commit message slightly
Thanks,
Tsukasa
Tsukasa OI (2):
RISC-V: Better support for long instructions (disassembler)
RISC-V: Better support for long instructions (assembler)
gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 +++++++
opcodes/riscv-dis.c | 13 ++++++----
6 files changed, 86 insertions(+), 14 deletions(-)
base-commit: 18a119b83d1f0f661532e5167af1c5549496759c
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler)
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
@ 2022-11-25 2:17 ` Tsukasa OI
2022-11-25 8:03 ` Jan Beulich
2022-11-25 2:17 ` [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler) Tsukasa OI
` (2 subsequent siblings)
3 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 2:17 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
1. On the disassembler, disassembler dump was limited up to 64-bit.
For long (unknown) instructions, instruction bits are incorrectly
zeroed out.
To solve these problems, this commit adds packet argument to support dumping
instructions longer than 64-bits. This commit will be tested on the next
commit "RISC-V: Better support for long instructions (assembler)".
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
---
opcodes/riscv-dis.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 3a31647a2f80..59ebbaf13417 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
this is little-endian code. */
static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+riscv_disassemble_insn (bfd_vma memaddr,
+ insn_t word,
+ const bfd_byte *packet,
+ disassemble_info *info)
{
const struct riscv_opcode *op;
static bool init = false;
@@ -806,8 +809,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
", ");
(*info->fprintf_styled_func) (info->stream, dis_style_immediate,
"0x%02x",
- (unsigned int) (word & 0xff));
- word >>= 8;
+ (unsigned int) (*packet++));
}
}
break;
@@ -983,6 +985,7 @@ riscv_data_length (bfd_vma memaddr,
static int
riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
insn_t data,
+ const bfd_byte *packet ATTRIBUTE_UNUSED,
disassemble_info *info)
{
info->display_endian = info->endian;
@@ -1037,7 +1040,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
bfd_vma dump_size;
int status;
enum riscv_seg_mstate mstate;
- int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+ int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte*, struct disassemble_info *);
if (info->disassembler_options != NULL)
{
@@ -1081,7 +1084,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
}
insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
- return (*riscv_disassembler) (memaddr, insn, info);
+ return (*riscv_disassembler) (memaddr, insn, packet, info);
}
disassembler_ftype
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
@ 2022-11-25 2:17 ` Tsukasa OI
2022-11-25 8:15 ` Jan Beulich
2022-11-25 11:41 ` [PATCH v3 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
3 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 2:17 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
1. It heavily depended on the bignum internals (radix of 2^16),
2. It generates "value conflicts with instruction length" even if a big
number instruction encoding does not exceed its expected length,
3. Because long opcode was handled separately (from struct riscv_cl_insn),
some information like DWARF line number correspondence was missing and
To solve these problems, this commit:
1. Handles bignum (and its encodings) precisely and
2. Incorporates long opcode handling into regular instruction handling.
gas/ChangeLog:
* config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
(create_insn) Clear long opcode marker.
(install_insn) Install longer opcode as well.
(s_riscv_insn) Likewise.
(riscv_ip_hardcode): Make big number handling stricter. Length and
the value conflicts only if the bignum size exceeds the expected
maximum length.
* testsuite/gas/riscv/insn.s: Add testcases such that big number
handling is required.
* testsuite/gas/riscv/insn.d: Likewise.
* testsuite/gas/riscv/insn-na.d: Likewise.
* testsuite/gas/riscv/insn-dwarf.d: Likewise.
---
gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 +++++++
5 files changed, 78 insertions(+), 9 deletions(-)
diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 019545171f5e..2237062d8b45 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -45,6 +45,9 @@ struct riscv_cl_insn
/* The encoded instruction bits. */
insn_t insn_opcode;
+ /* The long encoded instruction bits ([0] is non-zero if used). */
+ char insn_long_opcode[RISCV_MAX_INSN_LEN];
+
/* The frag that contains the instruction. */
struct frag *frag;
@@ -714,6 +717,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
{
insn->insn_mo = mo;
insn->insn_opcode = mo->match;
+ insn->insn_long_opcode[0] = 0; /* Long insn has non-zero value. */
insn->frag = NULL;
insn->where = 0;
insn->fixp = NULL;
@@ -725,7 +729,10 @@ static void
install_insn (const struct riscv_cl_insn *insn)
{
char *f = insn->frag->fr_literal + insn->where;
- number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
+ if (insn->insn_long_opcode[0] != 0)
+ memcpy (f, insn->insn_long_opcode, insn_length (insn));
+ else
+ number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
}
/* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
@@ -3481,7 +3488,9 @@ riscv_ip_hardcode (char *str,
values[num++] = (insn_t) imm_expr->X_add_number;
break;
case O_big:
- values[num++] = generic_bignum[0];
+ /* Extract lower 32-bits of a big number.
+ Assume that generic_bignum_to_int32 work on such number. */
+ values[num++] = (insn_t) generic_bignum_to_int32 ();
break;
default:
/* The first value isn't constant, so it should be
@@ -3508,12 +3517,25 @@ riscv_ip_hardcode (char *str,
if (imm_expr->X_op == O_big)
{
- if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
+ unsigned int llen = 0;
+ for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
+ lval != 0; llen++)
+ lval >>= BITS_PER_CHAR;
+ unsigned int repr_bytes
+ = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
+ if (bytes < repr_bytes)
return _("value conflicts with instruction length");
- char *f = frag_more (bytes);
- for (num = 0; num < imm_expr->X_add_number; ++num)
- number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
- generic_bignum[num], CHARS_PER_LITTLENUM);
+ for (num = 0; num < imm_expr->X_add_number - 1; ++num)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ CHARS_PER_LITTLENUM);
+ if (llen != 0)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ llen);
+ memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
return NULL;
}
@@ -4590,7 +4612,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
else
as_bad ("%s `%s'", error.msg, error.statement);
}
- else if (imm_expr.X_op != O_big)
+ else
{
gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
append_insn (&insn, &imm_expr, imm_reloc);
diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
index 89dc8d58ff09..b8bd42dff18c 100644
--- a/gas/testsuite/gas/riscv/insn-dwarf.d
+++ b/gas/testsuite/gas/riscv/insn-dwarf.d
@@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
insn.s +70 +0xfe.*
insn.s +71 +0x108.*
insn.s +72 +0x114.*
-insn.s +- +0x12a
+insn.s +74 +0x12a.*
+insn.s +75 +0x134.*
+insn.s +76 +0x13e.*
+insn.s +77 +0x154.*
+insn.s +78 +0x16a.*
+insn.s +79 +0x180.*
+insn.s +80 +0x196.*
+insn.s +81 +0x1ac.*
+insn.s +- +0x1c2
#pass
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index 66dce71ebc21..6928ba9ba0f2 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -73,3 +73,11 @@ Disassembly of section .text:
[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index 2e5d35b39702..b25bc35553fd 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -92,3 +92,25 @@ Disassembly of section .text:
[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
diff --git a/gas/testsuite/gas/riscv/insn.s b/gas/testsuite/gas/riscv/insn.s
index 94f62fe5672e..48db59b14e88 100644
--- a/gas/testsuite/gas/riscv/insn.s
+++ b/gas/testsuite/gas/riscv/insn.s
@@ -70,3 +70,12 @@ target:
.insn 10, 0x007f
.insn 12, 0x107f
.insn 22, 0x607f
+
+ .insn 0x8000000000000000007f
+ .insn 10, 0x8000000000000000007f
+ .insn 0x000000000000fedcba98765432100123456789ab607f
+ .insn 22, 0x000000000000fedcba98765432100123456789ab607f
+ .insn 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 0xfedcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 2/2] RISC-V: Better support for long instructions
2022-11-25 1:38 ` Nelson Chu
@ 2022-11-25 2:33 ` Tsukasa OI
0 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 2:33 UTC (permalink / raw)
To: Nelson Chu; +Cc: Jan Beulich, binutils
On 2022/11/25 10:38, Nelson Chu wrote:
> On Wed, Nov 23, 2022 at 4:40 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>>
>> Nelson assigned you as the person who makes the final judgement for this
>> series and I want to hear your thoughts/decision about PATCH v2 2/2.
>
> I'm not a natural language speaker, so I thought this should be a
> little bit of a misunderstanding. "Assigned" isn't a suitable word
> here, Jan is one of the global binutils maintainers, so he definitely
Yes. After sending the message, I noticed that I used a "too strong"
word here. Nelson's understandings are correct and thanks for your support.
I just submitted the PATCH v3 and I hope that Jan likes it.
Thanks,
Tsukasa
> has the right to approve anything by himself, including the target
> stuff. Not only for these two patches, I will definitely respect all
> of his suggestions, just like he always respects the decision from
> target maintainers. Therefore, I am glad he has time to help review
> and do lots of risc-v stuff, and also thanks to everyone who helped
> risc-v, of course including you :)
>
> Thanks
> Nelson
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler)
2022-11-25 2:17 ` [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
@ 2022-11-25 8:03 ` Jan Beulich
0 siblings, 0 replies; 38+ messages in thread
From: Jan Beulich @ 2022-11-25 8:03 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 25.11.2022 03:17, Tsukasa OI wrote:
> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>
> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
> instructions with .insn") tried to start supporting long instructions but
> it was insufficient.
>
> 1. On the disassembler, disassembler dump was limited up to 64-bit.
> For long (unknown) instructions, instruction bits are incorrectly
> zeroed out.
>
> To solve these problems, this commit adds packet argument to support dumping
> instructions longer than 64-bits. This commit will be tested on the next
> commit "RISC-V: Better support for long instructions (assembler)".
>
> opcodes/ChangeLog:
>
> * riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
> using the new argument packet.
> (riscv_disassemble_data): Add unused argument packet.
> (print_insn_riscv): Pass packet to the disassemble function.
Looks okay to me; just one style nit:
> --- a/opcodes/riscv-dis.c
> +++ b/opcodes/riscv-dis.c
> @@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
> this is little-endian code. */
>
> static int
> -riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
> +riscv_disassemble_insn (bfd_vma memaddr,
> + insn_t word,
> + const bfd_byte *packet,
Just like you have it here, please add ...
> @@ -1037,7 +1040,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
> bfd_vma dump_size;
> int status;
> enum riscv_seg_mstate mstate;
> - int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
> + int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte*, struct disassemble_info *);
... a blank before * here. I guess you will also want to wrap this
(now long) line.
No need to submit a v4 just for that, i.e. feel free to simply commit
with the adjustment (after giving Nelson a little bit of time, just
in case).
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 2:17 ` [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler) Tsukasa OI
@ 2022-11-25 8:15 ` Jan Beulich
2022-11-25 8:39 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-25 8:15 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 25.11.2022 03:17, Tsukasa OI wrote:
> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>
> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
> instructions with .insn") tried to start supporting long instructions but
> it was insufficient.
>
> 1. It heavily depended on the bignum internals (radix of 2^16),
> 2. It generates "value conflicts with instruction length" even if a big
> number instruction encoding does not exceed its expected length,
> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
> some information like DWARF line number correspondence was missing and
>
> To solve these problems, this commit:
>
> 1. Handles bignum (and its encodings) precisely and
> 2. Incorporates long opcode handling into regular instruction handling.
>
> gas/ChangeLog:
>
> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
> (create_insn) Clear long opcode marker.
> (install_insn) Install longer opcode as well.
> (s_riscv_insn) Likewise.
> (riscv_ip_hardcode): Make big number handling stricter. Length and
> the value conflicts only if the bignum size exceeds the expected
> maximum length.
> * testsuite/gas/riscv/insn.s: Add testcases such that big number
> handling is required.
> * testsuite/gas/riscv/insn.d: Likewise.
> * testsuite/gas/riscv/insn-na.d: Likewise.
> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
> ---
> gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
> gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
> gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
> gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++
> gas/testsuite/gas/riscv/insn.s | 9 +++++++
> 5 files changed, 78 insertions(+), 9 deletions(-)
>
> diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
> index 019545171f5e..2237062d8b45 100644
> --- a/gas/config/tc-riscv.c
> +++ b/gas/config/tc-riscv.c
> @@ -45,6 +45,9 @@ struct riscv_cl_insn
> /* The encoded instruction bits. */
> insn_t insn_opcode;
>
> + /* The long encoded instruction bits ([0] is non-zero if used). */
> + char insn_long_opcode[RISCV_MAX_INSN_LEN];
> +
> /* The frag that contains the instruction. */
> struct frag *frag;
>
> @@ -714,6 +717,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
> {
> insn->insn_mo = mo;
> insn->insn_opcode = mo->match;
> + insn->insn_long_opcode[0] = 0; /* Long insn has non-zero value. */
> insn->frag = NULL;
> insn->where = 0;
> insn->fixp = NULL;
> @@ -725,7 +729,10 @@ static void
> install_insn (const struct riscv_cl_insn *insn)
> {
> char *f = insn->frag->fr_literal + insn->where;
> - number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
> + if (insn->insn_long_opcode[0] != 0)
> + memcpy (f, insn->insn_long_opcode, insn_length (insn));
> + else
> + number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
> }
>
> /* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
> @@ -3481,7 +3488,9 @@ riscv_ip_hardcode (char *str,
> values[num++] = (insn_t) imm_expr->X_add_number;
> break;
> case O_big:
> - values[num++] = generic_bignum[0];
> + /* Extract lower 32-bits of a big number.
> + Assume that generic_bignum_to_int32 work on such number. */
> + values[num++] = (insn_t) generic_bignum_to_int32 ();
> break;
> default:
> /* The first value isn't constant, so it should be
> @@ -3508,12 +3517,25 @@ riscv_ip_hardcode (char *str,
>
> if (imm_expr->X_op == O_big)
> {
> - if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
> + unsigned int llen = 0;
> + for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
> + lval != 0; llen++)
> + lval >>= BITS_PER_CHAR;
> + unsigned int repr_bytes
> + = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
> + if (bytes < repr_bytes)
> return _("value conflicts with instruction length");
> - char *f = frag_more (bytes);
> - for (num = 0; num < imm_expr->X_add_number; ++num)
> - number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
> - generic_bignum[num], CHARS_PER_LITTLENUM);
> + for (num = 0; num < imm_expr->X_add_number - 1; ++num)
> + number_to_chars_littleendian (
> + ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
> + generic_bignum[num],
> + CHARS_PER_LITTLENUM);
> + if (llen != 0)
> + number_to_chars_littleendian (
> + ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
> + generic_bignum[num],
> + llen);
> + memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
> return NULL;
> }
>
> @@ -4590,7 +4612,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
> else
> as_bad ("%s `%s'", error.msg, error.statement);
> }
> - else if (imm_expr.X_op != O_big)
> + else
> {
> gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
> append_insn (&insn, &imm_expr, imm_reloc);
> diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
> index 89dc8d58ff09..b8bd42dff18c 100644
> --- a/gas/testsuite/gas/riscv/insn-dwarf.d
> +++ b/gas/testsuite/gas/riscv/insn-dwarf.d
> @@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
> insn.s +70 +0xfe.*
> insn.s +71 +0x108.*
> insn.s +72 +0x114.*
> -insn.s +- +0x12a
> +insn.s +74 +0x12a.*
> +insn.s +75 +0x134.*
> +insn.s +76 +0x13e.*
> +insn.s +77 +0x154.*
> +insn.s +78 +0x16a.*
> +insn.s +79 +0x180.*
> +insn.s +80 +0x196.*
> +insn.s +81 +0x1ac.*
> +insn.s +- +0x1c2
> #pass
> diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
> index 66dce71ebc21..6928ba9ba0f2 100644
> --- a/gas/testsuite/gas/riscv/insn-na.d
> +++ b/gas/testsuite/gas/riscv/insn-na.d
> @@ -73,3 +73,11 @@ Disassembly of section .text:
> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
I have to admit that I still don't see what good the ".byte ..." part of
the expectations does for the purpose of the test. In the cover letter
you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
change any time soon." But changing that is exactly my plan (unless
objected to by the arch maintainers): Showing insn components as bytes
is imo reasonable for RISC-V at most when things aren't even 2-byte
aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
matching the "raw opcode" output left to .<N>byte. In fact when raw
opcodes are output I question the need for any .<N>byte - it's fully
redundant.
Bottom line: As before I'd prefer if these parts were dropped (to limit
the churn on the files when changing the .<N>byte granularity), but I'm
not going to insist. Apart from this the change looks good to me.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 8:15 ` Jan Beulich
@ 2022-11-25 8:39 ` Tsukasa OI
2022-11-25 9:04 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 8:39 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
On 2022/11/25 17:15, Jan Beulich wrote:
> On 25.11.2022 03:17, Tsukasa OI wrote:
>> From: Tsukasa OI <research_trasio@irq.a4lg.com>
>>
>> Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
>> instructions with .insn") tried to start supporting long instructions but
>> it was insufficient.
>>
>> 1. It heavily depended on the bignum internals (radix of 2^16),
>> 2. It generates "value conflicts with instruction length" even if a big
>> number instruction encoding does not exceed its expected length,
>> 3. Because long opcode was handled separately (from struct riscv_cl_insn),
>> some information like DWARF line number correspondence was missing and
>>
>> To solve these problems, this commit:
>>
>> 1. Handles bignum (and its encodings) precisely and
>> 2. Incorporates long opcode handling into regular instruction handling.
>>
>> gas/ChangeLog:
>>
>> * config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
>> (create_insn) Clear long opcode marker.
>> (install_insn) Install longer opcode as well.
>> (s_riscv_insn) Likewise.
>> (riscv_ip_hardcode): Make big number handling stricter. Length and
>> the value conflicts only if the bignum size exceeds the expected
>> maximum length.
>> * testsuite/gas/riscv/insn.s: Add testcases such that big number
>> handling is required.
>> * testsuite/gas/riscv/insn.d: Likewise.
>> * testsuite/gas/riscv/insn-na.d: Likewise.
>> * testsuite/gas/riscv/insn-dwarf.d: Likewise.
>> ---
>> gas/config/tc-riscv.c | 38 ++++++++++++++++++++++------
>> gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++-
>> gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
>> gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++
>> gas/testsuite/gas/riscv/insn.s | 9 +++++++
>> 5 files changed, 78 insertions(+), 9 deletions(-)
>>
>> diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
>> index 019545171f5e..2237062d8b45 100644
>> --- a/gas/config/tc-riscv.c
>> +++ b/gas/config/tc-riscv.c
>> @@ -45,6 +45,9 @@ struct riscv_cl_insn
>> /* The encoded instruction bits. */
>> insn_t insn_opcode;
>>
>> + /* The long encoded instruction bits ([0] is non-zero if used). */
>> + char insn_long_opcode[RISCV_MAX_INSN_LEN];
>> +
>> /* The frag that contains the instruction. */
>> struct frag *frag;
>>
>> @@ -714,6 +717,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
>> {
>> insn->insn_mo = mo;
>> insn->insn_opcode = mo->match;
>> + insn->insn_long_opcode[0] = 0; /* Long insn has non-zero value. */
>> insn->frag = NULL;
>> insn->where = 0;
>> insn->fixp = NULL;
>> @@ -725,7 +729,10 @@ static void
>> install_insn (const struct riscv_cl_insn *insn)
>> {
>> char *f = insn->frag->fr_literal + insn->where;
>> - number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
>> + if (insn->insn_long_opcode[0] != 0)
>> + memcpy (f, insn->insn_long_opcode, insn_length (insn));
>> + else
>> + number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
>> }
>>
>> /* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
>> @@ -3481,7 +3488,9 @@ riscv_ip_hardcode (char *str,
>> values[num++] = (insn_t) imm_expr->X_add_number;
>> break;
>> case O_big:
>> - values[num++] = generic_bignum[0];
>> + /* Extract lower 32-bits of a big number.
>> + Assume that generic_bignum_to_int32 work on such number. */
>> + values[num++] = (insn_t) generic_bignum_to_int32 ();
>> break;
>> default:
>> /* The first value isn't constant, so it should be
>> @@ -3508,12 +3517,25 @@ riscv_ip_hardcode (char *str,
>>
>> if (imm_expr->X_op == O_big)
>> {
>> - if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
>> + unsigned int llen = 0;
>> + for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
>> + lval != 0; llen++)
>> + lval >>= BITS_PER_CHAR;
>> + unsigned int repr_bytes
>> + = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
>> + if (bytes < repr_bytes)
>> return _("value conflicts with instruction length");
>> - char *f = frag_more (bytes);
>> - for (num = 0; num < imm_expr->X_add_number; ++num)
>> - number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
>> - generic_bignum[num], CHARS_PER_LITTLENUM);
>> + for (num = 0; num < imm_expr->X_add_number - 1; ++num)
>> + number_to_chars_littleendian (
>> + ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
>> + generic_bignum[num],
>> + CHARS_PER_LITTLENUM);
>> + if (llen != 0)
>> + number_to_chars_littleendian (
>> + ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
>> + generic_bignum[num],
>> + llen);
>> + memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
>> return NULL;
>> }
>>
>> @@ -4590,7 +4612,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
>> else
>> as_bad ("%s `%s'", error.msg, error.statement);
>> }
>> - else if (imm_expr.X_op != O_big)
>> + else
>> {
>> gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
>> append_insn (&insn, &imm_expr, imm_reloc);
>> diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
>> index 89dc8d58ff09..b8bd42dff18c 100644
>> --- a/gas/testsuite/gas/riscv/insn-dwarf.d
>> +++ b/gas/testsuite/gas/riscv/insn-dwarf.d
>> @@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
>> insn.s +70 +0xfe.*
>> insn.s +71 +0x108.*
>> insn.s +72 +0x114.*
>> -insn.s +- +0x12a
>> +insn.s +74 +0x12a.*
>> +insn.s +75 +0x134.*
>> +insn.s +76 +0x13e.*
>> +insn.s +77 +0x154.*
>> +insn.s +78 +0x16a.*
>> +insn.s +79 +0x180.*
>> +insn.s +80 +0x196.*
>> +insn.s +81 +0x1ac.*
>> +insn.s +- +0x1c2
>> #pass
>> diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
>> index 66dce71ebc21..6928ba9ba0f2 100644
>> --- a/gas/testsuite/gas/riscv/insn-na.d
>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>> @@ -73,3 +73,11 @@ Disassembly of section .text:
>> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>
> I have to admit that I still don't see what good the ".byte ..." part of
> the expectations does for the purpose of the test. In the cover letter
> you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
> change any time soon." But changing that is exactly my plan (unless
> objected to by the arch maintainers): Showing insn components as bytes
> is imo reasonable for RISC-V at most when things aren't even 2-byte
> aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
> matching the "raw opcode" output left to .<N>byte. In fact when raw
> opcodes are output I question the need for any .<N>byte - it's fully
> redundant>
> Bottom line: As before I'd prefer if these parts were dropped (to limit
> the churn on the files when changing the .<N>byte granularity), but I'm
> not going to insist. Apart from this the change looks good to me.
>
> Jan
>
Okay, I have to admit that I misunderstood your intent.
Quoting my PATCH v1 cover letter:
> [Disassembler: Instruction is trimmed with 64-bits]
>
> In this section, we reuse the object file generated by the section above
> (two 22-byte instructions).
>
> 0000000000000000 <.text>:
> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 8: 0000 0000 0000 0000
> 10: 0000 0000 0000
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 7654 ba98 fedc zeroed out after first 64-bits
>
> See ".byte" at the address 0x16. It's trimmed at 64-bits.
> The resolution is simple. If we simply add a char* argument (containing all
> instruction bits), we can easily resolve this.
>
> 0000000000000000 <.text>:
> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 8: 0000 0000 0000 0000
> 10: 0000 0000 0000
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 7654 ba98 fedc all instruction bits are correct
At least, I want to test whether disassembly of this part (after first
64-bits of an instruction) is fixed. That is exactly what's fixed in
PATCH 1/2 and I want to make sure that this part is changed correctly.
If you change the output, you can freely remove or replace the testcase
according to the new output format but until that happens, I want to
keep those. What am I missing?
Tsukasa
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 8:39 ` Tsukasa OI
@ 2022-11-25 9:04 ` Jan Beulich
2022-11-25 9:18 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-25 9:04 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils
On 25.11.2022 09:39, Tsukasa OI wrote:
> On 2022/11/25 17:15, Jan Beulich wrote:
>> On 25.11.2022 03:17, Tsukasa OI wrote:
>>> --- a/gas/testsuite/gas/riscv/insn-na.d
>>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>>> @@ -73,3 +73,11 @@ Disassembly of section .text:
>>> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>>> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>>> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>
>> I have to admit that I still don't see what good the ".byte ..." part of
>> the expectations does for the purpose of the test. In the cover letter
>> you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
>> change any time soon." But changing that is exactly my plan (unless
>> objected to by the arch maintainers): Showing insn components as bytes
>> is imo reasonable for RISC-V at most when things aren't even 2-byte
>> aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
>> matching the "raw opcode" output left to .<N>byte. In fact when raw
>> opcodes are output I question the need for any .<N>byte - it's fully
>> redundant>
>> Bottom line: As before I'd prefer if these parts were dropped (to limit
>> the churn on the files when changing the .<N>byte granularity), but I'm
>> not going to insist. Apart from this the change looks good to me.
>
> Okay, I have to admit that I misunderstood your intent.
>
> Quoting my PATCH v1 cover letter:
>
>> [Disassembler: Instruction is trimmed with 64-bits]
>>
>> In this section, we reuse the object file generated by the section above
>> (two 22-byte instructions).
>>
>> 0000000000000000 <.text>:
>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> 8: 0000 0000 0000 0000
>> 10: 0000 0000 0000
>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>>
>> See ".byte" at the address 0x16. It's trimmed at 64-bits.
>> The resolution is simple. If we simply add a char* argument (containing all
>> instruction bits), we can easily resolve this.
>>
>> 0000000000000000 <.text>:
>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> 8: 0000 0000 0000 0000
>> 10: 0000 0000 0000
>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> 26: 7654 ba98 fedc all instruction bits are correct
>
> At least, I want to test whether disassembly of this part (after first
> 64-bits of an instruction) is fixed. That is exactly what's fixed in
> PATCH 1/2 and I want to make sure that this part is changed correctly.
Which you already achieve by the raw opcode output
607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000
The subsequent
.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
does not check anything the first part didn't already check.
> If you change the output, you can freely remove or replace the testcase
> according to the new output format but until that happens, I want to
> keep those. What am I missing?
I can only restate what I've said earlier: Testcases would imo better be
supplying as strict as necessary but as relaxed as possible expectations.
If it was truly the _intention_ for .byte (and not e.g. .2byte) to be
used here, _then_ it would make sense to have this be part of the
expectations. And then, by recording it that way, you also raise the
barrier of someone changing the behavior - after all the testcase then
says it is intended to be that way (rather than, as I assume here, it
being merely "the way it is").
And yes, if you look at other testcase expectations you will frequently
find way too strict expectations (often enough people simply take the
output of the dumping tool, massage it suitably to be regex-es, and be
done - sometimes with the effect of event recording outright wrong
expectations, simply because huge output is cumbersome to check for
correctness). In many cases this has resulted in unnecessarily big code
churn when extending such testcases, or unhelpful mishmash because of
people then always adding to the end instead of at a more sensible
place (e.g. next to related stuff). Hence in particular for a still
relatively new and tidy port like RISC-V is, it would seem desirable to
me to learn from mistakes made elsewhere before.
Yet to reiterate - I'm not going to insist, first and foremost because
binutils, unlike other projects, has overall relatively relaxed
acceptance criteria for patches.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 9:04 ` Jan Beulich
@ 2022-11-25 9:18 ` Tsukasa OI
2022-11-25 9:56 ` Jan Beulich
0 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 9:18 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
On 2022/11/25 18:04, Jan Beulich wrote:
> On 25.11.2022 09:39, Tsukasa OI wrote:
>> On 2022/11/25 17:15, Jan Beulich wrote:
>>> On 25.11.2022 03:17, Tsukasa OI wrote:
>>>> --- a/gas/testsuite/gas/riscv/insn-na.d
>>>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>>>> @@ -73,3 +73,11 @@ Disassembly of section .text:
>>>> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>>>> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>>>> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>
>>> I have to admit that I still don't see what good the ".byte ..." part of
>>> the expectations does for the purpose of the test. In the cover letter
>>> you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
>>> change any time soon." But changing that is exactly my plan (unless
>>> objected to by the arch maintainers): Showing insn components as bytes
>>> is imo reasonable for RISC-V at most when things aren't even 2-byte
>>> aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
>>> matching the "raw opcode" output left to .<N>byte. In fact when raw
>>> opcodes are output I question the need for any .<N>byte - it's fully
>>> redundant>
>>> Bottom line: As before I'd prefer if these parts were dropped (to limit
>>> the churn on the files when changing the .<N>byte granularity), but I'm
>>> not going to insist. Apart from this the change looks good to me.
>>
>> Okay, I have to admit that I misunderstood your intent.
>>
>> Quoting my PATCH v1 cover letter:
>>
>>> [Disassembler: Instruction is trimmed with 64-bits]
>>>
>>> In this section, we reuse the object file generated by the section above
>>> (two 22-byte instructions).
>>>
>>> 0000000000000000 <.text>:
>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>> 8: 0000 0000 0000 0000
>>> 10: 0000 0000 0000
>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>>>
>>> See ".byte" at the address 0x16. It's trimmed at 64-bits.
>>> The resolution is simple. If we simply add a char* argument (containing all
>>> instruction bits), we can easily resolve this.
>>>
>>> 0000000000000000 <.text>:
>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>> 8: 0000 0000 0000 0000
>>> 10: 0000 0000 0000
>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> 26: 7654 ba98 fedc all instruction bits are correct
>>
>> At least, I want to test whether disassembly of this part (after first
>> 64-bits of an instruction) is fixed. That is exactly what's fixed in
>> PATCH 1/2 and I want to make sure that this part is changed correctly.
>
> Which you already achieve by the raw opcode output
>
> 607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000
>
> The subsequent
>
> .byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>
> does not check anything the first part didn't already check.
That's simply not true.
Again, quoting from PATCH v1 cover letter (BEFORE THE PATCH
[disassembler part]):
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 7654 ba98 fedc zeroed out after first 64-bits
Compare hexdump and printed bytes (in ".byte"). You can see that "1e:
89ab..." are different from corresponding ".byte" (0x00, 0x00...).
They are completely separate and PATCH 1/2 only changes ".byte" output.
If ".byte[ ]+0x7f, 0x60..." does not check anything the first part
didn't already check, the dump would look like this:
> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> 1e: 0000 0000 0000 0000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 26: 0000 0000 0000 zeroed out after first 64-bits (but matches to hexdump)
If the hexdump and ".byte" output always matches EVEN BEFORE THE FIX,
that's what I can call "redundant". But in reality, they do not.
That's why I want to leave a test to make sure that the issue is fixed
(now hexdump and ".byte" output always matches).
Tsukasa
>
>> If you change the output, you can freely remove or replace the testcase
>> according to the new output format but until that happens, I want to
>> keep those. What am I missing?
>
> I can only restate what I've said earlier: Testcases would imo better be
> supplying as strict as necessary but as relaxed as possible expectations.
> If it was truly the _intention_ for .byte (and not e.g. .2byte) to be
> used here, _then_ it would make sense to have this be part of the
> expectations. And then, by recording it that way, you also raise the
> barrier of someone changing the behavior - after all the testcase then
> says it is intended to be that way (rather than, as I assume here, it
> being merely "the way it is").
>
> And yes, if you look at other testcase expectations you will frequently
> find way too strict expectations (often enough people simply take the
> output of the dumping tool, massage it suitably to be regex-es, and be
> done - sometimes with the effect of event recording outright wrong
> expectations, simply because huge output is cumbersome to check for
> correctness). In many cases this has resulted in unnecessarily big code
> churn when extending such testcases, or unhelpful mishmash because of
> people then always adding to the end instead of at a more sensible
> place (e.g. next to related stuff). Hence in particular for a still
> relatively new and tidy port like RISC-V is, it would seem desirable to
> me to learn from mistakes made elsewhere before.
>
> Yet to reiterate - I'm not going to insist, first and foremost because
> binutils, unlike other projects, has overall relatively relaxed
> acceptance criteria for patches.
>
> Jan
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 9:18 ` Tsukasa OI
@ 2022-11-25 9:56 ` Jan Beulich
2022-11-25 11:07 ` Tsukasa OI
0 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-25 9:56 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils
On 25.11.2022 10:18, Tsukasa OI wrote:
> On 2022/11/25 18:04, Jan Beulich wrote:
>> On 25.11.2022 09:39, Tsukasa OI wrote:
>>> On 2022/11/25 17:15, Jan Beulich wrote:
>>>> On 25.11.2022 03:17, Tsukasa OI wrote:
>>>>> --- a/gas/testsuite/gas/riscv/insn-na.d
>>>>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>>>>> @@ -73,3 +73,11 @@ Disassembly of section .text:
>>>>> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>>>>> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>>>>> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>>> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>>
>>>> I have to admit that I still don't see what good the ".byte ..." part of
>>>> the expectations does for the purpose of the test. In the cover letter
>>>> you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
>>>> change any time soon." But changing that is exactly my plan (unless
>>>> objected to by the arch maintainers): Showing insn components as bytes
>>>> is imo reasonable for RISC-V at most when things aren't even 2-byte
>>>> aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
>>>> matching the "raw opcode" output left to .<N>byte. In fact when raw
>>>> opcodes are output I question the need for any .<N>byte - it's fully
>>>> redundant>
>>>> Bottom line: As before I'd prefer if these parts were dropped (to limit
>>>> the churn on the files when changing the .<N>byte granularity), but I'm
>>>> not going to insist. Apart from this the change looks good to me.
>>>
>>> Okay, I have to admit that I misunderstood your intent.
>>>
>>> Quoting my PATCH v1 cover letter:
>>>
>>>> [Disassembler: Instruction is trimmed with 64-bits]
>>>>
>>>> In this section, we reuse the object file generated by the section above
>>>> (two 22-byte instructions).
>>>>
>>>> 0000000000000000 <.text>:
>>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>> 8: 0000 0000 0000 0000
>>>> 10: 0000 0000 0000
>>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>>>>
>>>> See ".byte" at the address 0x16. It's trimmed at 64-bits.
>>>> The resolution is simple. If we simply add a char* argument (containing all
>>>> instruction bits), we can easily resolve this.
>>>>
>>>> 0000000000000000 <.text>:
>>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>> 8: 0000 0000 0000 0000
>>>> 10: 0000 0000 0000
>>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
>>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> 26: 7654 ba98 fedc all instruction bits are correct
>>>
>>> At least, I want to test whether disassembly of this part (after first
>>> 64-bits of an instruction) is fixed. That is exactly what's fixed in
>>> PATCH 1/2 and I want to make sure that this part is changed correctly.
>>
>> Which you already achieve by the raw opcode output
>>
>> 607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000
>>
>> The subsequent
>>
>> .byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>
>> does not check anything the first part didn't already check.
>
> That's simply not true.
>
> Again, quoting from PATCH v1 cover letter (BEFORE THE PATCH
> [disassembler part]):
>
>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>
> Compare hexdump and printed bytes (in ".byte"). You can see that "1e:
> 89ab..." are different from corresponding ".byte" (0x00, 0x00...).
> They are completely separate and PATCH 1/2 only changes ".byte" output.
>
> If ".byte[ ]+0x7f, 0x60..." does not check anything the first part
> didn't already check, the dump would look like this:
>
>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>> 1e: 0000 0000 0000 0000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> 26: 0000 0000 0000 zeroed out after first 64-bits (but matches to hexdump)
>
> If the hexdump and ".byte" output always matches EVEN BEFORE THE FIX,
> that's what I can call "redundant". But in reality, they do not.
> That's why I want to leave a test to make sure that the issue is fixed
> (now hexdump and ".byte" output always matches).
Oh, apologies: Disassembly then was wrong _only_ for the .byte part, but
_not_ for the raw encoding? While indeed (going back) you said so in the
original cover letter, the description of patch 1 doesn't make this
clear. In that case, yes, I agree that the .byte part wants to be part
of the expectations (but patch 1's description also wants adjusting).
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler)
2022-11-25 9:56 ` Jan Beulich
@ 2022-11-25 11:07 ` Tsukasa OI
0 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:07 UTC (permalink / raw)
To: Jan Beulich; +Cc: binutils
On 2022/11/25 18:56, Jan Beulich wrote:
> On 25.11.2022 10:18, Tsukasa OI wrote:
>> On 2022/11/25 18:04, Jan Beulich wrote:
>>> On 25.11.2022 09:39, Tsukasa OI wrote:
>>>> On 2022/11/25 17:15, Jan Beulich wrote:
>>>>> On 25.11.2022 03:17, Tsukasa OI wrote:
>>>>>> --- a/gas/testsuite/gas/riscv/insn-na.d
>>>>>> +++ b/gas/testsuite/gas/riscv/insn-na.d
>>>>>> @@ -73,3 +73,11 @@ Disassembly of section .text:
>>>>>> [^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
>>>>>> [^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
>>>>>> [^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
>>>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>>>> +[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
>>>>>> +[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>>>
>>>>> I have to admit that I still don't see what good the ".byte ..." part of
>>>>> the expectations does for the purpose of the test. In the cover letter
>>>>> you say "They are not 4-byte aligned (10 and 22-bytes) and unlikely to
>>>>> change any time soon." But changing that is exactly my plan (unless
>>>>> objected to by the arch maintainers): Showing insn components as bytes
>>>>> is imo reasonable for RISC-V at most when things aren't even 2-byte
>>>>> aligned. IOW I'd see these to be "disassembled" to ".2byte ...",
>>>>> matching the "raw opcode" output left to .<N>byte. In fact when raw
>>>>> opcodes are output I question the need for any .<N>byte - it's fully
>>>>> redundant>
>>>>> Bottom line: As before I'd prefer if these parts were dropped (to limit
>>>>> the churn on the files when changing the .<N>byte granularity), but I'm
>>>>> not going to insist. Apart from this the change looks good to me.
>>>>
>>>> Okay, I have to admit that I misunderstood your intent.
>>>>
>>>> Quoting my PATCH v1 cover letter:
>>>>
>>>>> [Disassembler: Instruction is trimmed with 64-bits]
>>>>>
>>>>> In this section, we reuse the object file generated by the section above
>>>>> (two 22-byte instructions).
>>>>>
>>>>> 0000000000000000 <.text>:
>>>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>>> 8: 0000 0000 0000 0000
>>>>> 10: 0000 0000 0000
>>>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>>>>>
>>>>> See ".byte" at the address 0x16. It's trimmed at 64-bits.
>>>>> The resolution is simple. If we simply add a char* argument (containing all
>>>>> instruction bits), we can easily resolve this.
>>>>>
>>>>> 0000000000000000 <.text>:
>>>>> 0: 607f 0000 0000 0000 .byte 0x7f, 0x60, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>>> 8: 0000 0000 0000 0000
>>>>> 10: 0000 0000 0000
>>>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
>>>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> 26: 7654 ba98 fedc all instruction bits are correct
>>>>
>>>> At least, I want to test whether disassembly of this part (after first
>>>> 64-bits of an instruction) is fixed. That is exactly what's fixed in
>>>> PATCH 1/2 and I want to make sure that this part is changed correctly.
>>>
>>> Which you already achieve by the raw opcode output
>>>
>>> 607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000
>>>
>>> The subsequent
>>>
>>> .byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>>
>>> does not check anything the first part didn't already check.
>>
>> That's simply not true.
>>
>> Again, quoting from PATCH v1 cover letter (BEFORE THE PATCH
>> [disassembler part]):
>>
>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>> 1e: 89ab 4567 0123 3210 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> 26: 7654 ba98 fedc zeroed out after first 64-bits
>>
>> Compare hexdump and printed bytes (in ".byte"). You can see that "1e:
>> 89ab..." are different from corresponding ".byte" (0x00, 0x00...).
>> They are completely separate and PATCH 1/2 only changes ".byte" output.
>>
>> If ".byte[ ]+0x7f, 0x60..." does not check anything the first part
>> didn't already check, the dump would look like this:
>>
>>> 16: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
>>> 1e: 0000 0000 0000 0000 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> 26: 0000 0000 0000 zeroed out after first 64-bits (but matches to hexdump)
>>
>> If the hexdump and ".byte" output always matches EVEN BEFORE THE FIX,
>> that's what I can call "redundant". But in reality, they do not.
>> That's why I want to leave a test to make sure that the issue is fixed
>> (now hexdump and ".byte" output always matches).
>
> Oh, apologies: Disassembly then was wrong _only_ for the .byte part, but
> _not_ for the raw encoding? While indeed (going back) you said so in the
> original cover letter, the description of patch 1 doesn't make this
> clear. In that case, yes, I agree that the .byte part wants to be part
> of the expectations (but patch 1's description also wants adjusting).
>
> Jan
>
I'm so happy to resolve misunderstandings between us and I sincerely
apologize for misleading text. Hmm... splitting to three patches
(instead of two) might be better on PATCH v4.
1. Disassembler Fix
(clarify exactly what is going to be fixed: ".byte" output)
2. Assembler Fix
3. New testcases
(clarify that it tests both disassembler and assembler fixes)
PATCH v4 may not be submitted today but I will work on it.
Thanks,
Tsukasa
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler) Tsukasa OI
@ 2022-11-25 11:41 ` Tsukasa OI
2022-11-25 11:41 ` [PATCH v3 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
3 siblings, 1 reply; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:41 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
Hello,
c.f. PATCH v1 (with cover letter with some backgrounds):
<https://sourceware.org/pipermail/binutils/2022-November/124516.html>
c.f. PATCH v2:
<https://sourceware.org/pipermail/binutils/2022-November/124596.html>
c.f. PATCH v3:
<https://sourceware.org/pipermail/binutils/2022-November/124643.html>
[Changes: v3 -> v4]
1. Split to three separate patches
(disassembler fix, assembler fix and testcases that require both fixes)
2. PATCH 1/3: Further clarification of the intent of this commit
(with examples) on the commit message. No changes in the code.
3. PATCH 2/3: Minor clarification on comments and the commit message.
No code changes but some comment changes in tc-riscv.c.
4. PATCH 3/3: Clarify that it tests both fixes.
[Changes: v2 -> v3]
1. PATCH v2 1/2 is removed
2. PATCH v2 2/2 is splitted to PATCH v3 {1,2}/2
based on the feedback of Jan
Strict ".byte" testcases are only preserved to test new behavior.
They are not 4-byte aligned (10 and 22-bytes) and unlikely to change
any time soon.
[Changes: v1 -> v2]
1. Rebased (as usual)
2. PATCH 2/2: Simplified the logic to extract low instruction bits
(will describe later)
3. PATCH 2/2: Changed the commit message slightly
Thanks,
Tsukasa
Tsukasa OI (3):
RISC-V: Better support for long instructions (disassembler)
RISC-V: Better support for long instructions (assembler)
RISC-V: Better support for long instructions (tests)
gas/config/tc-riscv.c | 41 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 ++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
gas/testsuite/gas/riscv/insn.d | 22 +++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 ++++++
opcodes/riscv-dis.c | 14 ++++++----
6 files changed, 89 insertions(+), 15 deletions(-)
base-commit: ac8df5a1921904b3928429e696ad8b40c612f829
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v3 1/3] RISC-V: Better support for long instructions (disassembler)
2022-11-25 11:41 ` [PATCH v3 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
@ 2022-11-25 11:41 ` Tsukasa OI
0 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:41 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
On the disassembler, correct ".byte" output was limited to the first 64-bits
of an instruction. After that, zeroes are incorrectly printed.
Note that, it only happens on ".byte" output (instruction part) and not on
hexdump (data) part. For example, before this commit, hexdump and ".byte"
produces different values:
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (before the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
Note that, after 0xcd (after first 64-bits of the target instruction), all
".byte" values are incorrectly printed as zero while hexdump prints correct
instruction bits.
To resolve this, this commit adds "packet" argument to support dumping
instructions longer than 64-bits (to print correct instruction bits on
".byte"). This commit will be tested on the separate commit.
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (after the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
---
opcodes/riscv-dis.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 3a31647a2f80..0e1f3b4610aa 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
this is little-endian code. */
static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+riscv_disassemble_insn (bfd_vma memaddr,
+ insn_t word,
+ const bfd_byte *packet,
+ disassemble_info *info)
{
const struct riscv_opcode *op;
static bool init = false;
@@ -806,8 +809,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
", ");
(*info->fprintf_styled_func) (info->stream, dis_style_immediate,
"0x%02x",
- (unsigned int) (word & 0xff));
- word >>= 8;
+ (unsigned int) (*packet++));
}
}
break;
@@ -983,6 +985,7 @@ riscv_data_length (bfd_vma memaddr,
static int
riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
insn_t data,
+ const bfd_byte *packet ATTRIBUTE_UNUSED,
disassemble_info *info)
{
info->display_endian = info->endian;
@@ -1037,7 +1040,8 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
bfd_vma dump_size;
int status;
enum riscv_seg_mstate mstate;
- int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+ int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte *,
+ struct disassemble_info *);
if (info->disassembler_options != NULL)
{
@@ -1081,7 +1085,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
}
insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
- return (*riscv_disassembler) (memaddr, insn, info);
+ return (*riscv_disassembler) (memaddr, insn, packet, info);
}
disassembler_ftype
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
` (2 preceding siblings ...)
2022-11-25 11:41 ` [PATCH v3 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
@ 2022-11-25 11:42 ` Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
` (3 more replies)
3 siblings, 4 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:42 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
Hello,
c.f. PATCH v1 (with cover letter with some backgrounds):
<https://sourceware.org/pipermail/binutils/2022-November/124516.html>
c.f. PATCH v2:
<https://sourceware.org/pipermail/binutils/2022-November/124596.html>
c.f. PATCH v3:
<https://sourceware.org/pipermail/binutils/2022-November/124643.html>
[Changes: v3 -> v4]
1. Split to three separate patches
(disassembler fix, assembler fix and testcases that require both fixes)
2. PATCH 1/3: Further clarification of the intent of this commit
(with examples) on the commit message. No changes in the code.
3. PATCH 2/3: Minor clarification on comments and the commit message.
No code changes but some comment changes in tc-riscv.c.
4. PATCH 3/3: Clarify that it tests both fixes.
[Changes: v2 -> v3]
1. PATCH v2 1/2 is removed
2. PATCH v2 2/2 is splitted to PATCH v3 {1,2}/2
based on the feedback of Jan
Strict ".byte" testcases are only preserved to test new behavior.
They are not 4-byte aligned (10 and 22-bytes) and unlikely to change
any time soon.
[Changes: v1 -> v2]
1. Rebased (as usual)
2. PATCH 2/2: Simplified the logic to extract low instruction bits
(will describe later)
3. PATCH 2/2: Changed the commit message slightly
Thanks,
Tsukasa
Tsukasa OI (3):
RISC-V: Better support for long instructions (disassembler)
RISC-V: Better support for long instructions (assembler)
RISC-V: Better support for long instructions (tests)
gas/config/tc-riscv.c | 41 ++++++++++++++++++++++------
gas/testsuite/gas/riscv/insn-dwarf.d | 10 ++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++
gas/testsuite/gas/riscv/insn.d | 22 +++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 ++++++
opcodes/riscv-dis.c | 14 ++++++----
6 files changed, 89 insertions(+), 15 deletions(-)
base-commit: ac8df5a1921904b3928429e696ad8b40c612f829
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v4 1/3] RISC-V: Better support for long instructions (disassembler)
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
@ 2022-11-25 11:42 ` Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 2/3] RISC-V: Better support for long instructions (assembler) Tsukasa OI
` (2 subsequent siblings)
3 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:42 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
On the disassembler, correct ".byte" output was limited to the first 64-bits
of an instruction. After that, zeroes are incorrectly printed.
Note that, it only happens on ".byte" output (instruction part) and not on
hexdump (data) part. For example, before this commit, hexdump and ".byte"
produces different values:
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (before the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
Note that, after 0xcd (after first 64-bits of the target instruction), all
".byte" values are incorrectly printed as zero while hexdump prints correct
instruction bits.
To resolve this, this commit adds "packet" argument to support dumping
instructions longer than 64-bits (to print correct instruction bits on
".byte"). This commit will be tested on the separate commit.
Assembly:
.insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
objdump output example (after the fix):
10: 607f 33cc 55aa cdef .byte 0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
18: 89ab 4567 0123 3210
20: 7654 ba98 fedc
opcodes/ChangeLog:
* riscv-dis.c (riscv_disassemble_insn): Print unknown instruction
using the new argument packet.
(riscv_disassemble_data): Add unused argument packet.
(print_insn_riscv): Pass packet to the disassemble function.
---
opcodes/riscv-dis.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/opcodes/riscv-dis.c b/opcodes/riscv-dis.c
index 3a31647a2f80..0e1f3b4610aa 100644
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -641,7 +641,10 @@ print_insn_args (const char *oparg, insn_t l, bfd_vma pc, disassemble_info *info
this is little-endian code. */
static int
-riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
+riscv_disassemble_insn (bfd_vma memaddr,
+ insn_t word,
+ const bfd_byte *packet,
+ disassemble_info *info)
{
const struct riscv_opcode *op;
static bool init = false;
@@ -806,8 +809,7 @@ riscv_disassemble_insn (bfd_vma memaddr, insn_t word, disassemble_info *info)
", ");
(*info->fprintf_styled_func) (info->stream, dis_style_immediate,
"0x%02x",
- (unsigned int) (word & 0xff));
- word >>= 8;
+ (unsigned int) (*packet++));
}
}
break;
@@ -983,6 +985,7 @@ riscv_data_length (bfd_vma memaddr,
static int
riscv_disassemble_data (bfd_vma memaddr ATTRIBUTE_UNUSED,
insn_t data,
+ const bfd_byte *packet ATTRIBUTE_UNUSED,
disassemble_info *info)
{
info->display_endian = info->endian;
@@ -1037,7 +1040,8 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
bfd_vma dump_size;
int status;
enum riscv_seg_mstate mstate;
- int (*riscv_disassembler) (bfd_vma, insn_t, struct disassemble_info *);
+ int (*riscv_disassembler) (bfd_vma, insn_t, const bfd_byte *,
+ struct disassemble_info *);
if (info->disassembler_options != NULL)
{
@@ -1081,7 +1085,7 @@ print_insn_riscv (bfd_vma memaddr, struct disassemble_info *info)
}
insn = (insn_t) bfd_get_bits (packet, dump_size * 8, false);
- return (*riscv_disassembler) (memaddr, insn, info);
+ return (*riscv_disassembler) (memaddr, insn, packet, info);
}
disassembler_ftype
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v4 2/3] RISC-V: Better support for long instructions (assembler)
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
@ 2022-11-25 11:42 ` Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 3/3] RISC-V: Better support for long instructions (tests) Tsukasa OI
2022-11-25 13:08 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Jan Beulich
3 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:42 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
Commit bb996692bd96 ("RISC-V/gas: allow generating up to 176-bit
instructions with .insn") tried to start supporting long instructions but
it was insufficient.
1. It heavily depended on the bignum internals (radix of 2^16),
2. It generates "value conflicts with instruction length" even if a big
number instruction encoding does not exceed its expected length and
3. Because long opcode was handled separately (from struct riscv_cl_insn),
some information like DWARF line number correspondence was missing.
To resolve these problems, this commit:
1. Handles bignum (and its encodings) precisely and
2. Incorporates long opcode handling into regular instruction handling.
This commit will be tested on the separate commit.
gas/ChangeLog:
* config/tc-riscv.c (struct riscv_cl_insn): Add long opcode field.
(create_insn) Clear long opcode marker.
(install_insn) Install longer opcode as well.
(s_riscv_insn) Likewise.
(riscv_ip_hardcode): Make big number handling stricter. Length and
the value conflicts only if the bignum size exceeds the expected
maximum length.
---
gas/config/tc-riscv.c | 41 ++++++++++++++++++++++++++++++++---------
1 file changed, 32 insertions(+), 9 deletions(-)
diff --git a/gas/config/tc-riscv.c b/gas/config/tc-riscv.c
index 96d71dd1db60..0682eb355241 100644
--- a/gas/config/tc-riscv.c
+++ b/gas/config/tc-riscv.c
@@ -42,9 +42,13 @@ struct riscv_cl_insn
/* The opcode's entry in riscv_opcodes. */
const struct riscv_opcode *insn_mo;
- /* The encoded instruction bits. */
+ /* The encoded instruction bits
+ (first bits enough to extract instruction length on a long opcode). */
insn_t insn_opcode;
+ /* The long encoded instruction bits ([0] is non-zero on a long opcode). */
+ char insn_long_opcode[RISCV_MAX_INSN_LEN];
+
/* The frag that contains the instruction. */
struct frag *frag;
@@ -720,6 +724,7 @@ create_insn (struct riscv_cl_insn *insn, const struct riscv_opcode *mo)
{
insn->insn_mo = mo;
insn->insn_opcode = mo->match;
+ insn->insn_long_opcode[0] = 0;
insn->frag = NULL;
insn->where = 0;
insn->fixp = NULL;
@@ -731,7 +736,10 @@ static void
install_insn (const struct riscv_cl_insn *insn)
{
char *f = insn->frag->fr_literal + insn->where;
- number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
+ if (insn->insn_long_opcode[0] != 0)
+ memcpy (f, insn->insn_long_opcode, insn_length (insn));
+ else
+ number_to_chars_littleendian (f, insn->insn_opcode, insn_length (insn));
}
/* Move INSN to offset WHERE in FRAG. Adjust the fixups accordingly
@@ -3503,7 +3511,9 @@ riscv_ip_hardcode (char *str,
values[num++] = (insn_t) imm_expr->X_add_number;
break;
case O_big:
- values[num++] = generic_bignum[0];
+ /* Extract lower 32-bits of a big number.
+ Assume that generic_bignum_to_int32 work on such number. */
+ values[num++] = (insn_t) generic_bignum_to_int32 ();
break;
default:
/* The first value isn't constant, so it should be
@@ -3530,12 +3540,25 @@ riscv_ip_hardcode (char *str,
if (imm_expr->X_op == O_big)
{
- if (bytes != imm_expr->X_add_number * CHARS_PER_LITTLENUM)
+ unsigned int llen = 0;
+ for (LITTLENUM_TYPE lval = generic_bignum[imm_expr->X_add_number - 1];
+ lval != 0; llen++)
+ lval >>= BITS_PER_CHAR;
+ unsigned int repr_bytes
+ = (imm_expr->X_add_number - 1) * CHARS_PER_LITTLENUM + llen;
+ if (bytes < repr_bytes)
return _("value conflicts with instruction length");
- char *f = frag_more (bytes);
- for (num = 0; num < imm_expr->X_add_number; ++num)
- number_to_chars_littleendian (f + num * CHARS_PER_LITTLENUM,
- generic_bignum[num], CHARS_PER_LITTLENUM);
+ for (num = 0; num < imm_expr->X_add_number - 1; ++num)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ CHARS_PER_LITTLENUM);
+ if (llen != 0)
+ number_to_chars_littleendian (
+ ip->insn_long_opcode + num * CHARS_PER_LITTLENUM,
+ generic_bignum[num],
+ llen);
+ memset(ip->insn_long_opcode + repr_bytes, 0, bytes - repr_bytes);
return NULL;
}
@@ -4612,7 +4635,7 @@ s_riscv_insn (int x ATTRIBUTE_UNUSED)
else
as_bad ("%s `%s'", error.msg, error.statement);
}
- else if (imm_expr.X_op != O_big)
+ else
{
gas_assert (insn.insn_mo->pinfo != INSN_MACRO);
append_insn (&insn, &imm_expr, imm_reloc);
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v4 3/3] RISC-V: Better support for long instructions (tests)
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 2/3] RISC-V: Better support for long instructions (assembler) Tsukasa OI
@ 2022-11-25 11:42 ` Tsukasa OI
2022-11-25 13:08 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Jan Beulich
3 siblings, 0 replies; 38+ messages in thread
From: Tsukasa OI @ 2022-11-25 11:42 UTC (permalink / raw)
To: Tsukasa OI, Jan Beulich, Nelson Chu; +Cc: binutils
From: Tsukasa OI <research_trasio@irq.a4lg.com>
This commit tests both (assembler and disassembler) fixes of "Better support
for long instructions".
gas/ChangeLog:
* testsuite/gas/riscv/insn.s: Add testcases such that big number
handling is required and should be disassembled as long ".byte"
sequence with correct instruction bits.
* testsuite/gas/riscv/insn.d: Likewise.
* testsuite/gas/riscv/insn-na.d: Likewise.
* testsuite/gas/riscv/insn-dwarf.d: Likewise.
---
gas/testsuite/gas/riscv/insn-dwarf.d | 10 +++++++++-
gas/testsuite/gas/riscv/insn-na.d | 8 ++++++++
gas/testsuite/gas/riscv/insn.d | 22 ++++++++++++++++++++++
gas/testsuite/gas/riscv/insn.s | 9 +++++++++
4 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/gas/testsuite/gas/riscv/insn-dwarf.d b/gas/testsuite/gas/riscv/insn-dwarf.d
index 89dc8d58ff09..b8bd42dff18c 100644
--- a/gas/testsuite/gas/riscv/insn-dwarf.d
+++ b/gas/testsuite/gas/riscv/insn-dwarf.d
@@ -74,5 +74,13 @@ insn.s +69 +0xf6.*
insn.s +70 +0xfe.*
insn.s +71 +0x108.*
insn.s +72 +0x114.*
-insn.s +- +0x12a
+insn.s +74 +0x12a.*
+insn.s +75 +0x134.*
+insn.s +76 +0x13e.*
+insn.s +77 +0x154.*
+insn.s +78 +0x16a.*
+insn.s +79 +0x180.*
+insn.s +80 +0x196.*
+insn.s +81 +0x1ac.*
+insn.s +- +0x1c2
#pass
diff --git a/gas/testsuite/gas/riscv/insn-na.d b/gas/testsuite/gas/riscv/insn-na.d
index 66dce71ebc21..6928ba9ba0f2 100644
--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -73,3 +73,11 @@ Disassembly of section .text:
[^:]+:[ ]+007f 0000 0000 0000 0000[ ]+[._a-z].*
[^:]+:[ ]+0000107f 00000000 00000000[ ]+[._a-z].*
[^:]+:[ ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ ]+[._a-z].*
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+007f 0000 0000 0000 8000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
diff --git a/gas/testsuite/gas/riscv/insn.d b/gas/testsuite/gas/riscv/insn.d
index 2e5d35b39702..b25bc35553fd 100644
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -92,3 +92,25 @@ Disassembly of section .text:
[^:]+:[ ]+607f 0000 0000 0000[ ]+[._a-z].*
[^:]+:[ ]+0000 0000 0000 0000 ?
[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+007f 0000 0000 0000[ ]+\.byte[ ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ ]+8000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 89ab 4567 0123[ ]+\.byte[ ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ ]+3210 7654 ba98 fedc ?
+[^:]+:[ ]+0000 0000 0000 ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 00dc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
+[^:]+:[ ]+607f 33cc 55aa cdef[ ]+\.byte[ ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ ]+89ab 4567 0123 3210 ?
+[^:]+:[ ]+7654 ba98 fedc ?
diff --git a/gas/testsuite/gas/riscv/insn.s b/gas/testsuite/gas/riscv/insn.s
index 94f62fe5672e..48db59b14e88 100644
--- a/gas/testsuite/gas/riscv/insn.s
+++ b/gas/testsuite/gas/riscv/insn.s
@@ -70,3 +70,12 @@ target:
.insn 10, 0x007f
.insn 12, 0x107f
.insn 22, 0x607f
+
+ .insn 0x8000000000000000007f
+ .insn 10, 0x8000000000000000007f
+ .insn 0x000000000000fedcba98765432100123456789ab607f
+ .insn 22, 0x000000000000fedcba98765432100123456789ab607f
+ .insn 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0x00dcba98765432100123456789abcdef55aa33cc607f
+ .insn 0xfedcba98765432100123456789abcdef55aa33cc607f
+ .insn 22, 0xfedcba98765432100123456789abcdef55aa33cc607f
--
2.38.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
` (2 preceding siblings ...)
2022-11-25 11:42 ` [PATCH v4 3/3] RISC-V: Better support for long instructions (tests) Tsukasa OI
@ 2022-11-25 13:08 ` Jan Beulich
2022-11-28 1:53 ` Nelson Chu
3 siblings, 1 reply; 38+ messages in thread
From: Jan Beulich @ 2022-11-25 13:08 UTC (permalink / raw)
To: Tsukasa OI; +Cc: binutils, Nelson Chu
On 25.11.2022 12:42, Tsukasa OI wrote:
> Tsukasa OI (3):
> RISC-V: Better support for long instructions (disassembler)
> RISC-V: Better support for long instructions (assembler)
> RISC-V: Better support for long instructions (tests)
LGTM, but again please wait a day or two with committing in case Nelson
wants to take another look. And thanks for being patient with me.
Jan
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits])
2022-11-25 13:08 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Jan Beulich
@ 2022-11-28 1:53 ` Nelson Chu
0 siblings, 0 replies; 38+ messages in thread
From: Nelson Chu @ 2022-11-28 1:53 UTC (permalink / raw)
To: Jan Beulich; +Cc: Tsukasa OI, binutils
On Fri, Nov 25, 2022 at 9:08 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 25.11.2022 12:42, Tsukasa OI wrote:
> > Tsukasa OI (3):
> > RISC-V: Better support for long instructions (disassembler)
> > RISC-V: Better support for long instructions (assembler)
> > RISC-V: Better support for long instructions (tests)
>
> LGTM, but again please wait a day or two with committing in case Nelson
> wants to take another look. And thanks for being patient with me.
Hey Jan, thanks for your reply to clarify the roles of us. FYI, the
.insn test cases were not so stricter because we haven't had any
dis-assembler improvements at that time, so not sure which the
dis-assembler results are better for those cases where the
instructions are not supported. Anyway, I have seen the details,
which look good to me, too, so please commit these three patches.
Thanks for both of your support, Jan and Tsukasa
Nelson
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2022-11-28 1:54 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-19 7:10 [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-19 7:10 ` [PATCH 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
2022-11-21 7:32 ` Jan Beulich
2022-11-23 8:20 ` Tsukasa OI
2022-11-23 8:56 ` Jan Beulich
2022-11-19 7:10 ` [PATCH 2/2] RISC-V: Better support for long instructions Tsukasa OI
2022-11-21 7:37 ` Jan Beulich
2022-11-23 8:40 ` Tsukasa OI
2022-11-23 8:44 ` Jan Beulich
2022-11-23 8:51 ` Tsukasa OI
2022-11-25 1:38 ` Nelson Chu
2022-11-25 2:33 ` Tsukasa OI
2022-11-22 0:43 ` [PATCH 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Nelson Chu
2022-11-23 8:30 ` [PATCH v2 " Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 1/2] RISC-V: Make .insn tests stricter Tsukasa OI
2022-11-23 8:30 ` [PATCH v2 2/2] RISC-V: Better support for long instructions Tsukasa OI
2022-11-23 9:04 ` Jan Beulich
2022-11-24 2:34 ` Tsukasa OI
2022-11-24 7:31 ` Jan Beulich
2022-11-24 7:35 ` Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 0/2] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 2:17 ` [PATCH v3 1/2] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 8:03 ` Jan Beulich
2022-11-25 2:17 ` [PATCH v3 2/2] RISC-V: Better support for long instructions (assembler) Tsukasa OI
2022-11-25 8:15 ` Jan Beulich
2022-11-25 8:39 ` Tsukasa OI
2022-11-25 9:04 ` Jan Beulich
2022-11-25 9:18 ` Tsukasa OI
2022-11-25 9:56 ` Jan Beulich
2022-11-25 11:07 ` Tsukasa OI
2022-11-25 11:41 ` [PATCH v3 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 11:41 ` [PATCH v3 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 1/3] RISC-V: Better support for long instructions (disassembler) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 2/3] RISC-V: Better support for long instructions (assembler) Tsukasa OI
2022-11-25 11:42 ` [PATCH v4 3/3] RISC-V: Better support for long instructions (tests) Tsukasa OI
2022-11-25 13:08 ` [PATCH v4 0/3] RISC-V: Better support for long instructions (64 < x <= 176 [bits]) Jan Beulich
2022-11-28 1:53 ` Nelson Chu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).