public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
@ 2023-01-27 13:14 Jan Beulich
  2023-01-30  1:40 ` Nelson Chu
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2023-01-27 13:14 UTC (permalink / raw)
  To: Binutils; +Cc: Palmer Dabbelt, Andrew Waterman, Jim Wilson, Nelson Chu

Insn width granularity being 16 bits, produce 16 bits of data per output
element for the non-special-cased insn widths. Besides being slightly
more natural (in particular having the first element carry everything
that's needed for determining insn length), this is also slightly more
compact.
---
I think output would be even more neat if .<N>byte were suppressed when
generic code already prints the raw insn (or maybe something like
<unknown> was printed instead).

As to the special case code for 2-, 4-, and 8-byte insns: Any thoughts
towards zero-padding the numbers there as well? That would make yet more
noticable what the insn width is; in particular - see e.g. ".8byte 0x3f"
in full disassembly of the testcases modified here - it would avoid the
value looking like it was for a more narrow one.

Taking it yet one step further - how about printing unknown insns as
".insn ...", matching what gas would consume?

--- a/gas/testsuite/gas/riscv/insn-na.d
+++ b/gas/testsuite/gas/riscv/insn-na.d
@@ -76,11 +76,11 @@ Disassembly of section .text:
 [^:]+:[ 	]+007f 0000 0000 0000 0000[ 	]+[._a-z].*
 [^:]+:[ 	]+0000107f 00000000 00000000[ 	]+[._a-z].*
 [^:]+:[ 	]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[ 	]+[._a-z].*
-[^:]+:[ 	]+007f 0000 0000 0000 8000[ 	]+\.byte[ 	]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
-[^:]+:[ 	]+007f 0000 0000 0000 8000[ 	]+\.byte[ 	]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
-[^:]+:[ 	]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ 	]+\.byte[ 	]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-[^:]+:[ 	]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ 	]+\.byte[ 	]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
-[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
-[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
-[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
-[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ 	]+007f 0000 0000 0000 8000[ 	]+\.2byte[ 	]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
+[^:]+:[ 	]+007f 0000 0000 0000 8000[ 	]+\.2byte[ 	]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
+[^:]+:[ 	]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ 	]+\.2byte[ 	]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
+[^:]+:[ 	]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[ 	]+\.2byte[ 	]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
+[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
+[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
+[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
+[^:]+:[ 	]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
--- a/gas/testsuite/gas/riscv/insn.d
+++ b/gas/testsuite/gas/riscv/insn.d
@@ -95,25 +95,25 @@ Disassembly of section .text:
 [^:]+:[ 	]+607f 0000 0000 0000[ 	]+[._a-z].*
 [^:]+:[ 	]+0000 0000 0000 0000 ?
 [^:]+:[ 	]+0000 0000 0000 ?
-[^:]+:[ 	]+007f 0000 0000 0000[ 	]+\.byte[ 	]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ 	]+007f 0000 0000 0000[ 	]+\.2byte[ 	]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
 [^:]+:[ 	]+8000 ?
-[^:]+:[ 	]+007f 0000 0000 0000[ 	]+\.byte[ 	]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
+[^:]+:[ 	]+007f 0000 0000 0000[ 	]+\.2byte[ 	]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
 [^:]+:[ 	]+8000 ?
-[^:]+:[ 	]+607f 89ab 4567 0123[ 	]+\.byte[ 	]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ 	]+607f 89ab 4567 0123[ 	]+\.2byte[ 	]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
 [^:]+:[ 	]+3210 7654 ba98 fedc ?
 [^:]+:[ 	]+0000 0000 0000 ?
-[^:]+:[ 	]+607f 89ab 4567 0123[ 	]+\.byte[ 	]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+[^:]+:[ 	]+607f 89ab 4567 0123[ 	]+\.2byte[ 	]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
 [^:]+:[ 	]+3210 7654 ba98 fedc ?
 [^:]+:[ 	]+0000 0000 0000 ?
-[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
 [^:]+:[ 	]+89ab 4567 0123 3210 ?
 [^:]+:[ 	]+7654 ba98 00dc ?
-[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
+[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
 [^:]+:[ 	]+89ab 4567 0123 3210 ?
 [^:]+:[ 	]+7654 ba98 00dc ?
-[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
 [^:]+:[ 	]+89ab 4567 0123 3210 ?
 [^:]+:[ 	]+7654 ba98 fedc ?
-[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.byte[ 	]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
+[^:]+:[ 	]+607f 33cc 55aa cdef[ 	]+\.2byte[ 	]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
 [^:]+:[ 	]+89ab 4567 0123 3210 ?
 [^:]+:[ 	]+7654 ba98 fedc ?
--- a/opcodes/riscv-dis.c
+++ b/opcodes/riscv-dis.c
@@ -800,16 +800,16 @@ riscv_disassemble_insn (bfd_vma memaddr,
       {
         int i;
 	(*info->fprintf_styled_func)
-	  (info->stream, dis_style_assembler_directive, ".byte");
+	  (info->stream, dis_style_assembler_directive, ".2byte");
 	(*info->fprintf_styled_func) (info->stream, dis_style_text, "\t");
-        for (i = 0; i < insnlen; ++i)
+        for (i = 0; i < insnlen; i += 2)
           {
             if (i > 0)
 	      (*info->fprintf_styled_func) (info->stream, dis_style_text,
 					    ", ");
+	    word = bfd_get_bits (packet + i, 16, false);
 	    (*info->fprintf_styled_func) (info->stream, dis_style_immediate,
-					  "0x%02x",
-					  (unsigned int) (*packet++));
+					  "0x%04x", (unsigned int) word);
           }
       }
       break;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-27 13:14 [PATCH] RISC-V: don't disassemble unrecognized insns as .byte Jan Beulich
@ 2023-01-30  1:40 ` Nelson Chu
  2023-01-30  1:51   ` Maciej W. Rozycki
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Nelson Chu @ 2023-01-30  1:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Binutils, Palmer Dabbelt, Andrew Waterman, Jim Wilson, Kito Cheng

On Fri, Jan 27, 2023 at 9:14 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> Insn width granularity being 16 bits, produce 16 bits of data per output
> element for the non-special-cased insn widths. Besides being slightly
> more natural (in particular having the first element carry everything
> that's needed for determining insn length), this is also slightly more
> compact.

Sounds reasonable and good, so please commit, thanks.

> ---
> I think output would be even more neat if .<N>byte were suppressed when
> generic code already prints the raw insn (or maybe something like
> <unknown> was printed instead).
>
> As to the special case code for 2-, 4-, and 8-byte insns: Any thoughts
> towards zero-padding the numbers there as well? That would make yet more
> noticable what the insn width is; in particular - see e.g. ".8byte 0x3f"
> in full disassembly of the testcases modified here - it would avoid the
> value looking like it was for a more narrow one.
>
> Taking it yet one step further - how about printing unknown insns as
> ".insn ...", matching what gas would consume?

I don't remember how long ago, Kito did discuss this idea with me, but
since we haven't supported mapping symbols at that time, it's been
forgotten until this patch.  Anyway, I vote to dump the unrecognized
instructions into .insn directives, so if there is no objection,
please go ahead.

Also cc Kito, he should have almost forgotten about this :-)

Thanks
Nelson

>
> --- a/gas/testsuite/gas/riscv/insn-na.d
> +++ b/gas/testsuite/gas/riscv/insn-na.d
> @@ -76,11 +76,11 @@ Disassembly of section .text:
>  [^:]+:[        ]+007f 0000 0000 0000 0000[     ]+[._a-z].*
>  [^:]+:[        ]+0000107f 00000000 00000000[   ]+[._a-z].*
>  [^:]+:[        ]+607f 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000[       ]+[._a-z].*
> -[^:]+:[        ]+007f 0000 0000 0000 8000[     ]+\.byte[       ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> -[^:]+:[        ]+007f 0000 0000 0000 8000[     ]+\.byte[       ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> -[^:]+:[        ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[       ]+\.byte[       ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> -[^:]+:[        ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[       ]+\.byte[       ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> -[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[       ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
> -[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[       ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
> -[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[       ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> -[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[       ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> +[^:]+:[        ]+007f 0000 0000 0000 8000[     ]+\.2byte[      ]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
> +[^:]+:[        ]+007f 0000 0000 0000 8000[     ]+\.2byte[      ]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
> +[^:]+:[        ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[       ]+\.2byte[      ]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
> +[^:]+:[        ]+607f 89ab 4567 0123 3210 7654 ba98 fedc 0000 0000 0000[       ]+\.2byte[      ]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
> +[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[       ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
> +[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 00dc[       ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
> +[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[       ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
> +[^:]+:[        ]+607f 33cc 55aa cdef 89ab 4567 0123 3210 7654 ba98 fedc[       ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
> --- a/gas/testsuite/gas/riscv/insn.d
> +++ b/gas/testsuite/gas/riscv/insn.d
> @@ -95,25 +95,25 @@ Disassembly of section .text:
>  [^:]+:[        ]+607f 0000 0000 0000[  ]+[._a-z].*
>  [^:]+:[        ]+0000 0000 0000 0000 ?
>  [^:]+:[        ]+0000 0000 0000 ?
> -[^:]+:[        ]+007f 0000 0000 0000[  ]+\.byte[       ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> +[^:]+:[        ]+007f 0000 0000 0000[  ]+\.2byte[      ]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
>  [^:]+:[        ]+8000 ?
> -[^:]+:[        ]+007f 0000 0000 0000[  ]+\.byte[       ]+0x7f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x80
> +[^:]+:[        ]+007f 0000 0000 0000[  ]+\.2byte[      ]+0x007f, 0x0000, 0x0000, 0x0000, 0x8000
>  [^:]+:[        ]+8000 ?
> -[^:]+:[        ]+607f 89ab 4567 0123[  ]+\.byte[       ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[        ]+607f 89ab 4567 0123[  ]+\.2byte[      ]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
>  [^:]+:[        ]+3210 7654 ba98 fedc ?
>  [^:]+:[        ]+0000 0000 0000 ?
> -[^:]+:[        ]+607f 89ab 4567 0123[  ]+\.byte[       ]+0x7f, 0x60, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
> +[^:]+:[        ]+607f 89ab 4567 0123[  ]+\.2byte[      ]+0x607f, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc, 0x0000, 0x0000, 0x0000
>  [^:]+:[        ]+3210 7654 ba98 fedc ?
>  [^:]+:[        ]+0000 0000 0000 ?
> -[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
> +[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
>  [^:]+:[        ]+89ab 4567 0123 3210 ?
>  [^:]+:[        ]+7654 ba98 00dc ?
> -[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0x00
> +[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0x00dc
>  [^:]+:[        ]+89ab 4567 0123 3210 ?
>  [^:]+:[        ]+7654 ba98 00dc ?
> -[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> +[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
>  [^:]+:[        ]+89ab 4567 0123 3210 ?
>  [^:]+:[        ]+7654 ba98 fedc ?
> -[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.byte[       ]+0x7f, 0x60, 0xcc, 0x33, 0xaa, 0x55, 0xef, 0xcd, 0xab, 0x89, 0x67, 0x45, 0x23, 0x01, 0x10, 0x32, 0x54, 0x76, 0x98, 0xba, 0xdc, 0xfe
> +[^:]+:[        ]+607f 33cc 55aa cdef[  ]+\.2byte[      ]+0x607f, 0x33cc, 0x55aa, 0xcdef, 0x89ab, 0x4567, 0x0123, 0x3210, 0x7654, 0xba98, 0xfedc
>  [^:]+:[        ]+89ab 4567 0123 3210 ?
>  [^:]+:[        ]+7654 ba98 fedc ?
> --- a/opcodes/riscv-dis.c
> +++ b/opcodes/riscv-dis.c
> @@ -800,16 +800,16 @@ riscv_disassemble_insn (bfd_vma memaddr,
>        {
>          int i;
>         (*info->fprintf_styled_func)
> -         (info->stream, dis_style_assembler_directive, ".byte");
> +         (info->stream, dis_style_assembler_directive, ".2byte");
>         (*info->fprintf_styled_func) (info->stream, dis_style_text, "\t");
> -        for (i = 0; i < insnlen; ++i)
> +        for (i = 0; i < insnlen; i += 2)
>            {
>              if (i > 0)
>               (*info->fprintf_styled_func) (info->stream, dis_style_text,
>                                             ", ");
> +           word = bfd_get_bits (packet + i, 16, false);
>             (*info->fprintf_styled_func) (info->stream, dis_style_immediate,
> -                                         "0x%02x",
> -                                         (unsigned int) (*packet++));
> +                                         "0x%04x", (unsigned int) word);
>            }
>        }
>        break;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  1:40 ` Nelson Chu
@ 2023-01-30  1:51   ` Maciej W. Rozycki
  2023-01-30  3:03     ` Nelson Chu
  2023-01-30  2:07   ` Kito Cheng
  2023-01-30  7:51   ` Jan Beulich
  2 siblings, 1 reply; 8+ messages in thread
From: Maciej W. Rozycki @ 2023-01-30  1:51 UTC (permalink / raw)
  To: Nelson Chu
  Cc: Jan Beulich, Binutils, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Kito Cheng

On Mon, 30 Jan 2023, Nelson Chu wrote:

> > Insn width granularity being 16 bits, produce 16 bits of data per output
> > element for the non-special-cased insn widths. Besides being slightly
> > more natural (in particular having the first element carry everything
> > that's needed for determining insn length), this is also slightly more
> > compact.
> 
> Sounds reasonable and good, so please commit, thanks.

 Why not `.short' though, the more usual pseudo-op (and documented as 
portable in the GAS manual)?

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  1:40 ` Nelson Chu
  2023-01-30  1:51   ` Maciej W. Rozycki
@ 2023-01-30  2:07   ` Kito Cheng
  2023-01-30  7:51   ` Jan Beulich
  2 siblings, 0 replies; 8+ messages in thread
From: Kito Cheng @ 2023-01-30  2:07 UTC (permalink / raw)
  To: Nelson Chu
  Cc: Jan Beulich, Binutils, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Kito Cheng

> > As to the special case code for 2-, 4-, and 8-byte insns: Any thoughts
> > towards zero-padding the numbers there as well? That would make yet more
> > noticable what the insn width is; in particular - see e.g. ".8byte 0x3f"
> > in full disassembly of the testcases modified here - it would avoid the
> > value looking like it was for a more narrow one.
> >
> > Taking it yet one step further - how about printing unknown insns as
> > ".insn ...", matching what gas would consume?
>
> I don't remember how long ago, Kito did discuss this idea with me, but
> since we haven't supported mapping symbols at that time, it's been
> forgotten until this patch.  Anyway, I vote to dump the unrecognized
> instructions into .insn directives, so if there is no objection,
> please go ahead.
>
> Also cc Kito, he should have almost forgotten about this :-)

I like printing .insn too, we have mapping symbols now, so it should
be able to distinguish between code and data :)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  1:51   ` Maciej W. Rozycki
@ 2023-01-30  3:03     ` Nelson Chu
  2023-01-30  3:33       ` Maciej W. Rozycki
  0 siblings, 1 reply; 8+ messages in thread
From: Nelson Chu @ 2023-01-30  3:03 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Jan Beulich, Binutils, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Kito Cheng

Hi Maciej,

On Mon, Jan 30, 2023 at 9:51 AM Maciej W. Rozycki <macro@orcam.me.uk> wrote:
>
> On Mon, 30 Jan 2023, Nelson Chu wrote:
>
> > > Insn width granularity being 16 bits, produce 16 bits of data per output
> > > element for the non-special-cased insn widths. Besides being slightly
> > > more natural (in particular having the first element carry everything
> > > that's needed for determining insn length), this is also slightly more
> > > compact.
> >
> > Sounds reasonable and good, so please commit, thanks.
>
>  Why not `.short' though, the more usual pseudo-op (and documented as
> portable in the GAS manual)?

Not sure if there are any documents mentioning this, but I remember
that there has always been an issue of hoping to distinguish whether
the encodings is an instruction or data  The .<N>byte may not be used
to do that at the beginning (maybe just let dis-assembler dump the
legal formats, https://github.com/bminor/binutils-gdb/commit/6a7f57668afcd841e8fc6b507a27bb20e9209fa9),
but sometimes it has the same effect - For example, .2byte is an
instruction but unrecognized, .short is a data in text.  However, it
isn't good enough, since,

% cat tmp.s
.text
.short 0x1
.insn 0x1
.2byte 0x1
% riscv64-unknown-elf-as tmp.s -o tmp.o
% riscv64-unknown-elf-objdump -d tmp.o
...
Disassembly of section .text:

0000000000000000 <.text>:
   0: 0001                .short 0x0001
   2: 0001                .2byte 0x1
   4: 0001                .short 0x0001
...

The .<N>byte in assembly will also be treated as data to dump, so
ideally, dump these unrecognized instructions as .insn is probably
more reasonable and won't be confused.

Thanks
Nelson

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  3:03     ` Nelson Chu
@ 2023-01-30  3:33       ` Maciej W. Rozycki
  2023-01-31  1:51         ` Nelson Chu
  0 siblings, 1 reply; 8+ messages in thread
From: Maciej W. Rozycki @ 2023-01-30  3:33 UTC (permalink / raw)
  To: Nelson Chu
  Cc: Jan Beulich, Binutils, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Kito Cheng

Hi Nelson,

> 0000000000000000 <.text>:
>    0: 0001                .short 0x0001
>    2: 0001                .2byte 0x1
>    4: 0001                .short 0x0001
> ...
> 
> The .<N>byte in assembly will also be treated as data to dump, so
> ideally, dump these unrecognized instructions as .insn is probably
> more reasonable and won't be confused.

 I wasn't aware about the RISC-V `.insn' pseudo-op.  I think it will make 
a lot of sense to use it here then, so I'd be happier with either:

   2: 0001                .insn 0x0001

or:

   2: 0001                .insn 0x2, 0x1

and analogously for unrecognised instructions comprised of a higher number 
of parcels according to the opcode allocation, e.g.:

Disassembly of section .text:
0000000000000000 <foo> .insn	0x00000000001f

or:

Disassembly of section .text:
0000000000000000 <foo> .insn	0x6, 0x1f

(here in the `--prefix-addresses' format).  The single-argument format may 
by clearer to the casual eye at the cost of extra leading zeros with some 
encodings.

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  1:40 ` Nelson Chu
  2023-01-30  1:51   ` Maciej W. Rozycki
  2023-01-30  2:07   ` Kito Cheng
@ 2023-01-30  7:51   ` Jan Beulich
  2 siblings, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2023-01-30  7:51 UTC (permalink / raw)
  To: Nelson Chu
  Cc: Binutils, Palmer Dabbelt, Andrew Waterman, Jim Wilson, Kito Cheng

On 30.01.2023 02:40, Nelson Chu wrote:
> On Fri, Jan 27, 2023 at 9:14 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Insn width granularity being 16 bits, produce 16 bits of data per output
>> element for the non-special-cased insn widths. Besides being slightly
>> more natural (in particular having the first element carry everything
>> that's needed for determining insn length), this is also slightly more
>> compact.
> 
> Sounds reasonable and good, so please commit, thanks.

Given ...

>> ---
>> I think output would be even more neat if .<N>byte were suppressed when
>> generic code already prints the raw insn (or maybe something like
>> <unknown> was printed instead).
>>
>> As to the special case code for 2-, 4-, and 8-byte insns: Any thoughts
>> towards zero-padding the numbers there as well? That would make yet more
>> noticable what the insn width is; in particular - see e.g. ".8byte 0x3f"
>> in full disassembly of the testcases modified here - it would avoid the
>> value looking like it was for a more narrow one.
>>
>> Taking it yet one step further - how about printing unknown insns as
>> ".insn ...", matching what gas would consume?
> 
> I don't remember how long ago, Kito did discuss this idea with me, but
> since we haven't supported mapping symbols at that time, it's been
> forgotten until this patch.  Anyway, I vote to dump the unrecognized
> instructions into .insn directives, so if there is no objection,
> please go ahead.
> 
> Also cc Kito, he should have almost forgotten about this :-)

both of you think this is a good direction to follow, I won't even bother
committing the patch here, but make one to switch to emitting .insn right
away. With some luck later this week ...

Jan


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] RISC-V: don't disassemble unrecognized insns as .byte
  2023-01-30  3:33       ` Maciej W. Rozycki
@ 2023-01-31  1:51         ` Nelson Chu
  0 siblings, 0 replies; 8+ messages in thread
From: Nelson Chu @ 2023-01-31  1:51 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Jan Beulich, Binutils, Palmer Dabbelt, Andrew Waterman,
	Jim Wilson, Kito Cheng

On Mon, Jan 30, 2023 at 11:33 AM Maciej W. Rozycki <macro@orcam.me.uk> wrote:
>
> Hi Nelson,
>
> > 0000000000000000 <.text>:
> >    0: 0001                .short 0x0001
> >    2: 0001                .2byte 0x1
> >    4: 0001                .short 0x0001
> > ...
> >
> > The .<N>byte in assembly will also be treated as data to dump, so
> > ideally, dump these unrecognized instructions as .insn is probably
> > more reasonable and won't be confused.
>
>  I wasn't aware about the RISC-V `.insn' pseudo-op.  I think it will make
> a lot of sense to use it here then, so I'd be happier with either:
>
>    2: 0001                .insn 0x0001
>
> or:
>
>    2: 0001                .insn 0x2, 0x1
>
> and analogously for unrecognised instructions comprised of a higher number
> of parcels according to the opcode allocation, e.g.:
>
> Disassembly of section .text:
> 0000000000000000 <foo> .insn    0x00000000001f
>
> or:
>
> Disassembly of section .text:
> 0000000000000000 <foo> .insn    0x6, 0x1f
>
> (here in the `--prefix-addresses' format).  The single-argument format may
> by clearer to the casual eye at the cost of extra leading zeros with some
> encodings.

Yeah, dump these unrecognized instructions into the .insn directives
with the <bytes> field looks better.  Thanks for pointing this out :-)

Nelson

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-31  1:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-27 13:14 [PATCH] RISC-V: don't disassemble unrecognized insns as .byte Jan Beulich
2023-01-30  1:40 ` Nelson Chu
2023-01-30  1:51   ` Maciej W. Rozycki
2023-01-30  3:03     ` Nelson Chu
2023-01-30  3:33       ` Maciej W. Rozycki
2023-01-31  1:51         ` Nelson Chu
2023-01-30  2:07   ` Kito Cheng
2023-01-30  7:51   ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).