public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Disassembler opcode display and text alignment
@ 2022-06-23 16:05 Andrew Burgess
  2022-06-23 16:05 ` [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output Andrew Burgess
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This series makes two related changes to GDB's disassembler.  Both
changes flow naturally from having GDB make use of libopcodes
bytes_per_line and bytes_per_chunk presentation hints that are set on
a per-architecure basis with each call into the disassembler.

The end result of this change is that GDB will now display instruction
opcodes in the same way that objdump does.

For x86-64 there's no change in the way that opcode bytes are
presented.  Now due to some special case, it's just that how GDB lays
out instruction opcodes just happens to match how libopcodes requests
that the opcodes be laid out.

For other architectures, risc-v, powerpc, arm, aarch64, etc, this is
not the case, and I (personally) think objdump does a better job of
presenting the information that GDB does; the instruction opcodes will
be grouped together based on the instruction size, and potentially
byte-swapped so they appear in the instruction's natural order.

Making use of the display hints also allows for better alignment of
the disassembly text when opcodes are being printed.

I have proposed that this new behaviour become the default for
'disassemble /r', and I've added a new flag 'disassemble /b' which
allows the user to access the old behaviour.

But, if people feel strongly that the old 'disassemble /r' should not
be changed, I can place the new behaviour under a new flag.

Let me know your thoughts,

Thanks,
Andrew

---

Andrew Burgess (9):
  gdb/doc: improve description of --data-disassemble opcodes output
  gdb/testsuite: new test for -data-disassemble opcodes format
  gdb/disasm: read opcodes bytes with a single read_code call
  gdb: disassembler opcode display formatting
  gdb: make gdb_disassembly_flag unsigned
  gdb/doc: fix column widths in MI compatibility table
  gdb/doc: update syntax of -data-disassemble command arguments
  gdb/mi: some int to bool conversion
  gdb/mi: new options for -data-disassemble command

 gdb/NEWS                                |  12 ++
 gdb/cli/cli-cmds.c                      |   6 +
 gdb/disasm-flags.h                      |   3 +-
 gdb/disasm.c                            |  55 +++++++--
 gdb/disasm.h                            |   3 +
 gdb/doc/gdb.texinfo                     | 137 ++++++++++++++++-----
 gdb/mi/mi-cmd-disas.c                   | 105 ++++++++++++----
 gdb/record.c                            |   3 +
 gdb/testsuite/gdb.mi/mi-disassemble.exp | 153 ++++++++++++++++++++----
 gdb/testsuite/lib/mi-support.exp        |  27 +++++
 10 files changed, 417 insertions(+), 87 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:09   ` Eli Zaretskii
  2022-06-23 16:05 ` [PATCH 2/9] gdb/testsuite: new test for -data-disassemble opcodes format Andrew Burgess
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Extend the description of the MI command --data-disassemble.
Specifically, expand the description of the 'opcodes' field to explain
how the bytes are formatted.
---
 gdb/doc/gdb.texinfo | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 2178b476f53..d046ce5891e 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -34422,8 +34422,10 @@
 The text disassembly for this @samp{address}.
 
 @item opcodes
-This field is only present for modes 2, 3 and 5.  This contains the raw opcode
-bytes for the @samp{inst} field.
+This field is only present for modes 2, 3 and 5.  This contains the
+raw opcode bytes for the @samp{inst} field.  The bytes are formatted
+as single bytes, in hex, in ascending address order, with a single
+space between each byte.
 
 @end table
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/9] gdb/testsuite: new test for -data-disassemble opcodes format
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
  2022-06-23 16:05 ` [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:05 ` [PATCH 3/9] gdb/disasm: read opcodes bytes with a single read_code call Andrew Burgess
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Add another test for the output of MI command -data-disassemble.  The
new check validates the format of the 'opcodes' field, specifically,
this test checks that the field contains a series of bytes, separated
by a single space.

We also check that the bytes are in the correct order, that is, the
first byte is from the lowest address, and subsequent bytes are from
increasing addresses.

The motivation for this test (besides more tests being generally good)
is that I plan to make changes to how opcode bytes are displayed in
the disassembler output, and I want to ensure that I don't break any
existing MI behaviour.

There should be no user visible changes to GDB after this commit.
---
 gdb/testsuite/gdb.mi/mi-disassemble.exp | 83 +++++++++++++++++++++++++
 gdb/testsuite/lib/mi-support.exp        | 27 ++++++++
 2 files changed, 110 insertions(+)

diff --git a/gdb/testsuite/gdb.mi/mi-disassemble.exp b/gdb/testsuite/gdb.mi/mi-disassemble.exp
index 2d3e8e216b1..b7c52472c84 100644
--- a/gdb/testsuite/gdb.mi/mi-disassemble.exp
+++ b/gdb/testsuite/gdb.mi/mi-disassemble.exp
@@ -239,6 +239,88 @@ proc test_disassembly_bogus_args {} {
 
 }
 
+# Check the format of the opcode bytes.
+proc test_disassembly_opcode_format {} {
+    # First, we need to find a multi-byte instruction that we can
+    # then disassemble using the MI command.
+    set longest_insn_bytes ""
+    set longest_insn_addr ""
+    gdb_test_multiple "disassemble /r main" "" {
+	-re "^disassemble /r main\r\n" {
+	    exp_continue
+	}
+
+	-re "^&\"disassemble /r main.n\"\r\n" {
+	    exp_continue
+	}
+
+	-re "^~\"Dump of assembler code for function \[^\r\n\]+\r\n" {
+	    exp_continue
+	}
+
+	-re "^~\".. ($::hex) <\[^>\]+>:\\\\t(\[^\\\\\]+)\\\\t\[^\r\n\]+\r\n" {
+	    set addr $expect_out(1,string)
+	    set bytes $expect_out(2,string)
+	    if { [string length $bytes] > [string length $longest_insn_bytes] } {
+		set longest_insn_addr $addr
+		set longest_insn_bytes $bytes
+	    }
+	    exp_continue
+	}
+
+	-re "^~\"End of assembler dump\[^\r\n\]+\r\n" {
+	    exp_continue
+	}
+
+	-re "^\\^done\r\n$::mi_gdb_prompt$" {
+	    gdb_assert { ![string equal $longest_insn_bytes ""] } \
+		"found the bytes string for a longest instruction"
+	    gdb_assert { ![string equal $longest_insn_addr ""] } \
+		"found the address for a longest instruction"
+	}
+    }
+
+    verbose -log "Longest instruction at ${longest_insn_addr} with bytes '${longest_insn_bytes}'"
+
+    # Check that the instruction bytes that we found above consists of
+    # a series of individual bytes separated by a whitespace.  Also,
+    # we check that the bytes reported match what can be found in the
+    # inferior memory.
+    set split_bytes [split $longest_insn_bytes " "]
+    set is_bad false
+    set addr $longest_insn_addr
+    set idx 0
+    foreach b $split_bytes {
+	if { [string length $b] != 2 } {
+	    set is_bad true
+	}
+
+	# Load the actual byte value from memory, and check it matches
+	# the opcode byte reported in the disassembler output.
+	set addr 0x[format %x [expr $longest_insn_addr + $idx]]
+	set actual [format %02x [mi_get_valueof "/x" "*((unsigned char *) $addr)" "XX"]]
+	gdb_assert [string equal $actual "$b"] \
+	    "byte at $addr matches"
+
+	incr idx
+    }
+    gdb_assert { !$is_bad } "check length of each byte"
+    set check_bytes [join $split_bytes " "]
+    gdb_assert { [string equal $check_bytes $longest_insn_bytes] } \
+	"bytes are separated by a single space"
+
+    # Figure out an end address at which to stop the disassembly.
+    set byte_count [llength $split_bytes]
+    set end_addr 0x[format %x [expr $longest_insn_addr + $byte_count]]
+    set start_addr $longest_insn_addr
+
+    verbose -log "Instruction is ${byte_count} bytes, end address ${end_addr}"
+
+    mi_gdb_test "321-data-disassemble -s $start_addr -e $end_addr -- 2" \
+	"321\\^done,asm_insns=\\\[\{address=\"$start_addr\",func-name=\"main\",offset=\"$::decimal\",opcodes=\"$longest_insn_bytes\",inst=\".*\"\}\]" \
+	"data-disassemble checking the opcodes bytes format"
+}
+
 mi_clean_restart $binfile
 mi_runto_main
 test_disassembly_only
@@ -248,6 +330,7 @@ test_disassembly_mixed_with_opcodes
 test_disassembly_bogus_args
 test_disassembly_lines_limit
 test_disassembly_mixed_lines_limit
+test_disassembly_opcode_format
 
 mi_gdb_exit
 return 0
diff --git a/gdb/testsuite/lib/mi-support.exp b/gdb/testsuite/lib/mi-support.exp
index e578a7e6f9b..2af12461ec5 100644
--- a/gdb/testsuite/lib/mi-support.exp
+++ b/gdb/testsuite/lib/mi-support.exp
@@ -2783,3 +2783,30 @@ proc mi_is_target_remote {} {
 
     return [gdb_is_target_remote_prompt "$mi_gdb_prompt"]
 }
+
+# Retrieve the value of EXP in the inferior, represented in format
+# specified in FMT (using "printFMT").  DEFAULT is used as fallback if
+# print fails.  TEST is the test message to use.  It can be omitted,
+# in which case a test message is built from EXP.
+#
+# This is an MI version of gdb_valueof.
+
+proc mi_get_valueof { fmt exp default {test ""} } {
+    global mi_gdb_prompt
+
+    if {$test == "" } {
+	set test "get valueof \"${exp}\""
+    }
+
+    set val ${default}
+    gdb_test_multiple "print${fmt} ${exp}" "$test" {
+	-re "~\"\\$\[0-9\]* = (\[^\r\n\]*)\\\\n\"\r\n\\^done\r\n$mi_gdb_prompt$" {
+	    set val $expect_out(1,string)
+	    pass "$test"
+	}
+	timeout {
+	    fail "$test (timeout)"
+	}
+    }
+    return ${val}
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/9] gdb/disasm: read opcodes bytes with a single read_code call
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
  2022-06-23 16:05 ` [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output Andrew Burgess
  2022-06-23 16:05 ` [PATCH 2/9] gdb/testsuite: new test for -data-disassemble opcodes format Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:05 ` [PATCH 4/9] gdb: disassembler opcode display formatting Andrew Burgess
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit reduces the number of times we call read_code when
printing the instruction opcode bytes during disassembly.

I've added a new gdb::byte_vector within the
gdb_pretty_print_disassembler class, in line with all the other
buffers that gdb_pretty_print_disassembler needs.  This byte_vector is
then resized as needed, and filled with a single read_code call for
each instruction.

There should be no user visible changes after this commit.
---
 gdb/disasm.c | 16 +++++++---------
 gdb/disasm.h |  3 +++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/gdb/disasm.c b/gdb/disasm.c
index c6edc92930d..42351c735d3 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -323,21 +323,19 @@ gdb_pretty_print_disassembler::pretty_print_insn (const struct disasm_insn *insn
 
     if (flags & DISASSEMBLY_RAW_INSN)
       {
-	CORE_ADDR end_pc;
-	bfd_byte data;
-	const char *spacer = "";
-
 	/* Build the opcodes using a temporary stream so we can
 	   write them out in a single go for the MI.  */
 	m_opcode_stb.clear ();
 
-	end_pc = pc + size;
+	/* Read the instruction opcode data.  */
+	m_opcode_data.resize (size);
+	read_code (pc, m_opcode_data.data (), size);
 
-	for (;pc < end_pc; ++pc)
+	for (int i = 0; i < size; ++i)
 	  {
-	    read_code (pc, &data, 1);
-	    m_opcode_stb.printf ("%s%02x", spacer, (unsigned) data);
-	    spacer = " ";
+	    if (i > 0)
+	      m_opcode_stb.puts (" ");
+	    m_opcode_stb.printf ("%02x", (unsigned) m_opcode_data[i]);
 	  }
 
 	m_uiout->field_stream ("opcodes", m_opcode_stb);
diff --git a/gdb/disasm.h b/gdb/disasm.h
index da03e130526..13403042709 100644
--- a/gdb/disasm.h
+++ b/gdb/disasm.h
@@ -311,6 +311,9 @@ class gdb_pretty_print_disassembler
 
   /* The buffer used to build the raw opcodes string.  */
   string_file m_opcode_stb;
+
+  /* The buffer used to hold the opcode bytes (if required).  */
+  gdb::byte_vector m_opcode_data;
 };
 
 /* Return the length in bytes of the instruction at address MEMADDR in
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/9] gdb: disassembler opcode display formatting
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (2 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 3/9] gdb/disasm: read opcodes bytes with a single read_code call Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:14   ` Eli Zaretskii
  2022-06-23 16:05 ` [PATCH 5/9] gdb: make gdb_disassembly_flag unsigned Andrew Burgess
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

This commit changes the format of 'disassemble /r' to match GNU
objdump.  Specifically, GDB will now display the instruction bytes in
as 'objdump --wide --disassemble' does.

Here is an example for RISC-V before this patch:

  (gdb) disassemble /r 0x0001018e,0x0001019e
  Dump of assembler code from 0x1018e to 0x1019e:
     0x0001018e <call_me+66>:     03 26 84 fe     lw      a2,-24(s0)
     0x00010192 <call_me+70>:     83 25 c4 fe     lw      a1,-20(s0)
     0x00010196 <call_me+74>:     61 65   lui     a0,0x18
     0x00010198 <call_me+76>:     13 05 85 6a     addi    a0,a0,1704
     0x0001019c <call_me+80>:     f1 22   jal     0x10368 <printf>
  End of assembler dump.

And here's an example after this patch:

  (gdb) disassemble /r 0x0001018e,0x0001019e
  Dump of assembler code from 0x1018e to 0x1019e:
     0x0001018e <call_me+66>:     fe842603                lw      a2,-24(s0)
     0x00010192 <call_me+70>:     fec42583                lw      a1,-20(s0)
     0x00010196 <call_me+74>:     6561                    lui     a0,0x18
     0x00010198 <call_me+76>:     6a850513                addi    a0,a0,1704
     0x0001019c <call_me+80>:     22f1                    jal     0x10368 <printf>
  End of assembler dump.

There are two differences here.  First, the instruction bytes after
the patch are grouped based on the size of the instruction, and are
byte-swapped to little-endian order.

Second, after the patch, GDB now uses the bytes-per-line hint from
libopcodes to add whitespace padding after the opcode bytes, this
means that in most cases the instructions are nicely aligned.

It is still possible for a very long instruction to intrude into the
disassembled text space.  The next example is x86-64, before the
patch:

  (gdb) disassemble /r main
  Dump of assembler code for function main:
     0x0000000000401106 <+0>:     55      push   %rbp
     0x0000000000401107 <+1>:     48 89 e5        mov    %rsp,%rbp
     0x000000000040110a <+4>:     c7 87 d8 00 00 00 01 00 00 00   movl   $0x1,0xd8(%rdi)
     0x0000000000401114 <+14>:    b8 00 00 00 00  mov    $0x0,%eax
     0x0000000000401119 <+19>:    5d      pop    %rbp
     0x000000000040111a <+20>:    c3      ret
  End of assembler dump.

And after the patch:

  (gdb) disassemble /r main
  Dump of assembler code for function main:
     0x0000000000401106 <+0>:     55                      push   %rbp
     0x0000000000401107 <+1>:     48 89 e5                mov    %rsp,%rbp
     0x000000000040110a <+4>:     c7 87 d8 00 00 00 01 00 00 00   movl   $0x1,0xd8(%rdi)
     0x0000000000401114 <+14>:    b8 00 00 00 00          mov    $0x0,%eax
     0x0000000000401119 <+19>:    5d                      pop    %rbp
     0x000000000040111a <+20>:    c3                      ret
  End of assembler dump.

Most instructions are aligned, except for the very long instruction.
Notice too that for x86-64 libopcodes doesn't request that GDB group
the instruction bytes.  This matches the behaviour of objdump.

In case the user really wants the old behaviour, I have added a new
modifier 'disassemble /b', this displays the instruction byte at a
time.  For x86-64, which never groups instruction bytes, /b and /r are
equivalent, but for RISC-V, using /b gets the old layout back (except
that the whitespace for alignment is still present).  Consider our
original RISC-V example, this time using /b:

  (gdb) disassemble /b 0x0001018e,0x0001019e
  Dump of assembler code from 0x1018e to 0x1019e:
     0x0001018e <call_me+66>:     03 26 84 fe             lw      a2,-24(s0)
     0x00010192 <call_me+70>:     83 25 c4 fe             lw      a1,-20(s0)
     0x00010196 <call_me+74>:     61 65                   lui     a0,0x18
     0x00010198 <call_me+76>:     13 05 85 6a             addi    a0,a0,1704
     0x0001019c <call_me+80>:     f1 22                   jal     0x10368 <printf>
  End of assembler dump.

Obviously, this patch is a potentially significant change to the
behaviour or /r.  I could have added /b with the new behaviour and
left /r alone.  However, personally, I feel the new behaviour is
significantly better than the old, hence, I made /r be what I consider
the "better" behaviour.

The reason I prefer the new behaviour is that, when I use /r, I almost
always want to manually decode the instruction for some reason, and
having the bytes displayed in "instruction order" rather than memory
order, just makes this easier.

The 'record instruction-history' command also takes a /r modifier, and
has been modified in the same way as disassemble; /r gets the new
behaviour, and /b has been added to retain the old behaviour.

Finally, the MI command -data-disassemble, is unchanged in behaviour,
this command now requests the raw bytes of the instruction, which is
equivalent to the /b modifier.  This means that the MI output will
remain backward compatible.
---
 gdb/NEWS                                | 12 +++++++
 gdb/cli/cli-cmds.c                      |  6 ++++
 gdb/disasm-flags.h                      |  1 +
 gdb/disasm.c                            | 43 ++++++++++++++++++++--
 gdb/doc/gdb.texinfo                     | 48 +++++++++++++++++++++----
 gdb/mi/mi-cmd-disas.c                   |  6 ++--
 gdb/record.c                            |  3 ++
 gdb/testsuite/gdb.mi/mi-disassemble.exp |  6 ++--
 8 files changed, 109 insertions(+), 16 deletions(-)

diff --git a/gdb/NEWS b/gdb/NEWS
index ac9a1aacd34..3e8853505eb 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -27,6 +27,18 @@
   emit to indicate where a breakpoint should be placed to break in a function
   past its prologue.
 
+* The format of 'disassemble /r' and 'record instruction-history /r'
+  has changed.  The instruction bytes could now be grouped together,
+  and displayed in the endianness of the instruction.  This is the
+  same layout as used by GNU objdump when disassembling.
+
+  There is now 'disassemble /b' and 'record instruction-history /b'
+  which will always display the instructions bytes one at a time in
+  memory order, that is, the byte at the lowest address first.
+
+  For both /r and /b GDB is now better at using whitespace in order to
+  align the disassembled instruction text.
+
 * New commands
 
 maintenance set ignore-prologue-end-flag on|off
diff --git a/gdb/cli/cli-cmds.c b/gdb/cli/cli-cmds.c
index 31911ebe61f..4950ee376a1 100644
--- a/gdb/cli/cli-cmds.c
+++ b/gdb/cli/cli-cmds.c
@@ -1508,6 +1508,9 @@ disassemble_current_function (gdb_disassembly_flags flags)
 
    A /r modifier will include raw instructions in hex with the assembly.
 
+   A /b modifier is similar to /r except the instruction bytes are printed
+   as separate bytes with no grouping, or endian switching.
+
    A /s modifier will include source code with the assembly, like /m, with
    two important differences:
    1) The output is still in pc address order.
@@ -1546,6 +1549,9 @@ disassemble_command (const char *arg, int from_tty)
 	    case 'r':
 	      flags |= DISASSEMBLY_RAW_INSN;
 	      break;
+	    case 'b':
+	      flags |= DISASSEMBLY_RAW_BYTES;
+	      break;
 	    case 's':
 	      flags |= DISASSEMBLY_SOURCE;
 	      break;
diff --git a/gdb/disasm-flags.h b/gdb/disasm-flags.h
index 025b6893941..5a7371b0a39 100644
--- a/gdb/disasm-flags.h
+++ b/gdb/disasm-flags.h
@@ -33,6 +33,7 @@ enum gdb_disassembly_flag
     DISASSEMBLY_OMIT_PC = (0x1 << 4),
     DISASSEMBLY_SOURCE = (0x1 << 5),
     DISASSEMBLY_SPECULATIVE = (0x1 << 6),
+    DISASSEMBLY_RAW_BYTES = (0x1 << 7),
   };
 DEF_ENUM_FLAGS_TYPE (enum gdb_disassembly_flag, gdb_disassembly_flags);
 
diff --git a/gdb/disasm.c b/gdb/disasm.c
index 42351c735d3..946c235e7ef 100644
--- a/gdb/disasm.c
+++ b/gdb/disasm.c
@@ -321,7 +321,7 @@ gdb_pretty_print_disassembler::pretty_print_insn (const struct disasm_insn *insn
 	throw ex;
       }
 
-    if (flags & DISASSEMBLY_RAW_INSN)
+    if ((flags & (DISASSEMBLY_RAW_INSN | DISASSEMBLY_RAW_BYTES)) != 0)
       {
 	/* Build the opcodes using a temporary stream so we can
 	   write them out in a single go for the MI.  */
@@ -331,14 +331,51 @@ gdb_pretty_print_disassembler::pretty_print_insn (const struct disasm_insn *insn
 	m_opcode_data.resize (size);
 	read_code (pc, m_opcode_data.data (), size);
 
-	for (int i = 0; i < size; ++i)
+	/* The disassembler provides information about the best way to
+	   display the instruction bytes to the user.  We provide some sane
+	   defaults in case the disassembler gets it wrong.  */
+	const struct disassemble_info *di = m_di.disasm_info ();
+	int bytes_per_line = std::max (di->bytes_per_line, size);
+	int bytes_per_chunk = std::max (di->bytes_per_chunk, 1);
+
+	/* If the user has requested the instruction bytes be displayed
+	   byte at a time, then handle that here.  Also, if the instruction
+	   is not a multiple of the chunk size (which probably indicates a
+	   disassembler problem) then avoid that causing display problems
+	   by switching to byte at a time mode.  */
+	if ((flags & DISASSEMBLY_RAW_BYTES) != 0
+	    || (size % bytes_per_chunk) != 0)
+	  bytes_per_chunk = 1;
+
+	/* Print the instruction opcodes bytes, grouped into chunks.  */
+	for (int i = 0; i < size; i += bytes_per_chunk)
 	  {
 	    if (i > 0)
 	      m_opcode_stb.puts (" ");
-	    m_opcode_stb.printf ("%02x", (unsigned) m_opcode_data[i]);
+
+	    if (di->display_endian == BFD_ENDIAN_LITTLE)
+	      {
+		for (int k = bytes_per_chunk; k-- != 0; )
+		  m_opcode_stb.printf ("%02x", (unsigned) m_opcode_data[i + k]);
+	      }
+	    else
+	      {
+		for (int k = 0; k < bytes_per_chunk; k++)
+		  m_opcode_stb.printf ("%02x", (unsigned) m_opcode_data[i + k]);
+	      }
+	  }
+
+	/* Calculate required padding.  */
+	int nspaces = 0;
+	for (int i = size; i < bytes_per_line; i += bytes_per_chunk)
+	  {
+	    if (i > size)
+	      nspaces++;
+	    nspaces += bytes_per_chunk * 2;
 	  }
 
 	m_uiout->field_stream ("opcodes", m_opcode_stb);
+	m_uiout->spaces (nspaces);
 	m_uiout->text ("\t");
       }
 
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index d046ce5891e..af559370db0 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -7901,7 +7901,10 @@
 
 It can also print mixed source+disassembly if you specify the the
 @code{/m} or @code{/s} modifier, and print the raw instructions in hex
-as well as in symbolic form by specifying the @code{/r} modifier.
+as well as in symbolic form by specifying the @code{/r} or @code{/b}
+modifier.  The behaviour of the @code{/m}, @code{/s}, @code{/r}, and
+@code{/b} modifiers are the same as for the @kbd{disassemble} command
+(@pxref{disassemble,,@kbd{disassemble}}).
 
 The current position marker is printed for the instruction at the
 current program counter value.  This instruction can appear multiple
@@ -9815,6 +9818,7 @@
 specifying a location will display information about the next source
 line.
 
+@anchor{disassemble}
 @table @code
 @kindex disassemble
 @cindex assembly instructions
@@ -9825,16 +9829,17 @@
 @itemx disassemble /m
 @itemx disassemble /s
 @itemx disassemble /r
+@itemx disassemble /b
 This specialized command dumps a range of memory as machine
 instructions.  It can also print mixed source+disassembly by specifying
-the @code{/m} or @code{/s} modifier and print the raw instructions in hex
-as well as in symbolic form by specifying the @code{/r} modifier.
-The default memory range is the function surrounding the
+the @code{/m} or @code{/s} modifier and print the raw instructions in
+hex as well as in symbolic form by specifying the @code{/r} or @code{/b}
+modifier.  The default memory range is the function surrounding the
 program counter of the selected frame.  A single argument to this
 command is a program counter value; @value{GDBN} dumps the function
-surrounding this value.  When two arguments are given, they should
-be separated by a comma, possibly surrounded by whitespace.  The
-arguments specify a range of addresses to dump, in one of two forms:
+surrounding this value.  When two arguments are given, they should be
+separated by a comma, possibly surrounded by whitespace.  The arguments
+specify a range of addresses to dump, in one of two forms:
 
 @table @code
 @item @var{start},@var{end}
@@ -9872,6 +9877,35 @@
 End of assembler dump.
 @end smallexample
 
+The following two examples are for RISC-V, and demonstrates the
+difference between the @code{/r} and @code{/b} modifiers.  First with
+@code{/b}, the bytes of the instruction are printed, in hex, in memory
+order:
+
+@smallexample
+(@value{GDBP}) disassemble /b 0x00010150,0x0001015c
+Dump of assembler code from 0x10150 to 0x1015c:
+   0x00010150 <call_me+4>:      22 dc                 	sw	s0,56(sp)
+   0x00010152 <call_me+6>:      80 00                 	addi	s0,sp,64
+   0x00010154 <call_me+8>:      23 26 a4 fe           	sw	a0,-20(s0)
+   0x00010158 <call_me+12>:     23 24 b4 fe           	sw	a1,-24(s0)
+End of assembler dump.
+@end smallexample
+
+In contrast, with @code{/r} the bytes of the instruction are displayed
+in the instruction order, for RISC-V this means that the bytes have been
+swapped to little-endian order:
+
+@smallexample
+(@value{GDBP}) disassemble /r 0x00010150,0x0001015c
+Dump of assembler code from 0x10150 to 0x1015c:
+   0x00010150 <call_me+4>:      dc22              	sw	s0,56(sp)
+   0x00010152 <call_me+6>:      0080              	addi	s0,sp,64
+   0x00010154 <call_me+8>:      fea42623        	sw	a0,-20(s0)
+   0x00010158 <call_me+12>:     feb42423        	sw	a1,-24(s0)
+End of assembler dump.
+@end smallexample
+
 Here is an example showing mixed source+assembly for Intel x86
 with @code{/m} or @code{/s}, when the program is stopped just after
 function prologue in a non-optimized function with no inline code.
diff --git a/gdb/mi/mi-cmd-disas.c b/gdb/mi/mi-cmd-disas.c
index 387c4900150..c8e06cd940a 100644
--- a/gdb/mi/mi-cmd-disas.c
+++ b/gdb/mi/mi-cmd-disas.c
@@ -165,16 +165,16 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
       disasm_flags |= DISASSEMBLY_SOURCE_DEPRECATED;
       break;
     case 2:
-      disasm_flags |= DISASSEMBLY_RAW_INSN;
+      disasm_flags |= DISASSEMBLY_RAW_BYTES;
       break;
     case 3:
-      disasm_flags |= DISASSEMBLY_SOURCE_DEPRECATED | DISASSEMBLY_RAW_INSN;
+      disasm_flags |= DISASSEMBLY_SOURCE_DEPRECATED | DISASSEMBLY_RAW_BYTES;
       break;
     case 4:
       disasm_flags |= DISASSEMBLY_SOURCE;
       break;
     case 5:
-      disasm_flags |= DISASSEMBLY_SOURCE | DISASSEMBLY_RAW_INSN;
+      disasm_flags |= DISASSEMBLY_SOURCE | DISASSEMBLY_RAW_BYTES;
       break;
     default:
       gdb_assert_not_reached ("bad disassembly mode");
diff --git a/gdb/record.c b/gdb/record.c
index 17a5df262bd..2390a58f9c0 100644
--- a/gdb/record.c
+++ b/gdb/record.c
@@ -494,6 +494,9 @@ get_insn_history_modifiers (const char **arg)
 	    case 'r':
 	      modifiers |= DISASSEMBLY_RAW_INSN;
 	      break;
+	    case 'b':
+	      modifiers |= DISASSEMBLY_RAW_BYTES;
+	      break;
 	    case 'f':
 	      modifiers |= DISASSEMBLY_OMIT_FNAME;
 	      break;
diff --git a/gdb/testsuite/gdb.mi/mi-disassemble.exp b/gdb/testsuite/gdb.mi/mi-disassemble.exp
index b7c52472c84..ef3337d4062 100644
--- a/gdb/testsuite/gdb.mi/mi-disassemble.exp
+++ b/gdb/testsuite/gdb.mi/mi-disassemble.exp
@@ -245,12 +245,12 @@ proc test_disassembly_opcode_format {} {
     # then disassemble using the MI command.
     set longest_insn_bytes ""
     set longest_insn_addr ""
-    gdb_test_multiple "disassemble /r main" "" {
-	-re "^disassemble /r main\r\n" {
+    gdb_test_multiple "disassemble /b main" "" {
+	-re "^disassemble /b main\r\n" {
 	    exp_continue
 	}
 
-	-re "^&\"disassemble /r main.n\"\r\n" {
+	-re "^&\"disassemble /b main.n\"\r\n" {
 	    exp_continue
 	}
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 5/9] gdb: make gdb_disassembly_flag unsigned
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (3 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 4/9] gdb: disassembler opcode display formatting Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:05 ` [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table Andrew Burgess
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In a later commit I want to use operator~ on a gdb_disassembly_flag
flag value.  This is currently not possible as gdb_disassembly_flag
is, by default, signed.

This commit just makes this enum unsigned.

There should be no user visible changes after this commit.
---
 gdb/disasm-flags.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/disasm-flags.h b/gdb/disasm-flags.h
index 5a7371b0a39..9420e7c478c 100644
--- a/gdb/disasm-flags.h
+++ b/gdb/disasm-flags.h
@@ -24,7 +24,7 @@
 
 /* Flags used to control how GDB's disassembler behaves.  */
 
-enum gdb_disassembly_flag
+enum gdb_disassembly_flag : unsigned
   {
     DISASSEMBLY_SOURCE_DEPRECATED = (0x1 << 0),
     DISASSEMBLY_RAW_INSN = (0x1 << 1),
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (4 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 5/9] gdb: make gdb_disassembly_flag unsigned Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:22   ` Eli Zaretskii
  2022-06-23 16:05 ` [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments Andrew Burgess
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In passing I noticed that the column headings for the table of MI
compatibility and breaking changes, were overlapping, at least when
the PDF is generated on my machine.

I propose giving slightly more space to the two version number
columns, this prevents the headers overlapping for me.
---
 gdb/doc/gdb.texinfo | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index af559370db0..b0624afce2f 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -30308,7 +30308,7 @@
 interface: the version number, the version of GDB in which it first appeared
 and the breaking changes compared to the previous version.
 
-@multitable @columnfractions .05 .05 .9
+@multitable @columnfractions .1 .1 .8
 @headitem MI version @tab GDB version @tab Breaking changes
 
 @item
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (5 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:21   ` Eli Zaretskii
  2022-06-23 16:05 ` [PATCH 8/9] gdb/mi: some int to bool conversion Andrew Burgess
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

The argument documentation looks like this:

   -data-disassemble
      [ -s @var{start-addr} -e @var{end-addr} ]
    | [ -a @var{addr} ]
    | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
    -- @var{mode}

However, I believe, according to the 'Notation and Terminology'
section, this means that the there are 3 optional location
specification argument groups for the command, followed by a
non-optional '-- mode'.

However, this is not true, one of the location specification must be
given, i.e. we can't choose to give NO location specification, which
is what the above implies.

I propose that we change this to instead be:

   -data-disassemble
    ( -s @var{start-addr} -e @var{end-addr}
    | -a @var{addr}
    | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
    -- @var{mode}

By placing all the location specifications within '( ... )' we
indication that these are a group, from which one of the options,
separated by '|', must be selected.

This change is important because, in a later commit, I want to add
additional optional arguments to the -data-disassemble command, and
things start to get confusing with the original syntax.
---
 gdb/doc/gdb.texinfo | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index b0624afce2f..ca9d29bd364 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -34384,9 +34384,9 @@
 
 @smallexample
  -data-disassemble
-    [ -s @var{start-addr} -e @var{end-addr} ]
-  | [ -a @var{addr} ]
-  | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
+  ( -s @var{start-addr} -e @var{end-addr}
+  | -a @var{addr}
+  | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
   -- @var{mode}
 @end smallexample
 
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 8/9] gdb/mi: some int to bool conversion
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (6 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:05 ` [PATCH 9/9] gdb/mi: new options for -data-disassemble command Andrew Burgess
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Just some simple int to bool conversion in mi_cmd_disassemble.  There
should be no user visible changes after this commit.
---
 gdb/mi/mi-cmd-disas.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gdb/mi/mi-cmd-disas.c b/gdb/mi/mi-cmd-disas.c
index c8e06cd940a..4f6e5613b9d 100644
--- a/gdb/mi/mi-cmd-disas.c
+++ b/gdb/mi/mi-cmd-disas.c
@@ -62,12 +62,12 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
   struct symtab *s;
 
   /* Which options have we processed ... */
-  int file_seen = 0;
-  int line_seen = 0;
-  int num_seen = 0;
-  int start_seen = 0;
-  int end_seen = 0;
-  int addr_seen = 0;
+  bool file_seen = false;
+  bool line_seen = false;
+  bool num_seen = false;
+  bool start_seen = false;
+  bool end_seen = false;
+  bool addr_seen = false;
 
   /* ... and their corresponding value. */
   char *file_string = NULL;
@@ -107,27 +107,27 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
 	{
 	case FILE_OPT:
 	  file_string = oarg;
-	  file_seen = 1;
+	  file_seen = true;
 	  break;
 	case LINE_OPT:
 	  line_num = atoi (oarg);
-	  line_seen = 1;
+	  line_seen = true;
 	  break;
 	case NUM_OPT:
 	  how_many = atoi (oarg);
-	  num_seen = 1;
+	  num_seen = true;
 	  break;
 	case START_OPT:
 	  low = parse_and_eval_address (oarg);
-	  start_seen = 1;
+	  start_seen = true;
 	  break;
 	case END_OPT:
 	  high = parse_and_eval_address (oarg);
-	  end_seen = 1;
+	  end_seen = true;
 	  break;
 	case ADDR_OPT:
 	  addr = parse_and_eval_address (oarg);
-	  addr_seen = 1;
+	  addr_seen = true;
 	  break;
 	}
     }
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 9/9] gdb/mi: new options for -data-disassemble command
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (7 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 8/9] gdb/mi: some int to bool conversion Andrew Burgess
@ 2022-06-23 16:05 ` Andrew Burgess
  2022-06-23 16:34   ` Eli Zaretskii
  2022-07-25 18:28 ` [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
  2022-09-21 16:41 ` Tom Tromey
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-23 16:05 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

Now that the disassembler has two different strategies for laying out
the opcode bytes of an instruction (see /r vs /b for the disassemble
command), I wanted to add support for this to the MI disassemble
command.

Currently the -data-disassemble command takes a single 'mode' value,
which currently has 6 different values (0 -> 5), 3 of these modes
relate to opcode display.

So, clearly I should just add an additional 3 modes to handle the new
opcode format, right?

No, I didn't think that was a great idea either.

So, I wonder, could I adjust the -data-disassemble command in a
backward compatible way, that would allow GDB to move away from using
the mode value altogether?

I think we can.

In this commit, I propose adding two new options to -data-disassemble,
these are:

  --opcodes none|bytes|display
  --source

Additionally, I will make the mode optional, and default to mode 0 if
no mode value is given.  Mode 0 is the simplest, no source code, no
opcodes disassembly mode.

The two new options are only valid for mode 0, if they are used with
any other mode then an error is thrown.

The --opcodes option can add opcodes to the result, with 'bytes' being
equivalent to 'disassemble /b' and 'display' being 'disassemble /r'.

The --source option will enable the /s style source code display, this
is equivalent to modes 4 and 5.  There is no way, using the new
command options to get the now deprecated /m style source code
display, that is mode 1 and 3.

My hope is that new users of the MI will not use the mode at all, and
instead will just use the new options to achieve the output they need.
Existing MI users can continue to use the mode, and will not need to
be updated to use the new options.
---
 gdb/doc/gdb.texinfo                     | 83 ++++++++++++++++++-------
 gdb/mi/mi-cmd-disas.c                   | 75 +++++++++++++++++++---
 gdb/testsuite/gdb.mi/mi-disassemble.exp | 66 +++++++++++++-------
 3 files changed, 174 insertions(+), 50 deletions(-)

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index ca9d29bd364..456d362a109 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -34387,7 +34387,9 @@
   ( -s @var{start-addr} -e @var{end-addr}
   | -a @var{addr}
   | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
-  -- @var{mode}
+  [ --opcodes @var{opcodes-mode} ]
+  [ --source ]
+  [ -- @var{mode} ]
 @end smallexample
 
 @noindent
@@ -34416,31 +34418,56 @@
 displayed; if @var{lines} is higher than the number of lines between
 @var{start-addr} and @var{end-addr}, only the lines up to @var{end-addr}
 are displayed.
+@item @var{opcodes-mode}
+is one of @samp{none}, @samp{bytes}, or @samp{display}.  This setting
+can only be used with @var{mode} 0.  Passing @samp{none} here will
+prevent opcodes being included in the result; passing @samp{bytes} here
+will include opcodes in the result, the opcodes will be formatted as for
+@kbd{disassemble /b}; and passing @samp{display} here will include the
+opcodes in the result, the opcodes will be formatted as for
+@kbd{disassemble /r}.
 @item @var{mode}
-is one of:
+the use of @var{mode} is deprecated in favour of using the
+@code{--opcodes} and @code{--source} options.  When no @var{mode} is
+given, @var{mode} 0 will be assumed.  However, the @var{mode} is still
+available for backward compatibility.  The @var{mode} should be one of:
 @itemize @bullet
-@item 0 disassembly only
-@item 1 mixed source and disassembly (deprecated)
-@item 2 disassembly with raw opcodes
-@item 3 mixed source and disassembly with raw opcodes (deprecated)
-@item 4 mixed source and disassembly
-@item 5 mixed source and disassembly with raw opcodes
+@item 0 disassembly only:
+This is the default mode if no mode is specified.
+@item 1 mixed source and disassembly (deprecated):
+It is not possible to recreate this mode using @code{--opcodes} and
+@code{--source} options.
+@item 2 disassembly with raw opcodes:
+This mode is equivalent to using @var{mode} 0 and passing
+@code{--opcodes bytes} to the command.
+@item 3 mixed source and disassembly with raw opcodes (deprecated):
+It is not possible to recreate this mode using @code{--opcodes} and
+@code{--source} options.
+@item 4 mixed source and disassembly:
+This mode is equivalent to using @var{mode} 0 and passing
+@code{--source} to the command.
+@item 5 mixed source and disassembly with raw opcodes:
+This mode is equivalent to using @var{mode} 0 and passing
+@code{--opcodes bytes} and @code{--source} to the command.
 @end itemize
-
 Modes 1 and 3 are deprecated.  The output is ``source centric''
 which hasn't proved useful in practice.
 @xref{Machine Code}, for a discussion of the difference between
 @code{/m} and @code{/s} output of the @code{disassemble} command.
 @end table
 
+The @code{--source} can only be used with @var{mode} 0.  Passing this
+option will include the source code in the disassembly result as if
+@var{mode} 4 or 5 had been used.
+
 @subsubheading Result
 
 The result of the @code{-data-disassemble} command will be a list named
-@samp{asm_insns}, the contents of this list depend on the @var{mode}
-used with the @code{-data-disassemble} command.
+@samp{asm_insns}, the contents of this list depend on the options used
+with the @code{-data-disassemble} command.
 
-For modes 0 and 2 the @samp{asm_insns} list contains tuples with the
-following fields:
+For modes 0 and 2, and when the @code{--source} option is not used the
+@samp{asm_insns} list contains tuples with the following fields:
 
 @table @code
 @item address
@@ -34456,15 +34483,27 @@
 The text disassembly for this @samp{address}.
 
 @item opcodes
-This field is only present for modes 2, 3 and 5.  This contains the
-raw opcode bytes for the @samp{inst} field.  The bytes are formatted
-as single bytes, in hex, in ascending address order, with a single
-space between each byte.
-
-@end table
-
-For modes 1, 3, 4 and 5 the @samp{asm_insns} list contains tuples named
-@samp{src_and_asm_line}, each of which has the following fields:
+This field is only present for modes 2, 3 and 5, or when the
+@code{--opcodes} option @samp{bytes} or @samp{display} is used.  This
+contains the raw opcode bytes for the @samp{inst} field.
+
+When the @samp{--opcodes} option is not passed to
+@code{-data-disassemble}, or the @samp{bytes} value is passed to
+@samp{--opcodes}, then the bytes are formatted as a series of single
+bytes, in hex, in ascending address order, with a single space between
+each byte.  This format is equivalent to the @samp{/b} option being used
+with the @kbd{disassemble} command.
+
+When @samp{--opcodes} is passed the value @samp{display} then the bytes
+are formatted in the natural instruction display order.  This means
+multiple bytes can be grouped together, and the bytes might be
+byte-swapped.  This format is equivalent to the @samp{/r} option being
+used with the @kbd{disassemble} command.
+@end table
+
+For modes 1, 3, 4 and 5, or when the @code{--source} option is used, the
+@samp{asm_insns} list contains tuples named @samp{src_and_asm_line},
+each of which has the following fields:
 
 @table @code
 @item line
diff --git a/gdb/mi/mi-cmd-disas.c b/gdb/mi/mi-cmd-disas.c
index 4f6e5613b9d..db5a78827fc 100644
--- a/gdb/mi/mi-cmd-disas.c
+++ b/gdb/mi/mi-cmd-disas.c
@@ -68,6 +68,8 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
   bool start_seen = false;
   bool end_seen = false;
   bool addr_seen = false;
+  bool opcodes_seen = false;
+  bool source_seen = false;
 
   /* ... and their corresponding value. */
   char *file_string = NULL;
@@ -77,12 +79,23 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
   CORE_ADDR high = 0;
   CORE_ADDR addr = 0;
 
+  /* Flags to handle the --opcodes option.  */
+  enum opcodes_mode
+  {
+    OPCODES_DEFAULT, OPCODES_NONE, OPCODES_DISPLAY, OPCODES_BYTES
+  };
+  enum opcodes_mode opcodes_mode = OPCODES_DEFAULT;
+
+  /* Handle the -source option.  */
+  bool show_source = false;
+
   /* Options processing stuff.  */
   int oind = 0;
   char *oarg;
   enum opt
   {
-    FILE_OPT, LINE_OPT, NUM_OPT, START_OPT, END_OPT, ADDR_OPT
+    FILE_OPT, LINE_OPT, NUM_OPT, START_OPT, END_OPT, ADDR_OPT, OPCODES_OPT,
+    SHOW_SRC_OPT
   };
   static const struct mi_opt opts[] =
     {
@@ -92,6 +105,8 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
       {"s", START_OPT, 1},
       {"e", END_OPT, 1},
       {"a", ADDR_OPT, 1},
+      {"-opcodes", OPCODES_OPT, 1},
+      {"-source", SHOW_SRC_OPT, 0},
       { 0, 0, 0 }
     };
 
@@ -129,6 +144,21 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
 	  addr = parse_and_eval_address (oarg);
 	  addr_seen = true;
 	  break;
+	case OPCODES_OPT:
+	  opcodes_seen = true;
+	  if (strcmp (oarg, "none") == 0)
+	    opcodes_mode = OPCODES_NONE;
+	  else if (strcmp (oarg, "display") == 0)
+	    opcodes_mode = OPCODES_DISPLAY;
+	  else if (strcmp (oarg, "bytes") == 0)
+	    opcodes_mode = OPCODES_BYTES;
+	  else
+	    error (_("-data-disassemble: unknown value for -opcodes argument"));
+	  break;
+	case SHOW_SRC_OPT:
+	  source_seen = true;
+	  show_source = true;
+	  break;
 	}
     }
   argv += oind;
@@ -146,13 +176,23 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
 
        || (!line_seen && !file_seen && !num_seen && !start_seen && !end_seen
 								&&  addr_seen))
-      || argc != 1)
-    error (_("-data-disassemble: Usage: ( [-f filename -l linenum "
-	     "[-n howmany]] | [-s startaddr -e endaddr] | [-a addr] ) [--] mode."));
+      || argc > 1)
+    error (_("-data-disassemble: Usage: "
+	     "( -f filename -l linenum [-n howmany] |"
+	     " -s startaddr -e endaddr | -a addr ) "
+	     "[ --opcodes mode ] [ --source ] [ [--] mode ]."));
+
+  if (argc == 1)
+    {
+      mode = atoi (argv[0]);
+      if (mode < 0 || mode > 5)
+	error (_("-data-disassemble: Mode argument must be in the range 0-5."));
+    }
+  else
+    mode = 0;
 
-  mode = atoi (argv[0]);
-  if (mode < 0 || mode > 5)
-    error (_("-data-disassemble: Mode argument must be in the range 0-5."));
+  if (mode != 0 && (source_seen || opcodes_seen))
+    error (_("-data-disassemble: --opcodes and --source can only be used with mode 0"));
 
   /* Convert the mode into a set of disassembly flags.  */
 
@@ -180,6 +220,27 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
       gdb_assert_not_reached ("bad disassembly mode");
     }
 
+  /* Now handle the (optional) --opcodes argument.  This partially
+     overrides the mode value.  */
+  if (opcodes_mode != OPCODES_DEFAULT)
+    {
+      /* Remove any existing flags related to opcodes display.  */
+      disasm_flags &= ~(DISASSEMBLY_RAW_BYTES | DISASSEMBLY_RAW_INSN);
+
+      /* Add back any required flags.  */
+      if (opcodes_mode == OPCODES_DISPLAY)
+	disasm_flags |= DISASSEMBLY_RAW_INSN;
+      else if (opcodes_mode == OPCODES_BYTES)
+	disasm_flags |= DISASSEMBLY_RAW_BYTES;
+    }
+
+  /* Handle the optional --source argument.  */
+  if (show_source)
+    {
+      disasm_flags &= ~DISASSEMBLY_SOURCE_DEPRECATED;
+      disasm_flags |= DISASSEMBLY_SOURCE;
+    }
+
   /* We must get the function beginning and end where line_num is
      contained.  */
 
diff --git a/gdb/testsuite/gdb.mi/mi-disassemble.exp b/gdb/testsuite/gdb.mi/mi-disassemble.exp
index ef3337d4062..4d7c2f2e6e7 100644
--- a/gdb/testsuite/gdb.mi/mi-disassemble.exp
+++ b/gdb/testsuite/gdb.mi/mi-disassemble.exp
@@ -47,21 +47,27 @@ proc test_disassembly_only {} {
 
 
     mi_gdb_test "print/x \$pc" "" ""
-    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" -- 0" \
-	    "111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble from pc to pc+12 assembly only"
 
-    mi_gdb_test "112-data-disassemble -a \$pc -- 0" \
-	    "112\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble function around pc assembly only"
+    foreach { test_name option_string } [list "mode 0" "-- 0" \
+					     "default mode" "" ] {
+	with_test_prefix $test_name {
+	    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" ${option_string}" \
+		"111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble from pc to pc+12 assembly only"
 
-    mi_gdb_test "113-data-disassemble -a callee4 -- 0" \
-	    "113\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble function callee4 assembly only"
+	    mi_gdb_test "112-data-disassemble -a \$pc ${option_string}" \
+		"112\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble function around pc assembly only"
 
-    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body -- 0" \
-	    "222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}\\\]" \
-              "data-disassemble file & line, assembly only"
+	    mi_gdb_test "113-data-disassemble -a callee4 ${option_string}" \
+		"113\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble function callee4 assembly only"
+
+	    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body ${option_string}" \
+		"222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}\\\]" \
+		"data-disassemble file & line, assembly only"
+	}
+    }
 }
 
 proc test_disassembly_with_opcodes {} {
@@ -78,13 +84,20 @@ proc test_disassembly_with_opcodes {} {
     # -data-disassembly -f basics.c -l $line_main_body -- 2
 
     mi_gdb_test "print/x \$pc" "" ""
-    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" -- 2" \
-           "111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}.*\]" \
-             "data-disassemble from pc to pc+12 assembly with opcodes"
 
-    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body -- 2" \
-           "222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",opcodes=\".*\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}\\\]" \
-              "data-disassemble file & line, assembly with opcodes"
+    foreach { test_name option_string} [list "mode 2" "-- 2" \
+					    "mode 0 and --opcodes bytes" "--opcodes bytes -- 0" \
+					    "default mode and --opcodes bytes" "--opcodes bytes"] {
+	with_test_prefix $test_name {
+	    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" ${option_string}" \
+		"111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}.*\]" \
+		"data-disassemble from pc to pc+12 assembly"
+
+	    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body ${option_string}" \
+		"222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",opcodes=\".*\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}\\\]" \
+		"data-disassemble file & line, assembly"
+	}
+    }
 }
 
 proc test_disassembly_lines_limit {} {
@@ -230,13 +243,24 @@ proc test_disassembly_bogus_args {} {
              "data-disassemble bogus address, -a"
 
     mi_gdb_test "456-data-disassemble -s \$pc -f basics.c -- 0" \
-             "456\\^error,msg=\"-data-disassemble: Usage: \\( .-f filename -l linenum .-n howmany.. \\| .-s startaddr -e endaddr. \\| .-a addr. \\) .--. mode.\"" \
-             "data-disassemble mix different args"
+	"456\\^error,msg=\"-data-disassemble: Usage: \\( -f filename -l linenum .-n howmany. \\| -s startaddr -e endaddr \\| -a addr \\) . --opcodes mode . . --source . . .--. mode .\\.\"" \
+	"data-disassemble mix different args"
 
     mi_gdb_test "789-data-disassemble -f basics.c -l $line_main_body -- 9" \
              "789\\^error,msg=\"-data-disassemble: Mode argument must be in the range 0-5.\"" \
-             "data-disassemble wrong mode arg"
+	"data-disassemble wrong mode arg"
+
+    foreach mode { 1 2 3 4 5 } {
+	foreach opcode_arg { none bytes display } {
+	    mi_gdb_test "801-data-disassemble -s \$pc -e \"\$pc + 12\" --opcodes ${opcode_arg} -- ${mode}" \
+		"801\\^error,msg=\"-data-disassemble: --opcodes and --source can only be used with mode 0\"" \
+		"data-disassemble use --opcode ${opcode_arg} with mode ${mode}"
+	}
 
+	mi_gdb_test "802-data-disassemble -s \$pc -e \"\$pc + 12\" --source -- ${mode}" \
+	    "802\\^error,msg=\"-data-disassemble: --opcodes and --source can only be used with mode 0\"" \
+	    "data-disassemble use --source with mode ${mode}"
+    }
 }
 
 # Check the format of the opcode bytes.
-- 
2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output
  2022-06-23 16:05 ` [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output Andrew Burgess
@ 2022-06-23 16:09   ` Eli Zaretskii
  2022-06-29 12:54     ` Andrew Burgess
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-23 16:09 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Thu, 23 Jun 2022 17:05:08 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <aburgess@redhat.com>
> 
> Extend the description of the MI command --data-disassemble.
> Specifically, expand the description of the 'opcodes' field to explain
> how the bytes are formatted.
> ---
>  gdb/doc/gdb.texinfo | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Thanks, this is OK.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/9] gdb: disassembler opcode display formatting
  2022-06-23 16:05 ` [PATCH 4/9] gdb: disassembler opcode display formatting Andrew Burgess
@ 2022-06-23 16:14   ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-23 16:14 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Thu, 23 Jun 2022 17:05:11 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <aburgess@redhat.com>
> 
>  gdb/NEWS                                | 12 +++++++
>  gdb/cli/cli-cmds.c                      |  6 ++++
>  gdb/disasm-flags.h                      |  1 +
>  gdb/disasm.c                            | 43 ++++++++++++++++++++--
>  gdb/doc/gdb.texinfo                     | 48 +++++++++++++++++++++----
>  gdb/mi/mi-cmd-disas.c                   |  6 ++--
>  gdb/record.c                            |  3 ++
>  gdb/testsuite/gdb.mi/mi-disassemble.exp |  6 ++--
>  8 files changed, 109 insertions(+), 16 deletions(-)

Thanks, the documentation parts are OK.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments
  2022-06-23 16:05 ` [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments Andrew Burgess
@ 2022-06-23 16:21   ` Eli Zaretskii
  2022-06-30 10:18     ` Andrew Burgess
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-23 16:21 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Thu, 23 Jun 2022 17:05:14 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <aburgess@redhat.com>
> 
> The argument documentation looks like this:
> 
>    -data-disassemble
>       [ -s @var{start-addr} -e @var{end-addr} ]
>     | [ -a @var{addr} ]
>     | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
>     -- @var{mode}
> 
> However, I believe, according to the 'Notation and Terminology'
> section, this means that the there are 3 optional location
> specification argument groups for the command, followed by a
> non-optional '-- mode'.
> 
> However, this is not true, one of the location specification must be
> given, i.e. we can't choose to give NO location specification, which
> is what the above implies.

I don't believe we ever used this convention this rigorously.  But I
agree that it is better to be as accurate as possible.

> I propose that we change this to instead be:
> 
>    -data-disassemble
>     ( -s @var{start-addr} -e @var{end-addr}
>     | -a @var{addr}
>     | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
>     -- @var{mode}
> 
> By placing all the location specifications within '( ... )' we
> indication that these are a group, from which one of the options,
> separated by '|', must be selected.

According to "Notation and Terminology", the (...) construct should be
followed by either * or +, so I think you should use + here.

Otherwise, this is fine, thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table
  2022-06-23 16:05 ` [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table Andrew Burgess
@ 2022-06-23 16:22   ` Eli Zaretskii
  2022-06-30  9:39     ` Andrew Burgess
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-23 16:22 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Thu, 23 Jun 2022 17:05:13 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <aburgess@redhat.com>
> 
> In passing I noticed that the column headings for the table of MI
> compatibility and breaking changes, were overlapping, at least when
> the PDF is generated on my machine.
> 
> I propose giving slightly more space to the two version number
> columns, this prevents the headers overlapping for me.

This should fall under the "obvious change" rule.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] gdb/mi: new options for -data-disassemble command
  2022-06-23 16:05 ` [PATCH 9/9] gdb/mi: new options for -data-disassemble command Andrew Burgess
@ 2022-06-23 16:34   ` Eli Zaretskii
  2022-06-30 11:22     ` Andrew Burgess
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-23 16:34 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Thu, 23 Jun 2022 17:05:16 +0100
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <aburgess@redhat.com>
> 
> @@ -34416,31 +34418,56 @@
>  displayed; if @var{lines} is higher than the number of lines between
>  @var{start-addr} and @var{end-addr}, only the lines up to @var{end-addr}
>  are displayed.
> +@item @var{opcodes-mode}
> +is one of @samp{none}, @samp{bytes}, or @samp{display}.  This setting
> +can only be used with @var{mode} 0.  Passing @samp{none} here will
> +prevent opcodes being included in the result; passing @samp{bytes} here
> +will include opcodes in the result, the opcodes will be formatted as for
> +@kbd{disassemble /b}; and passing @samp{display} here will include the
> +opcodes in the result, the opcodes will be formatted as for
> +@kbd{disassemble /r}.

This begs for an inner @table or @itemize, describing each of the 3
options separately.  It would be also more correct, English-wise,
since starting an @item with a lowercase letter and then writing
several sentences there is not really correct.

>  @item @var{mode}
> -is one of:
> +the use of @var{mode} is deprecated in favour of using the
> +@code{--opcodes} and @code{--source} options.  When no @var{mode} is
> +given, @var{mode} 0 will be assumed.  However, the @var{mode} is still
> +available for backward compatibility.  The @var{mode} should be one of:
>  @itemize @bullet
> -@item 0 disassembly only
> -@item 1 mixed source and disassembly (deprecated)
> -@item 2 disassembly with raw opcodes
> -@item 3 mixed source and disassembly with raw opcodes (deprecated)
> -@item 4 mixed source and disassembly
> -@item 5 mixed source and disassembly with raw opcodes
> +@item 0 disassembly only:
> +This is the default mode if no mode is specified.

This is in @itemize, so @items should appear alone on their lines, and
the text on the line below it.  But the style in which you wrote this
is better suited fore @table, so maybe just replace @itemize by @table.

> +@item 1 mixed source and disassembly (deprecated):
> +It is not possible to recreate this mode using @code{--opcodes} and
> +@code{--source} options.

Then why are we deprecating it (and the MODE argument itself)?

> -For modes 0 and 2 the @samp{asm_insns} list contains tuples with the
> -following fields:
> +For modes 0 and 2, and when the @code{--source} option is not used the
                                                                     ^^
Comma missing there.

> +When the @samp{--opcodes} option is not passed to
> +@code{-data-disassemble}, or the @samp{bytes} value is passed to
> +@samp{--opcodes}, then the bytes are formatted as a series of single
> +bytes, in hex, in ascending address order, with a single space between
> +each byte.  This format is equivalent to the @samp{/b} option being used
> +with the @kbd{disassemble} command.

A cross-reference to the description of 'disassemble' would be good
here.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output
  2022-06-23 16:09   ` Eli Zaretskii
@ 2022-06-29 12:54     ` Andrew Burgess
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-29 12:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> Date: Thu, 23 Jun 2022 17:05:08 +0100
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Andrew Burgess <aburgess@redhat.com>
>> 
>> Extend the description of the MI command --data-disassemble.
>> Specifically, expand the description of the 'opcodes' field to explain
>> how the bytes are formatted.
>> ---
>>  gdb/doc/gdb.texinfo | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> Thanks, this is OK.

Thanks, I pushed this patch.

Andrew


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table
  2022-06-23 16:22   ` Eli Zaretskii
@ 2022-06-30  9:39     ` Andrew Burgess
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-06-30  9:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> Date: Thu, 23 Jun 2022 17:05:13 +0100
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Andrew Burgess <aburgess@redhat.com>
>> 
>> In passing I noticed that the column headings for the table of MI
>> compatibility and breaking changes, were overlapping, at least when
>> the PDF is generated on my machine.
>> 
>> I propose giving slightly more space to the two version number
>> columns, this prevents the headers overlapping for me.
>
> This should fall under the "obvious change" rule.

Thanks, pushed.

Andrew


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments
  2022-06-23 16:21   ` Eli Zaretskii
@ 2022-06-30 10:18     ` Andrew Burgess
  2022-06-30 10:46       ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-30 10:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> Date: Thu, 23 Jun 2022 17:05:14 +0100
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Andrew Burgess <aburgess@redhat.com>
>> 
>> The argument documentation looks like this:
>> 
>>    -data-disassemble
>>       [ -s @var{start-addr} -e @var{end-addr} ]
>>     | [ -a @var{addr} ]
>>     | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
>>     -- @var{mode}
>> 
>> However, I believe, according to the 'Notation and Terminology'
>> section, this means that the there are 3 optional location
>> specification argument groups for the command, followed by a
>> non-optional '-- mode'.
>> 
>> However, this is not true, one of the location specification must be
>> given, i.e. we can't choose to give NO location specification, which
>> is what the above implies.
>
> I don't believe we ever used this convention this rigorously.  But I
> agree that it is better to be as accurate as possible.

I agree, and I don't plan to go out of my way to "fix" the syntax of
every command, it was just that with my next patch the syntax became
rather muddled unless we were a little more rigorous.

>
>> I propose that we change this to instead be:
>> 
>>    -data-disassemble
>>     ( -s @var{start-addr} -e @var{end-addr}
>>     | -a @var{addr}
>>     | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
>>     -- @var{mode}
>> 
>> By placing all the location specifications within '( ... )' we
>> indication that these are a group, from which one of the options,
>> separated by '|', must be selected.
>
> According to "Notation and Terminology", the (...) construct should be
> followed by either * or +, so I think you should use + here.

You are correct that (...) is not mentioned.  Unfortunately (...)+
clearly means one or more times, and this is not correct in this case.
I propose adding (...) to the "Notation and Terminology" section to mean
exactly once.

How's the patch below?

Thanks,
Andrew

---

commit e151b49f5adac4f42bc5df977c5ab646ef7fe83c
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Thu Jun 23 13:57:57 2022 +0100

    gdb/doc: update syntax of -data-disassemble command arguments
    
    The argument documentation for -data-disassemble looks like this:
    
       -data-disassemble
          [ -s @var{start-addr} -e @var{end-addr} ]
        | [ -a @var{addr} ]
        | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
        -- @var{mode}
    
    However, I believe, according to the 'Notation and Terminology'
    section, this means that the there are 3 optional location
    specification argument groups for the command, followed by a
    non-optional '-- mode'.
    
    However, this is not true, one of the location specifications must be
    given, i.e. we can't choose to give NO location specification, which
    is what the above implies.
    
    I propose that we change this to instead be:
    
       -data-disassemble
        ( -s @var{start-addr} -e @var{end-addr}
        | -a @var{addr}
        | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
        -- @var{mode}
    
    By placing all the location specifications within '( ... )' we
    indication that these are a group, from which one of the options,
    separated by '|', must be selected.
    
    However, the 'Notation and Terminology' section only describes two
    uses for parenthesis: '( GROUP )*' and '( GROUP )+', in the first case
    GROUP is repeated zero or more times, and in the second GROUP is
    repeated 1 or more times.
    
    Neither of those exactly describe what I want, which is GROUP must
    appear exactly once.  I propose to extend 'Notation and Terminology'
    to include '( GROUP )' which means that GROUP should appear exactly
    once.
    
    This change is important because, in a later commit, I want to add
    additional optional arguments to the -data-disassemble command, and
    things start to get confusing with the original syntax.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 9ae88ee03df..984ad9c1ed1 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -29766,6 +29766,10 @@
 @code{( @var{group} )+} means that @var{group} inside the parentheses
 may repeat one or more times.
 
+@item
+@code{( @var{group} )} means that @var{group} inside the parentheses
+occurs exactly once.
+
 @item
 @code{"@var{string}"} means a literal @var{string}.
 @end itemize
@@ -34416,9 +34420,9 @@
 
 @smallexample
  -data-disassemble
-    [ -s @var{start-addr} -e @var{end-addr} ]
-  | [ -a @var{addr} ]
-  | [ -f @var{filename} -l @var{linenum} [ -n @var{lines} ] ]
+  ( -s @var{start-addr} -e @var{end-addr}
+  | -a @var{addr}
+  | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
   -- @var{mode}
 @end smallexample
 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments
  2022-06-30 10:18     ` Andrew Burgess
@ 2022-06-30 10:46       ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-30 10:46 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Thu, 30 Jun 2022 11:18:35 +0100
> 
> > According to "Notation and Terminology", the (...) construct should be
> > followed by either * or +, so I think you should use + here.
> 
> You are correct that (...) is not mentioned.  Unfortunately (...)+
> clearly means one or more times, and this is not correct in this case.
> I propose adding (...) to the "Notation and Terminology" section to mean
> exactly once.
> 
> How's the patch below?

LGTM, thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] gdb/mi: new options for -data-disassemble command
  2022-06-23 16:34   ` Eli Zaretskii
@ 2022-06-30 11:22     ` Andrew Burgess
  2022-06-30 13:36       ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-06-30 11:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> Date: Thu, 23 Jun 2022 17:05:16 +0100
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Andrew Burgess <aburgess@redhat.com>
>> 
>> @@ -34416,31 +34418,56 @@
>>  displayed; if @var{lines} is higher than the number of lines between
>>  @var{start-addr} and @var{end-addr}, only the lines up to @var{end-addr}
>>  are displayed.
>> +@item @var{opcodes-mode}
>> +is one of @samp{none}, @samp{bytes}, or @samp{display}.  This setting
>> +can only be used with @var{mode} 0.  Passing @samp{none} here will
>> +prevent opcodes being included in the result; passing @samp{bytes} here
>> +will include opcodes in the result, the opcodes will be formatted as for
>> +@kbd{disassemble /b}; and passing @samp{display} here will include the
>> +opcodes in the result, the opcodes will be formatted as for
>> +@kbd{disassemble /r}.
>
> This begs for an inner @table or @itemize, describing each of the 3
> options separately.  It would be also more correct, English-wise,
> since starting an @item with a lowercase letter and then writing
> several sentences there is not really correct.
>
>>  @item @var{mode}
>> -is one of:
>> +the use of @var{mode} is deprecated in favour of using the
>> +@code{--opcodes} and @code{--source} options.  When no @var{mode} is
>> +given, @var{mode} 0 will be assumed.  However, the @var{mode} is still
>> +available for backward compatibility.  The @var{mode} should be one of:
>>  @itemize @bullet
>> -@item 0 disassembly only
>> -@item 1 mixed source and disassembly (deprecated)
>> -@item 2 disassembly with raw opcodes
>> -@item 3 mixed source and disassembly with raw opcodes (deprecated)
>> -@item 4 mixed source and disassembly
>> -@item 5 mixed source and disassembly with raw opcodes
>> +@item 0 disassembly only:
>> +This is the default mode if no mode is specified.
>
> This is in @itemize, so @items should appear alone on their lines, and
> the text on the line below it.  But the style in which you wrote this
> is better suited fore @table, so maybe just replace @itemize by @table.
>
>> +@item 1 mixed source and disassembly (deprecated):
>> +It is not possible to recreate this mode using @code{--opcodes} and
>> +@code{--source} options.
>
> Then why are we deprecating it (and the MODE argument itself)?

Mode 1 and 3 have been deprecated for a while now, and were marked as
suched before this patch.  The only change I'm making is to also
deprecate the mode value.

The problem with the mode is that is just doesn't scale very well, we
already have 6 modes, and if I were to add my new options this would
take us to 9 modes.

Instead, by adding some new (optional) command flags, the user can now
access all the old (non-deprecated) modes, without having to specify a
mode value.

Note that, as part of this patch, there is some new behaviour that is
only available when using the command flags (i.e. '--opcodes display'),
there's no mode value that gives access to this functionality.

My hope would be that, if we ever add more functionality to the
-data-disassemble command, this would be added as a command flag rather
than a mode value.  Over time MI users will be motivated to switch to
the command flag system rather than the mode value system in order to
access the new features.

Obviously, I've left the mode value system in place so all existing MI
users will continue to function just as they did before.

Hopefully that answers your question.

>
>> -For modes 0 and 2 the @samp{asm_insns} list contains tuples with the
>> -following fields:
>> +For modes 0 and 2, and when the @code{--source} option is not used the
>                                                                      ^^
> Comma missing there.
>
>> +When the @samp{--opcodes} option is not passed to
>> +@code{-data-disassemble}, or the @samp{bytes} value is passed to
>> +@samp{--opcodes}, then the bytes are formatted as a series of single
>> +bytes, in hex, in ascending address order, with a single space between
>> +each byte.  This format is equivalent to the @samp{/b} option being used
>> +with the @kbd{disassemble} command.
>
> A cross-reference to the description of 'disassemble' would be good
> here.

Thanks for your feedback.  I've updated the docs to (I hope) address all
the points you raised, please let me know what you think.

Thanks,
Andrew

---

commit f7175f62ae260b1b2e7b7fb3a89208582a4540b7
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Thu Jun 23 14:49:55 2022 +0100

    gdb/mi: new options for -data-disassemble command
    
    Now that the disassembler has two different strategies for laying out
    the opcode bytes of an instruction (see /r vs /b for the disassemble
    command), I wanted to add support for this to the MI disassemble
    command.
    
    Currently the -data-disassemble command takes a single 'mode' value,
    which currently has 6 different values (0 -> 5), 3 of these modes
    relate to opcode display.
    
    So, clearly I should just add an additional 3 modes to handle the new
    opcode format, right?
    
    No, I didn't think that was a great idea either.
    
    So, I wonder, could I adjust the -data-disassemble command in a
    backward compatible way, that would allow GDB to move away from using
    the mode value altogether?
    
    I think we can.
    
    In this commit, I propose adding two new options to -data-disassemble,
    these are:
    
      --opcodes none|bytes|display
      --source
    
    Additionally, I will make the mode optional, and default to mode 0 if
    no mode value is given.  Mode 0 is the simplest, no source code, no
    opcodes disassembly mode.
    
    The two new options are only valid for mode 0, if they are used with
    any other mode then an error is thrown.
    
    The --opcodes option can add opcodes to the result, with 'bytes' being
    equivalent to 'disassemble /b' and 'display' being 'disassemble /r'.
    
    The --source option will enable the /s style source code display, this
    is equivalent to modes 4 and 5.  There is no way, using the new
    command options to get the now deprecated /m style source code
    display, that is mode 1 and 3.
    
    My hope is that new users of the MI will not use the mode at all, and
    instead will just use the new options to achieve the output they need.
    Existing MI users can continue to use the mode, and will not need to
    be updated to use the new options.

diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index 984ad9c1ed1..9d2cedd8b8b 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -34423,7 +34423,9 @@
   ( -s @var{start-addr} -e @var{end-addr}
   | -a @var{addr}
   | -f @var{filename} -l @var{linenum} [ -n @var{lines} ] )
-  -- @var{mode}
+  [ --opcodes @var{opcodes-mode} ]
+  [ --source ]
+  [ -- @var{mode} ]
 @end smallexample
 
 @noindent
@@ -34452,31 +34454,71 @@
 displayed; if @var{lines} is higher than the number of lines between
 @var{start-addr} and @var{end-addr}, only the lines up to @var{end-addr}
 are displayed.
+@item @var{opcodes-mode}
+can only be used with @var{mode} 0, and should be one of the following:
+@table @samp
+@item none
+no opcode information will be included in the result.
+
+@item bytes
+opcodes will be included in the result, the opcodes will be formatted
+as for @kbd{disassemble /b}.
+
+@item display
+opcodes will be included in the result, the opcodes will be formatted
+as for @kbd{disassemble /r}.
+@end table
 @item @var{mode}
-is one of:
-@itemize @bullet
-@item 0 disassembly only
-@item 1 mixed source and disassembly (deprecated)
-@item 2 disassembly with raw opcodes
-@item 3 mixed source and disassembly with raw opcodes (deprecated)
-@item 4 mixed source and disassembly
-@item 5 mixed source and disassembly with raw opcodes
-@end itemize
+the use of @var{mode} is deprecated in favour of using the
+@code{--opcodes} and @code{--source} options.  When no @var{mode} is
+given, @var{mode} 0 will be assumed.  However, the @var{mode} is still
+available for backward compatibility.  The @var{mode} should be one of:
+@table @samp
+@item 0
+@emph{disassembly only}, this is the default mode if no mode is
+specified.
+
+@item 1
+@emph{mixed source and disassembly (deprecated)}, it is not possible
+to recreate this mode using @code{--opcodes} and @code{--source}
+options.
+
+@item 2
+@emph{disassembly with raw opcodes}, this mode is equivalent to using
+@var{mode} 0 and passing @code{--opcodes bytes} to the command.
+
+@item 3
+@emph{mixed source and disassembly with raw opcodes (deprecated)}, it
+is not possible to recreate this mode using @code{--opcodes} and
+@code{--source} options.
+
+@item 4
+@emph{mixed source and disassembly}, this mode is equivalent to using
+@var{mode} 0 and passing @code{--source} to the command.
 
+@item 5
+@emph{mixed source and disassembly with raw opcodes}, this mode is
+equivalent to using @var{mode} 0 and passing @code{--opcodes bytes}
+and @code{--source} to the command.
+@end table
 Modes 1 and 3 are deprecated.  The output is ``source centric''
 which hasn't proved useful in practice.
 @xref{Machine Code}, for a discussion of the difference between
 @code{/m} and @code{/s} output of the @code{disassemble} command.
 @end table
 
+The @code{--source} can only be used with @var{mode} 0.  Passing this
+option will include the source code in the disassembly result as if
+@var{mode} 4 or 5 had been used.
+
 @subsubheading Result
 
 The result of the @code{-data-disassemble} command will be a list named
-@samp{asm_insns}, the contents of this list depend on the @var{mode}
-used with the @code{-data-disassemble} command.
+@samp{asm_insns}, the contents of this list depend on the options used
+with the @code{-data-disassemble} command.
 
-For modes 0 and 2 the @samp{asm_insns} list contains tuples with the
-following fields:
+For modes 0 and 2, and when the @code{--source} option is not used, the
+@samp{asm_insns} list contains tuples with the following fields:
 
 @table @code
 @item address
@@ -34492,15 +34534,28 @@
 The text disassembly for this @samp{address}.
 
 @item opcodes
-This field is only present for modes 2, 3 and 5.  This contains the
-raw opcode bytes for the @samp{inst} field.  The bytes are formatted
-as single bytes, in hex, in ascending address order, with a single
-space between each byte.
+This field is only present for modes 2, 3 and 5, or when the
+@code{--opcodes} option @samp{bytes} or @samp{display} is used.  This
+contains the raw opcode bytes for the @samp{inst} field.
+
+When the @samp{--opcodes} option is not passed to
+@code{-data-disassemble}, or the @samp{bytes} value is passed to
+@samp{--opcodes}, then the bytes are formatted as a series of single
+bytes, in hex, in ascending address order, with a single space between
+each byte.  This format is equivalent to the @samp{/b} option being
+used with the @kbd{disassemble} command
+(@pxref{disassemble,,@kbd{disassemble}}).
 
+When @samp{--opcodes} is passed the value @samp{display} then the bytes
+are formatted in the natural instruction display order.  This means
+multiple bytes can be grouped together, and the bytes might be
+byte-swapped.  This format is equivalent to the @samp{/r} option being
+used with the @kbd{disassemble} command.
 @end table
 
-For modes 1, 3, 4 and 5 the @samp{asm_insns} list contains tuples named
-@samp{src_and_asm_line}, each of which has the following fields:
+For modes 1, 3, 4 and 5, or when the @code{--source} option is used, the
+@samp{asm_insns} list contains tuples named @samp{src_and_asm_line},
+each of which has the following fields:
 
 @table @code
 @item line
diff --git a/gdb/mi/mi-cmd-disas.c b/gdb/mi/mi-cmd-disas.c
index 4f6e5613b9d..db5a78827fc 100644
--- a/gdb/mi/mi-cmd-disas.c
+++ b/gdb/mi/mi-cmd-disas.c
@@ -68,6 +68,8 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
   bool start_seen = false;
   bool end_seen = false;
   bool addr_seen = false;
+  bool opcodes_seen = false;
+  bool source_seen = false;
 
   /* ... and their corresponding value. */
   char *file_string = NULL;
@@ -77,12 +79,23 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
   CORE_ADDR high = 0;
   CORE_ADDR addr = 0;
 
+  /* Flags to handle the --opcodes option.  */
+  enum opcodes_mode
+  {
+    OPCODES_DEFAULT, OPCODES_NONE, OPCODES_DISPLAY, OPCODES_BYTES
+  };
+  enum opcodes_mode opcodes_mode = OPCODES_DEFAULT;
+
+  /* Handle the -source option.  */
+  bool show_source = false;
+
   /* Options processing stuff.  */
   int oind = 0;
   char *oarg;
   enum opt
   {
-    FILE_OPT, LINE_OPT, NUM_OPT, START_OPT, END_OPT, ADDR_OPT
+    FILE_OPT, LINE_OPT, NUM_OPT, START_OPT, END_OPT, ADDR_OPT, OPCODES_OPT,
+    SHOW_SRC_OPT
   };
   static const struct mi_opt opts[] =
     {
@@ -92,6 +105,8 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
       {"s", START_OPT, 1},
       {"e", END_OPT, 1},
       {"a", ADDR_OPT, 1},
+      {"-opcodes", OPCODES_OPT, 1},
+      {"-source", SHOW_SRC_OPT, 0},
       { 0, 0, 0 }
     };
 
@@ -129,6 +144,21 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
 	  addr = parse_and_eval_address (oarg);
 	  addr_seen = true;
 	  break;
+	case OPCODES_OPT:
+	  opcodes_seen = true;
+	  if (strcmp (oarg, "none") == 0)
+	    opcodes_mode = OPCODES_NONE;
+	  else if (strcmp (oarg, "display") == 0)
+	    opcodes_mode = OPCODES_DISPLAY;
+	  else if (strcmp (oarg, "bytes") == 0)
+	    opcodes_mode = OPCODES_BYTES;
+	  else
+	    error (_("-data-disassemble: unknown value for -opcodes argument"));
+	  break;
+	case SHOW_SRC_OPT:
+	  source_seen = true;
+	  show_source = true;
+	  break;
 	}
     }
   argv += oind;
@@ -146,13 +176,23 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
 
        || (!line_seen && !file_seen && !num_seen && !start_seen && !end_seen
 								&&  addr_seen))
-      || argc != 1)
-    error (_("-data-disassemble: Usage: ( [-f filename -l linenum "
-	     "[-n howmany]] | [-s startaddr -e endaddr] | [-a addr] ) [--] mode."));
+      || argc > 1)
+    error (_("-data-disassemble: Usage: "
+	     "( -f filename -l linenum [-n howmany] |"
+	     " -s startaddr -e endaddr | -a addr ) "
+	     "[ --opcodes mode ] [ --source ] [ [--] mode ]."));
+
+  if (argc == 1)
+    {
+      mode = atoi (argv[0]);
+      if (mode < 0 || mode > 5)
+	error (_("-data-disassemble: Mode argument must be in the range 0-5."));
+    }
+  else
+    mode = 0;
 
-  mode = atoi (argv[0]);
-  if (mode < 0 || mode > 5)
-    error (_("-data-disassemble: Mode argument must be in the range 0-5."));
+  if (mode != 0 && (source_seen || opcodes_seen))
+    error (_("-data-disassemble: --opcodes and --source can only be used with mode 0"));
 
   /* Convert the mode into a set of disassembly flags.  */
 
@@ -180,6 +220,27 @@ mi_cmd_disassemble (const char *command, char **argv, int argc)
       gdb_assert_not_reached ("bad disassembly mode");
     }
 
+  /* Now handle the (optional) --opcodes argument.  This partially
+     overrides the mode value.  */
+  if (opcodes_mode != OPCODES_DEFAULT)
+    {
+      /* Remove any existing flags related to opcodes display.  */
+      disasm_flags &= ~(DISASSEMBLY_RAW_BYTES | DISASSEMBLY_RAW_INSN);
+
+      /* Add back any required flags.  */
+      if (opcodes_mode == OPCODES_DISPLAY)
+	disasm_flags |= DISASSEMBLY_RAW_INSN;
+      else if (opcodes_mode == OPCODES_BYTES)
+	disasm_flags |= DISASSEMBLY_RAW_BYTES;
+    }
+
+  /* Handle the optional --source argument.  */
+  if (show_source)
+    {
+      disasm_flags &= ~DISASSEMBLY_SOURCE_DEPRECATED;
+      disasm_flags |= DISASSEMBLY_SOURCE;
+    }
+
   /* We must get the function beginning and end where line_num is
      contained.  */
 
diff --git a/gdb/testsuite/gdb.mi/mi-disassemble.exp b/gdb/testsuite/gdb.mi/mi-disassemble.exp
index ef3337d4062..4d7c2f2e6e7 100644
--- a/gdb/testsuite/gdb.mi/mi-disassemble.exp
+++ b/gdb/testsuite/gdb.mi/mi-disassemble.exp
@@ -47,21 +47,27 @@ proc test_disassembly_only {} {
 
 
     mi_gdb_test "print/x \$pc" "" ""
-    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" -- 0" \
-	    "111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble from pc to pc+12 assembly only"
 
-    mi_gdb_test "112-data-disassemble -a \$pc -- 0" \
-	    "112\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble function around pc assembly only"
+    foreach { test_name option_string } [list "mode 0" "-- 0" \
+					     "default mode" "" ] {
+	with_test_prefix $test_name {
+	    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" ${option_string}" \
+		"111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble from pc to pc+12 assembly only"
 
-    mi_gdb_test "113-data-disassemble -a callee4 -- 0" \
-	    "113\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
-             "data-disassemble function callee4 assembly only"
+	    mi_gdb_test "112-data-disassemble -a \$pc ${option_string}" \
+		"112\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble function around pc assembly only"
 
-    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body -- 0" \
-	    "222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}\\\]" \
-              "data-disassemble file & line, assembly only"
+	    mi_gdb_test "113-data-disassemble -a callee4 ${option_string}" \
+		"113\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\},\{address=\"$hex\",func-name=\"callee4\",offset=\"$decimal\",inst=\".*\"\}.*\]" \
+		"data-disassemble function callee4 assembly only"
+
+	    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body ${option_string}" \
+		"222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",inst=\".*\"\}\\\]" \
+		"data-disassemble file & line, assembly only"
+	}
+    }
 }
 
 proc test_disassembly_with_opcodes {} {
@@ -78,13 +84,20 @@ proc test_disassembly_with_opcodes {} {
     # -data-disassembly -f basics.c -l $line_main_body -- 2
 
     mi_gdb_test "print/x \$pc" "" ""
-    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" -- 2" \
-           "111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}.*\]" \
-             "data-disassemble from pc to pc+12 assembly with opcodes"
 
-    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body -- 2" \
-           "222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",opcodes=\".*\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}\\\]" \
-              "data-disassemble file & line, assembly with opcodes"
+    foreach { test_name option_string} [list "mode 2" "-- 2" \
+					    "mode 0 and --opcodes bytes" "--opcodes bytes -- 0" \
+					    "default mode and --opcodes bytes" "--opcodes bytes"] {
+	with_test_prefix $test_name {
+	    mi_gdb_test "111-data-disassemble -s \$pc -e \"\$pc + 12\" ${option_string}" \
+		"111\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\},\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}.*\]" \
+		"data-disassemble from pc to pc+12 assembly"
+
+	    mi_gdb_test "222-data-disassemble  -f basics.c -l $line_main_body ${option_string}" \
+		"222\\^done,asm_insns=\\\[\{address=\"$hex\",func-name=\"main\",offset=\"0\",opcodes=\".*\",inst=\".*\"\},.*,\{address=\"$hex\",func-name=\"main\",offset=\"$decimal\",opcodes=\".*\",inst=\".*\"\}\\\]" \
+		"data-disassemble file & line, assembly"
+	}
+    }
 }
 
 proc test_disassembly_lines_limit {} {
@@ -230,13 +243,24 @@ proc test_disassembly_bogus_args {} {
              "data-disassemble bogus address, -a"
 
     mi_gdb_test "456-data-disassemble -s \$pc -f basics.c -- 0" \
-             "456\\^error,msg=\"-data-disassemble: Usage: \\( .-f filename -l linenum .-n howmany.. \\| .-s startaddr -e endaddr. \\| .-a addr. \\) .--. mode.\"" \
-             "data-disassemble mix different args"
+	"456\\^error,msg=\"-data-disassemble: Usage: \\( -f filename -l linenum .-n howmany. \\| -s startaddr -e endaddr \\| -a addr \\) . --opcodes mode . . --source . . .--. mode .\\.\"" \
+	"data-disassemble mix different args"
 
     mi_gdb_test "789-data-disassemble -f basics.c -l $line_main_body -- 9" \
              "789\\^error,msg=\"-data-disassemble: Mode argument must be in the range 0-5.\"" \
-             "data-disassemble wrong mode arg"
+	"data-disassemble wrong mode arg"
+
+    foreach mode { 1 2 3 4 5 } {
+	foreach opcode_arg { none bytes display } {
+	    mi_gdb_test "801-data-disassemble -s \$pc -e \"\$pc + 12\" --opcodes ${opcode_arg} -- ${mode}" \
+		"801\\^error,msg=\"-data-disassemble: --opcodes and --source can only be used with mode 0\"" \
+		"data-disassemble use --opcode ${opcode_arg} with mode ${mode}"
+	}
 
+	mi_gdb_test "802-data-disassemble -s \$pc -e \"\$pc + 12\" --source -- ${mode}" \
+	    "802\\^error,msg=\"-data-disassemble: --opcodes and --source can only be used with mode 0\"" \
+	    "data-disassemble use --source with mode ${mode}"
+    }
 }
 
 # Check the format of the opcode bytes.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] gdb/mi: new options for -data-disassemble command
  2022-06-30 11:22     ` Andrew Burgess
@ 2022-06-30 13:36       ` Eli Zaretskii
  0 siblings, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2022-06-30 13:36 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Thu, 30 Jun 2022 12:22:28 +0100
> 
> Thanks for your feedback.  I've updated the docs to (I hope) address all
> the points you raised, please let me know what you think.

Thanks, the documentation part looks good.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Disassembler opcode display and text alignment
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (8 preceding siblings ...)
  2022-06-23 16:05 ` [PATCH 9/9] gdb/mi: new options for -data-disassemble command Andrew Burgess
@ 2022-07-25 18:28 ` Andrew Burgess
  2022-09-05 14:11   ` Andrew Burgess
  2022-09-21 16:41 ` Tom Tromey
  10 siblings, 1 reply; 25+ messages in thread
From: Andrew Burgess @ 2022-07-25 18:28 UTC (permalink / raw)
  To: gdb-patches


Ping!

Thanks,
Andrew

Andrew Burgess <aburgess@redhat.com> writes:

> This series makes two related changes to GDB's disassembler.  Both
> changes flow naturally from having GDB make use of libopcodes
> bytes_per_line and bytes_per_chunk presentation hints that are set on
> a per-architecure basis with each call into the disassembler.
>
> The end result of this change is that GDB will now display instruction
> opcodes in the same way that objdump does.
>
> For x86-64 there's no change in the way that opcode bytes are
> presented.  Now due to some special case, it's just that how GDB lays
> out instruction opcodes just happens to match how libopcodes requests
> that the opcodes be laid out.
>
> For other architectures, risc-v, powerpc, arm, aarch64, etc, this is
> not the case, and I (personally) think objdump does a better job of
> presenting the information that GDB does; the instruction opcodes will
> be grouped together based on the instruction size, and potentially
> byte-swapped so they appear in the instruction's natural order.
>
> Making use of the display hints also allows for better alignment of
> the disassembly text when opcodes are being printed.
>
> I have proposed that this new behaviour become the default for
> 'disassemble /r', and I've added a new flag 'disassemble /b' which
> allows the user to access the old behaviour.
>
> But, if people feel strongly that the old 'disassemble /r' should not
> be changed, I can place the new behaviour under a new flag.
>
> Let me know your thoughts,
>
> Thanks,
> Andrew
>
> ---
>
> Andrew Burgess (9):
>   gdb/doc: improve description of --data-disassemble opcodes output
>   gdb/testsuite: new test for -data-disassemble opcodes format
>   gdb/disasm: read opcodes bytes with a single read_code call
>   gdb: disassembler opcode display formatting
>   gdb: make gdb_disassembly_flag unsigned
>   gdb/doc: fix column widths in MI compatibility table
>   gdb/doc: update syntax of -data-disassemble command arguments
>   gdb/mi: some int to bool conversion
>   gdb/mi: new options for -data-disassemble command
>
>  gdb/NEWS                                |  12 ++
>  gdb/cli/cli-cmds.c                      |   6 +
>  gdb/disasm-flags.h                      |   3 +-
>  gdb/disasm.c                            |  55 +++++++--
>  gdb/disasm.h                            |   3 +
>  gdb/doc/gdb.texinfo                     | 137 ++++++++++++++++-----
>  gdb/mi/mi-cmd-disas.c                   | 105 ++++++++++++----
>  gdb/record.c                            |   3 +
>  gdb/testsuite/gdb.mi/mi-disassemble.exp | 153 ++++++++++++++++++++----
>  gdb/testsuite/lib/mi-support.exp        |  27 +++++
>  10 files changed, 417 insertions(+), 87 deletions(-)
>
> -- 
> 2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Disassembler opcode display and text alignment
  2022-07-25 18:28 ` [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
@ 2022-09-05 14:11   ` Andrew Burgess
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-09-05 14:11 UTC (permalink / raw)
  To: gdb-patches


Ping!

Thanks,
Andrew

Andrew Burgess <aburgess@redhat.com> writes:

> Ping!
>
> Thanks,
> Andrew
>
> Andrew Burgess <aburgess@redhat.com> writes:
>
>> This series makes two related changes to GDB's disassembler.  Both
>> changes flow naturally from having GDB make use of libopcodes
>> bytes_per_line and bytes_per_chunk presentation hints that are set on
>> a per-architecure basis with each call into the disassembler.
>>
>> The end result of this change is that GDB will now display instruction
>> opcodes in the same way that objdump does.
>>
>> For x86-64 there's no change in the way that opcode bytes are
>> presented.  Now due to some special case, it's just that how GDB lays
>> out instruction opcodes just happens to match how libopcodes requests
>> that the opcodes be laid out.
>>
>> For other architectures, risc-v, powerpc, arm, aarch64, etc, this is
>> not the case, and I (personally) think objdump does a better job of
>> presenting the information that GDB does; the instruction opcodes will
>> be grouped together based on the instruction size, and potentially
>> byte-swapped so they appear in the instruction's natural order.
>>
>> Making use of the display hints also allows for better alignment of
>> the disassembly text when opcodes are being printed.
>>
>> I have proposed that this new behaviour become the default for
>> 'disassemble /r', and I've added a new flag 'disassemble /b' which
>> allows the user to access the old behaviour.
>>
>> But, if people feel strongly that the old 'disassemble /r' should not
>> be changed, I can place the new behaviour under a new flag.
>>
>> Let me know your thoughts,
>>
>> Thanks,
>> Andrew
>>
>> ---
>>
>> Andrew Burgess (9):
>>   gdb/doc: improve description of --data-disassemble opcodes output
>>   gdb/testsuite: new test for -data-disassemble opcodes format
>>   gdb/disasm: read opcodes bytes with a single read_code call
>>   gdb: disassembler opcode display formatting
>>   gdb: make gdb_disassembly_flag unsigned
>>   gdb/doc: fix column widths in MI compatibility table
>>   gdb/doc: update syntax of -data-disassemble command arguments
>>   gdb/mi: some int to bool conversion
>>   gdb/mi: new options for -data-disassemble command
>>
>>  gdb/NEWS                                |  12 ++
>>  gdb/cli/cli-cmds.c                      |   6 +
>>  gdb/disasm-flags.h                      |   3 +-
>>  gdb/disasm.c                            |  55 +++++++--
>>  gdb/disasm.h                            |   3 +
>>  gdb/doc/gdb.texinfo                     | 137 ++++++++++++++++-----
>>  gdb/mi/mi-cmd-disas.c                   | 105 ++++++++++++----
>>  gdb/record.c                            |   3 +
>>  gdb/testsuite/gdb.mi/mi-disassemble.exp | 153 ++++++++++++++++++++----
>>  gdb/testsuite/lib/mi-support.exp        |  27 +++++
>>  10 files changed, 417 insertions(+), 87 deletions(-)
>>
>> -- 
>> 2.25.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Disassembler opcode display and text alignment
  2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
                   ` (9 preceding siblings ...)
  2022-07-25 18:28 ` [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
@ 2022-09-21 16:41 ` Tom Tromey
  2022-10-02 13:15   ` Andrew Burgess
  10 siblings, 1 reply; 25+ messages in thread
From: Tom Tromey @ 2022-09-21 16:41 UTC (permalink / raw)
  To: Andrew Burgess via Gdb-patches

>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

Andrew> This series makes two related changes to GDB's disassembler.  Both
Andrew> changes flow naturally from having GDB make use of libopcodes
Andrew> bytes_per_line and bytes_per_chunk presentation hints that are set on
Andrew> a per-architecure basis with each call into the disassembler.

I read through this series and it all looked fine to me.
Thank you for doing this.

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] Disassembler opcode display and text alignment
  2022-09-21 16:41 ` Tom Tromey
@ 2022-10-02 13:15   ` Andrew Burgess
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Burgess @ 2022-10-02 13:15 UTC (permalink / raw)
  To: Tom Tromey, Andrew Burgess via Gdb-patches

Tom Tromey <tom@tromey.com> writes:

>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> Andrew> This series makes two related changes to GDB's disassembler.  Both
> Andrew> changes flow naturally from having GDB make use of libopcodes
> Andrew> bytes_per_line and bytes_per_chunk presentation hints that are set on
> Andrew> a per-architecure basis with each call into the disassembler.
>
> I read through this series and it all looked fine to me.
> Thank you for doing this.

Thanks, pushed.

Andrew


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-10-02 13:15 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-23 16:05 [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
2022-06-23 16:05 ` [PATCH 1/9] gdb/doc: improve description of --data-disassemble opcodes output Andrew Burgess
2022-06-23 16:09   ` Eli Zaretskii
2022-06-29 12:54     ` Andrew Burgess
2022-06-23 16:05 ` [PATCH 2/9] gdb/testsuite: new test for -data-disassemble opcodes format Andrew Burgess
2022-06-23 16:05 ` [PATCH 3/9] gdb/disasm: read opcodes bytes with a single read_code call Andrew Burgess
2022-06-23 16:05 ` [PATCH 4/9] gdb: disassembler opcode display formatting Andrew Burgess
2022-06-23 16:14   ` Eli Zaretskii
2022-06-23 16:05 ` [PATCH 5/9] gdb: make gdb_disassembly_flag unsigned Andrew Burgess
2022-06-23 16:05 ` [PATCH 6/9] gdb/doc: fix column widths in MI compatibility table Andrew Burgess
2022-06-23 16:22   ` Eli Zaretskii
2022-06-30  9:39     ` Andrew Burgess
2022-06-23 16:05 ` [PATCH 7/9] gdb/doc: update syntax of -data-disassemble command arguments Andrew Burgess
2022-06-23 16:21   ` Eli Zaretskii
2022-06-30 10:18     ` Andrew Burgess
2022-06-30 10:46       ` Eli Zaretskii
2022-06-23 16:05 ` [PATCH 8/9] gdb/mi: some int to bool conversion Andrew Burgess
2022-06-23 16:05 ` [PATCH 9/9] gdb/mi: new options for -data-disassemble command Andrew Burgess
2022-06-23 16:34   ` Eli Zaretskii
2022-06-30 11:22     ` Andrew Burgess
2022-06-30 13:36       ` Eli Zaretskii
2022-07-25 18:28 ` [PATCH 0/9] Disassembler opcode display and text alignment Andrew Burgess
2022-09-05 14:11   ` Andrew Burgess
2022-09-21 16:41 ` Tom Tromey
2022-10-02 13:15   ` Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).