Re: [PATCH 1/1] Integrate GNU poke in GDB

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Andrew Burgess <andrew.burgess@embecosm.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/1] Integrate GNU poke in GDB
Date: Tue, 11 May 2021 15:07:42 +0200	[thread overview]
Message-ID: <875yzps9k1.fsf@oracle.com> (raw)
In-Reply-To: <20210511073331.GU6612@embecosm.com> (Andrew Burgess's message of "Tue, 11 May 2021 08:33:31 +0100")

Hi Andrew.

Thanks for the feedback.

> I have no objections to merging this functionality, especially as it
> is so self-contained, however, I do have a question, or feedback.
>
> As someone who doesn't know poke, after reading the manual, it's still
> not clear to me what significant value poke adds to GDB.  A lot of the
> examples seem to cover things that GDB already does, mostly examining
> data and variables in target memory (though there were a couple of
> neat features which aren't so easy to achieve in GDB).
>
> I guess my question, or feedback, would be, what's the killer feature
> that adding poke brings to GDB?

The most obvious example that quickly comes to mind is to check/debug
the integrity of data in an application's memory.

Suppose you have a program or driver that processes USB packets, and
buffers them in some memory buffer.  The program is malfunctioning and
you suspect (but you don't know for sure) the culprit is some corrupted
USB packet.

So you want to examine the integrity of the USB packets that the program
has buffered at certain point in the program execution.  This is where
poke gets handy: a pickle for USB packets (let's say usb.pk) will
provide types/methods/functions describing the format of USB packets,
including integrity.  It will also likely provide utility functions for
diagnostics, and even fixing invalid packets.

GDB gives you access to the C/whatever-language types, and that is very
good, but:

- C/whatever language types do not include integrity checks.  Poke types
  almost always do.

- Poke types are designed to be used/interpreted/read by persons,
  whereas C/whatever types are designed to be interpreted by programs.
  As such, Poke types tend to "adapt" their form to their contents in
  order to be intelligible and to increase the odds of detecting
  integrity problems.

- If the protocol/format/structure of the data being examined is
  bit-oriented (consider for example data compressed with deflate) the
  C/whatever types usually provide a byte-sized view of the structure
  instead of a meaningful structure that the person using GDB can easily
  understand/manipulate.  Instead, poke can _really_ operate at the bit
  level.  See the following section in the poke manual:
  http://www.jemarch.net/poke-1.2-manual/html_node/Weird-Integers.html#Weird-Integers

For example, consider the following Poke type describing a type in BTF
(the debugging format for BTF types):

  type BTF_Type =
    struct
    {
      offset<uint<32>,B> name;

      struct uint<32>
      {
        uint<1> kind_flag;
        uint<3>;
        uint<4> kind;
        uint<8>;
        uint<16> vlen;

        method _print = void:
          {
            printf ("#<%s,kind_flag:%u32d,vlen:%v>",
                    btf_kind_names[kind], kind_flag, vlen);
          }
      } info;

      union
      {
        offset<uint<32>,B> size : (info.kind in [BTF_KIND_INT,
                                                 BTF_KIND_ENUM,
                                                 BTF_KIND_STRUCT,
                                                 BTF_KIND_UNION]);
        BTF_Type_Id type_id;
      } attrs;

      type BTF_Member =
        struct
        {
          offset<uint<32>,B> name;
          BTF_Type_Id type_id;
          union
          {
            offset<uint<32>,b> member_offset : !info.kind_flag;
            struct uint<32>
            {
              offset<uint<8>,b> bitfield_size;
              offset<uint<24>,b> bit_offset;
            } bitfield;
          } offset;
        };

      type BTF_Func_Proto =
        struct
        {
          BTF_Param[info.vlen] params;
        };

      union
      {
        BTF_Int integer                    : info.kind == BTF_KIND_INT;
        BTF_Array array                    : info.kind == BTF_KIND_ARRAY;
        BTF_Enum[info.vlen] enum           : info.kind == BTF_KIND_ENUM;
        BTF_Func_Proto func_proto          : info.kind == BTF_KIND_FUNC_PROTO;
        BTF_Variable variable              : info.kind == BTF_KIND_VAR;
        BTF_Member[info.vlen] members      : (info.kind == BTF_KIND_UNION
                                              || info.kind == BTF_KIND_STRUCT);
        BTF_Var_SecInfo[info.vlen] datasec : info.kind == BTF_KIND_DATASEC;

        struct {} nothing;
      } data;

      method vararg_p = int:
        {
          var last_param = data.func_proto.params[info.vlen - 1];
          return (last_param.name == 0#B && last_param.param_type == 0);
        }

      method get_kind_name = string:
        {
          return btf_kind_names[info.kind];
        }
    };

If you map a BTF_Type from GDB, it will check the integrity (poke also
support mapping in non-strict mode) and also will build the structure
based on the very contents.  GDB simply can't do the same thing based on
the "equivalent" collection of decoupled C types, which are really not
equivalent at all.

So, back to the debugging of the USB processing program, sure, you could
dump the buffer to some file using GDB commands and then use poke on the
side to poke on it.  But having poke in GDB allows you to inspect the
buffer directly and the integrity of its contents, and also provides
access to the rest of the program memory, which could be also handy to
diagnose the problem.

Now suppose you find something fishy in a USB packet, and what you want
to do is to _fix_ it (or modify it somehow) and then continue the
program to see if that fixes the crash.  Again sure, you could use GDB
commands to load the modified dump.  But having poke in GDB allows you
to just continue.

This as for "what can poke do for GDB".  But there is another aspect to
consider as well: "what can GDB do for poke".

You see, mainly for the sake of completion, we _do_ support a "proc" IO
device in GNU poke (the command-line editor) that provides access to the
memory of a running process.  However, this support is intended only for
the very basics, and:

- It is not portable (only works on GNU/Linux).
- It doesn't know how to recognize stack frames.
- It doesn't know how to unwind.
- It doesn't have a disassembler.
- etc

When people ask me to add features like that to poke (and they do ask) I
always think: why bothering implementing them?  GDB does all these
things very well, and more, and it is portable, and it knows about a
shitload of architectures, and... .

So, in my mind, the right platform to use for poking at the memory of
live (or even dead) processes is GDB.  That's why we took pains to make
it very easy to integrate poke (libpoke) in other applications, and
therefore this proposed patch :)

Another example of integration (which is work in progress) is with the
assembler (GAS).

Instead of writing this:

 .section .text
 .set softfloat
 .ascii "PS-X EXE"
 .byte 0, 0, 0, 0, 0, 0, 0, 0
 .word main
 .word 0
 .word 0x80010000
 .word image_size
 .word 0,0,0,0
 .word 0x8001FFF0
 .word 0,0,0,0,0,0
 .ascii "Sony Computer Entertainment Inc. for zid"

You will be able to write something like:

 .poke load psxexe
 .poke var s = "Sony Computer"
 .poke PS_X_EXE { start = $main, size = $image_size, vendor = s }

(A nice side effect will be that .poke will be a _really portable_ data
 directive, unlike the regular ones, and therefore we will be able to
 write, say, the GDB and linker tests for CTF the right way,
 i.e. without relying on a compiler or copy-pased inintelligible
 .word/.byte blocks.)

I could go on and on but I don't want to pester you people so I better
stop :)

next prev parent reply	other threads:[~2021-05-11 13:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 15:10 [PATCH 0/1] " Jose E. Marchesi
2021-05-10 15:10 ` [PATCH 1/1] " Jose E. Marchesi
2021-05-10 16:56   ` Eli Zaretskii
2021-05-10 18:49     ` Jose E. Marchesi
2021-05-10 18:52       ` Eli Zaretskii
2021-05-11  7:33   ` Andrew Burgess
2021-05-11 13:07     ` Jose E. Marchesi [this message]
2021-05-12  8:52       ` Andrew Burgess
2021-05-12 10:14         ` Jose E. Marchesi
2021-05-13 16:59   ` Tom Tromey
2021-05-10 18:39 ` [PATCH 0/1] " Simon Marchi
2021-05-10 20:07   ` Jose E. Marchesi
2021-05-11  6:25     ` Andrew Burgess
2021-05-13 17:04   ` Tom Tromey
2021-05-11 18:56 ` Tom Tromey
2021-05-12  8:06   ` Jose E. Marchesi
2021-05-13 15:52     ` Tom Tromey
2021-05-14 20:52       ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yzps9k1.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=andrew.burgess@embecosm.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).