public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom Tromey <tom@tromey.com>
To: "Jose E. Marchesi via Gdb-patches" <gdb-patches@sourceware.org>
Subject: Re: [PATCH 1/1] Integrate GNU poke in GDB
Date: Thu, 13 May 2021 10:59:30 -0600	[thread overview]
Message-ID: <871raa603x.fsf@tromey.com> (raw)
In-Reply-To: <20210510151044.20829-2-jose.marchesi@oracle.com> (Jose E. Marchesi via Gdb-patches's message of "Mon, 10 May 2021 17:10:44 +0200")

>>>>> "Jose" == Jose E Marchesi via Gdb-patches <gdb-patches@sourceware.org> writes:

Hi.  Thanks for the patch.  gdb has its own coding style, not to mention
API peculiarities and surprises, so I've gone through the patch and
mentioned the ones I could find.  I'm sorry it turned out rather long,
don't be discouraged, this is typical for a first gdb patch.

Before accepting this, I think test cases would also be needed; plus a
NEWS entry.

Probably not all the maintainers will have poke, so I'd guess any
changes needed will be done on a best-effort basis.  I see Fedora
doesn't package it, at least as of Fedora 32.

Jose> +# Integration with GNU poke is done through the libpoke library.
Jose> +AC_ARG_ENABLE([poke],
Jose> +              AS_HELP_STRING([--enable-poke],
Jose> +                             [Build GDB with poke support (default is NO)]),
Jose> +              [poke_enabled=$enableval], [poke_enabled=no])

It's normal to also support an 'auto' value, often the default; then
have this mode probe for the library.

Jose> +if test "x$poke_enabled" = "xyes"; then
Jose> +  # Note that we need a libpoke with support for registering foreign
Jose> +  # IO devices, hence the symbol pk_register_iod.
Jose> +  AC_CHECK_LIB(poke, pk_register_iod)
Jose> +  POKE_OBS="poke.o"

Probably the "yes" value should not probe, and just give a compilation
(or configure) error if the library isn't found.  The idea here is that
this is what the user asked for.

Jose> +++ b/gdb/doc/gdb.texinfo
Jose> @@ -10245,6 +10245,7 @@ being passed the type of @var{arg} as the argument.
Jose>  * Caching Target Data::         Data caching for targets
Jose>  * Searching Memory::            Searching memory for a sequence of bytes
Jose>  * Value Sizes::                 Managing memory allocated for values
Jose> +* Poke::                        Poking at data using GNU poke.

I think it would be better if this were conditional on Poke actually
being compiled in.

Jose> +static void
Jose> +poke_term_flush (void)

No need for (void) in C++.

Jose> +/* Terminal hook that prints a fixed string.  */
Jose> +
Jose> +static void
Jose> +poke_puts (const char *str)
Jose> +{
Jose> +  printf_filtered ("%s", str);
Jose> +}

printf_filtered can throw an exception if the user chooses 'q' at a
pagination prompt.  Is this ok for poke?

I don't think other exceptions can be thrown, but it's hard to know for
sure.

Jose> +/* Terminal hook that prints a formatted string.  */
Jose> +
Jose> +__attribute__ ((__format__ (__printf__, 1, 2)))

ATTRIBUTE_PRINTF

Jose> +static void
Jose> +poke_printf (const char *format, ...)
Jose> +{
Jose> +  va_list ap;
Jose> +  char *str;
Jose> +  int r;
Jose> +
Jose> +  va_start (ap, format);
Jose> +  r = vasprintf (&str, format, ap);
Jose> +  if (r == -1)
Jose> +    error (_("out of memory in vasprintf")); /* XXX fatal */
Jose> +  va_end (ap);
Jose> +
Jose> +  printf_filtered ("%s", str);

This can all be replaced with vprintf_filtered.

Jose> +/* Terminal hook that sets the terminal background color.  */
Jose> +
Jose> +static void
Jose> +poke_term_set_bgcolor (struct pk_color color)
Jose> +{
Jose> +  /* Do nothing.  */
Jose> +}

I don't know what this is for, but gdb has a styling system that can be
used.

Jose> +/* Implementation of the poke terminal interface, that uses the hooks
Jose> +   defined above.  */
Jose> +
Jose> +static struct pk_term_if poke_term_if =
Jose> +  {
Jose> +    .flush_fn = poke_term_flush,

I don't think this is C++ syntax.
Maybe it's a GNU extension, but gdb only uses those when it knows it is
being compiled with gcc.

Jose> +static char *
Jose> +iod_handler_normalize (const char *handler, uint64_t flags, int *error)
Jose> +{
Jose> +  char *new_handler = NULL;
Jose> +
Jose> +  if (strcmp (handler, "<gdb>") == 0)
Jose> +    new_handler = xstrdup (handler);
Jose> +  if (error)
Jose> +    *error = PK_IOD_OK;
Jose> +
Jose> +  return new_handler;

If poke owns this memory, then I suspect you'll have problems on
Windows, because IIRC different libraries may get different malloc/free
implementations, and so passing ownership like this isn't ok.

Not sure if that's correct, you may want to double-check.

Jose> +  return (gdbarch_addr_bit (get_current_arch ()) == 32
Jose> +	  ? 0xffffffff : 0xffffffffffffffff);

GNU style here is to add extra parens so that the second line lines up
with the expression on the first line.

Jose> +static struct pk_alien_token *
Jose> +poke_alien_token_handler (const char *id, char **errmsg)
Jose> +{
Jose> +  /* In GDB alien poke tokens with the form $addr::FOO provide the
Jose> +     address of the symbol `FOO' as an offset in bytes, i.e. it
Jose> +     resolves to the GDB value &foo as a Poke offset with unit bytes.
Jose> +
Jose> +     $FOO, on the other hand, provide the value of the symbol FOO
Jose> +     incarnated in a proper Poke value.  */
Jose> +
Jose> +  if (strncmp (id, "addr::", 6) == 0)

startswith

Jose> +    {
Jose> +      CORE_ADDR addr;
Jose> +
Jose> +      std::string expr = "&";
Jose> +      expr += id + 6;
Jose> +
Jose> +      try
Jose> +	{
Jose> +	  addr = parse_and_eval_address (expr.c_str ());
Jose> +	}

"&" is valid in C, but there's no guarantee it is correct for all the
languages gdb supports.  I think it would be better to parse-and-eval,
then use the value API to take the address, the convert using
value_as_address.

Jose> +      catch (const gdb_exception_error &except)
Jose> +	{
Jose> +	  goto error;
Jose> +	}

This will let through gdb_exception_quit, which can happen if the user
types C-c during the evaluation.  You probably want to catch
gdb_exception instead.

Jose> +      if (can_dereference (type))

I didn't know about this function.  It's only used by
annotations... hmm.

Classically gdb lets you dereference a plain int.  Dunno if you want to
support that.

Jose> +  *errmsg = NULL;
Jose> +  return &alien_token;

If this is the only use, the definition might as well be put inside this
function.

Jose> +
Jose> + error:
Jose> +  std::string emsg = "can't access GDB variable '";
Jose> +  emsg += id;
Jose> +  emsg += "'";
Jose> +  *errmsg = xstrdup (emsg.c_str ());

You can use xstrprintf or concat here, instead of a std::string.

Jose> +static std::vector<std::string> poke_keywords
Jose> +  {
Jose> +    "pinned", "struct", "union", "else", "while", "until",
Jose> +    "for", "in", "where", "if", "sizeof", "fun", "method",
Jose> +    "type", "var", "unit", "break", "continue", "return",
Jose> +    "string", "as", "try", "catch", "raise", "void", "any",
Jose> +    "print", "printf", "isa", "unmap", "big", "little",
Jose> +    "load", "lambda", "assert",
Jose> +  };
Jose> +
Jose> +static std::string
Jose> +normalize_poke_identifier (std::string prefix, std::string str)
Jose> +{
Jose> +  if (std::find (poke_keywords.begin (),
Jose> +		 poke_keywords.end (),
Jose> +		 str) != poke_keywords.end ())
Jose> +    str = prefix + str;
Jose> +
Jose> +  return str;
Jose> +}

I think it's probably better to use a plain C array of const char * for
poke_keywords.  Not sure if std::find will work, but an ordinary foreach
is also fine.

I think both arguments can be 'const std::string &' to avoid copying.

Jose> +static std::string
Jose> +gdb_type_name_to_poke (std::string str, struct type *type = NULL)

nullptr

Jose> +{
Jose> +  for (int i = 0; i < str.length (); ++i)
Jose> +    if (!(str.begin()[i] == '_'
Jose> +	  || (str.begin()[i] >= 'a' && str.begin()[i] <= 'z')
Jose> +	  || (str.begin()[i] >= '0' && str.begin()[i] <= '9')
Jose> +	  || (str.begin()[i] >= 'A' && str.begin()[i] <= 'Z')))
Jose> +      str.begin()[i] = '_';

You can just use str[i], no need for .begin().
gdb would use the ctype macros here, or maybe the ones in safe-ctype.h.

Jose> +  if (type)

type != nullptr

Jose> +    {
Jose> +      /* Prepend struct and union tags with suitable prefixes.  This
Jose> +	 is to avoid ending with recursive typedefs in C programs.  */
Jose> +      if (type->code () == TYPE_CODE_STRUCT)
Jose> +	str = "struct_" + str;
Jose> +      else if (type->code () == TYPE_CODE_UNION)
Jose> +	str = "union_" + str;

Before using a type in gdb, you normally have to call check_typedef.
However, I don't know if that's appropriate here or not, since I don't
really know what this is doing.  How does poke handle typedefs?

Jose> +#if 0
Jose> +/* Command to shut down the poke incremental compiler.  */

I guess this is the deletion thing.  Anyway, FAOD, we don't want new
dead code.

What does the incremental compiler do exactly?

Jose> +/* Command to feed the poke compiler with the definition of some given
Jose> +   GDB type.  */
Jose> +
Jose> +static void poke_command (const char *args, int from_tty);
Jose> +
Jose> +static std::string
Jose> +poke_add_type (struct type *type)
Jose> +{

Comment should be just before the function.

Jose> +  std::string str = "";

You can drop the "".

Jose> +      if (type->name ())

!= nullptr

Jose> +	case TYPE_CODE_TYPEDEF:
Jose> +	  {
Jose> +	    struct type *target_type = TYPE_TARGET_TYPE (type);
Jose> +	    std::string target_type_code = poke_add_type (target_type);
Jose> +
Jose> +	    if (target_type_code == "")

Normally you'd use 'target_type_code.empty ()' here.
I think I missed another spot like this.

Jose> +	case TYPE_CODE_INT:

You can probably handle enums like ints.

Jose> +	case TYPE_CODE_ARRAY:
Jose> +	  {
Jose> +	    struct type *target_type = TYPE_TARGET_TYPE (type);
Jose> +	    size_t target_type_length = TYPE_LENGTH (target_type);

You need a check_typedef in here.

Jose> +	    if (target_type->name ())

nullptr

Jose> +	    str += std::to_string (TYPE_LENGTH (type) / target_type_length);

I don't think this will work as you expect.  Normally you need to fetch
the bounds of the array.  This can't always be done, though; for example
in C, a VLA's bounds depend on some value, so the dynamic type can't be
resolved in isolation.

Not sure if this is an issue here or not.  You can probably just skip
dynamically-sized arrays.

Jose> +	case TYPE_CODE_STRUCT:

Don't know if it matters to poke, but unions can often be handled like
structs.

Jose> +	  {
Jose> +	    size_t natural_bitpos = 0;
Jose> +	    str += "struct {";
Jose> +
Jose> +	    for (int idx = 0; idx < type->num_fields (); idx++)
Jose> +	      {
Jose> +		std::string field_name
Jose> +		  = normalize_poke_identifier ("__f", TYPE_FIELD_NAME (type, idx));
Jose> +		struct type *field_type = type->field (idx).type ();
Jose> +		size_t field_bitpos = TYPE_FIELD_BITPOS (type, idx);

You probably want to skip static fields in this loop.

Jose> +		    if (poke_add_type (field_type) == "")
Jose> +		      goto skip;
Jose> +		    str += gdb_type_name_to_poke (field_type->name (), field_type);

I forgot to mention something else ... this seems to assume that a given
name corresponds to a single type.  Is that true?  I wasn't really sure
from the code.  Anyway -- this isn't always the case, it's relatively
normal in C programs to have an opaque type that has a different meaning
in different compilation units.  How will poke handle this?

You may also want this to skip virtual base classes, depending on
whether poke can handle those.

And you may want to skip all dynamic types generally, maybe somewhere a
bit higher up.  Not sure.  These are types whose layout depends on other
values -- not super common in C (just VLA), but relatively more so in
Ada and Rust and maybe Fortran.

Jose> +		natural_bitpos = field_bitpos + TYPE_LENGTH (field_type) * 8;

I didn't understand this.  For example, a bitfield will have an explicit
length that isn't the length of its underlying type.

It's probably better to just remove the "natural_bitpos" logic and defer
to what gdb already knows about the type.  However, I don't know if this
has some kind of bad effect on poke.

Also there may be something weird about bit positions on big-endian systems.
I'm never quite clear on that.

Jose> +	  type_poke_strings.push_back (deftype);
Jose> +	  poke_command (deftype.c_str(), 0 /* from_tty */);

I think it would be better to factor out a common helper instead.

Jose> +	  printf_filtered ("added type %s\n", poke_type_name.c_str ());

Was this some kind of debugging thing, or is it needed?

Jose> + skip:
Jose> +  return "";

I guess all those 'goto's can just be 'return {};'

Jose> +static void
Jose> +start_poke (void)

(void)

Jose> +  if (poke_compiler == NULL)
Jose> +    error (_("Couldn't start the poke incremental compiler."));
Jose> +  poke_compiler_lives = 1;

I was wondering if all the code after this point in this function
matters.  If so, then probably this flag shouldn't be set until then.

Also, it should be bool and use true/false.

Jose> +/* Commands to add GDB types to the running poke compiler.  */
Jose> +
Jose> +static void
Jose> +poke_add_type_command (const char *args, int from_tty)
Jose> +{

Jose> +  std::string type_name = skip_spaces (args);
Jose> +  type_name = gdb_type_name_to_poke (type_name);
Jose> +
Jose> +  expression_up expr = parse_expression (args);

This seems to use 'args' twice for two different things.

Jose> +
Jose> +static void
Jose> +poke_add_types (const char *args, int from_tty)

Intro comment.

Jose> +  std::string symbol_name_regexp = skip_spaces (args);

Better to use a const char * here -- avoids a copy.

Jose> +  int what; /* 0 -> declaration, 1 -> statement */

gdb prefers bool in new code.

Jose> +  const char *end;
Jose> +  std::string cmd;
Jose> +
Jose> +#define IS_COMMAND(input, cmd) \
Jose> +  (strncmp ((input), (cmd), sizeof (cmd) - 1) == 0 \
Jose> +   && ((input)[sizeof (cmd) - 1] == ' ' || (input)[sizeof (cmd) - 1] == '\t'))
Jose> +
Jose> +  args = skip_spaces (args);
Jose> +  if (IS_COMMAND (args, "fun"))
Jose> +  {
Jose> +    what = 0;
Jose> +    cmd = args;
Jose> +  }
Jose> +  else
Jose> +    {
Jose> +      if (IS_COMMAND (args, "var")
Jose> +	  || IS_COMMAND (args, "type")
Jose> +	  || IS_COMMAND (args, "unit"))

One idea here would be to register a command prefix and sub-commands
instead.  This will get completion, abbreviation, and better 'help'
handling.  I get that this is some kind of poke built-in thing, though,
so maybe you'd rather not.

Jose> +
Jose> +  if (what == 0)
Jose> +    {
Jose> +      /* Declaration.  */
Jose> +      if (pk_compile_buffer (poke_compiler, cmd.c_str (), &end) != PK_OK)
Jose> +	goto error;

Normally in gdb, a command will call error() on an error, to stop
execution of command sequences at that point.  Swallowing an error is
sometimes ok, but I wonder if that's the case here.

Does poke itself print something if this failed?
If not, the user should get some indication of the problem.

Jose> +/* Initialize the poke GDB subsystem.  */
Jose> +
Jose> +void _initialize_poke (void);

(void)

Jose> +  /* XXX commands to set poke settings, like the output base.  */

gdb has an output radix, not sure if that's useful.

Tom

  parent reply	other threads:[~2021-05-13 16:59 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 15:10 [PATCH 0/1] " Jose E. Marchesi
2021-05-10 15:10 ` [PATCH 1/1] " Jose E. Marchesi
2021-05-10 16:56   ` Eli Zaretskii
2021-05-10 18:49     ` Jose E. Marchesi
2021-05-10 18:52       ` Eli Zaretskii
2021-05-11  7:33   ` Andrew Burgess
2021-05-11 13:07     ` Jose E. Marchesi
2021-05-12  8:52       ` Andrew Burgess
2021-05-12 10:14         ` Jose E. Marchesi
2021-05-13 16:59   ` Tom Tromey [this message]
2021-05-10 18:39 ` [PATCH 0/1] " Simon Marchi
2021-05-10 20:07   ` Jose E. Marchesi
2021-05-11  6:25     ` Andrew Burgess
2021-05-13 17:04   ` Tom Tromey
2021-05-11 18:56 ` Tom Tromey
2021-05-12  8:06   ` Jose E. Marchesi
2021-05-13 15:52     ` Tom Tromey
2021-05-14 20:52       ` Jose E. Marchesi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871raa603x.fsf@tromey.com \
    --to=tom@tromey.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).