From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=qM2S=5D=simark.ca=simark@sourceware.org>
Received: from simark.ca (simark.ca [158.69.221.121])
	by sourceware.org (Postfix) with ESMTPS id 9C43D3858D28
	for <gdb-patches@sourceware.org>; Fri,  6 Jan 2023 19:43:35 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C43D3858D28
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca
Received: from [172.16.0.64] (192-222-180-24.qc.cable.ebox.net [192.222.180.24])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPSA id 24BFD1E110;
	Fri,  6 Jan 2023 14:43:35 -0500 (EST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail;
	t=1673034215; bh=36CC3vesoOTYLm/5FmoXJ2uhngaNYWbVMNNKXY1uT6A=;
	h=Date:Subject:To:References:From:In-Reply-To:From;
	b=Np7LTQxuyUjgjTkxF2OMiomtpK4Y0mpT5zhqjahRf8Gehk6h+/aBtVJIwmHAr9lHC
	 ry32yNRTVNRaudrCKr97vgybU+KFKvGUzqvPDDDkKTEWmzml6pP5fyNDZ5xBGj9N75
	 5Y9F7t1U+Rh2L5P+lkf/n40NzsdmC1s0Y9a4GJME=
Message-ID: <c9d2dc49-45c7-d6b9-c567-4ec78dd870a0@simark.ca>
Date: Fri, 6 Jan 2023 14:43:34 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.6.1
Subject: Re: [PATCH 1/1] Add extra __repr__() implementations to the Python
 API
Content-Language: fr
To: dark.ryu.550@gmail.com, gdb-patches@sourceware.org
References: <5c5101d92160$a1bc4eb0$e534ec10$@gmail.com>
From: Simon Marchi <simark@simark.ca>
In-Reply-To: <5c5101d92160$a1bc4eb0$e534ec10$@gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gdb-patches.sourceware.org>

Hi,

I can't apply the patch, it looks like it was reformatted by your email
client.  Can you please send it using git-send-email?

On 1/5/23 18:51, dark.ryu.550--- via Gdb-patches wrote:
> Only a few types in the Python API currently have __repr__()
> implementations.
> This patch adds a few more of them. specifically: it adds __repr__() 
> implementations to gdb.Symbol, gdb.Architecture, gdb.Block, gdb.Breakpoint,
> and gdb.Type.
> 
> This makes it easier to play around the GDB Python API in the Python
> interpreter 
> session invoked with the 'pi' command in GDB, giving more easily accessible
> type 
> information to users.

I think it's a good idea.  Could you add some examples to the commit
message, to illustrate?

It would be nice to have a test with this.  Something simple that just
tests one of each would be enough.

> 
> One thing to note about this patch is that it makes use of u8 string
> literals,
> so as to make sure we meet python's expectations of strings passed to it
> using
> PyUnicode_FromFormat being encoded in utf8. This should remove the chance of
> odd compilation environments spitting out strings Python would consider
> invalid
> for the function we're calling.
> 
> ---
> diff --git a/gdb/python/py-symbol.c b/gdb/python/py-symbol.c
> index 93c86964f3e..0f7e5ae137c 100644
> --- a/gdb/python/py-symbol.c
> +++ b/gdb/python/py-symbol.c
> @@ -375,6 +375,16 @@ sympy_dealloc (PyObject *obj)
>    Py_TYPE (obj)->tp_free (obj);
>  }
>  
> +static PyObject *
> +sympy_repr (PyObject *self)

Add a (simple) comment above the function, something like:

/* __repr__ implementation for gdb.Symbol.  */

Same for other instances.

> +{
> +  struct symbol *symbol = NULL;
> +
> +  SYMPY_REQUIRE_VALID (self, symbol);
> +
> +  return PyUnicode_FromFormat("<gdb.Symbol print_name=%s>",
> symbol->print_name());

Space before parentheses (multiple instances throughout the patch).

> +}
> +
>  /* Implementation of
>     gdb.lookup_symbol (name [, block] [, domain]) -> (symbol,
> is_field_of_this)
>     A tuple with 2 elements is always returned.  The first is the symbol
> @@ -732,7 +742,7 @@ PyTypeObject symbol_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  sympy_repr,				  /*tp_repr*/

I don't know how it looks in practice, but in the email, the
/*tp_repr*/ comment is not aligned with the others anymore.  Just make
sure to keep it aligned.

>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index cf0978560f9..e607edb565a 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -319,6 +319,19 @@ archpy_integer_type (PyObject *self, PyObject *args,
> PyObject *kw)
>    return type_to_type_object (type);
>  }
>  
> +static PyObject *
> +archpy_repr (PyObject *self)
> +{
> +  struct gdbarch *gdbarch = NULL;
> +
> +  ARCHPY_REQUIRE_VALID (self, gdbarch);
> +
> +  return PyUnicode_FromFormat(
> +    "<gdb.Architecture arch_name=%s printable_name=%s>",
> +    (gdbarch_bfd_arch_info (gdbarch))->arch_name,
> +    (gdbarch_bfd_arch_info (gdbarch))->printable_name);

Our formatting style would have the parenthesis on the next line.
Also, you have unnecessary extra parentheses around the
gdbarch_bfd_arch_info calls.

  return PyUnicode_FromFormat
    ("<gdb.Arch...",
     gdbarch_bfd_arch_info (gdbarch)->arch_name,
     ...);


> +}
> +
>  /* Implementation of gdb.architecture_names().  Return a list of all the
>     BFD architecture names that GDB understands.  */
>  
> @@ -391,7 +404,7 @@ PyTypeObject arch_object_type = {
>    0,                                  /* tp_getattr */
>    0,                                  /* tp_setattr */
>    0,                                  /* tp_compare */
> -  0,                                  /* tp_repr */
> +  archpy_repr,                                  /* tp_repr */
>    0,                                  /* tp_as_number */
>    0,                                  /* tp_as_sequence */
>    0,                                  /* tp_as_mapping */
> diff --git a/gdb/python/py-block.c b/gdb/python/py-block.c
> index b9aea3aca69..abf2bae353d 100644
> --- a/gdb/python/py-block.c
> +++ b/gdb/python/py-block.c
> @@ -23,6 +23,7 @@
>  #include "symtab.h"
>  #include "python-internal.h"
>  #include "objfiles.h"
> +#include <sstream>
>  
>  struct block_object {
>    PyObject_HEAD
> @@ -424,6 +425,30 @@ blpy_iter_is_valid (PyObject *self, PyObject *args)
>    Py_RETURN_TRUE;
>  }
>  
> +static PyObject *
> +blpy_repr (PyObject *self)
> +{
> +  const struct block *block;
> +  BLPY_REQUIRE_VALID(self, block);
> +
> +  const auto name = block->function() ? block->function()->print_name() :
> "<anonymous>";
> +
> +  auto iter = block_iterator{};

We don't really use this style, we typically use the more classic:

  block_iterator iter;

I don't think there's a need to zero-initialize it before calling
block_iterator_first.

> +  block_iterator_first(block, &iter);
> +
> +  std::stringstream ss;
> +  const struct symbol *symbol;
> +  while ((symbol = block_iterator_next(&iter)) != NULL)

NULL -> nullptr

> +  {
> +    ss << std::endl;
> +    ss << symbol->print_name() << ",";
> +  }
> +  if (!ss.str().empty())
> +    ss << std::endl;
> +
> +  return PyUnicode_FromFormat("<gdb.Block %s {%s}>", name,
> ss.str().c_str());
> +}
> +
>  int
>  gdbpy_initialize_blocks (void)
>  {
> @@ -486,7 +511,7 @@ PyTypeObject block_object_type = {
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  blpy_repr,				  /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    &block_object_as_mapping,	  /*tp_as_mapping*/
> diff --git a/gdb/python/py-breakpoint.c b/gdb/python/py-breakpoint.c
> index de7b9f4266b..bc80b5eec0e 100644
> --- a/gdb/python/py-breakpoint.c
> +++ b/gdb/python/py-breakpoint.c
> @@ -33,6 +33,7 @@
>  #include "location.h"
>  #include "py-event.h"
>  #include "linespec.h"
> +#include <sstream>
>  
>  extern PyTypeObject breakpoint_location_object_type
>      CPYCHECKER_TYPE_OBJECT_FOR_TYPEDEF ("breakpoint_location_object");
> @@ -967,6 +968,20 @@ bppy_init (PyObject *self, PyObject *args, PyObject
> *kwargs)
>    return 0;
>  }
>  
> +static PyObject *
> +bppy_repr(PyObject *self)
> +{
> +  const struct gdbpy_breakpoint_object* bp = (struct
> gdbpy_breakpoint_object*) self;
> +  BPPY_REQUIRE_VALID(bp);
> +
> +  return PyUnicode_FromFormat(
> +    "<gdb.Breakpoint number=%d thread=%d hits=%d enable_count=%d>",
> +    bp->number,
> +    bp->bp->thread,
> +    bp->bp->hit_count,
> +    bp->bp->enable_count);
> +}
> +
>  /* Append to LIST the breakpoint Python object associated to B.
>  
>     Return true on success.  Return false on failure, with the Python error
> @@ -1389,7 +1404,7 @@ PyTypeObject breakpoint_object_type =
>    0,				  /*tp_getattr*/
>    0,				  /*tp_setattr*/
>    0,				  /*tp_compare*/
> -  0,				  /*tp_repr*/
> +  bppy_repr,				  /*tp_repr*/
>    0,				  /*tp_as_number*/
>    0,				  /*tp_as_sequence*/
>    0,				  /*tp_as_mapping*/
> @@ -1604,6 +1619,31 @@ bplocpy_dealloc (PyObject *py_self)
>    Py_TYPE (py_self)->tp_free (py_self);
>  }
>  
> +static PyObject *
> +bplocpy_repr (PyObject *py_self)
> +{
> +  auto *self = (gdbpy_breakpoint_location_object *) py_self;
> +  BPPY_REQUIRE_VALID (self->owner);
> +  BPLOCPY_REQUIRE_VALID (self->owner, self);
> +
> +  const auto enabled = self->bp_loc->enabled ? u8"enabled " : u8"disabled
> ";
> +
> +  std::stringstream ss;
> +  ss << std::endl << enabled << std::endl;
> +  ss << u8"requested_address=" << self->bp_loc->requested_address << u8" ";
> +  ss << u8"actual_address=" << self->bp_loc->address << u8" " << std::endl;

I don't really understand why the "u8"s are needed here.

> +  if (self->bp_loc->symtab)

Whenever checking pointers, we use "== nullptr" / "!= nullptr".

> +  {
> +    ss << u8"file=" << self->bp_loc->symtab->filename << u8":" <<
> self->bp_loc->line_number << u8"" << std::endl;
> +  }
> +
> +  const auto fn_name = self->bp_loc->function_name.get ();
> +  if (fn_name != nullptr)
> +    ss << "in "<< fn_name << " " << std::endl;
> + 
> +  return PyUnicode_FromFormat("<gdb.BreakpointLocation %s>",
> ss.str().c_str());
> +}
> +
>  /* Attribute get/set Python definitions. */
>  
>  static gdb_PyGetSetDef bp_location_object_getset[] = {
> @@ -1635,7 +1675,7 @@ PyTypeObject breakpoint_location_object_type =
>    0,					/*tp_getattr*/
>    0,					/*tp_setattr*/
>    0,					/*tp_compare*/
> -  0,					/*tp_repr*/
> +  bplocpy_repr,					/*tp_repr*/
>    0,					/*tp_as_number*/
>    0,					/*tp_as_sequence*/
>    0,					/*tp_as_mapping*/
> 
> diff --git a/gdb/python/py-type.c b/gdb/python/py-type.c
> index 928efacfe8a..01095c16eef 100644
> --- a/gdb/python/py-type.c
> +++ b/gdb/python/py-type.c
> @@ -442,6 +442,7 @@ typy_is_signed (PyObject *self, void *closure)
>      Py_RETURN_TRUE;
>  }
>  
> +
>  /* Return the type, stripped of typedefs. */
>  static PyObject *
>  typy_strip_typedefs (PyObject *self, PyObject *args)
> @@ -1026,6 +1027,62 @@ typy_template_argument (PyObject *self, PyObject
> *args)
>    return value_to_value_object (val);
>  }
>  
> +static PyObject *
> +typy_repr (PyObject *self)
> +{
> +  auto type = type_object_to_type (self);
> +	
> +	std::string kind;
> +	switch(type->code())
> +    {
> +    case TYPE_CODE_INT: kind = u8"integer"; break;
> +    case TYPE_CODE_PTR: kind = u8"pointer"; break;
> +    case TYPE_CODE_ARRAY: kind = u8"array"; break;
> +    case TYPE_CODE_STRUCT: kind = u8"struct"; break;
> +    case TYPE_CODE_UNION: kind = u8"union"; break;
> +    case TYPE_CODE_ENUM: kind = u8"enum"; break;
> +    case TYPE_CODE_FLAGS: kind = u8"bitflags"; break;
> +    case TYPE_CODE_FUNC: kind = u8"function"; break;
> +    case TYPE_CODE_FLT: kind = u8"float"; break;
> +    case TYPE_CODE_VOID: kind = u8"void"; break;
> +    case TYPE_CODE_SET: kind = u8"set"; break;
> +    case TYPE_CODE_RANGE: kind = u8"range"; break;
> +    case TYPE_CODE_STRING: kind = u8"string"; break;
> +    case TYPE_CODE_ERROR: kind = u8"<error>"; break;
> +    case TYPE_CODE_METHOD: kind = u8"method"; break;
> +    case TYPE_CODE_METHODPTR: kind = u8"method pointer"; break;
> +    case TYPE_CODE_MEMBERPTR: kind = u8"member value pointer"; break;
> +    case TYPE_CODE_RVALUE_REF: kind = u8"rvalue reference"; break;
> +    case TYPE_CODE_CHAR: kind = u8"character"; break;
> +    case TYPE_CODE_BOOL: kind = u8"boolean"; break;
> +    case TYPE_CODE_COMPLEX: kind = u8"complex"; break;
> +    case TYPE_CODE_TYPEDEF: kind = u8"typedef"; break;
> +    case TYPE_CODE_NAMESPACE: kind = u8"namespace"; break;
> +    case TYPE_CODE_DECFLOAT: kind = u8"decimal float"; break;
> +    case TYPE_CODE_MODULE: kind = u8"module"; break;
> +    case TYPE_CODE_INTERNAL_FUNCTION: kind = u8"internal function"; break;
> +    case TYPE_CODE_XMETHOD: kind = u8"cross method"; break;
> +    case TYPE_CODE_FIXED_POINT: kind = u8"fixed point"; break;
> +    case TYPE_CODE_NAMELIST: kind = u8"namelist"; break;
> +    }

I'm wondering if it would be better to use the "TYPE_CODE_FOO" form
instead.  This is the form that is exposed to the user after all, so it
would be nice if it matched:

  https://sourceware.org/gdb/onlinedocs/gdb/Types-In-Python.html

And those strings are already available in the pyty_codes array, so it
would spare us the need for this switch.

Simon