public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb/python: add gdb.Architecture.format_address
@ 2022-02-11 16:17 Andrew Burgess
  2022-02-11 18:54 ` Eli Zaretskii
  2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
  0 siblings, 2 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-02-11 16:17 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

From: Andrew Burgess <andrew.burgess@embecosm.com>

Add a new method gdb.Architecture.format_address, which is a wrapper
around GDB's print_address function.

This method takes an address, and returns a string that represents the
address within the currently selected inferior's address space,
formatted like this '0x.... <symbol+offset>'.  If there's no symbol to
associate with the address then the returned string just contains the
address (this is just print_address behaviour).

This is useful if a user wants to write a Python script that
pretty-print addresses, the user no longer needs to do manual symbol
lookup.
---
 gdb/NEWS                             |  5 +++++
 gdb/doc/python.texi                  | 19 +++++++++++++++++
 gdb/python/py-arch.c                 | 31 ++++++++++++++++++++++++++++
 gdb/testsuite/gdb.python/py-arch.exp | 15 ++++++++++++++
 4 files changed, 70 insertions(+)

diff --git a/gdb/NEWS b/gdb/NEWS
index e173d38c3a1..4f4f0c2af6d 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -187,6 +187,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
 
 GNU/Linux/OpenRISC		or1k*-*-linux*
 
+  ** New function gdb.Architecture.format_address(ADDRESS), which
+     takes an address in the currently selected inferior's address
+     space, and returns a string representing the address.  The format
+     of the returned string is '0x.... <symbol+offset>'.
+
 *** Changes in GDB 11
 
 * The 'set disassembler-options' command now supports specifying options
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index c1a3f5f2a7e..50443f7b704 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6016,6 +6016,25 @@
 @code{gdb.Architecture}.
 @end defun
 
+@defun Architecture.format_address (@var{address})
+Returns @var{address}, an address within the currently selected
+inferior's address space, formatted as a string.  When a suitable
+symbol can be found to associate with @var{address} this will be
+included in the returned string, formatted like this:
+
+@smallexample
+0x00001042 <symbol+16>
+@end smallexample
+
+If there is no symbol that @value{GDBN} can find to associate with
+@var{address} then the returned string will just contain
+@var{address}.
+
+If @var{address} is not accessible within the current inferior's
+address space, this function will still return a string containing
+@var{address}.
+@end defun
+
 @node Registers In Python
 @subsubsection Registers In Python
 @cindex Registers In Python
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..95ae931e73e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
  return list.release ();
 }
 
+/* Implement gdb.architecture.format_address(ADDR).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "address", nullptr };
+  PyObject *addr_obj;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+
+  ARCHPY_REQUIRE_VALID (self, gdbarch);
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
 void _initialize_py_arch ();
 void
 _initialize_py_arch ()
@@ -391,6 +416,12 @@ group GROUP-NAME." },
     METH_NOARGS,
     "register_groups () -> Iterator.\n\
 Return an iterator over all of the register groups in this architecture." },
+  { "format_address", (PyCFunction) archpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS) -> String.\n\
+Format ADDRESS, an address within the currently selected inferior's\n\
+address space, as a string.  The format of the returned string is\n\
+'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index b55778b0b72..c4854033d8c 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
     }
 }
 gdb_assert { $lists_match }
+
+# Check the gdb.Architecture.format_address method.
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
+    "Got: $main_addr <main>" \
+    "gdb.Architecture.format_address, result should have no offset"
+set next_addr [format 0x%x [expr $main_addr + 1]]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
+    "Got: $next_addr <main\\+1>" \
+    "gdb.Architecture.format_address, result should have an offset"
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
+	"Got: 0x0" \
+	"gdb.Architecture.format_address for address 0"
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-11 16:17 [PATCH] gdb/python: add gdb.Architecture.format_address Andrew Burgess
@ 2022-02-11 18:54 ` Eli Zaretskii
  2022-02-21 17:27   ` Andrew Burgess
  2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
  1 sibling, 1 reply; 20+ messages in thread
From: Eli Zaretskii @ 2022-02-11 18:54 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> Date: Fri, 11 Feb 2022 16:17:21 +0000
> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Andrew Burgess <andrew.burgess@embecosm.com>
> 
> diff --git a/gdb/NEWS b/gdb/NEWS
> index e173d38c3a1..4f4f0c2af6d 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -187,6 +187,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
>  
>  GNU/Linux/OpenRISC		or1k*-*-linux*
>  
> +  ** New function gdb.Architecture.format_address(ADDRESS), which
> +     takes an address in the currently selected inferior's address
> +     space, and returns a string representing the address.  The format
> +     of the returned string is '0x.... <symbol+offset>'.
> +
>  *** Changes in GDB 11
>  
>  * The 'set disassembler-options' command now supports specifying options
> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
> index c1a3f5f2a7e..50443f7b704 100644
> --- a/gdb/doc/python.texi
> +++ b/gdb/doc/python.texi
> @@ -6016,6 +6016,25 @@
>  @code{gdb.Architecture}.
>  @end defun
>  
> +@defun Architecture.format_address (@var{address})
> +Returns @var{address}, an address within the currently selected

Our style is to say "Return", not "Returns".

Also, saying "return ADDRESS" basically misses the main rationale of
this function, I think; see below.

> +inferior's address space, formatted as a string.  When a suitable
> +symbol can be found to associate with @var{address} this will be
> +included in the returned string, formatted like this:
> +
> +@smallexample
> +0x00001042 <symbol+16>
> +@end smallexample
> +
> +If there is no symbol that @value{GDBN} can find to associate with
> +@var{address} then the returned string will just contain
> +@var{address}.
> +
> +If @var{address} is not accessible within the current inferior's
> +address space, this function will still return a string containing
> +@var{address}.
> +@end defun

More generally, I wonder whether the name "format_address" is the best
one we could come up with.  Isn't this the equivalent of "info symbol"
CLI command?  If so, why not call it "address_to_symbol" or somesuch?

This goes back to the documentation: saying that a method takes its
argument and returns it as a string makes the reader wonder why would
we need such a trivial method.  So the documentation should start by
saying that the method returns SYMBOL+OFFSET that corresponds to
ADDRESS, and only mention that it returns ADDRESS as a string as the
fallback, when SYMBOL cannot be found.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-11 18:54 ` Eli Zaretskii
@ 2022-02-21 17:27   ` Andrew Burgess
  2022-02-21 18:02     ` Eli Zaretskii
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Burgess @ 2022-02-21 17:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> Date: Fri, 11 Feb 2022 16:17:21 +0000
>> From: Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Andrew Burgess <andrew.burgess@embecosm.com>
>> 
>> diff --git a/gdb/NEWS b/gdb/NEWS
>> index e173d38c3a1..4f4f0c2af6d 100644
>> --- a/gdb/NEWS
>> +++ b/gdb/NEWS
>> @@ -187,6 +187,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
>>  
>>  GNU/Linux/OpenRISC		or1k*-*-linux*
>>  
>> +  ** New function gdb.Architecture.format_address(ADDRESS), which
>> +     takes an address in the currently selected inferior's address
>> +     space, and returns a string representing the address.  The format
>> +     of the returned string is '0x.... <symbol+offset>'.
>> +
>>  *** Changes in GDB 11
>>  
>>  * The 'set disassembler-options' command now supports specifying options
>> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
>> index c1a3f5f2a7e..50443f7b704 100644
>> --- a/gdb/doc/python.texi
>> +++ b/gdb/doc/python.texi
>> @@ -6016,6 +6016,25 @@
>>  @code{gdb.Architecture}.
>>  @end defun
>>  
>> +@defun Architecture.format_address (@var{address})
>> +Returns @var{address}, an address within the currently selected
>
> Our style is to say "Return", not "Returns".
>
> Also, saying "return ADDRESS" basically misses the main rationale of
> this function, I think; see below.

I've completely rewritten both the /doc/ entry, and the NEWS entry based
on your feedback.

>
>> +inferior's address space, formatted as a string.  When a suitable
>> +symbol can be found to associate with @var{address} this will be
>> +included in the returned string, formatted like this:
>> +
>> +@smallexample
>> +0x00001042 <symbol+16>
>> +@end smallexample
>> +
>> +If there is no symbol that @value{GDBN} can find to associate with
>> +@var{address} then the returned string will just contain
>> +@var{address}.
>> +
>> +If @var{address} is not accessible within the current inferior's
>> +address space, this function will still return a string containing
>> +@var{address}.
>> +@end defun
>
> More generally, I wonder whether the name "format_address" is the best
> one we could come up with.  Isn't this the equivalent of "info symbol"
> CLI command?  If so, why not call it "address_to_symbol" or somesuch?

Given we have an actual gdb.Symbol class, I'm reluctant to use
address_to_symbol because I think that might give the unrealistic
expectation that this function returns a gdb.Symbol object.

This function is really a wrapper around the internal function
'print_address', but, as the Python version doesn't print anything, I
ended up with format_address.

I'd also be reluctant to go with address_to_string as that might give
the impression that it "just" converts a number to an string, which
obviously would be a really weird thing to have as a separate function
in Python.

I'm certainly not against renaming, if we can come up with a better
name... maybe 'format_address_info'?  I don't know... I still kind of
like 'format_address'...

>
> This goes back to the documentation: saying that a method takes its
> argument and returns it as a string makes the reader wonder why would
> we need such a trivial method.  So the documentation should start by
> saying that the method returns SYMBOL+OFFSET that corresponds to
> ADDRESS, and only mention that it returns ADDRESS as a string as the
> fallback, when SYMBOL cannot be found.

Thanks, I took this advice.  I'm not sure if my use of @samp{} is
acceptable in the new docs - maybe there's better formatting constructs
I could/should use.

Thanks,
Andrew

---

commit eba54d7150c8d87b34db41594ab4f6aef95cd847
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Sat Oct 23 09:59:25 2021 +0100

    gdb/python: add gdb.Architecture.format_address
    
    Add a new method gdb.Architecture.format_address, which is a wrapper
    around GDB's print_address function.
    
    This method takes an address, and returns a string with the format:
    
      ADDRESS <SYMBOL+OFFSET>
    
    Where, ADDRESS is the original address, formatted as hexadecimal, and
    padded with zeros on the left up to the width of an address in the
    current architecture.
    
    SYMBOL is a symbol whose address range covers ADDRESS, and OFFSET is
    the offset from SYMBOL to ADDRESS in decimal.
    
    If there's no SYMBOL whose address range covers ADDRESS, then the
    <SYMBOL+OFFSET> part is not included.
    
    This is useful if a user wants to write a Python script that
    pretty-print addresses, the user no longer needs to do manual symbol
    lookup, and additionally, things like the zero padding on addresses
    will be consistent with the builtin GDB behaviour.

diff --git a/gdb/NEWS b/gdb/NEWS
index 9da74e71796..c4e31188fdb 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -185,6 +185,12 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
      set styling').  When false, which is the default if the argument
      is not given, then no styling is applied to the returned string.
 
+  ** New function gdb.Architecture.format_address(ADDRESS), that takes
+     an address, and returns a string formatted as:
+       ADDRESS <SYMBOL+OFFSET>
+     This is the same format that GDB uses when printing address,
+     symbol, and offset information from the disassembler.
+
 * New features in the GDB remote stub, GDBserver
 
   ** GDBserver is now supported on OpenRISC GNU/Linux.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index c1a3f5f2a7e..a095d055807 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6016,6 +6016,30 @@
 @code{gdb.Architecture}.
 @end defun
 
+@defun Architecture.format_address (@var{address})
+Return a string in the format @samp{ADDRESS <SYMBOL+OFFSET>}, where
+@samp{ADDRESS} is @var{address} formatted in hexadecimal,
+@samp{SYMBOL} is a symbol, the address range of which, covers
+@var{address}, and @samp{OFFSET} is the offset from @samp{SYMBOL} to
+@var{address} in decimal.  This is the same format that @value{GDBN}
+uses when printing address, symbol, and offset information, for
+example, within disassembler output.
+
+If no @samp{SYMBOL} has an address range that covers @var{address},
+then the @samp{<SYMBOL+OFFSET>} part is not included in the returned
+string, instead the returned string will just contain the
+@var{address} formatted as hexadecimal.
+
+In all cases, the @samp{ADDRESS} component will be padded with leading
+zeros based on the width of an address for the current architecture.
+
+An example of the returned string is:
+
+@smallexample
+0x00001042 <symbol+16>
+@end smallexample
+@end defun
+
 @node Registers In Python
 @subsubsection Registers In Python
 @cindex Registers In Python
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..95ae931e73e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
  return list.release ();
 }
 
+/* Implement gdb.architecture.format_address(ADDR).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "address", nullptr };
+  PyObject *addr_obj;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+
+  ARCHPY_REQUIRE_VALID (self, gdbarch);
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
 void _initialize_py_arch ();
 void
 _initialize_py_arch ()
@@ -391,6 +416,12 @@ group GROUP-NAME." },
     METH_NOARGS,
     "register_groups () -> Iterator.\n\
 Return an iterator over all of the register groups in this architecture." },
+  { "format_address", (PyCFunction) archpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS) -> String.\n\
+Format ADDRESS, an address within the currently selected inferior's\n\
+address space, as a string.  The format of the returned string is\n\
+'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index b55778b0b72..c4854033d8c 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
     }
 }
 gdb_assert { $lists_match }
+
+# Check the gdb.Architecture.format_address method.
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
+    "Got: $main_addr <main>" \
+    "gdb.Architecture.format_address, result should have no offset"
+set next_addr [format 0x%x [expr $main_addr + 1]]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
+    "Got: $next_addr <main\\+1>" \
+    "gdb.Architecture.format_address, result should have an offset"
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
+	"Got: 0x0" \
+	"gdb.Architecture.format_address for address 0"
+}


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-21 17:27   ` Andrew Burgess
@ 2022-02-21 18:02     ` Eli Zaretskii
  2022-02-22 13:56       ` Andrew Burgess
  0 siblings, 1 reply; 20+ messages in thread
From: Eli Zaretskii @ 2022-02-21 18:02 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Mon, 21 Feb 2022 17:27:21 +0000
> 
> I'm certainly not against renaming, if we can come up with a better
> name... maybe 'format_address_info'?  I don't know... I still kind of
> like 'format_address'...

I hope someone will come up with a better name.

> +@defun Architecture.format_address (@var{address})
> +Return a string in the format @samp{ADDRESS <SYMBOL+OFFSET>}, where
> +@samp{ADDRESS} is @var{address} formatted in hexadecimal,
> +@samp{SYMBOL} is a symbol, the address range of which, covers
> +@var{address}, and @samp{OFFSET} is the offset from @samp{SYMBOL} to
> +@var{address} in decimal.  This is the same format that @value{GDBN}
> +uses when printing address, symbol, and offset information, for
> +example, within disassembler output.
> +
> +If no @samp{SYMBOL} has an address range that covers @var{address},
> +then the @samp{<SYMBOL+OFFSET>} part is not included in the returned
> +string, instead the returned string will just contain the
> +@var{address} formatted as hexadecimal.
> +
> +In all cases, the @samp{ADDRESS} component will be padded with leading
> +zeros based on the width of an address for the current architecture.

This is okay, but needs the markup fixed.  All the places where you
use SOMETHING ("ADDRESS", "SYMBOL", etc.) should be @var{something},
i.e. have the @var markup and be in lower-case.  (To prevent confusion
with the argument @var{address}, call it something else, like
@var{addr}.)  Also, remove @samp everywhere except this single
instance:

   @samp{@var{address} <@var{symbol}+@var{offset}>}

And finally, this text:

> @samp{SYMBOL} is a symbol, the address range of which, covers
> +@var{address}

is better worded as

  @var{symbol} is the symbol to which @var{addr} belongs

Btw, is the above accurate? does GDB really guarantee that ADDR is
p[art of SYMBOL's memory? or does it just find the closest symbol
whose address is smaller than ADDR?

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-21 18:02     ` Eli Zaretskii
@ 2022-02-22 13:56       ` Andrew Burgess
  2022-02-22 14:48         ` Eli Zaretskii
  2022-03-03 18:35         ` Craig Blackmore
  0 siblings, 2 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-02-22 13:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrew Burgess <aburgess@redhat.com>
>> Cc: gdb-patches@sourceware.org
>> Date: Mon, 21 Feb 2022 17:27:21 +0000
>> 
>> I'm certainly not against renaming, if we can come up with a better
>> name... maybe 'format_address_info'?  I don't know... I still kind of
>> like 'format_address'...
>
> I hope someone will come up with a better name.
>
>> +@defun Architecture.format_address (@var{address})
>> +Return a string in the format @samp{ADDRESS <SYMBOL+OFFSET>}, where
>> +@samp{ADDRESS} is @var{address} formatted in hexadecimal,
>> +@samp{SYMBOL} is a symbol, the address range of which, covers
>> +@var{address}, and @samp{OFFSET} is the offset from @samp{SYMBOL} to
>> +@var{address} in decimal.  This is the same format that @value{GDBN}
>> +uses when printing address, symbol, and offset information, for
>> +example, within disassembler output.
>> +
>> +If no @samp{SYMBOL} has an address range that covers @var{address},
>> +then the @samp{<SYMBOL+OFFSET>} part is not included in the returned
>> +string, instead the returned string will just contain the
>> +@var{address} formatted as hexadecimal.
>> +
>> +In all cases, the @samp{ADDRESS} component will be padded with leading
>> +zeros based on the width of an address for the current architecture.
>
> This is okay, but needs the markup fixed.  All the places where you
> use SOMETHING ("ADDRESS", "SYMBOL", etc.) should be @var{something},
> i.e. have the @var markup and be in lower-case.  (To prevent confusion
> with the argument @var{address}, call it something else, like
> @var{addr}.)  Also, remove @samp everywhere except this single
> instance:
>
>    @samp{@var{address} <@var{symbol}+@var{offset}>}
>
> And finally, this text:
>
>> @samp{SYMBOL} is a symbol, the address range of which, covers
>> +@var{address}
>
> is better worded as
>
>   @var{symbol} is the symbol to which @var{addr} belongs
>
> Btw, is the above accurate? does GDB really guarantee that ADDR is
> p[art of SYMBOL's memory? or does it just find the closest symbol
> whose address is smaller than ADDR?

You're absolutely correct.

While checking the code to see how this stuff actually works I
discovered a few more interesting settings that I felt were worth
mentioning in the docs for this function.

So, appologies, but this is another complete rewrite of the docs.  With
the exception of the name, for which I still have no better suggestions,
how's this?

Thanks,
Andrew

---

commit dc104a9f22e3d5baf1b87b344acbfae010bac168
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Sat Oct 23 09:59:25 2021 +0100

    gdb/python: add gdb.Architecture.format_address
    
    Add a new method gdb.Architecture.format_address, which is a wrapper
    around GDB's print_address function.
    
    This method takes an address, and returns a string with the format:
    
      ADDRESS <SYMBOL+OFFSET>
    
    Where, ADDRESS is the original address, formatted as hexadecimal, and
    padded with zeros on the left up to the width of an address in the
    current architecture.
    
    SYMBOL is a symbol whose address range covers ADDRESS, and OFFSET is
    the offset from SYMBOL to ADDRESS in decimal.
    
    If there's no SYMBOL whose address range covers ADDRESS, then the
    <SYMBOL+OFFSET> part is not included.
    
    This is useful if a user wants to write a Python script that
    pretty-print addresses, the user no longer needs to do manual symbol
    lookup, and additionally, things like the zero padding on addresses
    will be consistent with the builtin GDB behaviour.

diff --git a/gdb/NEWS b/gdb/NEWS
index 9da74e71796..b012e0c562b 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -185,6 +185,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
      set styling').  When false, which is the default if the argument
      is not given, then no styling is applied to the returned string.
 
+  ** New function gdb.Architecture.format_address(ADDRESS), that
+     formats ADDRESS as 'address <symbol+offset>', this is the same
+     format that GDB uses when printing address, symbol, and offset
+     information from the disassembler.
+
 * New features in the GDB remote stub, GDBserver
 
   ** GDBserver is now supported on OpenRISC GNU/Linux.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index c1a3f5f2a7e..a786d52b092 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6016,6 +6016,42 @@
 @code{gdb.Architecture}.
 @end defun
 
+@defun Architecture.format_address (@var{address})
+Return a string in the format @samp{@var{addr}
+<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
+formatted in hexadecimal, @var{symbol} is the closest earlier symbol
+to @var{address}, and @var{offset} is the offset from @var{symbol} to
+@var{address} in decimal.
+
+If no suitable @var{symbol} was found, then the
+<@var{symbol}+@var{offset}> part is not included in the returned
+string, instead the returned string will just contain the
+@var{address} formatted as hexadecimal.  How far @value{GDBN} looks
+back for a suitable symbol can be controlled with @kbd{set print
+max-symbolic-offset} (@pxref{Print Settings}).
+
+Additionally, the returned string can include file name and line
+number information when @kbd{set print symbol-filename on}
+(@pxref{Print Settings}), in this case the format of the returned
+string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
+@var{filename}:@var{line-number}}.
+
+In all cases, the @var{addr} component will be padded with leading
+zeros based on the width of an address for the current architecture.
+
+This method uses the same mechanism for formatting address, symbol,
+and offset information as core @value{GDBN} does in commands such as
+@kbd{disassemble}.
+
+Here are some examples of the possible string formats:
+
+@smallexample
+0x00001042
+0x00001042 <symbol+16>
+0x00001042 <symbol+16 at file.c:123>
+@end smallexample
+@end defun
+
 @node Registers In Python
 @subsubsection Registers In Python
 @cindex Registers In Python
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..95ae931e73e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
  return list.release ();
 }
 
+/* Implement gdb.architecture.format_address(ADDR).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "address", nullptr };
+  PyObject *addr_obj;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+
+  ARCHPY_REQUIRE_VALID (self, gdbarch);
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
 void _initialize_py_arch ();
 void
 _initialize_py_arch ()
@@ -391,6 +416,12 @@ group GROUP-NAME." },
     METH_NOARGS,
     "register_groups () -> Iterator.\n\
 Return an iterator over all of the register groups in this architecture." },
+  { "format_address", (PyCFunction) archpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS) -> String.\n\
+Format ADDRESS, an address within the currently selected inferior's\n\
+address space, as a string.  The format of the returned string is\n\
+'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index b55778b0b72..c4854033d8c 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
     }
 }
 gdb_assert { $lists_match }
+
+# Check the gdb.Architecture.format_address method.
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
+    "Got: $main_addr <main>" \
+    "gdb.Architecture.format_address, result should have no offset"
+set next_addr [format 0x%x [expr $main_addr + 1]]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
+    "Got: $next_addr <main\\+1>" \
+    "gdb.Architecture.format_address, result should have an offset"
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
+	"Got: 0x0" \
+	"gdb.Architecture.format_address for address 0"
+}


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-22 13:56       ` Andrew Burgess
@ 2022-02-22 14:48         ` Eli Zaretskii
  2022-02-23 14:20           ` Andrew Burgess
  2022-03-03 18:35         ` Craig Blackmore
  1 sibling, 1 reply; 20+ messages in thread
From: Eli Zaretskii @ 2022-02-22 14:48 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

> From: Andrew Burgess <aburgess@redhat.com>
> Cc: gdb-patches@sourceware.org
> Date: Tue, 22 Feb 2022 13:56:17 +0000
> 
> So, appologies, but this is another complete rewrite of the docs.  With
> the exception of the name, for which I still have no better suggestions,
> how's this?

Let's see...

> +@defun Architecture.format_address (@var{address})
> +Return a string in the format @samp{@var{addr}
> +<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
> +formatted in hexadecimal, @var{symbol} is the closest earlier symbol
> +to @var{address}, and @var{offset} is the offset from @var{symbol} to
> +@var{address} in decimal.

I think instead of "closest earlier" it would be best to say

  ... @var{symbol} is the symbol whose address is the nearest to
  @var{address} and below it in memory ...

Other than that, SGTM, thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-22 14:48         ` Eli Zaretskii
@ 2022-02-23 14:20           ` Andrew Burgess
  2022-03-03 16:49             ` Andrew Burgess
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Burgess @ 2022-02-23 14:20 UTC (permalink / raw)
  To: gdb-patches

Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:

>> From: Andrew Burgess <aburgess@redhat.com>
>> Cc: gdb-patches@sourceware.org
>> Date: Tue, 22 Feb 2022 13:56:17 +0000
>> 
>> So, appologies, but this is another complete rewrite of the docs.  With
>> the exception of the name, for which I still have no better suggestions,
>> how's this?
>
> Let's see...
>
>> +@defun Architecture.format_address (@var{address})
>> +Return a string in the format @samp{@var{addr}
>> +<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
>> +formatted in hexadecimal, @var{symbol} is the closest earlier symbol
>> +to @var{address}, and @var{offset} is the offset from @var{symbol} to
>> +@var{address} in decimal.
>
> I think instead of "closest earlier" it would be best to say
>
>   ... @var{symbol} is the symbol whose address is the nearest to
>   @var{address} and below it in memory ...
>
> Other than that, SGTM, thanks.

Thanks for all your feedback.  I've made this change locally, and it
will be included once someone has reviewed the rest of the code.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-23 14:20           ` Andrew Burgess
@ 2022-03-03 16:49             ` Andrew Burgess
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-03 16:49 UTC (permalink / raw)
  To: gdb-patches


Ping!

Thanks,
Andrew

Andrew Burgess <aburgess@redhat.com> writes:

> Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org> writes:
>
>>> From: Andrew Burgess <aburgess@redhat.com>
>>> Cc: gdb-patches@sourceware.org
>>> Date: Tue, 22 Feb 2022 13:56:17 +0000
>>> 
>>> So, appologies, but this is another complete rewrite of the docs.  With
>>> the exception of the name, for which I still have no better suggestions,
>>> how's this?
>>
>> Let's see...
>>
>>> +@defun Architecture.format_address (@var{address})
>>> +Return a string in the format @samp{@var{addr}
>>> +<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
>>> +formatted in hexadecimal, @var{symbol} is the closest earlier symbol
>>> +to @var{address}, and @var{offset} is the offset from @var{symbol} to
>>> +@var{address} in decimal.
>>
>> I think instead of "closest earlier" it would be best to say
>>
>>   ... @var{symbol} is the symbol whose address is the nearest to
>>   @var{address} and below it in memory ...
>>
>> Other than that, SGTM, thanks.
>
> Thanks for all your feedback.  I've made this change locally, and it
> will be included once someone has reviewed the rest of the code.
>
> Thanks,
> Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-02-22 13:56       ` Andrew Burgess
  2022-02-22 14:48         ` Eli Zaretskii
@ 2022-03-03 18:35         ` Craig Blackmore
  2022-03-04 10:51           ` Andrew Burgess
  1 sibling, 1 reply; 20+ messages in thread
From: Craig Blackmore @ 2022-03-03 18:35 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

Hi Andrew,

On 22/02/2022 13:56, Andrew Burgess via Gdb-patches wrote:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Andrew Burgess <aburgess@redhat.com>
>>> Cc: gdb-patches@sourceware.org
>>> Date: Mon, 21 Feb 2022 17:27:21 +0000
>>>
>>> I'm certainly not against renaming, if we can come up with a better
>>> name... maybe 'format_address_info'?  I don't know... I still kind of
>>> like 'format_address'...
>> I hope someone will come up with a better name.
format_address seems ok to me and I couldn't come up with anything better.
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index 0f273b344e4..95ae931e73e 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
>    return list.release ();
>   }
>   
> +/* Implement gdb.architecture.format_address(ADDR).  Provide access to
> +   GDB's print_address function from Python.  The returned address will
> +   have the format '0x..... <symbol+offset>'.  */
> +
> +static PyObject *
> +archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *keywords[] = { "address", nullptr };
> +  PyObject *addr_obj;
> +  CORE_ADDR addr;
> +  struct gdbarch *gdbarch = nullptr;
> +
> +  ARCHPY_REQUIRE_VALID (self, gdbarch);
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
> +    return nullptr;
> +
> +  if (get_addr_from_python (addr_obj, &addr) < 0)
> +    return nullptr;
> +
> +  string_file buf;
> +  print_address (gdbarch, addr, &buf);
> +  return PyString_FromString (buf.c_str ());
> +}
> +
>   void _initialize_py_arch ();
>   void
>   _initialize_py_arch ()
> @@ -391,6 +416,12 @@ group GROUP-NAME." },
>       METH_NOARGS,
>       "register_groups () -> Iterator.\n\
>   Return an iterator over all of the register groups in this architecture." },
> +  { "format_address", (PyCFunction) archpy_format_address,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "format_address (ADDRESS) -> String.\n\
> +Format ADDRESS, an address within the currently selected inferior's\n\
> +address space, as a string.  The format of the returned string is\n\
> +'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
>     {NULL}  /* Sentinel */
>   };
>   
> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
> index b55778b0b72..c4854033d8c 100644
> --- a/gdb/testsuite/gdb.python/py-arch.exp
> +++ b/gdb/testsuite/gdb.python/py-arch.exp
> @@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
>       }
>   }
>   gdb_assert { $lists_match }
> +
> +# Check the gdb.Architecture.format_address method.
> +set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
> +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
> +    "Got: $main_addr <main>" \
> +    "gdb.Architecture.format_address, result should have no offset"
> +set next_addr [format 0x%x [expr $main_addr + 1]]
> +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
> +    "Got: $next_addr <main\\+1>" \
> +    "gdb.Architecture.format_address, result should have an offset"
> +if {![is_address_zero_readable]} {
> +    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
> +	"Got: 0x0" \
> +	"gdb.Architecture.format_address for address 0"
> +}
>
The code LGTM. Just one query, is it necessary to add a test case for 
printing the filename and line number when we have `set print 
symbol-filename on`?

Thanks,

Craig


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCHv2] gdb/python: add gdb.Architecture.format_address
  2022-02-11 16:17 [PATCH] gdb/python: add gdb.Architecture.format_address Andrew Burgess
  2022-02-11 18:54 ` Eli Zaretskii
@ 2022-03-04 10:50 ` Andrew Burgess
  2022-03-04 15:22   ` Simon Marchi
  2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
  1 sibling, 2 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-04 10:50 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

From: Andrew Burgess <andrew.burgess@embecosm.com>

Since v1:

 - All the doc changes have been addressed (these have already been
   approved),

 - Rebased to current master,

 - Added an additional test for when 'set print symbol-filename' is
   on.

Thanks,
Andrew


---

Add a new method gdb.Architecture.format_address, which is a wrapper
around GDB's print_address function.

This method takes an address, and returns a string with the format:

  ADDRESS <SYMBOL+OFFSET>

Where, ADDRESS is the original address, formatted as hexadecimal, and
padded with zeros on the left up to the width of an address in the
current architecture.

SYMBOL is a symbol whose address range covers ADDRESS, and OFFSET is
the offset from SYMBOL to ADDRESS in decimal.

If there's no SYMBOL whose address range covers ADDRESS, then the
<SYMBOL+OFFSET> part is not included.

This is useful if a user wants to write a Python script that
pretty-print addresses, the user no longer needs to do manual symbol
lookup, and additionally, things like the zero padding on addresses
will be consistent with the builtin GDB behaviour.
---
 gdb/NEWS                             |  5 ++++
 gdb/doc/python.texi                  | 36 ++++++++++++++++++++++++++++
 gdb/python/py-arch.c                 | 31 ++++++++++++++++++++++++
 gdb/testsuite/gdb.python/py-arch.exp | 28 ++++++++++++++++++++++
 4 files changed, 100 insertions(+)

diff --git a/gdb/NEWS b/gdb/NEWS
index dc2cac1871b..d3883febf82 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -206,6 +206,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
      state information, or None, if there is no such additional
      information.
 
+  ** New function gdb.Architecture.format_address(ADDRESS), that
+     formats ADDRESS as 'address <symbol+offset>', this is the same
+     format that GDB uses when printing address, symbol, and offset
+     information from the disassembler.
+
 * New features in the GDB remote stub, GDBserver
 
   ** GDBserver is now supported on OpenRISC GNU/Linux.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 4d9e77bf12c..0ec08f4994e 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6033,6 +6033,42 @@
 @code{gdb.Architecture}.
 @end defun
 
+@defun Architecture.format_address (@var{address})
+Return a string in the format @samp{@var{addr}
+<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
+formatted in hexadecimal, @var{symbol} is the symbol whose address is
+the nearest to @var{address} and below it in memory, and @var{offset}
+is the offset from @var{symbol} to @var{address} in decimal.
+
+If no suitable @var{symbol} was found, then the
+<@var{symbol}+@var{offset}> part is not included in the returned
+string, instead the returned string will just contain the
+@var{address} formatted as hexadecimal.  How far @value{GDBN} looks
+back for a suitable symbol can be controlled with @kbd{set print
+max-symbolic-offset} (@pxref{Print Settings}).
+
+Additionally, the returned string can include file name and line
+number information when @kbd{set print symbol-filename on}
+(@pxref{Print Settings}), in this case the format of the returned
+string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
+@var{filename}:@var{line-number}}.
+
+In all cases, the @var{addr} component will be padded with leading
+zeros based on the width of an address for the current architecture.
+
+This method uses the same mechanism for formatting address, symbol,
+and offset information as core @value{GDBN} does in commands such as
+@kbd{disassemble}.
+
+Here are some examples of the possible string formats:
+
+@smallexample
+0x00001042
+0x00001042 <symbol+16>
+0x00001042 <symbol+16 at file.c:123>
+@end smallexample
+@end defun
+
 @node Registers In Python
 @subsubsection Registers In Python
 @cindex Registers In Python
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..95ae931e73e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
  return list.release ();
 }
 
+/* Implement gdb.architecture.format_address(ADDR).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "address", nullptr };
+  PyObject *addr_obj;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+
+  ARCHPY_REQUIRE_VALID (self, gdbarch);
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
 void _initialize_py_arch ();
 void
 _initialize_py_arch ()
@@ -391,6 +416,12 @@ group GROUP-NAME." },
     METH_NOARGS,
     "register_groups () -> Iterator.\n\
 Return an iterator over all of the register groups in this architecture." },
+  { "format_address", (PyCFunction) archpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS) -> String.\n\
+Format ADDRESS, an address within the currently selected inferior's\n\
+address space, as a string.  The format of the returned string is\n\
+'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index b55778b0b72..5272d9c2173 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -127,3 +127,31 @@ foreach a $arch_names b $py_arch_names {
     }
 }
 gdb_assert { $lists_match }
+
+# Check the gdb.Architecture.format_address method.
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+set next_addr [format 0x%x [expr $main_addr + 1]]
+
+foreach_with_prefix symbol_filename { on off } {
+    gdb_test_no_output "set print symbol-filename ${symbol_filename}"
+
+    if { $symbol_filename == "on" } {
+	set filename_pattern " at \[^\r\n\]+/py-arch.c:$decimal"
+    } else {
+	set filename_pattern ""
+    }
+
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
+	"Got: $main_addr <main${filename_pattern}>" \
+	"gdb.Architecture.format_address, result should have no offset"
+
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
+	"Got: $next_addr <main\\+1${filename_pattern}>" \
+	"gdb.Architecture.format_address, result should have an offset"
+}
+
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
+	"Got: 0x0" \
+	"gdb.Architecture.format_address for address 0"
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] gdb/python: add gdb.Architecture.format_address
  2022-03-03 18:35         ` Craig Blackmore
@ 2022-03-04 10:51           ` Andrew Burgess
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-04 10:51 UTC (permalink / raw)
  To: Craig Blackmore; +Cc: gdb-patches

Craig Blackmore <craig.blackmore@embecosm.com> writes:

> Hi Andrew,
>
> On 22/02/2022 13:56, Andrew Burgess via Gdb-patches wrote:
>> Eli Zaretskii <eliz@gnu.org> writes:
>>
>>>> From: Andrew Burgess <aburgess@redhat.com>
>>>> Cc: gdb-patches@sourceware.org
>>>> Date: Mon, 21 Feb 2022 17:27:21 +0000
>>>>
>>>> I'm certainly not against renaming, if we can come up with a better
>>>> name... maybe 'format_address_info'?  I don't know... I still kind of
>>>> like 'format_address'...
>>> I hope someone will come up with a better name.
> format_address seems ok to me and I couldn't come up with anything better.
>> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
>> index 0f273b344e4..95ae931e73e 100644
>> --- a/gdb/python/py-arch.c
>> +++ b/gdb/python/py-arch.c
>> @@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
>>    return list.release ();
>>   }
>>   
>> +/* Implement gdb.architecture.format_address(ADDR).  Provide access to
>> +   GDB's print_address function from Python.  The returned address will
>> +   have the format '0x..... <symbol+offset>'.  */
>> +
>> +static PyObject *
>> +archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
>> +{
>> +  static const char *keywords[] = { "address", nullptr };
>> +  PyObject *addr_obj;
>> +  CORE_ADDR addr;
>> +  struct gdbarch *gdbarch = nullptr;
>> +
>> +  ARCHPY_REQUIRE_VALID (self, gdbarch);
>> +
>> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
>> +    return nullptr;
>> +
>> +  if (get_addr_from_python (addr_obj, &addr) < 0)
>> +    return nullptr;
>> +
>> +  string_file buf;
>> +  print_address (gdbarch, addr, &buf);
>> +  return PyString_FromString (buf.c_str ());
>> +}
>> +
>>   void _initialize_py_arch ();
>>   void
>>   _initialize_py_arch ()
>> @@ -391,6 +416,12 @@ group GROUP-NAME." },
>>       METH_NOARGS,
>>       "register_groups () -> Iterator.\n\
>>   Return an iterator over all of the register groups in this architecture." },
>> +  { "format_address", (PyCFunction) archpy_format_address,
>> +    METH_VARARGS | METH_KEYWORDS,
>> +    "format_address (ADDRESS) -> String.\n\
>> +Format ADDRESS, an address within the currently selected inferior's\n\
>> +address space, as a string.  The format of the returned string is\n\
>> +'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
>>     {NULL}  /* Sentinel */
>>   };
>>   
>> diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
>> index b55778b0b72..c4854033d8c 100644
>> --- a/gdb/testsuite/gdb.python/py-arch.exp
>> +++ b/gdb/testsuite/gdb.python/py-arch.exp
>> @@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
>>       }
>>   }
>>   gdb_assert { $lists_match }
>> +
>> +# Check the gdb.Architecture.format_address method.
>> +set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
>> +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
>> +    "Got: $main_addr <main>" \
>> +    "gdb.Architecture.format_address, result should have no offset"
>> +set next_addr [format 0x%x [expr $main_addr + 1]]
>> +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
>> +    "Got: $next_addr <main\\+1>" \
>> +    "gdb.Architecture.format_address, result should have an offset"
>> +if {![is_address_zero_readable]} {
>> +    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
>> +	"Got: 0x0" \
>> +	"gdb.Architecture.format_address for address 0"
>> +}
>>
> The code LGTM. Just one query, is it necessary to add a test case for 
> printing the filename and line number when we have `set print 
> symbol-filename on`?

Yes, I only learned about 'set print symbol-filename' during the doc
review, and neglected to add a new test.

I've posted a V2 patch which include a new test for this.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv2] gdb/python: add gdb.Architecture.format_address
  2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
@ 2022-03-04 15:22   ` Simon Marchi
  2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
  1 sibling, 0 replies; 20+ messages in thread
From: Simon Marchi @ 2022-03-04 15:22 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: Andrew Burgess

> @@ -391,6 +416,12 @@ group GROUP-NAME." },
>      METH_NOARGS,
>      "register_groups () -> Iterator.\n\
>  Return an iterator over all of the register groups in this architecture." },
> +  { "format_address", (PyCFunction) archpy_format_address,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "format_address (ADDRESS) -> String.\n\
> +Format ADDRESS, an address within the currently selected inferior's\n\
> +address space, as a string.  The format of the returned string is\n\

I didn't have much time to think about this, but I was just wondering if
there would be a way to avoid relying on the currently selected
inferior.

My understanding is that this function depends on both an architecture,
for formatting the number part of the address, and an inferior for the symbolic
part of the the address.  Or is it a program space instead of inferior?

So the possibilities I see are, to let the user specify both:

  arch_obj.format_address(addr, inf_obj)
  arch_obj.format_address(addr, pspace_obj)
  inf_obj.format_address(addr, arch_obj)
  pspace_obj.format_address(addr, arch_obj)
  gdb.format_address(addr, inf_obj, arch_obj)
  gdb.format_address(addr, pspace_obj, arch_obj)

Putting the format_address on one of the objects (arch, inf, pspace)
seems strange, because it's not a arch-specific operation more than it
is an inf-specific (or pspace-specific operation).  So I'm more leaning
towards gdb.format_address.

And even if the user can pass in objects, we can offer sensible defaults
that will make format_address use "the current things".  Let's say that
format_address is declared as:

  def format_address(addr, pspace_obj=None, arch_obj=None)

If pspace_obj is None, we can fetch it from the currently selected
inferior.  And same for arch_obj.

Also, maybe that doing:

  format_address(addr, inf_obj)

could be a shortcut for:

  format_address(addr, inf_obj.progspace, inf_obj.architecture())

Not sure about that one though.

Simon

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
  2022-03-04 15:22   ` Simon Marchi
@ 2022-03-07 12:33   ` Andrew Burgess
  2022-03-21 17:53     ` Andrew Burgess
  2022-03-21 18:23     ` Simon Marchi
  1 sibling, 2 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-07 12:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Andrew Burgess

From: Andrew Burgess <andrew.burgess@embecosm.com>

Since v2:

 - Almost complete rewrite, function is now gdb.format_address, and
   takes an address, program space, and architecture,

 - Updated docs to go with this,

 - Updated tests.

Since v1:

 - All the doc changes have been addressed (these have already been
   approved),

 - Rebased to current master,

 - Added an additional test for when 'set print symbol-filename' is
   on.

Thanks,
Andrew


---

Add a new function, gdb.format_address, which is a wrapper around
GDB's print_address function.

This method takes an address, and returns a string with the format:

  ADDRESS <SYMBOL+OFFSET>

Where, ADDRESS is the original address, formatted as hexadecimal,
SYMBOL is a symbol with an address lower than ADDRESS, and OFFSET is
the offset from SYMBOL to ADDRESS in decimal.

If there's no SYMBOL suitably close to ADDRESS then the
<SYMBOL+OFFSET> part is not included.

This is useful if a user wants to write a Python script that
pretty-prints addresses, the user no longer needs to do manual symbol
lookup, or worry about correctly formatting addresses.

Additionally, there are some settings that effect how GDB picks
SYMBOL, and whether the file name and line number should be included
with the SYMBOL name, the gdb.format_address function ensures that the
users Python script also benefits from these settings.

The gdb.format_address by default selects SYMBOL from the current
inferiors program space, and address is formatted using the
architecture for the current inferior.  However, a user can also
explicitly pass a program space and architecture like this:

  gdb.format_address(ADDRESS, PROGRAM_SPACE, ARCHITECTURE)

In order to format an address for a different inferior.

Notes on the implementation:

In py-arch.c I extended arch_object_to_gdbarch to add an assertion for
the type of the PyObject being worked on.  Prior to this commit all
uses of arch_object_to_gdbarch were guaranteed to pass this function a
gdb.Architecture object, but, with this commit, this might not be the
case.

So, with this commit I've made it a requirement that the PyObject be a
gdb.Architecture, and this is checked with the assert.  And in order
that callers from other files can check if they have a
gdb.Architecture object, I've added the new function
gdbpy_is_architecture.

In py-progspace.c I've added two new function, the first
progspace_object_to_program_space, converts a PyObject of type
gdb.Progspace to the associated program_space pointer, and
gdbpy_is_progspace checks if a PyObject is a gdb.Progspace or not.
---
 gdb/NEWS                                      |   6 +
 gdb/doc/python.texi                           |  54 ++++++
 gdb/python/py-arch.c                          |  13 +-
 gdb/python/py-progspace.c                     |  17 ++
 gdb/python/python-internal.h                  |  16 ++
 gdb/python/python.c                           | 108 +++++++++++
 gdb/testsuite/gdb.python/py-format-address.c  |  32 ++++
 .../gdb.python/py-format-address.exp          | 177 ++++++++++++++++++
 8 files changed, 421 insertions(+), 2 deletions(-)
 create mode 100644 gdb/testsuite/gdb.python/py-format-address.c
 create mode 100644 gdb/testsuite/gdb.python/py-format-address.exp

diff --git a/gdb/NEWS b/gdb/NEWS
index dc2cac1871b..ec6a0753024 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -206,6 +206,12 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
      state information, or None, if there is no such additional
      information.
 
+  ** New function gdb.format_address(ADDRESS, PROGSPACE, ARCHITECTURE),
+     that formats ADDRESS as 'address <symbol+offset>', where symbol is
+     looked up in PROGSPACE, and ARCHITECTURE is used to format address.
+     This is the same format that GDB uses when printing address, symbol,
+     and offset information from the disassembler.
+
 * New features in the GDB remote stub, GDBserver
 
   ** GDBserver is now supported on OpenRISC GNU/Linux.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index 4d9e77bf12c..721c5d3176e 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -612,6 +612,60 @@
 connection objects are in no particular order in the returned list.
 @end defun
 
+@defun gdb.format_address (@var{address} @r{[}, @var{progspace}, @var{architecture}@r{]})
+Return a string in the format @samp{@var{addr}
+<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
+formatted in hexadecimal, @var{symbol} is the symbol whose address is
+the nearest to @var{address} and below it in memory, and @var{offset}
+is the offset from @var{symbol} to @var{address} in decimal.
+
+If no suitable @var{symbol} was found, then the
+<@var{symbol}+@var{offset}> part is not included in the returned
+string, instead the returned string will just contain the
+@var{address} formatted as hexadecimal.  How far @value{GDBN} looks
+back for a suitable symbol can be controlled with @kbd{set print
+max-symbolic-offset} (@pxref{Print Settings}).
+
+Additionally, the returned string can include file name and line
+number information when @kbd{set print symbol-filename on}
+(@pxref{Print Settings}), in this case the format of the returned
+string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
+@var{filename}:@var{line-number}}.
+
+
+The @var{progspace} is the gdb.Progspace in which @var{symbol} is
+looked up, and @var{architecture} is used when formatting @var{addr},
+e.g.@: in order to determine the size of an address in bytes.
+
+If neither @var{progspace} or @var{architecture} are passed, then by
+default @value{GDBN} will use the program space and architecture of
+the currently selected inferior, thus, the following two calls are
+equivalent:
+
+@smallexample
+gdb.format_address(address)
+gdb.format_address(address,
+                   gdb.selected_inferior().progspace,
+                   gdb.selected_inferior().architecture())
+@end smallexample
+
+It is not valid to only pass one of @var{progspace} or
+@var{architecture}, either they must both be provided, or neither must
+be provided (and the defaults will be used).
+
+This method uses the same mechanism for formatting address, symbol,
+and offset information as core @value{GDBN} does in commands such as
+@kbd{disassemble}.
+
+Here are some examples of the possible string formats:
+
+@smallexample
+0x00001042
+0x00001042 <symbol+16>
+0x00001042 <symbol+16 at file.c:123>
+@end smallexample
+@end defun
+
 @node Exception Handling
 @subsubsection Exception Handling
 @cindex python exceptions
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..53906ce506e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
 }
 
 /* Returns the struct gdbarch value corresponding to the given Python
-   architecture object OBJ.  */
+   architecture object OBJ, which must be a gdb.Architecture object.  */
 
 struct gdbarch *
 arch_object_to_gdbarch (PyObject *obj)
 {
-  arch_object *py_arch = (arch_object *) obj;
+  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));
 
+  arch_object *py_arch = (arch_object *) obj;
   return py_arch->gdbarch;
 }
 
+/* See python-internal.h.  */
+
+bool
+gdbpy_is_architecture (PyObject *obj)
+{
+  return PyObject_TypeCheck (obj, &arch_object_type);
+}
+
 /* Returns the Python architecture object corresponding to GDBARCH.
    Returns a new reference to the arch_object associated as data with
    GDBARCH.  */
diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
index 1e01068c59b..f9f2a969e2b 100644
--- a/gdb/python/py-progspace.c
+++ b/gdb/python/py-progspace.c
@@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
   return gdbpy_ref<>::new_reference (result);
 }
 
+/* See python-internal.h.  */
+
+struct program_space *
+progspace_object_to_program_space (PyObject *obj)
+{
+  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));
+  return ((pspace_object *) obj)->pspace;
+}
+
+/* See python-internal.h.  */
+
+bool
+gdbpy_is_progspace (PyObject *obj)
+{
+  return PyObject_TypeCheck (obj, &pspace_object_type);
+}
+
 void _initialize_py_progspace ();
 void
 _initialize_py_progspace ()
diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
index 5e15f62f745..d2a8e4d8cf4 100644
--- a/gdb/python/python-internal.h
+++ b/gdb/python/python-internal.h
@@ -497,6 +497,13 @@ struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
 struct frame_info *frame_object_to_frame_info (PyObject *frame_obj);
 struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
 
+/* Convert Python object OBJ to a program_space pointer.  OBJ must be a
+   gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
+   valid (see gdb.Progspace.is_valid), otherwise return the program_space
+   pointer.  */
+
+extern struct program_space *progspace_object_to_program_space (PyObject *obj);
+
 void gdbpy_initialize_gdb_readline (void);
 int gdbpy_initialize_auto_load (void)
   CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
@@ -824,4 +831,13 @@ typedef std::unique_ptr<Py_buffer, Py_buffer_deleter> Py_buffer_up;
 extern bool gdbpy_parse_register_id (struct gdbarch *gdbarch,
 				     PyObject *pyo_reg_id, int *reg_num);
 
+/* Return true if OBJ is a gdb.Architecture object, otherwise, return
+   false.  */
+
+extern bool gdbpy_is_architecture (PyObject *obj);
+
+/* Return true if OBJ is a gdb.Progspace object, otherwise, return false.  */
+
+extern bool gdbpy_is_progspace (PyObject *obj);
+
 #endif /* PYTHON_PYTHON_INTERNAL_H */
diff --git a/gdb/python/python.c b/gdb/python/python.c
index 79f9826365a..8abce819001 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -1294,6 +1294,107 @@ gdbpy_colorize_disasm (const std::string &content, gdbarch *gdbarch)
 
 \f
 
+/* Implement gdb.format_address(ADDR,P_SPACE,ARCH).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+gdbpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] =
+    {
+      "address", "progspace", "architecture", nullptr
+    };
+  PyObject *addr_obj = nullptr, *pspace_obj = nullptr, *arch_obj = nullptr;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+  struct program_space *pspace = nullptr;
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O|OO", keywords,
+					&addr_obj, &pspace_obj, &arch_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  /* If the user passed None for progspace or architecture, then we
+     consider this to mean "the default".  Here we replace references to
+     None with nullptr, this means that in the following code we only have
+     to handle the nullptr case.  These are only borrowed references, so
+     no decref is required here.  */
+  if (pspace_obj == Py_None)
+    pspace_obj = nullptr;
+  if (arch_obj == Py_None)
+    arch_obj = nullptr;
+
+  if (pspace_obj == nullptr && arch_obj == nullptr)
+    {
+      /* Grab both of these from the current inferior, and its associated
+	 default architecture.  */
+      pspace = current_inferior ()->pspace;
+      gdbarch = current_inferior ()->gdbarch;
+    }
+  else if (arch_obj == nullptr || pspace_obj == nullptr)
+    {
+      /* If the user has only given one of program space or architecture,
+	 then don't use the default for the other.  Sure we could use the
+	 default, but it feels like there's too much scope of mistakes in
+	 this case, so better to require the user to provide both
+	 arguments.  */
+      PyErr_SetString (PyExc_ValueError,
+		       _("The architecture and progspace arguments must both be supplied"));
+      return nullptr;
+    }
+  else
+    {
+      /* The user provided an address, program space, and architecture.
+	 Just check that these objects are valid.  */
+      if (!gdbpy_is_progspace (pspace_obj))
+	{
+	  PyErr_SetString (PyExc_TypeError,
+			   _("The progspace argument is not a gdb.Progspace object"));
+	  return nullptr;
+	}
+
+      pspace = progspace_object_to_program_space (pspace_obj);
+      if (pspace == nullptr)
+	{
+	  PyErr_SetString (PyExc_ValueError,
+			   _("The progspace argument is not valid"));
+	  return nullptr;
+	}
+
+      if (!gdbpy_is_architecture (arch_obj))
+	{
+	  PyErr_SetString (PyExc_TypeError,
+			   _("The architecture argument is not a gdb.Architecture object"));
+	  return nullptr;
+	}
+
+      /* Architectures are never deleted once created, so gdbarch should
+	 never come back as nullptr.  */
+      gdbarch = arch_object_to_gdbarch (arch_obj);
+      gdb_assert (gdbarch != nullptr);
+    }
+
+  /* By this point we should know the program space and architecture we are
+     going to use.  */
+  gdb_assert (pspace != nullptr);
+  gdb_assert (gdbarch != nullptr);
+
+  /* Unfortunately print_address relies on the current program space for
+     its symbol lookup.  Temporarily switch now.  */
+  scoped_restore_current_program_space restore_progspace;
+  set_current_program_space (pspace);
+
+  /* Format the address, and return it as a string.  */
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
+\f
+
 /* Printing.  */
 
 /* A python function to write a single string using gdb's filtered
@@ -2442,6 +2543,13 @@ Return a list of all the architecture names GDB understands." },
     "connections () -> List.\n\
 Return a list of gdb.TargetConnection objects." },
 
+  { "format_address", (PyCFunction) gdbpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS, PROG_SPACE, ARCH) -> String.\n\
+Format ADDRESS, an address within PROG_SPACE, a gdb.Progspace, using\n\
+ARCH, a gdb.Architecture to determine the address size.  The format of\n\
+the returned string is 'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
+
   {NULL, NULL, 0, NULL}
 };
 
diff --git a/gdb/testsuite/gdb.python/py-format-address.c b/gdb/testsuite/gdb.python/py-format-address.c
new file mode 100644
index 00000000000..6493fc4d579
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-format-address.c
@@ -0,0 +1,32 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see  <http://www.gnu.org/licenses/>.  */
+
+/* This test is compiled multiple times with FUNCTION_NAME defined to
+   different strings, this means we should (hopefully) get the same code
+   layout in memory, but with different strings for the function name.  */
+
+int
+FUNCTION_NAME (void)
+{
+  return 0;
+}
+
+int
+main (void)
+{
+  return FUNCTION_NAME ();
+}
diff --git a/gdb/testsuite/gdb.python/py-format-address.exp b/gdb/testsuite/gdb.python/py-format-address.exp
new file mode 100644
index 00000000000..5c808299d34
--- /dev/null
+++ b/gdb/testsuite/gdb.python/py-format-address.exp
@@ -0,0 +1,177 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+load_lib gdb-python.exp
+standard_testfile
+
+foreach func_name { foo bar } {
+    if {[build_executable "build binary with ${func_name} function" \
+	     "$testfile-${func_name}" $srcfile \
+	     [list debug \
+		  additional_flags=-DFUNCTION_NAME=${func_name}]] == -1} {
+	return -1
+    }
+}
+
+set binary_foo [standard_output_file "${testfile}-foo"]
+set binary_bar [standard_output_file "${testfile}-bar"]
+
+clean_restart $binary_foo
+
+# Skip all tests if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+if ![runto_main] {
+   return -1
+}
+
+# Check the gdb.format_address method when using the default values
+# for the program space and architecture (these will be selected based
+# on the current inferior).
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+set next_addr [format 0x%x [expr $main_addr + 1]]
+
+foreach_with_prefix symbol_filename { on off } {
+    gdb_test_no_output "set print symbol-filename ${symbol_filename}"
+
+    if { $symbol_filename == "on" } {
+	set filename_pattern " at \[^\r\n\]+/${srcfile}:$decimal"
+    } else {
+	set filename_pattern ""
+    }
+
+    gdb_test "python print(\"Got: \" + gdb.format_address($main_addr))" \
+	"Got: $main_addr <main${filename_pattern}>" \
+	"gdb.format_address, result should have no offset"
+
+    gdb_test "python print(\"Got: \" + gdb.format_address($next_addr))" \
+	"Got: $next_addr <main\\+1${filename_pattern}>" \
+	"gdb.format_address, result should have an offset"
+}
+
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.format_address(0))" \
+	"Got: 0x0" \
+	"gdb.format_address for address 0"
+}
+
+# Now check that gdb.format_address will accept the program space and
+# architecture arguments correctly.
+gdb_test_no_output "python inf = gdb.selected_inferior()"
+
+# First, pass both arguments, this should be fine.
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, inf.architecture()))" \
+    "Got: $main_addr <main>" \
+    "gdb.format_address passing program space and architecture"
+
+# Now pass the program space and architecture as None.
+# First, pass both arguments, this should be fine.
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, None))" \
+    "Got: $main_addr <main>" \
+    "gdb.format_address passing program space and architecture as None"
+
+# Now forget the architecture, this should fail.
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace))" \
+    [multi_line \
+	 "ValueError: The architecture and progspace arguments must both be supplied" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address passing program space only"
+
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, None))" \
+    [multi_line \
+	 "ValueError: The architecture and progspace arguments must both be supplied" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address passing real program space, but architecture is None"
+
+# Now skip the program space argument.
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, architecture=inf.architecture()))" \
+    [multi_line \
+	 "ValueError: The architecture and progspace arguments must both be supplied" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address passing architecture only"
+
+gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, inf.architecture()))" \
+    [multi_line \
+	 "ValueError: The architecture and progspace arguments must both be supplied" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address passing real architecture, but progspace is None"
+
+# Now, before we add a second inferior, lets just check we can format
+# the address of 'foo' correctly.
+set foo_addr [get_hexadecimal_valueof "&foo" "UNKNOWN"]
+
+gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
+    "Got: $foo_addr <foo>" \
+    "gdb.format_address for foo, with just one inferior"
+
+# Now lets add a second inferior, using a slightly different
+# executable, select that inferior, and capture a reference to the
+# inferior in a Python object.
+gdb_test "add-inferior -exec ${binary_bar}" ".*" \
+    "add a second inferior running the bar executable"
+gdb_test "inferior 2" ".*"
+gdb_test_no_output "python inf2 = gdb.selected_inferior()"
+
+# Now we can test formatting an address from inferior 1.
+gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
+    "Got: $foo_addr <foo>" \
+    "gdb.format_address for foo, while inferior 2 is selected"
+
+# Grab the address of 'bar'.  Hopefully this will be the same address
+# as 'foo', but if not, that's not the end of the world, the test just
+# wont be quite as tough.
+set bar_addr [get_hexadecimal_valueof "&bar" "UNKNOWN"]
+
+# Now format the address of bar using the default inferior and
+# architecture, this should display the 'bar' symbol rather than
+# 'foo'.
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr))" \
+    "Got: $foo_addr <bar>" \
+    "gdb.format_address for bar, while inferior 2 is selected"
+
+# And again, but this time, specificy the program space and
+# architecture.
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
+    "Got: $foo_addr <bar>" \
+    "gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture"
+
+# Reselect inferior 1, and then format an address from inferior 2.
+gdb_test "inferior 1" ".*"
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
+    "Got: $foo_addr <bar>" \
+    "gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture"
+
+# Try pasing incorrect object types for program space and architecture.
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.progspace))" \
+    [multi_line \
+	 "TypeError: The architecture argument is not a gdb.Architecture object" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address pass wrong object type for architecture"
+
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.architecture(), inf2.architecture()))" \
+    [multi_line \
+	 "TypeError: The progspace argument is not a gdb.Progspace object" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address pass wrong object type for progspace"
+
+# Now invalidate inferior 2's program space, and try using that.
+gdb_test "python pspace = inf2.progspace"
+gdb_test "python arch = inf2.architecture()"
+gdb_test "remove-inferior 2"
+gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, pspace, arch))" \
+    [multi_line \
+	 "ValueError: The progspace argument is not valid" \
+	 "Error while executing Python code\\."] \
+    "gdb.format_address called with an invalid program space"
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
@ 2022-03-21 17:53     ` Andrew Burgess
  2022-03-21 18:23     ` Simon Marchi
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-21 17:53 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess


Ping!

Thanks,
Andrew

Andrew Burgess <aburgess@redhat.com> writes:

> From: Andrew Burgess <andrew.burgess@embecosm.com>
>
> Since v2:
>
>  - Almost complete rewrite, function is now gdb.format_address, and
>    takes an address, program space, and architecture,
>
>  - Updated docs to go with this,
>
>  - Updated tests.
>
> Since v1:
>
>  - All the doc changes have been addressed (these have already been
>    approved),
>
>  - Rebased to current master,
>
>  - Added an additional test for when 'set print symbol-filename' is
>    on.
>
> Thanks,
> Andrew
>
>
> ---
>
> Add a new function, gdb.format_address, which is a wrapper around
> GDB's print_address function.
>
> This method takes an address, and returns a string with the format:
>
>   ADDRESS <SYMBOL+OFFSET>
>
> Where, ADDRESS is the original address, formatted as hexadecimal,
> SYMBOL is a symbol with an address lower than ADDRESS, and OFFSET is
> the offset from SYMBOL to ADDRESS in decimal.
>
> If there's no SYMBOL suitably close to ADDRESS then the
> <SYMBOL+OFFSET> part is not included.
>
> This is useful if a user wants to write a Python script that
> pretty-prints addresses, the user no longer needs to do manual symbol
> lookup, or worry about correctly formatting addresses.
>
> Additionally, there are some settings that effect how GDB picks
> SYMBOL, and whether the file name and line number should be included
> with the SYMBOL name, the gdb.format_address function ensures that the
> users Python script also benefits from these settings.
>
> The gdb.format_address by default selects SYMBOL from the current
> inferiors program space, and address is formatted using the
> architecture for the current inferior.  However, a user can also
> explicitly pass a program space and architecture like this:
>
>   gdb.format_address(ADDRESS, PROGRAM_SPACE, ARCHITECTURE)
>
> In order to format an address for a different inferior.
>
> Notes on the implementation:
>
> In py-arch.c I extended arch_object_to_gdbarch to add an assertion for
> the type of the PyObject being worked on.  Prior to this commit all
> uses of arch_object_to_gdbarch were guaranteed to pass this function a
> gdb.Architecture object, but, with this commit, this might not be the
> case.
>
> So, with this commit I've made it a requirement that the PyObject be a
> gdb.Architecture, and this is checked with the assert.  And in order
> that callers from other files can check if they have a
> gdb.Architecture object, I've added the new function
> gdbpy_is_architecture.
>
> In py-progspace.c I've added two new function, the first
> progspace_object_to_program_space, converts a PyObject of type
> gdb.Progspace to the associated program_space pointer, and
> gdbpy_is_progspace checks if a PyObject is a gdb.Progspace or not.
> ---
>  gdb/NEWS                                      |   6 +
>  gdb/doc/python.texi                           |  54 ++++++
>  gdb/python/py-arch.c                          |  13 +-
>  gdb/python/py-progspace.c                     |  17 ++
>  gdb/python/python-internal.h                  |  16 ++
>  gdb/python/python.c                           | 108 +++++++++++
>  gdb/testsuite/gdb.python/py-format-address.c  |  32 ++++
>  .../gdb.python/py-format-address.exp          | 177 ++++++++++++++++++
>  8 files changed, 421 insertions(+), 2 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.python/py-format-address.c
>  create mode 100644 gdb/testsuite/gdb.python/py-format-address.exp
>
> diff --git a/gdb/NEWS b/gdb/NEWS
> index dc2cac1871b..ec6a0753024 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -206,6 +206,12 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
>       state information, or None, if there is no such additional
>       information.
>  
> +  ** New function gdb.format_address(ADDRESS, PROGSPACE, ARCHITECTURE),
> +     that formats ADDRESS as 'address <symbol+offset>', where symbol is
> +     looked up in PROGSPACE, and ARCHITECTURE is used to format address.
> +     This is the same format that GDB uses when printing address, symbol,
> +     and offset information from the disassembler.
> +
>  * New features in the GDB remote stub, GDBserver
>  
>    ** GDBserver is now supported on OpenRISC GNU/Linux.
> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
> index 4d9e77bf12c..721c5d3176e 100644
> --- a/gdb/doc/python.texi
> +++ b/gdb/doc/python.texi
> @@ -612,6 +612,60 @@
>  connection objects are in no particular order in the returned list.
>  @end defun
>  
> +@defun gdb.format_address (@var{address} @r{[}, @var{progspace}, @var{architecture}@r{]})
> +Return a string in the format @samp{@var{addr}
> +<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
> +formatted in hexadecimal, @var{symbol} is the symbol whose address is
> +the nearest to @var{address} and below it in memory, and @var{offset}
> +is the offset from @var{symbol} to @var{address} in decimal.
> +
> +If no suitable @var{symbol} was found, then the
> +<@var{symbol}+@var{offset}> part is not included in the returned
> +string, instead the returned string will just contain the
> +@var{address} formatted as hexadecimal.  How far @value{GDBN} looks
> +back for a suitable symbol can be controlled with @kbd{set print
> +max-symbolic-offset} (@pxref{Print Settings}).
> +
> +Additionally, the returned string can include file name and line
> +number information when @kbd{set print symbol-filename on}
> +(@pxref{Print Settings}), in this case the format of the returned
> +string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
> +@var{filename}:@var{line-number}}.
> +
> +
> +The @var{progspace} is the gdb.Progspace in which @var{symbol} is
> +looked up, and @var{architecture} is used when formatting @var{addr},
> +e.g.@: in order to determine the size of an address in bytes.
> +
> +If neither @var{progspace} or @var{architecture} are passed, then by
> +default @value{GDBN} will use the program space and architecture of
> +the currently selected inferior, thus, the following two calls are
> +equivalent:
> +
> +@smallexample
> +gdb.format_address(address)
> +gdb.format_address(address,
> +                   gdb.selected_inferior().progspace,
> +                   gdb.selected_inferior().architecture())
> +@end smallexample
> +
> +It is not valid to only pass one of @var{progspace} or
> +@var{architecture}, either they must both be provided, or neither must
> +be provided (and the defaults will be used).
> +
> +This method uses the same mechanism for formatting address, symbol,
> +and offset information as core @value{GDBN} does in commands such as
> +@kbd{disassemble}.
> +
> +Here are some examples of the possible string formats:
> +
> +@smallexample
> +0x00001042
> +0x00001042 <symbol+16>
> +0x00001042 <symbol+16 at file.c:123>
> +@end smallexample
> +@end defun
> +
>  @node Exception Handling
>  @subsubsection Exception Handling
>  @cindex python exceptions
> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index 0f273b344e4..53906ce506e 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
>  }
>  
>  /* Returns the struct gdbarch value corresponding to the given Python
> -   architecture object OBJ.  */
> +   architecture object OBJ, which must be a gdb.Architecture object.  */
>  
>  struct gdbarch *
>  arch_object_to_gdbarch (PyObject *obj)
>  {
> -  arch_object *py_arch = (arch_object *) obj;
> +  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));
>  
> +  arch_object *py_arch = (arch_object *) obj;
>    return py_arch->gdbarch;
>  }
>  
> +/* See python-internal.h.  */
> +
> +bool
> +gdbpy_is_architecture (PyObject *obj)
> +{
> +  return PyObject_TypeCheck (obj, &arch_object_type);
> +}
> +
>  /* Returns the Python architecture object corresponding to GDBARCH.
>     Returns a new reference to the arch_object associated as data with
>     GDBARCH.  */
> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
> index 1e01068c59b..f9f2a969e2b 100644
> --- a/gdb/python/py-progspace.c
> +++ b/gdb/python/py-progspace.c
> @@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
>    return gdbpy_ref<>::new_reference (result);
>  }
>  
> +/* See python-internal.h.  */
> +
> +struct program_space *
> +progspace_object_to_program_space (PyObject *obj)
> +{
> +  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));
> +  return ((pspace_object *) obj)->pspace;
> +}
> +
> +/* See python-internal.h.  */
> +
> +bool
> +gdbpy_is_progspace (PyObject *obj)
> +{
> +  return PyObject_TypeCheck (obj, &pspace_object_type);
> +}
> +
>  void _initialize_py_progspace ();
>  void
>  _initialize_py_progspace ()
> diff --git a/gdb/python/python-internal.h b/gdb/python/python-internal.h
> index 5e15f62f745..d2a8e4d8cf4 100644
> --- a/gdb/python/python-internal.h
> +++ b/gdb/python/python-internal.h
> @@ -497,6 +497,13 @@ struct symtab_and_line *sal_object_to_symtab_and_line (PyObject *obj);
>  struct frame_info *frame_object_to_frame_info (PyObject *frame_obj);
>  struct gdbarch *arch_object_to_gdbarch (PyObject *obj);
>  
> +/* Convert Python object OBJ to a program_space pointer.  OBJ must be a
> +   gdb.Progspace reference.  Return nullptr if the gdb.Progspace is not
> +   valid (see gdb.Progspace.is_valid), otherwise return the program_space
> +   pointer.  */
> +
> +extern struct program_space *progspace_object_to_program_space (PyObject *obj);
> +
>  void gdbpy_initialize_gdb_readline (void);
>  int gdbpy_initialize_auto_load (void)
>    CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
> @@ -824,4 +831,13 @@ typedef std::unique_ptr<Py_buffer, Py_buffer_deleter> Py_buffer_up;
>  extern bool gdbpy_parse_register_id (struct gdbarch *gdbarch,
>  				     PyObject *pyo_reg_id, int *reg_num);
>  
> +/* Return true if OBJ is a gdb.Architecture object, otherwise, return
> +   false.  */
> +
> +extern bool gdbpy_is_architecture (PyObject *obj);
> +
> +/* Return true if OBJ is a gdb.Progspace object, otherwise, return false.  */
> +
> +extern bool gdbpy_is_progspace (PyObject *obj);
> +
>  #endif /* PYTHON_PYTHON_INTERNAL_H */
> diff --git a/gdb/python/python.c b/gdb/python/python.c
> index 79f9826365a..8abce819001 100644
> --- a/gdb/python/python.c
> +++ b/gdb/python/python.c
> @@ -1294,6 +1294,107 @@ gdbpy_colorize_disasm (const std::string &content, gdbarch *gdbarch)
>  
>  \f
>  
> +/* Implement gdb.format_address(ADDR,P_SPACE,ARCH).  Provide access to
> +   GDB's print_address function from Python.  The returned address will
> +   have the format '0x..... <symbol+offset>'.  */
> +
> +static PyObject *
> +gdbpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  static const char *keywords[] =
> +    {
> +      "address", "progspace", "architecture", nullptr
> +    };
> +  PyObject *addr_obj = nullptr, *pspace_obj = nullptr, *arch_obj = nullptr;
> +  CORE_ADDR addr;
> +  struct gdbarch *gdbarch = nullptr;
> +  struct program_space *pspace = nullptr;
> +
> +  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O|OO", keywords,
> +					&addr_obj, &pspace_obj, &arch_obj))
> +    return nullptr;
> +
> +  if (get_addr_from_python (addr_obj, &addr) < 0)
> +    return nullptr;
> +
> +  /* If the user passed None for progspace or architecture, then we
> +     consider this to mean "the default".  Here we replace references to
> +     None with nullptr, this means that in the following code we only have
> +     to handle the nullptr case.  These are only borrowed references, so
> +     no decref is required here.  */
> +  if (pspace_obj == Py_None)
> +    pspace_obj = nullptr;
> +  if (arch_obj == Py_None)
> +    arch_obj = nullptr;
> +
> +  if (pspace_obj == nullptr && arch_obj == nullptr)
> +    {
> +      /* Grab both of these from the current inferior, and its associated
> +	 default architecture.  */
> +      pspace = current_inferior ()->pspace;
> +      gdbarch = current_inferior ()->gdbarch;
> +    }
> +  else if (arch_obj == nullptr || pspace_obj == nullptr)
> +    {
> +      /* If the user has only given one of program space or architecture,
> +	 then don't use the default for the other.  Sure we could use the
> +	 default, but it feels like there's too much scope of mistakes in
> +	 this case, so better to require the user to provide both
> +	 arguments.  */
> +      PyErr_SetString (PyExc_ValueError,
> +		       _("The architecture and progspace arguments must both be supplied"));
> +      return nullptr;
> +    }
> +  else
> +    {
> +      /* The user provided an address, program space, and architecture.
> +	 Just check that these objects are valid.  */
> +      if (!gdbpy_is_progspace (pspace_obj))
> +	{
> +	  PyErr_SetString (PyExc_TypeError,
> +			   _("The progspace argument is not a gdb.Progspace object"));
> +	  return nullptr;
> +	}
> +
> +      pspace = progspace_object_to_program_space (pspace_obj);
> +      if (pspace == nullptr)
> +	{
> +	  PyErr_SetString (PyExc_ValueError,
> +			   _("The progspace argument is not valid"));
> +	  return nullptr;
> +	}
> +
> +      if (!gdbpy_is_architecture (arch_obj))
> +	{
> +	  PyErr_SetString (PyExc_TypeError,
> +			   _("The architecture argument is not a gdb.Architecture object"));
> +	  return nullptr;
> +	}
> +
> +      /* Architectures are never deleted once created, so gdbarch should
> +	 never come back as nullptr.  */
> +      gdbarch = arch_object_to_gdbarch (arch_obj);
> +      gdb_assert (gdbarch != nullptr);
> +    }
> +
> +  /* By this point we should know the program space and architecture we are
> +     going to use.  */
> +  gdb_assert (pspace != nullptr);
> +  gdb_assert (gdbarch != nullptr);
> +
> +  /* Unfortunately print_address relies on the current program space for
> +     its symbol lookup.  Temporarily switch now.  */
> +  scoped_restore_current_program_space restore_progspace;
> +  set_current_program_space (pspace);
> +
> +  /* Format the address, and return it as a string.  */
> +  string_file buf;
> +  print_address (gdbarch, addr, &buf);
> +  return PyString_FromString (buf.c_str ());
> +}
> +
> +\f
> +
>  /* Printing.  */
>  
>  /* A python function to write a single string using gdb's filtered
> @@ -2442,6 +2543,13 @@ Return a list of all the architecture names GDB understands." },
>      "connections () -> List.\n\
>  Return a list of gdb.TargetConnection objects." },
>  
> +  { "format_address", (PyCFunction) gdbpy_format_address,
> +    METH_VARARGS | METH_KEYWORDS,
> +    "format_address (ADDRESS, PROG_SPACE, ARCH) -> String.\n\
> +Format ADDRESS, an address within PROG_SPACE, a gdb.Progspace, using\n\
> +ARCH, a gdb.Architecture to determine the address size.  The format of\n\
> +the returned string is 'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
> +
>    {NULL, NULL, 0, NULL}
>  };
>  
> diff --git a/gdb/testsuite/gdb.python/py-format-address.c b/gdb/testsuite/gdb.python/py-format-address.c
> new file mode 100644
> index 00000000000..6493fc4d579
> --- /dev/null
> +++ b/gdb/testsuite/gdb.python/py-format-address.c
> @@ -0,0 +1,32 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2022 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see  <http://www.gnu.org/licenses/>.  */
> +
> +/* This test is compiled multiple times with FUNCTION_NAME defined to
> +   different strings, this means we should (hopefully) get the same code
> +   layout in memory, but with different strings for the function name.  */
> +
> +int
> +FUNCTION_NAME (void)
> +{
> +  return 0;
> +}
> +
> +int
> +main (void)
> +{
> +  return FUNCTION_NAME ();
> +}
> diff --git a/gdb/testsuite/gdb.python/py-format-address.exp b/gdb/testsuite/gdb.python/py-format-address.exp
> new file mode 100644
> index 00000000000..5c808299d34
> --- /dev/null
> +++ b/gdb/testsuite/gdb.python/py-format-address.exp
> @@ -0,0 +1,177 @@
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +load_lib gdb-python.exp
> +standard_testfile
> +
> +foreach func_name { foo bar } {
> +    if {[build_executable "build binary with ${func_name} function" \
> +	     "$testfile-${func_name}" $srcfile \
> +	     [list debug \
> +		  additional_flags=-DFUNCTION_NAME=${func_name}]] == -1} {
> +	return -1
> +    }
> +}
> +
> +set binary_foo [standard_output_file "${testfile}-foo"]
> +set binary_bar [standard_output_file "${testfile}-bar"]
> +
> +clean_restart $binary_foo
> +
> +# Skip all tests if Python scripting is not enabled.
> +if { [skip_python_tests] } { continue }
> +
> +if ![runto_main] {
> +   return -1
> +}
> +
> +# Check the gdb.format_address method when using the default values
> +# for the program space and architecture (these will be selected based
> +# on the current inferior).
> +set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
> +set next_addr [format 0x%x [expr $main_addr + 1]]
> +
> +foreach_with_prefix symbol_filename { on off } {
> +    gdb_test_no_output "set print symbol-filename ${symbol_filename}"
> +
> +    if { $symbol_filename == "on" } {
> +	set filename_pattern " at \[^\r\n\]+/${srcfile}:$decimal"
> +    } else {
> +	set filename_pattern ""
> +    }
> +
> +    gdb_test "python print(\"Got: \" + gdb.format_address($main_addr))" \
> +	"Got: $main_addr <main${filename_pattern}>" \
> +	"gdb.format_address, result should have no offset"
> +
> +    gdb_test "python print(\"Got: \" + gdb.format_address($next_addr))" \
> +	"Got: $next_addr <main\\+1${filename_pattern}>" \
> +	"gdb.format_address, result should have an offset"
> +}
> +
> +if {![is_address_zero_readable]} {
> +    gdb_test "python print(\"Got: \" + gdb.format_address(0))" \
> +	"Got: 0x0" \
> +	"gdb.format_address for address 0"
> +}
> +
> +# Now check that gdb.format_address will accept the program space and
> +# architecture arguments correctly.
> +gdb_test_no_output "python inf = gdb.selected_inferior()"
> +
> +# First, pass both arguments, this should be fine.
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, inf.architecture()))" \
> +    "Got: $main_addr <main>" \
> +    "gdb.format_address passing program space and architecture"
> +
> +# Now pass the program space and architecture as None.
> +# First, pass both arguments, this should be fine.
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, None))" \
> +    "Got: $main_addr <main>" \
> +    "gdb.format_address passing program space and architecture as None"
> +
> +# Now forget the architecture, this should fail.
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace))" \
> +    [multi_line \
> +	 "ValueError: The architecture and progspace arguments must both be supplied" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address passing program space only"
> +
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, inf.progspace, None))" \
> +    [multi_line \
> +	 "ValueError: The architecture and progspace arguments must both be supplied" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address passing real program space, but architecture is None"
> +
> +# Now skip the program space argument.
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, architecture=inf.architecture()))" \
> +    [multi_line \
> +	 "ValueError: The architecture and progspace arguments must both be supplied" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address passing architecture only"
> +
> +gdb_test "python print(\"Got: \" + gdb.format_address($main_addr, None, inf.architecture()))" \
> +    [multi_line \
> +	 "ValueError: The architecture and progspace arguments must both be supplied" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address passing real architecture, but progspace is None"
> +
> +# Now, before we add a second inferior, lets just check we can format
> +# the address of 'foo' correctly.
> +set foo_addr [get_hexadecimal_valueof "&foo" "UNKNOWN"]
> +
> +gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
> +    "Got: $foo_addr <foo>" \
> +    "gdb.format_address for foo, with just one inferior"
> +
> +# Now lets add a second inferior, using a slightly different
> +# executable, select that inferior, and capture a reference to the
> +# inferior in a Python object.
> +gdb_test "add-inferior -exec ${binary_bar}" ".*" \
> +    "add a second inferior running the bar executable"
> +gdb_test "inferior 2" ".*"
> +gdb_test_no_output "python inf2 = gdb.selected_inferior()"
> +
> +# Now we can test formatting an address from inferior 1.
> +gdb_test "python print(\"Got: \" + gdb.format_address($foo_addr, inf.progspace, inf.architecture()))" \
> +    "Got: $foo_addr <foo>" \
> +    "gdb.format_address for foo, while inferior 2 is selected"
> +
> +# Grab the address of 'bar'.  Hopefully this will be the same address
> +# as 'foo', but if not, that's not the end of the world, the test just
> +# wont be quite as tough.
> +set bar_addr [get_hexadecimal_valueof "&bar" "UNKNOWN"]
> +
> +# Now format the address of bar using the default inferior and
> +# architecture, this should display the 'bar' symbol rather than
> +# 'foo'.
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr))" \
> +    "Got: $foo_addr <bar>" \
> +    "gdb.format_address for bar, while inferior 2 is selected"
> +
> +# And again, but this time, specificy the program space and
> +# architecture.
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
> +    "Got: $foo_addr <bar>" \
> +    "gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture"
> +
> +# Reselect inferior 1, and then format an address from inferior 2.
> +gdb_test "inferior 1" ".*"
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
> +    "Got: $foo_addr <bar>" \
> +    "gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture"
> +
> +# Try pasing incorrect object types for program space and architecture.
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.progspace))" \
> +    [multi_line \
> +	 "TypeError: The architecture argument is not a gdb.Architecture object" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address pass wrong object type for architecture"
> +
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.architecture(), inf2.architecture()))" \
> +    [multi_line \
> +	 "TypeError: The progspace argument is not a gdb.Progspace object" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address pass wrong object type for progspace"
> +
> +# Now invalidate inferior 2's program space, and try using that.
> +gdb_test "python pspace = inf2.progspace"
> +gdb_test "python arch = inf2.architecture()"
> +gdb_test "remove-inferior 2"
> +gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, pspace, arch))" \
> +    [multi_line \
> +	 "ValueError: The progspace argument is not valid" \
> +	 "Error while executing Python code\\."] \
> +    "gdb.format_address called with an invalid program space"
> -- 
> 2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
  2022-03-21 17:53     ` Andrew Burgess
@ 2022-03-21 18:23     ` Simon Marchi
  2022-03-22 13:19       ` Andrew Burgess
  1 sibling, 1 reply; 20+ messages in thread
From: Simon Marchi @ 2022-03-21 18:23 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: Andrew Burgess

> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
> index 0f273b344e4..53906ce506e 100644
> --- a/gdb/python/py-arch.c
> +++ b/gdb/python/py-arch.c
> @@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
>  }
>
>  /* Returns the struct gdbarch value corresponding to the given Python
> -   architecture object OBJ.  */
> +   architecture object OBJ, which must be a gdb.Architecture object.  */
>
>  struct gdbarch *
>  arch_object_to_gdbarch (PyObject *obj)
>  {
> -  arch_object *py_arch = (arch_object *) obj;
> +  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));

Since we have a function for that, I'd do:

  gdb_assert (gdbpy_is_architecture (obj));

> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
> index 1e01068c59b..f9f2a969e2b 100644
> --- a/gdb/python/py-progspace.c
> +++ b/gdb/python/py-progspace.c
> @@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
>    return gdbpy_ref<>::new_reference (result);
>  }
>
> +/* See python-internal.h.  */
> +
> +struct program_space *
> +progspace_object_to_program_space (PyObject *obj)
> +{
> +  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));

Same here, use gdbpy_is_progspace.

Otherwise, LGTM.

Simon

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-21 18:23     ` Simon Marchi
@ 2022-03-22 13:19       ` Andrew Burgess
  2022-03-23 12:14         ` Simon Marchi
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Burgess @ 2022-03-22 13:19 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

>> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
>> index 0f273b344e4..53906ce506e 100644
>> --- a/gdb/python/py-arch.c
>> +++ b/gdb/python/py-arch.c
>> @@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
>>  }
>>
>>  /* Returns the struct gdbarch value corresponding to the given Python
>> -   architecture object OBJ.  */
>> +   architecture object OBJ, which must be a gdb.Architecture object.  */
>>
>>  struct gdbarch *
>>  arch_object_to_gdbarch (PyObject *obj)
>>  {
>> -  arch_object *py_arch = (arch_object *) obj;
>> +  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));
>
> Since we have a function for that, I'd do:
>
>   gdb_assert (gdbpy_is_architecture (obj));
>
>> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
>> index 1e01068c59b..f9f2a969e2b 100644
>> --- a/gdb/python/py-progspace.c
>> +++ b/gdb/python/py-progspace.c
>> @@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
>>    return gdbpy_ref<>::new_reference (result);
>>  }
>>
>> +/* See python-internal.h.  */
>> +
>> +struct program_space *
>> +progspace_object_to_program_space (PyObject *obj)
>> +{
>> +  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));
>
> Same here, use gdbpy_is_progspace.
>
> Otherwise, LGTM.

Thanks, I made those changes and pushed this patch.

Andrew


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-22 13:19       ` Andrew Burgess
@ 2022-03-23 12:14         ` Simon Marchi
  2022-03-23 15:30           ` Andrew Burgess
  0 siblings, 1 reply; 20+ messages in thread
From: Simon Marchi @ 2022-03-23 12:14 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: Andrew Burgess



On 2022-03-22 09:19, Andrew Burgess wrote:
> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
>>> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
>>> index 0f273b344e4..53906ce506e 100644
>>> --- a/gdb/python/py-arch.c
>>> +++ b/gdb/python/py-arch.c
>>> @@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
>>>  }
>>>
>>>  /* Returns the struct gdbarch value corresponding to the given Python
>>> -   architecture object OBJ.  */
>>> +   architecture object OBJ, which must be a gdb.Architecture object.  */
>>>
>>>  struct gdbarch *
>>>  arch_object_to_gdbarch (PyObject *obj)
>>>  {
>>> -  arch_object *py_arch = (arch_object *) obj;
>>> +  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));
>>
>> Since we have a function for that, I'd do:
>>
>>   gdb_assert (gdbpy_is_architecture (obj));
>>
>>> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
>>> index 1e01068c59b..f9f2a969e2b 100644
>>> --- a/gdb/python/py-progspace.c
>>> +++ b/gdb/python/py-progspace.c
>>> @@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
>>>    return gdbpy_ref<>::new_reference (result);
>>>  }
>>>
>>> +/* See python-internal.h.  */
>>> +
>>> +struct program_space *
>>> +progspace_object_to_program_space (PyObject *obj)
>>> +{
>>> +  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));
>>
>> Same here, use gdbpy_is_progspace.
>>
>> Otherwise, LGTM.
> 
> Thanks, I made those changes and pushed this patch.
> 
> Andrew
> 

Hmm, I see these failures:

145 python print("Got: " + gdb.format_address(0x1129))^M
146 Got: 0x1129 <bar>^M
147 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 2 is selected
148 python print("Got: " + gdb.format_address(0x1129, inf2.progspace, inf2.architecture()))^M
149 Got: 0x1129 <bar>^M
150 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture
151 inferior 1^M
152 [Switching to inferior 1 [process 1874961] (/home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.python/py-format-address/py-format-address-foo)]^M
153 [Switching to thread 1.1 (process 1874961)]^M
154 #0  main () at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.python/py-format-address.c:31^M
155 31        return FUNCTION_NAME ();^M
156 (gdb) PASS: gdb.python/py-format-address.exp: inferior 1
157 python print("Got: " + gdb.format_address(0x1129, inf2.progspace, inf2.architecture()))^M
158 Got: 0x1129 <bar>^M
159 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture

Simon

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-23 12:14         ` Simon Marchi
@ 2022-03-23 15:30           ` Andrew Burgess
  2022-03-28 21:59             ` Simon Marchi
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Burgess @ 2022-03-23 15:30 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess

Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:

> On 2022-03-22 09:19, Andrew Burgess wrote:
>> Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
>> 
>>>> diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
>>>> index 0f273b344e4..53906ce506e 100644
>>>> --- a/gdb/python/py-arch.c
>>>> +++ b/gdb/python/py-arch.c
>>>> @@ -62,16 +62,25 @@ arch_object_data_init (struct gdbarch *gdbarch)
>>>>  }
>>>>
>>>>  /* Returns the struct gdbarch value corresponding to the given Python
>>>> -   architecture object OBJ.  */
>>>> +   architecture object OBJ, which must be a gdb.Architecture object.  */
>>>>
>>>>  struct gdbarch *
>>>>  arch_object_to_gdbarch (PyObject *obj)
>>>>  {
>>>> -  arch_object *py_arch = (arch_object *) obj;
>>>> +  gdb_assert (PyObject_TypeCheck (obj, &arch_object_type));
>>>
>>> Since we have a function for that, I'd do:
>>>
>>>   gdb_assert (gdbpy_is_architecture (obj));
>>>
>>>> diff --git a/gdb/python/py-progspace.c b/gdb/python/py-progspace.c
>>>> index 1e01068c59b..f9f2a969e2b 100644
>>>> --- a/gdb/python/py-progspace.c
>>>> +++ b/gdb/python/py-progspace.c
>>>> @@ -504,6 +504,23 @@ pspace_to_pspace_object (struct program_space *pspace)
>>>>    return gdbpy_ref<>::new_reference (result);
>>>>  }
>>>>
>>>> +/* See python-internal.h.  */
>>>> +
>>>> +struct program_space *
>>>> +progspace_object_to_program_space (PyObject *obj)
>>>> +{
>>>> +  gdb_assert (PyObject_TypeCheck (obj, &pspace_object_type));
>>>
>>> Same here, use gdbpy_is_progspace.
>>>
>>> Otherwise, LGTM.
>> 
>> Thanks, I made those changes and pushed this patch.
>> 
>> Andrew
>> 
>
> Hmm, I see these failures:
>
> 145 python print("Got: " + gdb.format_address(0x1129))^M
> 146 Got: 0x1129 <bar>^M
> 147 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 2 is selected
> 148 python print("Got: " + gdb.format_address(0x1129, inf2.progspace, inf2.architecture()))^M
> 149 Got: 0x1129 <bar>^M
> 150 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture
> 151 inferior 1^M
> 152 [Switching to inferior 1 [process 1874961] (/home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.python/py-format-address/py-format-address-foo)]^M
> 153 [Switching to thread 1.1 (process 1874961)]^M
> 154 #0  main () at /home/smarchi/src/binutils-gdb/gdb/testsuite/gdb.python/py-format-address.c:31^M
> 155 31        return FUNCTION_NAME ();^M
> 156 (gdb) PASS: gdb.python/py-format-address.exp: inferior 1
> 157 python print("Got: " + gdb.format_address(0x1129, inf2.progspace, inf2.architecture()))^M
> 158 Got: 0x1129 <bar>^M
> 159 (gdb) FAIL: gdb.python/py-format-address.exp: gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture

Hi Simon,

Sorry for that.  I can't reproduce these failures, but I suspect I know
what's going on.  Could you try the patch below please and confirm that
this fixes the issues.

Many thanks,
Andrew

---

commit d6eb0c69d3919a1b3862d29c788415ca9f7bbeb4
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Mar 23 15:23:47 2022 +0000

    gdb/testsuite: fix copy & paste error in gdb.python/py-format-address.exp
    
    The test gdb.python/py-format-address.exp, added in commit:
    
      commit 25209e2c6979c3838e14e099f0333609810db280
      Date:   Sat Oct 23 09:59:25 2021 +0100
    
          gdb/python: add gdb.format_address function
    
    Included 3 copy & paste errors where the wrong address was used in the
    expected output patterns.
    
    The test compiles two almost identical test binaries (one function
    changes its name, that's the only difference), if the two binaries are
    laid out the same by the compiler, and loaded at the same locations in
    memory, then the two addresses would have been the same.  However,
    this is not the case for everyone, and so some folk were seeing test
    failures:
    
      https://sourceware.org/pipermail/gdb-patches/2022-March/186911.html
    
    This commit fixes the errors by using the correct addresses.

diff --git a/gdb/testsuite/gdb.python/py-format-address.exp b/gdb/testsuite/gdb.python/py-format-address.exp
index 5c808299d34..70ca5d5ae59 100644
--- a/gdb/testsuite/gdb.python/py-format-address.exp
+++ b/gdb/testsuite/gdb.python/py-format-address.exp
@@ -138,19 +138,19 @@ set bar_addr [get_hexadecimal_valueof "&bar" "UNKNOWN"]
 # architecture, this should display the 'bar' symbol rather than
 # 'foo'.
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 2 is selected"
 
 # And again, but this time, specificy the program space and
 # architecture.
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture"
 
 # Reselect inferior 1, and then format an address from inferior 2.
 gdb_test "inferior 1" ".*"
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture"
 
 # Try pasing incorrect object types for program space and architecture.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-23 15:30           ` Andrew Burgess
@ 2022-03-28 21:59             ` Simon Marchi
  2022-03-29 13:38               ` Andrew Burgess
  0 siblings, 1 reply; 20+ messages in thread
From: Simon Marchi @ 2022-03-28 21:59 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches; +Cc: Andrew Burgess

> Hi Simon,
>
> Sorry for that.  I can't reproduce these failures, but I suspect I know
> what's going on.  Could you try the patch below please and confirm that
> this fixes the issues.
>
> Many thanks,
> Andrew
>
> ---
>
> commit d6eb0c69d3919a1b3862d29c788415ca9f7bbeb4
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Wed Mar 23 15:23:47 2022 +0000
>
>     gdb/testsuite: fix copy & paste error in gdb.python/py-format-address.exp
>
>     The test gdb.python/py-format-address.exp, added in commit:
>
>       commit 25209e2c6979c3838e14e099f0333609810db280
>       Date:   Sat Oct 23 09:59:25 2021 +0100
>
>           gdb/python: add gdb.format_address function
>
>     Included 3 copy & paste errors where the wrong address was used in the
>     expected output patterns.
>
>     The test compiles two almost identical test binaries (one function
>     changes its name, that's the only difference), if the two binaries are
>     laid out the same by the compiler, and loaded at the same locations in
>     memory, then the two addresses would have been the same.  However,
>     this is not the case for everyone, and so some folk were seeing test
>     failures:
>
>       https://sourceware.org/pipermail/gdb-patches/2022-March/186911.html
>
>     This commit fixes the errors by using the correct addresses.

Hi Andrew,

I looked into this separately because I failed to see your message.  In
the end I came up with the same change, but the reason the addresses are
different is that the executables are PIE.  Inferior 1 is running, so
the function address is relocated:

 print /x &foo^M
 $2 = 0x555555555129^M

Inferior 2 is not running, so the function address is unrelocated:

 print /x &bar^M
 $3 = 0x1129^M

So, the patch LGTM, but you can clarify the commit message if you want.

Simon

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCHv3] gdb/python: add gdb.format_address function
  2022-03-28 21:59             ` Simon Marchi
@ 2022-03-29 13:38               ` Andrew Burgess
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Burgess @ 2022-03-29 13:38 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches; +Cc: Andrew Burgess

Simon Marchi <simon.marchi@polymtl.ca> writes:

>> Hi Simon,
>>
>> Sorry for that.  I can't reproduce these failures, but I suspect I know
>> what's going on.  Could you try the patch below please and confirm that
>> this fixes the issues.
>>
>> Many thanks,
>> Andrew
>>
>> ---
>>
>> commit d6eb0c69d3919a1b3862d29c788415ca9f7bbeb4
>> Author: Andrew Burgess <aburgess@redhat.com>
>> Date:   Wed Mar 23 15:23:47 2022 +0000
>>
>>     gdb/testsuite: fix copy & paste error in gdb.python/py-format-address.exp
>>
>>     The test gdb.python/py-format-address.exp, added in commit:
>>
>>       commit 25209e2c6979c3838e14e099f0333609810db280
>>       Date:   Sat Oct 23 09:59:25 2021 +0100
>>
>>           gdb/python: add gdb.format_address function
>>
>>     Included 3 copy & paste errors where the wrong address was used in the
>>     expected output patterns.
>>
>>     The test compiles two almost identical test binaries (one function
>>     changes its name, that's the only difference), if the two binaries are
>>     laid out the same by the compiler, and loaded at the same locations in
>>     memory, then the two addresses would have been the same.  However,
>>     this is not the case for everyone, and so some folk were seeing test
>>     failures:
>>
>>       https://sourceware.org/pipermail/gdb-patches/2022-March/186911.html
>>
>>     This commit fixes the errors by using the correct addresses.
>
> Hi Andrew,
>
> I looked into this separately because I failed to see your message.  In
> the end I came up with the same change, but the reason the addresses are
> different is that the executables are PIE.  Inferior 1 is running, so
> the function address is relocated:
>
>  print /x &foo^M
>  $2 = 0x555555555129^M
>
> Inferior 2 is not running, so the function address is unrelocated:
>
>  print /x &bar^M
>  $3 = 0x1129^M
>
> So, the patch LGTM, but you can clarify the commit message if you want.

I pushed the patch below.

This updates the commit message, but also, I pass the 'nopie' option
when compiling the test binaries, which should make the test more useful
(the addresses should now be the same).

Thanks,
Andrew


---

commit 4daa9f295d07917610f0972e0cd45df8c51e69a2
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Mar 23 15:23:47 2022 +0000

    gdb/testsuite: fix copy & paste error in gdb.python/py-format-address.exp
    
    The test gdb.python/py-format-address.exp, added in commit:
    
      commit 25209e2c6979c3838e14e099f0333609810db280
      Date:   Sat Oct 23 09:59:25 2021 +0100
    
          gdb/python: add gdb.format_address function
    
    included 3 copy & paste errors where the wrong address was used in the
    expected output patterns.
    
    The test compiles two almost identical test binaries (one function
    changes its name, that's the only difference), two inferiors are
    created, each inferior using one of the test binaries.
    
    We then take the address of the name changing function in both
    inferiors ('foo' in inferior 1 and 'bar' in inferior 2) and the tests
    are carried out using these addresses.
    
    What we're checking for is that symbols 'foo' and 'bar' show up in the
    correct inferior, and that (as this test is for a Python API feature),
    the user can have one inferior selected, but ask about the other
    inferior, and see the correct symbol in the result.
    
    The hope is that the two binaries will be laid out identically by the
    compiler, and that 'foo' and 'bar' will be at the same address.  This
    is fine, unless the executable is compiled as PIE (position
    independent executable), in which case there is a problem.
    
    The problem is that though inferior 1 is set running, the inferior 2
    never is.  If the executables are compiled as PIE, then the address in
    the inferior 2 will not have been resolved, while the address in the
    inferior 1 will have been, and so the two addresses we use in the
    tests will be different.
    
    This issue was reported here:
    
      https://sourceware.org/pipermail/gdb-patches/2022-March/186911.html
    
    The first part of the fix is to use the correct address variable in
    the expected output patterns, with this change the tests pass even
    when the executables are compiled as PIE.
    
    A second part of this fix is to pass the 'nopie' option when we
    compile the tests, this should ensure that the address obtained in
    inferior 2 is the same as the address from inferior 1, which makes the
    test more useful.

diff --git a/gdb/testsuite/gdb.python/py-format-address.exp b/gdb/testsuite/gdb.python/py-format-address.exp
index 5c808299d34..bbfe658c0bb 100644
--- a/gdb/testsuite/gdb.python/py-format-address.exp
+++ b/gdb/testsuite/gdb.python/py-format-address.exp
@@ -20,6 +20,7 @@ foreach func_name { foo bar } {
     if {[build_executable "build binary with ${func_name} function" \
 	     "$testfile-${func_name}" $srcfile \
 	     [list debug \
+		  nopie \
 		  additional_flags=-DFUNCTION_NAME=${func_name}]] == -1} {
 	return -1
     }
@@ -138,19 +139,19 @@ set bar_addr [get_hexadecimal_valueof "&bar" "UNKNOWN"]
 # architecture, this should display the 'bar' symbol rather than
 # 'foo'.
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 2 is selected"
 
 # And again, but this time, specificy the program space and
 # architecture.
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 2 is selected, pass progspace and architecture"
 
 # Reselect inferior 1, and then format an address from inferior 2.
 gdb_test "inferior 1" ".*"
 gdb_test "python print(\"Got: \" + gdb.format_address($bar_addr, inf2.progspace, inf2.architecture()))" \
-    "Got: $foo_addr <bar>" \
+    "Got: $bar_addr <bar>" \
     "gdb.format_address for bar, while inferior 1 is selected, pass progspace and architecture"
 
 # Try pasing incorrect object types for program space and architecture.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-03-29 13:38 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-11 16:17 [PATCH] gdb/python: add gdb.Architecture.format_address Andrew Burgess
2022-02-11 18:54 ` Eli Zaretskii
2022-02-21 17:27   ` Andrew Burgess
2022-02-21 18:02     ` Eli Zaretskii
2022-02-22 13:56       ` Andrew Burgess
2022-02-22 14:48         ` Eli Zaretskii
2022-02-23 14:20           ` Andrew Burgess
2022-03-03 16:49             ` Andrew Burgess
2022-03-03 18:35         ` Craig Blackmore
2022-03-04 10:51           ` Andrew Burgess
2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
2022-03-04 15:22   ` Simon Marchi
2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
2022-03-21 17:53     ` Andrew Burgess
2022-03-21 18:23     ` Simon Marchi
2022-03-22 13:19       ` Andrew Burgess
2022-03-23 12:14         ` Simon Marchi
2022-03-23 15:30           ` Andrew Burgess
2022-03-28 21:59             ` Simon Marchi
2022-03-29 13:38               ` Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).