public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH] gdb/python: add gdb.Architecture.format_address
Date: Tue, 22 Feb 2022 13:56:17 +0000	[thread overview]
Message-ID: <87sfsbattq.fsf@redhat.com> (raw)
In-Reply-To: <837d9oumhd.fsf@gnu.org>

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrew Burgess <aburgess@redhat.com>
>> Cc: gdb-patches@sourceware.org
>> Date: Mon, 21 Feb 2022 17:27:21 +0000
>> 
>> I'm certainly not against renaming, if we can come up with a better
>> name... maybe 'format_address_info'?  I don't know... I still kind of
>> like 'format_address'...
>
> I hope someone will come up with a better name.
>
>> +@defun Architecture.format_address (@var{address})
>> +Return a string in the format @samp{ADDRESS <SYMBOL+OFFSET>}, where
>> +@samp{ADDRESS} is @var{address} formatted in hexadecimal,
>> +@samp{SYMBOL} is a symbol, the address range of which, covers
>> +@var{address}, and @samp{OFFSET} is the offset from @samp{SYMBOL} to
>> +@var{address} in decimal.  This is the same format that @value{GDBN}
>> +uses when printing address, symbol, and offset information, for
>> +example, within disassembler output.
>> +
>> +If no @samp{SYMBOL} has an address range that covers @var{address},
>> +then the @samp{<SYMBOL+OFFSET>} part is not included in the returned
>> +string, instead the returned string will just contain the
>> +@var{address} formatted as hexadecimal.
>> +
>> +In all cases, the @samp{ADDRESS} component will be padded with leading
>> +zeros based on the width of an address for the current architecture.
>
> This is okay, but needs the markup fixed.  All the places where you
> use SOMETHING ("ADDRESS", "SYMBOL", etc.) should be @var{something},
> i.e. have the @var markup and be in lower-case.  (To prevent confusion
> with the argument @var{address}, call it something else, like
> @var{addr}.)  Also, remove @samp everywhere except this single
> instance:
>
>    @samp{@var{address} <@var{symbol}+@var{offset}>}
>
> And finally, this text:
>
>> @samp{SYMBOL} is a symbol, the address range of which, covers
>> +@var{address}
>
> is better worded as
>
>   @var{symbol} is the symbol to which @var{addr} belongs
>
> Btw, is the above accurate? does GDB really guarantee that ADDR is
> p[art of SYMBOL's memory? or does it just find the closest symbol
> whose address is smaller than ADDR?

You're absolutely correct.

While checking the code to see how this stuff actually works I
discovered a few more interesting settings that I felt were worth
mentioning in the docs for this function.

So, appologies, but this is another complete rewrite of the docs.  With
the exception of the name, for which I still have no better suggestions,
how's this?

Thanks,
Andrew

---

commit dc104a9f22e3d5baf1b87b344acbfae010bac168
Author: Andrew Burgess <andrew.burgess@embecosm.com>
Date:   Sat Oct 23 09:59:25 2021 +0100

    gdb/python: add gdb.Architecture.format_address
    
    Add a new method gdb.Architecture.format_address, which is a wrapper
    around GDB's print_address function.
    
    This method takes an address, and returns a string with the format:
    
      ADDRESS <SYMBOL+OFFSET>
    
    Where, ADDRESS is the original address, formatted as hexadecimal, and
    padded with zeros on the left up to the width of an address in the
    current architecture.
    
    SYMBOL is a symbol whose address range covers ADDRESS, and OFFSET is
    the offset from SYMBOL to ADDRESS in decimal.
    
    If there's no SYMBOL whose address range covers ADDRESS, then the
    <SYMBOL+OFFSET> part is not included.
    
    This is useful if a user wants to write a Python script that
    pretty-print addresses, the user no longer needs to do manual symbol
    lookup, and additionally, things like the zero padding on addresses
    will be consistent with the builtin GDB behaviour.

diff --git a/gdb/NEWS b/gdb/NEWS
index 9da74e71796..b012e0c562b 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -185,6 +185,11 @@ GNU/Linux/LoongArch    loongarch*-*-linux*
      set styling').  When false, which is the default if the argument
      is not given, then no styling is applied to the returned string.
 
+  ** New function gdb.Architecture.format_address(ADDRESS), that
+     formats ADDRESS as 'address <symbol+offset>', this is the same
+     format that GDB uses when printing address, symbol, and offset
+     information from the disassembler.
+
 * New features in the GDB remote stub, GDBserver
 
   ** GDBserver is now supported on OpenRISC GNU/Linux.
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index c1a3f5f2a7e..a786d52b092 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -6016,6 +6016,42 @@
 @code{gdb.Architecture}.
 @end defun
 
+@defun Architecture.format_address (@var{address})
+Return a string in the format @samp{@var{addr}
+<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address}
+formatted in hexadecimal, @var{symbol} is the closest earlier symbol
+to @var{address}, and @var{offset} is the offset from @var{symbol} to
+@var{address} in decimal.
+
+If no suitable @var{symbol} was found, then the
+<@var{symbol}+@var{offset}> part is not included in the returned
+string, instead the returned string will just contain the
+@var{address} formatted as hexadecimal.  How far @value{GDBN} looks
+back for a suitable symbol can be controlled with @kbd{set print
+max-symbolic-offset} (@pxref{Print Settings}).
+
+Additionally, the returned string can include file name and line
+number information when @kbd{set print symbol-filename on}
+(@pxref{Print Settings}), in this case the format of the returned
+string is @samp{@var{addr} <@var{symbol}+@var{offset}> at
+@var{filename}:@var{line-number}}.
+
+In all cases, the @var{addr} component will be padded with leading
+zeros based on the width of an address for the current architecture.
+
+This method uses the same mechanism for formatting address, symbol,
+and offset information as core @value{GDBN} does in commands such as
+@kbd{disassemble}.
+
+Here are some examples of the possible string formats:
+
+@smallexample
+0x00001042
+0x00001042 <symbol+16>
+0x00001042 <symbol+16 at file.c:123>
+@end smallexample
+@end defun
+
 @node Registers In Python
 @subsubsection Registers In Python
 @cindex Registers In Python
diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c
index 0f273b344e4..95ae931e73e 100644
--- a/gdb/python/py-arch.c
+++ b/gdb/python/py-arch.c
@@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args)
  return list.release ();
 }
 
+/* Implement gdb.architecture.format_address(ADDR).  Provide access to
+   GDB's print_address function from Python.  The returned address will
+   have the format '0x..... <symbol+offset>'.  */
+
+static PyObject *
+archpy_format_address (PyObject *self, PyObject *args, PyObject *kw)
+{
+  static const char *keywords[] = { "address", nullptr };
+  PyObject *addr_obj;
+  CORE_ADDR addr;
+  struct gdbarch *gdbarch = nullptr;
+
+  ARCHPY_REQUIRE_VALID (self, gdbarch);
+
+  if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj))
+    return nullptr;
+
+  if (get_addr_from_python (addr_obj, &addr) < 0)
+    return nullptr;
+
+  string_file buf;
+  print_address (gdbarch, addr, &buf);
+  return PyString_FromString (buf.c_str ());
+}
+
 void _initialize_py_arch ();
 void
 _initialize_py_arch ()
@@ -391,6 +416,12 @@ group GROUP-NAME." },
     METH_NOARGS,
     "register_groups () -> Iterator.\n\
 Return an iterator over all of the register groups in this architecture." },
+  { "format_address", (PyCFunction) archpy_format_address,
+    METH_VARARGS | METH_KEYWORDS,
+    "format_address (ADDRESS) -> String.\n\
+Format ADDRESS, an address within the currently selected inferior's\n\
+address space, as a string.  The format of the returned string is\n\
+'ADDRESS <SYMBOL+OFFSET>' without the quotes." },
   {NULL}  /* Sentinel */
 };
 
diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp
index b55778b0b72..c4854033d8c 100644
--- a/gdb/testsuite/gdb.python/py-arch.exp
+++ b/gdb/testsuite/gdb.python/py-arch.exp
@@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names {
     }
 }
 gdb_assert { $lists_match }
+
+# Check the gdb.Architecture.format_address method.
+set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \
+    "Got: $main_addr <main>" \
+    "gdb.Architecture.format_address, result should have no offset"
+set next_addr [format 0x%x [expr $main_addr + 1]]
+gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \
+    "Got: $next_addr <main\\+1>" \
+    "gdb.Architecture.format_address, result should have an offset"
+if {![is_address_zero_readable]} {
+    gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \
+	"Got: 0x0" \
+	"gdb.Architecture.format_address for address 0"
+}


  reply	other threads:[~2022-02-22 13:56 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-11 16:17 Andrew Burgess
2022-02-11 18:54 ` Eli Zaretskii
2022-02-21 17:27   ` Andrew Burgess
2022-02-21 18:02     ` Eli Zaretskii
2022-02-22 13:56       ` Andrew Burgess [this message]
2022-02-22 14:48         ` Eli Zaretskii
2022-02-23 14:20           ` Andrew Burgess
2022-03-03 16:49             ` Andrew Burgess
2022-03-03 18:35         ` Craig Blackmore
2022-03-04 10:51           ` Andrew Burgess
2022-03-04 10:50 ` [PATCHv2] " Andrew Burgess
2022-03-04 15:22   ` Simon Marchi
2022-03-07 12:33   ` [PATCHv3] gdb/python: add gdb.format_address function Andrew Burgess
2022-03-21 17:53     ` Andrew Burgess
2022-03-21 18:23     ` Simon Marchi
2022-03-22 13:19       ` Andrew Burgess
2022-03-23 12:14         ` Simon Marchi
2022-03-23 15:30           ` Andrew Burgess
2022-03-28 21:59             ` Simon Marchi
2022-03-29 13:38               ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sfsbattq.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).