From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id A947B3836431 for ; Tue, 22 Feb 2022 13:56:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A947B3836431 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-549-NEniO2VOPoWrRib5FgT5WA-1; Tue, 22 Feb 2022 08:56:20 -0500 X-MC-Unique: NEniO2VOPoWrRib5FgT5WA-1 Received: by mail-wm1-f69.google.com with SMTP id h82-20020a1c2155000000b003552c13626cso891442wmh.3 for ; Tue, 22 Feb 2022 05:56:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=sEKQsmsInz2OyoGyBxgCqBqb10lNaHo+WzxTkdZJWk8=; b=EqraYduz5NQRiniDvj9GH5fx+icyjgVpbodca7E0kPadwvaX8TZz3Z7KeDQROKJm1B Ixfo3TIWGbZ80zxpC6P5fcKDhOl6Z+9W8lU3krNDcA8Kjv2H2syNmYdq5KI/dofH0cy1 Wte4G6WtJ/p/uUywB7Stub1JW7EneAZfM5zySfs0D7mIln9SKr9W5k29N+CmgH7Oz0MR En8alSdYPxVNr7txGkowqgSJNfjbEqJ7l2g+LkPNsrv+FNEK7kWvNUKjw398sxPHOBbL mtLApfwzc4/ber+/lfeIbn6sPco6VjjWtUERy9UfKPWVfRh72HmeT21TYSQbnaIgx2x/ +TXw== X-Gm-Message-State: AOAM533O6qyyHDS91ORQEM4QQHXORb8uSkGWJYFS37hK3mJGff0ARMsJ xckZwz06qWQIDzCuPl00PW4hvxUfeydk10mdBXqf79WAYE93qJLqNA3uaagciQ3eEYKAGnMEln5 jmF0Fd8IjZ1Y2FidWUFodWw== X-Received: by 2002:a05:600c:1d28:b0:37c:a9d:d39f with SMTP id l40-20020a05600c1d2800b0037c0a9dd39fmr3479906wms.172.1645538178639; Tue, 22 Feb 2022 05:56:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJwGncoBSv5cQTqCWt+rlItJOindwNcDnVpoVS1xEgkdzzJ+RWwlmqYXanbGetBPTbsSuWMqEQ== X-Received: by 2002:a05:600c:1d28:b0:37c:a9d:d39f with SMTP id l40-20020a05600c1d2800b0037c0a9dd39fmr3479886wms.172.1645538178375; Tue, 22 Feb 2022 05:56:18 -0800 (PST) Received: from localhost (host86-169-131-29.range86-169.btcentralplus.com. [86.169.131.29]) by smtp.gmail.com with ESMTPSA id y6sm35309442wrd.30.2022.02.22.05.56.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Feb 2022 05:56:17 -0800 (PST) From: Andrew Burgess To: Eli Zaretskii Cc: gdb-patches@sourceware.org Subject: Re: [PATCH] gdb/python: add gdb.Architecture.format_address In-Reply-To: <837d9oumhd.fsf@gnu.org> References: <20220211161721.3252422-1-aburgess@redhat.com> <83leyhs07f.fsf@gnu.org> <878ru4cepy.fsf@redhat.com> <837d9oumhd.fsf@gnu.org> Date: Tue, 22 Feb 2022 13:56:17 +0000 Message-ID: <87sfsbattq.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Feb 2022 13:56:24 -0000 Eli Zaretskii writes: >> From: Andrew Burgess >> Cc: gdb-patches@sourceware.org >> Date: Mon, 21 Feb 2022 17:27:21 +0000 >> >> I'm certainly not against renaming, if we can come up with a better >> name... maybe 'format_address_info'? I don't know... I still kind of >> like 'format_address'... > > I hope someone will come up with a better name. > >> +@defun Architecture.format_address (@var{address}) >> +Return a string in the format @samp{ADDRESS }, where >> +@samp{ADDRESS} is @var{address} formatted in hexadecimal, >> +@samp{SYMBOL} is a symbol, the address range of which, covers >> +@var{address}, and @samp{OFFSET} is the offset from @samp{SYMBOL} to >> +@var{address} in decimal. This is the same format that @value{GDBN} >> +uses when printing address, symbol, and offset information, for >> +example, within disassembler output. >> + >> +If no @samp{SYMBOL} has an address range that covers @var{address}, >> +then the @samp{} part is not included in the returned >> +string, instead the returned string will just contain the >> +@var{address} formatted as hexadecimal. >> + >> +In all cases, the @samp{ADDRESS} component will be padded with leading >> +zeros based on the width of an address for the current architecture. > > This is okay, but needs the markup fixed. All the places where you > use SOMETHING ("ADDRESS", "SYMBOL", etc.) should be @var{something}, > i.e. have the @var markup and be in lower-case. (To prevent confusion > with the argument @var{address}, call it something else, like > @var{addr}.) Also, remove @samp everywhere except this single > instance: > > @samp{@var{address} <@var{symbol}+@var{offset}>} > > And finally, this text: > >> @samp{SYMBOL} is a symbol, the address range of which, covers >> +@var{address} > > is better worded as > > @var{symbol} is the symbol to which @var{addr} belongs > > Btw, is the above accurate? does GDB really guarantee that ADDR is > p[art of SYMBOL's memory? or does it just find the closest symbol > whose address is smaller than ADDR? You're absolutely correct. While checking the code to see how this stuff actually works I discovered a few more interesting settings that I felt were worth mentioning in the docs for this function. So, appologies, but this is another complete rewrite of the docs. With the exception of the name, for which I still have no better suggestions, how's this? Thanks, Andrew --- commit dc104a9f22e3d5baf1b87b344acbfae010bac168 Author: Andrew Burgess Date: Sat Oct 23 09:59:25 2021 +0100 gdb/python: add gdb.Architecture.format_address Add a new method gdb.Architecture.format_address, which is a wrapper around GDB's print_address function. This method takes an address, and returns a string with the format: ADDRESS Where, ADDRESS is the original address, formatted as hexadecimal, and padded with zeros on the left up to the width of an address in the current architecture. SYMBOL is a symbol whose address range covers ADDRESS, and OFFSET is the offset from SYMBOL to ADDRESS in decimal. If there's no SYMBOL whose address range covers ADDRESS, then the part is not included. This is useful if a user wants to write a Python script that pretty-print addresses, the user no longer needs to do manual symbol lookup, and additionally, things like the zero padding on addresses will be consistent with the builtin GDB behaviour. diff --git a/gdb/NEWS b/gdb/NEWS index 9da74e71796..b012e0c562b 100644 --- a/gdb/NEWS +++ b/gdb/NEWS @@ -185,6 +185,11 @@ GNU/Linux/LoongArch loongarch*-*-linux* set styling'). When false, which is the default if the argument is not given, then no styling is applied to the returned string. + ** New function gdb.Architecture.format_address(ADDRESS), that + formats ADDRESS as 'address ', this is the same + format that GDB uses when printing address, symbol, and offset + information from the disassembler. + * New features in the GDB remote stub, GDBserver ** GDBserver is now supported on OpenRISC GNU/Linux. diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi index c1a3f5f2a7e..a786d52b092 100644 --- a/gdb/doc/python.texi +++ b/gdb/doc/python.texi @@ -6016,6 +6016,42 @@ @code{gdb.Architecture}. @end defun +@defun Architecture.format_address (@var{address}) +Return a string in the format @samp{@var{addr} +<@var{symbol}+@var{offset}>}, where @var{addr} is @var{address} +formatted in hexadecimal, @var{symbol} is the closest earlier symbol +to @var{address}, and @var{offset} is the offset from @var{symbol} to +@var{address} in decimal. + +If no suitable @var{symbol} was found, then the +<@var{symbol}+@var{offset}> part is not included in the returned +string, instead the returned string will just contain the +@var{address} formatted as hexadecimal. How far @value{GDBN} looks +back for a suitable symbol can be controlled with @kbd{set print +max-symbolic-offset} (@pxref{Print Settings}). + +Additionally, the returned string can include file name and line +number information when @kbd{set print symbol-filename on} +(@pxref{Print Settings}), in this case the format of the returned +string is @samp{@var{addr} <@var{symbol}+@var{offset}> at +@var{filename}:@var{line-number}}. + +In all cases, the @var{addr} component will be padded with leading +zeros based on the width of an address for the current architecture. + +This method uses the same mechanism for formatting address, symbol, +and offset information as core @value{GDBN} does in commands such as +@kbd{disassemble}. + +Here are some examples of the possible string formats: + +@smallexample +0x00001042 +0x00001042 +0x00001042 +@end smallexample +@end defun + @node Registers In Python @subsubsection Registers In Python @cindex Registers In Python diff --git a/gdb/python/py-arch.c b/gdb/python/py-arch.c index 0f273b344e4..95ae931e73e 100644 --- a/gdb/python/py-arch.c +++ b/gdb/python/py-arch.c @@ -348,6 +348,31 @@ gdbpy_all_architecture_names (PyObject *self, PyObject *args) return list.release (); } +/* Implement gdb.architecture.format_address(ADDR). Provide access to + GDB's print_address function from Python. The returned address will + have the format '0x..... '. */ + +static PyObject * +archpy_format_address (PyObject *self, PyObject *args, PyObject *kw) +{ + static const char *keywords[] = { "address", nullptr }; + PyObject *addr_obj; + CORE_ADDR addr; + struct gdbarch *gdbarch = nullptr; + + ARCHPY_REQUIRE_VALID (self, gdbarch); + + if (!gdb_PyArg_ParseTupleAndKeywords (args, kw, "O", keywords, &addr_obj)) + return nullptr; + + if (get_addr_from_python (addr_obj, &addr) < 0) + return nullptr; + + string_file buf; + print_address (gdbarch, addr, &buf); + return PyString_FromString (buf.c_str ()); +} + void _initialize_py_arch (); void _initialize_py_arch () @@ -391,6 +416,12 @@ group GROUP-NAME." }, METH_NOARGS, "register_groups () -> Iterator.\n\ Return an iterator over all of the register groups in this architecture." }, + { "format_address", (PyCFunction) archpy_format_address, + METH_VARARGS | METH_KEYWORDS, + "format_address (ADDRESS) -> String.\n\ +Format ADDRESS, an address within the currently selected inferior's\n\ +address space, as a string. The format of the returned string is\n\ +'ADDRESS ' without the quotes." }, {NULL} /* Sentinel */ }; diff --git a/gdb/testsuite/gdb.python/py-arch.exp b/gdb/testsuite/gdb.python/py-arch.exp index b55778b0b72..c4854033d8c 100644 --- a/gdb/testsuite/gdb.python/py-arch.exp +++ b/gdb/testsuite/gdb.python/py-arch.exp @@ -127,3 +127,18 @@ foreach a $arch_names b $py_arch_names { } } gdb_assert { $lists_match } + +# Check the gdb.Architecture.format_address method. +set main_addr [get_hexadecimal_valueof "&main" "UNKNOWN"] +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($main_addr))" \ + "Got: $main_addr
" \ + "gdb.Architecture.format_address, result should have no offset" +set next_addr [format 0x%x [expr $main_addr + 1]] +gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address($next_addr))" \ + "Got: $next_addr " \ + "gdb.Architecture.format_address, result should have an offset" +if {![is_address_zero_readable]} { + gdb_test "python print(\"Got: \" + gdb.selected_inferior().architecture().format_address(0))" \ + "Got: 0x0" \ + "gdb.Architecture.format_address for address 0" +}