From: Mark Wielaard <mark@klomp.org>
To: Aaron Merey <amerey@redhat.com>, elfutils-devel@sourceware.org
Subject: Re: [PATCH] readelf: Support .gdb_index version 9
Date: Thu, 02 Nov 2023 13:53:48 +0100 [thread overview]
Message-ID: <9a13a2cee1bff48ae7121515ccef4bcd5b047601.camel@klomp.org> (raw)
In-Reply-To: <20231031203312.536219-1-amerey@redhat.com>
Hi Aaron,
On Tue, 2023-10-31 at 16:33 -0400, Aaron Merey wrote:
> I'd like to merge this patch before the next release. Unless anyone objects
> I'll merge it by Friday Nov 3.
Looks good to me. Some small comments below.
> Commit message:
BTW if you add comments like the above after the first "scissors" ---
line, they wont show up in the actually commit message when applying
the email message (that is why the stat output is put there).
> Version 9 adds a "shortcut table" to the index. The shortcut table contains
> the name and language of the main function, if it exists.
>
> A testcase added in this patch uses an executable written with Fortran.
> This is because gdb does not currently populate the shortcut table of
> C/C++ programs (see sourceware PR30996).
>
> Signed-off-by: Aaron Merey <amerey@redhat.com>
> ---
> src/readelf.c | 66 +++++++++++++++-
> tests/Makefile.am | 3 +-
> tests/run-readelf-gdb_index.sh | 95 +++++++++++++++++++++++-
> tests/testfilegdbindex9-no-maininfo.bz2 | Bin 0 -> 3502 bytes
> tests/testfilegdbindex9.bz2 | Bin 0 -> 4266 bytes
> 5 files changed, 159 insertions(+), 5 deletions(-)
> create mode 100755 tests/testfilegdbindex9-no-maininfo.bz2
> create mode 100755 tests/testfilegdbindex9.bz2
>
> diff --git a/src/readelf.c b/src/readelf.c
> index db31ad09..a28f6236 100644
> --- a/src/readelf.c
> +++ b/src/readelf.c
> @@ -11539,8 +11539,9 @@ print_gdb_index_section (Dwfl_Module *dwflmod, Ebl *ebl,
> // hash used for generating the table. Version 6 contains symbols
> // for inlined functions, older versions didn't. Version 7 adds
> // symbol kinds. Version 8 just indicates that it correctly includes
> - // TUs for symbols.
> - if (vers < 4 || vers > 8)
> + // TUs for symbols. Version 9 adds shortcut table for information
> + // regarding the main function.
> + if (vers < 4 || vers > 9)
> {
> printf (_(" unknown version, cannot parse section\n"));
> return;
OK. BTW the description of the gdb_index at the top
https://sourceware.org/gdb/current/onlinedocs/gdb/Index-Section-Format.html
doesn't resolve anymore. It is now
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Index-Section-Format.html
Maybe we should report that to the gdb project.
> @@ -11578,6 +11579,17 @@ print_gdb_index_section (Dwfl_Module *dwflmod, Ebl *ebl,
> if (unlikely (readp + 4 > dataend))
> goto invalid_data;
>
> + uint32_t shortcut_off = 0;
> + if (vers >= 9)
> + {
> + shortcut_off = read_4ubyte_unaligned (dbg, readp);
> + printf (_(" shortcut offset: %#" PRIx32 "\n"), shortcut_off);
> +
> + readp += 4;
> + if (unlikely (readp + 4 > dataend))
> + goto invalid_data;
> + }
> +
> uint32_t const_off = read_4ubyte_unaligned (dbg, readp);
> printf (_(" constant offset: %#" PRIx32 "\n"), const_off);
OK
> @@ -11675,8 +11687,19 @@ print_gdb_index_section (Dwfl_Module *dwflmod, Ebl *ebl,
> if (const_off >= data->d_size)
> goto invalid_data;
>
> + const unsigned char *shortcut_start = NULL;
> + if (vers >= 9)
> + {
> + if (shortcut_off >= data->d_size)
> + goto invalid_data;
> +
> + shortcut_start = data->d_buf + shortcut_off;
> + nextp = shortcut_start;
> + }
> + else
> + nextp = const_start;
> +
> readp = data->d_buf + sym_off;
> - nextp = const_start;
> size_t sym_nr = (nextp - readp) / 8;
>
> printf (_("\n Symbol table at offset %#" PRIx32
OK
> @@ -11750,6 +11773,43 @@ print_gdb_index_section (Dwfl_Module *dwflmod, Ebl *ebl,
> }
> n++;
> }
> +
> + if (vers < 9)
> + return;
> +
> + if (unlikely (shortcut_start == NULL))
> + goto invalid_data;
> +
> + readp = shortcut_start;
> + nextp = const_start;
> + size_t shortcut_nr = (nextp - readp) / 4;
> +
> + if (unlikely (shortcut_nr != 2))
> + goto invalid_data;
OK. So you just expect two entries.
> + printf (_("\nShortcut table at offset %#" PRIx32 " contains %zu slots:\n"),
> + shortcut_off, shortcut_nr);
That is a fancy way to print "contains 2 slots" :)
> + uint32_t lang = read_4ubyte_unaligned (dbg, readp);
> + readp += 4;
> +
> + printf (_("Language of main: %s\n"), dwarf_lang_name (lang));
Note that dwarf_lang_name calls string_or_unknown with false for
print_unknown_num so it might make sense to either flip that to true
(but there is probably a reason it doesn't print the hex num) or to
always add the hex num after the name so the user doesn't just get ???
on an unknown DWARF_LANG constant.
> + printf (_("Name of main: "));
> +
> + if (lang != 0)
> + {
> + uint32_t name = read_4ubyte_unaligned (dbg, readp);
> + readp += 4;
> + const unsigned char *sym = const_start + name;
> +
> + if (unlikely ((size_t) (dataend - const_start) < name
> + || memchr (sym, '\0', dataend - sym) == NULL))
> + goto invalid_data;
Good end of string check.
BTW. DW_LANG constants are going away with DWARF6:
https://dwarfstd.org/languages-v6.html
> + printf ("%s\n", sym);
> + }
> + else
> + printf ("<unknown>\n");
> }
>
> /* Returns true and sets split DWARF CU id if there is a split compile
OK
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index ef5b6bb5..e8bc9058 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -421,7 +421,8 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \
> run-readelf-Dd.sh \
> testfile-s390x-hash-both.bz2 \
> run-readelf-gdb_index.sh testfilegdbindex5.bz2 \
> - testfilegdbindex7.bz2 \
> + testfilegdbindex7.bz2 testfilegdbindex9.bz2 \
> + testfilegdbindex9-no-maininfo.bz2 \
> run-readelf-s.sh testfilebazdbg.bz2 testfilebazdyn.bz2 \
> testfilebazmin.bz2 testfilebazdbg.debug.bz2 testfilebazmdb.bz2 \
> testfilebaztab.bz2 testfilebasmin.bz2 testfilebaxmin.bz2 \
> diff --git a/tests/run-readelf-gdb_index.sh b/tests/run-readelf-gdb_index.sh
> index fcbc3c57..95367ef8 100755
> --- a/tests/run-readelf-gdb_index.sh
> +++ b/tests/run-readelf-gdb_index.sh
> @@ -63,7 +63,7 @@
> # (gdb) save gdb-index .
> # objcopy --add-section .gdb_index=testfilegdbindex7.gdb-index --set-section-flags .gdb_index=readonly testfilegdbindex7 testfilegdbindex7
>
> -testfiles testfilegdbindex5 testfilegdbindex7
> +testfiles testfilegdbindex5 testfilegdbindex7 testfilegdbindex9 testfilegdbindex9-no-maininfo
>
> testrun_compare ${abs_top_builddir}/src/readelf --debug-dump=gdb_index testfilegdbindex5 <<\EOF
>
> @@ -127,4 +127,97 @@ GDB section [33] '.gdb_index' at offset 0xe76 contains 8399 bytes :
> [ 754] symbol: int, CUs: 0 (type:S)
> EOF
>
> +# testfilegdbindex9-no-maininfo is built the same way as testfilegdbindex7.
> +testrun_compare ${abs_top_builddir}/src/readelf --debug-dump=gdb_index testfilegdbindex9-no-maininfo <<\EOF
> +
> +GDB section [33] '.gdb_index' at offset 0x38e1 contains 8415 bytes :
> + Version: 9
> + CU offset: 0x1c
> + TU offset: 0x3c
> + address offset: 0x54
> + symbol offset: 0x7c
> + shortcut offset: 0x207c
> + constant offset: 0x2084
> +
> + CU list at offset 0x1c contains 2 entries:
> + [ 0] start: 0x00004c, length: 220
> + [ 1] start: 0x000128, length: 214
> +
> + TU list at offset 0x3c contains 1 entries:
> + [ 0] CU offset: 0, type offset: 30, signature: 0x87e03f92cc37cdf0
> +
> + Address list at offset 0x54 contains 2 entries:
> + [ 0] 0x0000000000401106 <main>..0x000000000040113b <main+0x35>, CU index: 1
> + [ 1] 0x000000000040113c <hello>..0x0000000000401173 <say+0x1c>, CU index: 2
> +
> + Symbol table at offset 0x54 contains 1024 slots:
> + [ 123] symbol: global, CUs: 1 (var:G), 0T (var:G)
> + [ 489] symbol: main, CUs: 1 (func:G)
> + [ 518] symbol: char, CUs: 0 (type:S)
> + [ 661] symbol: foo, CUs: 0 (type:S)
> + [ 741] symbol: hello, CUs: 1 (var:S), 0T (func:S)
> + [ 746] symbol: say, CUs: 0T (func:G)
> + [ 754] symbol: int, CUs: 1 (type:S)
> +
> +Shortcut table at offset 0x207c contains 2 slots:
> +Language of main: ???
> +Name of main: <unknown>
> +EOF
This seems an unfortunate example. Why is the language unknown?
But maybe it is a nice example to show why you should at least print
the hex num of the language?
> +# testfilegdbindex9.f90
> +#
> +# program repro
> +# type small_stride
> +# character*40 long_string
> +# integer small_pad
> +# end type small_stride
> +# type(small_stride), dimension (20), target :: unpleasant
> +# character*40, pointer, dimension(:):: c40pt
> +# integer i
> +# do i = 0,19
> +# unpleasant(i+1)%small_pad = i+1
> +# unpleasant(i+1)%long_string = char (ichar('0') + i)
> +# end do
> +# c40pt => unpleasant%long_string
> +# print *, c40pt
> +#end program repro
> +
> +# gfortran -g -o testfilegdbindex9 testfilegdbindex9.f90
> +# gdb-add-index testfilegdbindex9
> +
> +testrun_compare ${abs_top_builddir}/src/readelf --debug-dump=gdb_index testfilegdbindex9 <<\EOF
> +
> +GDB section [35] '.gdb_index' at offset 0x37d9 contains 8395 bytes :
> + Version: 9
> + CU offset: 0x1c
> + TU offset: 0x2c
> + address offset: 0x2c
> + symbol offset: 0x40
> + shortcut offset: 0x2040
> + constant offset: 0x2048
> +
> + CU list at offset 0x1c contains 1 entries:
> + [ 0] start: 00000000, length: 307
> +
> + TU list at offset 0x2c contains 0 entries:
> +
> + Address list at offset 0x2c contains 1 entries:
> + [ 0] 0x0000000000401166 <MAIN__>..0x00000000004013f0 <main+0x3a>, CU index: 0
> +
> + Symbol table at offset 0x2c contains 1024 slots:
> + [ 61] symbol: small_stride, CUs: 0 (type:S)
> + [ 71] symbol: integer(kind=8), CUs: 0 (type:S)
> + [ 161] symbol: character(kind=1), CUs: 0 (type:S)
> + [ 397] symbol: unpleasant, CUs: 0 (var:S)
> + [ 489] symbol: main, CUs: 0 (func:G)
> + [ 827] symbol: integer(kind=4), CUs: 0 (type:S)
> + [ 858] symbol: c40pt, CUs: 0 (var:S)
> + [ 965] symbol: repro, CUs: 0 (func:S)
> + [1016] symbol: i, CUs: 0 (var:S)
> +
> +Shortcut table at offset 0x2040 contains 2 slots:
> +Language of main: Fortran08
> +Name of main: repro
> +EOF
Nice.
next prev parent reply other threads:[~2023-11-02 12:53 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-31 20:33 Aaron Merey
2023-11-02 12:53 ` Mark Wielaard [this message]
2023-11-02 14:20 ` Frank Ch. Eigler
2023-11-02 16:30 ` Aaron Merey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a13a2cee1bff48ae7121515ccef4bcd5b047601.camel@klomp.org \
--to=mark@klomp.org \
--cc=amerey@redhat.com \
--cc=elfutils-devel@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).