public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@seketeli.org>
To: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
Cc: "Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org>
Subject: Re: [PATCH] CTF as a fallback when no DWARF debug info is present
Date: Thu, 06 Oct 2022 09:42:13 +0200	[thread overview]
Message-ID: <86mta9bdpm.fsf@seketeli.org> (raw)
In-Reply-To: <568fe730-3bb9-0267-00bc-2873e94e502f@oracle.com> (Guillermo E. Martinez's message of "Tue, 4 Oct 2022 18:13:53 -0500")

Hello Guillermo,

"Guillermo E. Martinez" <guillermo.e.martinez@oracle.com> a écrit:

[...]

>> I have also introduced a new function called
>> tools_utils::dir_contains_ctf_archive to look for a file that ends with
>> ".ctfa".  This abstracts away the search for "vmlinux.ctfa" as I wasn't
>> sure if those archives could exist for normal (non-kernel) binaries as
>> well:
>
> Ohh, perfect!, I'll use it in CTF reader to located the Linux archive file.

ACK.

[...]

>>      @@ -2525,8 +2542,12 @@ get_binary_paths_from_kernel_dist(const string&	dist_root,
>>       /// @param t time to trace time spent in each step.
>>       ///
>>       /// @param env the environment to create the corpus_group in.
>>      -static void
>>      -maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>>      +///
>>      +/// @return the status of the loading.  If it's
>>      +/// abigail::elf_reader::STATUS_UNKNOWN, then it means nothing was
>>      +/// done, meaning the function got out early.
>>      +static abigail::elf_reader::status
>>      +maybe_load_vmlinux_dwarf_corpus(corpus::origin&     origin,
>>                                       corpus_group_sptr&  group,
>>                                       const string&       vmlinux,
>>                                       vector<string>&     modules,
>>      @@ -2539,10 +2560,11 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>>                                       timer&              t,
>>                                       environment_sptr&   env)
>>       {
>>      +  abigail::elf_reader::status status = abigail::elf_reader::STATUS_UNKNOWN;
>>      +
>>         if (!(origin & corpus::DWARF_ORIGIN))
>>      -    return;
>>      +    return status;
>>
>>      -  abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
>>         dwarf_reader::read_context_sptr ctxt;
>>         ctxt =
>>          dwarf_reader::create_read_context(vmlinux, di_roots, env.get(),
>>      @@ -2569,6 +2591,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>>            << vmlinux << "' ...\n" << std::flush;
>>
>>         // Read the vmlinux corpus and add it to the group.
>>      +  status = abigail::elf_reader::STATUS_OK;
>>         t.start();
>>         read_and_add_corpus_to_group_from_elf(*ctxt, *group, status);
>>         t.stop();
>>      @@ -2579,7 +2602,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>>            << t << "\n";
>>
>
> At this point if `vmlinux' file doesn't have DWARF information, the `status'
> returned by `maybe_load_vmlinux_dwarf_corpus' will set the bit field
> `STATUS_DEBUG_INFO_NOT_FOUND', but it is not verified here, and since vmlinux
> corpus was already added into the group in `read_debug_info_into_corpus'
> function, it continues processing modules without the main corpus information,

I see.  You are right.  Yes, the debug info is not found in vmlinux and yet the
whole thing continues, collecting just information from the ELF symbol
table, basically, and from the modules.  Pretty useless, I guess.

> Is this the expected behaviour?

Hehe, no :-)

I guess maybe the caller should look for the .debug_info section in the
vmlinux section (or for split debug info), prior to even calling
maybe_load_vmlinux_dwarf_corpus.  If there is no debug info, then the
function should proceed directly to calling
maybe_load_vmlinux_ctf_corpus?  What do you think?

[...]

>> I have also introduced a new function called
>> tools_utils::dir_contains_ctf_archive to look for a file that ends with
>> ".ctfa".  This abstracts away the search for "vmlinux.ctfa" as I wasn't
>> sure if those archives could exist for normal (non-kernel) binaries as
>> well:
>
> Ohh, perfect!, I'll use it in CTF reader to located the Linux archive file.
> No. there is no `.ctfa' file for non-kernel binaries intead they have `.ctf'
> section, I could implement a similary function to looks for `.ctf' section
> using elf helpers

Right, abg-elf-helpers.h does have find_section_by_name.  That can be
used to look for the debug info, I guess.  However, we also need to
support finding the debug info when it's split out into a different
place, like when it's packaged in a separate debug-info package.  Today,
abg-dwarf-reader.cc uses dwfl (dwarf front-end library, I believe) to do
this, as dwfl knows how to find the DWARF debug info, wherever it is.

You can see how this is done in read_context::load_debug_info(), in
abg-dwarf-reader.cc, around line 2654.  Look for the comment "Look for
split debuginfo files".  Basically, dwfl_module_getdwarf returns a
pointer to the debug info it's found, if it has found one.  I think we
should split this logic out to make it re-usable somehow.

If you think this is worthwhile, I can think of splitting it out and
stick it into elf-helpers, maybe?


> and it can be used in `load_corpus_and_write_abixml'
> implementing a similar algorithm as with when we are processing the Kernel,
> looking for DWARF information, and if it is not present then, test if
> `.ctf' section is in ELF file then extract it using CTF reader,
> to avoid duplication use of:
>
> abigail::ctf_reader::read_context_sptr ctxt
> 		= abigail::ctf_reader::create_read_context(opts.in_file_path,
> 							   opts.prepared_di_root_paths,
> 							   env.get());
>
> One for `opts.use_ctf' and other one when `STATUS_DEBUG_INFO_NOT_FOUND' is returned.
> WDYT?

Yes, along with the testing for the presence of DWARF debug info, that
might be useful, indeed.

[...]

>> But then, it's here that we are going to inspect c1_status to see if
>> loading DWARF failed.  If it failed, then we'll try to load CTF.  So,
>> here is the change I am adding to the process of loading the corpus c1:
>>
>>
>> @@ -1205,6 +1208,36 @@ main(int argc, char* argv[])
>>                   set_suppressions(*ctxt, opts);
>>                   abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
>>                   c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
>> +
>> +#ifdef WITH_CTF
>> +		if (// We were not instructed to use CTF ...
>> +		    !opts.use_ctf
>
> This is always true, because we are in the else block of `opts.use_ctf'.

Right.

>
>> +		    // ... and yet, no debug info was found ...
>> +		    && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)
>> +		    // ... but we found ELF symbols ...
>> +		    && !(c1_status & STATUS_NO_SYMBOLS_FOUND))
>> +		  {
>> +		    string basenam, dirnam;
>> +		    base_name(opts.file1, basenam);
>> +		    dir_name(opts.file1, dirnam);
>> +		    // ... the input file seems to contain CTF
>> +		    // archive, so let's try to see if the file
>> +		    // contains a CTF archive, who knows ...
>> +		    if (dir_contains_ctf_archive(dirnam, basenam))
>
> Non-kernel binaries contains `.ctf' section instead of `ctfa' file,
> so I can implement a `file_contains_ctf_section' function to test if
> it is a valid input file for CTF reader.

Great, thanks.

OK, I'll look into trying to put together some facility to look for the
presence of DWARF debug info, so tools can decide ahead of time what
front-end to use.

[...]

Cheers,

-- 
		Dodji

  reply	other threads:[~2022-10-06  7:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-01  0:15 Guillermo E. Martinez
2022-10-04  9:04 ` Dodji Seketeli
2022-10-04 23:13   ` Guillermo E. Martinez
2022-10-06  7:42     ` Dodji Seketeli [this message]
2022-10-06 14:12       ` Dodji Seketeli
2022-10-07 14:13         ` Guillermo E. Martinez
2022-10-06 19:53       ` Guillermo Martinez
2022-10-06 19:50         ` Guillermo E. Martinez
2022-10-07 13:38         ` Dodji Seketeli
2022-10-07 16:04           ` Ben Woodard
2022-11-15 20:13 ` [PATCHv2] ELF based front-end readers fallback feature Guillermo E. Martinez
2022-11-21 18:51   ` [PATCHv3] " Guillermo E. Martinez
2022-11-22 14:19     ` Dodji Seketeli
2022-11-22 16:02       ` Guillermo E. Martinez
2022-11-22 16:00     ` [PATCH v4] " Guillermo E. Martinez
2022-11-28 15:56       ` Dodji Seketeli
2022-11-28 21:59         ` Guillermo E. Martinez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86mta9bdpm.fsf@seketeli.org \
    --to=dodji@seketeli.org \
    --cc=guillermo.e.martinez@oracle.com \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).