public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
To: Dodji Seketeli <dodji@seketeli.org>,
	"Guillermo E. Martinez via Libabigail"
	<libabigail@sourceware.org>
Subject: Re: [PATCH] CTF as a fallback when no DWARF debug info is present
Date: Tue, 4 Oct 2022 18:13:53 -0500	[thread overview]
Message-ID: <568fe730-3bb9-0267-00bc-2873e94e502f@oracle.com> (raw)
In-Reply-To: <877d1g2c52.fsf@seketeli.org>



On 10/4/22 04:04, Dodji Seketeli wrote:
> Hello Guillermo,
>

HI Dodji,
  
> Thank you for this patch.
> 
Your welcome!.

> Please find my comments below.
> 
> "Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
> écrit:
> 
> 
>> By default, `abidw', `abidiff', `abipkgdiff' and `kmidiff' tools use
>> debug information in `DWARF` format, if present, otherwise now
>> automatically the tools try to extract and build the binaries IR using
>> debug information in `CTF` format without use of `--ctf' option, if
>> present, finally, if neither is found, they use only `ELF` symbols to
>> extract, build, compare and report which of them were added or removed.
>>
>> In case of neither DWARF nor CTF debug information is present
>> (`STATUS_DEBUG_INFO_NOT_FOUND') and `--ctf' option was not passed in
>> the command line, those tools use the symbol object build by DWARF
>> reader (default) to completed  theirs task.
>>
>> Tools don't allow comparing corpora built with between different debug
>> information i.e DWARF vs CTF, the first corpus built dictate the
>> expected debug information in the second one.
>>
>> This work for libraries and Linux kernel.  The `--ctf' option is
>> preserved to explicitly indicate to those tools that we want to use
>> CTF support.
>>
>> 	* doc/manuals/abidiff.rst: Adjust usage tool information
>> 	to indicates fallback for CTF debug info when DWARF info
>> 	is not present.
>> 	* doc/manuals/abidw.rst: Likewise.
>> 	* doc/manuals/abipkgdiff.rst: Likewise.
>> 	* doc/manuals/kmidiff.rst: Likewise.
> 
> I have made some slight fixes to changes to these doc files.  Nothing
> major you'll see the changes in the updated version of the patch that I
> am posting at the end of this message.
> 

OK.

> [...]
> 
>> 	* src/abg-ctf-reader.cc (slurp_elf_info): Report status
>> 	when debug information is not present.
>> 	(read_corpus): Add code to locate `vmlinux.ctfa' just when
>> 	`main corpus' is being processed.
>> 	* src/abg-tools-utils.cc (maybe_load_vmlinux_dwarf_corpus):
>> 	Replace by a reference `origin' argument to notify whether
>> 	CTF reader needs to be executed.
>> 	* tests/data/test-diff-pkg-ctf/dirpkg-3-report-2.txt: Adjust
>> 	test case.
>> 	* tests/test-diff-pkg.cc (in_out_specs): Add `--ctf' option
>> 	in test case.
>> 	* tools/abidiff.cc (main): Add `origin' depending of command
>> 	line argument by default it is `corpus::DWARF_ORIGIN'. Add
>> 	`dwarf_corpus' to be used as default corpus (containing ELF
>> 	symbols), when DWARF nor CTF debug information was found.
>> 	* tools/abidw.cc: Likewise.
>> 	* tools/abipkgdiff.cc: Likewise.
>>
>> Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
>> ---
>>   doc/manuals/abidiff.rst                       |  15 +--
>>   doc/manuals/abidw.rst                         |  15 ++-
>>   doc/manuals/abipkgdiff.rst                    |  13 +-
>>   doc/manuals/kmidiff.rst                       |   9 +-
>>   src/abg-ctf-reader.cc                         |  33 +++--
>>   src/abg-tools-utils.cc                        |  22 +++-
>>   .../test-diff-pkg-ctf/dirpkg-3-report-2.txt   |  16 +++
>>   tests/test-diff-pkg.cc                        |   2 +-
>>   tools/abidiff.cc                              |  82 ++++++++-----
>>   tools/abidw.cc                                |  54 +++++---
>>   tools/abipkgdiff.cc                           | 116 +++++++++++-------
>>   11 files changed, 257 insertions(+), 120 deletions(-)
>>
>> diff --git a/doc/manuals/abidiff.rst b/doc/manuals/abidiff.rst
>> index 0c711d9e..caeceded 100644
>> --- a/doc/manuals/abidiff.rst
>> +++ b/doc/manuals/abidiff.rst
>> @@ -12,11 +12,12 @@ This tool can also compare the textual representations of the ABI of
>>   two ELF binaries (as emitted by ``abidw``) or an ELF binary against a
>>   textual representation of another ELF binary.
>>   
>> -For a comprehensive ABI change report that includes changes about
>> -function and variable sub-types, the two input shared libraries must
>> -be accompanied with their debug information in `DWARF`_ format.
>> -Otherwise, only `ELF`_ symbols that were added or removed are
>> -reported.
>> +For a comprehensive ABI change report between two input shared
>> +libraries that includes changes about function and variable sub-types,
>> +``abidiff`` uses by default, debug information in `DWARF`_ format, if
>> +present, otherwise it compares interfaces using debug information in
>> +`CTF`_ format, if present, finally, if neither is found, it uses only
>> +`ELF`_ symbols to report which of them were added or removed.
>>   
>>   .. include:: tools-use-libabigail.txt
>>   
>> @@ -558,7 +559,7 @@ Options
>>   
>>     * ``--ctf``
>>   
>> -    When comparing binaries, extract ABI information from CTF debug
>> +    When comparing binaries, extract ABI information from `CTF`_ debug
>>       information, if present.
>>   
>>     * ``--stats``
>> @@ -785,4 +786,4 @@ Usage examples
>>   
>>   .. _ELF: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>>   .. _DWARF: http://www.dwarfstd.org
>> -
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/doc/manuals/abidw.rst b/doc/manuals/abidw.rst
>> index a3055c7e..20948805 100644
>> --- a/doc/manuals/abidw.rst
>> +++ b/doc/manuals/abidw.rst
>> @@ -8,8 +8,7 @@ representation of its ABI to standard output.  The emitted
>>   representation format, named ``ABIXML``, includes all the globally
>>   defined functions and variables, along with a complete representation
>>   of their types.  It also includes a representation of the globally
>> -defined ELF symbols of the file.  The input shared library must
>> -contain associated debug information in `DWARF`_ format.
>> +defined ELF symbols of the file.
>>   
>>   When given the ``--linux-tree`` option, this program can also handle a
>>   `Linux kernel`_ tree.  That is, a directory tree that contains both
>> @@ -19,8 +18,13 @@ interface between the kernel and its module, to standard output.  In
>>   this case, we don't call it an ABI, but a KMI (Kernel Module
>>   Interface).  The emitted KMI includes all the globally defined
>>   functions and variables, along with a complete representation of their
>> -types.  The input binaries must contain associated debug information
>> -in `DWARF`_ format.
>> +types.
>> +
>> +To generate either ABI or KMI representation, by default ``abidw``
>> +uses debug information in `DWARF`_ format, if present, otherwise it
>> +looks for debug information in `CTF`_ format, if present, finally, if
>> +neither is found, it uses only `ELF`_ symbols to report which of them
>> +were added or removed.
>>   
>>   .. include:: tools-use-libabigail.txt
>>   
>> @@ -326,7 +330,7 @@ Options
>>   
>>     * ``--ctf``
>>   
>> -    Extract ABI information from CTF debug information, if present in
>> +    Extract ABI information from `CTF`_ debug information, if present in
>>       the given object.
>>   
>>     *  ``--annotate``
>> @@ -365,3 +369,4 @@ standard `here
>>   .. _DWARF: http://www.dwarfstd.org
>>   .. _GNU: http://www.gnu.org
>>   .. _Linux Kernel: https://kernel.org/
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/doc/manuals/abipkgdiff.rst b/doc/manuals/abipkgdiff.rst
>> index 9114775a..771bb034 100644
>> --- a/doc/manuals/abipkgdiff.rst
>> +++ b/doc/manuals/abipkgdiff.rst
>> @@ -13,12 +13,18 @@ binaries.
>>   For a comprehensive ABI change report that includes changes about
>>   function and variable sub-types, the two input packages must be
>>   accompanied with their debug information packages that contain debug
>> -information either in `DWARF`_ or in `CTF` formats.  Please note
>> +information either in `DWARF`_ or in `CTF`_ formats.  Please note
>>   however that some packages contain binaries that embed the debug
>>   information directly in a section of said binaries.  In those cases,
>>   obviously, no separate debug information package is needed as the tool
>>   will find the debug information inside the binaries.
>>   
>> +By default, ``abipkgdiff`` uses debug information in `DWARF`_ format,
>> +if present, otherwise it compares binaries interfaces using debug
>> +information in `CTF`_ format, if present, finally, if neither is
>> +found, it uses only `ELF`_ symbols to report which of them were added
>> +or removed.
>> +
>>   .. include:: tools-use-libabigail.txt
>>   
>>   .. _abipkgdiff_invocation_label:
>> @@ -525,8 +531,8 @@ Options
>>   
>>     * ``--ctf``
>>   
>> -     This is used to compare packages with CTF debug information, if
>> -     present.
>> +     This is used to compare packages with `CTF`_ debug information,
>> +     if present.
>>   
>>   .. _abipkgdiff_return_value_label:
>>   
>> @@ -546,4 +552,5 @@ In the later case, the value of the exit code is the same as for the
>>   .. _Deb: https://en.wikipedia.org/wiki/Deb_%28file_format%29
>>   .. _tar: https://en.wikipedia.org/wiki/Tar_%28computing%29
>>   .. _DWARF: http://www.dwarfstd.org
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>>   .. _Development Package: https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Devel_Packages
>> diff --git a/doc/manuals/kmidiff.rst b/doc/manuals/kmidiff.rst
>> index 53010189..a27d2456 100644
>> --- a/doc/manuals/kmidiff.rst
>> +++ b/doc/manuals/kmidiff.rst
>> @@ -74,6 +74,11 @@ functions and variables) between the Kernel and its modules.  In
>>   practice, though, some users might want to compare a subset of the
>>   those interfaces.
>>   
>> +By default, ``kmidiff`` uses debug information in `DWARF`_ format,
>> +if present, otherwise it compares interfaces using debug information
>> +in `CTF`_ format, if present, finally, if neither is found, it uses
>> +only `ELF`_ symbols to report which were added or removed.
>> +
>>   Users can then define a "white list" of the interfaces to compare.
>>   Such a white list is a just a file in the "INI" format that looks
>>   like: ::
>> @@ -174,7 +179,7 @@ Options
>>   
>>     * ``--ctf``
>>   
>> -    Extract ABI information from CTF debug information, if present in
>> +    Extract ABI information from `CTF`_ debug information, if present in
>>       the Kernel and Modules.
>>   
>>     * ``--impacted-interfaces | -i``
>> @@ -242,3 +247,5 @@ Options
>>   .. _ELF: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>>   .. _ksymtab: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>>   .. _Linux Kernel: https://kernel.org
>> +.. _DWARF: http://www.dwarfstd.org
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
>> index 9148a646..b3cde365 100644
>> --- a/src/abg-ctf-reader.cc
>> +++ b/src/abg-ctf-reader.cc
>> @@ -45,6 +45,7 @@ namespace ctf_reader
>>   using std::dynamic_pointer_cast;
>>   using abigail::tools_utils::dir_name;
>>   using abigail::tools_utils::file_exists;
>> +using abigail::tools_utils::base_name;
>>   
>>   class read_context
>>   {
>> @@ -1545,7 +1546,12 @@ slurp_elf_info(read_context *ctxt,
>>     corp->set_architecture_name(elf_helpers::e_machine_to_string(ehdr->e_machine));
>>   
>>     find_alt_debuginfo(ctxt, &ctf_dbg_scn);
>> -  ABG_ASSERT(ctxt->symtab);
>> +  if (!ctxt->symtab)
>> +    {
>> +      status |= elf_reader::STATUS_NO_SYMBOLS_FOUND;
>> +      return;
>> +    }
>> +
>>     corp->set_symtab(ctxt->symtab);
>>   
>>     if ((corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
>> @@ -1564,7 +1570,8 @@ slurp_elf_info(read_context *ctxt,
>>           ctf_scn = ctf_dbg_scn;
>>         else
>>           {
>> -          status |= elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
>> +          status |= (elf_reader::STATUS_OK |
>> +                     elf_reader::STATUS_DEBUG_INFO_NOT_FOUND);
>>             return;
>>           }
>>       }
>> @@ -1676,15 +1683,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
>>       origin |= corpus::LINUX_KERNEL_BINARY_ORIGIN;
>>     corp->set_origin(origin);
>>   
>> -  if (ctxt->cur_corpus_group_)
>> -    ctxt->cur_corpus_group_->add_corpus(ctxt->cur_corpus_);
>> -
>>     slurp_elf_info(ctxt, corp, status);
>> -  if (!is_linux_kernel
>> -      && ((status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND) |
>> -          (status & elf_reader::STATUS_NO_SYMBOLS_FOUND)))
>> +  if ((status & elf_reader::STATUS_NO_SYMBOLS_FOUND) ||
>> +      (!is_linux_kernel &&
>> +       (status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)))
>>         return corp;
>>   
>> +  if (ctxt->cur_corpus_group_)
>> +    ctxt->cur_corpus_group_->add_corpus(ctxt->cur_corpus_);
>> +
>>     // Set the set of exported declaration that are defined.
>>     ctxt->exported_decls_builder
>>      (ctxt->cur_corpus_->get_exported_decls_builder().get());
>> @@ -1695,9 +1702,13 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
>>       {
>>         if (ctxt->ctfa == NULL)
>>           {
>> -          std::string ctfa_filename;
>> -          if (find_ctfa_file(ctxt, ctfa_filename))
>> -            ctxt->ctfa = ctf_arc_open(ctfa_filename.c_str(), &errp);
>> +          std::string filename;
>> +          base_name(ctxt->filename, filename);
>> +
>> +          // locate vmlinux.ctfa only when reader is processing
>> +          // vmlinux file, i.e the main corpus in the group.
>> +          if (filename == "vmlinux" && find_ctfa_file(ctxt, filename))
>> +            ctxt->ctfa = ctf_arc_open(filename.c_str(), &errp);
>>           }
>>       }
>>     else
>> diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
>> index fe9ebc72..7d00f726 100644
>> --- a/src/abg-tools-utils.cc
>> +++ b/src/abg-tools-utils.cc
>> @@ -2493,6 +2493,11 @@ get_binary_paths_from_kernel_dist(const string&	dist_root,
>>   /// made of vmlinux kernel file and the linux kernel modules found
>>   /// under @p root directory and under its sub-directories, recursively.
>>   ///
>> +/// If the vmlinux file doens't have DWARF info, it looks for
>> +/// vmlinux.ctfa, if it's present, it assumes that kernel was build
>> +/// with CTF support, then it updates @ref origin, given chance to
>> +/// CTF reader to build the IR for kernel build directory.
>> +///
>>   /// @param origin the debug type information in vmlinux kernel and
>>   /// the linux kernel modules to be used to build the corpora @p group.
>>   ///
>> @@ -2526,7 +2531,7 @@ get_binary_paths_from_kernel_dist(const string&	dist_root,
>>   ///
>>   /// @param env the environment to create the corpus_group in.
>>   static void
>> -maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>> +maybe_load_vmlinux_dwarf_corpus(corpus::origin      &origin,
>>                                   corpus_group_sptr&  group,
>>                                   const string&       vmlinux,
>>                                   vector<string>&     modules,
>> @@ -2578,6 +2583,21 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>>        << " reading DONE:"
>>        << t << "\n";
>>   
>> +  if (status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)
>> +    {
>> +      // vmlinux doesn't have DWARF debug info, so it might contain
>> +      // CTF debug info, this means vmlinux.ctfa is under kernel
>> +      // build directory.
>> +      string ctfa_file = root + "/vmlinux.ctfa";
>> +      if (file_exists(ctfa_file))
>> +        {
>> +          // OK. Likely CTF could build a better IR, then let's
>> +          // notify the caller to try with CTF reader.
>> +          origin = corpus::CTF_ORIGIN;
>> +          return;
>> +        }
>> +    }
>> +
> 
> Hmmh, I think this change is done at logical level that is too "low",
> compared to the level at which I believe this decision should be made.
> Namely, maybe_load_vmlinux_dwarf_corpus is meant to try to load DWARF
> debug information.  It should not deal with CTF debug information
> (looking for CTF archive files etc).  Rather, I think that if the
> loading of DWARF failed, that function should tell its caller about it.
> It would thus be up to the caller to decide what to do with that
> information. For instance, the caller would then be able to try and load
> CTF debug information instead.
> 

Ohh. I see, much better.

> So I would rather modify maybe_load_vmlinux_dwarf_corpus to make it
> return the status of the loading, so that the caller can know if the
> loading succeeded or not.
> 

Yes. totally agree.

> So here is the change I would propose for the
> maybe_load_vmlinux_dwarf_corpus and maybe_load_vmlinux_ctf_corpus
> functions instead (it's part of the whole patch that I have posted at
> the end of this message):
> 
>      @@ -2525,8 +2542,12 @@ get_binary_paths_from_kernel_dist(const string&	dist_root,
>       /// @param t time to trace time spent in each step.
>       ///
>       /// @param env the environment to create the corpus_group in.
>      -static void
>      -maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>      +///
>      +/// @return the status of the loading.  If it's
>      +/// abigail::elf_reader::STATUS_UNKNOWN, then it means nothing was
>      +/// done, meaning the function got out early.
>      +static abigail::elf_reader::status
>      +maybe_load_vmlinux_dwarf_corpus(corpus::origin&     origin,
>                                       corpus_group_sptr&  group,
>                                       const string&       vmlinux,
>                                       vector<string>&     modules,
>      @@ -2539,10 +2560,11 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>                                       timer&              t,
>                                       environment_sptr&   env)
>       {
>      +  abigail::elf_reader::status status = abigail::elf_reader::STATUS_UNKNOWN;
>      +
>         if (!(origin & corpus::DWARF_ORIGIN))
>      -    return;
>      +    return status;
> 
>      -  abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
>         dwarf_reader::read_context_sptr ctxt;
>         ctxt =
>          dwarf_reader::create_read_context(vmlinux, di_roots, env.get(),
>      @@ -2569,6 +2591,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>            << vmlinux << "' ...\n" << std::flush;
> 
>         // Read the vmlinux corpus and add it to the group.
>      +  status = abigail::elf_reader::STATUS_OK;
>         t.start();
>         read_and_add_corpus_to_group_from_elf(*ctxt, *group, status);
>         t.stop();
>      @@ -2579,7 +2602,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>            << t << "\n";
> 

At this point if `vmlinux' file doesn't have DWARF information, the `status'
returned by `maybe_load_vmlinux_dwarf_corpus' will set the bit field
`STATUS_DEBUG_INFO_NOT_FOUND', but it is not verified here, and since vmlinux
corpus was already added into the group in `read_debug_info_into_corpus'
function, it continues processing modules without the main corpus information,
Is this the expected behaviour?
  
>         if (group->is_empty())
>      -    return;
>      +    return status;
> 
>         // Now add the corpora of the modules to the corpus group.
>         int total_nb_modules = modules.size();
>      @@ -2614,6 +2637,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>                << "' reading DONE: "
>                << t << "\n";
>           }
>      +  return status;
>       }
> 
>       /// If the @ref origin is CTF_ORIGIN it build a @ref corpus_group
>      @@ -2642,8 +2666,12 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
>       /// @param t time to trace time spent in each step.
>       ///
>       /// @param env the environment to create the corpus_group in.
>      +///
>      +/// @return the status of the loading.  If it's
>      +/// abigail::elf_reader::STATUS_UNKNOWN, then it means nothing was
>      +/// done, meaning the function got out early.
>       #ifdef WITH_CTF
>      -static void
>      +static abigail::elf_reader::status
>       maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
>                                     corpus_group_sptr&  group,
>                                     const string&       vmlinux,
>      @@ -2654,10 +2682,11 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
>                                     timer&              t,
>                                     environment_sptr&   env)
>       {
>      +  abigail::elf_reader::status status = abigail::elf_reader::STATUS_UNKNOWN;
>      +
>         if (!(origin & corpus::CTF_ORIGIN))
>      -    return;
>      +    return status;
> 
>      -  abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
>         ctf_reader::read_context_sptr ctxt;
>         ctxt = ctf_reader::create_read_context(vmlinux, di_roots, env.get());
>         group.reset(new corpus_group(env.get(), root));
>      @@ -2668,6 +2697,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
>            << vmlinux << "' ...\n" << std::flush;
> 
>         // Read the vmlinux corpus and add it to the group.
>      +  status = abigail::elf_reader::STATUS_OK;
>         t.start();
>         read_and_add_corpus_to_group_from_elf(ctxt.get(), *group, status);
>         t.stop();
>      @@ -2678,7 +2708,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
>            << t << "\n";
> 
>         if (group->is_empty())
>      -    return;
>      +    return status;
> 
>         // Now add the corpora of the modules to the corpus group.
>         int total_nb_modules = modules.size();
>      @@ -2707,6 +2737,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
>                << "' reading DONE: "
>                << t << "\n";
>           }
>      +  return status;
>       }
>       #endif
> 
> I have also introduced a new function called
> tools_utils::dir_contains_ctf_archive to look for a file that ends with
> ".ctfa".  This abstracts away the search for "vmlinux.ctfa" as I wasn't
> sure if those archives could exist for normal (non-kernel) binaries as
> well:

Ohh, perfect!, I'll use it in CTF reader to located the Linux archive file.
No. there is no `.ctfa' file for non-kernel binaries intead they have `.ctf'
section, I could implement a similary function to looks for `.ctf' section
using elf helpers and it can be used in `load_corpus_and_write_abixml'
implementing a similar algorithm as with when we are processing the Kernel,
looking for DWARF information, and if it is not present then, test if
`.ctf' section is in ELF file then extract it using CTF reader,
to avoid duplication use of:

abigail::ctf_reader::read_context_sptr ctxt
		= abigail::ctf_reader::create_read_context(opts.in_file_path,
							   opts.prepared_di_root_paths,
							   env.get());

One for `opts.use_ctf' and other one when `STATUS_DEBUG_INFO_NOT_FOUND' is returned.
WDYT?

> 
>      diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
>      index fe9ebc72..a23f08ea 100644
>      --- a/src/abg-tools-utils.cc
>      +++ b/src/abg-tools-utils.cc
>      @@ -1694,6 +1694,23 @@ file_is_kernel_debuginfo_package(const string& file_name, file_type file_type)
>         return result;
>       }
> 
>      +/// Test if a directory contains a CTF archive.
>      +///
>      +/// @param directory the directory to consider.
>      +///
>      +/// @param archive_prefix the prefix of the archive file.
>      +///
>      +/// @return true iff @p directory contains a CTF archive file.
>      +bool
>      +dir_contains_ctf_archive(const string& directory,
>      +			 const string& archive_prefix)
>      +{
>      +  string ctf_archive = directory + "/" + archive_prefix + ".ctfa";
>      +  if (file_exists(ctf_archive))
>      +    return true;
>      +  return false;
>      +}
>      +
> 
> Then, now that the caller of these functions can know if loading the
> vmlinux binary succeeded or not, we can update that caller, which is
> build_corpus_group_from_kernel_dist_under:
> 
>      @@ -2786,14 +2817,23 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
>             vector<char**> di_roots;
>             di_roots.push_back(&di_root_ptr);
> 
>      -      maybe_load_vmlinux_dwarf_corpus(origin, group, vmlinux,
>      -                                      modules, root, di_roots,
>      -                                      suppr_paths, kabi_wl_paths,
>      -                                      supprs, verbose, t, env);
>      +      abigail::elf_reader::status status =
>      +	maybe_load_vmlinux_dwarf_corpus(origin, group, vmlinux,
>      +					modules, root, di_roots,
>      +					suppr_paths, kabi_wl_paths,
>      +					supprs, verbose, t, env);
>       #ifdef WITH_CTF
>      +      string vmlinux_basename;
>      +      base_name(vmlinux, vmlinux_basename);
>      +      if (origin == corpus::DWARF_ORIGIN
>      +	  && (status & abigail::elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)
>      +	  && dir_contains_ctf_archive(root, vmlinux_basename))
>      +	  // Let's try to load the binary using the CTF archive!
>      +	origin |= corpus::CTF_ORIGIN;
>      +
>             maybe_load_vmlinux_ctf_corpus(origin, group, vmlinux,
>      -                                    modules, root, di_roots,
>      -                                    verbose, t, env);
>      +				    modules, root, di_roots,
>      +				    verbose, t, env);
>       #endif
>           }
> 
> If you feel like we shouldn't test if the directory 'root' contains a
> ctf archive before trying to load the CTF archive, then please remove
> the call to dir_contains_ctf_archive above.  I just meant to show the
> general idea of where the decision should be made to try to load the ctf
> debug information, in my opinion.
>

Actually the `dir_contains_ctf_archive' prevents to rebuild/reset a group
in the CTF reader and discard the work done by DWARF reader if CTF information
is neither present in binary file.
  
> The general idea, however, is to make the reader of
> build_corpus_group_from_kernel_dist_under understand what is going on
> (that is, we try loading vmlinux with DWARF, and if it fails, then we
> try CTF) without having to go dig deep into how
> maybe_load_vmlinux_ctf_corpus works.
> 
> Similarly, I amended the changes to the tools (abi{diff,dw,pkgdiff}) to
> make them follow the same path.
> 
> [...]
> 
>> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
>> index e0bb35ac..815c68df 100644
>> --- a/tools/abidiff.cc
>> +++ b/tools/abidiff.cc
>> @@ -1118,6 +1118,13 @@ main(int argc, char* argv[])
>>         return 0;
>>       }
>>   
>> +  corpus::origin first_input_origin;
>> +  corpus::origin origin =
>> +#ifdef WITH_CTF
>> +    opts.use_ctf ? corpus::CTF_ORIGIN :
>> +#endif
>> +    corpus::DWARF_ORIGIN;
>> +
> 
> I removed these changes above.  This is because we don't need to perform
> so many changes.  We can just let the code as it was and add new code to
> just one place to test if loading DWARF failed.  If it failed then we
> try to load CTF.
> 
> See below.
> 
>>     prepare_di_root_paths(opts);
>>   
>>     if (!maybe_check_suppression_files(opts))
>> @@ -1150,7 +1157,7 @@ main(int argc, char* argv[])
>>         abigail::elf_reader::status c1_status =
>>   	abigail::elf_reader::STATUS_OK,
>>   	c2_status = abigail::elf_reader::STATUS_OK;
>> -      corpus_sptr c1, c2;
>> +      corpus_sptr c1, c2, dwarf_corpus;
> 
> Likewise, removed.
> 
>>         corpus_group_sptr g1, g2;
>>         bool files_suppressed = false;
>>   
>> @@ -1180,20 +1187,9 @@ main(int argc, char* argv[])
>>   	case abigail::tools_utils::FILE_TYPE_ELF: // fall through
>>   	case abigail::tools_utils::FILE_TYPE_AR:
>>   	  {
>> -#ifdef WITH_CTF
>> -            if (opts.use_ctf)
>> -              {
>> -                abigail::ctf_reader::read_context_sptr ctxt
>> -                  = abigail::ctf_reader::create_read_context(opts.file1,
>> -                                                             opts.prepared_di_root_paths1,
>> -                                                             env.get());
>> -                ABG_ASSERT(ctxt);
>> -                c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
>> -                                                      c1_status);
>> -              }
>> -            else
>> -#endif
> 
> I left this code in.  So no change here either.
> 
>> +            if (origin & corpus::DWARF_ORIGIN)
>>                 {
>> +                first_input_origin = corpus::DWARF_ORIGIN;
> 
> I remove this change as well.  So, still no change.
> 
>>                   abigail::dwarf_reader::read_context_sptr ctxt =
>>                     abigail::dwarf_reader::create_read_context
>>                     (opts.file1, opts.prepared_di_root_paths1,
>> @@ -1205,6 +1201,7 @@ main(int argc, char* argv[])
>>                   set_suppressions(*ctxt, opts);
>>                   abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
>>                   c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
>> +                dwarf_corpus = c1;
> 
> Likewise.
> 
> But then, it's here that we are going to inspect c1_status to see if
> loading DWARF failed.  If it failed, then we'll try to load CTF.  So,
> here is the change I am adding to the process of loading the corpus c1:
> 
> 
> @@ -1205,6 +1208,36 @@ main(int argc, char* argv[])
>                   set_suppressions(*ctxt, opts);
>                   abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
>                   c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
> +
> +#ifdef WITH_CTF
> +		if (// We were not instructed to use CTF ...
> +		    !opts.use_ctf

This is always true, because we are in the else block of `opts.use_ctf'.

> +		    // ... and yet, no debug info was found ...
> +		    && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)
> +		    // ... but we found ELF symbols ...
> +		    && !(c1_status & STATUS_NO_SYMBOLS_FOUND))
> +		  {
> +		    string basenam, dirnam;
> +		    base_name(opts.file1, basenam);
> +		    dir_name(opts.file1, dirnam);
> +		    // ... the input file seems to contain CTF
> +		    // archive, so let's try to see if the file
> +		    // contains a CTF archive, who knows ...
> +		    if (dir_contains_ctf_archive(dirnam, basenam))

Non-kernel binaries contains `.ctf' section instead of `ctfa' file,
so I can implement a `file_contains_ctf_section' function to test if
it is a valid input file for CTF reader.

> +		      {
> +			// The file does contain CTF debug information finally!
> +			abigail::ctf_reader::read_context_sptr ctxt =
> +			  abigail::ctf_reader::create_read_context
> +			  (opts.file1,
> +			   opts.prepared_di_root_paths1,
> +			   env.get());
> +			ABG_ASSERT(ctxt);
> +
> +			c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
> +							      c1_status);
> +		      }
> +		  }
> +#endif
> 
> OK, maybe the statement
> 
>      "if (dir_contains_ctf_archive(dirnam, basenam))"
> 
> above is not necessary as I am not sure if the CTF archive file is
> supposed to be present for normal binaries or not.  If it's not
> necessary, then please remove that line as well as the use of the
> basenam/dirnam variables.
> 
> But you get the general idea.  We just test if loading DWARF failed
> (even if we were not instructed to use CTF) and then we try to load CTF.
> The change is at one place right after trying to load the DWARF,
> logically.
> 
> I do something similar to the loading of the second corpus c2.
> 
> 
> Here is the entire change to the abidiff file:
> 
>      diff --git a/tools/abidiff.cc b/tools/abidiff.cc
>      index e0bb35ac..0d9e59c2 100644
>      --- a/tools/abidiff.cc
>      +++ b/tools/abidiff.cc
>      @@ -47,6 +47,9 @@ using abigail::suppr::suppressions_type;
>       using abigail::suppr::read_suppressions;
>       using namespace abigail::dwarf_reader;
>       using namespace abigail::elf_reader;
>      +using abigail::tools_utils::base_name;
>      +using abigail::tools_utils::dir_name;
>      +using abigail::tools_utils::dir_contains_ctf_archive;
>       using abigail::tools_utils::emit_prefix;
>       using abigail::tools_utils::check_file;
>       using abigail::tools_utils::guess_file_type;
>      @@ -1205,6 +1208,36 @@ main(int argc, char* argv[])
>                       set_suppressions(*ctxt, opts);
>                       abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
>                       c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
>      +
>      +#ifdef WITH_CTF
>      +		if (// We were not instructed to use CTF ...
>      +		    !opts.use_ctf
>      +		    // ... and yet, no debug info was found ...
>      +		    && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)
>      +		    // ... but we found ELF symbols ...
>      +		    && !(c1_status & STATUS_NO_SYMBOLS_FOUND))
>      +		  {
>      +		    string basenam, dirnam;
>      +		    base_name(opts.file1, basenam);
>      +		    dir_name(opts.file1, dirnam);
>      +		    // ... the input file seems to contain CTF
>      +		    // archive, so let's try to see if the file
>      +		    // contains a CTF archive, who knows ...
>      +		    if (dir_contains_ctf_archive(dirnam, basenam))
>      +		      {
>      +			// The file does contain CTF debug information finally!
>      +			abigail::ctf_reader::read_context_sptr ctxt =
>      +			  abigail::ctf_reader::create_read_context
>      +			  (opts.file1,
>      +			   opts.prepared_di_root_paths1,
>      +			   env.get());
>      +			ABG_ASSERT(ctxt);
>      +
>      +			c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
>      +							      c1_status);
>      +		      }
>      +		  }
>      +#endif
>                       if (!c1
>                           || (opts.fail_no_debug_info
>                               && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
>      @@ -1212,7 +1245,7 @@ main(int argc, char* argv[])
>                         return handle_error(c1_status, ctxt.get(),
>                                             argv[0], opts);
>                     }
>      -	  }
>      +          }
>                break;
>              case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
>                {
>      @@ -1289,13 +1322,44 @@ main(int argc, char* argv[])
>                       set_suppressions(*ctxt, opts);
> 
>                       c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
>      +
>      +#ifdef WITH_CTF
>      +		if (// We were not instructed to use CTF ...
>      +		    !opts.use_ctf
>      +		    // ... and yet, no debug info was found ...
>      +		    && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)
>      +		    // ... but we found ELF symbols ...
>      +		    && !(c2_status & STATUS_NO_SYMBOLS_FOUND))
>      +		  {
>      +		    string basenam, dirnam;
>      +		    base_name(opts.file2, basenam);
>      +		    dir_name(opts.file2, dirnam);
>      +		    // ... the input file seems to contain CTF
>      +		    // archive, so let's try to see if the file
>      +		    // contains a CTF archive, who knows ...
>      +		      if (dir_contains_ctf_archive(dirnam, basenam))
>      +		      {
>      +			// The file does contain CTF debug information finally!
>      +			abigail::ctf_reader::read_context_sptr ctxt =
>      +			  abigail::ctf_reader::create_read_context
>      +			  (opts.file2,
>      +			   opts.prepared_di_root_paths2,
>      +			   env.get());
>      +			ABG_ASSERT(ctxt);
>      +
>      +			c2 = abigail::ctf_reader::read_corpus(ctxt.get(),
>      +							      c2_status);
>      +		      }
>      +		  }
>      +#endif
>      +
>                       if (!c2
>                           || (opts.fail_no_debug_info
>                               && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
>                               && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
>                         return handle_error(c2_status, ctxt.get(), argv[0], opts);
>                     }
>      -	  }
>      +          }
>                break;
>              case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
>                {
> 
> 
> By the way, I think there should be at least one test that exercises the
> workflow "try to load DWARF from the binary, fail, try to load CTF,
> succeed", without having the --ctf provided.  And we should have that
> test for all the tools that were modified.  We don't need to have that
> for a Linux Kernel, obviously, as that would be too big to have in the
> tarball.
> 

OK, I'll add test cases.

> Below is the entire amended patch that I came up with.  I suspect the
> use of the dir_contains_ctf_archive function everywhere is wrong, but
> I'll let you be the judge of that.  I think the documentation of when
> that archive is present/necessary would be good to improve/add.
> 
> Thanks again for the patch!

Thanks so much for comments!.

> 
> 
>  From 481b547d7871a544b690943b16e6a173a729932f Mon Sep 17 00:00:00 2001
> [...]

  reply	other threads:[~2022-10-04 23:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-01  0:15 Guillermo E. Martinez
2022-10-04  9:04 ` Dodji Seketeli
2022-10-04 23:13   ` Guillermo E. Martinez [this message]
2022-10-06  7:42     ` Dodji Seketeli
2022-10-06 14:12       ` Dodji Seketeli
2022-10-07 14:13         ` Guillermo E. Martinez
2022-10-06 19:53       ` Guillermo Martinez
2022-10-06 19:50         ` Guillermo E. Martinez
2022-10-07 13:38         ` Dodji Seketeli
2022-10-07 16:04           ` Ben Woodard
2022-11-15 20:13 ` [PATCHv2] ELF based front-end readers fallback feature Guillermo E. Martinez
2022-11-21 18:51   ` [PATCHv3] " Guillermo E. Martinez
2022-11-22 14:19     ` Dodji Seketeli
2022-11-22 16:02       ` Guillermo E. Martinez
2022-11-22 16:00     ` [PATCH v4] " Guillermo E. Martinez
2022-11-28 15:56       ` Dodji Seketeli
2022-11-28 21:59         ` Guillermo E. Martinez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568fe730-3bb9-0267-00bc-2873e94e502f@oracle.com \
    --to=guillermo.e.martinez@oracle.com \
    --cc=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).