From: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
To: Dodji Seketeli <dodji@seketeli.org>,
"Guillermo E. Martinez via Libabigail"
<libabigail@sourceware.org>
Subject: Re: [PATCH] CTF as a fallback when no DWARF debug info is present
Date: Tue, 4 Oct 2022 18:13:53 -0500 [thread overview]
Message-ID: <568fe730-3bb9-0267-00bc-2873e94e502f@oracle.com> (raw)
In-Reply-To: <877d1g2c52.fsf@seketeli.org>
On 10/4/22 04:04, Dodji Seketeli wrote:
> Hello Guillermo,
>
HI Dodji,
> Thank you for this patch.
>
Your welcome!.
> Please find my comments below.
>
> "Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
> écrit:
>
>
>> By default, `abidw', `abidiff', `abipkgdiff' and `kmidiff' tools use
>> debug information in `DWARF` format, if present, otherwise now
>> automatically the tools try to extract and build the binaries IR using
>> debug information in `CTF` format without use of `--ctf' option, if
>> present, finally, if neither is found, they use only `ELF` symbols to
>> extract, build, compare and report which of them were added or removed.
>>
>> In case of neither DWARF nor CTF debug information is present
>> (`STATUS_DEBUG_INFO_NOT_FOUND') and `--ctf' option was not passed in
>> the command line, those tools use the symbol object build by DWARF
>> reader (default) to completed theirs task.
>>
>> Tools don't allow comparing corpora built with between different debug
>> information i.e DWARF vs CTF, the first corpus built dictate the
>> expected debug information in the second one.
>>
>> This work for libraries and Linux kernel. The `--ctf' option is
>> preserved to explicitly indicate to those tools that we want to use
>> CTF support.
>>
>> * doc/manuals/abidiff.rst: Adjust usage tool information
>> to indicates fallback for CTF debug info when DWARF info
>> is not present.
>> * doc/manuals/abidw.rst: Likewise.
>> * doc/manuals/abipkgdiff.rst: Likewise.
>> * doc/manuals/kmidiff.rst: Likewise.
>
> I have made some slight fixes to changes to these doc files. Nothing
> major you'll see the changes in the updated version of the patch that I
> am posting at the end of this message.
>
OK.
> [...]
>
>> * src/abg-ctf-reader.cc (slurp_elf_info): Report status
>> when debug information is not present.
>> (read_corpus): Add code to locate `vmlinux.ctfa' just when
>> `main corpus' is being processed.
>> * src/abg-tools-utils.cc (maybe_load_vmlinux_dwarf_corpus):
>> Replace by a reference `origin' argument to notify whether
>> CTF reader needs to be executed.
>> * tests/data/test-diff-pkg-ctf/dirpkg-3-report-2.txt: Adjust
>> test case.
>> * tests/test-diff-pkg.cc (in_out_specs): Add `--ctf' option
>> in test case.
>> * tools/abidiff.cc (main): Add `origin' depending of command
>> line argument by default it is `corpus::DWARF_ORIGIN'. Add
>> `dwarf_corpus' to be used as default corpus (containing ELF
>> symbols), when DWARF nor CTF debug information was found.
>> * tools/abidw.cc: Likewise.
>> * tools/abipkgdiff.cc: Likewise.
>>
>> Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
>> ---
>> doc/manuals/abidiff.rst | 15 +--
>> doc/manuals/abidw.rst | 15 ++-
>> doc/manuals/abipkgdiff.rst | 13 +-
>> doc/manuals/kmidiff.rst | 9 +-
>> src/abg-ctf-reader.cc | 33 +++--
>> src/abg-tools-utils.cc | 22 +++-
>> .../test-diff-pkg-ctf/dirpkg-3-report-2.txt | 16 +++
>> tests/test-diff-pkg.cc | 2 +-
>> tools/abidiff.cc | 82 ++++++++-----
>> tools/abidw.cc | 54 +++++---
>> tools/abipkgdiff.cc | 116 +++++++++++-------
>> 11 files changed, 257 insertions(+), 120 deletions(-)
>>
>> diff --git a/doc/manuals/abidiff.rst b/doc/manuals/abidiff.rst
>> index 0c711d9e..caeceded 100644
>> --- a/doc/manuals/abidiff.rst
>> +++ b/doc/manuals/abidiff.rst
>> @@ -12,11 +12,12 @@ This tool can also compare the textual representations of the ABI of
>> two ELF binaries (as emitted by ``abidw``) or an ELF binary against a
>> textual representation of another ELF binary.
>>
>> -For a comprehensive ABI change report that includes changes about
>> -function and variable sub-types, the two input shared libraries must
>> -be accompanied with their debug information in `DWARF`_ format.
>> -Otherwise, only `ELF`_ symbols that were added or removed are
>> -reported.
>> +For a comprehensive ABI change report between two input shared
>> +libraries that includes changes about function and variable sub-types,
>> +``abidiff`` uses by default, debug information in `DWARF`_ format, if
>> +present, otherwise it compares interfaces using debug information in
>> +`CTF`_ format, if present, finally, if neither is found, it uses only
>> +`ELF`_ symbols to report which of them were added or removed.
>>
>> .. include:: tools-use-libabigail.txt
>>
>> @@ -558,7 +559,7 @@ Options
>>
>> * ``--ctf``
>>
>> - When comparing binaries, extract ABI information from CTF debug
>> + When comparing binaries, extract ABI information from `CTF`_ debug
>> information, if present.
>>
>> * ``--stats``
>> @@ -785,4 +786,4 @@ Usage examples
>>
>> .. _ELF: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>> .. _DWARF: http://www.dwarfstd.org
>> -
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/doc/manuals/abidw.rst b/doc/manuals/abidw.rst
>> index a3055c7e..20948805 100644
>> --- a/doc/manuals/abidw.rst
>> +++ b/doc/manuals/abidw.rst
>> @@ -8,8 +8,7 @@ representation of its ABI to standard output. The emitted
>> representation format, named ``ABIXML``, includes all the globally
>> defined functions and variables, along with a complete representation
>> of their types. It also includes a representation of the globally
>> -defined ELF symbols of the file. The input shared library must
>> -contain associated debug information in `DWARF`_ format.
>> +defined ELF symbols of the file.
>>
>> When given the ``--linux-tree`` option, this program can also handle a
>> `Linux kernel`_ tree. That is, a directory tree that contains both
>> @@ -19,8 +18,13 @@ interface between the kernel and its module, to standard output. In
>> this case, we don't call it an ABI, but a KMI (Kernel Module
>> Interface). The emitted KMI includes all the globally defined
>> functions and variables, along with a complete representation of their
>> -types. The input binaries must contain associated debug information
>> -in `DWARF`_ format.
>> +types.
>> +
>> +To generate either ABI or KMI representation, by default ``abidw``
>> +uses debug information in `DWARF`_ format, if present, otherwise it
>> +looks for debug information in `CTF`_ format, if present, finally, if
>> +neither is found, it uses only `ELF`_ symbols to report which of them
>> +were added or removed.
>>
>> .. include:: tools-use-libabigail.txt
>>
>> @@ -326,7 +330,7 @@ Options
>>
>> * ``--ctf``
>>
>> - Extract ABI information from CTF debug information, if present in
>> + Extract ABI information from `CTF`_ debug information, if present in
>> the given object.
>>
>> * ``--annotate``
>> @@ -365,3 +369,4 @@ standard `here
>> .. _DWARF: http://www.dwarfstd.org
>> .. _GNU: http://www.gnu.org
>> .. _Linux Kernel: https://kernel.org/
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/doc/manuals/abipkgdiff.rst b/doc/manuals/abipkgdiff.rst
>> index 9114775a..771bb034 100644
>> --- a/doc/manuals/abipkgdiff.rst
>> +++ b/doc/manuals/abipkgdiff.rst
>> @@ -13,12 +13,18 @@ binaries.
>> For a comprehensive ABI change report that includes changes about
>> function and variable sub-types, the two input packages must be
>> accompanied with their debug information packages that contain debug
>> -information either in `DWARF`_ or in `CTF` formats. Please note
>> +information either in `DWARF`_ or in `CTF`_ formats. Please note
>> however that some packages contain binaries that embed the debug
>> information directly in a section of said binaries. In those cases,
>> obviously, no separate debug information package is needed as the tool
>> will find the debug information inside the binaries.
>>
>> +By default, ``abipkgdiff`` uses debug information in `DWARF`_ format,
>> +if present, otherwise it compares binaries interfaces using debug
>> +information in `CTF`_ format, if present, finally, if neither is
>> +found, it uses only `ELF`_ symbols to report which of them were added
>> +or removed.
>> +
>> .. include:: tools-use-libabigail.txt
>>
>> .. _abipkgdiff_invocation_label:
>> @@ -525,8 +531,8 @@ Options
>>
>> * ``--ctf``
>>
>> - This is used to compare packages with CTF debug information, if
>> - present.
>> + This is used to compare packages with `CTF`_ debug information,
>> + if present.
>>
>> .. _abipkgdiff_return_value_label:
>>
>> @@ -546,4 +552,5 @@ In the later case, the value of the exit code is the same as for the
>> .. _Deb: https://en.wikipedia.org/wiki/Deb_%28file_format%29
>> .. _tar: https://en.wikipedia.org/wiki/Tar_%28computing%29
>> .. _DWARF: http://www.dwarfstd.org
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> .. _Development Package: https://fedoraproject.org/wiki/Packaging:Guidelines?rd=Packaging/Guidelines#Devel_Packages
>> diff --git a/doc/manuals/kmidiff.rst b/doc/manuals/kmidiff.rst
>> index 53010189..a27d2456 100644
>> --- a/doc/manuals/kmidiff.rst
>> +++ b/doc/manuals/kmidiff.rst
>> @@ -74,6 +74,11 @@ functions and variables) between the Kernel and its modules. In
>> practice, though, some users might want to compare a subset of the
>> those interfaces.
>>
>> +By default, ``kmidiff`` uses debug information in `DWARF`_ format,
>> +if present, otherwise it compares interfaces using debug information
>> +in `CTF`_ format, if present, finally, if neither is found, it uses
>> +only `ELF`_ symbols to report which were added or removed.
>> +
>> Users can then define a "white list" of the interfaces to compare.
>> Such a white list is a just a file in the "INI" format that looks
>> like: ::
>> @@ -174,7 +179,7 @@ Options
>>
>> * ``--ctf``
>>
>> - Extract ABI information from CTF debug information, if present in
>> + Extract ABI information from `CTF`_ debug information, if present in
>> the Kernel and Modules.
>>
>> * ``--impacted-interfaces | -i``
>> @@ -242,3 +247,5 @@ Options
>> .. _ELF: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>> .. _ksymtab: http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
>> .. _Linux Kernel: https://kernel.org
>> +.. _DWARF: http://www.dwarfstd.org
>> +.. _CTF: https://raw.githubusercontent.com/wiki/oracle/binutils-gdb/files/ctf-spec.pdf
>> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
>> index 9148a646..b3cde365 100644
>> --- a/src/abg-ctf-reader.cc
>> +++ b/src/abg-ctf-reader.cc
>> @@ -45,6 +45,7 @@ namespace ctf_reader
>> using std::dynamic_pointer_cast;
>> using abigail::tools_utils::dir_name;
>> using abigail::tools_utils::file_exists;
>> +using abigail::tools_utils::base_name;
>>
>> class read_context
>> {
>> @@ -1545,7 +1546,12 @@ slurp_elf_info(read_context *ctxt,
>> corp->set_architecture_name(elf_helpers::e_machine_to_string(ehdr->e_machine));
>>
>> find_alt_debuginfo(ctxt, &ctf_dbg_scn);
>> - ABG_ASSERT(ctxt->symtab);
>> + if (!ctxt->symtab)
>> + {
>> + status |= elf_reader::STATUS_NO_SYMBOLS_FOUND;
>> + return;
>> + }
>> +
>> corp->set_symtab(ctxt->symtab);
>>
>> if ((corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
>> @@ -1564,7 +1570,8 @@ slurp_elf_info(read_context *ctxt,
>> ctf_scn = ctf_dbg_scn;
>> else
>> {
>> - status |= elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
>> + status |= (elf_reader::STATUS_OK |
>> + elf_reader::STATUS_DEBUG_INFO_NOT_FOUND);
>> return;
>> }
>> }
>> @@ -1676,15 +1683,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
>> origin |= corpus::LINUX_KERNEL_BINARY_ORIGIN;
>> corp->set_origin(origin);
>>
>> - if (ctxt->cur_corpus_group_)
>> - ctxt->cur_corpus_group_->add_corpus(ctxt->cur_corpus_);
>> -
>> slurp_elf_info(ctxt, corp, status);
>> - if (!is_linux_kernel
>> - && ((status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND) |
>> - (status & elf_reader::STATUS_NO_SYMBOLS_FOUND)))
>> + if ((status & elf_reader::STATUS_NO_SYMBOLS_FOUND) ||
>> + (!is_linux_kernel &&
>> + (status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)))
>> return corp;
>>
>> + if (ctxt->cur_corpus_group_)
>> + ctxt->cur_corpus_group_->add_corpus(ctxt->cur_corpus_);
>> +
>> // Set the set of exported declaration that are defined.
>> ctxt->exported_decls_builder
>> (ctxt->cur_corpus_->get_exported_decls_builder().get());
>> @@ -1695,9 +1702,13 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
>> {
>> if (ctxt->ctfa == NULL)
>> {
>> - std::string ctfa_filename;
>> - if (find_ctfa_file(ctxt, ctfa_filename))
>> - ctxt->ctfa = ctf_arc_open(ctfa_filename.c_str(), &errp);
>> + std::string filename;
>> + base_name(ctxt->filename, filename);
>> +
>> + // locate vmlinux.ctfa only when reader is processing
>> + // vmlinux file, i.e the main corpus in the group.
>> + if (filename == "vmlinux" && find_ctfa_file(ctxt, filename))
>> + ctxt->ctfa = ctf_arc_open(filename.c_str(), &errp);
>> }
>> }
>> else
>> diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
>> index fe9ebc72..7d00f726 100644
>> --- a/src/abg-tools-utils.cc
>> +++ b/src/abg-tools-utils.cc
>> @@ -2493,6 +2493,11 @@ get_binary_paths_from_kernel_dist(const string& dist_root,
>> /// made of vmlinux kernel file and the linux kernel modules found
>> /// under @p root directory and under its sub-directories, recursively.
>> ///
>> +/// If the vmlinux file doens't have DWARF info, it looks for
>> +/// vmlinux.ctfa, if it's present, it assumes that kernel was build
>> +/// with CTF support, then it updates @ref origin, given chance to
>> +/// CTF reader to build the IR for kernel build directory.
>> +///
>> /// @param origin the debug type information in vmlinux kernel and
>> /// the linux kernel modules to be used to build the corpora @p group.
>> ///
>> @@ -2526,7 +2531,7 @@ get_binary_paths_from_kernel_dist(const string& dist_root,
>> ///
>> /// @param env the environment to create the corpus_group in.
>> static void
>> -maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
>> +maybe_load_vmlinux_dwarf_corpus(corpus::origin &origin,
>> corpus_group_sptr& group,
>> const string& vmlinux,
>> vector<string>& modules,
>> @@ -2578,6 +2583,21 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
>> << " reading DONE:"
>> << t << "\n";
>>
>> + if (status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)
>> + {
>> + // vmlinux doesn't have DWARF debug info, so it might contain
>> + // CTF debug info, this means vmlinux.ctfa is under kernel
>> + // build directory.
>> + string ctfa_file = root + "/vmlinux.ctfa";
>> + if (file_exists(ctfa_file))
>> + {
>> + // OK. Likely CTF could build a better IR, then let's
>> + // notify the caller to try with CTF reader.
>> + origin = corpus::CTF_ORIGIN;
>> + return;
>> + }
>> + }
>> +
>
> Hmmh, I think this change is done at logical level that is too "low",
> compared to the level at which I believe this decision should be made.
> Namely, maybe_load_vmlinux_dwarf_corpus is meant to try to load DWARF
> debug information. It should not deal with CTF debug information
> (looking for CTF archive files etc). Rather, I think that if the
> loading of DWARF failed, that function should tell its caller about it.
> It would thus be up to the caller to decide what to do with that
> information. For instance, the caller would then be able to try and load
> CTF debug information instead.
>
Ohh. I see, much better.
> So I would rather modify maybe_load_vmlinux_dwarf_corpus to make it
> return the status of the loading, so that the caller can know if the
> loading succeeded or not.
>
Yes. totally agree.
> So here is the change I would propose for the
> maybe_load_vmlinux_dwarf_corpus and maybe_load_vmlinux_ctf_corpus
> functions instead (it's part of the whole patch that I have posted at
> the end of this message):
>
> @@ -2525,8 +2542,12 @@ get_binary_paths_from_kernel_dist(const string& dist_root,
> /// @param t time to trace time spent in each step.
> ///
> /// @param env the environment to create the corpus_group in.
> -static void
> -maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> +///
> +/// @return the status of the loading. If it's
> +/// abigail::elf_reader::STATUS_UNKNOWN, then it means nothing was
> +/// done, meaning the function got out early.
> +static abigail::elf_reader::status
> +maybe_load_vmlinux_dwarf_corpus(corpus::origin& origin,
> corpus_group_sptr& group,
> const string& vmlinux,
> vector<string>& modules,
> @@ -2539,10 +2560,11 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> timer& t,
> environment_sptr& env)
> {
> + abigail::elf_reader::status status = abigail::elf_reader::STATUS_UNKNOWN;
> +
> if (!(origin & corpus::DWARF_ORIGIN))
> - return;
> + return status;
>
> - abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
> dwarf_reader::read_context_sptr ctxt;
> ctxt =
> dwarf_reader::create_read_context(vmlinux, di_roots, env.get(),
> @@ -2569,6 +2591,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> << vmlinux << "' ...\n" << std::flush;
>
> // Read the vmlinux corpus and add it to the group.
> + status = abigail::elf_reader::STATUS_OK;
> t.start();
> read_and_add_corpus_to_group_from_elf(*ctxt, *group, status);
> t.stop();
> @@ -2579,7 +2602,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> << t << "\n";
>
At this point if `vmlinux' file doesn't have DWARF information, the `status'
returned by `maybe_load_vmlinux_dwarf_corpus' will set the bit field
`STATUS_DEBUG_INFO_NOT_FOUND', but it is not verified here, and since vmlinux
corpus was already added into the group in `read_debug_info_into_corpus'
function, it continues processing modules without the main corpus information,
Is this the expected behaviour?
> if (group->is_empty())
> - return;
> + return status;
>
> // Now add the corpora of the modules to the corpus group.
> int total_nb_modules = modules.size();
> @@ -2614,6 +2637,7 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> << "' reading DONE: "
> << t << "\n";
> }
> + return status;
> }
>
> /// If the @ref origin is CTF_ORIGIN it build a @ref corpus_group
> @@ -2642,8 +2666,12 @@ maybe_load_vmlinux_dwarf_corpus(corpus::origin origin,
> /// @param t time to trace time spent in each step.
> ///
> /// @param env the environment to create the corpus_group in.
> +///
> +/// @return the status of the loading. If it's
> +/// abigail::elf_reader::STATUS_UNKNOWN, then it means nothing was
> +/// done, meaning the function got out early.
> #ifdef WITH_CTF
> -static void
> +static abigail::elf_reader::status
> maybe_load_vmlinux_ctf_corpus(corpus::origin origin,
> corpus_group_sptr& group,
> const string& vmlinux,
> @@ -2654,10 +2682,11 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin origin,
> timer& t,
> environment_sptr& env)
> {
> + abigail::elf_reader::status status = abigail::elf_reader::STATUS_UNKNOWN;
> +
> if (!(origin & corpus::CTF_ORIGIN))
> - return;
> + return status;
>
> - abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
> ctf_reader::read_context_sptr ctxt;
> ctxt = ctf_reader::create_read_context(vmlinux, di_roots, env.get());
> group.reset(new corpus_group(env.get(), root));
> @@ -2668,6 +2697,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin origin,
> << vmlinux << "' ...\n" << std::flush;
>
> // Read the vmlinux corpus and add it to the group.
> + status = abigail::elf_reader::STATUS_OK;
> t.start();
> read_and_add_corpus_to_group_from_elf(ctxt.get(), *group, status);
> t.stop();
> @@ -2678,7 +2708,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin origin,
> << t << "\n";
>
> if (group->is_empty())
> - return;
> + return status;
>
> // Now add the corpora of the modules to the corpus group.
> int total_nb_modules = modules.size();
> @@ -2707,6 +2737,7 @@ maybe_load_vmlinux_ctf_corpus(corpus::origin origin,
> << "' reading DONE: "
> << t << "\n";
> }
> + return status;
> }
> #endif
>
> I have also introduced a new function called
> tools_utils::dir_contains_ctf_archive to look for a file that ends with
> ".ctfa". This abstracts away the search for "vmlinux.ctfa" as I wasn't
> sure if those archives could exist for normal (non-kernel) binaries as
> well:
Ohh, perfect!, I'll use it in CTF reader to located the Linux archive file.
No. there is no `.ctfa' file for non-kernel binaries intead they have `.ctf'
section, I could implement a similary function to looks for `.ctf' section
using elf helpers and it can be used in `load_corpus_and_write_abixml'
implementing a similar algorithm as with when we are processing the Kernel,
looking for DWARF information, and if it is not present then, test if
`.ctf' section is in ELF file then extract it using CTF reader,
to avoid duplication use of:
abigail::ctf_reader::read_context_sptr ctxt
= abigail::ctf_reader::create_read_context(opts.in_file_path,
opts.prepared_di_root_paths,
env.get());
One for `opts.use_ctf' and other one when `STATUS_DEBUG_INFO_NOT_FOUND' is returned.
WDYT?
>
> diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
> index fe9ebc72..a23f08ea 100644
> --- a/src/abg-tools-utils.cc
> +++ b/src/abg-tools-utils.cc
> @@ -1694,6 +1694,23 @@ file_is_kernel_debuginfo_package(const string& file_name, file_type file_type)
> return result;
> }
>
> +/// Test if a directory contains a CTF archive.
> +///
> +/// @param directory the directory to consider.
> +///
> +/// @param archive_prefix the prefix of the archive file.
> +///
> +/// @return true iff @p directory contains a CTF archive file.
> +bool
> +dir_contains_ctf_archive(const string& directory,
> + const string& archive_prefix)
> +{
> + string ctf_archive = directory + "/" + archive_prefix + ".ctfa";
> + if (file_exists(ctf_archive))
> + return true;
> + return false;
> +}
> +
>
> Then, now that the caller of these functions can know if loading the
> vmlinux binary succeeded or not, we can update that caller, which is
> build_corpus_group_from_kernel_dist_under:
>
> @@ -2786,14 +2817,23 @@ build_corpus_group_from_kernel_dist_under(const string& root,
> vector<char**> di_roots;
> di_roots.push_back(&di_root_ptr);
>
> - maybe_load_vmlinux_dwarf_corpus(origin, group, vmlinux,
> - modules, root, di_roots,
> - suppr_paths, kabi_wl_paths,
> - supprs, verbose, t, env);
> + abigail::elf_reader::status status =
> + maybe_load_vmlinux_dwarf_corpus(origin, group, vmlinux,
> + modules, root, di_roots,
> + suppr_paths, kabi_wl_paths,
> + supprs, verbose, t, env);
> #ifdef WITH_CTF
> + string vmlinux_basename;
> + base_name(vmlinux, vmlinux_basename);
> + if (origin == corpus::DWARF_ORIGIN
> + && (status & abigail::elf_reader::STATUS_DEBUG_INFO_NOT_FOUND)
> + && dir_contains_ctf_archive(root, vmlinux_basename))
> + // Let's try to load the binary using the CTF archive!
> + origin |= corpus::CTF_ORIGIN;
> +
> maybe_load_vmlinux_ctf_corpus(origin, group, vmlinux,
> - modules, root, di_roots,
> - verbose, t, env);
> + modules, root, di_roots,
> + verbose, t, env);
> #endif
> }
>
> If you feel like we shouldn't test if the directory 'root' contains a
> ctf archive before trying to load the CTF archive, then please remove
> the call to dir_contains_ctf_archive above. I just meant to show the
> general idea of where the decision should be made to try to load the ctf
> debug information, in my opinion.
>
Actually the `dir_contains_ctf_archive' prevents to rebuild/reset a group
in the CTF reader and discard the work done by DWARF reader if CTF information
is neither present in binary file.
> The general idea, however, is to make the reader of
> build_corpus_group_from_kernel_dist_under understand what is going on
> (that is, we try loading vmlinux with DWARF, and if it fails, then we
> try CTF) without having to go dig deep into how
> maybe_load_vmlinux_ctf_corpus works.
>
> Similarly, I amended the changes to the tools (abi{diff,dw,pkgdiff}) to
> make them follow the same path.
>
> [...]
>
>> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
>> index e0bb35ac..815c68df 100644
>> --- a/tools/abidiff.cc
>> +++ b/tools/abidiff.cc
>> @@ -1118,6 +1118,13 @@ main(int argc, char* argv[])
>> return 0;
>> }
>>
>> + corpus::origin first_input_origin;
>> + corpus::origin origin =
>> +#ifdef WITH_CTF
>> + opts.use_ctf ? corpus::CTF_ORIGIN :
>> +#endif
>> + corpus::DWARF_ORIGIN;
>> +
>
> I removed these changes above. This is because we don't need to perform
> so many changes. We can just let the code as it was and add new code to
> just one place to test if loading DWARF failed. If it failed then we
> try to load CTF.
>
> See below.
>
>> prepare_di_root_paths(opts);
>>
>> if (!maybe_check_suppression_files(opts))
>> @@ -1150,7 +1157,7 @@ main(int argc, char* argv[])
>> abigail::elf_reader::status c1_status =
>> abigail::elf_reader::STATUS_OK,
>> c2_status = abigail::elf_reader::STATUS_OK;
>> - corpus_sptr c1, c2;
>> + corpus_sptr c1, c2, dwarf_corpus;
>
> Likewise, removed.
>
>> corpus_group_sptr g1, g2;
>> bool files_suppressed = false;
>>
>> @@ -1180,20 +1187,9 @@ main(int argc, char* argv[])
>> case abigail::tools_utils::FILE_TYPE_ELF: // fall through
>> case abigail::tools_utils::FILE_TYPE_AR:
>> {
>> -#ifdef WITH_CTF
>> - if (opts.use_ctf)
>> - {
>> - abigail::ctf_reader::read_context_sptr ctxt
>> - = abigail::ctf_reader::create_read_context(opts.file1,
>> - opts.prepared_di_root_paths1,
>> - env.get());
>> - ABG_ASSERT(ctxt);
>> - c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
>> - c1_status);
>> - }
>> - else
>> -#endif
>
> I left this code in. So no change here either.
>
>> + if (origin & corpus::DWARF_ORIGIN)
>> {
>> + first_input_origin = corpus::DWARF_ORIGIN;
>
> I remove this change as well. So, still no change.
>
>> abigail::dwarf_reader::read_context_sptr ctxt =
>> abigail::dwarf_reader::create_read_context
>> (opts.file1, opts.prepared_di_root_paths1,
>> @@ -1205,6 +1201,7 @@ main(int argc, char* argv[])
>> set_suppressions(*ctxt, opts);
>> abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
>> c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
>> + dwarf_corpus = c1;
>
> Likewise.
>
> But then, it's here that we are going to inspect c1_status to see if
> loading DWARF failed. If it failed, then we'll try to load CTF. So,
> here is the change I am adding to the process of loading the corpus c1:
>
>
> @@ -1205,6 +1208,36 @@ main(int argc, char* argv[])
> set_suppressions(*ctxt, opts);
> abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
> +
> +#ifdef WITH_CTF
> + if (// We were not instructed to use CTF ...
> + !opts.use_ctf
This is always true, because we are in the else block of `opts.use_ctf'.
> + // ... and yet, no debug info was found ...
> + && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)
> + // ... but we found ELF symbols ...
> + && !(c1_status & STATUS_NO_SYMBOLS_FOUND))
> + {
> + string basenam, dirnam;
> + base_name(opts.file1, basenam);
> + dir_name(opts.file1, dirnam);
> + // ... the input file seems to contain CTF
> + // archive, so let's try to see if the file
> + // contains a CTF archive, who knows ...
> + if (dir_contains_ctf_archive(dirnam, basenam))
Non-kernel binaries contains `.ctf' section instead of `ctfa' file,
so I can implement a `file_contains_ctf_section' function to test if
it is a valid input file for CTF reader.
> + {
> + // The file does contain CTF debug information finally!
> + abigail::ctf_reader::read_context_sptr ctxt =
> + abigail::ctf_reader::create_read_context
> + (opts.file1,
> + opts.prepared_di_root_paths1,
> + env.get());
> + ABG_ASSERT(ctxt);
> +
> + c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
> + c1_status);
> + }
> + }
> +#endif
>
> OK, maybe the statement
>
> "if (dir_contains_ctf_archive(dirnam, basenam))"
>
> above is not necessary as I am not sure if the CTF archive file is
> supposed to be present for normal binaries or not. If it's not
> necessary, then please remove that line as well as the use of the
> basenam/dirnam variables.
>
> But you get the general idea. We just test if loading DWARF failed
> (even if we were not instructed to use CTF) and then we try to load CTF.
> The change is at one place right after trying to load the DWARF,
> logically.
>
> I do something similar to the loading of the second corpus c2.
>
>
> Here is the entire change to the abidiff file:
>
> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
> index e0bb35ac..0d9e59c2 100644
> --- a/tools/abidiff.cc
> +++ b/tools/abidiff.cc
> @@ -47,6 +47,9 @@ using abigail::suppr::suppressions_type;
> using abigail::suppr::read_suppressions;
> using namespace abigail::dwarf_reader;
> using namespace abigail::elf_reader;
> +using abigail::tools_utils::base_name;
> +using abigail::tools_utils::dir_name;
> +using abigail::tools_utils::dir_contains_ctf_archive;
> using abigail::tools_utils::emit_prefix;
> using abigail::tools_utils::check_file;
> using abigail::tools_utils::guess_file_type;
> @@ -1205,6 +1208,36 @@ main(int argc, char* argv[])
> set_suppressions(*ctxt, opts);
> abigail::dwarf_reader::set_do_log(*ctxt, opts.do_log);
> c1 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c1_status);
> +
> +#ifdef WITH_CTF
> + if (// We were not instructed to use CTF ...
> + !opts.use_ctf
> + // ... and yet, no debug info was found ...
> + && (c1_status & STATUS_DEBUG_INFO_NOT_FOUND)
> + // ... but we found ELF symbols ...
> + && !(c1_status & STATUS_NO_SYMBOLS_FOUND))
> + {
> + string basenam, dirnam;
> + base_name(opts.file1, basenam);
> + dir_name(opts.file1, dirnam);
> + // ... the input file seems to contain CTF
> + // archive, so let's try to see if the file
> + // contains a CTF archive, who knows ...
> + if (dir_contains_ctf_archive(dirnam, basenam))
> + {
> + // The file does contain CTF debug information finally!
> + abigail::ctf_reader::read_context_sptr ctxt =
> + abigail::ctf_reader::create_read_context
> + (opts.file1,
> + opts.prepared_di_root_paths1,
> + env.get());
> + ABG_ASSERT(ctxt);
> +
> + c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
> + c1_status);
> + }
> + }
> +#endif
> if (!c1
> || (opts.fail_no_debug_info
> && (c1_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> @@ -1212,7 +1245,7 @@ main(int argc, char* argv[])
> return handle_error(c1_status, ctxt.get(),
> argv[0], opts);
> }
> - }
> + }
> break;
> case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
> {
> @@ -1289,13 +1322,44 @@ main(int argc, char* argv[])
> set_suppressions(*ctxt, opts);
>
> c2 = abigail::dwarf_reader::read_corpus_from_elf(*ctxt, c2_status);
> +
> +#ifdef WITH_CTF
> + if (// We were not instructed to use CTF ...
> + !opts.use_ctf
> + // ... and yet, no debug info was found ...
> + && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)
> + // ... but we found ELF symbols ...
> + && !(c2_status & STATUS_NO_SYMBOLS_FOUND))
> + {
> + string basenam, dirnam;
> + base_name(opts.file2, basenam);
> + dir_name(opts.file2, dirnam);
> + // ... the input file seems to contain CTF
> + // archive, so let's try to see if the file
> + // contains a CTF archive, who knows ...
> + if (dir_contains_ctf_archive(dirnam, basenam))
> + {
> + // The file does contain CTF debug information finally!
> + abigail::ctf_reader::read_context_sptr ctxt =
> + abigail::ctf_reader::create_read_context
> + (opts.file2,
> + opts.prepared_di_root_paths2,
> + env.get());
> + ABG_ASSERT(ctxt);
> +
> + c2 = abigail::ctf_reader::read_corpus(ctxt.get(),
> + c2_status);
> + }
> + }
> +#endif
> +
> if (!c2
> || (opts.fail_no_debug_info
> && (c2_status & STATUS_ALT_DEBUG_INFO_NOT_FOUND)
> && (c2_status & STATUS_DEBUG_INFO_NOT_FOUND)))
> return handle_error(c2_status, ctxt.get(), argv[0], opts);
> }
> - }
> + }
> break;
> case abigail::tools_utils::FILE_TYPE_XML_CORPUS:
> {
>
>
> By the way, I think there should be at least one test that exercises the
> workflow "try to load DWARF from the binary, fail, try to load CTF,
> succeed", without having the --ctf provided. And we should have that
> test for all the tools that were modified. We don't need to have that
> for a Linux Kernel, obviously, as that would be too big to have in the
> tarball.
>
OK, I'll add test cases.
> Below is the entire amended patch that I came up with. I suspect the
> use of the dir_contains_ctf_archive function everywhere is wrong, but
> I'll let you be the judge of that. I think the documentation of when
> that archive is present/necessary would be good to improve/add.
>
> Thanks again for the patch!
Thanks so much for comments!.
>
>
> From 481b547d7871a544b690943b16e6a173a729932f Mon Sep 17 00:00:00 2001
> [...]
next prev parent reply other threads:[~2022-10-04 23:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-01 0:15 Guillermo E. Martinez
2022-10-04 9:04 ` Dodji Seketeli
2022-10-04 23:13 ` Guillermo E. Martinez [this message]
2022-10-06 7:42 ` Dodji Seketeli
2022-10-06 14:12 ` Dodji Seketeli
2022-10-07 14:13 ` Guillermo E. Martinez
2022-10-06 19:53 ` Guillermo Martinez
2022-10-06 19:50 ` Guillermo E. Martinez
2022-10-07 13:38 ` Dodji Seketeli
2022-10-07 16:04 ` Ben Woodard
2022-11-15 20:13 ` [PATCHv2] ELF based front-end readers fallback feature Guillermo E. Martinez
2022-11-21 18:51 ` [PATCHv3] " Guillermo E. Martinez
2022-11-22 14:19 ` Dodji Seketeli
2022-11-22 16:02 ` Guillermo E. Martinez
2022-11-22 16:00 ` [PATCH v4] " Guillermo E. Martinez
2022-11-28 15:56 ` Dodji Seketeli
2022-11-28 21:59 ` Guillermo E. Martinez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=568fe730-3bb9-0267-00bc-2873e94e502f@oracle.com \
--to=guillermo.e.martinez@oracle.com \
--cc=dodji@seketeli.org \
--cc=libabigail@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).