public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
To: libabigail@sourceware.org
Subject: [PATCH v2] ctf-reader: Improves performance to read CTF information from Linux kernel
Date: Wed, 27 Apr 2022 17:16:56 -0500	[thread overview]
Message-ID: <20220427221656.2216932-1-guillermo.e.martinez@oracle.com> (raw)
In-Reply-To: <20220405160833.1179590-1-guillermo.e.martinez@oracle.com>

Hello libabigail team,

This is the v2 patch to improves the performance to extract  CTF
information from Linux kernel, major changes respect to v1 are:

   * Initialize `read_context::ctf_ctfa' per group, since it contains
   the debug information for vmlinux and kernel modules.
   * Get rid of utilities functions to locate, set and get the vmlinux
   archive pointer while corpus group is processed.

Feedback will be welcomed and appreciated as ever!,

Kind Regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), keep open per _group_
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group),
since this ctf archive file contains the information for kernel modules
as well, then it's reused by each module to be processed in the group,
invoking to `reset_read_context' ctf reader function.  Doing so,
`read_context::ctfa` member should be updated iff corpus origin is
`CTF_ORIGIN', otherwise it must contain valid information from the main
element in the corpus group, `ctf_close' must be called after processing
all group's members, that is in `reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::set_vmlinux_ctfa_path):
	Remove declaration.
	* src/abg-ctf-reader.cc (read_context::read_context): Use member
	initialization for `ctfa' member. Remove `vmlinux_ctfa_path_'.
	(read_context::~read_context): Add destructor and delete `ctfa'
	member.
	(ctf_reader::set_vmlinux_ctfa_path): Remove definition.
	(ctf_reader::read_corpus): Adjust `read_context::ctfa' if corpus
	origin is `LINUX_KERNEL_BINARY_ORIGIN', looking for
	'vmlinux.ctfa'. Invoke `process_ctf_archive', `sort_{functions,
	variables}' if ctf debug info was found.
	* src/abg-tools-utils.cc: (get_vmlinux_ctfa_path_from_kernel_dist):
	Remove definition.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h |  3 ---
 src/abg-ctf-reader.cc    | 48 +++++++++++++++++++---------------------
 src/abg-tools-utils.cc   | 40 ++-------------------------------
 3 files changed, 25 insertions(+), 66 deletions(-)

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..ba7289aa 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -48,9 +48,6 @@ void
 reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 2944be07..cc79520d 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@ public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -199,7 +195,9 @@ public:
   /// Constructor.
   ///
   /// @param elf_path the path to the ELF file.
-  read_context(const string& elf_path, ir::environment *env)
+  read_context(const string& elf_path, ir::environment *env) :
+    // ctfa data member can be used per courpus group
+    ctfa(NULL)
   {
     initialize(elf_path, env);
   }
@@ -212,11 +210,14 @@ public:
     elf_handler = NULL;
     elf_fd = -1;
     is_elf_exec = false;
-    ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1333,7 +1334,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
 
   int errp;
   if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
+    {
+      std::string filename;
+      if (tools_utils::base_name(ctxt->filename, filename)
+          && filename == "vmlinux")
+        {
+          std::string vmlinux_ctfa_path = ctxt->filename + ".ctfa";
+          ctxt->ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(), &errp);
+        }
+    }
   else
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
@@ -1346,14 +1355,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
   if (ctxt->ctfa == NULL)
     status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
   else
-    process_ctf_archive(ctxt, corp);
+    {
+      process_ctf_archive(ctxt, corp);
+      ctxt->cur_corpus_->sort_functions();
+      ctxt->cur_corpus_->sort_variables();
+    }
 
   ctxt->ir_env->canonicalization_is_done(true);
-  ctxt->cur_corpus_->sort_functions();
-  ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1383,7 +1393,7 @@ set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1436,18 +1446,6 @@ reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
-///
-/// @param ctxt the read_context to consider.
-///
-/// @param filename the @ref vmlinux CTFA filename to set.
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
-{
-  ctxt.vmlinux_ctfa_path_ = filename;
-}
-
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..47a33022 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2470,31 +2470,6 @@ get_vmlinux_path_from_kernel_dist(const string&	from,
   return found;
 }
 
-/// Get the paths of the CTF vmlinux archive under given directory.
-///
-/// @param from the directory under which to look for.
-///
-/// @param vmlinux_path output parameter.  The path of the CTF vmlinux
-/// binary that was found.
-///
-/// @return true if at least the path to the vmlinux.ctfa binary was found.
-#ifdef WITH_CTF
-bool
-get_vmlinux_ctfa_path_from_kernel_dist(const string& from,
-                                       string& vmlinux_ctfa_path)
-{
-  if (!dir_exists(from))
-    return false;
-
-  string dist_root = from;
-  bool found = false;
-  if (find_vmlinux_path(dist_root, vmlinux_ctfa_path, "vmlinux.ctfa"))
-    found = true;
-
-  return found;
-}
-#endif
-
 /// Get the paths of the vmlinux and kernel module binaries under
 /// given directory.
 ///
@@ -2576,15 +2551,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
   t.start();
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
-#ifdef WITH_CTF
-  string vmlinux_ctfa;
-  if (got_binary_paths &&
-      env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
-    {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
-    }
-#endif
 
   t.stop();
 
@@ -2684,12 +2650,11 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+
           if (!vmlinux.empty())
             {
               ctxt =
-               ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
-
+                ctf_reader::create_read_context(vmlinux, env.get());
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
 
@@ -2725,7 +2690,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();
-- 
2.35.1


  parent reply	other threads:[~2022-04-27 22:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 16:08 [PATCH] " Guillermo E. Martinez
2022-04-25 18:33 ` Guillermo E. Martinez
2022-04-27 22:16 ` Guillermo E. Martinez [this message]
2022-04-29 11:11 ` Dodji Seketeli
2022-04-29 13:58   ` Guillermo E. Martinez
2022-04-29 14:28   ` [PATCH v3] " Guillermo E. Martinez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220427221656.2216932-1-guillermo.e.martinez@oracle.com \
    --to=guillermo.e.martinez@oracle.com \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).