public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel
@ 2022-04-05 16:08 Guillermo E. Martinez
  2022-04-25 18:33 ` Guillermo E. Martinez
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Guillermo E. Martinez @ 2022-04-05 16:08 UTC (permalink / raw)
  To: libabigail

Hello libabigail team,

This patch is meant to improves the performance to extract  CTF
information from Linux kernel, it depends of:

https://sourceware.org/pipermail/libabigail/2022q1/004262.html

Feedback will be welcomed and appreciated as ever!,

Kind regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), storing a cached
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group)
using the new `open_vmlinux_ctf_archive' function, and since this ctf
archive file contains the information for kernel modules as well, then
it's reused by each module to be processed in the group by ctf reader
using `set_vmlinux_ctf_archive' function.  Doing so,
`read_context::ctfa` member should be updated if corpus origin is
`CTF_ORIGIN', otherwise it must contain a valid address, and `ctf_close'
must be called after processing all group's members, that is in
`reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::{open_vmlinux_ctf_archive,
	vmlinux_ctf_archive,set_vmlinux_ctf_archive}): Add new functions.
	* src/abg-ctf-reader.cc (read_context::~read_context): Add
	destructor.
	(ctf_reader::read_corpus): Adjust `ctf archive' if corpus
	origin is `CTF_ORIGIN'.
	* src/abg-tools-utils.cc (get_binary_paths_from_kernel_dist):
	Use new `open_vmlinux_ctf_archive' and `set_vmlinux_ctf_archive'
	to store a cached `ctf_archive_t' type gather from vmlinux
	and be reused by the rest of kernel modules.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h | 11 ++++++--
 src/abg-ctf-reader.cc    | 56 +++++++++++++++++++++++++++++-----------
 src/abg-tools-utils.cc   | 14 ++++++----
 3 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..09bf4020 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -49,8 +49,15 @@ reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
 void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
+open_vmlinux_ctf_archive(read_context& ctxt,
+                         const string& vmlinux_ctfa_path);
+ctf_archive_t *
+vmlinux_ctf_archive(read_context& ctxt);
+
+void
+set_vmlinux_ctf_archive(read_context& ctxt,
+                        void *ctfa);
+
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index b9b3d939..5efa6fc7 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@ public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -213,10 +209,14 @@ public:
     elf_fd = -1;
     is_elf_exec = false;
     ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1332,9 +1332,7 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
     }
 
   int errp;
-  if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
-  else
+  if (corp->get_origin() == corpus::CTF_ORIGIN)
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
        Information about the types, variables, functions, etc contained
@@ -1353,7 +1351,6 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
   ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1383,7 +1380,7 @@ set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1436,18 +1433,47 @@ reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
+/// Open CTF archive in a read context for a given path.
 ///
 /// @param ctxt the read_context to consider.
 ///
-/// @param filename the @ref vmlinux CTFA filename to set.
+/// @param vmlinux_ctfa_path the vmlinux CTFA filename
+/// to be opened.
 void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
+open_vmlinux_ctf_archive(read_context& ctxt,
+                         const string& vmlinux_ctfa_path)
 {
-  ctxt.vmlinux_ctfa_path_ = filename;
+  int errp;
+  if ((ctxt.ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(),
+                                &errp)) == NULL )
+    {
+      std::cerr << "cannot open: "
+       << ctf_errmsg (errp)
+       << "\n";
+      std::abort();
+    }
 }
 
+
+/// Set CTF archive member for a given read context.
+///
+/// @param ctxt the read_context to consider.
+///
+/// @param ctfa reference to be set.
+void
+set_vmlinux_ctf_archive(read_context& ctxt,
+                        void *ctfa)
+{
+  ctxt.ctfa = static_cast<ctf_archive_t *>(ctfa);
+}
+
+/// Return CTF archive member for a given read context.
+///
+/// @param ctxt the read_context to consider.
+ctf_archive_t *
+vmlinux_ctf_archive(read_context& ctxt)
+{ return ctxt.ctfa; }
+
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..5995a4ac 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2577,12 +2577,13 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
 #ifdef WITH_CTF
-  string vmlinux_ctfa;
+  string vmlinux_ctfa_path;
   if (got_binary_paths &&
       env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
     {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
+      got_binary_paths =
+       get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa_path);
+      ABG_ASSERT(!vmlinux_ctfa_path.empty());
     }
 #endif
 
@@ -2684,11 +2685,14 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+          void *vmlinux_ctfa = nullptr;
+
           if (!vmlinux.empty())
             {
               ctxt =
                ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
+              open_vmlinux_ctf_archive(*ctxt, vmlinux_ctfa_path);
+              vmlinux_ctfa = vmlinux_ctf_archive(*ctxt);
 
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
@@ -2725,7 +2729,7 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
+                   set_vmlinux_ctf_archive(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();
-- 
2.35.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel
  2022-04-05 16:08 [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel Guillermo E. Martinez
@ 2022-04-25 18:33 ` Guillermo E. Martinez
  2022-04-27 22:16 ` [PATCH v2] " Guillermo E. Martinez
  2022-04-29 11:11 ` [PATCH] " Dodji Seketeli
  2 siblings, 0 replies; 6+ messages in thread
From: Guillermo E. Martinez @ 2022-04-25 18:33 UTC (permalink / raw)
  To: libabigail

On Tuesday, April 5, 2022 11:08:33 AM CDT Guillermo E. Martinez wrote:
Hello libabigail team,

Any comment about this patch?
 
> This patch is meant to improves the performance to extract  CTF
> information from Linux kernel, it depends of:
> 
> https://sourceware.org/pipermail/libabigail/2022q1/004262.html
> 
> Feedback will be welcomed and appreciated as ever!,
> [...]
Kind regards,
Guillermo




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] ctf-reader: Improves performance to read CTF information from Linux kernel
  2022-04-05 16:08 [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel Guillermo E. Martinez
  2022-04-25 18:33 ` Guillermo E. Martinez
@ 2022-04-27 22:16 ` Guillermo E. Martinez
  2022-04-29 11:11 ` [PATCH] " Dodji Seketeli
  2 siblings, 0 replies; 6+ messages in thread
From: Guillermo E. Martinez @ 2022-04-27 22:16 UTC (permalink / raw)
  To: libabigail

Hello libabigail team,

This is the v2 patch to improves the performance to extract  CTF
information from Linux kernel, major changes respect to v1 are:

   * Initialize `read_context::ctf_ctfa' per group, since it contains
   the debug information for vmlinux and kernel modules.
   * Get rid of utilities functions to locate, set and get the vmlinux
   archive pointer while corpus group is processed.

Feedback will be welcomed and appreciated as ever!,

Kind Regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), keep open per _group_
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group),
since this ctf archive file contains the information for kernel modules
as well, then it's reused by each module to be processed in the group,
invoking to `reset_read_context' ctf reader function.  Doing so,
`read_context::ctfa` member should be updated iff corpus origin is
`CTF_ORIGIN', otherwise it must contain valid information from the main
element in the corpus group, `ctf_close' must be called after processing
all group's members, that is in `reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::set_vmlinux_ctfa_path):
	Remove declaration.
	* src/abg-ctf-reader.cc (read_context::read_context): Use member
	initialization for `ctfa' member. Remove `vmlinux_ctfa_path_'.
	(read_context::~read_context): Add destructor and delete `ctfa'
	member.
	(ctf_reader::set_vmlinux_ctfa_path): Remove definition.
	(ctf_reader::read_corpus): Adjust `read_context::ctfa' if corpus
	origin is `LINUX_KERNEL_BINARY_ORIGIN', looking for
	'vmlinux.ctfa'. Invoke `process_ctf_archive', `sort_{functions,
	variables}' if ctf debug info was found.
	* src/abg-tools-utils.cc: (get_vmlinux_ctfa_path_from_kernel_dist):
	Remove definition.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h |  3 ---
 src/abg-ctf-reader.cc    | 48 +++++++++++++++++++---------------------
 src/abg-tools-utils.cc   | 40 ++-------------------------------
 3 files changed, 25 insertions(+), 66 deletions(-)

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..ba7289aa 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -48,9 +48,6 @@ void
 reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 2944be07..cc79520d 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@ public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -199,7 +195,9 @@ public:
   /// Constructor.
   ///
   /// @param elf_path the path to the ELF file.
-  read_context(const string& elf_path, ir::environment *env)
+  read_context(const string& elf_path, ir::environment *env) :
+    // ctfa data member can be used per courpus group
+    ctfa(NULL)
   {
     initialize(elf_path, env);
   }
@@ -212,11 +210,14 @@ public:
     elf_handler = NULL;
     elf_fd = -1;
     is_elf_exec = false;
-    ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1333,7 +1334,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
 
   int errp;
   if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
+    {
+      std::string filename;
+      if (tools_utils::base_name(ctxt->filename, filename)
+          && filename == "vmlinux")
+        {
+          std::string vmlinux_ctfa_path = ctxt->filename + ".ctfa";
+          ctxt->ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(), &errp);
+        }
+    }
   else
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
@@ -1346,14 +1355,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
   if (ctxt->ctfa == NULL)
     status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
   else
-    process_ctf_archive(ctxt, corp);
+    {
+      process_ctf_archive(ctxt, corp);
+      ctxt->cur_corpus_->sort_functions();
+      ctxt->cur_corpus_->sort_variables();
+    }
 
   ctxt->ir_env->canonicalization_is_done(true);
-  ctxt->cur_corpus_->sort_functions();
-  ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1383,7 +1393,7 @@ set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1436,18 +1446,6 @@ reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
-///
-/// @param ctxt the read_context to consider.
-///
-/// @param filename the @ref vmlinux CTFA filename to set.
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
-{
-  ctxt.vmlinux_ctfa_path_ = filename;
-}
-
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..47a33022 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2470,31 +2470,6 @@ get_vmlinux_path_from_kernel_dist(const string&	from,
   return found;
 }
 
-/// Get the paths of the CTF vmlinux archive under given directory.
-///
-/// @param from the directory under which to look for.
-///
-/// @param vmlinux_path output parameter.  The path of the CTF vmlinux
-/// binary that was found.
-///
-/// @return true if at least the path to the vmlinux.ctfa binary was found.
-#ifdef WITH_CTF
-bool
-get_vmlinux_ctfa_path_from_kernel_dist(const string& from,
-                                       string& vmlinux_ctfa_path)
-{
-  if (!dir_exists(from))
-    return false;
-
-  string dist_root = from;
-  bool found = false;
-  if (find_vmlinux_path(dist_root, vmlinux_ctfa_path, "vmlinux.ctfa"))
-    found = true;
-
-  return found;
-}
-#endif
-
 /// Get the paths of the vmlinux and kernel module binaries under
 /// given directory.
 ///
@@ -2576,15 +2551,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
   t.start();
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
-#ifdef WITH_CTF
-  string vmlinux_ctfa;
-  if (got_binary_paths &&
-      env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
-    {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
-    }
-#endif
 
   t.stop();
 
@@ -2684,12 +2650,11 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+
           if (!vmlinux.empty())
             {
               ctxt =
-               ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
-
+                ctf_reader::create_read_context(vmlinux, env.get());
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
 
@@ -2725,7 +2690,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();
-- 
2.35.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel
  2022-04-05 16:08 [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel Guillermo E. Martinez
  2022-04-25 18:33 ` Guillermo E. Martinez
  2022-04-27 22:16 ` [PATCH v2] " Guillermo E. Martinez
@ 2022-04-29 11:11 ` Dodji Seketeli
  2022-04-29 13:58   ` Guillermo E. Martinez
  2022-04-29 14:28   ` [PATCH v3] " Guillermo E. Martinez
  2 siblings, 2 replies; 6+ messages in thread
From: Dodji Seketeli @ 2022-04-29 11:11 UTC (permalink / raw)
  To: Guillermo E. Martinez via Libabigail

Hello,

"Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
écrit:

> Hello libabigail team,
>
> This patch is meant to improves the performance to extract  CTF
> information from Linux kernel, it depends of:
>
> https://sourceware.org/pipermail/libabigail/2022q1/004262.html
>
> Feedback will be welcomed and appreciated as ever!,

I am sorry, but the patch doesn't apply to the current master branch
anymore :-(

Would you please refresh it on your tree and re-send it?

Thanks.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel
  2022-04-29 11:11 ` [PATCH] " Dodji Seketeli
@ 2022-04-29 13:58   ` Guillermo E. Martinez
  2022-04-29 14:28   ` [PATCH v3] " Guillermo E. Martinez
  1 sibling, 0 replies; 6+ messages in thread
From: Guillermo E. Martinez @ 2022-04-29 13:58 UTC (permalink / raw)
  To: Guillermo E. Martinez via Libabigail, Dodji Seketeli

On Friday, April 29, 2022 6:11:29 AM CDT Dodji Seketeli wrote:
> Hello,
Hello Dodji,

> "Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
> écrit:
> 
> > Hello libabigail team,
> >
> > This patch is meant to improves the performance to extract  CTF
> > information from Linux kernel, it depends of:
> >
> > https://sourceware.org/pipermail/libabigail/2022q1/004262.html
> >
> > Feedback will be welcomed and appreciated as ever!,
> 
> I am sorry, but the patch doesn't apply to the current master branch
> anymore :-(
Sorry, hmmm my fault, I forget mention that patch:
    https://sourceware.org/pipermail/libabigail/2022q1/004262.html
depends on this:
   https://sourceware.org/pipermail/libabigail/2022q2/004279.html 
> Would you ple|ase refresh it on your tree and re-send it?
Sure, i will refresh this and its dependencies from master :-)

> Thanks.
Thanks!, 
guillermo





^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3] ctf-reader: Improves performance to read CTF information from Linux kernel
  2022-04-29 11:11 ` [PATCH] " Dodji Seketeli
  2022-04-29 13:58   ` Guillermo E. Martinez
@ 2022-04-29 14:28   ` Guillermo E. Martinez
  1 sibling, 0 replies; 6+ messages in thread
From: Guillermo E. Martinez @ 2022-04-29 14:28 UTC (permalink / raw)
  To: libabigail

Hello libabigail team,

This is the v3 patch to improves the performance to extract  CTF
information from Linux kernel, major changes respect to v2 are:

   * rebase from its dependency:
     https://sourceware.org/pipermail/libabigail/2022q2/004309.html

Please take a look and let me know your comments, I'll really appreciate
them.

Kind Regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), keep open per _group_
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group),
since this ctf archive file contains the information for kernel modules
as well, then it's reused by each module to be processed in the group,
invoking to `reset_read_context' ctf reader function.  Doing so,
`read_context::ctfa` member should be updated iff corpus origin is
`CTF_ORIGIN', otherwise it must contain valid information from the main
element in the corpus group, `ctf_close' must be called after processing
all group's members, that is in `reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::set_vmlinux_ctfa_path):
	Remove declaration.
	* src/abg-ctf-reader.cc (read_context::read_context): Use member
	initialization for `ctfa' member. Remove `vmlinux_ctfa_path_'.
	(read_context::~read_context): Add destructor and delete `ctfa'
	member.
	(ctf_reader::set_vmlinux_ctfa_path): Remove definition.
	(ctf_reader::read_corpus): Adjust `read_context::ctfa' if corpus
	origin is `LINUX_KERNEL_BINARY_ORIGIN', looking for
	'vmlinux.ctfa'. Invoke `process_ctf_archive', `sort_{functions,
	variables}' if ctf debug info was found.
	* src/abg-tools-utils.cc: (get_vmlinux_ctfa_path_from_kernel_dist):
	Remove definition.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h |  3 ---
 src/abg-ctf-reader.cc    | 48 +++++++++++++++++++---------------------
 src/abg-tools-utils.cc   | 40 ++-------------------------------
 3 files changed, 25 insertions(+), 66 deletions(-)

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..ba7289aa 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -48,9 +48,6 @@ void
 reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index dcc65d4e..8cb65b65 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@ public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -199,7 +195,9 @@ public:
   /// Constructor.
   ///
   /// @param elf_path the path to the ELF file.
-  read_context(const string& elf_path, ir::environment *env)
+  read_context(const string& elf_path, ir::environment *env) :
+    // ctfa data member can be used per courpus group
+    ctfa(NULL)
   {
     initialize(elf_path, env);
   }
@@ -212,11 +210,14 @@ public:
     elf_handler = NULL;
     elf_fd = -1;
     is_elf_exec = false;
-    ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1342,7 +1343,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
 
   int errp;
   if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
+    {
+      std::string filename;
+      if (tools_utils::base_name(ctxt->filename, filename)
+          && filename == "vmlinux")
+        {
+          std::string vmlinux_ctfa_path = ctxt->filename + ".ctfa";
+          ctxt->ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(), &errp);
+        }
+    }
   else
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
@@ -1355,14 +1364,15 @@ read_corpus(read_context *ctxt, elf_reader::status &status)
   if (ctxt->ctfa == NULL)
     status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
   else
-    process_ctf_archive(ctxt, corp);
+    {
+      process_ctf_archive(ctxt, corp);
+      ctxt->cur_corpus_->sort_functions();
+      ctxt->cur_corpus_->sort_variables();
+    }
 
   ctxt->ir_env->canonicalization_is_done(true);
-  ctxt->cur_corpus_->sort_functions();
-  ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1392,7 +1402,7 @@ set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1445,18 +1455,6 @@ reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
-///
-/// @param ctxt the read_context to consider.
-///
-/// @param filename the @ref vmlinux CTFA filename to set.
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
-{
-  ctxt.vmlinux_ctfa_path_ = filename;
-}
-
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..47a33022 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2470,31 +2470,6 @@ get_vmlinux_path_from_kernel_dist(const string&	from,
   return found;
 }
 
-/// Get the paths of the CTF vmlinux archive under given directory.
-///
-/// @param from the directory under which to look for.
-///
-/// @param vmlinux_path output parameter.  The path of the CTF vmlinux
-/// binary that was found.
-///
-/// @return true if at least the path to the vmlinux.ctfa binary was found.
-#ifdef WITH_CTF
-bool
-get_vmlinux_ctfa_path_from_kernel_dist(const string& from,
-                                       string& vmlinux_ctfa_path)
-{
-  if (!dir_exists(from))
-    return false;
-
-  string dist_root = from;
-  bool found = false;
-  if (find_vmlinux_path(dist_root, vmlinux_ctfa_path, "vmlinux.ctfa"))
-    found = true;
-
-  return found;
-}
-#endif
-
 /// Get the paths of the vmlinux and kernel module binaries under
 /// given directory.
 ///
@@ -2576,15 +2551,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
   t.start();
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
-#ifdef WITH_CTF
-  string vmlinux_ctfa;
-  if (got_binary_paths &&
-      env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
-    {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
-    }
-#endif
 
   t.stop();
 
@@ -2684,12 +2650,11 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+
           if (!vmlinux.empty())
             {
               ctxt =
-               ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
-
+                ctf_reader::create_read_context(vmlinux, env.get());
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
 
@@ -2725,7 +2690,6 @@ build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();

base-commit: 6716aa042bc84f79ae5292fc24d435048e26e078
prerequisite-patch-id: a85e80196ae4ae473f15c55245c386d2bf9b9a7c
prerequisite-patch-id: ef0f517f89af7ae3cc1b573532a4805493058fd4
-- 
2.35.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-29 14:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-05 16:08 [PATCH] ctf-reader: Improves performance to read CTF information from Linux kernel Guillermo E. Martinez
2022-04-25 18:33 ` Guillermo E. Martinez
2022-04-27 22:16 ` [PATCH v2] " Guillermo E. Martinez
2022-04-29 11:11 ` [PATCH] " Dodji Seketeli
2022-04-29 13:58   ` Guillermo E. Martinez
2022-04-29 14:28   ` [PATCH v3] " Guillermo E. Martinez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).