From: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
To: libabigail@sourceware.org
Cc: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
Subject: [PATCH] ctf-reader: Lookup debug info for symbols in a non default archive member
Date: Wed, 31 Aug 2022 10:16:03 -0500 [thread overview]
Message-ID: <20220831151603.915945-1-guillermo.e.martinez@oracle.com> (raw)
Hello,
This patch improves the ABI XML file generated by ctf reader, there
are Linux symbols (EXPORT_SYMBOL*) that were missing.
Comments will be grateful and appreciated!.
Thanks in advanced,
guillermo
--
The current mechanism used by the ctf reader to looking for debug
information given a specific Linux symbol, is open the dictionary
(default) which the name match with the binary name being processing
in the current corpus, e.g. `vmlinux' or `module-name.ko'. However
there are symbol information is not located in a default dictionary,
this is evident comparing the symbols in `Module.symvers' file with
ABI XML file, so for example, the ctf reader is expecting to find the
information for `LZ4_decompress_fast' symbol in the CTF `vmlinux'
archive member, because this symbols is defined in `vmlinux' binary:
0x4c416eb9 LZ4_decompress_fast vmlinux EXPORT_SYMBOL
But it figures out that it is missing. The correct location is
`vmlinux#0' dictionary.
CTF archive member: vmlinux:
...
Function objects:
...
CTF archive member: vmlinux#0:
Function objects:
...
LZ4_decompress_fast -> 0x80037400: (kind 5) int (*) (const char *, char *, int) (aligned at 0x8)
...
Therefore, ctf reader is looking for debug information in the whole
archive, fortunately `libctf' provides a fast lookup mechanism using
cache, dictionary references, etc., so the penalty performance is ~10%.
* src/abg-ctf-reader.cc (lookup_symbol_in_ctf_archive): New function.
(process_ctf_archive): Use `lookup_symbol_in_ctf_archive'.
Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
src/abg-ctf-reader.cc | 72 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 64 insertions(+), 8 deletions(-)
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 71808f9a..8fa98a94 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -1204,6 +1204,62 @@ lookup_type(read_context *ctxt, corpus_sptr corp,
return result;
}
+/// Given a symbol name, lookup the corresponding CTF information in
+/// the default dictionary (CTF archive member provided by the caller)
+/// If the search is not success, the looks for the symbol name
+/// in _all_ archive members.
+///
+/// @param ctfa the CTF archive.
+/// @param dict the default dictionary to looks for.
+/// @param sym_name the symbol name.
+/// @param corp the IR corpus.
+///
+/// Note that if @ref sym_name is found in other than default dictionary
+/// @ref ctf_dict will be updated and it must be explicate closed by its
+/// caller.
+///
+/// @return a valid CTF type id, if @ref sym_name was found, -1 otherwise.
+
+static ctf_id_t
+lookup_symbol_in_ctf_archive(ctf_archive_t *ctfa, ctf_dict_t **ctf_dict,
+ const char *sym_name, corpus_sptr corp)
+{
+ int ctf_err;
+ ctf_dict_t *dict = *ctf_dict;
+ ctf_id_t ctf_type = ctf_lookup_variable(dict, sym_name);
+
+ /* lookup CTF type for a given symbol in its default
+ dictionary */
+ if (ctf_type == (ctf_id_t) -1
+ && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
+ ctf_type = ctf_lookup_by_symbol_name(dict, sym_name);
+
+ /* Not lucky, then, search in whole archive */
+ if (ctf_type == (ctf_id_t) -1)
+ {
+ ctf_dict_t *fp;
+ ctf_next_t *i = NULL;
+ const char *arcname;
+
+ while ((fp = ctf_archive_next(ctfa, &i, &arcname, 1, &ctf_err)) != NULL)
+ {
+ ctf_type = ctf_lookup_variable (fp, sym_name);
+ if (ctf_type == (ctf_id_t) -1
+ && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
+ ctf_type = ctf_lookup_by_symbol_name(fp, sym_name);
+
+ if (ctf_type != (ctf_id_t) -1)
+ {
+ *ctf_dict = fp;
+ break;
+ }
+ ctf_dict_close(fp);
+ }
+ }
+
+ return ctf_type;
+}
+
/// Process a CTF archive and create libabigail IR for the types,
/// variables and function declarations found in the archive, iterating
/// over public symbols. The IR is added to the given corpus.
@@ -1222,7 +1278,7 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
corp->add(ir_translation_unit);
int ctf_err;
- ctf_dict_t *ctf_dict;
+ ctf_dict_t *ctf_dict, *dict_tmp;
const auto symtab = ctxt->symtab;
symtab_reader::symtab_filter filter = symtab->make_filter();
filter.set_public_symbols();
@@ -1248,19 +1304,17 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
abort();
}
+ dict_tmp = ctf_dict;
+
for (const auto& symbol : symtab_reader::filtered_symtab(*symtab, filter))
{
std::string sym_name = symbol->get_name();
ctf_id_t ctf_sym_type;
- ctf_sym_type = ctf_lookup_variable(ctf_dict, sym_name.c_str());
- if (ctf_sym_type == (ctf_id_t) -1
- && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
- // lookup in function objects
- ctf_sym_type = ctf_lookup_by_symbol_name(ctf_dict, sym_name.c_str());
-
+ ctf_sym_type = lookup_symbol_in_ctf_archive(ctxt->ctfa, &ctf_dict,
+ sym_name.c_str(), corp);
if (ctf_sym_type == (ctf_id_t) -1)
- continue;
+ continue;
if (ctf_type_kind(ctf_dict, ctf_sym_type) != CTF_K_FUNCTION)
{
@@ -1305,6 +1359,8 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
func_declaration->set_is_in_public_symbol_table(true);
ctxt->maybe_add_fn_to_exported_decls(func_declaration.get());
}
+
+ ctf_dict = dict_tmp;
}
ctf_dict_close(ctf_dict);
--
2.35.1
WARNING: multiple messages have this Message-ID
From: "Guillermo E. Martinez" <guillermo.e.martinez@oracle.com>
To: libabigail@sourceware.org
Subject: [PATCH] ctf-reader: Lookup debug info for symbols in a non default archive member
Date: Wed, 31 Aug 2022 10:16:03 -0500 [thread overview]
Message-ID: <20220831151603.915945-1-guillermo.e.martinez@oracle.com> (raw)
Message-ID: <20220831151603.Zm9iiLn4FCEe6hnCE0fyuHiIhgJryREGO8PxcGNmzt8@z> (raw)
Hello,
This patch improves the ABI XML file generated by ctf reader, there
are Linux symbols (EXPORT_SYMBOL*) that were missing.
Comments will be grateful and appreciated!.
Thanks in advanced,
guillermo
--
The current mechanism used by the ctf reader to looking for debug
information given a specific Linux symbol, is open the dictionary
(default) which the name match with the binary name being processing
in the current corpus, e.g. `vmlinux' or `module-name.ko'. However
there are symbol information is not located in a default dictionary,
this is evident comparing the symbols in `Module.symvers' file with
ABI XML file, so for example, the ctf reader is expecting to find the
information for `LZ4_decompress_fast' symbol in the CTF `vmlinux'
archive member, because this symbols is defined in `vmlinux' binary:
0x4c416eb9 LZ4_decompress_fast vmlinux EXPORT_SYMBOL
But it figures out that it is missing. The correct location is
`vmlinux#0' dictionary.
CTF archive member: vmlinux:
...
Function objects:
...
CTF archive member: vmlinux#0:
Function objects:
...
LZ4_decompress_fast -> 0x80037400: (kind 5) int (*) (const char *, char *, int) (aligned at 0x8)
...
Therefore, ctf reader is looking for debug information in the whole
archive, fortunately `libctf' provides a fast lookup mechanism using
cache, dictionary references, etc., so the penalty performance is ~10%.
* src/abg-ctf-reader.cc (lookup_symbol_in_ctf_archive): New function.
(process_ctf_archive): Use `lookup_symbol_in_ctf_archive'.
Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
src/abg-ctf-reader.cc | 72 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 64 insertions(+), 8 deletions(-)
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 71808f9a..8fa98a94 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -1204,6 +1204,62 @@ lookup_type(read_context *ctxt, corpus_sptr corp,
return result;
}
+/// Given a symbol name, lookup the corresponding CTF information in
+/// the default dictionary (CTF archive member provided by the caller)
+/// If the search is not success, the looks for the symbol name
+/// in _all_ archive members.
+///
+/// @param ctfa the CTF archive.
+/// @param dict the default dictionary to looks for.
+/// @param sym_name the symbol name.
+/// @param corp the IR corpus.
+///
+/// Note that if @ref sym_name is found in other than default dictionary
+/// @ref ctf_dict will be updated and it must be explicate closed by its
+/// caller.
+///
+/// @return a valid CTF type id, if @ref sym_name was found, -1 otherwise.
+
+static ctf_id_t
+lookup_symbol_in_ctf_archive(ctf_archive_t *ctfa, ctf_dict_t **ctf_dict,
+ const char *sym_name, corpus_sptr corp)
+{
+ int ctf_err;
+ ctf_dict_t *dict = *ctf_dict;
+ ctf_id_t ctf_type = ctf_lookup_variable(dict, sym_name);
+
+ /* lookup CTF type for a given symbol in its default
+ dictionary */
+ if (ctf_type == (ctf_id_t) -1
+ && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
+ ctf_type = ctf_lookup_by_symbol_name(dict, sym_name);
+
+ /* Not lucky, then, search in whole archive */
+ if (ctf_type == (ctf_id_t) -1)
+ {
+ ctf_dict_t *fp;
+ ctf_next_t *i = NULL;
+ const char *arcname;
+
+ while ((fp = ctf_archive_next(ctfa, &i, &arcname, 1, &ctf_err)) != NULL)
+ {
+ ctf_type = ctf_lookup_variable (fp, sym_name);
+ if (ctf_type == (ctf_id_t) -1
+ && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
+ ctf_type = ctf_lookup_by_symbol_name(fp, sym_name);
+
+ if (ctf_type != (ctf_id_t) -1)
+ {
+ *ctf_dict = fp;
+ break;
+ }
+ ctf_dict_close(fp);
+ }
+ }
+
+ return ctf_type;
+}
+
/// Process a CTF archive and create libabigail IR for the types,
/// variables and function declarations found in the archive, iterating
/// over public symbols. The IR is added to the given corpus.
@@ -1222,7 +1278,7 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
corp->add(ir_translation_unit);
int ctf_err;
- ctf_dict_t *ctf_dict;
+ ctf_dict_t *ctf_dict, *dict_tmp;
const auto symtab = ctxt->symtab;
symtab_reader::symtab_filter filter = symtab->make_filter();
filter.set_public_symbols();
@@ -1248,19 +1304,17 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
abort();
}
+ dict_tmp = ctf_dict;
+
for (const auto& symbol : symtab_reader::filtered_symtab(*symtab, filter))
{
std::string sym_name = symbol->get_name();
ctf_id_t ctf_sym_type;
- ctf_sym_type = ctf_lookup_variable(ctf_dict, sym_name.c_str());
- if (ctf_sym_type == (ctf_id_t) -1
- && !(corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN))
- // lookup in function objects
- ctf_sym_type = ctf_lookup_by_symbol_name(ctf_dict, sym_name.c_str());
-
+ ctf_sym_type = lookup_symbol_in_ctf_archive(ctxt->ctfa, &ctf_dict,
+ sym_name.c_str(), corp);
if (ctf_sym_type == (ctf_id_t) -1)
- continue;
+ continue;
if (ctf_type_kind(ctf_dict, ctf_sym_type) != CTF_K_FUNCTION)
{
@@ -1305,6 +1359,8 @@ process_ctf_archive(read_context *ctxt, corpus_sptr corp)
func_declaration->set_is_in_public_symbol_table(true);
ctxt->maybe_add_fn_to_exported_decls(func_declaration.get());
}
+
+ ctf_dict = dict_tmp;
}
ctf_dict_close(ctf_dict);
--
2.35.1
next reply other threads:[~2022-08-31 15:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-31 15:16 Guillermo E. Martinez [this message]
2022-08-31 15:16 ` Guillermo E. Martinez
2022-09-06 12:49 ` Dodji Seketeli
2022-09-07 18:40 ` Guillermo E. Martinez
2022-09-07 23:40 ` [PATCHv v2] " Guillermo E. Martinez
2022-09-13 9:26 ` Dodji Seketeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220831151603.915945-1-guillermo.e.martinez@oracle.com \
--to=guillermo.e.martinez@oracle.com \
--cc=libabigail@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).