From: Matthias Maennich <maennich@google.com>
To: Giuliano Procida <gprocida@google.com>
Cc: libabigail@sourceware.org, Dodji Seketeli <dodji@seketeli.org>,
kernel-team@android.com
Subject: Re: [RESEND PATCH] reader context: do not reuse current corpus and corpus_group
Date: Mon, 4 May 2020 15:21:54 +0200 [thread overview]
Message-ID: <20200504132154.GA80772@google.com> (raw)
In-Reply-To: <CAGvU0HkH5a2FwDYSSfQ9qMYUztEBgYe-4BguyKb=e76+zY7m2A@mail.gmail.com>
On Fri, May 01, 2020 at 03:23:05PM +0100, Giuliano Procida wrote:
>Hi.
>
>Does this change have any impact on incomplete (forward-declared) type
>differences?
>Does it significantly impact performance on large inputs?
This was a prerequisite change for sorting the corpus vector. But we
replace that by sorting the corpora much earlier. Still this is a useful
change for others coming here and hitting an issue with the reuse (as I
did).
>
>Otherwise, this change makes sense and
>Reviewed-by: Giuliano Procida <gprocida@google.com>
>
>More generally, do you have a good understanding of what in the
>context is generic to the tool run and what is specific to a file
>being read and can they be separated cleanly?
There is more opportunity to separate things a bit (also for potential
multithreaded execution). As it stands, the read_context is a very
central object in the dwarf reader that contains likely too much data
and carries too much responsibility. I am working on splitting out the
symtab reader part to address such a concern. (Also to make it testable
individually).
Cheers,
Matthias
>
>Giuliano.
>
>On Thu, 30 Apr 2020 at 22:53, Matthias Maennich <maennich@google.com> wrote:
>>
>> libabigail's readers (abg-reader and abg-dwarf-reader) currently spare
>> some allocations by reusing the reader context's existing current corpus
>> and current corpus group. When building a corpus_group's vector of
>> corpora, reusing the shared_ptr referring to a corpus means we are
>> modifying the corpus data of a previously read corpus. As a user of the
>> read*corpus functions, that isn't entirely transparent and when storing
>> corpare like in the vector above, we might introduce subtle bugs.
>>
>> Fix this by explicitly creating new corpus / corpus group instances when
>> reading from elf/dwarf or xml.
>>
>> * src/abg-dwarf-reader.cc (read_debug_info_into_corpus): always
>> instantiate a new corpus instance.
>> * src/abg-reader.cc (read_corpus_from_input): Likewise.
>> (read_corpus_group_from_input): always instantiate a new corpus
>> group instance.
>>
>> Signed-off-by: Matthias Maennich <maennich@google.com>
>> ---
>> src/abg-dwarf-reader.cc | 11 ++++-----
>> src/abg-reader.cc | 49 +++++++++++++++++------------------------
>> 2 files changed, 24 insertions(+), 36 deletions(-)
>>
>> diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
>> index 850281ad1ce5..d99d25bfb3b9 100644
>> --- a/src/abg-dwarf-reader.cc
>> +++ b/src/abg-dwarf-reader.cc
>> @@ -16068,13 +16068,10 @@ read_debug_info_into_corpus(read_context& ctxt)
>> {
>> ctxt.clear_per_corpus_data();
>>
>> - if (!ctxt.current_corpus())
>> - {
>> - corpus_sptr corp (new corpus(ctxt.env(), ctxt.elf_path()));
>> - ctxt.current_corpus(corp);
>> - if (!ctxt.env())
>> - ctxt.env(corp->get_environment());
>> - }
>> + corpus_sptr corp(new corpus(ctxt.env(), ctxt.elf_path()));
>> + ctxt.current_corpus(corp);
>> + if (!ctxt.env())
>> + ctxt.env(corp->get_environment());
>>
>> // First set some mundane properties of the corpus gathered from
>> // ELF.
>> diff --git a/src/abg-reader.cc b/src/abg-reader.cc
>> index 255a200f2a25..0be45ec59a39 100644
>> --- a/src/abg-reader.cc
>> +++ b/src/abg-reader.cc
>> @@ -25,6 +25,7 @@
>> /// ABI Instrumentation file in libabigail native XML format. This
>> /// native XML format is named "abixml".
>>
>> +#include "abg-fwd.h"
>> #include "config.h"
>> #include <cstring>
>> #include <cstdlib>
>> @@ -1899,17 +1900,14 @@ read_corpus_from_input(read_context& ctxt)
>> BAD_CAST("abi-corpus")))
>> return nil;
>>
>> - if (!ctxt.get_corpus())
>> - {
>> - corpus_sptr c(new corpus(ctxt.get_environment(), ""));
>> - ctxt.set_corpus(c);
>> - }
>> + corpus_sptr corp(new corpus(ctxt.get_environment(), ""));
>> + ctxt.set_corpus(corp);
>>
>> if (!ctxt.get_corpus_group())
>> ctxt.clear_per_corpus_data();
>>
>> - corpus& corp = *ctxt.get_corpus();
>> - ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get());
>> + ctxt.set_exported_decls_builder(
>> + corp->get_exported_decls_builder().get());
>>
>> xml::xml_char_sptr path_str = XML_READER_GET_ATTRIBUTE(reader, "path");
>> string path;
>> @@ -1917,13 +1915,13 @@ read_corpus_from_input(read_context& ctxt)
>> if (path_str)
>> {
>> path = reinterpret_cast<char*>(path_str.get());
>> - corp.set_path(path);
>> + corp->set_path(path);
>> }
>>
>> xml::xml_char_sptr architecture_str =
>> XML_READER_GET_ATTRIBUTE(reader, "architecture");
>> if (architecture_str)
>> - corp.set_architecture_name
>> + corp->set_architecture_name
>> (reinterpret_cast<char*>(architecture_str.get()));
>>
>> xml::xml_char_sptr soname_str =
>> @@ -1933,7 +1931,7 @@ read_corpus_from_input(read_context& ctxt)
>> if (soname_str)
>> {
>> soname = reinterpret_cast<char*>(soname_str.get());
>> - corp.set_soname(soname);
>> + corp->set_soname(soname);
>> }
>>
>> // Apply suppression specifications here to honour:
>> @@ -1955,32 +1953,29 @@ read_corpus_from_input(read_context& ctxt)
>> }
>> else
>> {
>> - if (!ctxt.get_corpus())
>> - {
>> - corpus_sptr c(new corpus(ctxt.get_environment(), ""));
>> - ctxt.set_corpus(c);
>> - }
>> + corpus_sptr corp(new corpus(ctxt.get_environment(), ""));
>> + ctxt.set_corpus(corp);
>>
>> if (!ctxt.get_corpus_group())
>> ctxt.clear_per_corpus_data();
>>
>> - corpus& corp = *ctxt.get_corpus();
>> - ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get());
>> + ctxt.set_exported_decls_builder(
>> + corp->get_exported_decls_builder().get());
>>
>> xml::xml_char_sptr path_str = XML_NODE_GET_ATTRIBUTE(node, "path");
>> if (path_str)
>> - corp.set_path(reinterpret_cast<char*>(path_str.get()));
>> + corp->set_path(reinterpret_cast<char*>(path_str.get()));
>>
>> xml::xml_char_sptr architecture_str =
>> XML_NODE_GET_ATTRIBUTE(node, "architecture");
>> if (architecture_str)
>> - corp.set_architecture_name
>> + corp->set_architecture_name
>> (reinterpret_cast<char*>(architecture_str.get()));
>>
>> xml::xml_char_sptr soname_str =
>> XML_NODE_GET_ATTRIBUTE(node, "soname");
>> if (soname_str)
>> - corp.set_soname(reinterpret_cast<char*>(soname_str.get()));
>> + corp->set_soname(reinterpret_cast<char*>(soname_str.get()));
>> }
>>
>> if (!node->children)
>> @@ -2093,14 +2088,10 @@ read_corpus_group_from_input(read_context& ctxt)
>> BAD_CAST("abi-corpus-group")))
>> return nil;
>>
>> - if (!ctxt.get_corpus_group())
>> - {
>> - corpus_group_sptr g(new corpus_group(ctxt.get_environment(),
>> - ctxt.get_path()));
>> - ctxt.set_corpus_group(g);
>> - }
>> + corpus_group_sptr group(
>> + new corpus_group(ctxt.get_environment(), ctxt.get_path()));
>> + ctxt.set_corpus_group(group);
>>
>> - corpus_group_sptr group = ctxt.get_corpus_group();
>> xml::xml_char_sptr path_str = XML_READER_GET_ATTRIBUTE(reader, "path");
>> if (path_str)
>> group->set_path(reinterpret_cast<char*>(path_str.get()));
>> @@ -2115,11 +2106,11 @@ read_corpus_group_from_input(read_context& ctxt)
>>
>> corpus_sptr corp;
>> while ((corp = read_corpus_from_input(ctxt)))
>> - ctxt.get_corpus_group()->add_corpus(corp);
>> + group->add_corpus(corp);
>>
>> xmlTextReaderNext(reader.get());
>>
>> - return ctxt.get_corpus_group();
>> + return group;
>> }
>>
>> /// De-serialize an ABI corpus group from an input XML document which
>> --
>> 2.26.2.526.g744177e7f7-goog
>>
next prev parent reply other threads:[~2020-05-04 13:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-30 21:52 Matthias Maennich
2020-05-01 14:23 ` Giuliano Procida
2020-05-04 13:21 ` Matthias Maennich [this message]
2020-05-04 14:42 ` Dodji Seketeli
2020-05-04 18:03 ` Matthias Maennich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200504132154.GA80772@google.com \
--to=maennich@google.com \
--cc=dodji@seketeli.org \
--cc=gprocida@google.com \
--cc=kernel-team@android.com \
--cc=libabigail@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).