From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by sourceware.org (Postfix) with ESMTPS id 5A34338708E6 for ; Mon, 4 May 2020 13:21:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5A34338708E6 Received: by mail-wr1-x443.google.com with SMTP id g13so20877647wrb.8 for ; Mon, 04 May 2020 06:21:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ImQGzxRpIvuiGVeljsaRiAvMsMRuAwal1BOVcJVFpGw=; b=FO30GWUBU1PQfZMyfhxwAt9ta9nTtFzc3jHyEUwprsHbK6+fE/QOYnX7M2nZo98TyI LoXSaaqwcFcFbmWmQTgKUVeXoPGCD83Zp1urIajFis2mZ2S4BpePZF7y2nuh5Qc++R36 3xpb94Xsa/9V6IU4y6EQxd7wQzfocTUnkUlS6k7EWEuo2qTNGiwvVAksbPnhqEu6rm6j CATRb93zJsNR8TZ0alDTEWlaEFTtPKLByAHP2BQ6VAyygxNuC88De6dLFc/QzUFPCbLg 2rD8hhQ5/lSf4K4n2sYJ9ccvhD+V4aylhoV8yweP/Gx2Ho6UgyV5KrwQuriQmaOAVZRC hzXg== X-Gm-Message-State: AGi0PubbGvPxWcs64Vjq80HYO1a0WiaZWBShfjMgMnOO1GQuBu+ktaVI ZrjS+k1eI6TB9tGBlwEK+6af9A== X-Google-Smtp-Source: APiQypJ3X+/sj7lQIjYSoWSAkPZQG6flKVEt56/fUlFxf3yM2B8t9us0zEgESmIY4vWvZpg5tcnNkg== X-Received: by 2002:a05:6000:12c7:: with SMTP id l7mr20056648wrx.239.1588598515887; Mon, 04 May 2020 06:21:55 -0700 (PDT) Received: from google.com ([2a00:79e0:d:210:e8f7:125b:61e9:733d]) by smtp.gmail.com with ESMTPSA id n2sm18416917wrt.33.2020.05.04.06.21.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 May 2020 06:21:55 -0700 (PDT) Date: Mon, 4 May 2020 15:21:54 +0200 From: Matthias Maennich To: Giuliano Procida Cc: libabigail@sourceware.org, Dodji Seketeli , kernel-team@android.com Subject: Re: [RESEND PATCH] reader context: do not reuse current corpus and corpus_group Message-ID: <20200504132154.GA80772@google.com> References: <20200430215256.19135-1-maennich@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-41.3 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libabigail mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2020 13:22:07 -0000 On Fri, May 01, 2020 at 03:23:05PM +0100, Giuliano Procida wrote: >Hi. > >Does this change have any impact on incomplete (forward-declared) type >differences? >Does it significantly impact performance on large inputs? This was a prerequisite change for sorting the corpus vector. But we replace that by sorting the corpora much earlier. Still this is a useful change for others coming here and hitting an issue with the reuse (as I did). > >Otherwise, this change makes sense and >Reviewed-by: Giuliano Procida > >More generally, do you have a good understanding of what in the >context is generic to the tool run and what is specific to a file >being read and can they be separated cleanly? There is more opportunity to separate things a bit (also for potential multithreaded execution). As it stands, the read_context is a very central object in the dwarf reader that contains likely too much data and carries too much responsibility. I am working on splitting out the symtab reader part to address such a concern. (Also to make it testable individually). Cheers, Matthias > >Giuliano. > >On Thu, 30 Apr 2020 at 22:53, Matthias Maennich wrote: >> >> libabigail's readers (abg-reader and abg-dwarf-reader) currently spare >> some allocations by reusing the reader context's existing current corpus >> and current corpus group. When building a corpus_group's vector of >> corpora, reusing the shared_ptr referring to a corpus means we are >> modifying the corpus data of a previously read corpus. As a user of the >> read*corpus functions, that isn't entirely transparent and when storing >> corpare like in the vector above, we might introduce subtle bugs. >> >> Fix this by explicitly creating new corpus / corpus group instances when >> reading from elf/dwarf or xml. >> >> * src/abg-dwarf-reader.cc (read_debug_info_into_corpus): always >> instantiate a new corpus instance. >> * src/abg-reader.cc (read_corpus_from_input): Likewise. >> (read_corpus_group_from_input): always instantiate a new corpus >> group instance. >> >> Signed-off-by: Matthias Maennich >> --- >> src/abg-dwarf-reader.cc | 11 ++++----- >> src/abg-reader.cc | 49 +++++++++++++++++------------------------ >> 2 files changed, 24 insertions(+), 36 deletions(-) >> >> diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc >> index 850281ad1ce5..d99d25bfb3b9 100644 >> --- a/src/abg-dwarf-reader.cc >> +++ b/src/abg-dwarf-reader.cc >> @@ -16068,13 +16068,10 @@ read_debug_info_into_corpus(read_context& ctxt) >> { >> ctxt.clear_per_corpus_data(); >> >> - if (!ctxt.current_corpus()) >> - { >> - corpus_sptr corp (new corpus(ctxt.env(), ctxt.elf_path())); >> - ctxt.current_corpus(corp); >> - if (!ctxt.env()) >> - ctxt.env(corp->get_environment()); >> - } >> + corpus_sptr corp(new corpus(ctxt.env(), ctxt.elf_path())); >> + ctxt.current_corpus(corp); >> + if (!ctxt.env()) >> + ctxt.env(corp->get_environment()); >> >> // First set some mundane properties of the corpus gathered from >> // ELF. >> diff --git a/src/abg-reader.cc b/src/abg-reader.cc >> index 255a200f2a25..0be45ec59a39 100644 >> --- a/src/abg-reader.cc >> +++ b/src/abg-reader.cc >> @@ -25,6 +25,7 @@ >> /// ABI Instrumentation file in libabigail native XML format. This >> /// native XML format is named "abixml". >> >> +#include "abg-fwd.h" >> #include "config.h" >> #include >> #include >> @@ -1899,17 +1900,14 @@ read_corpus_from_input(read_context& ctxt) >> BAD_CAST("abi-corpus"))) >> return nil; >> >> - if (!ctxt.get_corpus()) >> - { >> - corpus_sptr c(new corpus(ctxt.get_environment(), "")); >> - ctxt.set_corpus(c); >> - } >> + corpus_sptr corp(new corpus(ctxt.get_environment(), "")); >> + ctxt.set_corpus(corp); >> >> if (!ctxt.get_corpus_group()) >> ctxt.clear_per_corpus_data(); >> >> - corpus& corp = *ctxt.get_corpus(); >> - ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get()); >> + ctxt.set_exported_decls_builder( >> + corp->get_exported_decls_builder().get()); >> >> xml::xml_char_sptr path_str = XML_READER_GET_ATTRIBUTE(reader, "path"); >> string path; >> @@ -1917,13 +1915,13 @@ read_corpus_from_input(read_context& ctxt) >> if (path_str) >> { >> path = reinterpret_cast(path_str.get()); >> - corp.set_path(path); >> + corp->set_path(path); >> } >> >> xml::xml_char_sptr architecture_str = >> XML_READER_GET_ATTRIBUTE(reader, "architecture"); >> if (architecture_str) >> - corp.set_architecture_name >> + corp->set_architecture_name >> (reinterpret_cast(architecture_str.get())); >> >> xml::xml_char_sptr soname_str = >> @@ -1933,7 +1931,7 @@ read_corpus_from_input(read_context& ctxt) >> if (soname_str) >> { >> soname = reinterpret_cast(soname_str.get()); >> - corp.set_soname(soname); >> + corp->set_soname(soname); >> } >> >> // Apply suppression specifications here to honour: >> @@ -1955,32 +1953,29 @@ read_corpus_from_input(read_context& ctxt) >> } >> else >> { >> - if (!ctxt.get_corpus()) >> - { >> - corpus_sptr c(new corpus(ctxt.get_environment(), "")); >> - ctxt.set_corpus(c); >> - } >> + corpus_sptr corp(new corpus(ctxt.get_environment(), "")); >> + ctxt.set_corpus(corp); >> >> if (!ctxt.get_corpus_group()) >> ctxt.clear_per_corpus_data(); >> >> - corpus& corp = *ctxt.get_corpus(); >> - ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get()); >> + ctxt.set_exported_decls_builder( >> + corp->get_exported_decls_builder().get()); >> >> xml::xml_char_sptr path_str = XML_NODE_GET_ATTRIBUTE(node, "path"); >> if (path_str) >> - corp.set_path(reinterpret_cast(path_str.get())); >> + corp->set_path(reinterpret_cast(path_str.get())); >> >> xml::xml_char_sptr architecture_str = >> XML_NODE_GET_ATTRIBUTE(node, "architecture"); >> if (architecture_str) >> - corp.set_architecture_name >> + corp->set_architecture_name >> (reinterpret_cast(architecture_str.get())); >> >> xml::xml_char_sptr soname_str = >> XML_NODE_GET_ATTRIBUTE(node, "soname"); >> if (soname_str) >> - corp.set_soname(reinterpret_cast(soname_str.get())); >> + corp->set_soname(reinterpret_cast(soname_str.get())); >> } >> >> if (!node->children) >> @@ -2093,14 +2088,10 @@ read_corpus_group_from_input(read_context& ctxt) >> BAD_CAST("abi-corpus-group"))) >> return nil; >> >> - if (!ctxt.get_corpus_group()) >> - { >> - corpus_group_sptr g(new corpus_group(ctxt.get_environment(), >> - ctxt.get_path())); >> - ctxt.set_corpus_group(g); >> - } >> + corpus_group_sptr group( >> + new corpus_group(ctxt.get_environment(), ctxt.get_path())); >> + ctxt.set_corpus_group(group); >> >> - corpus_group_sptr group = ctxt.get_corpus_group(); >> xml::xml_char_sptr path_str = XML_READER_GET_ATTRIBUTE(reader, "path"); >> if (path_str) >> group->set_path(reinterpret_cast(path_str.get())); >> @@ -2115,11 +2106,11 @@ read_corpus_group_from_input(read_context& ctxt) >> >> corpus_sptr corp; >> while ((corp = read_corpus_from_input(ctxt))) >> - ctxt.get_corpus_group()->add_corpus(corp); >> + group->add_corpus(corp); >> >> xmlTextReaderNext(reader.get()); >> >> - return ctxt.get_corpus_group(); >> + return group; >> } >> >> /// De-serialize an ABI corpus group from an input XML document which >> -- >> 2.26.2.526.g744177e7f7-goog >>