From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 108B73858D20 for ; Fri, 3 Feb 2023 11:49:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 108B73858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675424956; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=d6Jbfnx3CQajkDySOpRHASacrYRdZG+bvt+2Z78/D2U=; b=Xj0/MsAsx8XWomd9YeUSSMFc4rfcSr2AP2h7jqk+XZ50dzQC7czGIc8b2H93ZqSvSlcpLL CVmwrPGTQwTQUcD3BtfxYa/21zOCrfm9ew6tXgY1ahSXTxRzzfitOiQXUhz9oKugeHHOMk kS/A0+wPUEui7EY1sOwtqVK3opkwA8w= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-633-HHxJDaXsOB2jNde5WLT-8g-1; Fri, 03 Feb 2023 06:49:13 -0500 X-MC-Unique: HHxJDaXsOB2jNde5WLT-8g-1 Received: by mail-wm1-f72.google.com with SMTP id e38-20020a05600c4ba600b003dc434dabbdso4464983wmp.6 for ; Fri, 03 Feb 2023 03:49:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:message-id:in-reply-to:date:references :organization:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d6Jbfnx3CQajkDySOpRHASacrYRdZG+bvt+2Z78/D2U=; b=fMZlTxJiFtMdmRLJOtb9Vp7N4MK4o+sjCmQccTwiSJ7dziYDmqOEZEtWMtlw9bijM4 CVDOyXqSSBvXdR1TUlzpWeGs8ctOvvfNrpq8cLnMyBBog/3R9NBPrHFilC+voshvIvvY cUnX3px3PXle7Y1sDoqPd1dzJEmH/ZitbV/5QWO1xFgMfpfWFIOq8oWvQSCH4lwvoxun Q+5OCV78lVvNPB01ySPza9ucq8l+nkPtweSGrjFM3JGcj+yCwaxFhFRe2z5WBcQatbDT lx5P4Vy9HZJ2D6btqjq+bbJlCEdCOgjdEiJWoILp+nz62J6bNkyGoqNqEs7pwD8hY7fH 22xw== X-Gm-Message-State: AO0yUKVur2FpxkBl2xlrApzVgqqs9VPjf0wikBFoxmWO6BfbQqrW5k9m 47g8Ymg5DS6rt7+fo+RkHipCrqLE5j7l8MjeBQryPx3le3CKFbVwe8hmCP+gNU2Ua75L3v+26cw 2naOJjUaRUpGLfVArpwJt X-Received: by 2002:a05:6000:1449:b0:2bf:d2d8:d604 with SMTP id v9-20020a056000144900b002bfd2d8d604mr11608015wrx.46.1675424951564; Fri, 03 Feb 2023 03:49:11 -0800 (PST) X-Google-Smtp-Source: AK7set8VfXFB96e6+VImYmgh7I0OARzYOZK6jYe5Ag3dhwROMVfPFer05ZdeHBGU6WT2BsfkChfgBA== X-Received: by 2002:a05:6000:1449:b0:2bf:d2d8:d604 with SMTP id v9-20020a056000144900b002bfd2d8d604mr11607997wrx.46.1675424951387; Fri, 03 Feb 2023 03:49:11 -0800 (PST) Received: from localhost ([88.120.130.27]) by smtp.gmail.com with ESMTPSA id e23-20020a5d5957000000b002366e3f1497sm1844820wri.6.2023.02.03.03.49.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Feb 2023 03:49:10 -0800 (PST) Received: by localhost (Postfix, from userid 1000) id 6D91618F6A6; Fri, 3 Feb 2023 12:49:10 +0100 (CET) From: Dodji Seketeli To: libabigail@sourceware.org Cc: gprocida@google.com, Dodji Seketeli Subject: [PATCH 4/4] {dwarf,elf_based}-reader,writer: Avoid duplicating corpora in corpus_group Organization: Red Hat / France References: <878rhh8xwb.fsf@redhat.com> <87mt5vt3op.fsf@seketeli.org> X-Operating-System: CentOS Stream release 9 X-URL: http://www.redhat.com Date: Fri, 03 Feb 2023 12:49:10 +0100 In-Reply-To: <87mt5vt3op.fsf@seketeli.org> (Dodji Seketeli's message of "Fri, 03 Feb 2023 11:59:34 +0100") Message-ID: <87357nt1e1.fsf_-_@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, It's been brought to my attention on IRC that running abidw --linux-tree would result in a corpus group that duplicates every single corpus in the resulting abixml. Oops. This is because both dwarf::reader::read_corpus() and elf_based_reader::read_and_add_corpus_to_group() add the corpus to the corpus_group, and yet, the later function calls the former. So the corpus is added to the corpus_group twice. This patch ensures that elf_based_reader::read_and_add_corpus_to_group() is the only one to add the corpus to the group. It also ensures that this happens before the corpus is constructed from the debug info because that is useful for sharing types among the various corpora. Otherwise, those types are potentially duplicated in the IR of each corpus. The patch also ensures that the abixml writer enforces the fact that each corpus is emitted only once. * src/abg-dwarf-reader.cc (reader::read_debug_info_into_corpus): Do not add the corpus to the group here ... * src/abg-elf-based-reader.cc (elf_based_reader::read_and_add_corpus_to_group): ... because it's already added here. But then, let's add it here /before/ reading type & symbols information into the corpus. * src/abg-writer.cc (write_context::m_emitted_corpora_set): Add new data member. (write_context::{corpus_is_emitted, record_corpus_as_emitted}): Define new member functions. (write_corpus): Invoke the new write_context::record_corpus_as_emitted here. (write_corpus_group): Ensure that each corpus is emitted only once. Signed-off-by: Dodji Seketeli --- src/abg-dwarf-reader.cc | 3 --- src/abg-elf-based-reader.cc | 4 +--- src/abg-writer.cc | 45 +++++++++++++++++++++++++++++++++++-- 3 files changed, 44 insertions(+), 8 deletions(-) diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc index 566c9db1..92ce6c6a 100644 --- a/src/abg-dwarf-reader.cc +++ b/src/abg-dwarf-reader.cc @@ -2121,9 +2121,6 @@ public: corpus()->set_soname(dt_soname()); corpus()->set_needed(dt_needed()); corpus()->set_architecture_name(elf_architecture()); - if (corpus_group_sptr group = corpus_group()) - group->add_corpus(corpus()); - // Set symbols information to the corpus. corpus()->set_symtab(symtab()); diff --git a/src/abg-elf-based-reader.cc b/src/abg-elf-based-reader.cc index cd7b59b6..d1d9a2df 100644 --- a/src/abg-elf-based-reader.cc +++ b/src/abg-elf-based-reader.cc @@ -92,10 +92,8 @@ ir::corpus_sptr elf_based_reader::read_and_add_corpus_to_group(ir::corpus_group& group, fe_iface::status& status) { + group.add_corpus(corpus()); ir::corpus_sptr corp = read_corpus(status); - - if (status & fe_iface::STATUS_OK) - group.add_corpus(corp); return corp; } diff --git a/src/abg-writer.cc b/src/abg-writer.cc index 1fb067b8..9fe3dec7 100644 --- a/src/abg-writer.cc +++ b/src/abg-writer.cc @@ -230,7 +230,8 @@ class write_context class_tmpl_shared_ptr_map m_class_tmpl_id_map; string_elf_symbol_sptr_map_type m_fun_symbol_map; string_elf_symbol_sptr_map_type m_var_symbol_map; - unordered_set m_emitted_decls_set; + unordered_set m_emitted_decls_set; + unordered_set m_emitted_corpora_set; write_context(); @@ -818,6 +819,42 @@ public: m_emitted_decls_set.insert(irepr); } + /// Test if a corpus has already been emitted. + /// + /// A corpus is emitted if it's been recorded as having been emitted + /// by the function record_corpus_as_emitted(). + /// + /// @param corp the corpus to consider. + /// + /// @return true iff the corpus @p corp has been emitted. + bool + corpus_is_emitted(const corpus_sptr& corp) + { + if (!corp) + return false; + + if (m_emitted_corpora_set.find(corp->get_path()) + == m_emitted_corpora_set.end()) + return false; + + return true; + } + + /// Record the corpus has having been emitted. + /// + /// @param corp the corpus to consider. + void + record_corpus_as_emitted(const corpus_sptr& corp) + { + if (!corp) + return; + + const string& path = corp->get_path(); + ABG_ASSERT(!path.empty()); + + m_emitted_corpora_set.insert(path); + } + /// Get the set of types that have been emitted. /// /// @return the set of types that have been emitted. @@ -4588,6 +4625,7 @@ write_corpus(write_context& ctxt, out << "\n"; ctxt.clear_referenced_types(); + ctxt.record_corpus_as_emitted(corpus); return true; } @@ -4639,7 +4677,10 @@ std::ostream& out = ctxt.get_ostream(); group->get_corpora().begin(); c != group->get_corpora().end(); ++c) - write_corpus(ctxt, *c, get_indent_to_level(ctxt, indent, 1), true); + { + ABG_ASSERT(!ctxt.corpus_is_emitted(*c)); + write_corpus(ctxt, *c, get_indent_to_level(ctxt, indent, 1), true); + } do_indent_to_level(ctxt, indent, 0); out << "\n"; -- 2.31.1 -- Dodji