From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay12.mail.gandi.net (relay12.mail.gandi.net [IPv6:2001:4b98:dc4:8::232]) by sourceware.org (Postfix) with ESMTPS id 57A893858D28 for ; Tue, 20 Sep 2022 10:50:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 57A893858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=seketeli.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=seketeli.org Received: (Authenticated sender: dodji@seketeli.org) by mail.gandi.net (Postfix) with ESMTPSA id 29E5920000E; Tue, 20 Sep 2022 10:50:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=seketeli.org; s=gm1; t=1663671039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=kEIWi7MpqlEGEXkw0qxJbFvmgGrUaaeMvIxA5cEsJ88=; b=DBO4yWbg0YMs/vb/SMWft9AmKk5hgZl0AaK7WwsFoePmFSNP11N8HZ/LmzYdWEYrlml5JH pEMfZpcfbyU4VIEKKYReYp1vNYsMUPo+VNdImoFGHxAZSsQ+9dTW1teOB1DW8DNC5UWPie urr9altRmop+up8SbdrMUKnMgzdmW4iVRGM+Yg0b/+1k/RbHbLzugjOuBSb5eqqx9UlMPW Zw64xHsI92HOsicAl6m9Wtnrmy9nVG5ZqgQEuzpjwVU11c0bE9bHbe2RKlax3NFVm4c8HA tH/72PxMFvHFodtfgcn+NhVRyN6jAVrZjFkXpXfA7Dym1fbvdHwp/CieuByGVQ== Received: by localhost (Postfix, from userid 1000) id 63C485802BD; Tue, 20 Sep 2022 12:50:38 +0200 (CEST) From: Dodji Seketeli To: libabigail@sourceware.org Cc: dodji@redhat.com Subject: [PATCH 0/4, applied] Speed up type DIEs canonicalization Organization: Me, myself and I X-Operating-System: Fedora 38 X-URL: http://www.seketeli.net/~dodji Date: Tue, 20 Sep 2022 12:50:38 +0200 Message-ID: <87tu525ngh.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, This patch set is a response to the problem entitled "abidw performance regression on vmlinux", reported at https://sourceware.org/bugzilla/show_bug.cgi?id=29464. There was a first version of this patch sent to https://inbox.sourceware.org/libabigail/87o7vsizdl.fsf@seketeli.org/T/#t. This patch addresses the comments that were generously submitted in the problem report. In that problem report, "abidw --noout vmlinux" would take more than hour. Before that performance regression, type DIEs canonicalization by the DWARF reader was fast, because it was wrong, basically. During canonicalization, the reader was using a speed optimization called "canonical type propagation". I argue that the optimization was beeing done too eagerly. Meaning it was being done even in cases where it should not have been, often leading to wrong (but very fast) outcomes. This patch was: commit 7ecef6361799326b99129a479b43b138f0b237ae Author: Dodji Seketeli Date: Mon Apr 4 15:35:48 2022 +0200 Canonicalize DIEs w/o assuming ODR & handle typedefs transparently That patch was mostly meant to affect only C++, Ada and Java binaries. But then, the patch stopped the reader from doing canonical type propagation on types that are not aggregates (pointers, typedefs etc). Doing type propagation on types that are not aggregate is wrong because that can lead to cases where two "pointers to aggregate" could be considered equal where in the end, the aggregates are different. This can happen for instance on aggregates that are detected as redundant. Please note that these terms (redundant, canonical type propagation, etc) are defined in the introductory comments of the first patches below. Anyway, that patch restricted canonicalization and canonical type propagation to aggregate types. It also restricted the amount canonical type propagation performed because there are cases where these should not be performed. I believe that these changes lead the many more type DIEs comparison being performed, even for C programs. On libraries where there tend to be less deep type trees, comparison time didn't explode. But on some applications like the Linux Kernel with deep type trees containing lots of redundant type the number of DIEs comparison exploded. The first patch of the series below tracks all the types that are subject to the canonical type propagation optimization and tries to apply that optimization to a maximum of types. That makes comparisons be as fast as possible as many types now have a canonical type. But whenever it detects that the optimization should not have taken place, it cancels it on all the types where it happened. This is like what we already do when canonicalyzing IR types. This brought comparison times of vmlinux below 5 minutes (from more than hour). The second patch restricts the analysis of DIEs to the interfaces that have exported symbols. Otherwise, several other kinds of interfaces where being analyzed at the DIE level. It was just later that only the IR of the exported interfaces where "chosen" to be put into the final ABI Corpus. That lead to potentially "too many" DIEs being analyzed in the first place, leading to many comparisons in the case of huge applications like vmlinux. I believe this brings the time to under a minute or so when analyzing vmlinux. That second patch thus introduces a new "--exported-interfaces-only" option supported by the tools abidw, abidiff, abipkgdiff, kmidiff, etc. That option triggers the new mode where only exported interfaces are analyzed at the DIE level. Note that when looking at the Linux Kernel, that option is enabled by default. It also introduces a new "--allow-non-exported-interfaces" for those tools. Note that this option is enabled by default when looking at any binary that is not the Linux Kernel. The last two patches are additions to the initial series. Third one fixes issues that were latent in the way we were caching the result of some comparisons at the IR level. These issues are uncovered by the first patch. The last patch is a return to canonicalizing typedefs in the IR. I tried to stop canonicalizing typedefs before to give a change to the comparison engine to avoid detecting harmless type changes due to typedef name changes. It turned out not canonicalizing typedefs causes more harm than good as it makes comparing and addressing them (in the abixml writer for instance) much harder and slower. So I am reverting back to square one and I am relying on subsequent passes on the final diff graph to mark diff nodes as being harmless, leaving leeway to reporters to handle that information at reporting time. I am applying the patch set to master as the discussion on the problem was really fruitful. Many thanks to Giuliano Procida for the feedback. Dodji Seketeli (4): dwarf-reader: Revamp the canonical type DIE propagation algorithm Allow restricting analyzed decls to exported symbols Fix IR comparison result caching and canonical type propagation tracking ir, writer: Go back to canonicalizing typedefs in the IR doc/manuals/Makefile.am | 3 +- doc/manuals/abidiff.rst | 52 + doc/manuals/abidw.rst | 82 +- doc/manuals/abipkgdiff.rst | 51 + doc/manuals/kmidiff.rst | 52 +- doc/manuals/tools-use-libabigail.txt | 16 + include/abg-fwd.h | 5 + include/abg-ir.h | 67 +- src/abg-dwarf-reader.cc | 1514 +- src/abg-ir-priv.h | 64 +- src/abg-ir.cc | 552 +- src/abg-reader.cc | 3 + src/abg-writer.cc | 166 +- .../test-abidiff/test-PR18791-report0.txt | 238 +- tests/data/test-annotate/libtest23.so.abi | 724 +- .../test-annotate/libtest24-drop-fns-2.so.abi | 804 +- .../test-annotate/libtest24-drop-fns.so.abi | 804 +- .../test-anonymous-members-0.o.abi | 96 +- tests/data/test-annotate/test0.abi | 4 +- tests/data/test-annotate/test1.abi | 32 +- .../data/test-annotate/test13-pr18894.so.abi | 2322 +- .../data/test-annotate/test14-pr18893.so.abi | 18390 ++--- .../data/test-annotate/test15-pr18892.so.abi | 62714 ++++++++-------- .../data/test-annotate/test17-pr19027.so.abi | 46135 ++++++------ ...st18-pr19037-libvtkRenderingLIC-6.1.so.abi | 18052 +++-- ...19-pr19023-libtcmalloc_and_profiler.so.abi | 49442 ++++++------ tests/data/test-annotate/test2.so.abi | 38 +- ...st20-pr19025-libvtkParallelCore-6.1.so.abi | 30100 ++++---- .../data/test-annotate/test21-pr19092.so.abi | 7608 +- .../PR25058-liblttng-ctl-report-1.txt | 7 +- .../test42-PR21296-clanggcc-report0.txt | 6 + .../test31-pr18535-libstdc++-report-0.txt | 31 +- .../test31-pr18535-libstdc++-report-1.txt | 31 +- .../data/test-diff-filter/test41-report-0.txt | 30 +- ...x86_64--2.24.2-30.fc30.x86_64-report-0.txt | 2 +- .../PR24690/PR24690-report-0.txt | 2 +- .../nss-3.23.0-1.0.fc23.x86_64-report-0.txt | 6 +- ...l7.x86_64-0.12.8-1.el7.x86_64-report-2.txt | 113 +- ...bb-4.3-3.20141204.fc23.x86_64-report-0.txt | 47 +- ...bb-4.3-3.20141204.fc23.x86_64-report-1.txt | 13 +- .../PR22015-libboost_iostreams.so.abi | 2858 +- .../test-read-dwarf/PR22122-libftdc.so.abi | 16923 ++++- .../data/test-read-dwarf/PR25007-sdhci.ko.abi | 14620 ++-- .../PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi | 642 +- .../test-read-dwarf/PR26261/PR26261-exe.abi | 10 +- .../test-read-dwarf/PR27700/test-PR27700.abi | 2 +- tests/data/test-read-dwarf/libtest23.so.abi | 626 +- .../libtest24-drop-fns-2.so.abi | 702 +- .../test-read-dwarf/libtest24-drop-fns.so.abi | 702 +- .../data/test-read-dwarf/test-PR26568-1.o.abi | 16 +- .../data/test-read-dwarf/test-PR26568-2.o.abi | 22 +- .../test-read-dwarf/test-libaaudio.so.abi | 238 +- .../test-read-dwarf/test-libandroid.so.abi | 56274 +++++++------- tests/data/test-read-dwarf/test0.abi | 2 +- tests/data/test-read-dwarf/test0.hash.abi | 2 +- tests/data/test-read-dwarf/test1.abi | 26 +- tests/data/test-read-dwarf/test1.hash.abi | 6 +- .../test-read-dwarf/test10-pr18818-gcc.so.abi | 7990 +- .../test-read-dwarf/test11-pr18828.so.abi | 23566 +++--- .../test-read-dwarf/test12-pr18844.so.abi | 33361 ++++---- .../test-read-dwarf/test13-pr18894.so.abi | 2090 +- .../test-read-dwarf/test14-pr18893.so.abi | 15146 ++-- .../test-read-dwarf/test15-pr18892.so.abi | 27826 +++---- .../test-read-dwarf/test16-pr18904.so.abi | 41341 +++++----- .../test-read-dwarf/test17-pr19027.so.abi | 24602 +++--- ...st18-pr19037-libvtkRenderingLIC-6.1.so.abi | 14733 ++-- ...19-pr19023-libtcmalloc_and_profiler.so.abi | 22933 +++--- tests/data/test-read-dwarf/test2.so.abi | 34 +- tests/data/test-read-dwarf/test2.so.hash.abi | 4 +- ...st20-pr19025-libvtkParallelCore-6.1.so.abi | 21807 +++--- .../test-read-dwarf/test21-pr19092.so.abi | 6494 +- .../test22-pr19097-libstdc++.so.6.0.17.so.abi | 60456 +++++++-------- .../test9-pr18818-clang.so.abi | 5013 +- tests/data/test-read-write/test18.xml | 2 +- .../test28-without-std-fns-ref.xml | 860 +- .../test28-without-std-vars-ref.xml | 780 +- tools/abidiff.cc | 12 + tools/abidw.cc | 15 + tools/abipkgdiff.cc | 21 + tools/kmidiff.cc | 12 + 80 files changed, 326407 insertions(+), 316780 deletions(-) create mode 100644 doc/manuals/tools-use-libabigail.txt -- 2.37.2 -- Dodji