public inbox for
 help / color / mirror / Atom feed
From: Dodji Seketeli <>
Subject: [PATCH 0/2, RFC] Speed up type DIEs canonicalization
Date: Tue, 06 Sep 2022 12:07:53 +0200	[thread overview]
Message-ID: <> (raw)


This patch set is a response to the problem entitled "abidw
performance regression on vmlinux", reported at

In that problem report, "abidw --noout vmlinux" would take more than

Before that performance regression, type DIEs canonicalization by the
DWARF reader was fast, because it was wrong, basically.  During
canonicalization, the reader was using a speed optimization called
"canonical type propagation".  I argue that the optimization was
beeing done too eagerly.  Meaning it was being done even in cases
where it should not have been, often leading to wrong (but very fast)

This patch was:

    commit 7ecef6361799326b99129a479b43b138f0b237ae
    Author: Dodji Seketeli <>
    Date:   Mon Apr 4 15:35:48 2022 +0200

	Canonicalize DIEs w/o assuming ODR & handle typedefs transparently

That patch was mostly meant to affect only C++, Ada and Java binaries.
But then, the patch stopped the reader from doing canonical type
propagation on types that are not aggregates (pointers, typedefs
etc).  Doing type propagation on types that are not aggregate is wrong
because that can lead to cases where two "pointers to aggregate" could
be considered equal where in the end, the aggregates are different.
This can happen for instance on aggregates that are detected as redundant.

Please note that these terms (redundant, canonical type propagation,
etc) are defined in the introductory comments of the first patches

Anyway, that patch restricted canonicalization and canonical type
propagation to aggregate types.  It also restricted the amount
canonical type propagation performed because there are cases where
these should not be performed.

I believe that these changes lead the many more type DIEs comparison
being performed, even for C programs.  On libraries where there tend
to be less deep type trees, comparison time didn't explode.  But on
some applications like the Linux Kernel with deep type trees
containing lots of redundant type the number of DIEs comparison

The first patch of the series below tracks all the types that are
subject to the canonical type propagation optimization and tries to
apply that optimization to a maximum of types.  That makes comparisons
be as fast as possible as many types now have a canonical type.  But
whenever it detects that the optimization should not have taken place,
it cancels it on all the types where it happened.  This is like what
we already do when canonicalyzing IR types.

This brought comparison times of vmlinux below 5 minutes (from more
than hour).

The second patch restricts the analysis of DIEs to the interfaces that
have exported symbols.  Otherwise, several other kinds of interfaces
where being analyzed at the DIE level.  It was just later that only
the IR of the exported interfaces where "chosen" to be put into the
final ABI Corpus.  That lead to potentially "too many" DIEs being
analyzed in the first place, leading to many comparisons in the case
of huge applications like vmlinux.  I believe this brings the time to
under a minute or so when analyzing vmlinux.

That second patch thus introduces a new "--exported-interfaces-only"
option supported by the tools abidw, abidiff, abipkgdiff, kmidiff,
etc.  That option triggers the new mode where only exported interfaces
are analyzed at the DIE level.  Note that when looking at the Linux
Kernel, that option is enabled by default.

It also introduces a new "--allow-non-exported-interfaces" for those
tools.  Note that this option is enabled by default when looking at
any binary that is not the Linux Kernel.

Dodji Seketeli (2):
  dwarf-reader: Revamp the canonical type DIE propagation algorithm
  Allow restricting analyzed decls to exported symbols

 doc/manuals/                       |     3 +-
 doc/manuals/abidiff.rst                       |    52 +
 doc/manuals/abidw.rst                         |    82 +-
 doc/manuals/abipkgdiff.rst                    |    51 +
 doc/manuals/kmidiff.rst                       |    52 +-
 doc/manuals/tools-use-libabigail.txt          |    16 +
 include/abg-ir.h                              |     9 +
 src/                       |  1524 ++-
 src/abg-ir-priv.h                             |    11 +
 src/                                 |    36 +
 .../data/test-annotate/  |    52 +-
 .../data/test-annotate/  | 11386 ++++++++--------
 .../data/test-annotate/  |  1224 +- | 11323 +++++++-------- |  7194 +++++-----
 ...x86_64--2.24.2-30.fc30.x86_64-report-0.txt |     2 +-
 .../test-read-dwarf/    |   893 +-
 .../test-read-dwarf/    |     8 +
 .../test-read-dwarf/     |    38 +-
 .../test-read-dwarf/     | 10901 ++++++++-------
 .../test-read-dwarf/     |   676 +-
 .../test-read-dwarf/     |  1216 +- | 11153 +++++++-------- |  7164 +++++-----
 .../ |    32 +-
 .../                |   127 +-
 tools/                              |    12 +
 tools/                                |    15 +
 tools/                           |    21 +
 tools/                              |    12 +
 30 files changed, 33868 insertions(+), 31417 deletions(-)
 create mode 100644 doc/manuals/tools-use-libabigail.txt



             reply	other threads:[~2022-09-06 10:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-06 10:07 Dodji Seketeli [this message]
2022-09-06 10:10 ` [PATCH 1/2, RFC] dwarf-reader: Revamp the canonical type DIE propagation algorithm Dodji Seketeli
2022-09-06 10:11 ` [PATCH 2/2, RFC] Allow restricting analyzed decls to exported symbols Dodji Seketeli
2022-09-09 13:03   ` Giuliano Procida
2022-09-19  9:34     ` Dodji Seketeli
2022-09-06 10:13 ` [PATCH 0/2, RFC] Speed up type DIEs canonicalization Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).