public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: "gprocida at google dot com" <sourceware-bugzilla@sourceware.org>
To: libabigail@sourceware.org
Subject: [Bug default/28005] New: abidiff: compares the same types and declarations many times
Date: Tue, 22 Jun 2021 08:17:35 +0000	[thread overview]
Message-ID: <bug-28005-9487@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=28005

            Bug ID: 28005
           Summary: abidiff: compares the same types and declarations many
                    times
           Product: libabigail
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: default
          Assignee: dodji at redhat dot com
          Reporter: gprocida at google dot com
                CC: libabigail at sourceware dot org
  Target Milestone: ---

Created attachment 13510
  --> https://sourceware.org/bugzilla/attachment.cgi?id=13510&action=edit
archive containing two XML files for comparison

The test case here was comparing two mostly vanilla Linux kernel ABI XML files.
The libabigail version was 1656f9dd7b09cfc5013b891b93fa6b4dc41a0b6b (predating
a known performance regression).

The ABI files are attached.

Profiling with callgrind and instrumenting the code reveals that the two
functions:

ir::types_have_similar_structure
ir::equals (particularly the var_decl overload)

are called thousands more times than there are pairs of types or declarations
to be compared. Around 30% of runtime is estimated to be down these function
calls.

ir::types_have_similar_structure was called over 1 million times each for pairs
such as int, int and void, void.

ir::equals was called more than 50 thousand times each for around 20 (member)
variables and more 30 thousand times each for at least 1000 different ones.

There is a big caveat: I didn't check to see whether each call had the same
context (state affecting the outcome of function call) at each call. In
particular, these could be indirectly recursive calls where the same variable
pair appears earlier in the call stack. From my experimentation with diff
algorithms, I know it's not safe to always cache such pending comparisons.

On the other hand, types_have_similar_structure cannot be recursing for
arguments like int, int, so it's an obvious candidate for memoisation. However,
I think the repeated calls are a symptom of a deeper problem. My vague
intuition is that the structural similarity check could be done with a single
pass or could be eliminated altogether by being more careful elsewhere.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2021-06-22  8:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-22  8:17 gprocida at google dot com [this message]
2021-06-22  8:26 ` [Bug default/28005] " gprocida at google dot com
2021-06-22  8:51 ` gprocida at google dot com
2021-06-22  9:03 ` gprocida at google dot com
2022-01-21 14:08 ` gprocida at google dot com
2022-01-21 14:09 ` gprocida at google dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-28005-9487@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).