public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@seketeli.org>
To: Giuliano Procida <gprocida@google.com>
Cc: libabigail@sourceware.org,  kernel-team@android.com,
	 maennich@google.com
Subject: Re: [PATCH 2/2] Use pointers not strings in type graph comparison.
Date: Thu, 09 Jul 2020 11:19:26 +0200	[thread overview]
Message-ID: <87pn95qc4h.fsf@seketeli.org> (raw)
In-Reply-To: <20200619163924.207852-3-gprocida@google.com> (Giuliano Procida's message of "Fri, 19 Jun 2020 17:39:24 +0100")

Hello,

Giuliano Procida <gprocida@google.com> a écrit:

> During structural comparison of types there is the possibilitiy of
> infinite recursion as types can have self-references and there can
> be more elaborate mutual references between them.
>
> The current comparison algorithm keeps track of currently seen (struct
> and function) types by name. This causes earlier caching of names than
> is needed and, more significantly, may result in types comparing equal
> unexpectedly. This commit switches to storing their addresses instead.

I think it might be interesting to understand why the tracking was done
by type names, rather than by address.

First, the tracking was initially done on classes only.  This was at a
time when libabigail didn't have the concept of function types.

In a DWARF representation, a given struct S can appear several times in
a binary because that very same S is defined in several different
translation units.  Thus, before type canonicalization (which happens
at the level of the libabigail IR) there could be several different
representations of the same S.  These representations of S are different
in the sense that they have different addresses in memory.  But they
represent the same S.

So when, inside the same binary (or ABI corpus), we compare two S that
end-up being the same but have different addresses, we can get into
infinite recursion if S has some elaborate enough self-references, as
you pointed out in your preamble.  In that case, using the address of
the representation of S keep track of the currently seen S won't work
because there can be several different addresses representing the same
S.  That is why I used the name of the type to keep that track, rather
than addresses.  I thought the potential drawbacks (in terms of
precision of the compare) was better than risking an infinite loop.

However, I have later introduced type de-duplication at the level of
DWARF directly, before the construction of the IR, at least for DWARF
originating from C compilers.  With that in place, inside a given ABI
corpus, there is now only one internal representation for S despite the
potential multiple copies of S present at the DWARF level.  Now we can
use addresses to keep track of currently seen structs.  Of course, I
haven't re-visited this whole thing since then, so the type name based
tracking stayed.

I introduced function types, I believe, before type de-duplication (I am
not sure on that one, I'll prolly have to double check).  At the time, I
guess I used type name based tracking there two, because I assumed I
could have the same kind of issue.  But now I am not so sure.  Function
types are a different kind of beast and they might not suffer the same
kind of actual duplication as the other real types expressed in DWARF.

Anyway, I think this is a good move.  Thank you for looking into that.

> This change affects some tests which show more diffs than previously.
>
> 	src/abg-ir.cc: (environment::priv): Change types of
> 	classes_being_compared_ and fn_types_being_compared_ to be
> 	simple sets of pointers.
> 	(function_type::priv::mark_as_being_compared): Just add
> 	address to set.
> 	(function_type::priv::unmark_as_being_compared): Just remove
> 	address from set.
> 	(function_type::priv::comparison_started): Just look up
> 	address in set.
> 	(class_or_union::priv::mark_as_being_compared): Just add
> 	address to set.
> 	(class_or_union::priv::unmark_as_being_compared): Just remove
> 	address from set.
> 	(class_or_union::priv::comparison_started): Just look up
> 	address in set.
> 	* tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt:
> 	Update.
> 	* tests/data/test-diff-pkg/elfutils-libs-0.170-4.el7.x86_64-multiple-sym-vers-report-0.txt:
> 	Update.
> 	* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt:
> 	Update.
> 	* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
> 	Update.
>
> Signed-off-by: Giuliano Procida <gprocida@google.com>

Applied to the master branch, thanks!

Cheers,

-- 
		Dodji

  reply	other threads:[~2020-07-09  9:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 16:39 [PATCH 0/2] Type identity clean-ups Giuliano Procida
2020-06-19 16:39 ` [PATCH 1/2] abg-ir.cc: Remove unused re_canonicalize function Giuliano Procida
2020-07-07 14:48   ` Dodji Seketeli
2020-06-19 16:39 ` [PATCH 2/2] Use pointers not strings in type graph comparison Giuliano Procida
2020-07-09  9:19   ` Dodji Seketeli [this message]
2020-06-22 20:11 ` [PATCH 0/2] Type identity clean-ups Matthias Maennich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pn95qc4h.fsf@seketeli.org \
    --to=dodji@seketeli.org \
    --cc=gprocida@google.com \
    --cc=kernel-team@android.com \
    --cc=libabigail@sourceware.org \
    --cc=maennich@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).