public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@redhat.com>
To: Giuliano Procida <gprocida@google.com>
Cc: "Giuliano Procida via Libabigail" <libabigail@sourceware.org>,
	kernel-team@android.com, "Matthias Männich" <maennich@google.com>
Subject: Re: [PATCH] abidiff: improve treatment of array types in leaf changes mode
Date: Mon, 07 Dec 2020 14:24:46 +0100	[thread overview]
Message-ID: <86o8j5oijl.fsf@seketeli.org> (raw)
In-Reply-To: <CAGvU0H=3GzDcK4OsPNTe65b_TpnbiMNCtKLA19==yTHhmVFLuw@mail.gmail.com> (Giuliano Procida's message of "Fri, 4 Dec 2020 14:41:46 +0000")

Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

[...]

> I've found this aspect (local changes and types with similar
> structure) of abidiff really hard to reason about. I'm likely looking
> at things the wrong way, but here's roughly how I see things.

No, I don't think you are seeing things the "wrong way".  I think there
are several ways of seeing things.

There are several goals libabigail had to fulfill.  To accommodate them
all we adopted some point of views that might make things look less
obvious that they could have been should we have fewer goals to fulfill.

So, here are some of the concepts we use in Libabigail to analyze
changes.  I'll only talk about those related to your discussion here.

An ABI artifact is either a type or a declaration.  An ABI artifact is a
composite entity, meaning that it has sub-parts that are ABI artifacts
as well.  So, a declaration has a type.  A type might have have
sub-types.

In that context, we define the concept of a local change.

A local change to an ABI artifact is a change that does not involve any
of the sub-parts of the artifact.  For instance, a local change for a
typedef would be a change to its name.  But a change to the underlying
type of the typedef is not considered local, as the underlying type is
the sub-type of the typedef.

Said otherwise, a local change to an ABI artifact is a change to the
artifact itself, not a change to any of its sub-parts.

A leaf change is then a local change "that makes sense" from the
end-user point of view, in the context of the leaf reporter.

More precisely, a leaf change:

   * must impact an exported interface of the binary we are
     analyzing.  In other words, it must be a change to a sub-part of
     an ABI artifact (a declaration) that is exported by the binary we
     are looking at.

   * doesn't contain a name change

   * is about a function, variable, class, union or enum type.

There are other restrictions, but there are technical details that are
not useful to mention at this point, think.

The reason why we introduced the concept of the "local change" is that
it's useful to fulfill goals in libabigail, independently from the leaf
reporter requirements.  It's useful to be able to perform things like
redundancy detection and proper categorization of diff nodes.  But I
think this is a topic for a separate discussion.

> We want to identify "leaf" changes. I think of them more as "root" (cause)
> changes, but that's just inverting the edges in some graph. These should be
> (user-defined) types that have not changed name but have changed in some
> way "directly" (indirectly would be if there were a more remote cause of
> the same sort).

I find your way of seeing things fairly accurate.  It's doesn't seem to
go against the view that I exposed above.


> There are also direct changes to symbols' types.

I tend to think more in terms of 'exported interfaces'.  I.e, a binary
exports functions or global variables.  Those constitute the interfaces
of the application binary.  Those interfaces have several properties.
Their type is one of their properties.  In the particular case of ELF
binaries, those exported interfaces must have ELF symbols, which another
property of those exported interfaces.

So, yes, there are also "local changes to the binary interfaces".  This
is a particular case of the definition I laid down earlier.

So I agree with you.

> User-defined types with local changes have paths to the symbols whose types
> they affect - they are the impacted interfaces.

Yes.

> Now with libabigail, there is the notion of local vs non-local change which
> might not correspond exactly with the notion sketched above, but it's
> certainly related.

I hope it's clearer now.

> And then there are at least two more
> pieces: types_have_similar_structure which is used in decisions about
> locality.

Right.  The concept of "type structure similarity" is a detail of the
concept of "local change".

> and lastly the logic used to filter changes for consideration in
> the leaf report - they have to be local plus a lot of other conditions.

Right.

> The decision-making that goes into leaf reporting seems to be spread across
> several bits of code

The pass that walks through diff nodes to detect which ones are leaf
nodes is the 'struct leaf_diff_node_marker_visitor' defined in
abg-comparison.cc.

Now, there are a *lot* of concepts involved here, I agree.  But
ultimately, I think it's related to the complexity of what we are trying
to model here.

> and I speculate that it was an inconsistency between two of these that
> resulted in diff nodes being selected for inclusion but actually have
> no local diff to report (by the represent helper function).

Stricto sensu, I think the problem was us not respecting the definition
of the "type structure similarity" relation.  As that relation is at the
*bottom* of a stack of concepts, it's not surprising that a bug in there
has impacts possibly further down the road.

At the end of the day, this is a complex network of concepts we are
dealing with.

> If I knew what types_have_similar_structure had to be consistent with I
> could check all the cases or, more interestingly, see if the decisions
> being made in two different places could be made in just one place.

Well.  In this particular case, I'd argue that
types_have_similar_structure had to be consistent with itself to begin
with :-)

So I wouldn't beat myself over this too much, seriously.  To me, what is
paramount is that we keep the whole thing as "debuggable" as possible.
Because *that* is how we progressively get better.  At least in my
experience.

Thank you for initiating this discussion.  I hope I haven't made things
even more confusing than they are at the moment.

Cheers,

-- 
		Dodji


      reply	other threads:[~2020-12-07 13:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 15:09 Giuliano Procida
2020-12-04 11:02 ` Dodji Seketeli
2020-12-04 11:05   ` [PATCH 1/2] ir: Add better comments to types_have_similar_structure Dodji Seketeli
2020-12-04 11:07   ` [PATCH 2/2] ir: Arrays are indirect types for type similarity purposes Dodji Seketeli
2020-12-04 14:41   ` [PATCH] abidiff: improve treatment of array types in leaf changes mode Giuliano Procida
2020-12-07 13:24     ` Dodji Seketeli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86o8j5oijl.fsf@seketeli.org \
    --to=dodji@redhat.com \
    --cc=gprocida@google.com \
    --cc=kernel-team@android.com \
    --cc=libabigail@sourceware.org \
    --cc=maennich@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).