From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dodji@seketeli.org>
Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net
 [217.70.183.193])
 by sourceware.org (Postfix) with ESMTPS id 661D43858D39;
 Wed,  2 Mar 2022 13:34:11 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 661D43858D39
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=seketeli.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=seketeli.org
Received: (Authenticated sender: dodji@seketeli.org)
 by mail.gandi.net (Postfix) with ESMTPSA id E684C24000D;
 Wed,  2 Mar 2022 13:34:09 +0000 (UTC)
Received: by localhost (Postfix, from userid 1000)
 id 0E4FE581C3A; Wed,  2 Mar 2022 14:34:08 +0100 (CET)
From: Dodji Seketeli <dodji@seketeli.org>
To: gprocida at google dot com via Libabigail <libabigail@sourceware.org>
Cc: gprocida at google dot com <sourceware-bugzilla@sourceware.org>
Subject: Re: [Bug default/26646] unexpected declaration-only types
Organization: Me, myself and I
References: <bug-26646-9487@http.sourceware.org/bugzilla/>
 <bug-26646-9487-zhGqGtUWy6@http.sourceware.org/bugzilla/>
X-Operating-System: Fedora 36
X-URL: http://www.seketeli.net/~dodji
Date: Wed, 02 Mar 2022 14:34:08 +0100
In-Reply-To: <bug-26646-9487-zhGqGtUWy6@http.sourceware.org/bugzilla/>
 (gprocida at google dot com via Libabigail's message of "Tue, 01 Mar
 2022 14:40:28 +0000")
Message-ID: <87sfs0iilr.fsf@seketeli.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL,
 KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libabigail@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mailing list of the Libabigail project <libabigail.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libabigail>,
 <mailto:libabigail-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libabigail/>
List-Help: <mailto:libabigail-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libabigail>,
 <mailto:libabigail-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Mar 2022 13:34:13 -0000

Dodji Seketeli via Libabigail <libabigail@sourceware.org> a =C3=A9crit:

[...]


> This addresses https://sourceware.org/bugzilla/show_bug.cgi?id=3D26646#c2=
1.
>
> 	* src/abg-dwarf-reader.cc (compare_dies): Do not propagate
> 	canonical type when aggregate redundancy is detected.
>

[...]

gprocida at google dot com via Libabigail <libabigail@sourceware.org> a
=C3=A9crit:

[...]

> I've thought a little bit more about this one.
>
> The current check is "local", recursion prevention at *this* DIE
> prevents it from being canonicalised, but it could still depend on
> child DIEs that are actually also parent DIEs and whose processing has
> not yet been completed.

You are right.  The current state is an improvement, but it's not
"complete".  Some DIEs might still wrongly considered as being
equivalent just because they depend on a "recursive DIE" that was
detected as such.  The canonical DIE propagation might have happened
during the window of time where the recursive DIE was comparing equal to
its counterpart.

This is addressed in the IR type canonicalization algorithm in
abg-ir.cc.
To learn more about this, look for
/// @defgroup OnTheFlyCanonicalization On-the-fly Canonicalization

and that comment.

The implementation is scattered in the functions
return_comparison_result, maybe_propagate_canonical_type and in the
macro RETURN_TRUE_IF_COMPARISON_CYCLE_DETECTED.

More on this below ...

> Would it be safer (more precise) to inhibit on-the-fly
> canonicalisation whenever the set (stack) of aggregates being compared
> is non-empty?

Right, that is the obvious thing to do, as I thought.  But then the
problem I encountered, when doing this for IR types, was that things
were too slow.  Really really slow.  In the time window where the
canonical type DIE is inhibited, the amount of (quadratic) structural
DIE comparisons goes through the roof.  That is why I came up with the
canonical type propagation in the first place.

So, the other (harder) approach I've taken is to not disable the
on-the-fly canonicalization when the stack of aggregates being compared
is non-empty, but rather, keep track of the types which are resolved as
being equivalent, but that depend on recursive aggregate that were
detected as such.

I call these types "dependent types".  Let's consider one such dependent
type D.  D's canonical type is the result of canonical propagation that
happened as the recursive type (that D depends on) was on the stack of
aggregates being compared.  D is labelled as having a "non-confirmed"
canonical type.  If the recursive type it depends on is later considered
not being equivalent to its counterpart, then the non-confirmed
canonical of D is going to be "cancelled" and then D is going to be
considered as being non canonicalized.

This is done for D and all the types that depends on it.

By doing this, the number of non canonicalized types is reduced to its
absolute minimum and so comparisons are reasonably fast, even for an
applications like the Kernel or Gimp.  Libraries usually have smaller
type graphs so we don't usually see the speed issue there.  Unless it's
llvm.so or libwebkit.so ;-)

So, I wasn't going to get into doing this for DIEs right away because it
takes a lot of time doing careful coding/debugging/profiling cycles.

But I definitely think we'll need to do this to have perfectly precise
canonicalizer.  My point was to get this in as it's an improvement
already and then work on the subsequent patch with a cooler head.

Does that make any sense?

Thanks.

--=20
		Dodji