From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by sourceware.org (Postfix) with ESMTPS id 43BD73861962 for ; Thu, 9 Jul 2020 09:19:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 43BD73861962 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=seketeli.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dodji@seketeli.org X-Originating-IP: 91.166.131.130 Received: from localhost (91-166-131-130.subs.proxad.net [91.166.131.130]) (Authenticated sender: dodji@seketeli.org) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 3E3BE40004; Thu, 9 Jul 2020 09:19:28 +0000 (UTC) Received: by localhost (Postfix, from userid 1000) id E7A6A180092E; Thu, 9 Jul 2020 11:19:26 +0200 (CEST) From: Dodji Seketeli To: Giuliano Procida Cc: libabigail@sourceware.org, kernel-team@android.com, maennich@google.com Subject: Re: [PATCH 2/2] Use pointers not strings in type graph comparison. Organization: Me, myself and I References: <20200619163924.207852-1-gprocida@google.com> <20200619163924.207852-3-gprocida@google.com> X-Operating-System: Red Hat Enterprise Linux Workstation 7.8 Beta X-URL: http://www.seketeli.net/~dodji Date: Thu, 09 Jul 2020 11:19:26 +0200 In-Reply-To: <20200619163924.207852-3-gprocida@google.com> (Giuliano Procida's message of "Fri, 19 Jun 2020 17:39:24 +0100") Message-ID: <87pn95qc4h.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2020 09:19:32 -0000 Hello, Giuliano Procida a =C3=A9crit: > During structural comparison of types there is the possibilitiy of > infinite recursion as types can have self-references and there can > be more elaborate mutual references between them. > > The current comparison algorithm keeps track of currently seen (struct > and function) types by name. This causes earlier caching of names than > is needed and, more significantly, may result in types comparing equal > unexpectedly. This commit switches to storing their addresses instead. I think it might be interesting to understand why the tracking was done by type names, rather than by address. First, the tracking was initially done on classes only. This was at a time when libabigail didn't have the concept of function types. In a DWARF representation, a given struct S can appear several times in a binary because that very same S is defined in several different translation units. Thus, before type canonicalization (which happens at the level of the libabigail IR) there could be several different representations of the same S. These representations of S are different in the sense that they have different addresses in memory. But they represent the same S. So when, inside the same binary (or ABI corpus), we compare two S that end-up being the same but have different addresses, we can get into infinite recursion if S has some elaborate enough self-references, as you pointed out in your preamble. In that case, using the address of the representation of S keep track of the currently seen S won't work because there can be several different addresses representing the same S. That is why I used the name of the type to keep that track, rather than addresses. I thought the potential drawbacks (in terms of precision of the compare) was better than risking an infinite loop. However, I have later introduced type de-duplication at the level of DWARF directly, before the construction of the IR, at least for DWARF originating from C compilers. With that in place, inside a given ABI corpus, there is now only one internal representation for S despite the potential multiple copies of S present at the DWARF level. Now we can use addresses to keep track of currently seen structs. Of course, I haven't re-visited this whole thing since then, so the type name based tracking stayed. I introduced function types, I believe, before type de-duplication (I am not sure on that one, I'll prolly have to double check). At the time, I guess I used type name based tracking there two, because I assumed I could have the same kind of issue. But now I am not so sure. Function types are a different kind of beast and they might not suffer the same kind of actual duplication as the other real types expressed in DWARF. Anyway, I think this is a good move. Thank you for looking into that. > This change affects some tests which show more diffs than previously. > > src/abg-ir.cc: (environment::priv): Change types of > classes_being_compared_ and fn_types_being_compared_ to be > simple sets of pointers. > (function_type::priv::mark_as_being_compared): Just add > address to set. > (function_type::priv::unmark_as_being_compared): Just remove > address from set. > (function_type::priv::comparison_started): Just look up > address in set. > (class_or_union::priv::mark_as_being_compared): Just add > address to set. > (class_or_union::priv::unmark_as_being_compared): Just remove > address from set. > (class_or_union::priv::comparison_started): Just look up > address in set. > * tests/data/test-diff-dwarf/PR25058-liblttng-ctl-report-1.txt: > Update. > * tests/data/test-diff-pkg/elfutils-libs-0.170-4.el7.x86_64-multiple-sym= -vers-report-0.txt: > Update. > * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el= 7.x86_64-report-2.txt: > Update. > * tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el= 7.x86_64-report-3.txt: > Update. > > Signed-off-by: Giuliano Procida Applied to the master branch, thanks! Cheers, --=20 Dodji