From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id BFA153857806 for ; Mon, 7 Dec 2020 13:24:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BFA153857806 Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-283-hqzzGRKuPRKWYUg2yJ3xYA-1; Mon, 07 Dec 2020 08:24:50 -0500 X-MC-Unique: hqzzGRKuPRKWYUg2yJ3xYA-1 Received: by mail-wr1-f72.google.com with SMTP id d2so465798wrr.5 for ; Mon, 07 Dec 2020 05:24:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:organization:references:date :in-reply-to:message-id:user-agent:mime-version :content-transfer-encoding; bh=DNfoaNaxYPAuo02Btyp8fBxIGnoIKXYH7f7PUhU3teE=; b=IkRaKyWUI0DFLYXEMu2r4Q9jJ7xOgadySq0d+S6CoTX21tfXS+Wyb3gByYx0A/P4wD VrEGYE4s5c83ERIVd2FxWULI5EVvxIwpsldeotkg3bVxUOos2wYEqCbVVALri+EXwb8G /HgDi2X6RhyID+j4hZMvoI9EYIEA9ZDepyNzRw6BVrSvb07/rj+ySVaAjv/GDUCiRqVU 9c0obnFdrjbozNrt3f5Ox8JHrR0lmuSo6HamfmyU2dQwteGQpDG4pHa9Q/SfYSEr5TSY yRmqSMAoO4OdLnEEhr5SC7DMV3w1r14uWY2IeVWTO5ybM/QuIVvC530hizYEiqOvyDpj Jb1g== X-Gm-Message-State: AOAM531QauvcVotmfr7aSL4DxDkUCZUCKjoSqgl6OjBbk5Z4EnyRWUbW gX27D6XWVk0zpNvHxVdoU/rEy8j3SakcF3QmXQ4rissTKLZum3ooiYA04QbnpCm3lIb4zZ+Wj8M bN/7lClZPLi5f28MAerX7 X-Received: by 2002:adf:fdce:: with SMTP id i14mr12265033wrs.58.1607347489099; Mon, 07 Dec 2020 05:24:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJxFs4pElUTt+p19i1/KaRArjT4zrB+Ql/u1MNjIcCb2ZyyQWXetaC9B8XvqLXrG4DzYhqrdIg== X-Received: by 2002:adf:fdce:: with SMTP id i14mr12265013wrs.58.1607347488816; Mon, 07 Dec 2020 05:24:48 -0800 (PST) Received: from localhost (91-166-131-65.subs.proxad.net. [91.166.131.65]) by smtp.gmail.com with ESMTPSA id u12sm14919013wmu.28.2020.12.07.05.24.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Dec 2020 05:24:48 -0800 (PST) Received: by localhost (Postfix, from userid 1001) id D0B6E1A262B; Mon, 7 Dec 2020 14:24:46 +0100 (CET) From: Dodji Seketeli To: Giuliano Procida Cc: Giuliano Procida via Libabigail , kernel-team@android.com, Matthias =?utf-8?Q?M=C3=A4nnich?= Subject: Re: [PATCH] abidiff: improve treatment of array types in leaf changes mode Organization: Me, myself and I References: <20201203150916.3540551-1-gprocida@google.com> <865z5hq1eo.fsf@seketeli.org> X-Operating-System: Red Hat Enterprise Linux Server 7.8 Date: Mon, 07 Dec 2020 14:24:46 +0100 In-Reply-To: (Giuliano Procida's message of "Fri, 4 Dec 2020 14:41:46 +0000") Message-ID: <86o8j5oijl.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Dec 2020 13:24:54 -0000 Hello Giuliano, Giuliano Procida a =C3=A9crit: [...] > I've found this aspect (local changes and types with similar > structure) of abidiff really hard to reason about. I'm likely looking > at things the wrong way, but here's roughly how I see things. No, I don't think you are seeing things the "wrong way". I think there are several ways of seeing things. There are several goals libabigail had to fulfill. To accommodate them all we adopted some point of views that might make things look less obvious that they could have been should we have fewer goals to fulfill. So, here are some of the concepts we use in Libabigail to analyze changes. I'll only talk about those related to your discussion here. An ABI artifact is either a type or a declaration. An ABI artifact is a composite entity, meaning that it has sub-parts that are ABI artifacts as well. So, a declaration has a type. A type might have have sub-types. In that context, we define the concept of a local change. A local change to an ABI artifact is a change that does not involve any of the sub-parts of the artifact. For instance, a local change for a typedef would be a change to its name. But a change to the underlying type of the typedef is not considered local, as the underlying type is the sub-type of the typedef. Said otherwise, a local change to an ABI artifact is a change to the artifact itself, not a change to any of its sub-parts. A leaf change is then a local change "that makes sense" from the end-user point of view, in the context of the leaf reporter. More precisely, a leaf change: * must impact an exported interface of the binary we are analyzing. In other words, it must be a change to a sub-part of an ABI artifact (a declaration) that is exported by the binary we are looking at. * doesn't contain a name change * is about a function, variable, class, union or enum type. There are other restrictions, but there are technical details that are not useful to mention at this point, think. The reason why we introduced the concept of the "local change" is that it's useful to fulfill goals in libabigail, independently from the leaf reporter requirements. It's useful to be able to perform things like redundancy detection and proper categorization of diff nodes. But I think this is a topic for a separate discussion. > We want to identify "leaf" changes. I think of them more as "root" (cause= ) > changes, but that's just inverting the edges in some graph. These should = be > (user-defined) types that have not changed name but have changed in some > way "directly" (indirectly would be if there were a more remote cause of > the same sort). I find your way of seeing things fairly accurate. It's doesn't seem to go against the view that I exposed above. > There are also direct changes to symbols' types. I tend to think more in terms of 'exported interfaces'. I.e, a binary exports functions or global variables. Those constitute the interfaces of the application binary. Those interfaces have several properties. Their type is one of their properties. In the particular case of ELF binaries, those exported interfaces must have ELF symbols, which another property of those exported interfaces. So, yes, there are also "local changes to the binary interfaces". This is a particular case of the definition I laid down earlier. So I agree with you. > User-defined types with local changes have paths to the symbols whose typ= es > they affect - they are the impacted interfaces. Yes. > Now with libabigail, there is the notion of local vs non-local change whi= ch > might not correspond exactly with the notion sketched above, but it's > certainly related. I hope it's clearer now. > And then there are at least two more > pieces: types_have_similar_structure which is used in decisions about > locality. Right. The concept of "type structure similarity" is a detail of the concept of "local change". > and lastly the logic used to filter changes for consideration in > the leaf report - they have to be local plus a lot of other conditions. Right. > The decision-making that goes into leaf reporting seems to be spread acro= ss > several bits of code The pass that walks through diff nodes to detect which ones are leaf nodes is the 'struct leaf_diff_node_marker_visitor' defined in abg-comparison.cc. Now, there are a *lot* of concepts involved here, I agree. But ultimately, I think it's related to the complexity of what we are trying to model here. > and I speculate that it was an inconsistency between two of these that > resulted in diff nodes being selected for inclusion but actually have > no local diff to report (by the represent helper function). Stricto sensu, I think the problem was us not respecting the definition of the "type structure similarity" relation. As that relation is at the *bottom* of a stack of concepts, it's not surprising that a bug in there has impacts possibly further down the road. At the end of the day, this is a complex network of concepts we are dealing with. > If I knew what types_have_similar_structure had to be consistent with I > could check all the cases or, more interestingly, see if the decisions > being made in two different places could be made in just one place. Well. In this particular case, I'd argue that types_have_similar_structure had to be consistent with itself to begin with :-) So I wouldn't beat myself over this too much, seriously. To me, what is paramount is that we keep the whole thing as "debuggable" as possible. Because *that* is how we progressively get better. At least in my experience. Thank you for initiating this discussion. I hope I haven't made things even more confusing than they are at the moment. Cheers, --=20 =09=09Dodji