From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay11.mail.gandi.net (relay11.mail.gandi.net [217.70.178.231]) by sourceware.org (Postfix) with ESMTPS id 2D3F13858D39 for ; Fri, 26 Nov 2021 10:27:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2D3F13858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=seketeli.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=seketeli.org Received: (Authenticated sender: dodji@seketeli.org) by relay11.mail.gandi.net (Postfix) with ESMTPSA id 49D3D10000B; Fri, 26 Nov 2021 10:27:48 +0000 (UTC) Received: by localhost (Postfix, from userid 1000) id 6582A5802B4; Fri, 26 Nov 2021 11:27:48 +0100 (CET) From: Dodji Seketeli To: Ben Woodard Cc: "Guillermo E. Martinez via Libabigail" Subject: Re: [PATCH v2] Add regression tests for ctf reading Organization: Me, myself and I References: <20211118041625.622972-1-guillermo.e.martinez@oracle.com> <20211122213353.2456208-1-guillermo.e.martinez@oracle.com> <87ee75tsst.fsf@seketeli.org> X-Operating-System: Fedora 36 X-URL: http://www.seketeli.net/~dodji Date: Fri, 26 Nov 2021 11:27:48 +0100 In-Reply-To: (Ben Woodard's message of "Wed, 24 Nov 2021 11:09:59 -0800") Message-ID: <87y25brz2z.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Nov 2021 10:27:53 -0000 Hello, [...] On Nov 24, 2021, at 8:36 AM, Dodji Seketeli wrote: >> In those cases where the abixml file generated from CTF is different >> from the one generated from DWARF, I think we should have two >> different reference abixml files to diff against. No CTF test should >> be disabled, I think. Ben Woodard a =C3=A9crit: > Here is a somewhat deeper question that I think needs to be > considered. I=E2=80=99ve generally referred to it as =E2=80=9CDWARF Idiom= s=E2=80=9D but this > email makes me think that it is even larger than that. > > For my work, I need libabigail to generate an abstract notion of the > ABI corpus. How it constructs that abstract notion of the ABI needs to > be independent of the producer. Think of it this way, say we take the > same compiler and have it compile the same library producing both CTF > and DWARF, the ABI of the library doesn=E2=80=99t change. Since it is > literally the same object, the program text is same. Therefore the ABI > is unquestionably the the same. Any difference reported by libabigail > therefore is a problem with libabigail. In theory, I think what you say makes total sense. In practice however, there seems to be some annoying limits to what we can do right now. For instance, right now, CTF doesn't include source location (file name, line numbers) information. So by default, the output details of abidiff will be different at least because in one case it shows line information and in other other it doesn't. Of course we can work harder to avoid showing line information in all abidiff tests, but then it means that the particular feature of handling line information coming from DWARF won't be tested. So yeah, there are details like that. I am not sure how much of these we have, but if I've found one, I can't say they aren't others. Of course, the "general representation" (whatever that means) of the ABI represented by DWARF or CTF should not be different. But the exact textual output of abidiff or abidw is not necessarily going to be the same, "a priori". > It is not taking the source material and abstracting it enough into > the ABI artifacts to separate the artifacts from the implementation. Right. I'd argue that the problem wasn't "apparent" when we had only DWARF to care about. Now it is. So, I think now we are starting to have the means to see the issue, rather than just speculate in the abstract. So for cases where it makes sense, I think the "abstraction level" of the IR can be ameliorated. > So I kind of believe that we need to look more deeply into WHY the CTF > and DWARF are not comparing as equivalent and begin the process of > filing the compiler bugs when we need to, and doing what is necessary > to abstract the ABI from the source material that libabigail used to p> construct its IR of the ABI corpus from. Agreed. > So, I must say that I disagree with both dodji=E2=80=99s approach here an= d to > a lesser extent Guillermo=E2=80=99s approach of disabling the tests. I th= ink > that the tests where the CTF doesn=E2=80=99t match the DWARF should be > investigated and when necessary marked =E2=80=9Cxfail=E2=80=9D with a not= e citing > their individual cause. Well, lemme quote what I said: On Nov 24, 2021, at 8:36 AM, Dodji Seketeli wrote: >> In those cases where the abixml file generated from CTF is different >> from the one generated from DWARF, I think we should have two >> different reference abixml files to diff against. No CTF test should >> be disabled, I think. I said explicitly that no test should be disabled. Incidentally, by providing the two abixml files (DWARF and CTF) at least we can see what the change are and later work on reducing those changes if it makes sense. > I think that what we need to work towards is: abidw produces the same > output (or more precisely libabigail produces the same IR) whether you > compile with -gdwarf-3 -gdwarf-4 -gdwarf-5 -gsplit-dwarf -gctf also > for the most part the ABI should not change between compiler > versions. There may be a few cases that we need to look into where the > compiler actually breaks ABI and of course libabigail should flag > those but I would assert that libabigail needs to abstract its IR of > the ABI enough that compiler version changes that don=E2=80=99t actually > change the ABI of the ELF object are not reported as ABI breaks. It > currently does pretty well at this at the moment. Then once that > foundation is built, being able to abstract the ABI IR enough that > differences in toolchains e.g. LLVM vs GCC are not flagged as changes > in the object=E2=80=99s ABI. This is important to provide people with a t= ool > that will allow them to mix toolchains within a project to achieve > optimal code. In an ideal world, of course. I am not opposed to trying our best ;-) Cheers, --=20 Dodji