From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [IPv6:2001:4b98:dc4:8::221]) by sourceware.org (Postfix) with ESMTPS id 4334E3858D1E; Fri, 23 Dec 2022 09:19:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4334E3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=seketeli.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=seketeli.org Received: (Authenticated sender: dodji@seketeli.org) by mail.gandi.net (Postfix) with ESMTPSA id 82566240009; Fri, 23 Dec 2022 09:19:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=seketeli.org; s=gm1; t=1671787172; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n1dAlya/uBfxnJnJDkVa9v8hZ6bnSyB4x7fZyTqsk+Q=; b=LoC6ltdj+yrypFczRt1x18vIzOkK6eiQL60DaMvia2qtSEMszrqYva1SvSRsRs8gcbwPtm qQYCsaTxeJihGmakawUkI1g8iZCighwpyXeXIvolh7HjsES6dsU/Qr89Nb/vFe4L7jWOyi nUJRdMZ99QPzwIrMBO921+cqImK0d7IB4aBCuZVqgWKMOtyNm2vudFhN45Hgun68FLKVRD 31++8sVIUKDsrOMGRXkJbNuc6jxMDUEgUriEm5h5eUnCuH5sUn8azmakVrnrg9fsnDsXJ7 pYD4AcafE+lBI3p/00s4E2oBFA7duO13xRwT4PKehxUEz6rXDvrYuU46CRcQbg== Received: by localhost (Postfix, from userid 1000) id 83D93C1B9AE6; Fri, 23 Dec 2022 10:19:31 +0100 (CET) From: Dodji Seketeli To: "guillermo.e.martinez at oracle dot com via Libabigail" Cc: "guillermo.e.martinez at oracle dot com" Subject: Re: [Bug default/29811] extern array declaration following by its definition reports different sizes in DWARF vs CTF Organization: Me, myself and I References: X-Operating-System: AlmaLinux 9.0 X-URL: http://www.seketeli.net/~dodji Date: Fri, 23 Dec 2022 10:19:31 +0100 In-Reply-To: (guillermo e. martinez at oracle dot com via Libabigail's message of "Thu, 22 Dec 2022 15:27:05 +0000") Message-ID: <87o7rumq8s.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: "guillermo.e.martinez at oracle dot com via Libabigail" a =C3=A9crit: > Thanks for this great explanation! You are welcome! [...] >> As you can see there, that DIE has no "size" attribute. That is in line >> with the type of is_basic_table, as declared in the C source code, which >> is "Array of unknown size". >>=20 > > So, Is it a limitation of DWARF info? I wouldn't say it's a limitation. Rather, I'd say that it's a feature. In that case DWARF keeps the information about the exact type of the variable as written in the source code. That type really is "array of unknown size". The actual size taken by the variable in memory can be derived from the initializer. Not from the type definition. So I think that DWARF is correct here. [...] > > Right, just that it is limited to _symbol_information not saying much abo= ut of > type symbol, it is most evident when we use abidiff changing the array si= ze, > using CTF and DWARF front-end (e.g I changed the array's elements to two): If I understand correctly, you changed the size of the array in the initializer, right? > > $ abidiff test-01.o test-02.o > > Functions changes summary: 0 Removed, 0 Changed, 0 Added function=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20 > Variables changes summary: 0 Removed, 1 Changed, 0 Added variable=20=20= =20=20=20=20=20=20=20=20=20=20=20 > > 1 Changed variable:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > > [C] 'unsigned int is_basic_table[]' was changed at test03.c:1:1:=20= =20=20=20=20=20=20=20=20=20=20=20 > size of symbol changed from 4 to 8=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 I would guess so. So both variables have the same type (array of uninitialized size), but their initializer have different size. See, that information is kept all along. If on the other hand, you say that the type size should be set to 1, then we wouldn't know the difference between: unsigned is_basic_table[] =3D {0}; and unsigned is_basic_table[1]; Even if in the end, the ABI ends up being the same, there is a subtle difference there that would be lost. And I think that it's important to keep information for users at this point. Is it possible to have that information from CTF? I mean, is it possible to have CTF tell us that is_basic_table is an array of unknown size? > > $ abidiff --ctf test-01.o test-02.o > > Functions changes summary: 0 Removed, 0 Changed, 0 Added function=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20 > Variables changes summary: 0 Removed, 1 Changed, 0 Added variable=20=20= =20=20=20=20=20=20=20=20=20=20=20 > > 1 Changed variable:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > > [C] 'unsigned int is_basic_table[1]' was changed to 'unsigned int > is_basic_table[2]':=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > size of symbol changed from 4 to 8=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20 > type of variable changed:=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20 > type name changed from 'unsigned int[1]' to 'unsigned int[2]'=20= =20=20=20=20=20=20=20=20=20=20 > array type size changed from 32 to 64=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 > array type subrange 1 changed length from 1 to 2 Here, it looks like abidiff is reporting about the difference between: unsigned is_basic_table[1]; and unsigned is_basic_table[2]; The information about the fact that we are actually looking at unsigned is_basic_table[], initialized with two different initializers is lost. So, it seems to me that DWARF is more fined grained here. It would be nice to have that same level of finesse from CTF, if possible. Is that possible? [...] >> Do you know what happens if you set the alignment to zero in the CTF >> front-end? >>=20 > > Yes, I changed the alignment value in CTF front-end in commit: 8b832a9edf= a, > and I tested with latest commit until now: 4cf2ef8f9. > > $ abidiff abi-ctf.xml abi-dwarf.xml > > Functions changes summary: 0 Removed, 0 Changed, 0 Added function=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20 > Variables changes summary: 0 Removed, 1 Changed, 0 Added variable=20=20= =20=20=20=20=20=20=20=20=20=20=20 > > 1 Changed variable:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > > [C] 'unsigned int is_basic_table[1]' was changed to 'unsigned int > is_basic_table[]' at test03.c:1:1:=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20 > type of variable changed:=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20 > type name changed from 'unsigned int[1]' to 'unsigned int[]'=20= =20=20=20=20=20=20=20=20=20=20=20 > array type size changed from 32 to infinity=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 > array type subrange 1 changed length from 1 to infinity=20=20=20 OK, fair enough. I think that here, we should teach libabigail's comparison engine to take into account the fact that although the array type size is unknown, its symbol size is known and hasn't changed. Thus, this is not an ABI change. This is on me. --=20 Dodji