public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@seketeli.org>
To: "guillermo.e.martinez at oracle dot com via Libabigail"
	<libabigail@sourceware.org>
Cc: "guillermo.e.martinez at oracle dot com"
	<sourceware-bugzilla@sourceware.org>
Subject: Re: [Bug default/29811] New: extern array declaration following by its definition reports different sizes in DWARF vs CTF
Date: Wed, 21 Dec 2022 17:02:42 +0100	[thread overview]
Message-ID: <87k02kd9st.fsf@seketeli.org> (raw)
In-Reply-To: <bug-29811-9487@http.sourceware.org/bugzilla/> (guillermo e. martinez at oracle dot com via Libabigail's message of "Mon, 21 Nov 2022 04:09:59 +0000")

Hello,

> The following C code generates a infinite subrange length using DWARF
> front-end.
>
> extern unsigned int is_basic_table[];
>
> unsigned int is_basic_table [] =
>   {
>     0,
>   };

So, the declaration unsigned int is_basic_table[] is a declaration of an
array of "unknown" size.  So, the type of is_basic_table is "array of
unsigned int, of unknown size".  In this particular case, is_basic_table
is initialized with an array of one element of type unsigned int.

So, let's look at what DWARF description of this looks like:

$ cat test.c
extern unsigned int is_basic_table[];

unsigned int is_basic_table [] =
  {
    0,
  };
$ 
$ gcc -g -c test.c
$
$ eu-readelf --debug-dump=info test.o | cat -n
     1	
     2	DWARF section [ 4] '.debug_info' at offset 0x40:
     3	 [Offset]
     4	 Compilation unit at offset 0:
     5	 Version: 5, Abbreviation section offset: 0, Address size: 8, Offset size: 4
     6	 Unit type: compile (1)
     7	 [     c]  compile_unit         abbrev: 3
     8	           producer             (strp) "GNU C17 11.3.1 20221121 (Red Hat 11.3.1-4) -mtune=generic -march=x86-64-v2 -g"
     9	           language             (data1) C11 (29)
    10	           name                 (line_strp) "test.c"
    11	           comp_dir             (line_strp) "/home/dodji/git/libabigail/fixes/prtests/PR29811"
    12	           stmt_list            (sec_offset) 0
    13	 [    1e]    array_type           abbrev: 1
    14	             type                 (ref4) [    29]
    15	             sibling              (ref4) [    29]
    16	 [    27]      subrange_type        abbrev: 4
    17	 [    29]    base_type            abbrev: 2
    18	             byte_size            (data1) 4
    19	             encoding             (implicit_const) unsigned (7)
    20	             name                 (strp) "unsigned int"
    21	 [    2f]    variable             abbrev: 5
    22	             name                 (strp) "is_basic_table"
    23	             decl_file            (data1) test.c (1)
    24	             decl_line            (data1) 1
    25	             decl_column          (data1) 21
    26	             type                 (ref4) [    1e]
    27	             external             (flag_present) yes
    28	             declaration          (flag_present) yes
    29	 [    3b]    array_type           abbrev: 1
    30	             type                 (ref4) [    29]
    31	             sibling              (ref4) [    4b]
    32	 [    44]      subrange_type        abbrev: 6
    33	               type                 (ref4) [    4b]
    34	               upper_bound          (data1) 0
    35	 [    4b]    base_type            abbrev: 2
    36	             byte_size            (data1) 8
    37	             encoding             (implicit_const) unsigned (7)
    38	             name                 (strp) "long unsigned int"
    39	 [    51]    variable             abbrev: 7
    40	             specification        (ref4) [    2f]
    41	             decl_line            (data1) 3
    42	             decl_column          (data1) 14
    43	             type                 (ref4) [    3b]
    44	             location             (exprloc) 
    45	              [ 0] addr .bss+0 <is_basic_table>
$ 

From line 21 to line 26, we see that the Debug Information Entry (a.k.a DIE)
that describes the variable is_basic_table is the following:

    21	 [    2f]    variable             abbrev: 5
    22	             name                 (strp) "is_basic_table"
    23	             decl_file            (data1) test.c (1)
    24	             decl_line            (data1) 1
    25	             decl_column          (data1) 21
    26	             type                 (ref4) [    1e]

At line 26, we see that the the value of "type" attribute of the
variable DIE is 0x1e.  Which means that the DIE describing the type of
the is_basic_table variable is the DIE which offset is 0x1e.

That DIE which offset 0x1e is at line 13:

    13	 [    1e]    array_type           abbrev: 1
    14	             type                 (ref4) [    29]
    15	             sibling              (ref4) [    29]

As you can see there, that DIE has no "size" attribute.  That is in line
with the type of is_basic_table, as declared in the C source code, which
is "Array of unknown size".

That is what libabigail expresses by the keyword "infinite".  I probably
could have chosen a better keyword here ;-)  Anyway, "infinite" really means
"unknown" here.

But then, you are right in saying that:

> The symbol size showed by objdump:
>   0000000000000000 g     O .bss   0000000000000004 is_basic_table

That is the symbol view of things.  Note the type description as
written in the source code.

In other words, the size "4" there is the size of the ELF symbol
is_basic_table, which is the ELF symbol for the is_basic_table C
variable.

The ELF symbol named "is_basic_table" is described by libabigail as can
been seen by doing:

    $ /home/dodji/git/libabigail/fixes/build/tools/abidw test.o
         1	<abi-corpus version='2.1' path='test.o' architecture='elf-amd-x86_64'>
         2	  <elf-variable-symbols>
         3	    <elf-symbol name='is_basic_table' size='4' type='object-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>
         4	  </elf-variable-symbols>
         5	  <abi-instr address-size='64' path='test.c' comp-dir-path='/home/dodji/git/libabigail/fixes/prtests/PR29811' language='LANG_C11'>
         6	    <type-decl name='unsigned int' size-in-bits='32' id='type-id-1'/>
         7	    <array-type-def dimensions='1' type-id='type-id-1' size-in-bits='32' id='type-id-2'>
         8	      <subrange length='1' type-id='type-id-3' id='type-id-4'/>
         9	    </array-type-def>
        10	    <array-type-def dimensions='1' type-id='type-id-1' size-in-bits='infinite' id='type-id-5'>
        11	      <subrange length='infinite' id='type-id-6'/>
        12	    </array-type-def>
        13	    <type-decl name='unsigned long int' size-in-bits='64' id='type-id-3'/>
        14	    <var-decl name='is_basic_table' type-id='type-id-5' mangled-name='is_basic_table' visibility='default' filepath='/home/dodji/git/libabigail/fixes/prtests/PR29811/test.c' line='1' column='1' elf-symbol-id='is_basic_table'/>
        15	  </abi-instr>
        16	</abi-corpus>
    $

At line 3, we can see:

<elf-symbol name='is_basic_table' size='4' type='object-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>

So we see that the symbol is_basic_table has a size of 4.

So I think that today, all the information is kept in the model: what
the source code said, and what the actual symbol looks like.

Let's look at what the comparison & reporting engine are saying when
comparing the IRs of DWARF against CTF:

> Thus, abidiff abi-ctf.xml abi-dwarf.xml, reports:
>
> 1 Changed variable:                                                             
>
>   [C] 'unsigned int is_basic_table[]' was changed at test-extern-array.c:3:1:   
>     type of variable changed:                                                   
>       array element type 'unsigned int' changed:                                
>         type size hasn't changed
>         type alignment changed from 32 to 0                                     
>       type size hasn't changed

You can see the report says that the type hasn't changed.

>       type alignment changed from 32 to 0

What has changed is the alignment.  the DWARF front-end considers the
alignment to be zero for arrays.  This is because there is no specific
alignment information for that type so we set it to zero.

Do you know what happens if you set the alignment to zero in the CTF
front-end?

Cheers,

-- 
		Dodji

  reply	other threads:[~2022-12-21 16:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21  4:09 guillermo.e.martinez at oracle dot com
2022-12-21 16:02 ` Dodji Seketeli [this message]
2022-12-21 16:02 ` [Bug default/29811] " dodji at seketeli dot org
2022-12-22 10:17 ` dodji at redhat dot com
2022-12-22 15:27 ` guillermo.e.martinez at oracle dot com
2022-12-23  9:19   ` Dodji Seketeli
2022-12-23  9:19 ` dodji at seketeli dot org
2022-12-23  9:20 ` dodji at redhat dot com
2022-12-24 15:39 ` guillermo.e.martinez at oracle dot com
2023-01-01 17:50   ` Dodji Seketeli
2023-01-01 17:50 ` dodji at seketeli dot org
2023-01-01 17:57 ` dodji at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k02kd9st.fsf@seketeli.org \
    --to=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    --cc=sourceware-bugzilla@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).