public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Type representation in CTF and DWARF
@ 2019-10-04 19:12 Indu Bhagat
  2019-10-07  7:35 ` Richard Biener
  2019-10-08 15:37 ` Pedro Alves
  0 siblings, 2 replies; 24+ messages in thread
From: Indu Bhagat @ 2019-10-04 19:12 UTC (permalink / raw)
  To: gcc-patches, palves, mark

Hello,

At GNU Tools Cauldron this year, some folks were curious to know more on how
the "type representation" in CTF compares vis-a-vis DWARF.

I use small testcase below to gather some numbers to help drive this discussion.

[ibhagat@ibhagatpc ctf-size]$ cat ctf_sizeme.c
#define MAX_NUM_MSGS 5
  
enum node_type
{
   INIT_TYPE = 0,
   COMM_TYPE = 1,
   COMP_TYPE = 2,
   MSG_TYPE = 3,
   RELEASE_TYPE = 4,
   MAX_NODE_TYPE
};
  
typedef struct node_payload
{
   unsigned short npay_offset;
   const char * npay_msg;
   unsigned int npay_nelems;
   struct node_payload * npay_next;
} node_payload;
  
typedef struct node_property
{
   int timestamp;
   char category;
   long initvalue;
} node_property_t;
  
typedef struct node
{
   enum node_type ntype;
   int nmask:5;
   union
     {
       struct node_payload * npayload;
       void * nbase;
     } nu;
     unsigned int msgs[MAX_NUM_MSGS];
     node_property_t node_prop;
} Node;
  
Node s;
  
int main (void)
{
   return 0;
}

Note that in this case, there is nothing that the de-duplicator has to do
(neither for the TYPE comdat sections nor CTF types). I chose such an example
because de-duplication of types is orthogonal to the concept of representation
of types.

So, for the small C testcase with a union, enum, array, struct, typedef etc, I
see following sizes :

Compile with -fdebug-types-section -gdwarf-4 (size -A <binary> excerpt):
     .debug_aranges     48         0
     .debug_info       150         0
     .debug_abbrev     314         0
     .debug_line        73         0
     .debug_str        455         0
     .debug_ranges      32         0
     .debug_types      578         0

Compile with -fdebug-types-section -gdwarf-5 (size -A <binary> excerpt):
     .debug_aranges      48         0
     .debug_info        732         0
     .debug_abbrev      309         0
     .debug_line         73         0
     .debug_str         455         0
     .debug_rnglists     23         0

Compile with -gt (size -A <binary> excerpt):
     .ctf      966     0
     CTF strings sub-section size (ctf_strlen in disassmebly) = 374
     == > CTF section just for representing types = 966 - 374 = 592 bytes
     (The 592 bytes include the CTF header and other indexes etc.)

So, following points are what I would highlight. Hopefully this helps you see
that CTF has promise for the task of representing type debug info.

1. Type Information layout in sections:
    A .ctf section is self-sufficient to represent types in a program. All
    references within the CTF section are via either indexes or offsets into the
    CTF section. No relocations are necessary in CTF at this time. In contrast,
    DWARF type information is organized in multiple sections - .debug_info,
    .debug_abbrev and .debug_str sections in DWARF5; plus .debug_types in DWARF4.

2. Type Information encoding / compactness matters:
    Because the type information is organized across sections in DWARF (and
    contains some debug information like location etc.) , it is not feasible
    to put a distinct number to the size in bytes for representing type
    information in DWARF. But the size info of sections shown above should
    be helpful to show that CTF does show promise in compactly representing
    types.

    Lets see some size data. CTF string table (= 374 bytes) is left out of the
    discussion at hand because it will not be fair to compare with .debug_str
    section which contains other information than just names of types.

    The 592 bytes of the .ctf section are needed to represent types in CTF
    format. Now, when using DWARF5, the type information needs 732 bytes in
    .debug_info and 309 bytes in .debug_abbrev.

    In DWARF (when using -fdebug-types-section), the base types are duplicated
    across type units. So for the above example, the DWARF DIE representing
    'unsigned int' will appear in both the  DWARF trees for types - node and
    node_payload. In CTF, there is a single lone type 'unsigned int'.

3. Type Information retrieval and handling:
    CTF type information is organized as a linear array of CTF types. CTF types
    have references to other CTF types. libctf facilitates name lookups, i.e.
    given the name of the type, get the type information.

    DWARF type information is organized in a tree of DIEs. The information at
    the leaf DIEs (base types) across DWARF type units is often duplicated.
    DWARF type units do have references to other type units for larger types
    though. In the example, the DWARF type unit for node has a reference to the
    DWARF type unit for node_payload.

    I only state the above for sake of observation, I don't know for certain if
    one format is necessarily better or worse for consumers of type debug
    information at this time WRT runtime access patterns.

    On a related note though, it's not clear to me how .debug_types integration
    with split-dwarf works out. If the linker does not see the
    non-relocation-necessary part of the DWARF, I am not sure how .debug_type type
    units are de-duplicated when using split-dwarf.

Thanks
Indu

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-10-25  7:29 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 19:12 Type representation in CTF and DWARF Indu Bhagat
2019-10-07  7:35 ` Richard Biener
2019-10-07 20:47   ` Indu Bhagat
2019-10-07 20:56     ` Jason Merrill
2019-10-08 15:37 ` Pedro Alves
2019-10-09  6:04   ` Indu Bhagat
2019-10-09  7:43     ` Richard Biener
2019-10-09  8:01       ` Jakub Jelinek
2019-10-10 23:07         ` Indu Bhagat
2019-10-11 11:27           ` Richard Biener
2019-10-11 11:47             ` Jakub Jelinek
2019-10-25  3:43               ` Indu Bhagat
2019-10-25  7:49                 ` Richard Biener
2019-10-11 18:55             ` Indu Bhagat
2019-10-17 17:59           ` Nick Alcock
2019-10-17 18:09             ` Richard Biener
2019-10-17 19:12               ` Nick Alcock
2019-10-18 12:28                 ` Pedro Alves
2019-10-18 13:27                   ` Richard Biener
2019-10-18 15:31                     ` Pedro Alves
2019-10-18 16:04                       ` Nick Alcock
2019-10-18 11:59             ` Pedro Alves
2019-10-09  9:15     ` Segher Boessenkool
2019-10-15 15:30     ` Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).