From: Richard Biener <richard.guenther@gmail.com>
To: Nick Alcock <nick.alcock@oracle.com>
Cc: Indu Bhagat <indu.bhagat@oracle.com>,
Jakub Jelinek <jakub@redhat.com>,
Pedro Alves <palves@redhat.com>,
gcc-patches <gcc-patches@gcc.gnu.org>,
mark@klomp.org
Subject: Re: Type representation in CTF and DWARF
Date: Thu, 17 Oct 2019 18:09:00 -0000 [thread overview]
Message-ID: <CAFiYyc3ehvZzX6+ooDoiPtLTUHScjZrhkaDQ+giY4cvYyBO+_w@mail.gmail.com> (raw)
In-Reply-To: <87r23b8eav.fsf@esperi.org.uk>
On Thu, Oct 17, 2019 at 7:36 PM Nick Alcock <nick.alcock@oracle.com> wrote:
>
> On 11 Oct 2019, Indu Bhagat stated:
> > Compile with -g -gdwarf-like-ctf and use dwz -o <binary_dwz> <binary> (using
> > dwz compiled from the master branch) on the generated binaries:
> >
> > (coreutils-0.22)
> > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> > ls 30616 | 1136 | 21098 | 26240 | 0.62
> > pwd 10734 | 788 | 10433 | 13929 | 0.83
> > groups 10706 | 811 | 10249 | 13378 | 0.80
> >
> > (emacs-26.3)
> > .debug_info(D1) | .debug_abbrev(D2) | .debug_str(D4) | .ctf (uncompressed) | ratio (.ctf/(D1+D2+0.5*D4))
> > emacs-26.3.1 674657 | 6402 | 273963 | 273910 | 0.33
Btw, for a fair comparison you have to remove all DW_TAG_subroutine
children as well since
CTF doesn't represent scopes or local variables at all (nor types only
used by locals). It seems
CTF only represents function entry points.
> A side note here: the sizes given above are uncompressed sizes, but in
> the real world CTF is almost always compressed: the threshold for
> compression is in theory customizable but at the moment is hardwired at
> 4KiB-uncompressed in the linker. I usually see compression ratios of
> roughly 3 or 4 to 1: e.g. I just tried it with a randomly chosen binary,
> /usr/lib/libgtk-3.so.0.2404.3, and got these sizes:
>
> .text: 3317489
> DWARF: 8589254
> Uncompressed CTF (*no* ELF strtab sharing, so a bit bigger than usual): 713264
> .ctf section size: 213839
>
> Note that this is not only in the absence of CTF strtab sharing with the
> ELF dynstrtab, but also using a less effective compressor: currently we
> use gzip, but I expect to transition to lzma iff available at binutils
> build time (which it usually is), perhaps as an option (on by default)
> to allow interoperability with binutils that don't have lzma available.
> Obviously better compressors will save even more space.
>
> It may help that CTF is designed for good compressibility: we try to
> minimize the number of unique symbols if we can do so without impairing
> other properties, e.g. by avoiding encoding IDs of objects when we can
> instead rely on the consumer to compute them at read time by walking
> through the relevant data structures and counting.
>
> A few benchamrks indicate that compression by default also saves time
> both at compression and decompression time.
>
> (Within a week I should be able to repeat this with an ld capable of CTF
> deduplication rather than kludging it with a deduplicator meant for a
> quite different job. I expect the sizes above to improve. In fact if
> they *don't* improve I will take this as strong evidence that my
> deduplicator is buggy.)
>
>
> FWIW, here's my Emacs (26.1.50) sizes, again with no strtab sharing, but
> with deduplication: it's bigger than I'd like at around 10% of .text
> size, but still much less than 1% of binary size (my goal is 1--2% of
> .text, but Emacs is a nice tricky case, like Gtk, with lots of big types
> and structures with long member names):
>
> section size addr
> .interp 28 4194872
> .note.ABI-tag 32 4194900
> .note.gnu.build-id 36 4194932
> .gnu.hash 628 4194968
> .dynsym 24432 4195600
> .dynstr 16934 4220032
> .gnu.version 2036 4236966
> .gnu.version_r 704 4239008
> .rela.data.rel.ro 72 4239712
> .rela.data 168 4239784
> .rela.got 48 4239952
> .rela,bss 336 4240000
> .rela.plt 23448 4240336
> .init 23 4263784
> .plt 15648 4263808
> .text 1912622 4279456
> .fini 9 6192080
> .rodata 165416 6192096
> .eh_frame_hdr 36196 6357512
> .eh_frame 210976 6393712
> .init_array 8 6609328
> .fini_array 8 6609336
> .data.rel.ro 4569 6609344
> .dynamic 1104 6613920
> .got 16 6615024
> .got.plt 7840 6615040
> .data 3276077 6622880
> ,bss 34153472 9899008
> .comment 26 0
> .gnu_debuglink 24 0
> .comment 26 0
> .debug_aranges 1536 0
> .debug_info 3912261 0
> .debug_abbrev 38821 0
> .debug_line 408063 0
> .debug_str 117631 0
> .debug_loc 954538 0
> .debug_ranges 149590 0
> .ctf 213839 0
> .ctf (uncompressed) 713264 0
>
> (obviously, manually edited a bit, size -A doesn't produce the last line
> on its own!)
>
> (I'm not sure what the hell is going on with the weirdly-named ,bss
> section. Probably something to do with unexec().)
next prev parent reply other threads:[~2019-10-17 18:05 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-04 19:12 Indu Bhagat
2019-10-07 7:35 ` Richard Biener
2019-10-07 20:47 ` Indu Bhagat
2019-10-07 20:56 ` Jason Merrill
2019-10-08 15:37 ` Pedro Alves
2019-10-09 6:04 ` Indu Bhagat
2019-10-09 7:43 ` Richard Biener
2019-10-09 8:01 ` Jakub Jelinek
2019-10-10 23:07 ` Indu Bhagat
2019-10-11 11:27 ` Richard Biener
2019-10-11 11:47 ` Jakub Jelinek
2019-10-25 3:43 ` Indu Bhagat
2019-10-25 7:49 ` Richard Biener
2019-10-11 18:55 ` Indu Bhagat
2019-10-17 17:59 ` Nick Alcock
2019-10-17 18:09 ` Richard Biener [this message]
2019-10-17 19:12 ` Nick Alcock
2019-10-18 12:28 ` Pedro Alves
2019-10-18 13:27 ` Richard Biener
2019-10-18 15:31 ` Pedro Alves
2019-10-18 16:04 ` Nick Alcock
2019-10-18 11:59 ` Pedro Alves
2019-10-09 9:15 ` Segher Boessenkool
2019-10-15 15:30 ` Nick Alcock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFiYyc3ehvZzX6+ooDoiPtLTUHScjZrhkaDQ+giY4cvYyBO+_w@mail.gmail.com \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=indu.bhagat@oracle.com \
--cc=jakub@redhat.com \
--cc=mark@klomp.org \
--cc=nick.alcock@oracle.com \
--cc=palves@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).