From: Paul Pluzhnikov <ppluzhnikov@google.com>
To: Daniel Jacobowitz <drow@false.org>
Cc: Tom Tromey <tromey@redhat.com>,
Cary Coutant <ccoutant@google.com>,
Dodji Seketeli <dodji@redhat.com>,
"GDB/Archer list" <archer@sourceware.org>
Subject: Re: [RFC] Proposal for a new DWARF name index section
Date: Thu, 03 Dec 2009 01:46:00 -0000 [thread overview]
Message-ID: <8ac60eac0912021746g3cc9b543j1b175cf80b433705@mail.gmail.com> (raw)
In-Reply-To: <20091202193852.GA23631@caradoc.them.org>
On Wed, Dec 2, 2009 at 11:38 AM, Daniel Jacobowitz <drow@false.org> wrote:
> Well, inherent in the cache approach (IMO) is a system-provided cache;
> for installed libraries, the cache data could be added to a debuginfo
> file. Of course, that assumes GDB's format stays "relatively stable"
> across GDB updates.
FWIW, I've used the following approach on a previous product X:
- As new binary is detected, a copy of X is invoked to parse all
the needed debug info into internal form and written to a cache file.
- Once the copy exits, the cache file is directly mmap()ed by X.
- Cache files older than 1 week, and cache files prepared from
binaries which no longer exist in their original location are
pruned to keep cache size down.
The cache file contains version of X, so when a new version of X
is shipped, the cache is automatically rebuilt.
It also contains path/timestamp/inode/size for the target binary,
so if e.g. one of the shared libs has been rebuilt since last run,
only that one shared library must be re-processed.
This trades startup speed against disk space, and disk is usually
very cheap now.
One of our typical usage scenarios is a tiny executable linked with
1000+ C++ shared libraries. Simply re-running the test a second time
in a row in GDB takes 1+ minutes, as GDB discards and re-reads the
debug info for each solib (it used to take 6+ minutes before my dwarf
mmap changes).
The major CPU consumers in my tests are now:
CPU: AMD64 processors, speed 2200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a
unit mask of 0x00 (No unit mask) count 100000
samples % symbol name
43092 8.2847 read_partial_die
38243 7.3525 strcmp_iw_ordered
36744 7.0643 read_attribute_value
28887 5.5537 cpname_parse
28849 5.5464 d_print_comp
27731 5.3315 htab_hash_string
21975 4.2248 cp_canonicalize_string
20736 3.9866 load_partial_dies
18098 3.4795 cpname_lex
15649 3.0086 lookup_minimal_symbol
15156 2.9138 msymbol_hash_iw
14185 2.7272 htab_find_slot_with_hash
I am guessing that a GDB cache of pre-canonicalized strings would
save a *lot* of CPU under this scenario, and there is no reason
you can't put any other indices into the cache, or to have a stable
format of the cache file -- newer version of GDB will simply rebuild
what it needs on demand.
--
Paul Pluzhnikov
next prev parent reply other threads:[~2009-12-03 1:46 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-10 9:04 Dodji Seketeli
2009-08-10 14:38 ` Jan Kratochvil
2009-08-10 17:36 ` Tom Tromey
2009-08-10 18:21 ` Jan Kratochvil
2009-08-11 7:55 ` Dodji Seketeli
2009-08-11 17:45 ` Jan Kratochvil
2009-08-11 22:43 ` Tom Tromey
2009-08-12 19:20 ` Jan Kratochvil
2009-08-11 22:29 ` Tom Tromey
2009-08-20 17:31 ` Dodji Seketeli
2009-11-17 23:46 ` Cary Coutant
2009-11-20 17:25 ` Tom Tromey
2009-11-22 4:39 ` Daniel Jacobowitz
2009-11-23 19:51 ` Tom Tromey
2009-12-01 19:14 ` Tom Tromey
2009-12-02 5:17 ` Daniel Jacobowitz
2009-12-02 17:07 ` Tom Tromey
2009-12-02 17:35 ` Daniel Jacobowitz
2009-12-02 19:23 ` Tom Tromey
2009-12-02 19:39 ` Daniel Jacobowitz
2009-12-03 1:46 ` Paul Pluzhnikov [this message]
2009-12-04 23:13 ` Tom Tromey
2009-12-06 3:41 ` Tom Tromey
2009-12-07 21:32 ` Tom Tromey
2009-12-02 16:11 ` Dodji Seketeli
2009-12-02 17:29 ` Tom Tromey
2009-12-11 23:56 ` Tom Tromey
2009-12-12 0:06 ` Daniel Jacobowitz
2009-12-12 0:13 ` Cary Coutant
2009-12-13 3:48 ` Dodji Seketeli
2009-12-14 15:32 ` Dodji Seketeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ac60eac0912021746g3cc9b543j1b175cf80b433705@mail.gmail.com \
--to=ppluzhnikov@google.com \
--cc=archer@sourceware.org \
--cc=ccoutant@google.com \
--cc=dodji@redhat.com \
--cc=drow@false.org \
--cc=tromey@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).