Initial psymtab replacement results

public inbox for archer@sourceware.org
 help / color / mirror / Atom feed

* Initial psymtab replacement results
@ 2009-12-10 21:54 Tom Tromey
  2009-12-11 23:10 ` Tom Tromey
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-10 21:54 UTC (permalink / raw)
  To: Project Archer

This week I added a new command to gdb to let me dump the
.debug_gnu_index section from psymtabs.  This way I was able to
instrument the OO.o .debug files without recompiling the world.  (If you
want to try this out, let me know.  It required some obscure and
probably bogus BFD hacks to make objcopy not barf when splicing the new
section into the .debug files.)

The summary is, this approach seems solid.

Here are timings for a simple "attach" to a running OO.o writer, with
full debuginfo installed.  These are representative figures, but there
is some noise.

GDB        Time
===        ===================================
CVS        26.05user 2.53system 1:08.86elapsed
F11         5.29user 1.09system 0:22.01elapsed
  (The F11 gdb includes the delayed-symfile branch.)
Branch     10.64user 0.81system 0:22.63elapsed
Threaded   11.26user 0.87system 0:19.16elapsed

I can shave another 5 seconds off the new branch by assuming that the
.debug_gnu_index section has canonicalized C++ names.  This is
significant, but I think it isn't a deal-breaker.

I don't have an explanation for why reading the index in a background
thread isn't faster than the above.  I didn't look into it yet.

The other test is "thread apply all bt full".  This requires a lot of
symbol reading and is the bane of the delayed-symfile branch.  These
numbers come from "maint time 1" in gdb.

GDB        Time
===        =========
CVS         1.824723
F11        27.235860
Branch      1.695742

I didn't time this with the threaded branch but I would expect numbers
similar to the branch.  The threading only helps hide reading the index,
nothing else.

Future work I'm planning on this branch:

* Test out using pubnames + pubtypes instead of the new section.
  I remain skeptical about this but it seems important to be sure.
* Fix the remaining bugs.  I still haven't implemented the needed Ada
  hook.
* Change gdb to an "expand first" model.  Right now it will search
  symtabs and only then expand psymtabs looking for a match.  This
  yields inconsistent behavior, depending on what was already read.
  This task would change gdb to expand matching psymtabs (or CUs) first,
  then do symtab searching.  I may do this as a completely separate
  change (I can't remember offhand why I thought I needed this change
  for this particular project).
* Investigate the threading stuff some more.  This is the lowest
  priority.

Your comments are welcome.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-10 21:54 Initial psymtab replacement results Tom Tromey
@ 2009-12-11 23:10 ` Tom Tromey
  2009-12-11 23:59   ` Daniel Jacobowitz
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-11 23:10 UTC (permalink / raw)
  To: Project Archer

>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:

Tom> * Test out using pubnames + pubtypes instead of the new section.
Tom>   I remain skeptical about this but it seems important to be sure.

I thought of a way to approximate the performance of this without
actually implementing it.

According to my theory, the most important thing is to avoid reading
.debug_info (et al) during startup.  And, we already know that using
.debug_pub* will require reading the CU DIE.  So, I simply modified gdb
to always read the CU DIE.

GDB        Cold Cache        Warm Cache
=======================================
baseline   1:24              0:56
branch     0:45              0:19
DIE        0:54              0:21

So... it is worth about a 15% speedup in the cold cache case.  Still, it
is hard to know whether this is justified in terms of the overall
effort.  OO.o is a pretty big program, and 10 seconds doesn't exactly
give the user that "wow" factor.

I'm leaning toward ditching the .debug_gnu_index idea and simply going
with the standard sections instead.  Let me know what you think of this.

As far as I know the .debug_pub* approach is ok assuming (1) we take
Cary's suggestion of changing gcc to put everything in it, and (2) parse
the producer to see when the sections are usable.

Jan mentioned on irc yesterday that we could probably do better by
avoiding reading any debuginfo at all during startup.  For the "attach"
case we could additionally use info from the inferior to determine which
objfile the PC is in -- so we would not have to read anything from most
objfiles.  (Perhaps in conjunction with lazily reading minimal symbols.)

The possible issue with this is latency.  This is what bit us with the
delayed-symfile branch: we could hide the initial work easily, but it
was (very) noticeable later, because gdb would pause for a long time in
response to some user command.

To look at this, I instrumented gdb to print the amount of time taken to
read the index for each objfile.  There were 170 objfiles with debuginfo
in OO.o, with a median reading time of 7999 microseconds, a mean of
30277 microseconds, and a maximum of 641903 microseconds.

This indicates to me that the latency problem would not be severe for
the PC-lookup case.

However, I think it would be just as bad as delayed-symfile for the
name-lookup case, because name-lookup doesn't let us pick the needed
objfile a priori.  Perhaps this is an area where the threaded branch can
help.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-11 23:10 ` Tom Tromey
@ 2009-12-11 23:59   ` Daniel Jacobowitz
  2009-12-14 22:40     ` Tom Tromey
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-11 23:59 UTC (permalink / raw)
  To: archer

On Fri, Dec 11, 2009 at 04:09:49PM -0700, Tom Tromey wrote:
> I'm leaning toward ditching the .debug_gnu_index idea and simply going
> with the standard sections instead.  Let me know what you think of this.
> 
> As far as I know the .debug_pub* approach is ok assuming (1) we take
> Cary's suggestion of changing gcc to put everything in it, and (2) parse
> the producer to see when the sections are usable.

I want to get my comments on this into email, since IRC isn't a
productive tool for this sort of discussion.  This is pretty much
the same as what I said on IRC yesterday.

I still strongly support the idea of a debugger-specific cache,
for the reasons below.  There's prior art (according to Paul P.); we
know this approach is practical.

* It is simple to version and extend.  If the cache is too old, either
use what is there or ignore the cache.  I believe that DWARF producer
checks, while pragmatically necessary to work around some bugs, should
be avoided whenever possible.  And who knows what compiler versions
end up with this backported to them.

* It is still simple to preseed at distribution time.  For instance,
merge the cache generation process with whatever you use to generate
separate debug info files, and put a cache there - with associated
version, so that GDB knows what is and isn't in that cache.  No need
to maintain a local cache in $HOME if there's already one in
/usr/lib/debug.

* You've already had to make changes to GCC for this project.  If you
have to change GCC again, there's more users out there with even more
backported and hacked up compilers that you have no idea whether
pubtypes is complete enough.

* Rolling out new compilers to people who want a new debugger may be
easy for Fedora, but it just aint so in the rest of the world.  We
routinely get requests from customers who want to upgrade their
debugger and continue to use the compiler they've validated for the
past year.  Changing the debugger is cheaper than changing the
compiler, even if the work is already "out there" in trunk.

* We also have customers using non-GCC compilers.  You have users
doing this too: icc, for instance.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-11 23:59   ` Daniel Jacobowitz
@ 2009-12-14 22:40     ` Tom Tromey
  2009-12-14 23:09       ` Daniel Jacobowitz
  2009-12-15  1:04       ` Roland McGrath
  0 siblings, 2 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-14 22:40 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

Daniel> I still strongly support the idea of a debugger-specific cache,
Daniel> for the reasons below.  There's prior art (according to Paul P.); we
Daniel> know this approach is practical.

Thanks for replying.

If we look at this as a gdb-specific format, then I suspect that is
actually a harder sell for Fedora.  It means more space and another
post-processing pass when making the debuginfo RPMs.

I'd hope this is surmountable, but it would be preferable to hide the
metadata processing somehow, because then we can involve fewer parties.
Having gcc generate it in a .debug section achieves this pretty nicely,
especially as gcc already generates these sections (more or less).

The other big drawback I see with the caching approach is that it means
that the first time will still be slow -- even with a system-wide cache
this will be true for the objfiles that a user changes during
development.

I understand the compiler problem.  If we had a program to rewrite the
appropriate DWARF sections, would that address the problems you have?
It seems to me that it would.

FWIW if we were going to do our own cache, I wouldn't put it in a form
like .debug_gnu_index or .debug_pub*.  I'd just have gdb write out a
mappable data structure.

One definite positive about the branch is that these changes are a lot
simpler now.  The psymtab stuff is mostly isolated, and writing a new
"back end" is reasonably self-contained.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-14 22:40     ` Tom Tromey
@ 2009-12-14 23:09       ` Daniel Jacobowitz
  2009-12-15 23:39         ` Tom Tromey
  2009-12-17 16:39         ` Paul Pluzhnikov
  2009-12-15  1:04       ` Roland McGrath
  1 sibling, 2 replies; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-14 23:09 UTC (permalink / raw)
  To: Tom Tromey; +Cc: archer

On Mon, Dec 14, 2009 at 03:39:50PM -0700, Tom Tromey wrote:
> I understand the compiler problem.  If we had a program to rewrite the
> appropriate DWARF sections, would that address the problems you have?
> It seems to me that it would.

I guess so; if you allow GDB to automatically invoke said program
(there is prior art for that, too) then it's pretty much identical.  I
still think that you will have long term maintenance problems with
this approach and it will cramp future desire to extend it or change
GDB.  But that's not a provable position.

> FWIW if we were going to do our own cache, I wouldn't put it in a form
> like .debug_gnu_index or .debug_pub*.  I'd just have gdb write out a
> mappable data structure.

Or you could drag another bit of GDB into this century, and use SQLite
or some other in-process database.  Mappable data structures are
tricky; one thing I'd definitely insist on is host neutrality.  IMO
that is not optional.

> One definite positive about the branch is that these changes are a lot
> simpler now.  The psymtab stuff is mostly isolated, and writing a new
> "back end" is reasonably self-contained.

This makes me very happy.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-14 23:09       ` Daniel Jacobowitz
@ 2009-12-15 23:39         ` Tom Tromey
  2009-12-16  3:01           ` Roland McGrath
                             ` (2 more replies)
  2009-12-17 16:39         ` Paul Pluzhnikov
  1 sibling, 3 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-15 23:39 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

Daniel> Or you could drag another bit of GDB into this century, and use
Daniel> SQLite or some other in-process database.

I played with this a bit today.  In particular I changed my gdb to dump
an SQLite database from the psymtabs.  I used this schema:

CREATE TABLE IF NOT EXISTS index_version (version INTEGER);
CREATE TABLE IF NOT EXISTS objfile (name, mtime INTEGER, size INTEGER,
                inode INTEGER);
CREATE TABLE IF NOT EXISTS cus (offset INTEGER UNIQUE);
CREATE TABLE IF NOT EXISTS filenames (cu INTEGER REFERENCES cus (offset),
                name, is_primary INTEGER);
CREATE TABLE IF NOT EXISTS symbols (cu INTEGER REFERENCES cus (offset),
                name NOT NULL, psym_domain INTEGER, psym_class INTEGER,
                is_public INTEGER);
CREATE INDEX IF NOT EXISTS byname ON symbols (name);
CREATE TABLE IF NOT EXISTS addresses (cu INTEGER REFERENCES cus (offset),
                low INTEGER, high INTEGER);

The 'byname' index really slows down populating the database.  It took
more than a minute to write out the database for gdb.  However, without
the index, name lookups are extremely slow (as in, I can count to 2
seconds when running a select command in sqlite3).

Maybe I'm misusing SQLite somehow, I'll try to look at this a little
more.  I'm not an SQL wizard, so if I've done something weird, please
let me know, as I'm sure there was no good reason for it.

The other issue is that the resulting database is very big.  For
example, the database for gdb is 72M, but the gdb executable itself is
119M.

I didn't write the reader side of this yet, but that won't be too bad.

I guess one idea would be to write the database in a background thread.
Or just not write it at all by default.

Daniel> Mappable data structures are tricky; one thing I'd definitely
Daniel> insist on is host neutrality.  IMO that is not optional.

Yeah, that does make it trickier.

I'm starting to get a bit discouraged by this project.  I think at this
point we've got strikes against all the ideas:

* .debug_pub*.  These require DWARF extensions and GCC bug fixes, don't
  work nicely with comdat or the other (planned) post-processors.

* .debug_gnu_index.  Pretty much the same problems except that we're
  also inventing it ourselves.

* Mappable data structure.  A pain to make host-independent; and I
  suspect that would kill performance.

* SQLite.  Too big and too slow to create.

I suppose we could write out the equivalent of .debug_gnu_index, only
not as an ELF section, and not as a mappable data structure.  We already
know that will perform adequately.  This won't meet all of my goals but
it would definitely help some use cases.

Maybe I can add code to do a psymtab-like scan of .debug_info in a
background thread.  That might make "gdb gdb" feel faster.

I think we ought to change GCC to drop the .debug_pub* sections (and
maybe .debug_aranges), at least on Linux.  AFAIK they aren't used by
anything, and indeed are barely usable due to historic bugs -- so they
are just wasting time and space.

Let me know what you think,
Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-15 23:39         ` Tom Tromey
@ 2009-12-16  3:01           ` Roland McGrath
  2009-12-16 18:20             ` Tom Tromey
  2009-12-16 18:11           ` Daniel Jacobowitz
  2009-12-23 18:29           ` Tom Tromey
  2 siblings, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2009-12-16  3:01 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, archer

> I think we ought to change GCC to drop the .debug_pub* sections (and
> maybe .debug_aranges), at least on Linux.  AFAIK they aren't used by
> anything, and indeed are barely usable due to historic bugs -- so they
> are just wasting time and space.

A correct .debug_aranges is trivially deduced from the .debug_info data.
So any DWARF rewriter can (and will) easily emit this from scratch (and
correct, and maybe consolidating the uselessly many entries for the same
CU with adjacent address ranges that just bend around a zillion 2-byte
section alignment holes).  (Perhaps there is something easy and kosher
to do that GDB could use as an indicator of a .debug_aranges worth using.)

With .debug_pub* it has always been my impression that the extraction
from the DIE tree is not fully generic.  That is, it at least assumes
some language-specific knowledge to construct foo::bar names and the
like.  I've never been entirely clear on if it's even entirely knowable
just from very simple language knowledge as opposed to encoding some
choices or extra knowledge the compiler had.  As a hacker of supposedly
fully-generic DWARF tools, I am much happier with the idea of index
sections that are described purely in terms of a generic extraction
from the DIE trees without regard to any knowledge outside the DWARF
spec proper.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-16  3:01           ` Roland McGrath
@ 2009-12-16 18:20             ` Tom Tromey
  2009-12-16 18:57               ` Roland McGrath
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-16 18:20 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Daniel Jacobowitz, archer

>>>>> "Roland" == Roland McGrath <roland@redhat.com> writes:

Roland> (Perhaps there is something easy and kosher to do that GDB could
Roland> use as an indicator of a .debug_aranges worth using.)

I don't think so, or at least, I couldn't think of anything.  Our
current branches, including F12, check for a known GCC bug, but
otherwise assume that if aranges exists, it is correct.

There's also an issue with knowing whether it is actually complete; I
didn't think of this until relatively recently:
    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42288

Roland> With .debug_pub* it has always been my impression that the extraction
Roland> from the DIE tree is not fully generic.  That is, it at least assumes
Roland> some language-specific knowledge to construct foo::bar names and the
Roland> like.  I've never been entirely clear on if it's even entirely knowable
Roland> just from very simple language knowledge as opposed to encoding some
Roland> choices or extra knowledge the compiler had.

I think this question is related to Keith's work on avoiding
DW_AT_MIPS_linkage_name.  I think the current answer is that there are
some known missing DWARF features relating to C++.  However, Keith and
Dodji have worked on GNU extensions for at least some of these.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-16 18:20             ` Tom Tromey
@ 2009-12-16 18:57               ` Roland McGrath
  2009-12-16 19:46                 ` Tom Tromey
  0 siblings, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2009-12-16 18:57 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, archer

> I don't think so, or at least, I couldn't think of anything.  Our
> current branches, including F12, check for a known GCC bug, but
> otherwise assume that if aranges exists, it is correct.

FWIW, elfutils (libdw) does use it and rely on its correctness for
address->CU lookups.  But that direction of lookup is only used in
eu-addr2line and in systemtap's rarely-used numeric address probe syntax.

> There's also an issue with knowing whether it is actually complete; I
> didn't think of this until relatively recently:
>     http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42288

I think we discussed this before and I've forgotten again why that issue
matters.  In the status quo there is a .debug_aranges hunk for each CU from
the beginning of the existing of the corresponding .debug_info hunk at
compile time, and that can never be "stripped" except by strip et al
that remove .debug_info along with it.

> I think this question is related to Keith's work on avoiding
> DW_AT_MIPS_linkage_name.  I think the current answer is that there are
> some known missing DWARF features relating to C++.  However, Keith and
> Dodji have worked on GNU extensions for at least some of these.

I take this to mean a verification of my supposition that .debug_pub*
indeed are not a generic extraction from the DIE tree as things stand.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-16 18:57               ` Roland McGrath
@ 2009-12-16 19:46                 ` Tom Tromey
  2009-12-16 19:52                   ` Roland McGrath
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-16 19:46 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Daniel Jacobowitz, archer

>>>>> "Roland" == Roland McGrath <roland@redhat.com> writes:

Tom> There's also an issue with knowing whether it is actually complete; I
Tom> didn't think of this until relatively recently:
Tom> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42288

Roland> I think we discussed this before and I've forgotten again why
Roland> that issue matters.  In the status quo there is a .debug_aranges
Roland> hunk for each CU from the beginning of the existing of the
Roland> corresponding .debug_info hunk at compile time, and that can
Roland> never be "stripped" except by strip et al that remove
Roland> .debug_info along with it.

Nothing requires a compiler to emit .debug_aranges for a CU.  It is an
optional index, at least by my reading:

    [6.1 Accelerated Access]
    ... a producer of DWARF information may provide...

So, if there is no aranges entry for a given CU, there is no way to tell
whether the CU has no addressable content, or whether the entry was
simply never created.

This is not an issue if you are willing to assume that the user is using
GCC, because AFAIK (modulo the bug we found) GCC always emits this with
-g.

GDB aspires to be more defensive, though.  So, on my current branch, if
GDB notices a missing aranges entry, it reads full symbols for the CU
just in case.  This triggers a number of times in OO.o.

The PR in question is suspended because I told rth that it wasn't clear
we would even be using aranges.

Overall this is fairly minor.  It isn't likely to affect Fedora.  I have
no idea what, if anything, other gdb maintainers might say about it.
Maybe we can just ignore it.

This issue affects .debug_pub* as well.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-16 19:46                 ` Tom Tromey
@ 2009-12-16 19:52                   ` Roland McGrath
  0 siblings, 0 replies; 26+ messages in thread
From: Roland McGrath @ 2009-12-16 19:52 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, archer

> Nothing requires a compiler to emit .debug_aranges for a CU.  It is an
> optional index, at least by my reading:

Right.  I guess I didn't consider the issue of linking together things
built with different compilers.

> This issue affects .debug_pub* as well.

Right.  For all such things this suggests some utility in having headers or
something to identify "I was emitted at/after link time" so that any
post-processor (or equivalent DWARF-savvy linker) would indicate when it
had wiped away such concerns.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-15 23:39         ` Tom Tromey
  2009-12-16  3:01           ` Roland McGrath
@ 2009-12-16 18:11           ` Daniel Jacobowitz
  2009-12-16 19:31             ` Tom Tromey
  2009-12-23 18:29           ` Tom Tromey
  2 siblings, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-16 18:11 UTC (permalink / raw)
  To: Tom Tromey; +Cc: archer

I've been thinking about this.  You've done some nice work already, so
even though I have expressed concern about it I really want it to end
up merged and useful.

On Tue, Dec 15, 2009 at 04:38:58PM -0700, Tom Tromey wrote:
> The 'byname' index really slows down populating the database.  It took
> more than a minute to write out the database for gdb.  However, without
> the index, name lookups are extremely slow (as in, I can count to 2
> seconds when running a select command in sqlite3).

Did you create the index, then populate the table?  Is it quicker to
populate the table, and then create the index?

> Daniel> Mappable data structures are tricky; one thing I'd definitely
> Daniel> insist on is host neutrality.  IMO that is not optional.
> 
> Yeah, that does make it trickier.

Frank made a good point about putting host characteristics in the
cache key.  By careful choice of the types stored, we should be able
to create a mapped data structure that is in practice dependent only
on endianness and maybe pointer size.  WDYT?

If it's a pointer-swizzled index (i.e. update pointer offsets when
writing and reading), then we'd have to read the whole index into
memory instead of mmapping it; but in exchange we'd get to use zlib
when streaming the index to disk, and the data is probably highly
compressible.  I don't know which of these turns out to be faster
in real-world scenarios (cold and hot cache).

I know you've done a lot of work to kill psymtabs.  Do we populate
psymtabs from the index, or are they pretty much optional now?  In
other words, can we reclaim and reuse the memory formerly spent on
psymtabs?

> I think we ought to change GCC to drop the .debug_pub* sections (and
> maybe .debug_aranges), at least on Linux.  AFAIK they aren't used by
> anything, and indeed are barely usable due to historic bugs -- so they
> are just wasting time and space.

I don't know whether third party debuggers use the existing
.debug_pubnames.  I wouldn't be surprised either way.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-16 18:11           ` Daniel Jacobowitz
@ 2009-12-16 19:31             ` Tom Tromey
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-16 19:31 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

Daniel> Did you create the index, then populate the table?  Is it quicker to
Daniel> populate the table, and then create the index?

I did both, they are both slow.

Originally I didn't have an index and when I was playing with the SQLite
shell I noticed searches were slow.  So, I made an index -- which was
very slow to create in the shell.

Then I thought that maybe making the index before populating the table
would be faster.  I made that change to gdb, but it was still quite
slow.

Another idea I have is to make a new column holding a hash code, and not
use an index; or maybe use that for the index (indexing on an integer
column may be faster).

I was experimenting just now, and removing the "CREATE INDEX" and
changing the schema to mark symbols.name as "PRIMARY KEY" made database
creation much faster -- for gdb, down from 60 seconds to 19.  I still
think that is too slow though.

Daniel> Frank made a good point about putting host characteristics in the
Daniel> cache key.  By careful choice of the types stored, we should be able
Daniel> to create a mapped data structure that is in practice dependent only
Daniel> on endianness and maybe pointer size.  WDYT?

Yeah, I may give that a try.

Daniel> I know you've done a lot of work to kill psymtabs.  Do we populate
Daniel> psymtabs from the index, or are they pretty much optional now?  In
Daniel> other words, can we reclaim and reuse the memory formerly spent on
Daniel> psymtabs?

What I did was introduce a new struct of function pointers, alongside
struct sym_fns.  This provides an abstraction that replaces direct uses
of partial symbols.  The API "design" is completely ad hoc, based on
what previously existed.  So, it is rather weird and large; e.g., it has
a special function just for Ada, because ada-lang.c directly examines
psymtabs.

Then I moved all the uses of partial symbols into a new file, psymtab.c,
and made a new rule: only psymtab.c and the debuginfo readers are
allowed to directly manipulate these data structures.

Finally, I changed dwarf2read.c to have a separate implementation of
these functions, and to use its own indexing data structures.
dwarf2read now decides per-objfile whether to use partial symbols or the
new code.

I did all this because I did not think it was possible to really create
psymbols from the DWARF indices.

This approach saves a bit of memory when using the index.  I don't have
numbers handy but my recollection is that the savings isn't very
dramatic.

I have considered modifying dwarf2read to create "new-style" data
structures when the indices are not available.  I haven't implemented
this yet, though, because it is more work and the payoff doesn't seem to
be huge.

The new code could free some memory whenever it reads full symbols for a
CU.  I haven't implemented that yet.

Finally, with "-readnow", dwarf2read no longer reads partial symbols or
the indices; it skips directly to just reading everything.  I only did
this because it was easy to implement; I actually consider -readnow to
be fairly useless.

Another idea I have is to change the threaded-dwarf branch to read
psymtabs in the background thread.  This isn't too terribly hard, now
that psymtabs are fully segregated.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-15 23:39         ` Tom Tromey
  2009-12-16  3:01           ` Roland McGrath
  2009-12-16 18:11           ` Daniel Jacobowitz
@ 2009-12-23 18:29           ` Tom Tromey
  2009-12-23 18:35             ` Daniel Jacobowitz
  2009-12-24 17:07             ` Tom Tromey
  2 siblings, 2 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-23 18:29 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:

Tom> The 'byname' index really slows down populating the database.  It took
Tom> more than a minute to write out the database for gdb.
[ ... later reduced to 20 seconds ... ]

Tom> The other issue is that the resulting database is very big.  For
Tom> example, the database for gdb is 72M, but the gdb executable itself is
Tom> 119M.

This week I changed gdb to write out a directly-mmappable index.
Writing the index for gdb takes 2.5 seconds, and the resulting file is
19M.

I think that is more reasonable -- still not super, but livable.

I haven't written the reader side of this yet.  I probably won't finish
that until sometime in January.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-23 18:29           ` Tom Tromey
@ 2009-12-23 18:35             ` Daniel Jacobowitz
  2009-12-24 17:07             ` Tom Tromey
  1 sibling, 0 replies; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-23 18:35 UTC (permalink / raw)
  To: Tom Tromey; +Cc: archer

On Wed, Dec 23, 2009 at 11:29:10AM -0700, Tom Tromey wrote:
> This week I changed gdb to write out a directly-mmappable index.
> Writing the index for gdb takes 2.5 seconds, and the resulting file is
> 19M.
> 
> I think that is more reasonable -- still not super, but livable.

Actually, I think it's pretty good!  It's about 3x the size of
.debug_info, which uses abbrevs and ulebs for compression.

Some of the 2.5s will be leveraged in the same session since we can
then use the index.  It's still a big hit, but maybe we can background
it or something...

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-23 18:29           ` Tom Tromey
  2009-12-23 18:35             ` Daniel Jacobowitz
@ 2009-12-24 17:07             ` Tom Tromey
  2010-01-06 23:05               ` Tom Tromey
  1 sibling, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-24 17:07 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:

Tom> I haven't written the reader side of this yet.  I probably won't finish
Tom> that until sometime in January.

Actually, I finished it yesterday and got preliminary results:

With index:

opsy. /usr/bin/time ./gdb/gdb -batch ./gdb/gdb
0.21user 0.03system 0:00.30elapsed 82%CPU (0avgtext+0avgdata 0maxresident)k
8inputs+32outputs (0major+3760minor)pagefaults 0swaps

Without:

opsy. /usr/bin/time ./gdb/gdb -batch ./gdb/gdb
3.16user 0.09system 0:03.82elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
8inputs+32outputs (0major+10562minor)pagefaults 0swaps

This is with a warm cache, I didn't find the time to do it the other way
yet.

I just picked a size for all the offsets (32 bits) and an endianness
(little) for the index.  I figure big-endian hosts can byteswap when
using the index, that might not be too bad in practice.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-24 17:07             ` Tom Tromey
@ 2010-01-06 23:05               ` Tom Tromey
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Tromey @ 2010-01-06 23:05 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: archer

>>>>> "Tom" == Tom Tromey <tromey@redhat.com> writes:

Tom> Actually, I finished it yesterday and got preliminary results:
[...]

This week I fixed a few bugs and tried this out on OO.o.

The attach results are competitive with the .debug_gnu_index approach --
actually a bit faster, as one would expect, because less work is done at
startup.

"thread apply all bt full" is a little slower (than .debug_gnu_index), I
think because this code uses a "pre-expand" model, and is fairly
indiscriminate.  That is, if a given symbol (say a type name) is looked
for, all symtabs holding that name are expanded.  The slowdown isn't
severe, though, but it could be fixed by some additional smarts in the
index writer.

This version of the index code is faster than CVS gdb on both operations
(when the index exists).

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-14 23:09       ` Daniel Jacobowitz
  2009-12-15 23:39         ` Tom Tromey
@ 2009-12-17 16:39         ` Paul Pluzhnikov
  2009-12-17 16:53           ` Daniel Jacobowitz
  1 sibling, 1 reply; 26+ messages in thread
From: Paul Pluzhnikov @ 2009-12-17 16:39 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Tom Tromey, archer

On Mon, Dec 14, 2009 at 3:09 PM, Daniel Jacobowitz <drow@false.org> wrote:

> Mappable data structures are
> tricky; one thing I'd definitely insist on is host neutrality.  IMO
> that is not optional.

I am sorry for being slow, but I thought about it for a while and I
still can't come up with a realistic scenario where the host
non-neutrality of the cache matters.

Are you worried about hosts A and B (with different architecture) both
NFS-mounting server:/usr/lib/debug ?  That already wouldn't work,
since e.g. libc-X.Y.Z.so.debug is not host-neutral.

Besides, it's trivial to have the cache live in architecture-specific
subdirectory, e.g. /usr/lib/debug/gdb-cache/{x86_64,i386,ppc}/

So what am I missing?

Thanks,
--
Paul Pluzhnikov

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-17 16:39         ` Paul Pluzhnikov
@ 2009-12-17 16:53           ` Daniel Jacobowitz
  2009-12-17 17:20             ` Paul Pluzhnikov
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-17 16:53 UTC (permalink / raw)
  To: Paul Pluzhnikov; +Cc: Tom Tromey, archer

On Thu, Dec 17, 2009 at 08:38:52AM -0800, Paul Pluzhnikov wrote:
> On Mon, Dec 14, 2009 at 3:09 PM, Daniel Jacobowitz <drow@false.org> wrote:
> 
> >Â Mappable data structures are
> > tricky; one thing I'd definitely insist on is host neutrality. Â IMO
> > that is not optional.
> 
> I am sorry for being slow, but I thought about it for a while and I
> still can't come up with a realistic scenario where the host
> non-neutrality of the cache matters.
> 
> Are you worried about hosts A and B (with different architecture) both
> NFS-mounting server:/usr/lib/debug ?  That already wouldn't work,
> since e.g. libc-X.Y.Z.so.debug is not host-neutral.

You obviously use a lot of native toolchains :-)  In GDB terms,
libc-X.Y.Z.so is host neutral.  It's not target neutral.  The terms
shift if you're talking about building glibc; build/host aren't a
great match for this scenario.

If I'm going to ship pre-cached ARM Linux files, I need them to work
on x86-linux and x86-mingw32 at a minimum.  Sometimes I need them to
work on sparc-solaris2.10 or powerpc-linux or arm-linux.  That's where
I was coming from.

If we can limit it to endianness, and maybe pointer size, then we can
put that in the cache key as Frank suggested.  Then this goes away, or
at least doesn't bother me so much.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-17 16:53           ` Daniel Jacobowitz
@ 2009-12-17 17:20             ` Paul Pluzhnikov
  2009-12-17 18:16               ` Daniel Jacobowitz
  0 siblings, 1 reply; 26+ messages in thread
From: Paul Pluzhnikov @ 2009-12-17 17:20 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Tom Tromey, archer

On Thu, Dec 17, 2009 at 8:53 AM, Daniel Jacobowitz <drow@false.org> wrote:

> If I'm going to ship pre-cached ARM Linux files, I need them to work
> on x86-linux and x86-mingw32 at a minimum.  Sometimes I need them to
> work on sparc-solaris2.10 or powerpc-linux or arm-linux.  That's where
> I was coming from.

Thanks, I understand now.

And building the cache "on demand" or at GDB install time is not a viable
option?

We did it "on demand" -- there are usually a lot of shared libraries in
the distribution which the customer doesn't care about debugging.  E.g. if
the customer is developing some crypto app, he hardly ever debugs anything
that links against libX11, so there is no point in having libX11 take space
in the cache. Conversely, customers who debug GUIs rarely care about libssl.

Building "on demand" also handles updates nicely: when e.g. libc.so gets
updated, next time you run GDB, it notices and rebuilds the cache (just
for that image).

-- 
Paul Pluzhnikov

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-17 17:20             ` Paul Pluzhnikov
@ 2009-12-17 18:16               ` Daniel Jacobowitz
  2009-12-18 23:58                 ` Tom Tromey
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-17 18:16 UTC (permalink / raw)
  To: Paul Pluzhnikov; +Cc: Tom Tromey, archer

On Thu, Dec 17, 2009 at 09:20:44AM -0800, Paul Pluzhnikov wrote:
> And building the cache "on demand" or at GDB install time is not a viable
> option?

I think it's viable, but it'd be nice to pregenerate - that way you
don't need per-user copies of the possibly large cache.

I guess I should wait and see what sort of times Tom comes up with.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-17 18:16               ` Daniel Jacobowitz
@ 2009-12-18 23:58                 ` Tom Tromey
  2009-12-21 13:54                   ` Daniel Jacobowitz
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Tromey @ 2009-12-18 23:58 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Paul Pluzhnikov, archer

>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

Daniel> On Thu, Dec 17, 2009 at 09:20:44AM -0800, Paul Pluzhnikov wrote:
>> And building the cache "on demand" or at GDB install time is not a viable
>> option?

Daniel> I think it's viable, but it'd be nice to pregenerate - that way you
Daniel> don't need per-user copies of the possibly large cache.

Daniel> I guess I should wait and see what sort of times Tom comes up with.

The problem I have with it is that it doesn't help the first time you
start gdb.  And because the cache is invalidated whenever an objfile
changes, the part of your program you are hacking on will always look
like the first time.

I am still looking for a viable solution to this problem.  Maybe we can
resurrect the index sections in some form.

I guess we could try resurrecting the sqlite stuff but do the writing in
a background thread.  (The problem is what happens if the user quits gdb
while this is working.)

I spent the rest of this week changing gdb to read psymtabs in the
background.  This works, though there is still some bug because the
result is slower than I think it ought to be... maybe I just need to
defer starting these threads until after the prompt shows up.

I'll go back to the index and cache stuff next week, and probably on
into next year.

I also plan to look into Jan's idea of using the shared library info to
limit searches by PC.  This seems like it could help a case like ABRT,
which really just wants a stack trace and nothing else; in this case, we
could defer reading any debuginfo until it is requested, and then use
this change to pick exactly the objfile that matters.  This seems like
an independent patch.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-18 23:58                 ` Tom Tromey
@ 2009-12-21 13:54                   ` Daniel Jacobowitz
  2009-12-21 21:29                     ` Tom Tromey
  0 siblings, 1 reply; 26+ messages in thread
From: Daniel Jacobowitz @ 2009-12-21 13:54 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Paul Pluzhnikov, archer

On Fri, Dec 18, 2009 at 04:58:23PM -0700, Tom Tromey wrote:
> The problem I have with it is that it doesn't help the first time you
> start gdb.  And because the cache is invalidated whenever an objfile
> changes, the part of your program you are hacking on will always look
> like the first time.

Are monolithic single objects still the norm?  In e.g. the OpenOffice
case, this is fine; most of the guts are in shared libraries and
hopefully someone hacking on OO.o can rebuild just one library at a
time.

> I also plan to look into Jan's idea of using the shared library info to
> limit searches by PC.  This seems like it could help a case like ABRT,
> which really just wants a stack trace and nothing else; in this case, we
> could defer reading any debuginfo until it is requested, and then use
> this change to pick exactly the objfile that matters.  This seems like
> an independent patch.

Yes, it does, although it probably does best with the psymtab cleanups
you've already got.  Any interest in submitting those independently?
It sounds like a nice framework for investigating alternate
approaches.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-21 13:54                   ` Daniel Jacobowitz
@ 2009-12-21 21:29                     ` Tom Tromey
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-21 21:29 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: Paul Pluzhnikov, archer

>>>>> "Daniel" == Daniel Jacobowitz <drow@false.org> writes:

Tom> The problem I have with it is that it doesn't help the first time you
Tom> start gdb.  And because the cache is invalidated whenever an objfile
Tom> changes, the part of your program you are hacking on will always look
Tom> like the first time.

Daniel> Are monolithic single objects still the norm?  In e.g. the OpenOffice
Daniel> case, this is fine; most of the guts are in shared libraries and
Daniel> hopefully someone hacking on OO.o can rebuild just one library at a
Daniel> time.

Yeah, but the vitally important "gdb gdb" is still slow.
Maybe I'll just buy a faster machine :-)

Daniel> Yes, it does, although it probably does best with the psymtab cleanups
Daniel> you've already got.  Any interest in submitting those independently?
Daniel> It sounds like a nice framework for investigating alternate
Daniel> approaches.

Sure, I will do it if you think it is worthwhile.  The reason I was
putting it off is that it seemed strange to refactor all this code
behind a bunch of function pointers, and then only have a single
implementation.

The other infrastructure is the threading changes, which are a bit
uglier.  For one thing, I didn't see an easy way to make the BFD cache
code truly thread-safe; so for now I wrap it in a lock and just pretend
that nothing bad can happen.  Also, I'm not 100% sure I got all the
places that need a __thread modifier.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-14 22:40     ` Tom Tromey
  2009-12-14 23:09       ` Daniel Jacobowitz
@ 2009-12-15  1:04       ` Roland McGrath
  2009-12-15 18:36         ` Tom Tromey
  1 sibling, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2009-12-15  1:04 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, archer

> If we look at this as a gdb-specific format, then I suspect that is
> actually a harder sell for Fedora.  It means more space and another
> post-processing pass when making the debuginfo RPMs.

I'm quite sure we can roll it into the one post-processing pass when we do
the expected change to make that rewrite the DWARF data more than it does
already anyway.  In fact, for Fedora it's probably easier to do this than
anything else (i.e. nobody else but me will ever try to understand it
anyway!).  But, really, in the Fedora-centric context we have numerous
options and really not all that many constraints on approaches.  So this is
mostly neither here nor there as to the general subject of wise GDB features.

> I'd hope this is surmountable, but it would be preferable to hide the
> metadata processing somehow, because then we can involve fewer parties.
> Having gcc generate it in a .debug section achieves this pretty nicely,
> especially as gcc already generates these sections (more or less).

OTOH this approach inherently involves some suboptimally and probably at
least a little duplication.  (Also, try to consider COMDAT and be afraid.)
A post-link procedure gets the benefit of knowing how all the little CUs
got linked together, as well as not having to try to produce a
linkable/relocatable result.

> The other big drawback I see with the caching approach is that it means
> that the first time will still be slow -- even with a system-wide cache
> this will be true for the objfiles that a user changes during
> development.

Either the compilation is slower or the first debugging is slower.  In
theory it's about a wash, but of course that theory does not account for
the residency of all the relevant bits at compilation time nor for any
inherent benefit the compiler gets as emitter using its own internal
representations vs something that reads the compiler's vanilla DWARF to
build its index.

> FWIW if we were going to do our own cache, I wouldn't put it in a form
> like .debug_gnu_index or .debug_pub*.  I'd just have gdb write out a
> mappable data structure.

IMHO the big benefit of it being done on-demand by GDB itself (or perhaps
equivalently by an entirely separate post-processing phase not rolled into
other build-time DWARF fiddling) is that GDB has a free hand and no real
version skew issues in evolving this format.  Even if it's purely a
post-processing step rather than a compiler feature, if that is done as
part of package building, the index format immediately becomes a fairly
real and stable binary file format with all the attendant compatibility
issues and constraints on the ongoing development.

> One definite positive about the branch is that these changes are a lot
> simpler now.  The psymtab stuff is mostly isolated, and writing a new
> "back end" is reasonably self-contained.

Well done!

Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Initial psymtab replacement results
  2009-12-15  1:04       ` Roland McGrath
@ 2009-12-15 18:36         ` Tom Tromey
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Tromey @ 2009-12-15 18:36 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Daniel Jacobowitz, archer

>>>>> "Roland" == Roland McGrath <roland@redhat.com> writes:

>> If we look at this as a gdb-specific format, then I suspect that is
>> actually a harder sell for Fedora.  It means more space and another
>> post-processing pass when making the debuginfo RPMs.

Roland> I'm quite sure we can roll it into the one post-processing pass
Roland> when we do the expected change to make that rewrite the DWARF
Roland> data more than it does already anyway.  In fact, for Fedora it's
Roland> probably easier to do this than anything else (i.e. nobody else
Roland> but me will ever try to understand it anyway!).  But, really, in
Roland> the Fedora-centric context we have numerous options and really
Roland> not all that many constraints on approaches.

Ok, sounds great.  I will try the cache approach.

Tom

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-01-06 23:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-10 21:54 Initial psymtab replacement results Tom Tromey
2009-12-11 23:10 ` Tom Tromey
2009-12-11 23:59   ` Daniel Jacobowitz
2009-12-14 22:40     ` Tom Tromey
2009-12-14 23:09       ` Daniel Jacobowitz
2009-12-15 23:39         ` Tom Tromey
2009-12-16  3:01           ` Roland McGrath
2009-12-16 18:20             ` Tom Tromey
2009-12-16 18:57               ` Roland McGrath
2009-12-16 19:46                 ` Tom Tromey
2009-12-16 19:52                   ` Roland McGrath
2009-12-16 18:11           ` Daniel Jacobowitz
2009-12-16 19:31             ` Tom Tromey
2009-12-23 18:29           ` Tom Tromey
2009-12-23 18:35             ` Daniel Jacobowitz
2009-12-24 17:07             ` Tom Tromey
2010-01-06 23:05               ` Tom Tromey
2009-12-17 16:39         ` Paul Pluzhnikov
2009-12-17 16:53           ` Daniel Jacobowitz
2009-12-17 17:20             ` Paul Pluzhnikov
2009-12-17 18:16               ` Daniel Jacobowitz
2009-12-18 23:58                 ` Tom Tromey
2009-12-21 13:54                   ` Daniel Jacobowitz
2009-12-21 21:29                     ` Tom Tromey
2009-12-15  1:04       ` Roland McGrath
2009-12-15 18:36         ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).