public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [RFC][gdb/symtab] Lazy expansion of full symbol table
@ 2021-06-14  9:39 Tom de Vries
  2021-06-14 20:54 ` Tom Tromey
  0 siblings, 1 reply; 14+ messages in thread
From: Tom de Vries @ 2021-06-14  9:39 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Tom Tromey

Hi,

[ I'm not posting the experimental patch series as such for now.  Available
here ( https://github.com/vries/gdb/commits/lazy-full-symtab ). ]

In PR23710, the stated problem is that gdb is slow and memory hungry when
consuming debug information generated by GCC with LTO.

I.

Taking the range of final releases 8.1.1 to 10.2, as well as a recent trunk
commit (3633d4fb446), and an experiment using cc1:
...
$ gdb -q -batch cc1 -ex "b do_rpo_vn"
...
we get:
...
+---------+----------------+
| Version | real (seconds) |
+---------+----------------+
| 8.1.1   | 9.42           |
| 8.2.1   | - (PR23712)    |
| 8.3.1   | 9.31           |
| 9.2     | 8.50           |
| 10.2    | 5.86           |
| trunk   | 6.36           |
+---------+----------------+
...
which is nice progress in the releases.  The regression on trunk since 10.2
has been filed as PR27937.

[ The 10.2 score can be further improved to 5.23, by setting dwarf
max-cache-age to 1000.  Defaults to 5, see PR25703. ]

However, the best score is still more than a factor 3 slower than lldb:
...
+-------------+----------------+
| Version     | real (seconds) |
+-------------+----------------+
| gdb 10.2    | 5.86           |
| lldb 10.0.1 | 1.74           |
+-------------+----------------+
...

II.

Breaking down the 10.2 time of 5.86, we have:
...
+-----------------+----------------+
|                 | real (seconds) |
+-----------------+----------------+
| Minimal symbols | 0.18           |
| Partial symbols | 2.34           |
| Full symbols    | 3.34           |
+-----------------+----------------+
...

So:
- the minimal symbols and partial symbols are processed for all CUs, while
  the full symbols are processed only for the necessary CUs
- still the majority of the time is spent for the full symbols

This is due to the combination of:
- the one-CU-at-a-time strategy of gdb, and
- the code generation for LTO which combines several CUs into an
  artificial CU.
In other words, LTO increases the scope of processing from individual CUs to
larger artificial CUs, and consequently things take much longer.

III.

A way to fix this is to do processing of the full symbols in a lazy fashion.

This patch series implements a first attempt at this, for now intended not to
be functionally correct, but to assess the kind of performance improvement we
get from this.

With current trunk (commit 987610f2d68), we get 3.44, instead of the 6.44
without this patch series.

IV.

The patch series consists of:

[gdb/symtab] Cover letter -- Lazy expansion of full symbol table
This.

[gdb/symtab] Add lazy_expand_symtab_p, default to false
Add variable to enable lazy expansion.  Keep false for now to enable gradual
introduction of implementation.

[gdb/symtab] Add sect_off field to partial_symbol
Keep track of section offset in partial symbol.

[gdb/symtab] Keep track of interesting_symbols in partial_symtab
When searching for a symbol in a partial symbol table:
- keep going after finding a match
- store the matching partial symbols in a vector interesting_symbols

[gdb/symtab] Add interesting_symbols to dwarf2_per_cu_data
Make the vector interesting_symbols available during full symbols expansion.

[gdb/symtab] Handle interesting_symbols in read_file_scope
Using the interesting_symbols vector to filter the DIEs that we process when
doing read_file_scope.

[gdb/symtab] Add reset_compunit_symtab
Add a new function reset_compunit_symtab

[gdb/symtab] Reset compunit_symtab in psymtab_to_symtab
Use the new function to reset the full symbols table before expanding.
[ Without this patch, after finding symbols for the first time, we are not able
to find any others.  With this patch, we are able to find other symbols, but
only after forgetting the first ones.  This obviously needs proper fixing. ]

[gdb/symtab] Set lazy_expand_symtab_p to true
Enable.

V.

The current state of trunk is that expanding full symbols is a two part
process:
- a builder is created during expansion
- after expansion the builder is destroyed after delivering the
  end result: a symbol table

The problem is that we need a way to do this gradually instead:
- expand a few symbols
- get the correspoding symbol table
- expand a few more symbols
- get the updated symbol table contain all expanded symbols

I'm not sure what is the smartest way to do that.

My current idea is to try to keep the builder around rather than destroy it,
and have it generate an updated symbol table each time.

Is this a good idea?

Any other comments?

Thanks,
- Tom

[gdb/symtab] Cover letter -- Lazy expansion of full symbol table

---
 COVER-LETTER | 1 +
 1 file changed, 1 insertion(+)

diff --git a/COVER-LETTER b/COVER-LETTER
new file mode 100644
index 00000000000..d273939703b
--- /dev/null
+++ b/COVER-LETTER
@@ -0,0 +1 @@
+Lazy expansion of full symbol table

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-28  0:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-14  9:39 [RFC][gdb/symtab] Lazy expansion of full symbol table Tom de Vries
2021-06-14 20:54 ` Tom Tromey
2021-06-14 23:36   ` Tom de Vries
2021-06-15 13:26     ` Tom Tromey
2021-06-16 10:20       ` Tom de Vries
2021-06-18  2:30         ` Tom Tromey
2021-06-19 19:36           ` Tom de Vries
2021-06-20  9:41             ` Tom de Vries
2021-06-20 18:17               ` Tom Tromey
2021-06-22  9:16                 ` Tom de Vries
2021-06-20 23:44             ` Tom Tromey
2021-06-21  9:18               ` Tom de Vries
2021-06-27 22:24                 ` Tom Tromey
2021-06-28  0:48               ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).