public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug symtab/17547] New: over-eager debuginfo reading
@ 2014-11-04 15:59 tromey at sourceware dot org
  2021-05-25 19:15 ` [Bug symtab/17547] " fche at redhat dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: tromey at sourceware dot org @ 2014-11-04 15:59 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

            Bug ID: 17547
           Summary: over-eager debuginfo reading
           Product: gdb
           Version: 7.7
            Status: NEW
          Severity: normal
          Priority: P2
         Component: symtab
          Assignee: unassigned at sourceware dot org
          Reporter: tromey at sourceware dot org

I'm doing multi-inferior debugging of firefox.

I started gdb and did "catch exec".  Then I "ran".

Here is some output from just before the catchpoint
triggers:

[New inferior 19343]
[New process 19343]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from
/home/tromey/firefox-git/gecko-dev/obj-x86_64-unknown-linux-gnu/dist/bin/libmozsandbox.so...done.
Reading symbols from
/home/tromey/firefox-git/gecko-dev/obj-x86_64-unknown-linux-gnu/dist/bin/libnspr4.so...done.
Reading symbols from
/home/tromey/firefox-git/gecko-dev/obj-x86_64-unknown-linux-gnu/dist/bin/libplc4.so...done.
[...]
Catchpoint 1 (exec'd
/home/tromey/firefox-git/gecko-dev/obj-x86_64-unknown-linux-gnu/dist/bin/plugin-container),
0x00007ffff7dde1f0 in _start () from /lib64/ld-linux-x86-64.so.2


I think what is happening here is that gdb is reading the debuginfo
for the newly fork()d inferior.  Then it proceeds to discard it all
again when the inferior exec()s.  (And in my case, since the subprocess
execs firefox with different arguments, it actually re-reads it all
once again...)

This is pretty inefficient, especially with a large program like firefox.
Reading all the debuginfo is on the order of 10 seconds -- a noticeable
hiccup in the debug session.

While the long term fix is address-space-independence, perhaps a
short-term solution can be found as well.  It seems to me that in
the fork case, the shared libraries cannot be further relocated, so
perhaps some way of sharing debuginfo across inferiors could be found.
For example, an objfile could appear in multiple address spaces (or program
spaces?  I never keep those straight).

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
@ 2021-05-25 19:15 ` fche at redhat dot com
  2021-05-25 19:18 ` cbiesinger at google dot com
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: fche at redhat dot com @ 2021-05-25 19:15 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Frank Ch. Eigler <fche at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fche at redhat dot com

--- Comment #1 from Frank Ch. Eigler <fche at redhat dot com> ---
The addition of debuginfod capability has made this problem more acute, in the
sense that now that gdb can download dwarf on the fly, it will do so, and take
its sweet time!

Consider a GUI program with hundreds of shared libraries.  As soon as it
starts, gdb will proactively download all that dwarf for all shared libraries,
even before the user does anything that would strictly require it.  I see that
Many gdb commands may require searching for symbols program-wide, so yeah
probably one needs all that data, but not all (like a backtrace?).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
  2021-05-25 19:15 ` [Bug symtab/17547] " fche at redhat dot com
@ 2021-05-25 19:18 ` cbiesinger at google dot com
  2021-05-25 19:19 ` fche at redhat dot com
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cbiesinger at google dot com @ 2021-05-25 19:18 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Christian Biesinger <cbiesinger at google dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cbiesinger at google dot com

--- Comment #2 from Christian Biesinger <cbiesinger at google dot com> ---
backtrace needs debug data for stack unwinding & showing function names (and
arguments)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
  2021-05-25 19:15 ` [Bug symtab/17547] " fche at redhat dot com
  2021-05-25 19:18 ` cbiesinger at google dot com
@ 2021-05-25 19:19 ` fche at redhat dot com
  2021-05-25 19:44 ` simark at simark dot ca
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: fche at redhat dot com @ 2021-05-25 19:19 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #3 from Frank Ch. Eigler <fche at redhat dot com> ---
(In reply to Christian Biesinger from comment #2)
> backtrace needs debug data for stack unwinding & showing function names (and
> arguments)

Certainly, but not for ALL solibs, just those implicated in an actual stack
frame.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (2 preceding siblings ...)
  2021-05-25 19:19 ` fche at redhat dot com
@ 2021-05-25 19:44 ` simark at simark dot ca
  2021-05-25 19:50 ` fche at redhat dot com
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: simark at simark dot ca @ 2021-05-25 19:44 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Simon Marchi <simark at simark dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |simark at simark dot ca

--- Comment #4 from Simon Marchi <simark at simark dot ca> ---
GDB goes through the DWARF info of all code objects to build some name index
(partial symtabs), such that if you type "break foo", it can quickly know which
compilation units contain a function or method called foo.  The solution to
avoid that is to use an index (DWARF4 .gdb_index, or DWARF5 .debug_names). 
Then, if debuginfod offered that, I could imagine GDB just downloading the
index section, which is small compared to the rest of the debug info.  And
then, when it needs to expand a compilation unit's debug info, it could
download just the portion of the .debug_* sections that it needs (again, if
debuginfod offered that).  I think it would be neat.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (3 preceding siblings ...)
  2021-05-25 19:44 ` simark at simark dot ca
@ 2021-05-25 19:50 ` fche at redhat dot com
  2021-05-25 20:21 ` keiths at redhat dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: fche at redhat dot com @ 2021-05-25 19:50 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #5 from Frank Ch. Eigler <fche at redhat dot com> ---
(In reply to Simon Marchi from comment #4)
> GDB goes through the DWARF info of all code objects to build some name index
> (partial symtabs), such that if you type "break foo", it can quickly know
> which compilation units contain a function or method called foo.

Understood, that's the sort of thing that appears to require program-wide
debuginfo data of some sort.  But something like 

% gdb /bin/gnome-control-centre
(gdb) run
^C
(gdb) bt

doesn't seem to.


> The solution to avoid that is to use an index (DWARF4 .gdb_index, or DWARF5
> .debug_names).  Then, if debuginfod offered that, I could imagine GDB just
> downloading the index section [...]

Yes, that seems plausible, assuming it's a savings, perhaps latency being most
important.  Depending on one's luck, it could be that transferring the whole
file would not take much longer than extracting the index only and transferring
that.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (4 preceding siblings ...)
  2021-05-25 19:50 ` fche at redhat dot com
@ 2021-05-25 20:21 ` keiths at redhat dot com
  2021-05-25 21:24 ` tromey at sourceware dot org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: keiths at redhat dot com @ 2021-05-25 20:21 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Keith Seitz <keiths at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |keiths at redhat dot com

--- Comment #6 from Keith Seitz <keiths at redhat dot com> ---
Of course, we could just do some of the obvious things, like allow/deny lists
by regexp, "interactive" (a la "rm -i"), etc... Those would be pretty easy to
do, and we already support, e.g., skip lists.

I imagine users might find this useful anyway. I sure would.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (5 preceding siblings ...)
  2021-05-25 20:21 ` keiths at redhat dot com
@ 2021-05-25 21:24 ` tromey at sourceware dot org
  2021-05-26  1:49 ` rfhn.fhbrrjnzeneqpf at noclue dot notk.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tromey at sourceware dot org @ 2021-05-25 21:24 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #7 from Tom Tromey <tromey at sourceware dot org> ---
Once upon a time, I added lazy reading to gdb.  The vestiges
are still in the tree, search for OBJF_PSYMTABS_READ and SYMFILE_NO_READ.
The use case was precisely what Frank is talking about - "attach"
was very slow and read a bunch of irrelevant stuff.

However, this was mostly disabled.  While it works great for some
scenarios, it also causes long pauses at "unexpected" times in other
scenarios.  For example, it makes "backtrace" randomly slow, depending
on whether the debuginfo for a frame has been read.

(The "backtrace" problem can be solved as well, by lazier reading of
the DWARF.  This is a reasonably-sized project though.)

Another problem is that setting a breakpoint becomes very expensive,
because most breakpoints require a full DWARF scan.  This too is avoidable,
at least partly, if there were a way to isolate breakpoints to a given
.so or whatever; or by the use of an index.

When disabling this feature, I think our rationale was that it's better
for users to get the delays up front, when they expect them, rather than
randomly in response to some command.


What would be truly great for something like debuginfod is to have it
index all the libraries, and then provide a fine-grained API to gdb.
Then with some work, gdb could avoid ever downloading all the DWARF
or doing a massive scan.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (6 preceding siblings ...)
  2021-05-25 21:24 ` tromey at sourceware dot org
@ 2021-05-26  1:49 ` rfhn.fhbrrjnzeneqpf at noclue dot notk.org
  2023-01-01 22:24 ` tromey at sourceware dot org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rfhn.fhbrrjnzeneqpf at noclue dot notk.org @ 2021-05-26  1:49 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Dominique Martinet <rfhn.fhbrrjnzeneqpf at noclue dot notk.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rfhn.fhbrrjnzeneqpf@noclue.
                   |                            |notk.org

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (7 preceding siblings ...)
  2021-05-26  1:49 ` rfhn.fhbrrjnzeneqpf at noclue dot notk.org
@ 2023-01-01 22:24 ` tromey at sourceware dot org
  2023-01-03 23:02 ` amerey at redhat dot com
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tromey at sourceware dot org @ 2023-01-01 22:24 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #8 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Frank Ch. Eigler from comment #3)
> (In reply to Christian Biesinger from comment #2)
> > backtrace needs debug data for stack unwinding & showing function names (and
> > arguments)
> 
> Certainly, but not for ALL solibs, just those implicated in an actual stack
> frame.

gdb can do some weird stuff for type lookups, so this would need
a bit of careful attention.

Anyway I was wondering recently why gdb isn't using the shared library
memory map to decide which objfiles to search.  That is, surely if
a PC is in some mapped range, then the debuginfo for that function
will come from the corresponding .so.

There's a bunch of loops like:

  for (objfile *objfile : current_program_space->objfiles ())
    {
      struct compunit_symtab *cust
        = objfile->find_pc_sect_compunit_symtab (msymbol, pc, section, 0);
      if (cust)
        return;
    }

... that could then be simplified.  On the branch I'm working on
(see bug #29942), this wouldn't improve downloads of debuginfo,
but would improve parsing wait times.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (8 preceding siblings ...)
  2023-01-01 22:24 ` tromey at sourceware dot org
@ 2023-01-03 23:02 ` amerey at redhat dot com
  2023-01-04  2:03 ` tromey at sourceware dot org
  2023-01-05  1:16 ` amerey at redhat dot com
  11 siblings, 0 replies; 13+ messages in thread
From: amerey at redhat dot com @ 2023-01-03 23:02 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

Aaron Merey <amerey at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amerey at redhat dot com

--- Comment #9 from Aaron Merey <amerey at redhat dot com> ---
Related to this, I've been working on a patch to help address gdb's over-eager
debuginfo downloading. The idea is to initially download a solib's .gdb_index
section from debuginfod instead of its full debuginfo. Downloading full
debuginfo is deferred until required.

https://sourceware.org/pipermail/gdb-patches/2022-November/193416.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (9 preceding siblings ...)
  2023-01-03 23:02 ` amerey at redhat dot com
@ 2023-01-04  2:03 ` tromey at sourceware dot org
  2023-01-05  1:16 ` amerey at redhat dot com
  11 siblings, 0 replies; 13+ messages in thread
From: tromey at sourceware dot org @ 2023-01-04  2:03 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #10 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Aaron Merey from comment #9)
> Related to this, I've been working on a patch to help address gdb's
> over-eager debuginfo downloading. The idea is to initially download a
> solib's .gdb_index section from debuginfod instead of its full debuginfo.
> Downloading full debuginfo is deferred until required.
> 
> https://sourceware.org/pipermail/gdb-patches/2022-November/193416.html

Sorry I haven't reviewed this yet.  I like this idea.

My current work for bug #29942 involves trying to move some gdb work
into the background, so that when many libraries must be read at
once, gdb isn't blocking while processing each one.

The corresponding idea with debuginfod would be to read the .gdb_index
remotely -- but not even wait for the download to complete before
setting up the "quick_symbol_functions" object.  Any code trying to
actually use the index would have to wait for the download to complete,
but this seems fine to me (assuming the user opts in, like in your patch).

Perhaps another thing that could be done here is to also trigger a
background download of the corresponding full files, so they are
more likely to be available when needed.

One funny thing here is that while I've been wanting gdb to move
towards the DWARF 5 index (see bug #24820), .gdb_index is
actually more self-contained -- the DWARF 5 index refers to
.debug_str, and gdb also needs .debug_aranges to work.  So, .gdb_index
is a little easier for this use.

However, .gdb_index also has issues, see bug #27930 (and I think there
may be others).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug symtab/17547] over-eager debuginfo reading
  2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
                   ` (10 preceding siblings ...)
  2023-01-04  2:03 ` tromey at sourceware dot org
@ 2023-01-05  1:16 ` amerey at redhat dot com
  11 siblings, 0 replies; 13+ messages in thread
From: amerey at redhat dot com @ 2023-01-05  1:16 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=17547

--- Comment #11 from Aaron Merey <amerey at redhat dot com> ---
(In reply to Tom Tromey from comment #10)
> My current work for bug #29942 involves trying to move some gdb work
> into the background, so that when many libraries must be read at
> once, gdb isn't blocking while processing each one.
> 
> The corresponding idea with debuginfod would be to read the .gdb_index
> remotely -- but not even wait for the download to complete before
> setting up the "quick_symbol_functions" object.  Any code trying to
> actually use the index would have to wait for the download to complete,
> but this seems fine to me (assuming the user opts in, like in your patch).
> 
> Perhaps another thing that could be done here is to also trigger a
> background download of the corresponding full files, so they are
> more likely to be available when needed.

These are nice enhancements. I'll try prototyping some basic support
for background downloading and keep an eye out for your PR29942 patch. 

> One funny thing here is that while I've been wanting gdb to move
> towards the DWARF 5 index (see bug #24820), .gdb_index is
> actually more self-contained -- the DWARF 5 index refers to
> .debug_str, and gdb also needs .debug_aranges to work.  So, .gdb_index
> is a little easier for this use.

debuginfod supports the downloading of any ELF sections from a
given file, so we could implement deferred downloads using 
.debug_names and related sections in addition to .gdb_index or
instead of it.

Even if we keep using .gdb_index for this we still may end up
downloading other sections. One issue with deferred downloads that
I'd like to address: commands that require searching debuginfos
for a source file name trigger a mass download of all deferred
debuginfo, since .gdb_index doesn't contain file name information.
Maybe gdb could get this information from downloading
.debug_line/.debug_line_str in addition to .gdb_index.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-01-05  1:16 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-04 15:59 [Bug symtab/17547] New: over-eager debuginfo reading tromey at sourceware dot org
2021-05-25 19:15 ` [Bug symtab/17547] " fche at redhat dot com
2021-05-25 19:18 ` cbiesinger at google dot com
2021-05-25 19:19 ` fche at redhat dot com
2021-05-25 19:44 ` simark at simark dot ca
2021-05-25 19:50 ` fche at redhat dot com
2021-05-25 20:21 ` keiths at redhat dot com
2021-05-25 21:24 ` tromey at sourceware dot org
2021-05-26  1:49 ` rfhn.fhbrrjnzeneqpf at noclue dot notk.org
2023-01-01 22:24 ` tromey at sourceware dot org
2023-01-03 23:02 ` amerey at redhat dot com
2023-01-04  2:03 ` tromey at sourceware dot org
2023-01-05  1:16 ` amerey at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).