public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Aaron Merey <amerey@redhat.com>, gdb-patches@sourceware.org
Cc: Aaron Merey <amerey@redhat.com>
Subject: Re: [PATCH 2/3 v6] gdb/debuginfod: Support on-demand debuginfo downloading
Date: Mon, 17 Jun 2024 11:45:10 +0100	[thread overview]
Message-ID: <8734pb7sqx.fsf@redhat.com> (raw)
In-Reply-To: <20240503225902.2866347-3-amerey@redhat.com>


Aaron,

Thanks for continuing to work on this, and sorry it's taken me so long
to get back to looking at this patch.


Aaron Merey <amerey@redhat.com> writes:

> v6 fixes a comment for dwarf2_has_separate_index and merges
> lookup_gdb_index_debuginfod with lookup_gdb_index.
>
> At the beginning of a session, gdb may attempt to download debuginfo
> for all shared libraries associated with the process or core file
> being debugged.  This can be a waste of time and storage space when much
> of the debuginfo ends up not being used during the session.
>
> To reduce the gdb's startup latency and to download only the debuginfo
> that is really needed, this patch adds on-demand downloading of debuginfo.
>
> 'set debuginfo enabled on' now causes gdb to attempt to download a .gdb_index
> for each shared library instead of its full debuginfo.  Each corresponding
> separate debuginfo will be deferred until gdb needs to expand symtabs
> associated with the debuginfo's index.
>
> Because these indices are significantly smaller than their corresponding
> debuginfo, this generally reduces the total amount of data gdb downloads.
> Reductions of 80%-95% have been observed when debugging large GUI programs.
>
>     (gdb) set debuginfod enabled on
>     (gdb) start
>     Downloading section .gdb_index for /lib64/libcurl.so.4
>     [...]
>     1826        client->server_mhandle = curl_multi_init ();
>     (gdb) step
>     Downloading separate debug info for /lib64/libcurl.so.4
>     Downloading separate debug info for [libcurl dwz]
>     Downloading source file /usr/src/debug/curl-7.85.0-6.fc37.x86_64/build-full/lib/../../lib/multi.c
>     curl_multi_init () at ../../lib/multi.c:457
>     457     {
>     (gdb)
>
> Some of the key functions below include dwarf2_has_separate_index which
> downloads the separate .gdb_index.  If successful, the shared library
> objfile owns the index until the separate debug objfile is downloaded
> or confirmed to not be available.
>
> read_full_dwarf_from_debuginfod downloads the full debuginfo and
> initializes the separate debug objfile.  It is called by functions
> such as dwarf2_gdb_index::expand_symtabs_matching and
> dwarf2_base_index_functions::find_pc_sect_compunit_symtab when symtab
> expansion is required.
> ---

> diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
> index e743ce13978..97b8137aabf 100644
> --- a/gdb/dwarf2/read-gdb-index.c
> +++ b/gdb/dwarf2/read-gdb-index.c
> @@ -139,6 +139,8 @@ struct dwarf2_gdb_index : public dwarf2_base_index_functions
>       gdb.dwarf2/gdb-index.exp testcase.  */
>    void dump (struct objfile *objfile) override;
>  
> +  /* Calls do_expand_symtabs_matching and triggers debuginfo downloading
> +     if necessary.  */
>    bool expand_symtabs_matching
>      (struct objfile *objfile,
>       gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> @@ -147,8 +149,59 @@ struct dwarf2_gdb_index : public dwarf2_base_index_functions
>       gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
>       block_search_flags search_flags,
>       domain_search_flags domain) override;
> +
> +  /* Expand symtabs matching a given symbol or file.  */
> +  bool do_expand_symtabs_matching
> +    (struct objfile *objfile,
> +     gdb::function_view<expand_symtabs_file_matcher_ftype> file_matcher,
> +     const lookup_name_info *lookup_name,
> +     gdb::function_view<expand_symtabs_symbol_matcher_ftype> symbol_matcher,
> +     gdb::function_view<expand_symtabs_exp_notify_ftype> expansion_notify,
> +     block_search_flags search_flags,
> +     domain_search_flags domain);
> +
> +  /* Calls dwarf2_base_index_functions::expand_all_symtabs and downloads
> +     debuginfo if necessary.  */
> +  void expand_all_symtabs (struct objfile *objfile) override;
> +
> +  /* Calls dwarf2_base_index_functions::find_last_source_symtab and downloads
> +     debuginfo if necessary.  */
> +  struct symtab *find_last_source_symtab (struct objfile *objfile) override;
>  };
>  
> +void
> +dwarf2_gdb_index::expand_all_symtabs (struct objfile *objfile)
> +{
> +  try
> +    {
> +      dwarf2_base_index_functions::expand_all_symtabs (objfile);
> +    }
> +  catch (const gdb_exception &e)
> +    {
> +      if ((objfile->flags & OBJF_DOWNLOAD_DEFERRED) == 0)
> +	exception_print (gdb_stderr, e);
> +      else
> +	read_full_dwarf_from_debuginfod (objfile, this);

I know this isn't what you want to hear, but I really don't think we
should move forward with this approach.

Every piece of C++ advice I've ever read always says the same thing: use
exceptions for error handling, not as an alternative control flow.

Now only is it confusing to folk who will assume that try/catch is all
about handling errors, but in this exact case the objfile might well
have deferred debug information, but some other exception might have
been thrown, in which case we're going to swallow the real error and
just read the deferred debug.  If we did want this to work reliably we'd
need a unique exception type which could be thrown and caught that is
_only_ thrown when we want deferred debug to be loaded...

...and by that point, you might has well move the deferred debug loading
to the point where the exception is being thrown.

Which is exactly the route I think we need to go down.

If I understand what's going on correctly you're hooking around the code
which searches the index.  Normally (in the non debuginfod case) we
parse the index, then when a request arrives for some debug information
we'll scan the index, and if we get a hit we expand the relevant CU's
debug information.

This would seem to imply there's a two stage process, (1) a search of
the index, and (2) an expansion phase if we get a hit.

In my head, an ideal solution would parameterise this process, stage
(1), the search remains common, but what we do when we get a hit
changes.  In the "normal" case we expand the CU's we already have on
hand, and in the "debuginfod" case we'll download (and expand/parse) the
full debug info.

Did you consider an approach like this at all?  If so what issues did
you encounter?  If not, would you be willing to consider this approach?

For the wider GDB maintainer / contributor community, am I being too
harsh here?  Is my reluctance to use exceptions for (what I see as)
control flow unreasonable?  If I'm in the minority here then I'm happy
to stand aside.

Thanks,
Andrew


  parent reply	other threads:[~2024-06-17 10:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-03 22:58 [PATCH 0/3] On-demand " Aaron Merey
2024-05-03 22:59 ` [PATCH 1/3 v3] gdb/progspace: Add reverse safe iterator Aaron Merey
2024-05-17 14:09   ` [PING][PATCH " Aaron Merey
2024-05-31 20:58     ` [PING*2][PATCH " Aaron Merey
2024-06-07 20:48       ` [PING*3][PATCH " Aaron Merey
2024-06-14 18:23       ` [PING*2][PATCH " Aaron Merey
2024-05-03 22:59 ` [PATCH 2/3 v6] gdb/debuginfod: Support on-demand debuginfo downloading Aaron Merey
2024-05-17 14:10   ` [PING][PATCH " Aaron Merey
2024-05-31 20:58     ` [PING*2][PATCH " Aaron Merey
2024-06-07 20:48       ` [PING*3][PATCH " Aaron Merey
2024-06-14 18:24       ` Aaron Merey
2024-06-17 10:45   ` Andrew Burgess [this message]
2024-06-17 19:57     ` [PATCH " Aaron Merey
2024-05-03 22:59 ` [PATCH 3/3 v6] gdb/debuginfod: Add .debug_line downloading Aaron Merey
2024-05-17 14:10   ` [PING][PATCH " Aaron Merey
2024-05-31 20:58     ` [PING*2][PATCH " Aaron Merey
2024-06-07 20:49       ` [PING*3][PATCH " Aaron Merey
2024-06-14 18:24       ` Aaron Merey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734pb7sqx.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=amerey@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).