public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Lancelot SIX <lsix@lancelotsix.com>
To: Andrew Burgess <aburgess@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c
Date: Thu, 26 Oct 2023 10:06:01 +0100	[thread overview]
Message-ID: <20231026090601.awjsf5mrc2beyatn@octopus> (raw)
In-Reply-To: <b8b8e8ea0ce3909f6a824587cf98f4b6902dec71.1698246342.git.aburgess@redhat.com>

On Wed, Oct 25, 2023 at 04:08:17PM +0100, Andrew Burgess wrote:
> In the following commit I ran into a problem.  The next commit aims to
> improve GDB's handling of the main executable being a file on a remote
> target (i.e. one with a 'target:' prefix).
> 
> To do this I have replace a system 'stat' call with a bfd_stat call.
> 
> However, doing this caused a regression in gdb.base/attach.exp.
> 
> The problem is that the bfd library caches open FILE* handles for bfd
> objects that it has accessed, which is great for short-lived, non
> interactive programs (e.g. the assembler, or objcopy, etc), however,
> for GDB this caching causes us a problem.
> 
> If we open the main executable as a bfd then the bfd library will
> cache the open FILE*.  If some time passes, maybe just sat at the GDB
> prompt, or with the inferior running, and then later we use bfd_stat
> to check if the underlying, on-disk file has changed, then the bfd
> library will actually use fstat on the underlying file descriptor.
> This is of course slightly different than using system stat on with
> the on-disk file name.
> 
> If the on-disk file has changed then system stat will give results for
> the current on-disk file.  But, if the bfd cache is still holding open
> the file descriptor for the original on-disk file (from before the
> change) then fstat will return a result based on the original file,
> and so show no change as having happened.
> 
> This is a known problem in GDB, and so far this has been solved by
> scattering bfd_cache_close_all() calls throughout GDB.  But, as I
> said, in the next commit I've made a change and run into a
> problem (gdb.base/attach.exp) where we are apparently missing a
> bfd_cache_close_all() call.
> 
> Now I could solve this problem by adding a bfd_cache_close_all() call
> before the bfd_stat call that I plan to add in the next commit, that
> would for sure solve the problem, but feels a little crude.
> 
> Better I think would be to track down where the bfd is being opened
> and add a corresponding bfd_cache_close_all() call elsewhere in GDB
> once we've finished doing whatever it is that caused us to open the
> bfd in the first place.
> 
> This second solution felt like the better choice, so I tracked the
> problem down to elf_locate_base and fixed that.  But that just exposed
> another problem in gdb_bfd_map_section which was also re-opening the
> bfd, so I fixed this (with another bfd_cache_close_all() call), and
> that exposed another issue in gdbarch_lookup_osabi... and at this
> point I wondered if I was approaching this problem the wrong way...
> 
> .... And so, I wonder, is there a _better_ way to handle these
> bfd_cache_close_all() calls?
> 
> I see two problems with the current approach:
> 
>   1. It's fragile.  Folk aren't always aware that they need to clear
>   the bfd cache, and this feels like something that is easy to
>   overlook in review.  So adding new code to GDB can innocently touch
>   a bfd, which populates the cache, which will then be a bug that can
>   lie hidden until an on-disk file just happens to change at the wrong
>   time ... and GDB fails to spot the change.  Additionally,
> 
>   2. It's in efficient.  The caching is intended to stop the bfd
>   library from continually having to re-open the on-disk file.  If we
>   have a function that touches a bfd then often that function is the
>   obvious place to call bfd_cache_close_all.  But if a single GDB
>   command calls multiple functions, each of which touch the bfd, then
>   we will end up opening and closing the same on-disk file multiple
>   times.  It feels like we would be better postponing the
>   bfd_cache_close_all call until some later point, then we can benefit
>   from the bfd cache.
> 
> So, in this commit I propose a new approach.  We now clear the bfd
> cache in two places:
> 
>   (a) Just before we display a GDB prompt.  We display a prompt after
>   completing a command, and GDB is about to enter an idle state
>   waiting for further input from the user (or in async mode, for an
>   inferior event).  If while we are in this idle state the user
>   changes the on-disk file(s) then we would like GDB to notice this
>   the next time it leaves its idle state, e.g. the next time the user
>   executes a command, or when an inferior event arrives,
> 
>   (b) When we resume the inferior.  In synchronous mode, resuming the
>   inferior is another time when GDB is blocked and sitting idle, but
>   in this case we don't display a prompt.  As with (a) above, when an
>   inferior event arrives we want GDB to notice any changes to on-disk
>   files.
> 
> Nicely, there are existing observers for both of these
> cases (before_prompt and target_resumed respectively), so, in
> gdb_bfd.c I've hooked into both of these and arranged to clear the bfd
> cache.
> 
> With this commit in place the next commit no longer shows any issues
> with gdb.base/attach.exp.
> 
> One possible problem that I see with this commit is: what if some
> other user of one of the observers I've hooked into causes a bfd to be
> opened after my new observers have run?  If this happened then we
> would proceed with a populated bfd cache, and this might causes
> problems.
> 
> Right now, the only other users of the observers that I'm connecting
> too are the extension languages, specifically, Python.  I suspect it
> must be possible for Python to carry out actions that can cause the
> bfd cache to be populated, so maybe there is a risk here.
> 
> To counter this risk, we could move the bfd_cache_close_all() calls
> out of observers, and move them into GDB core close too, but after the
> two observers in questions are actually called, so into
> top_level_prompt for the before_prompt observer and into
> notify_target_resumed for the target_resumed observer.
> 
> Another choice would be to add a bfd_cache_close_all() into
> gdbpy_enter::~gdbpy_enter, so whenever we call into a Python
> extension, we always clear the bfd cache once we're done.
> 
> For now I haven't made either of these changes, maybe I'm worrying
> about nothing?  I'd appreciate peoples thoughts.

Hi Andrew,

I think there are a couple of places in GDB where observers are not
notified directly, but via a notify_ helper function.  Those functions
ensure that some step are take before/after the observers listeners
execute.

Instances of this are notify_about_to_proceed, notify_about_to_proceed,
notify_normal_stop or notify_user_selected_context_changed (in
infrunc.c).

For this problem, could you replace
"gdb::observers::target_resumed.notify (ptid);" and
"gdb::observers::before_prompt.notify (get_prompt ().c_str ())" calls
with such helper function?  This way, you could ensure that listeners
are all executed before the "bfd_cache_close_all ()" call.

This approach does require that future change do not insert new direct
notify calls to the observers, but at least should solve the problem of
observers being notified in an arbitrary order.

Best,
Lancelot.

> 
> This commit also removes all of the other bfd_cache_close_all() calls
> from GDB.  My claim is that these are no longer needed.
> ---
>  gdb/corefile.c |  5 -----
>  gdb/exec.c     |  2 --
>  gdb/gdb_bfd.c  | 20 ++++++++++++++++++++
>  gdb/infcmd.c   |  1 -
>  gdb/inferior.c |  2 --
>  gdb/symfile.c  |  1 -
>  gdb/target.c   |  5 -----
>  7 files changed, 20 insertions(+), 16 deletions(-)
> 
> diff --git a/gdb/corefile.c b/gdb/corefile.c
> index c27061a3ae3..19a96bc6f86 100644
> --- a/gdb/corefile.c
> +++ b/gdb/corefile.c
> @@ -120,11 +120,6 @@ reopen_exec_file (void)
>        && current_program_space->ebfd_mtime
>        && current_program_space->ebfd_mtime != st.st_mtime)
>      exec_file_attach (filename.c_str (), 0);
> -  else
> -    /* If we accessed the file since last opening it, close it now;
> -       this stops GDB from holding the executable open after it
> -       exits.  */
> -    bfd_cache_close_all ();
>  }
>  \f
>  /* If we have both a core file and an exec file,
> diff --git a/gdb/exec.c b/gdb/exec.c
> index 5956012338f..59965b84d55 100644
> --- a/gdb/exec.c
> +++ b/gdb/exec.c
> @@ -500,8 +500,6 @@ exec_file_attach (const char *filename, int from_tty)
>  	(*deprecated_exec_file_display_hook) (filename);
>      }
>  
> -  bfd_cache_close_all ();
> -
>    /* Are are loading the same executable?  */
>    bfd *prev_bfd = exec_bfd_holder.get ();
>    bfd *curr_bfd = current_program_space->exec_bfd ();
> diff --git a/gdb/gdb_bfd.c b/gdb/gdb_bfd.c
> index 56a4c5ecc91..9b7a06da231 100644
> --- a/gdb/gdb_bfd.c
> +++ b/gdb/gdb_bfd.c
> @@ -33,6 +33,7 @@
>  #include "gdbsupport/fileio.h"
>  #include "inferior.h"
>  #include "cli/cli-style.h"
> +#include "observable.h"
>  #include <unordered_map>
>  
>  /* An object of this type is stored in the section's user data when
> @@ -1171,10 +1172,29 @@ gdb_bfd_error_handler (const char *fmt, va_list ap)
>    (*default_bfd_error_handler) (fmt, ap);
>  }
>  
> +/* A before_prompt observer.  */
> +
> +static void
> +gdb_bfd_before_prompt (const char * /* ignored */)
> +{
> +  bfd_cache_close_all ();
> +}
> +
> +/* A target_resumed observer.  */
> +
> +static void
> +gdb_bfd_target_resumed (ptid_t /* ignored */)
> +{
> +  bfd_cache_close_all ();
> +}
> +
>  void _initialize_gdb_bfd ();
>  void
>  _initialize_gdb_bfd ()
>  {
> +  gdb::observers::target_resumed.attach (gdb_bfd_target_resumed, "gdb-bfd");
> +  gdb::observers::before_prompt.attach (gdb_bfd_before_prompt, "gdb-bfd");
> +
>    all_bfds = htab_create_alloc (10, htab_hash_pointer, htab_eq_pointer,
>  				NULL, xcalloc, xfree);
>  
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index cf8cd527955..5153843dde8 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -2498,7 +2498,6 @@ kill_command (const char *arg, int from_tty)
>    int infnum = current_inferior ()->num;
>  
>    target_kill ();
> -  bfd_cache_close_all ();
>  
>    update_previous_thread ();
>  
> diff --git a/gdb/inferior.c b/gdb/inferior.c
> index 1778723863e..927c5f16ae2 100644
> --- a/gdb/inferior.c
> +++ b/gdb/inferior.c
> @@ -714,8 +714,6 @@ kill_inferior_command (const char *args, int from_tty)
>  
>        target_kill ();
>      }
> -
> -  bfd_cache_close_all ();
>  }
>  
>  /* See inferior.h.  */
> diff --git a/gdb/symfile.c b/gdb/symfile.c
> index eebc5ea44b9..24570372316 100644
> --- a/gdb/symfile.c
> +++ b/gdb/symfile.c
> @@ -1124,7 +1124,6 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name,
>  
>    gdb::observers::new_objfile.notify (objfile);
>  
> -  bfd_cache_close_all ();
>    return objfile;
>  }
>  
> diff --git a/gdb/target.c b/gdb/target.c
> index f688ff33e3b..aeb53dd91d0 100644
> --- a/gdb/target.c
> +++ b/gdb/target.c
> @@ -2729,11 +2729,6 @@ target_mourn_inferior (ptid_t ptid)
>  {
>    gdb_assert (ptid.pid () == inferior_ptid.pid ());
>    current_inferior ()->top_target ()->mourn_inferior ();
> -
> -  /* We no longer need to keep handles on any of the object files.
> -     Make sure to release them to avoid unnecessarily locking any
> -     of them while we're not actually debugging.  */
> -  bfd_cache_close_all ();
>  }
>  
>  /* Look for a target which can describe architectural features, starting
> -- 
> 2.25.4
> 

  reply	other threads:[~2023-10-26  9:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 15:08 [PATCH 0/2] BFD cache management And Exec file with target: prefix Andrew Burgess
2023-10-25 15:08 ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-26  9:06   ` Lancelot SIX [this message]
2023-10-27  8:49     ` Andrew Burgess
2023-10-27 19:30   ` Tom Tromey
2023-10-30 10:20     ` Andrew Burgess
2023-10-31 18:28       ` Tom Tromey
2023-11-01 10:46         ` Andrew Burgess
2023-11-12 23:38           ` Tom Tromey
2023-10-25 15:08 ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-10-27 18:39   ` Tom Tromey
2023-10-30 13:41 ` [PATCH 0/2] BFD cache management And Exec file " Andrew Burgess
2023-10-30 13:41   ` [PATCH 1/2] gdb: move all bfd_cache_close_all calls in gdb_bfd.c Andrew Burgess
2023-10-30 13:41   ` [PATCH 2/2] gdb: fix reopen_exec_file for files with target: prefix Andrew Burgess
2023-11-12 23:40   ` [PATCH 0/2] BFD cache management And Exec file " Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231026090601.awjsf5mrc2beyatn@octopus \
    --to=lsix@lancelotsix.com \
    --cc=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).