public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: [PING][PATCH][gdb] Use partial symbol table to find language for main
Date: Mon, 30 Mar 2020 23:20:29 +0200	[thread overview]
Message-ID: <82944a35-6f31-dda2-53ac-25a8708e13e5@suse.de> (raw)
In-Reply-To: <b673518c-eace-0260-54e1-871230122c29@suse.de>

On 17-03-2020 16:25, Tom de Vries wrote:
> On 13-03-2020 14:59, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>> Tom> [gdb] Use partial symbol table to find language for main
>>
>> Tom> When language is set to auto, part of loading an executable is to update the
>> Tom> language accordingly.  This is implemented by set_initial_language.
>>
>> ...
>>
>> Tom> +/* Psymtab version of lookup_symbol_language.  See its definition in
>> Tom> +   the definition of quick_symbol_functions in symfile.h.  */
>> Tom> +
>> Tom> +enum language
>> Tom> +psym_lookup_symbol_language (struct objfile *objfile,
>> Tom> +			     block_enum block_index,
>> Tom> +			     const char *name,
>> Tom> +			     domain_enum domain,
>> Tom> +			     bool *symbol_found_p)
>>
>> This should be static.
>>
> Ack, done.
> 
>> Tom> diff --git a/gdb/psymtab.h b/gdb/psymtab.h
>> Tom> index 040b973927..a8a5d344b0 100644
>> Tom> --- a/gdb/psymtab.h
>> Tom> +++ b/gdb/psymtab.h
>> Tom> @@ -151,4 +151,9 @@ extern const struct quick_symbol_functions dwarf2_debug_names_functions;
>> Tom>  extern psymtab_storage::partial_symtab_range require_partial_symbols
>> Tom>      (struct objfile *objfile, bool verbose);
>>  
>> Tom> +enum language psym_lookup_symbol_language (struct objfile *objfile,
>> Tom> +					   block_enum block_index,
>> Tom> +					   const char *name,
>> Tom> +					   domain_enum domain,
>> Tom> +					   bool *symbol_found_p);
>>
>> ... and not declared here.
>>
> done.
> 
>> Tom> +  /* Check to see if the symbol is defined in a "partial" symbol table
>> Tom> +     of OBJFILE.  BLOCK_INDEX should be either GLOBAL_BLOCK or STATIC_BLOCK,
>> Tom> +     depending on whether we want to search global symbols or static
>> Tom> +     symbols.  NAME is the name of the symbol to look for.  DOMAIN
>> Tom> +     indicates what sort of symbol to search for.
>> Tom> +
>> Tom> +     If found, sets *symbol_found_p to true and returns the symbol language.
>> Tom> +     defined, or NULL if no such symbol table exists.  */
>> Tom> +  enum language (*lookup_symbol_language) (struct objfile *objfile,
>> Tom> +					   block_enum block_index,
>>
>> Does this really need the block index?  It seems to me that main will
>> normally be global.  Also, the code seems to only ever use the global
>> scope.
>>
> It doesn't need to, no. I've hard-coded GLOBAL_BLOCK and added global_
> to the function names.
> 
>> Tom> +/* Find the language for partial symbol with NAME.  */
>> Tom> +
>> Tom> +enum language
>> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
>> Tom> +			    const domain_enum domain)
>>
>> This can be static.
>>
> Done.
> 
>> Tom> diff --git a/gdb/symtab.h b/gdb/symtab.h
>> Tom> index 5fa067b5f4..da8d3011ea 100644
>> Tom> --- a/gdb/symtab.h
>> Tom> +++ b/gdb/symtab.h
>> Tom> @@ -2349,4 +2349,8 @@ private:
>> Tom>    std::vector<bound_minimal_symbol> m_minimal_symbols;
>> Tom>  };
>>  
>> Tom> +enum language
>> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
>> Tom> +			    const domain_enum domain);
>>
>> No need to declare here.
> Done.
> 
> OK for trunk?

Ping.

Thanks,
- Tom

> 
> 
> 0001-gdb-Use-partial-symbol-table-to-find-language-for-main.patch
> 
> [gdb] Use partial symbol table to find language for main
> 
> When language is set to auto, part of loading an executable is to update the
> language accordingly.  This is implemented by set_initial_language.
> 
> The implementation of set_initial_language works as follows:
> - check if any objfile in the progspace has name_of_main/language_of_main
>   set, and if so, use the first one found. [ This is what you get f.i. when
>   using dwarf with DW_AT_main_subprogram. ]
> - otherwise, check for known names in the minimal symbols, and either:
> - use the associated language if any (f.i. for ada), or
> - lookup the symbol in the symtab for the name and use the symbol language
>   (f.i. for c/c++).
> 
> The symbol lookup can be slow though.
> 
> In the case of the cc1 binary from PR23710 comment 1, getting to the initial
> prompt takes ~8s:
> ...
> $ time.sh gdb cc1 -batch -ex "show language"
> The current source language is "auto; currently c++".
> maxmem: 1272260
> real: 8.05
> user: 7.73
> system: 0.38
> ...
> but if we skip guessing the initial language by setting it instead, it takes
> only ~4s:
> ...
> $ time.sh gdb -iex "set language c++" cc1 -batch -ex "show language"
> The current source language is "c++".
> maxmem: 498272
> real: 3.99
> user: 3.90
> system: 0.15
> ...
> 
> In both cases, we load the partial symbols for the executable, but in the
> first case only we also do a lookup of main, which causes the corresponding
> partial symtab to be expanded into a full symtab.
> 
> Ideally, we'd like to get the language of the symbol without triggering
> expansion into a full symtab, and get the speedup without having to set the
> language manually.
> 
> There's a related fixme in the header comment of set_initial_language:
> ...
> /* Set the initial language.
> 
>    FIXME: A better solution would be to record the language in the
>    psymtab when reading partial symbols, and then use it (if known) to
>    set the language.  This would be a win for formats that encode the
>    language in an easily discoverable place, such as DWARF.  For
>    stabs, we can jump through hoops looking for specially named
>    symbols or try to intuit the language from the specific type of
>    stabs we find, but we can't do that until later when we read in
>    full symbols.  */
> 
> void
> set_initial_language (void)
> ...
> 
> Since we're already tracking the language of partial symbols, use this to set
> the language for the main symbol.
> 
> Note that this search in partial symbol tables is not guaranteed to yield the
> same result as the lookup_symbol_in_language call currently done in
> set_initial_language.
> 
> Build and reg-tested on x86_64-linux.
> 
> This requires an update to test-case gdb.base/break-interp.exp to work around
> PR25613 - "[gdb] verbose message appears not at start of line during
> backtrace".
> 
> gdb/ChangeLog:
> 
> 2020-03-04  Tom de Vries  <tdevries@suse.de>
> 
> 	* dwarf2/read.c (dwarf2_gdb_index_functions,
> 	dwarf2_debug_names_functions): Init lookup_global_symbol_language with
> 	NULL.
> 	* psymtab.c (psym_lookup_global_symbol_language): New function.
> 	(psym_functions): Init psym_lookup_global_symbol_language with
> 	psym_lookup_global_symbol_language.
> 	* symfile-debug.c (debug_sym_quick_functions): Init
> 	lookup_global_symbol_language with NULL.
> 	* symfile.c (set_initial_language): Remove fixme comment.
> 	* symfile.h (struct quick_symbol_functions): Add
> 	lookup_global_symbol_language.
> 	* symtab.c (find_quick_global_symbol_language): New function.
> 	(find_main_name): Use find_quick_global_symbol_language.
> 
> gdb/testsuite/ChangeLog:
> 
> 2020-03-04  Tom de Vries  <tdevries@suse.de>
> 
> 	* gdb.base/break-interp.exp: Set verbose back to off.
> 	* gdb.base/c-linkage-name.exp (verify_psymtab_expanded): Move ...
> 	* lib/gdb.exp: ... here.
> 	* gdb.base/main-psymtab.exp: New file.
> 
> ---
>  gdb/dwarf2/read.c                         |  2 ++
>  gdb/psymtab.c                             | 31 ++++++++++++++++++++++++++
>  gdb/symfile-debug.c                       |  1 +
>  gdb/symfile.c                             | 11 +--------
>  gdb/symfile.h                             | 11 +++++++++
>  gdb/symtab.c                              | 37 +++++++++++++++++++++++++++++++
>  gdb/testsuite/gdb.base/c-linkage-name.exp | 13 -----------
>  gdb/testsuite/gdb.base/main-psymtab.exp   | 34 ++++++++++++++++++++++++++++
>  gdb/testsuite/lib/gdb.exp                 | 13 +++++++++++
>  9 files changed, 130 insertions(+), 23 deletions(-)
> 
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 0e879e08a0..699f1a14da 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -4801,6 +4801,7 @@ const struct quick_symbol_functions dwarf2_gdb_index_functions =
>    dw2_forget_cached_source_info,
>    dw2_map_symtabs_matching_filename,
>    dw2_lookup_symbol,
> +  NULL,
>    dw2_print_stats,
>    dw2_dump,
>    dw2_expand_symtabs_for_function,
> @@ -5683,6 +5684,7 @@ const struct quick_symbol_functions dwarf2_debug_names_functions =
>    dw2_forget_cached_source_info,
>    dw2_map_symtabs_matching_filename,
>    dw2_debug_names_lookup_symbol,
> +  NULL,
>    dw2_print_stats,
>    dw2_debug_names_dump,
>    dw2_debug_names_expand_symtabs_for_function,
> diff --git a/gdb/psymtab.c b/gdb/psymtab.c
> index f77f6d5108..3731540cbb 100644
> --- a/gdb/psymtab.c
> +++ b/gdb/psymtab.c
> @@ -517,6 +517,36 @@ psym_lookup_symbol (struct objfile *objfile,
>    return stab_best;
>  }
>  
> +/* Psymtab version of lookup_global_symbol_language.  See its definition in
> +   the definition of quick_symbol_functions in symfile.h.  */
> +
> +static enum language
> +psym_lookup_global_symbol_language (struct objfile *objfile, const char *name,
> +				    domain_enum domain, bool *symbol_found_p)
> +{
> +  *symbol_found_p = false;
> +  if (objfile->sf == NULL)
> +    return language_unknown;
> +
> +  lookup_name_info lookup_name (name, symbol_name_match_type::FULL);
> +
> +  for (partial_symtab *ps : require_partial_symbols (objfile, true))
> +    {
> +      struct partial_symbol *psym;
> +      if (ps->readin_p ())
> +	continue;
> +
> +      psym = lookup_partial_symbol (objfile, ps, name, 1, domain);
> +      if (psym)
> +	{
> +	  *symbol_found_p = true;
> +	  return psym->ginfo.language ();
> +	}
> +    }
> +
> +  return language_unknown;
> +}
> +
>  /* Returns true if PSYM matches LOOKUP_NAME.  */
>  
>  static bool
> @@ -1448,6 +1478,7 @@ const struct quick_symbol_functions psym_functions =
>    psym_forget_cached_source_info,
>    psym_map_symtabs_matching_filename,
>    psym_lookup_symbol,
> +  psym_lookup_global_symbol_language,
>    psym_print_stats,
>    psym_dump,
>    psym_expand_symtabs_for_function,
> diff --git a/gdb/symfile-debug.c b/gdb/symfile-debug.c
> index 53a77a5405..19dc83a8bd 100644
> --- a/gdb/symfile-debug.c
> +++ b/gdb/symfile-debug.c
> @@ -362,6 +362,7 @@ static const struct quick_symbol_functions debug_sym_quick_functions =
>    debug_qf_forget_cached_source_info,
>    debug_qf_map_symtabs_matching_filename,
>    debug_qf_lookup_symbol,
> +  NULL,
>    debug_qf_print_stats,
>    debug_qf_dump,
>    debug_qf_expand_symtabs_for_function,
> diff --git a/gdb/symfile.c b/gdb/symfile.c
> index 3b63887ce1..bd27a1fefe 100644
> --- a/gdb/symfile.c
> +++ b/gdb/symfile.c
> @@ -1667,16 +1667,7 @@ symbol_file_command (const char *args, int from_tty)
>      }
>  }
>  
> -/* Set the initial language.
> -
> -   FIXME: A better solution would be to record the language in the
> -   psymtab when reading partial symbols, and then use it (if known) to
> -   set the language.  This would be a win for formats that encode the
> -   language in an easily discoverable place, such as DWARF.  For
> -   stabs, we can jump through hoops looking for specially named
> -   symbols or try to intuit the language from the specific type of
> -   stabs we find, but we can't do that until later when we read in
> -   full symbols.  */
> +/* Set the initial language.  */
>  
>  void
>  set_initial_language (void)
> diff --git a/gdb/symfile.h b/gdb/symfile.h
> index 85f8e7c155..84fa283e85 100644
> --- a/gdb/symfile.h
> +++ b/gdb/symfile.h
> @@ -183,6 +183,17 @@ struct quick_symbol_functions
>  					    const char *name,
>  					    domain_enum domain);
>  
> +  /* Check to see if the global symbol is defined in a "partial" symbol table
> +     of OBJFILE. NAME is the name of the symbol to look for.  DOMAIN
> +     indicates what sort of symbol to search for.
> +
> +     If found, sets *symbol_found_p to true and returns the symbol language.
> +     defined, or NULL if no such symbol table exists.  */
> +  enum language (*lookup_global_symbol_language) (struct objfile *objfile,
> +						  const char *name,
> +						  domain_enum domain,
> +						  bool *symbol_found_p);
> +
>    /* Print statistics about any indices loaded for OBJFILE.  The
>       statistics should be printed to gdb_stdout.  This is used for
>       "maint print statistics".  */
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index aa415a9399..7303e3c9d9 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -2560,6 +2560,33 @@ lookup_symbol_in_objfile (struct objfile *objfile, enum block_enum block_index,
>    return result;
>  }
>  
> +/* Find the language for partial symbol with NAME.  */
> +
> +static enum language
> +find_quick_global_symbol_language (const char *name, const domain_enum domain)
> +{
> +  for (objfile *objfile : current_program_space->objfiles ())
> +    {
> +      if (objfile->sf && objfile->sf->qf
> +	  && objfile->sf->qf->lookup_global_symbol_language)
> +	continue;
> +      return language_unknown;
> +    }
> +
> +  for (objfile *objfile : current_program_space->objfiles ())
> +    {
> +      bool symbol_found_p;
> +      enum language lang
> +	= objfile->sf->qf->lookup_global_symbol_language (objfile, name, domain,
> +							  &symbol_found_p);
> +      if (!symbol_found_p)
> +	continue;
> +      return lang;
> +    }
> +
> +  return language_unknown;
> +}
> +
>  /* Private data to be used with lookup_symbol_global_iterator_cb.  */
>  
>  struct global_or_static_sym_lookup_data
> @@ -6123,6 +6150,16 @@ find_main_name (void)
>  
>    /* The languages above didn't identify the name of the main procedure.
>       Fallback to "main".  */
> +
> +  /* Try to find language for main in psymtabs.  */
> +  enum language lang
> +    = find_quick_global_symbol_language ("main", VAR_DOMAIN);
> +  if (lang != language_unknown)
> +    {
> +      set_main_name ("main", lang);
> +      return;
> +    }
> +
>    set_main_name ("main", language_unknown);
>  }
>  
> diff --git a/gdb/testsuite/gdb.base/c-linkage-name.exp b/gdb/testsuite/gdb.base/c-linkage-name.exp
> index 4551c8dcac..87036aa1ff 100644
> --- a/gdb/testsuite/gdb.base/c-linkage-name.exp
> +++ b/gdb/testsuite/gdb.base/c-linkage-name.exp
> @@ -27,19 +27,6 @@ if { [gdb_compile "${sources}" "${binfile}" executable {debug}] != "" } {
>  
>  clean_restart ${binfile}
>  
> -# Verify that partial symtab expansion for $filename has state $readin
> -
> -proc verify_psymtab_expanded { filename readin } {
> -    set cmd "maint info psymtab"
> -    set test "$cmd: $filename: $readin"
> -    set re [multi_line \
> -		"  \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
> -		"    readin $readin" \
> -		".*"]
> -
> -    gdb_test $cmd $re $test
> -}
> -
>  # Verify that partial symtab expansion has not taken place for
>  # c-linkage-name-2.c.
>  
> diff --git a/gdb/testsuite/gdb.base/main-psymtab.exp b/gdb/testsuite/gdb.base/main-psymtab.exp
> new file mode 100644
> index 0000000000..72596f632b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/main-psymtab.exp
> @@ -0,0 +1,34 @@
> +# Copyright 2020 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +standard_testfile persistent-lang.cc
> +
> +if {[build_executable "failed to prepare" $testfile $srcfile debug]} {
> +    return -1
> +}
> +
> +clean_restart
> +
> +set auto_cpp \
> +    {The current source language is "auto; currently c\+\+"\.}
> +
> +gdb_load ${binfile}
> +gdb_test "show language" $auto_cpp \
> +    "language auto/c++ after load"
> +
> +# Verify that partial symtab expansion has not taken place for
> +# persistent-lang.cc
> +
> +verify_psymtab_expanded persistent-lang.cc no
> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index b14b3a968e..a66d738713 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -6958,5 +6958,18 @@ gdb_caching_proc supports_statement_frontiers {
>      } executable "additional_flags=-gstatement-frontiers"]
>  }
>  
> +# Verify that partial symtab expansion for $filename has state $readin.
> +
> +proc verify_psymtab_expanded { filename readin } {
> +    set cmd "maint info psymtab"
> +    set test "$cmd: $filename: $readin"
> +    set re [multi_line \
> +		"  \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
> +		"    readin $readin" \
> +		".*"]
> +
> +    gdb_test $cmd $re $test
> +}
> +
>  # Always load compatibility stuff.
>  load_lib future.exp
> 

  reply	other threads:[~2020-03-30 21:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-01 23:18 [RFC][gdb] " Tom de Vries
2020-03-03 15:48 ` Tom Tromey
2020-03-04 16:07   ` Tom de Vries
2020-03-13 13:59     ` Tom Tromey
2020-03-17 15:25       ` [PATCH][gdb] " Tom de Vries
2020-03-30 21:20         ` Tom de Vries [this message]
2020-04-01 20:07         ` Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82944a35-6f31-dda2-53ac-25a8708e13e5@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).