From: Tom de Vries <tdevries@suse.de>
To: Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: [PING][PATCH][gdb] Use partial symbol table to find language for main
Date: Mon, 30 Mar 2020 23:20:29 +0200 [thread overview]
Message-ID: <82944a35-6f31-dda2-53ac-25a8708e13e5@suse.de> (raw)
In-Reply-To: <b673518c-eace-0260-54e1-871230122c29@suse.de>
On 17-03-2020 16:25, Tom de Vries wrote:
> On 13-03-2020 14:59, Tom Tromey wrote:
>>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
>> Tom> [gdb] Use partial symbol table to find language for main
>>
>> Tom> When language is set to auto, part of loading an executable is to update the
>> Tom> language accordingly. This is implemented by set_initial_language.
>>
>> ...
>>
>> Tom> +/* Psymtab version of lookup_symbol_language. See its definition in
>> Tom> + the definition of quick_symbol_functions in symfile.h. */
>> Tom> +
>> Tom> +enum language
>> Tom> +psym_lookup_symbol_language (struct objfile *objfile,
>> Tom> + block_enum block_index,
>> Tom> + const char *name,
>> Tom> + domain_enum domain,
>> Tom> + bool *symbol_found_p)
>>
>> This should be static.
>>
> Ack, done.
>
>> Tom> diff --git a/gdb/psymtab.h b/gdb/psymtab.h
>> Tom> index 040b973927..a8a5d344b0 100644
>> Tom> --- a/gdb/psymtab.h
>> Tom> +++ b/gdb/psymtab.h
>> Tom> @@ -151,4 +151,9 @@ extern const struct quick_symbol_functions dwarf2_debug_names_functions;
>> Tom> extern psymtab_storage::partial_symtab_range require_partial_symbols
>> Tom> (struct objfile *objfile, bool verbose);
>>
>> Tom> +enum language psym_lookup_symbol_language (struct objfile *objfile,
>> Tom> + block_enum block_index,
>> Tom> + const char *name,
>> Tom> + domain_enum domain,
>> Tom> + bool *symbol_found_p);
>>
>> ... and not declared here.
>>
> done.
>
>> Tom> + /* Check to see if the symbol is defined in a "partial" symbol table
>> Tom> + of OBJFILE. BLOCK_INDEX should be either GLOBAL_BLOCK or STATIC_BLOCK,
>> Tom> + depending on whether we want to search global symbols or static
>> Tom> + symbols. NAME is the name of the symbol to look for. DOMAIN
>> Tom> + indicates what sort of symbol to search for.
>> Tom> +
>> Tom> + If found, sets *symbol_found_p to true and returns the symbol language.
>> Tom> + defined, or NULL if no such symbol table exists. */
>> Tom> + enum language (*lookup_symbol_language) (struct objfile *objfile,
>> Tom> + block_enum block_index,
>>
>> Does this really need the block index? It seems to me that main will
>> normally be global. Also, the code seems to only ever use the global
>> scope.
>>
> It doesn't need to, no. I've hard-coded GLOBAL_BLOCK and added global_
> to the function names.
>
>> Tom> +/* Find the language for partial symbol with NAME. */
>> Tom> +
>> Tom> +enum language
>> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
>> Tom> + const domain_enum domain)
>>
>> This can be static.
>>
> Done.
>
>> Tom> diff --git a/gdb/symtab.h b/gdb/symtab.h
>> Tom> index 5fa067b5f4..da8d3011ea 100644
>> Tom> --- a/gdb/symtab.h
>> Tom> +++ b/gdb/symtab.h
>> Tom> @@ -2349,4 +2349,8 @@ private:
>> Tom> std::vector<bound_minimal_symbol> m_minimal_symbols;
>> Tom> };
>>
>> Tom> +enum language
>> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
>> Tom> + const domain_enum domain);
>>
>> No need to declare here.
> Done.
>
> OK for trunk?
Ping.
Thanks,
- Tom
>
>
> 0001-gdb-Use-partial-symbol-table-to-find-language-for-main.patch
>
> [gdb] Use partial symbol table to find language for main
>
> When language is set to auto, part of loading an executable is to update the
> language accordingly. This is implemented by set_initial_language.
>
> The implementation of set_initial_language works as follows:
> - check if any objfile in the progspace has name_of_main/language_of_main
> set, and if so, use the first one found. [ This is what you get f.i. when
> using dwarf with DW_AT_main_subprogram. ]
> - otherwise, check for known names in the minimal symbols, and either:
> - use the associated language if any (f.i. for ada), or
> - lookup the symbol in the symtab for the name and use the symbol language
> (f.i. for c/c++).
>
> The symbol lookup can be slow though.
>
> In the case of the cc1 binary from PR23710 comment 1, getting to the initial
> prompt takes ~8s:
> ...
> $ time.sh gdb cc1 -batch -ex "show language"
> The current source language is "auto; currently c++".
> maxmem: 1272260
> real: 8.05
> user: 7.73
> system: 0.38
> ...
> but if we skip guessing the initial language by setting it instead, it takes
> only ~4s:
> ...
> $ time.sh gdb -iex "set language c++" cc1 -batch -ex "show language"
> The current source language is "c++".
> maxmem: 498272
> real: 3.99
> user: 3.90
> system: 0.15
> ...
>
> In both cases, we load the partial symbols for the executable, but in the
> first case only we also do a lookup of main, which causes the corresponding
> partial symtab to be expanded into a full symtab.
>
> Ideally, we'd like to get the language of the symbol without triggering
> expansion into a full symtab, and get the speedup without having to set the
> language manually.
>
> There's a related fixme in the header comment of set_initial_language:
> ...
> /* Set the initial language.
>
> FIXME: A better solution would be to record the language in the
> psymtab when reading partial symbols, and then use it (if known) to
> set the language. This would be a win for formats that encode the
> language in an easily discoverable place, such as DWARF. For
> stabs, we can jump through hoops looking for specially named
> symbols or try to intuit the language from the specific type of
> stabs we find, but we can't do that until later when we read in
> full symbols. */
>
> void
> set_initial_language (void)
> ...
>
> Since we're already tracking the language of partial symbols, use this to set
> the language for the main symbol.
>
> Note that this search in partial symbol tables is not guaranteed to yield the
> same result as the lookup_symbol_in_language call currently done in
> set_initial_language.
>
> Build and reg-tested on x86_64-linux.
>
> This requires an update to test-case gdb.base/break-interp.exp to work around
> PR25613 - "[gdb] verbose message appears not at start of line during
> backtrace".
>
> gdb/ChangeLog:
>
> 2020-03-04 Tom de Vries <tdevries@suse.de>
>
> * dwarf2/read.c (dwarf2_gdb_index_functions,
> dwarf2_debug_names_functions): Init lookup_global_symbol_language with
> NULL.
> * psymtab.c (psym_lookup_global_symbol_language): New function.
> (psym_functions): Init psym_lookup_global_symbol_language with
> psym_lookup_global_symbol_language.
> * symfile-debug.c (debug_sym_quick_functions): Init
> lookup_global_symbol_language with NULL.
> * symfile.c (set_initial_language): Remove fixme comment.
> * symfile.h (struct quick_symbol_functions): Add
> lookup_global_symbol_language.
> * symtab.c (find_quick_global_symbol_language): New function.
> (find_main_name): Use find_quick_global_symbol_language.
>
> gdb/testsuite/ChangeLog:
>
> 2020-03-04 Tom de Vries <tdevries@suse.de>
>
> * gdb.base/break-interp.exp: Set verbose back to off.
> * gdb.base/c-linkage-name.exp (verify_psymtab_expanded): Move ...
> * lib/gdb.exp: ... here.
> * gdb.base/main-psymtab.exp: New file.
>
> ---
> gdb/dwarf2/read.c | 2 ++
> gdb/psymtab.c | 31 ++++++++++++++++++++++++++
> gdb/symfile-debug.c | 1 +
> gdb/symfile.c | 11 +--------
> gdb/symfile.h | 11 +++++++++
> gdb/symtab.c | 37 +++++++++++++++++++++++++++++++
> gdb/testsuite/gdb.base/c-linkage-name.exp | 13 -----------
> gdb/testsuite/gdb.base/main-psymtab.exp | 34 ++++++++++++++++++++++++++++
> gdb/testsuite/lib/gdb.exp | 13 +++++++++++
> 9 files changed, 130 insertions(+), 23 deletions(-)
>
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 0e879e08a0..699f1a14da 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -4801,6 +4801,7 @@ const struct quick_symbol_functions dwarf2_gdb_index_functions =
> dw2_forget_cached_source_info,
> dw2_map_symtabs_matching_filename,
> dw2_lookup_symbol,
> + NULL,
> dw2_print_stats,
> dw2_dump,
> dw2_expand_symtabs_for_function,
> @@ -5683,6 +5684,7 @@ const struct quick_symbol_functions dwarf2_debug_names_functions =
> dw2_forget_cached_source_info,
> dw2_map_symtabs_matching_filename,
> dw2_debug_names_lookup_symbol,
> + NULL,
> dw2_print_stats,
> dw2_debug_names_dump,
> dw2_debug_names_expand_symtabs_for_function,
> diff --git a/gdb/psymtab.c b/gdb/psymtab.c
> index f77f6d5108..3731540cbb 100644
> --- a/gdb/psymtab.c
> +++ b/gdb/psymtab.c
> @@ -517,6 +517,36 @@ psym_lookup_symbol (struct objfile *objfile,
> return stab_best;
> }
>
> +/* Psymtab version of lookup_global_symbol_language. See its definition in
> + the definition of quick_symbol_functions in symfile.h. */
> +
> +static enum language
> +psym_lookup_global_symbol_language (struct objfile *objfile, const char *name,
> + domain_enum domain, bool *symbol_found_p)
> +{
> + *symbol_found_p = false;
> + if (objfile->sf == NULL)
> + return language_unknown;
> +
> + lookup_name_info lookup_name (name, symbol_name_match_type::FULL);
> +
> + for (partial_symtab *ps : require_partial_symbols (objfile, true))
> + {
> + struct partial_symbol *psym;
> + if (ps->readin_p ())
> + continue;
> +
> + psym = lookup_partial_symbol (objfile, ps, name, 1, domain);
> + if (psym)
> + {
> + *symbol_found_p = true;
> + return psym->ginfo.language ();
> + }
> + }
> +
> + return language_unknown;
> +}
> +
> /* Returns true if PSYM matches LOOKUP_NAME. */
>
> static bool
> @@ -1448,6 +1478,7 @@ const struct quick_symbol_functions psym_functions =
> psym_forget_cached_source_info,
> psym_map_symtabs_matching_filename,
> psym_lookup_symbol,
> + psym_lookup_global_symbol_language,
> psym_print_stats,
> psym_dump,
> psym_expand_symtabs_for_function,
> diff --git a/gdb/symfile-debug.c b/gdb/symfile-debug.c
> index 53a77a5405..19dc83a8bd 100644
> --- a/gdb/symfile-debug.c
> +++ b/gdb/symfile-debug.c
> @@ -362,6 +362,7 @@ static const struct quick_symbol_functions debug_sym_quick_functions =
> debug_qf_forget_cached_source_info,
> debug_qf_map_symtabs_matching_filename,
> debug_qf_lookup_symbol,
> + NULL,
> debug_qf_print_stats,
> debug_qf_dump,
> debug_qf_expand_symtabs_for_function,
> diff --git a/gdb/symfile.c b/gdb/symfile.c
> index 3b63887ce1..bd27a1fefe 100644
> --- a/gdb/symfile.c
> +++ b/gdb/symfile.c
> @@ -1667,16 +1667,7 @@ symbol_file_command (const char *args, int from_tty)
> }
> }
>
> -/* Set the initial language.
> -
> - FIXME: A better solution would be to record the language in the
> - psymtab when reading partial symbols, and then use it (if known) to
> - set the language. This would be a win for formats that encode the
> - language in an easily discoverable place, such as DWARF. For
> - stabs, we can jump through hoops looking for specially named
> - symbols or try to intuit the language from the specific type of
> - stabs we find, but we can't do that until later when we read in
> - full symbols. */
> +/* Set the initial language. */
>
> void
> set_initial_language (void)
> diff --git a/gdb/symfile.h b/gdb/symfile.h
> index 85f8e7c155..84fa283e85 100644
> --- a/gdb/symfile.h
> +++ b/gdb/symfile.h
> @@ -183,6 +183,17 @@ struct quick_symbol_functions
> const char *name,
> domain_enum domain);
>
> + /* Check to see if the global symbol is defined in a "partial" symbol table
> + of OBJFILE. NAME is the name of the symbol to look for. DOMAIN
> + indicates what sort of symbol to search for.
> +
> + If found, sets *symbol_found_p to true and returns the symbol language.
> + defined, or NULL if no such symbol table exists. */
> + enum language (*lookup_global_symbol_language) (struct objfile *objfile,
> + const char *name,
> + domain_enum domain,
> + bool *symbol_found_p);
> +
> /* Print statistics about any indices loaded for OBJFILE. The
> statistics should be printed to gdb_stdout. This is used for
> "maint print statistics". */
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index aa415a9399..7303e3c9d9 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -2560,6 +2560,33 @@ lookup_symbol_in_objfile (struct objfile *objfile, enum block_enum block_index,
> return result;
> }
>
> +/* Find the language for partial symbol with NAME. */
> +
> +static enum language
> +find_quick_global_symbol_language (const char *name, const domain_enum domain)
> +{
> + for (objfile *objfile : current_program_space->objfiles ())
> + {
> + if (objfile->sf && objfile->sf->qf
> + && objfile->sf->qf->lookup_global_symbol_language)
> + continue;
> + return language_unknown;
> + }
> +
> + for (objfile *objfile : current_program_space->objfiles ())
> + {
> + bool symbol_found_p;
> + enum language lang
> + = objfile->sf->qf->lookup_global_symbol_language (objfile, name, domain,
> + &symbol_found_p);
> + if (!symbol_found_p)
> + continue;
> + return lang;
> + }
> +
> + return language_unknown;
> +}
> +
> /* Private data to be used with lookup_symbol_global_iterator_cb. */
>
> struct global_or_static_sym_lookup_data
> @@ -6123,6 +6150,16 @@ find_main_name (void)
>
> /* The languages above didn't identify the name of the main procedure.
> Fallback to "main". */
> +
> + /* Try to find language for main in psymtabs. */
> + enum language lang
> + = find_quick_global_symbol_language ("main", VAR_DOMAIN);
> + if (lang != language_unknown)
> + {
> + set_main_name ("main", lang);
> + return;
> + }
> +
> set_main_name ("main", language_unknown);
> }
>
> diff --git a/gdb/testsuite/gdb.base/c-linkage-name.exp b/gdb/testsuite/gdb.base/c-linkage-name.exp
> index 4551c8dcac..87036aa1ff 100644
> --- a/gdb/testsuite/gdb.base/c-linkage-name.exp
> +++ b/gdb/testsuite/gdb.base/c-linkage-name.exp
> @@ -27,19 +27,6 @@ if { [gdb_compile "${sources}" "${binfile}" executable {debug}] != "" } {
>
> clean_restart ${binfile}
>
> -# Verify that partial symtab expansion for $filename has state $readin
> -
> -proc verify_psymtab_expanded { filename readin } {
> - set cmd "maint info psymtab"
> - set test "$cmd: $filename: $readin"
> - set re [multi_line \
> - " \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
> - " readin $readin" \
> - ".*"]
> -
> - gdb_test $cmd $re $test
> -}
> -
> # Verify that partial symtab expansion has not taken place for
> # c-linkage-name-2.c.
>
> diff --git a/gdb/testsuite/gdb.base/main-psymtab.exp b/gdb/testsuite/gdb.base/main-psymtab.exp
> new file mode 100644
> index 0000000000..72596f632b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/main-psymtab.exp
> @@ -0,0 +1,34 @@
> +# Copyright 2020 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +standard_testfile persistent-lang.cc
> +
> +if {[build_executable "failed to prepare" $testfile $srcfile debug]} {
> + return -1
> +}
> +
> +clean_restart
> +
> +set auto_cpp \
> + {The current source language is "auto; currently c\+\+"\.}
> +
> +gdb_load ${binfile}
> +gdb_test "show language" $auto_cpp \
> + "language auto/c++ after load"
> +
> +# Verify that partial symtab expansion has not taken place for
> +# persistent-lang.cc
> +
> +verify_psymtab_expanded persistent-lang.cc no
> diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
> index b14b3a968e..a66d738713 100644
> --- a/gdb/testsuite/lib/gdb.exp
> +++ b/gdb/testsuite/lib/gdb.exp
> @@ -6958,5 +6958,18 @@ gdb_caching_proc supports_statement_frontiers {
> } executable "additional_flags=-gstatement-frontiers"]
> }
>
> +# Verify that partial symtab expansion for $filename has state $readin.
> +
> +proc verify_psymtab_expanded { filename readin } {
> + set cmd "maint info psymtab"
> + set test "$cmd: $filename: $readin"
> + set re [multi_line \
> + " \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
> + " readin $readin" \
> + ".*"]
> +
> + gdb_test $cmd $re $test
> +}
> +
> # Always load compatibility stuff.
> load_lib future.exp
>
next prev parent reply other threads:[~2020-03-30 21:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-01 23:18 [RFC][gdb] " Tom de Vries
2020-03-03 15:48 ` Tom Tromey
2020-03-04 16:07 ` Tom de Vries
2020-03-13 13:59 ` Tom Tromey
2020-03-17 15:25 ` [PATCH][gdb] " Tom de Vries
2020-03-30 21:20 ` Tom de Vries [this message]
2020-04-01 20:07 ` Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=82944a35-6f31-dda2-53ac-25a8708e13e5@suse.de \
--to=tdevries@suse.de \
--cc=gdb-patches@sourceware.org \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).