public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: [PATCH][gdb] Use partial symbol table to find language for main
Date: Tue, 17 Mar 2020 16:25:35 +0100	[thread overview]
Message-ID: <b673518c-eace-0260-54e1-871230122c29@suse.de> (raw)
In-Reply-To: <87mu8kfj9d.fsf@tromey.com>

[-- Attachment #1: Type: text/plain, Size: 3041 bytes --]

On 13-03-2020 14:59, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> Tom> [gdb] Use partial symbol table to find language for main
> 
> Tom> When language is set to auto, part of loading an executable is to update the
> Tom> language accordingly.  This is implemented by set_initial_language.
> 
> ...
> 
> Tom> +/* Psymtab version of lookup_symbol_language.  See its definition in
> Tom> +   the definition of quick_symbol_functions in symfile.h.  */
> Tom> +
> Tom> +enum language
> Tom> +psym_lookup_symbol_language (struct objfile *objfile,
> Tom> +			     block_enum block_index,
> Tom> +			     const char *name,
> Tom> +			     domain_enum domain,
> Tom> +			     bool *symbol_found_p)
> 
> This should be static.
> 

Ack, done.

> Tom> diff --git a/gdb/psymtab.h b/gdb/psymtab.h
> Tom> index 040b973927..a8a5d344b0 100644
> Tom> --- a/gdb/psymtab.h
> Tom> +++ b/gdb/psymtab.h
> Tom> @@ -151,4 +151,9 @@ extern const struct quick_symbol_functions dwarf2_debug_names_functions;
> Tom>  extern psymtab_storage::partial_symtab_range require_partial_symbols
> Tom>      (struct objfile *objfile, bool verbose);
>  
> Tom> +enum language psym_lookup_symbol_language (struct objfile *objfile,
> Tom> +					   block_enum block_index,
> Tom> +					   const char *name,
> Tom> +					   domain_enum domain,
> Tom> +					   bool *symbol_found_p);
> 
> ... and not declared here.
> 

done.

> Tom> +  /* Check to see if the symbol is defined in a "partial" symbol table
> Tom> +     of OBJFILE.  BLOCK_INDEX should be either GLOBAL_BLOCK or STATIC_BLOCK,
> Tom> +     depending on whether we want to search global symbols or static
> Tom> +     symbols.  NAME is the name of the symbol to look for.  DOMAIN
> Tom> +     indicates what sort of symbol to search for.
> Tom> +
> Tom> +     If found, sets *symbol_found_p to true and returns the symbol language.
> Tom> +     defined, or NULL if no such symbol table exists.  */
> Tom> +  enum language (*lookup_symbol_language) (struct objfile *objfile,
> Tom> +					   block_enum block_index,
> 
> Does this really need the block index?  It seems to me that main will
> normally be global.  Also, the code seems to only ever use the global
> scope.
> 

It doesn't need to, no. I've hard-coded GLOBAL_BLOCK and added global_
to the function names.

> Tom> +/* Find the language for partial symbol with NAME.  */
> Tom> +
> Tom> +enum language
> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
> Tom> +			    const domain_enum domain)
> 
> This can be static.
> 

Done.

> Tom> diff --git a/gdb/symtab.h b/gdb/symtab.h
> Tom> index 5fa067b5f4..da8d3011ea 100644
> Tom> --- a/gdb/symtab.h
> Tom> +++ b/gdb/symtab.h
> Tom> @@ -2349,4 +2349,8 @@ private:
> Tom>    std::vector<bound_minimal_symbol> m_minimal_symbols;
> Tom>  };
>  
> Tom> +enum language
> Tom> +find_quick_symbol_language (block_enum block_index, const char *name,
> Tom> +			    const domain_enum domain);
> 
> No need to declare here.

Done.

OK for trunk?

Thanks,
- Tom



[-- Attachment #2: 0001-gdb-Use-partial-symbol-table-to-find-language-for-main.patch --]
[-- Type: text/x-patch, Size: 12919 bytes --]

[gdb] Use partial symbol table to find language for main

When language is set to auto, part of loading an executable is to update the
language accordingly.  This is implemented by set_initial_language.

The implementation of set_initial_language works as follows:
- check if any objfile in the progspace has name_of_main/language_of_main
  set, and if so, use the first one found. [ This is what you get f.i. when
  using dwarf with DW_AT_main_subprogram. ]
- otherwise, check for known names in the minimal symbols, and either:
- use the associated language if any (f.i. for ada), or
- lookup the symbol in the symtab for the name and use the symbol language
  (f.i. for c/c++).

The symbol lookup can be slow though.

In the case of the cc1 binary from PR23710 comment 1, getting to the initial
prompt takes ~8s:
...
$ time.sh gdb cc1 -batch -ex "show language"
The current source language is "auto; currently c++".
maxmem: 1272260
real: 8.05
user: 7.73
system: 0.38
...
but if we skip guessing the initial language by setting it instead, it takes
only ~4s:
...
$ time.sh gdb -iex "set language c++" cc1 -batch -ex "show language"
The current source language is "c++".
maxmem: 498272
real: 3.99
user: 3.90
system: 0.15
...

In both cases, we load the partial symbols for the executable, but in the
first case only we also do a lookup of main, which causes the corresponding
partial symtab to be expanded into a full symtab.

Ideally, we'd like to get the language of the symbol without triggering
expansion into a full symtab, and get the speedup without having to set the
language manually.

There's a related fixme in the header comment of set_initial_language:
...
/* Set the initial language.

   FIXME: A better solution would be to record the language in the
   psymtab when reading partial symbols, and then use it (if known) to
   set the language.  This would be a win for formats that encode the
   language in an easily discoverable place, such as DWARF.  For
   stabs, we can jump through hoops looking for specially named
   symbols or try to intuit the language from the specific type of
   stabs we find, but we can't do that until later when we read in
   full symbols.  */

void
set_initial_language (void)
...

Since we're already tracking the language of partial symbols, use this to set
the language for the main symbol.

Note that this search in partial symbol tables is not guaranteed to yield the
same result as the lookup_symbol_in_language call currently done in
set_initial_language.

Build and reg-tested on x86_64-linux.

This requires an update to test-case gdb.base/break-interp.exp to work around
PR25613 - "[gdb] verbose message appears not at start of line during
backtrace".

gdb/ChangeLog:

2020-03-04  Tom de Vries  <tdevries@suse.de>

	* dwarf2/read.c (dwarf2_gdb_index_functions,
	dwarf2_debug_names_functions): Init lookup_global_symbol_language with
	NULL.
	* psymtab.c (psym_lookup_global_symbol_language): New function.
	(psym_functions): Init psym_lookup_global_symbol_language with
	psym_lookup_global_symbol_language.
	* symfile-debug.c (debug_sym_quick_functions): Init
	lookup_global_symbol_language with NULL.
	* symfile.c (set_initial_language): Remove fixme comment.
	* symfile.h (struct quick_symbol_functions): Add
	lookup_global_symbol_language.
	* symtab.c (find_quick_global_symbol_language): New function.
	(find_main_name): Use find_quick_global_symbol_language.

gdb/testsuite/ChangeLog:

2020-03-04  Tom de Vries  <tdevries@suse.de>

	* gdb.base/break-interp.exp: Set verbose back to off.
	* gdb.base/c-linkage-name.exp (verify_psymtab_expanded): Move ...
	* lib/gdb.exp: ... here.
	* gdb.base/main-psymtab.exp: New file.

---
 gdb/dwarf2/read.c                         |  2 ++
 gdb/psymtab.c                             | 31 ++++++++++++++++++++++++++
 gdb/symfile-debug.c                       |  1 +
 gdb/symfile.c                             | 11 +--------
 gdb/symfile.h                             | 11 +++++++++
 gdb/symtab.c                              | 37 +++++++++++++++++++++++++++++++
 gdb/testsuite/gdb.base/c-linkage-name.exp | 13 -----------
 gdb/testsuite/gdb.base/main-psymtab.exp   | 34 ++++++++++++++++++++++++++++
 gdb/testsuite/lib/gdb.exp                 | 13 +++++++++++
 9 files changed, 130 insertions(+), 23 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 0e879e08a0..699f1a14da 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4801,6 +4801,7 @@ const struct quick_symbol_functions dwarf2_gdb_index_functions =
   dw2_forget_cached_source_info,
   dw2_map_symtabs_matching_filename,
   dw2_lookup_symbol,
+  NULL,
   dw2_print_stats,
   dw2_dump,
   dw2_expand_symtabs_for_function,
@@ -5683,6 +5684,7 @@ const struct quick_symbol_functions dwarf2_debug_names_functions =
   dw2_forget_cached_source_info,
   dw2_map_symtabs_matching_filename,
   dw2_debug_names_lookup_symbol,
+  NULL,
   dw2_print_stats,
   dw2_debug_names_dump,
   dw2_debug_names_expand_symtabs_for_function,
diff --git a/gdb/psymtab.c b/gdb/psymtab.c
index f77f6d5108..3731540cbb 100644
--- a/gdb/psymtab.c
+++ b/gdb/psymtab.c
@@ -517,6 +517,36 @@ psym_lookup_symbol (struct objfile *objfile,
   return stab_best;
 }
 
+/* Psymtab version of lookup_global_symbol_language.  See its definition in
+   the definition of quick_symbol_functions in symfile.h.  */
+
+static enum language
+psym_lookup_global_symbol_language (struct objfile *objfile, const char *name,
+				    domain_enum domain, bool *symbol_found_p)
+{
+  *symbol_found_p = false;
+  if (objfile->sf == NULL)
+    return language_unknown;
+
+  lookup_name_info lookup_name (name, symbol_name_match_type::FULL);
+
+  for (partial_symtab *ps : require_partial_symbols (objfile, true))
+    {
+      struct partial_symbol *psym;
+      if (ps->readin_p ())
+	continue;
+
+      psym = lookup_partial_symbol (objfile, ps, name, 1, domain);
+      if (psym)
+	{
+	  *symbol_found_p = true;
+	  return psym->ginfo.language ();
+	}
+    }
+
+  return language_unknown;
+}
+
 /* Returns true if PSYM matches LOOKUP_NAME.  */
 
 static bool
@@ -1448,6 +1478,7 @@ const struct quick_symbol_functions psym_functions =
   psym_forget_cached_source_info,
   psym_map_symtabs_matching_filename,
   psym_lookup_symbol,
+  psym_lookup_global_symbol_language,
   psym_print_stats,
   psym_dump,
   psym_expand_symtabs_for_function,
diff --git a/gdb/symfile-debug.c b/gdb/symfile-debug.c
index 53a77a5405..19dc83a8bd 100644
--- a/gdb/symfile-debug.c
+++ b/gdb/symfile-debug.c
@@ -362,6 +362,7 @@ static const struct quick_symbol_functions debug_sym_quick_functions =
   debug_qf_forget_cached_source_info,
   debug_qf_map_symtabs_matching_filename,
   debug_qf_lookup_symbol,
+  NULL,
   debug_qf_print_stats,
   debug_qf_dump,
   debug_qf_expand_symtabs_for_function,
diff --git a/gdb/symfile.c b/gdb/symfile.c
index 3b63887ce1..bd27a1fefe 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -1667,16 +1667,7 @@ symbol_file_command (const char *args, int from_tty)
     }
 }
 
-/* Set the initial language.
-
-   FIXME: A better solution would be to record the language in the
-   psymtab when reading partial symbols, and then use it (if known) to
-   set the language.  This would be a win for formats that encode the
-   language in an easily discoverable place, such as DWARF.  For
-   stabs, we can jump through hoops looking for specially named
-   symbols or try to intuit the language from the specific type of
-   stabs we find, but we can't do that until later when we read in
-   full symbols.  */
+/* Set the initial language.  */
 
 void
 set_initial_language (void)
diff --git a/gdb/symfile.h b/gdb/symfile.h
index 85f8e7c155..84fa283e85 100644
--- a/gdb/symfile.h
+++ b/gdb/symfile.h
@@ -183,6 +183,17 @@ struct quick_symbol_functions
 					    const char *name,
 					    domain_enum domain);
 
+  /* Check to see if the global symbol is defined in a "partial" symbol table
+     of OBJFILE. NAME is the name of the symbol to look for.  DOMAIN
+     indicates what sort of symbol to search for.
+
+     If found, sets *symbol_found_p to true and returns the symbol language.
+     defined, or NULL if no such symbol table exists.  */
+  enum language (*lookup_global_symbol_language) (struct objfile *objfile,
+						  const char *name,
+						  domain_enum domain,
+						  bool *symbol_found_p);
+
   /* Print statistics about any indices loaded for OBJFILE.  The
      statistics should be printed to gdb_stdout.  This is used for
      "maint print statistics".  */
diff --git a/gdb/symtab.c b/gdb/symtab.c
index aa415a9399..7303e3c9d9 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -2560,6 +2560,33 @@ lookup_symbol_in_objfile (struct objfile *objfile, enum block_enum block_index,
   return result;
 }
 
+/* Find the language for partial symbol with NAME.  */
+
+static enum language
+find_quick_global_symbol_language (const char *name, const domain_enum domain)
+{
+  for (objfile *objfile : current_program_space->objfiles ())
+    {
+      if (objfile->sf && objfile->sf->qf
+	  && objfile->sf->qf->lookup_global_symbol_language)
+	continue;
+      return language_unknown;
+    }
+
+  for (objfile *objfile : current_program_space->objfiles ())
+    {
+      bool symbol_found_p;
+      enum language lang
+	= objfile->sf->qf->lookup_global_symbol_language (objfile, name, domain,
+							  &symbol_found_p);
+      if (!symbol_found_p)
+	continue;
+      return lang;
+    }
+
+  return language_unknown;
+}
+
 /* Private data to be used with lookup_symbol_global_iterator_cb.  */
 
 struct global_or_static_sym_lookup_data
@@ -6123,6 +6150,16 @@ find_main_name (void)
 
   /* The languages above didn't identify the name of the main procedure.
      Fallback to "main".  */
+
+  /* Try to find language for main in psymtabs.  */
+  enum language lang
+    = find_quick_global_symbol_language ("main", VAR_DOMAIN);
+  if (lang != language_unknown)
+    {
+      set_main_name ("main", lang);
+      return;
+    }
+
   set_main_name ("main", language_unknown);
 }
 
diff --git a/gdb/testsuite/gdb.base/c-linkage-name.exp b/gdb/testsuite/gdb.base/c-linkage-name.exp
index 4551c8dcac..87036aa1ff 100644
--- a/gdb/testsuite/gdb.base/c-linkage-name.exp
+++ b/gdb/testsuite/gdb.base/c-linkage-name.exp
@@ -27,19 +27,6 @@ if { [gdb_compile "${sources}" "${binfile}" executable {debug}] != "" } {
 
 clean_restart ${binfile}
 
-# Verify that partial symtab expansion for $filename has state $readin
-
-proc verify_psymtab_expanded { filename readin } {
-    set cmd "maint info psymtab"
-    set test "$cmd: $filename: $readin"
-    set re [multi_line \
-		"  \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
-		"    readin $readin" \
-		".*"]
-
-    gdb_test $cmd $re $test
-}
-
 # Verify that partial symtab expansion has not taken place for
 # c-linkage-name-2.c.
 
diff --git a/gdb/testsuite/gdb.base/main-psymtab.exp b/gdb/testsuite/gdb.base/main-psymtab.exp
new file mode 100644
index 0000000000..72596f632b
--- /dev/null
+++ b/gdb/testsuite/gdb.base/main-psymtab.exp
@@ -0,0 +1,34 @@
+# Copyright 2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile persistent-lang.cc
+
+if {[build_executable "failed to prepare" $testfile $srcfile debug]} {
+    return -1
+}
+
+clean_restart
+
+set auto_cpp \
+    {The current source language is "auto; currently c\+\+"\.}
+
+gdb_load ${binfile}
+gdb_test "show language" $auto_cpp \
+    "language auto/c++ after load"
+
+# Verify that partial symtab expansion has not taken place for
+# persistent-lang.cc
+
+verify_psymtab_expanded persistent-lang.cc no
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index b14b3a968e..a66d738713 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -6958,5 +6958,18 @@ gdb_caching_proc supports_statement_frontiers {
     } executable "additional_flags=-gstatement-frontiers"]
 }
 
+# Verify that partial symtab expansion for $filename has state $readin.
+
+proc verify_psymtab_expanded { filename readin } {
+    set cmd "maint info psymtab"
+    set test "$cmd: $filename: $readin"
+    set re [multi_line \
+		"  \{ psymtab \[^\r\n\]*$filename\[^\r\n\]*" \
+		"    readin $readin" \
+		".*"]
+
+    gdb_test $cmd $re $test
+}
+
 # Always load compatibility stuff.
 load_lib future.exp

  reply	other threads:[~2020-03-17 15:25 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-01 23:18 [RFC][gdb] " Tom de Vries
2020-03-03 15:48 ` Tom Tromey
2020-03-04 16:07   ` Tom de Vries
2020-03-13 13:59     ` Tom Tromey
2020-03-17 15:25       ` Tom de Vries [this message]
2020-03-30 21:20         ` [PING][PATCH][gdb] " Tom de Vries
2020-04-01 20:07         ` [PATCH][gdb] " Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b673518c-eace-0260-54e1-871230122c29@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).