[PATCH 0/2] New test for slow DWARF reader issue

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/2] New test for slow DWARF reader issue
@ 2022-12-08 15:38 Andrew Burgess
  2022-12-08 15:38 ` [PATCH 1/2] gdb/testsuite: fix readnow detection Andrew Burgess
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Andrew Burgess @ 2022-12-08 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey, Andrew Burgess

This series adds a test for PR gdb/gdb/29105, my motivation was to
better understand the commit that fixed this issue.

While writting the test in patch #2 I ran into a testsuite issue,
which is fixed in patch #1.

Thanks,
Andrew

---

Andrew Burgess (2):
  gdb/testsuite: fix readnow detection
  gdb/testsuite: new test for recent dwarf reader issue

 .../gdb.base/signed-builtin-types-lib.c       |  30 +++++
 gdb/testsuite/gdb.base/signed-builtin-types.c |  25 ++++
 .../gdb.base/signed-builtin-types.exp         | 112 ++++++++++++++++++
 gdb/testsuite/gdb.opt/break-on-_exit.exp      |   3 +-
 gdb/testsuite/gdb.rust/traits.exp             |   2 -
 gdb/testsuite/lib/gdb.exp                     |  30 +----
 gdb/testsuite/lib/mi-support.exp              |  28 +----
 7 files changed, 177 insertions(+), 53 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types-lib.c
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types.c
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types.exp


base-commit: 2d77a94ff17a81260b80997db476f87cba5f4b11
-- 
2.25.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] gdb/testsuite: fix readnow detection
  2022-12-08 15:38 [PATCH 0/2] New test for slow DWARF reader issue Andrew Burgess
@ 2022-12-08 15:38 ` Andrew Burgess
  2022-12-08 15:38 ` [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue Andrew Burgess
  2022-12-09 18:18 ` [PATCH 0/2] New test for slow DWARF " Tom Tromey
  2 siblings, 0 replies; 17+ messages in thread
From: Andrew Burgess @ 2022-12-08 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey, Andrew Burgess

The following commit broke the readnow detection in the testsuite:

  commit dfaa040b440084dd73ebd359326752d5f44fc02c
  Date:   Mon Mar 29 18:31:31 2021 -0600

      Remove some "OBJF_READNOW" code from dwarf2_debug_names_index

The testsuite checks if GDB was started with the -readnow flag by
using the 'maintenance print objfiles' command, and looking for the
string 'faked for "readnow"' in the output.  This is implemented in
two helper procs `readnow` (gdb.exp) and `mi_readnow` (mi-support.exp).

The following tests all currently depend on this detection:

  gdb.base/maint.exp
  gdb.cp/nsalias.exp
  gdb.dwarf2/debug-aranges-duplicate-offset-warning.exp
  gdb.dwarf2/dw2-stack-boundary.exp
  gdb.dwarf2/dw2-zero-range.exp
  gdb.dwarf2/gdb-index-nodebug.exp
  gdb.mi/mi-info-sources.exp
  gdb.python/py-symbol.exp
  gdb.rust/traits.exp

The following test also includes detection of 'readnow', but does the
detection itself by checking $::GDBFLAGS for the readnow flag:

  gdb.opt/break-on-_exit.exp

The above commit removed from GDB the code that produced the 'faked
for "readnow"' string, as a consequence the testsuite can no longer
correctly spot when readnow is in use, and many of the above tests
will fail (at least partially).

When looking at the above tests, I noticed that gdb.rust/traits.exp
does call `readnow`, but doesn't actually use the result, so I've
removed the readnow call, this simplifies the next part of this patch
as gdb.rust/traits.exp was the only place an extra regexp was passed
to the readnow call.

Next I have rewritten `readnow` to check the $GDBFLAGS for the
-readnow flag, and removed the `maintenance print objfiles` check.  At
least for all the tests above, when using the readnow board, this is
good enough to get everything passing again.

For the `mi_readnow` proc, I changed this to just call `readnow` from
gdb.exp, I left the mi_readnow name in place - in the future it might
be the case that we want to do some different checks here.

Finally, I updated gdb.opt/break-on-_exit.exp to call the `readnow`
proc.

With these changes, all of the tests listed above now pass correctly
when using the readnow board.
---
 gdb/testsuite/gdb.opt/break-on-_exit.exp |  3 +--
 gdb/testsuite/gdb.rust/traits.exp        |  2 --
 gdb/testsuite/lib/gdb.exp                | 30 ++++--------------------
 gdb/testsuite/lib/mi-support.exp         | 28 ++++------------------
 4 files changed, 10 insertions(+), 53 deletions(-)

diff --git a/gdb/testsuite/gdb.opt/break-on-_exit.exp b/gdb/testsuite/gdb.opt/break-on-_exit.exp
index 7c2fda6af69..3d18cd70bb9 100644
--- a/gdb/testsuite/gdb.opt/break-on-_exit.exp
+++ b/gdb/testsuite/gdb.opt/break-on-_exit.exp
@@ -36,8 +36,7 @@
 standard_testfile
 
 # See if we have target board readnow.exp or similar.
-if { [lsearch -exact $GDBFLAGS -readnow] != -1 \
-	 || [lsearch -exact $GDBFLAGS --readnow] != -1 } {
+if {[readnow]} {
     untested "--readnever not allowed in combination with --readnow"
     return -1
 }
diff --git a/gdb/testsuite/gdb.rust/traits.exp b/gdb/testsuite/gdb.rust/traits.exp
index aa45e64b877..949e7cb919e 100644
--- a/gdb/testsuite/gdb.rust/traits.exp
+++ b/gdb/testsuite/gdb.rust/traits.exp
@@ -43,7 +43,5 @@ if {![runto ${srcfile}:$line]} {
     return -1
 }
 
-set readnow_p [readnow $binfile]
-
 gdb_test "print *td" " = 23.5"
 gdb_test "print *tu" " = 23"
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 008f59b9f30..132d538957c 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -8553,32 +8553,12 @@ gdb_caching_proc supports_fcf_protection {
   } executable "additional_flags=-fcf-protection=full"]
 }
 
-# Return 1 if symbols were read in using -readnow.  Otherwise, return 0.
+# Return true if symbols were read in using -readnow.  Otherwise,
+# return false.
 
-proc readnow { args } {
-    if { [llength $args] == 1 } {
-	set re [lindex $args 0]
-    } else {
-	set re ""
-    }
-
-    set readnow_p 0
-    # Given the listing from the following command can be very verbose, match
-    # the patterns line-by-line.  This prevents timeouts from waiting for
-    # too much data to come at once.
-    set cmd "maint print objfiles $re"
-    gdb_test_multiple $cmd "" -lbl {
-	-re "\r\n.gdb_index: faked for \"readnow\"" {
-	    # Record the we've seen the above pattern.
-	    set readnow_p 1
-	    exp_continue
-	}
-	-re -wrap "" {
-	    # We don't care about any other input.
-	}
-    }
-
-    return $readnow_p
+proc readnow { } {
+    return [expr {[lsearch -exact $::GDBFLAGS -readnow] != -1
+		  || [lsearch -exact $::GDBFLAGS --readnow] != -1}]
 }
 
 # Return index name if symbols were read in using an index.
diff --git a/gdb/testsuite/lib/mi-support.exp b/gdb/testsuite/lib/mi-support.exp
index 18a2a04def8..2dccb8924b1 100644
--- a/gdb/testsuite/lib/mi-support.exp
+++ b/gdb/testsuite/lib/mi-support.exp
@@ -671,32 +671,12 @@ proc mi_gdb_load { arg } {
     return 0
 }
 
-# Return 1 if symbols were read in using -readnow.  Otherwise, return 0.
-# Based on readnow from lib/gdb.exp.
+# Return true if symbols were read in using -readnow.  Otherwise,
+# return false.
 
 proc mi_readnow { args } {
-    global mi_gdb_prompt
-
-    if { [llength $args] == 1 } {
-	set re [lindex $args 0]
-    } else {
-	set re ""
-    }
-
-    set readnow_p 0
-    set cmd "maint print objfiles $re"
-    send_gdb "$cmd\n"
-    gdb_expect {
-	-re ".gdb_index: faked for ..readnow.." {
-	    # Record that we've seen the above pattern.
-	    set readnow_p 1
-	    exp_continue
-	}
-	-re "\\^done\r\n$mi_gdb_prompt$" {
-	}
-    }
-
-    return $readnow_p
+    # Just defer to gdb.exp.
+    return [readnow]
 }
 
 # mi_gdb_test COMMAND [PATTERN [MESSAGE [IPATTERN]]] -- send a command to gdb;
-- 
2.25.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-08 15:38 [PATCH 0/2] New test for slow DWARF reader issue Andrew Burgess
  2022-12-08 15:38 ` [PATCH 1/2] gdb/testsuite: fix readnow detection Andrew Burgess
@ 2022-12-08 15:38 ` Andrew Burgess
  2022-12-09 18:18   ` Tom Tromey
  2022-12-09 18:18 ` [PATCH 0/2] New test for slow DWARF " Tom Tromey
  2 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2022-12-08 15:38 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey, Andrew Burgess

This commit provides a test for this commit:

  commit 55fc1623f942fba10362cb199f9356d75ca5835b
  Date:   Thu Nov 3 13:49:17 2022 -0600

      Add name canonicalization for C

Which resolves PR gdb/29105.  My reason for writing this test was a
desire to better understand the above commit, my process was to study
the commit until I thought I understood it, then write a test to
expose the issue.  As the original commit didn't have a test, I
thought it wouldn't hurt to commit this upstream.

The problem tested for here is already described in the above commit,
but I'll give a brief description here.  This description describes
GDB prior to the above commit:

  - Builtin types are added to GDB using their canonical name,
    e.g. "short", not "signed short",

  - When the user does something like 'p sizeof(short)', then this is
    handled in c-exp.y, and results in a call to lookup_signed_type
    for the name "int".  The "int" here is actually being looked up as
    the type for the result of the 'sizeof' expression,

  - In lookup_signed_type GDB first adds a 'signed' and looks for that
    type, so in this case 'signed int', and, if that lookup fails, GDB
    then looks up 'int',

  - The problem is that 'signed int' is not the canonical name for a
    signed int, so no builtin type with that name will be found, GDB
    will then go to each object file in turn looking for a matching
    type,

  - When checking each object file, GDB will first check the partial
    symtab to see if the full symtab should be expanded or not.
    Remember, at this point GDB is looking for 'signed int', there
    will be no partial symbols with that name, so GDB will not expand
    anything,

  - However, GDB checks each partial symbol using multiple languages,
    not just the current language (C in this case), so, when GDB
    checks using the C++ language, the symbol name is first demangled,
    the code that does this can be found
    lookup_name_info::language_lookup_name.  As the demangled form of
    'signed int' is just 'int', GDB then looks for any symbols with
    the name 'int', most partial symtabs will contain such a symbol,
    so GDB ends up expanding pretty much every symtab.

The above commit fixes this by avoiding the use of non-canonical names
with C, now the initial builtin type lookup will succeed, and GDB
never even considers whether to expand any additional symtabs.

The test case creates a library that includes char, short, int, and
long types, and a test program that links against the library.

In the test script we start the inferior, but don't allow it to
progress far enough that the debug information for the library has
been fully expanded yet.

Then we evaluate some 'sizeof(TYPE)' expressions.

In the buggy version of GDB this would cause the debug information
for the library to be fully expanded, while in the fixed version of
GDB this will not be the case.

We use 'info sources' to determine if the debug information has been
fully expanded or not.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105
---
 .../gdb.base/signed-builtin-types-lib.c       |  30 +++++
 gdb/testsuite/gdb.base/signed-builtin-types.c |  25 ++++
 .../gdb.base/signed-builtin-types.exp         | 112 ++++++++++++++++++
 3 files changed, 167 insertions(+)
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types-lib.c
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types.c
 create mode 100644 gdb/testsuite/gdb.base/signed-builtin-types.exp

diff --git a/gdb/testsuite/gdb.base/signed-builtin-types-lib.c b/gdb/testsuite/gdb.base/signed-builtin-types-lib.c
new file mode 100644
index 00000000000..a32ec223ec1
--- /dev/null
+++ b/gdb/testsuite/gdb.base/signed-builtin-types-lib.c
@@ -0,0 +1,30 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+extern int foo (void);
+
+short short_var = 1;
+int int_var = 2;
+long long_var = 3;
+char char_var = 4;
+
+int
+foo (void)
+{
+  /* Just use all the globals!  This works out as zero.  */
+  return (char_var / int_var) - (long_var - short_var);
+}
diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.c b/gdb/testsuite/gdb.base/signed-builtin-types.c
new file mode 100644
index 00000000000..1975b20e277
--- /dev/null
+++ b/gdb/testsuite/gdb.base/signed-builtin-types.c
@@ -0,0 +1,25 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2022 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+extern int foo (void);
+
+int
+main (void)
+{
+  int result = foo ();
+  return result;
+}
diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.exp b/gdb/testsuite/gdb.base/signed-builtin-types.exp
new file mode 100644
index 00000000000..e9784330fee
--- /dev/null
+++ b/gdb/testsuite/gdb.base/signed-builtin-types.exp
@@ -0,0 +1,112 @@
+# Copyright 2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+if {[skip_shlib_tests]} {
+    return -1
+}
+
+standard_testfile .c -lib.c
+
+# Compile the shared library.
+set srcdso [file join $srcdir $subdir $srcfile2]
+set objdso [standard_output_file lib${gdb_test_file_name}.so]
+if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
+    untested "failed to compile dso"
+    return -1
+}
+
+# Build the test executable and runto main.
+set opts [list debug shlib=$objdso]
+if { [prepare_for_testing "failed to " $testfile $srcfile $opts] } {
+    return -1
+}
+
+if {![runto_main]} {
+    return -1
+}
+
+if {[readnow]} {
+    untested "this test checks for delayed symtab expansion"
+    return -1
+}
+
+# Use 'info sources' to check if the debug information for the shared
+# library has been fully expanded or not.  Return true if the debug
+# information has NOT been fully expanded (which is what we want for this
+# test).
+proc shared_library_debug_not_fully_expanded {} {
+    set library_expanded ""
+    gdb_test_multiple "info sources" "" {
+	-re "^info sources\r\n" {
+	    exp_continue
+	}
+	-re "^(\[^\r\n\]+):\r\n\\(Full debug information has not yet been read for this file\\.\\)\r\n\r\n" {
+	    set libname $expect_out(1,string)
+	    if {$libname == $::objdso} {
+		set library_expanded "no"
+	    }
+	    exp_continue
+	}
+	-re "^(\[^\r\n\]+):\r\n\\(Objfile has no debug information\\.\\)\r\n\r\n" {
+	    set libname $expect_out(1,string)
+	    if {$libname == $::objdso} {
+		# For some reason the shared library has no debug
+		# information, this is not expected.
+		set library_expanded "missing debug"
+	    }
+	    exp_continue
+	}
+	-re "^(\[^\r\n\]+):\r\n\r\n" {
+	    set libname $expect_out(1,string)
+	    if {$libname == $::objdso} {
+		set library_expanded "yes"
+	    }
+	    exp_continue
+	}
+	-re "^$::gdb_prompt $" {
+	    gdb_assert {[string equal $library_expanded "yes"] \
+			    || [string equal $library_expanded "no"]} \
+		$gdb_test_name
+	}
+	-re "^(\[^\r\n:\]*)\r\n" {
+	    exp_continue
+	}
+    }
+
+    return [expr $library_expanded == "no"]
+}
+
+foreach_with_prefix type_name {"short" "int" "long" "char"} {
+    foreach_with_prefix type_prefix {"" "signed" "unsigned"} {
+	with_test_prefix "before sizeof expression" {
+	    # Check that the debug information for the shared library has
+	    # not yet been read in.
+	    gdb_assert { [shared_library_debug_not_fully_expanded] }
+	}
+
+	# Evaluate a sizeof expression for a builtin type.  At one point GDB
+	# would fail to find the builtin type, and would then start
+	# expanding compilation units looking for a suitable debug entry,
+	# for some builtin types GDB would never find a suitable match, and
+	# so would end up expanding all available compilation units.
+	gdb_test "print/d sizeof ($type_prefix $type_name)" " = $decimal"
+
+	with_test_prefix "after sizeof expression" {
+	    # Check that the debug information for the shared library has not
+	    # yet been read in.
+	    gdb_assert { [shared_library_debug_not_fully_expanded] }
+	}
+    }
+}
-- 
2.25.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-08 15:38 ` [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue Andrew Burgess
@ 2022-12-09 18:18   ` Tom Tromey
  2022-12-09 19:24     ` Andrew Burgess
  0 siblings, 1 reply; 17+ messages in thread
From: Tom Tromey @ 2022-12-09 18:18 UTC (permalink / raw)
  To: Andrew Burgess via Gdb-patches; +Cc: Andrew Burgess, Tom Tromey

>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

Thank you for doing this.

Andrew>   - However, GDB checks each partial symbol using multiple languages,
Andrew>     not just the current language (C in this case), so, when GDB
Andrew>     checks using the C++ language, the symbol name is first demangled,
Andrew>     the code that does this can be found
Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
Andrew>     the name 'int', most partial symtabs will contain such a symbol,
Andrew>     so GDB ends up expanding pretty much every symtab.

It's a pedantic point but what happens here is name canonicalization,
not demangling.  Demangling is just used to refer to the translation
from a name like "_Zmumble" to "something::else" -- that is, the input
is a linkage name and the output is a C++ name.  Canonicalization takes
a C++ name as input and returns the standard form, basically dealing
with the fact that C++ (and as we discovered, C) has multiple possible
spellings for some symbols.

Tom

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] New test for slow DWARF reader issue
  2022-12-08 15:38 [PATCH 0/2] New test for slow DWARF reader issue Andrew Burgess
  2022-12-08 15:38 ` [PATCH 1/2] gdb/testsuite: fix readnow detection Andrew Burgess
  2022-12-08 15:38 ` [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue Andrew Burgess
@ 2022-12-09 18:18 ` Tom Tromey
  2022-12-14 10:25   ` Andrew Burgess
  2 siblings, 1 reply; 17+ messages in thread
From: Tom Tromey @ 2022-12-09 18:18 UTC (permalink / raw)
  To: Andrew Burgess via Gdb-patches; +Cc: Andrew Burgess, Tom Tromey

>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:

Andrew> This series adds a test for PR gdb/gdb/29105, my motivation was to
Andrew> better understand the commit that fixed this issue.

Andrew> While writting the test in patch #2 I ran into a testsuite issue,
Andrew> which is fixed in patch #1.

Thank you for doing this.  These look good to me.

Tom

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-09 18:18   ` Tom Tromey
@ 2022-12-09 19:24     ` Andrew Burgess
  2022-12-14 14:47       ` Luis Machado
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2022-12-09 19:24 UTC (permalink / raw)
  To: Tom Tromey, Andrew Burgess via Gdb-patches; +Cc: Tom Tromey

Tom Tromey <tom@tromey.com> writes:

>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> Thank you for doing this.
>
> Andrew>   - However, GDB checks each partial symbol using multiple languages,
> Andrew>     not just the current language (C in this case), so, when GDB
> Andrew>     checks using the C++ language, the symbol name is first demangled,
> Andrew>     the code that does this can be found
> Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
> Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
> Andrew>     the name 'int', most partial symtabs will contain such a symbol,
> Andrew>     so GDB ends up expanding pretty much every symtab.
>
> It's a pedantic point but what happens here is name canonicalization,
> not demangling.  Demangling is just used to refer to the translation
> from a name like "_Zmumble" to "something::else" -- that is, the input
> is a linkage name and the output is a C++ name.  Canonicalization takes
> a C++ name as input and returns the standard form, basically dealing
> with the fact that C++ (and as we discovered, C) has multiple possible
> spellings for some symbols.

Please, be pedantic.  My goal here was to better understand this code,
there's no point me understanding it wrong.

I'll reword that paragraph.

Thanks for taking a look.

Andrew


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] New test for slow DWARF reader issue
  2022-12-09 18:18 ` [PATCH 0/2] New test for slow DWARF " Tom Tromey
@ 2022-12-14 10:25   ` Andrew Burgess
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Burgess @ 2022-12-14 10:25 UTC (permalink / raw)
  To: Tom Tromey, Andrew Burgess via Gdb-patches; +Cc: Tom Tromey

Tom Tromey <tom@tromey.com> writes:

>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> Andrew> This series adds a test for PR gdb/gdb/29105, my motivation was to
> Andrew> better understand the commit that fixed this issue.
>
> Andrew> While writting the test in patch #2 I ran into a testsuite issue,
> Andrew> which is fixed in patch #1.
>
> Thank you for doing this.  These look good to me.

I fixed the demangle/canonicalization confusion in patch #2 and pushed
these patches.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-09 19:24     ` Andrew Burgess
@ 2022-12-14 14:47       ` Luis Machado
  2022-12-15 11:22         ` Andrew Burgess
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2022-12-14 14:47 UTC (permalink / raw)
  To: Andrew Burgess, Tom Tromey, Andrew Burgess via Gdb-patches

Hi Andrew,

On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
> Tom Tromey <tom@tromey.com> writes:
> 
>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>>
>> Thank you for doing this.
>>
>> Andrew>   - However, GDB checks each partial symbol using multiple languages,
>> Andrew>     not just the current language (C in this case), so, when GDB
>> Andrew>     checks using the C++ language, the symbol name is first demangled,
>> Andrew>     the code that does this can be found
>> Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
>> Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
>> Andrew>     the name 'int', most partial symtabs will contain such a symbol,
>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>
>> It's a pedantic point but what happens here is name canonicalization,
>> not demangling.  Demangling is just used to refer to the translation
>> from a name like "_Zmumble" to "something::else" -- that is, the input
>> is a linkage name and the output is a C++ name.  Canonicalization takes
>> a C++ name as input and returns the standard form, basically dealing
>> with the fact that C++ (and as we discovered, C) has multiple possible
>> spellings for some symbols.
> 
> Please, be pedantic.  My goal here was to better understand this code,
> there's no point me understanding it wrong.
> 
> I'll reword that paragraph.
> 
> Thanks for taking a look.
> 
> Andrew
> 

I'm not saying you should investigate this, as it is a new test, but I'm getting a lot of these messages for this test:

ERROR: internal buffer is full.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-14 14:47       ` Luis Machado
@ 2022-12-15 11:22         ` Andrew Burgess
  2022-12-19 13:20           ` Luis Machado
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2022-12-15 11:22 UTC (permalink / raw)
  To: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

Luis Machado <luis.machado@arm.com> writes:

> Hi Andrew,
>
> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>> Tom Tromey <tom@tromey.com> writes:
>> 
>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>>>
>>> Thank you for doing this.
>>>
>>> Andrew>   - However, GDB checks each partial symbol using multiple languages,
>>> Andrew>     not just the current language (C in this case), so, when GDB
>>> Andrew>     checks using the C++ language, the symbol name is first demangled,
>>> Andrew>     the code that does this can be found
>>> Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
>>> Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
>>> Andrew>     the name 'int', most partial symtabs will contain such a symbol,
>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>
>>> It's a pedantic point but what happens here is name canonicalization,
>>> not demangling.  Demangling is just used to refer to the translation
>>> from a name like "_Zmumble" to "something::else" -- that is, the input
>>> is a linkage name and the output is a C++ name.  Canonicalization takes
>>> a C++ name as input and returns the standard form, basically dealing
>>> with the fact that C++ (and as we discovered, C) has multiple possible
>>> spellings for some symbols.
>> 
>> Please, be pedantic.  My goal here was to better understand this code,
>> there's no point me understanding it wrong.
>> 
>> I'll reword that paragraph.
>> 
>> Thanks for taking a look.
>> 
>> Andrew
>> 
>
> I'm not saying you should investigate this, as it is a new test, but I'm getting a lot of these messages for this test:
>
> ERROR: internal buffer is full.

Happy to take a look at the problem.

I guess the issue is coming from the gdb_test_multiple that I use in the
new test script.

I'm tried to write patterns that match and discard all the lines as they
arrive from GDB.  I guess you are seeing a pattern that I am not for
some reason.

Could you run just this test and attach the gdb.log file and I'll take a
look.  I probably just need to tweak one of the patterns a little.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-15 11:22         ` Andrew Burgess
@ 2022-12-19 13:20           ` Luis Machado
  2022-12-19 13:52             ` Andrew Burgess
  0 siblings, 1 reply; 17+ messages in thread
From: Luis Machado @ 2022-12-19 13:20 UTC (permalink / raw)
  To: Andrew Burgess, Tom Tromey, Andrew Burgess via Gdb-patches

On 12/15/22 11:22, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> Hi Andrew,
>>
>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>> Tom Tromey <tom@tromey.com> writes:
>>>
>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>>>>
>>>> Thank you for doing this.
>>>>
>>>> Andrew>   - However, GDB checks each partial symbol using multiple languages,
>>>> Andrew>     not just the current language (C in this case), so, when GDB
>>>> Andrew>     checks using the C++ language, the symbol name is first demangled,
>>>> Andrew>     the code that does this can be found
>>>> Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
>>>> Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
>>>> Andrew>     the name 'int', most partial symtabs will contain such a symbol,
>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>
>>>> It's a pedantic point but what happens here is name canonicalization,
>>>> not demangling.  Demangling is just used to refer to the translation
>>>> from a name like "_Zmumble" to "something::else" -- that is, the input
>>>> is a linkage name and the output is a C++ name.  Canonicalization takes
>>>> a C++ name as input and returns the standard form, basically dealing
>>>> with the fact that C++ (and as we discovered, C) has multiple possible
>>>> spellings for some symbols.
>>>
>>> Please, be pedantic.  My goal here was to better understand this code,
>>> there's no point me understanding it wrong.
>>>
>>> I'll reword that paragraph.
>>>
>>> Thanks for taking a look.
>>>
>>> Andrew
>>>
>>
>> I'm not saying you should investigate this, as it is a new test, but I'm getting a lot of these messages for this test:
>>
>> ERROR: internal buffer is full.
> 
> Happy to take a look at the problem.
> 
> I guess the issue is coming from the gdb_test_multiple that I use in the
> new test script.
> 
> I'm tried to write patterns that match and discard all the lines as they
> arrive from GDB.  I guess you are seeing a pattern that I am not for
> some reason.
> 
> Could you run just this test and attach the gdb.log file and I'll take a
> look.  I probably just need to tweak one of the patterns a little.
> 
> Thanks,
> Andrew
> 

I briefly looked into this. The problem seems to arise from the fact that sometimes we don't have multiple lines for the "info sources" output.

Some sections are output in a single line. For example, one of them has 133K characters. But each entry seems to be separated by a comma character:

./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h, ./elf/../sysdeps/generic/ldsodefs.h, ./elf/../sysdeps/aarch64/dl-machine.h, ...

It might be best (for the testsuite) if gdb outputs this data across more lines.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-19 13:20           ` Luis Machado
@ 2022-12-19 13:52             ` Andrew Burgess
  2022-12-20  8:43               ` tdevries
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2022-12-19 13:52 UTC (permalink / raw)
  To: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

Luis Machado <luis.machado@arm.com> writes:

> On 12/15/22 11:22, Andrew Burgess wrote:
>> Luis Machado <luis.machado@arm.com> writes:
>> 
>>> Hi Andrew,
>>>
>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>> Tom Tromey <tom@tromey.com> writes:
>>>>
>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches <gdb-patches@sourceware.org> writes:
>>>>>
>>>>> Thank you for doing this.
>>>>>
>>>>> Andrew>   - However, GDB checks each partial symbol using multiple languages,
>>>>> Andrew>     not just the current language (C in this case), so, when GDB
>>>>> Andrew>     checks using the C++ language, the symbol name is first demangled,
>>>>> Andrew>     the code that does this can be found
>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the demangled form of
>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any symbols with
>>>>> Andrew>     the name 'int', most partial symtabs will contain such a symbol,
>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>
>>>>> It's a pedantic point but what happens here is name canonicalization,
>>>>> not demangling.  Demangling is just used to refer to the translation
>>>>> from a name like "_Zmumble" to "something::else" -- that is, the input
>>>>> is a linkage name and the output is a C++ name.  Canonicalization takes
>>>>> a C++ name as input and returns the standard form, basically dealing
>>>>> with the fact that C++ (and as we discovered, C) has multiple possible
>>>>> spellings for some symbols.
>>>>
>>>> Please, be pedantic.  My goal here was to better understand this code,
>>>> there's no point me understanding it wrong.
>>>>
>>>> I'll reword that paragraph.
>>>>
>>>> Thanks for taking a look.
>>>>
>>>> Andrew
>>>>
>>>
>>> I'm not saying you should investigate this, as it is a new test, but I'm getting a lot of these messages for this test:
>>>
>>> ERROR: internal buffer is full.
>> 
>> Happy to take a look at the problem.
>> 
>> I guess the issue is coming from the gdb_test_multiple that I use in the
>> new test script.
>> 
>> I'm tried to write patterns that match and discard all the lines as they
>> arrive from GDB.  I guess you are seeing a pattern that I am not for
>> some reason.
>> 
>> Could you run just this test and attach the gdb.log file and I'll take a
>> look.  I probably just need to tweak one of the patterns a little.
>> 
>> Thanks,
>> Andrew
>> 
>
> I briefly looked into this. The problem seems to arise from the fact that sometimes we don't have multiple lines for the "info sources" output.
>
> Some sections are output in a single line. For example, one of them has 133K characters. But each entry seems to be separated by a comma character:
>
> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h, ./elf/../sysdeps/generic/ldsodefs.h, ./elf/../sysdeps/aarch64/dl-machine.h, ...

Ahh, that would explain it.  We don't appear to use 'info sources' that
frequently in the testsuite.  I wonder if you are also seeing failures
on those other tests?

  gdb.asm/asm-source.exp
  gdb.dwarf2/dup-psym.exp
  gdb.dwarf2/dw2-filename.exp

> It might be best (for the testsuite) if gdb outputs this data across more lines.

The other option might be to extend 'info sources' to allow filtering
based on the objfile name, then we can use this in the testsuite to
limit the output...

... or I wonder if we could trick GDB by setting the width to something
small, the I guess the lines would be broken after the ',' characters.

I'll have a play and see what I can come up with.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-19 13:52             ` Andrew Burgess
@ 2022-12-20  8:43               ` tdevries
  2022-12-20 10:32                 ` Andrew Burgess
  0 siblings, 1 reply; 17+ messages in thread
From: tdevries @ 2022-12-20  8:43 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> On 12/15/22 11:22, Andrew Burgess wrote:
>>> Luis Machado <luis.machado@arm.com> writes:
>>> 
>>>> Hi Andrew,
>>>> 
>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>> 
>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches 
>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>> 
>>>>>> Thank you for doing this.
>>>>>> 
>>>>>> Andrew>   - However, GDB checks each partial symbol using multiple 
>>>>>> languages,
>>>>>> Andrew>     not just the current language (C in this case), so, 
>>>>>> when GDB
>>>>>> Andrew>     checks using the C++ language, the symbol name is 
>>>>>> first demangled,
>>>>>> Andrew>     the code that does this can be found
>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the 
>>>>>> demangled form of
>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any 
>>>>>> symbols with
>>>>>> Andrew>     the name 'int', most partial symtabs will contain such 
>>>>>> a symbol,
>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>> 
>>>>>> It's a pedantic point but what happens here is name 
>>>>>> canonicalization,
>>>>>> not demangling.  Demangling is just used to refer to the 
>>>>>> translation
>>>>>> from a name like "_Zmumble" to "something::else" -- that is, the 
>>>>>> input
>>>>>> is a linkage name and the output is a C++ name.  Canonicalization 
>>>>>> takes
>>>>>> a C++ name as input and returns the standard form, basically 
>>>>>> dealing
>>>>>> with the fact that C++ (and as we discovered, C) has multiple 
>>>>>> possible
>>>>>> spellings for some symbols.
>>>>> 
>>>>> Please, be pedantic.  My goal here was to better understand this 
>>>>> code,
>>>>> there's no point me understanding it wrong.
>>>>> 
>>>>> I'll reword that paragraph.
>>>>> 
>>>>> Thanks for taking a look.
>>>>> 
>>>>> Andrew
>>>>> 
>>>> 
>>>> I'm not saying you should investigate this, as it is a new test, but 
>>>> I'm getting a lot of these messages for this test:
>>>> 
>>>> ERROR: internal buffer is full.
>>> 
>>> Happy to take a look at the problem.
>>> 
>>> I guess the issue is coming from the gdb_test_multiple that I use in 
>>> the
>>> new test script.
>>> 
>>> I'm tried to write patterns that match and discard all the lines as 
>>> they
>>> arrive from GDB.  I guess you are seeing a pattern that I am not for
>>> some reason.
>>> 
>>> Could you run just this test and attach the gdb.log file and I'll 
>>> take a
>>> look.  I probably just need to tweak one of the patterns a little.
>>> 
>>> Thanks,
>>> Andrew
>>> 
>> 
>> I briefly looked into this. The problem seems to arise from the fact 
>> that sometimes we don't have multiple lines for the "info sources" 
>> output.
>> 
>> Some sections are output in a single line. For example, one of them 
>> has 133K characters. But each entry seems to be separated by a comma 
>> character:
>> 
>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h, 
>> ./elf/../sysdeps/generic/ldsodefs.h, 
>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
> 
> Ahh, that would explain it.  We don't appear to use 'info sources' that
> frequently in the testsuite.  I wonder if you are also seeing failures
> on those other tests?
> 
>   gdb.asm/asm-source.exp
>   gdb.dwarf2/dup-psym.exp
>   gdb.dwarf2/dw2-filename.exp
> 
>> It might be best (for the testsuite) if gdb outputs this data across 
>> more lines.
> 
> The other option might be to extend 'info sources' to allow filtering
> based on the objfile name, then we can use this in the testsuite to
> limit the output...
> 
> ... or I wonder if we could trick GDB by setting the width to something
> small, the I guess the lines would be broken after the ',' characters.
> 
> I'll have a play and see what I can come up with.
> 

I also ran into this issue on ubuntu 22.04.1 x86_64.

AFAIK, the way we usually test for this type of information is "maint 
print objfile", which is less verbose, and doesn't have long lines.

Thanks,
- Tom

> Thanks,
> Andrew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-20  8:43               ` tdevries
@ 2022-12-20 10:32                 ` Andrew Burgess
  2022-12-20 13:20                   ` Andrew Burgess
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Burgess @ 2022-12-20 10:32 UTC (permalink / raw)
  To: tdevries; +Cc: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

tdevries <tdevries@suse.de> writes:

> On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
>> Luis Machado <luis.machado@arm.com> writes:
>> 
>>> On 12/15/22 11:22, Andrew Burgess wrote:
>>>> Luis Machado <luis.machado@arm.com> writes:
>>>> 
>>>>> Hi Andrew,
>>>>> 
>>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>> 
>>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches 
>>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>>> 
>>>>>>> Thank you for doing this.
>>>>>>> 
>>>>>>> Andrew>   - However, GDB checks each partial symbol using multiple 
>>>>>>> languages,
>>>>>>> Andrew>     not just the current language (C in this case), so, 
>>>>>>> when GDB
>>>>>>> Andrew>     checks using the C++ language, the symbol name is 
>>>>>>> first demangled,
>>>>>>> Andrew>     the code that does this can be found
>>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the 
>>>>>>> demangled form of
>>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any 
>>>>>>> symbols with
>>>>>>> Andrew>     the name 'int', most partial symtabs will contain such 
>>>>>>> a symbol,
>>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>>> 
>>>>>>> It's a pedantic point but what happens here is name 
>>>>>>> canonicalization,
>>>>>>> not demangling.  Demangling is just used to refer to the 
>>>>>>> translation
>>>>>>> from a name like "_Zmumble" to "something::else" -- that is, the 
>>>>>>> input
>>>>>>> is a linkage name and the output is a C++ name.  Canonicalization 
>>>>>>> takes
>>>>>>> a C++ name as input and returns the standard form, basically 
>>>>>>> dealing
>>>>>>> with the fact that C++ (and as we discovered, C) has multiple 
>>>>>>> possible
>>>>>>> spellings for some symbols.
>>>>>> 
>>>>>> Please, be pedantic.  My goal here was to better understand this 
>>>>>> code,
>>>>>> there's no point me understanding it wrong.
>>>>>> 
>>>>>> I'll reword that paragraph.
>>>>>> 
>>>>>> Thanks for taking a look.
>>>>>> 
>>>>>> Andrew
>>>>>> 
>>>>> 
>>>>> I'm not saying you should investigate this, as it is a new test, but 
>>>>> I'm getting a lot of these messages for this test:
>>>>> 
>>>>> ERROR: internal buffer is full.
>>>> 
>>>> Happy to take a look at the problem.
>>>> 
>>>> I guess the issue is coming from the gdb_test_multiple that I use in 
>>>> the
>>>> new test script.
>>>> 
>>>> I'm tried to write patterns that match and discard all the lines as 
>>>> they
>>>> arrive from GDB.  I guess you are seeing a pattern that I am not for
>>>> some reason.
>>>> 
>>>> Could you run just this test and attach the gdb.log file and I'll 
>>>> take a
>>>> look.  I probably just need to tweak one of the patterns a little.
>>>> 
>>>> Thanks,
>>>> Andrew
>>>> 
>>> 
>>> I briefly looked into this. The problem seems to arise from the fact 
>>> that sometimes we don't have multiple lines for the "info sources" 
>>> output.
>>> 
>>> Some sections are output in a single line. For example, one of them 
>>> has 133K characters. But each entry seems to be separated by a comma 
>>> character:
>>> 
>>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h, 
>>> ./elf/../sysdeps/generic/ldsodefs.h, 
>>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
>> 
>> Ahh, that would explain it.  We don't appear to use 'info sources' that
>> frequently in the testsuite.  I wonder if you are also seeing failures
>> on those other tests?
>> 
>>   gdb.asm/asm-source.exp
>>   gdb.dwarf2/dup-psym.exp
>>   gdb.dwarf2/dw2-filename.exp
>> 
>>> It might be best (for the testsuite) if gdb outputs this data across 
>>> more lines.
>> 
>> The other option might be to extend 'info sources' to allow filtering
>> based on the objfile name, then we can use this in the testsuite to
>> limit the output...
>> 
>> ... or I wonder if we could trick GDB by setting the width to something
>> small, the I guess the lines would be broken after the ',' characters.
>> 
>> I'll have a play and see what I can come up with.
>> 
>
> I also ran into this issue on ubuntu 22.04.1 x86_64.
>
> AFAIK, the way we usually test for this type of information is "maint 
> print objfile", which is less verbose, and doesn't have long lines.

I'm looking at this issue today, I'll give 'maint print objfile' a go.
Thanks for the suggestion.

Thanks,
Andrew


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-20 10:32                 ` Andrew Burgess
@ 2022-12-20 13:20                   ` Andrew Burgess
  2022-12-20 14:04                     ` Luis Machado
  2022-12-20 14:54                     ` tdevries
  0 siblings, 2 replies; 17+ messages in thread
From: Andrew Burgess @ 2022-12-20 13:20 UTC (permalink / raw)
  To: tdevries; +Cc: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

Andrew Burgess <aburgess@redhat.com> writes:

> tdevries <tdevries@suse.de> writes:
>
>> On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
>>> Luis Machado <luis.machado@arm.com> writes:
>>> 
>>>> On 12/15/22 11:22, Andrew Burgess wrote:
>>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>> 
>>>>>> Hi Andrew,
>>>>>> 
>>>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>> 
>>>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches 
>>>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>>>> 
>>>>>>>> Thank you for doing this.
>>>>>>>> 
>>>>>>>> Andrew>   - However, GDB checks each partial symbol using multiple 
>>>>>>>> languages,
>>>>>>>> Andrew>     not just the current language (C in this case), so, 
>>>>>>>> when GDB
>>>>>>>> Andrew>     checks using the C++ language, the symbol name is 
>>>>>>>> first demangled,
>>>>>>>> Andrew>     the code that does this can be found
>>>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the 
>>>>>>>> demangled form of
>>>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any 
>>>>>>>> symbols with
>>>>>>>> Andrew>     the name 'int', most partial symtabs will contain such 
>>>>>>>> a symbol,
>>>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>>>> 
>>>>>>>> It's a pedantic point but what happens here is name 
>>>>>>>> canonicalization,
>>>>>>>> not demangling.  Demangling is just used to refer to the 
>>>>>>>> translation
>>>>>>>> from a name like "_Zmumble" to "something::else" -- that is, the 
>>>>>>>> input
>>>>>>>> is a linkage name and the output is a C++ name.  Canonicalization 
>>>>>>>> takes
>>>>>>>> a C++ name as input and returns the standard form, basically 
>>>>>>>> dealing
>>>>>>>> with the fact that C++ (and as we discovered, C) has multiple 
>>>>>>>> possible
>>>>>>>> spellings for some symbols.
>>>>>>> 
>>>>>>> Please, be pedantic.  My goal here was to better understand this 
>>>>>>> code,
>>>>>>> there's no point me understanding it wrong.
>>>>>>> 
>>>>>>> I'll reword that paragraph.
>>>>>>> 
>>>>>>> Thanks for taking a look.
>>>>>>> 
>>>>>>> Andrew
>>>>>>> 
>>>>>> 
>>>>>> I'm not saying you should investigate this, as it is a new test, but 
>>>>>> I'm getting a lot of these messages for this test:
>>>>>> 
>>>>>> ERROR: internal buffer is full.
>>>>> 
>>>>> Happy to take a look at the problem.
>>>>> 
>>>>> I guess the issue is coming from the gdb_test_multiple that I use in 
>>>>> the
>>>>> new test script.
>>>>> 
>>>>> I'm tried to write patterns that match and discard all the lines as 
>>>>> they
>>>>> arrive from GDB.  I guess you are seeing a pattern that I am not for
>>>>> some reason.
>>>>> 
>>>>> Could you run just this test and attach the gdb.log file and I'll 
>>>>> take a
>>>>> look.  I probably just need to tweak one of the patterns a little.
>>>>> 
>>>>> Thanks,
>>>>> Andrew
>>>>> 
>>>> 
>>>> I briefly looked into this. The problem seems to arise from the fact 
>>>> that sometimes we don't have multiple lines for the "info sources" 
>>>> output.
>>>> 
>>>> Some sections are output in a single line. For example, one of them 
>>>> has 133K characters. But each entry seems to be separated by a comma 
>>>> character:
>>>> 
>>>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h, 
>>>> ./elf/../sysdeps/generic/ldsodefs.h, 
>>>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
>>> 
>>> Ahh, that would explain it.  We don't appear to use 'info sources' that
>>> frequently in the testsuite.  I wonder if you are also seeing failures
>>> on those other tests?
>>> 
>>>   gdb.asm/asm-source.exp
>>>   gdb.dwarf2/dup-psym.exp
>>>   gdb.dwarf2/dw2-filename.exp
>>> 
>>>> It might be best (for the testsuite) if gdb outputs this data across 
>>>> more lines.
>>> 
>>> The other option might be to extend 'info sources' to allow filtering
>>> based on the objfile name, then we can use this in the testsuite to
>>> limit the output...
>>> 
>>> ... or I wonder if we could trick GDB by setting the width to something
>>> small, the I guess the lines would be broken after the ',' characters.
>>> 
>>> I'll have a play and see what I can come up with.
>>> 
>>
>> I also ran into this issue on ubuntu 22.04.1 x86_64.
>>
>> AFAIK, the way we usually test for this type of information is "maint 
>> print objfile", which is less verbose, and doesn't have long lines.
>
> I'm looking at this issue today, I'll give 'maint print objfile' a go.
> Thanks for the suggestion.

I was able to reproduce the buffer overflow errors.  The patch below
addresses the issue for me.

Thoughts?

Thanks,
Andrew

---

commit e1f51c1b3b37d96e679fa2698eb83a6a3a05eb53
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Tue Dec 20 12:51:50 2022 +0000

    gdb/testsuite: fix buffer overflow in gdb.base/signed-builtin-types.exp
    
    In commit:
    
      commit 9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12
      Date:   Wed Dec 7 15:55:25 2022 +0000
    
          gdb/testsuite: new test for recent dwarf reader issue
    
    A new test (gdb.base/signed-builtin-types.exp) was added that made use
    of 'info sources' to figure out if the debug information for a
    particular object file had been fully expanded or not.  Unfortunately
    some lines of the 'info sources' output can be very long, this was
    observed on some systems where the debug information for the
    dynamic-linker was installed, in this case, the list of source files
    associated with the dynamic linker was so long it would cause expect's
    internal buffer to overflow.
    
    This commit switches from using 'info sources' to 'maint print
    objfile', the output from the latter command is more compact, but
    also, can be restricted to a named object file.
    
    With this change in place I am no longer seeing buffer overflow errors
    from expect when running gdb.base/signed-builtin-types.exp.

diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.exp b/gdb/testsuite/gdb.base/signed-builtin-types.exp
index e9784330fee..fdb9251758e 100644
--- a/gdb/testsuite/gdb.base/signed-builtin-types.exp
+++ b/gdb/testsuite/gdb.base/signed-builtin-types.exp
@@ -21,7 +21,8 @@ standard_testfile .c -lib.c
 
 # Compile the shared library.
 set srcdso [file join $srcdir $subdir $srcfile2]
-set objdso [standard_output_file lib${gdb_test_file_name}.so]
+set libname "lib${gdb_test_file_name}.so"
+set objdso [standard_output_file $libname]
 if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
     untested "failed to compile dso"
     return -1
@@ -47,45 +48,39 @@ if {[readnow]} {
 # information has NOT been fully expanded (which is what we want for this
 # test).
 proc shared_library_debug_not_fully_expanded {} {
-    set library_expanded ""
-    gdb_test_multiple "info sources" "" {
-	-re "^info sources\r\n" {
+    set not_expanded true
+    gdb_test_multiple "maint print objfiles $::libname" "" {
+	-re "^maint print objfiles \[^\r\n\]+\r\n" {
 	    exp_continue
 	}
-	-re "^(\[^\r\n\]+):\r\n\\(Full debug information has not yet been read for this file\\.\\)\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		set library_expanded "no"
-	    }
+
+	-re "^\\s*\r\n" {
+	    exp_continue
+	}
+
+	-re "^Object file \[^\r\n\]+\r\n" {
 	    exp_continue
 	}
-	-re "^(\[^\r\n\]+):\r\n\\(Objfile has no debug information\\.\\)\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		# For some reason the shared library has no debug
-		# information, this is not expected.
-		set library_expanded "missing debug"
-	    }
+
+	-re "^Cooked index in use\r\n" {
 	    exp_continue
 	}
-	-re "^(\[^\r\n\]+):\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		set library_expanded "yes"
-	    }
+
+	-re "^Symtabs:\r\n" {
+	    set not_expanded false
 	    exp_continue
 	}
+
 	-re "^$::gdb_prompt $" {
-	    gdb_assert {[string equal $library_expanded "yes"] \
-			    || [string equal $library_expanded "no"]} \
-		$gdb_test_name
+	    pass $gdb_test_name
 	}
-	-re "^(\[^\r\n:\]*)\r\n" {
+
+	-re "^\[^\r\n\]+\r\n" {
 	    exp_continue
 	}
     }
 
-    return [expr $library_expanded == "no"]
+    return $not_expanded
 }
 
 foreach_with_prefix type_name {"short" "int" "long" "char"} {


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-20 13:20                   ` Andrew Burgess
@ 2022-12-20 14:04                     ` Luis Machado
  2022-12-20 14:54                     ` tdevries
  1 sibling, 0 replies; 17+ messages in thread
From: Luis Machado @ 2022-12-20 14:04 UTC (permalink / raw)
  To: Andrew Burgess, tdevries; +Cc: Tom Tromey, Andrew Burgess via Gdb-patches

On 12/20/22 13:20, Andrew Burgess wrote:
> Andrew Burgess <aburgess@redhat.com> writes:
> 
>> tdevries <tdevries@suse.de> writes:
>>
>>> On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>
>>>>> On 12/15/22 11:22, Andrew Burgess wrote:
>>>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>>>
>>>>>>> Hi Andrew,
>>>>>>>
>>>>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>>
>>>>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches
>>>>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>>>>>
>>>>>>>>> Thank you for doing this.
>>>>>>>>>
>>>>>>>>> Andrew>   - However, GDB checks each partial symbol using multiple
>>>>>>>>> languages,
>>>>>>>>> Andrew>     not just the current language (C in this case), so,
>>>>>>>>> when GDB
>>>>>>>>> Andrew>     checks using the C++ language, the symbol name is
>>>>>>>>> first demangled,
>>>>>>>>> Andrew>     the code that does this can be found
>>>>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the
>>>>>>>>> demangled form of
>>>>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any
>>>>>>>>> symbols with
>>>>>>>>> Andrew>     the name 'int', most partial symtabs will contain such
>>>>>>>>> a symbol,
>>>>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>>>>>
>>>>>>>>> It's a pedantic point but what happens here is name
>>>>>>>>> canonicalization,
>>>>>>>>> not demangling.  Demangling is just used to refer to the
>>>>>>>>> translation
>>>>>>>>> from a name like "_Zmumble" to "something::else" -- that is, the
>>>>>>>>> input
>>>>>>>>> is a linkage name and the output is a C++ name.  Canonicalization
>>>>>>>>> takes
>>>>>>>>> a C++ name as input and returns the standard form, basically
>>>>>>>>> dealing
>>>>>>>>> with the fact that C++ (and as we discovered, C) has multiple
>>>>>>>>> possible
>>>>>>>>> spellings for some symbols.
>>>>>>>>
>>>>>>>> Please, be pedantic.  My goal here was to better understand this
>>>>>>>> code,
>>>>>>>> there's no point me understanding it wrong.
>>>>>>>>
>>>>>>>> I'll reword that paragraph.
>>>>>>>>
>>>>>>>> Thanks for taking a look.
>>>>>>>>
>>>>>>>> Andrew
>>>>>>>>
>>>>>>>
>>>>>>> I'm not saying you should investigate this, as it is a new test, but
>>>>>>> I'm getting a lot of these messages for this test:
>>>>>>>
>>>>>>> ERROR: internal buffer is full.
>>>>>>
>>>>>> Happy to take a look at the problem.
>>>>>>
>>>>>> I guess the issue is coming from the gdb_test_multiple that I use in
>>>>>> the
>>>>>> new test script.
>>>>>>
>>>>>> I'm tried to write patterns that match and discard all the lines as
>>>>>> they
>>>>>> arrive from GDB.  I guess you are seeing a pattern that I am not for
>>>>>> some reason.
>>>>>>
>>>>>> Could you run just this test and attach the gdb.log file and I'll
>>>>>> take a
>>>>>> look.  I probably just need to tweak one of the patterns a little.
>>>>>>
>>>>>> Thanks,
>>>>>> Andrew
>>>>>>
>>>>>
>>>>> I briefly looked into this. The problem seems to arise from the fact
>>>>> that sometimes we don't have multiple lines for the "info sources"
>>>>> output.
>>>>>
>>>>> Some sections are output in a single line. For example, one of them
>>>>> has 133K characters. But each entry seems to be separated by a comma
>>>>> character:
>>>>>
>>>>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h,
>>>>> ./elf/../sysdeps/generic/ldsodefs.h,
>>>>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
>>>>
>>>> Ahh, that would explain it.  We don't appear to use 'info sources' that
>>>> frequently in the testsuite.  I wonder if you are also seeing failures
>>>> on those other tests?
>>>>
>>>>    gdb.asm/asm-source.exp
>>>>    gdb.dwarf2/dup-psym.exp
>>>>    gdb.dwarf2/dw2-filename.exp
>>>>
>>>>> It might be best (for the testsuite) if gdb outputs this data across
>>>>> more lines.
>>>>
>>>> The other option might be to extend 'info sources' to allow filtering
>>>> based on the objfile name, then we can use this in the testsuite to
>>>> limit the output...
>>>>
>>>> ... or I wonder if we could trick GDB by setting the width to something
>>>> small, the I guess the lines would be broken after the ',' characters.
>>>>
>>>> I'll have a play and see what I can come up with.
>>>>
>>>
>>> I also ran into this issue on ubuntu 22.04.1 x86_64.
>>>
>>> AFAIK, the way we usually test for this type of information is "maint
>>> print objfile", which is less verbose, and doesn't have long lines.
>>
>> I'm looking at this issue today, I'll give 'maint print objfile' a go.
>> Thanks for the suggestion.
> 
> I was able to reproduce the buffer overflow errors.  The patch below
> addresses the issue for me.
> 
> Thoughts?
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit e1f51c1b3b37d96e679fa2698eb83a6a3a05eb53
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Dec 20 12:51:50 2022 +0000
> 
>      gdb/testsuite: fix buffer overflow in gdb.base/signed-builtin-types.exp
>      
>      In commit:
>      
>        commit 9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12
>        Date:   Wed Dec 7 15:55:25 2022 +0000
>      
>            gdb/testsuite: new test for recent dwarf reader issue
>      
>      A new test (gdb.base/signed-builtin-types.exp) was added that made use
>      of 'info sources' to figure out if the debug information for a
>      particular object file had been fully expanded or not.  Unfortunately
>      some lines of the 'info sources' output can be very long, this was
>      observed on some systems where the debug information for the
>      dynamic-linker was installed, in this case, the list of source files
>      associated with the dynamic linker was so long it would cause expect's
>      internal buffer to overflow.
>      
>      This commit switches from using 'info sources' to 'maint print
>      objfile', the output from the latter command is more compact, but
>      also, can be restricted to a named object file.
>      
>      With this change in place I am no longer seeing buffer overflow errors
>      from expect when running gdb.base/signed-builtin-types.exp.
> 
> diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.exp b/gdb/testsuite/gdb.base/signed-builtin-types.exp
> index e9784330fee..fdb9251758e 100644
> --- a/gdb/testsuite/gdb.base/signed-builtin-types.exp
> +++ b/gdb/testsuite/gdb.base/signed-builtin-types.exp
> @@ -21,7 +21,8 @@ standard_testfile .c -lib.c
>   
>   # Compile the shared library.
>   set srcdso [file join $srcdir $subdir $srcfile2]
> -set objdso [standard_output_file lib${gdb_test_file_name}.so]
> +set libname "lib${gdb_test_file_name}.so"
> +set objdso [standard_output_file $libname]
>   if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
>       untested "failed to compile dso"
>       return -1
> @@ -47,45 +48,39 @@ if {[readnow]} {
>   # information has NOT been fully expanded (which is what we want for this
>   # test).
>   proc shared_library_debug_not_fully_expanded {} {
> -    set library_expanded ""
> -    gdb_test_multiple "info sources" "" {
> -	-re "^info sources\r\n" {
> +    set not_expanded true
> +    gdb_test_multiple "maint print objfiles $::libname" "" {
> +	-re "^maint print objfiles \[^\r\n\]+\r\n" {
>   	    exp_continue
>   	}
> -	-re "^(\[^\r\n\]+):\r\n\\(Full debug information has not yet been read for this file\\.\\)\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		set library_expanded "no"
> -	    }
> +
> +	-re "^\\s*\r\n" {
> +	    exp_continue
> +	}
> +
> +	-re "^Object file \[^\r\n\]+\r\n" {
>   	    exp_continue
>   	}
> -	-re "^(\[^\r\n\]+):\r\n\\(Objfile has no debug information\\.\\)\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		# For some reason the shared library has no debug
> -		# information, this is not expected.
> -		set library_expanded "missing debug"
> -	    }
> +
> +	-re "^Cooked index in use\r\n" {
>   	    exp_continue
>   	}
> -	-re "^(\[^\r\n\]+):\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		set library_expanded "yes"
> -	    }
> +
> +	-re "^Symtabs:\r\n" {
> +	    set not_expanded false
>   	    exp_continue
>   	}
> +
>   	-re "^$::gdb_prompt $" {
> -	    gdb_assert {[string equal $library_expanded "yes"] \
> -			    || [string equal $library_expanded "no"]} \
> -		$gdb_test_name
> +	    pass $gdb_test_name
>   	}
> -	-re "^(\[^\r\n:\]*)\r\n" {
> +
> +	-re "^\[^\r\n\]+\r\n" {
>   	    exp_continue
>   	}
>       }
>   
> -    return [expr $library_expanded == "no"]
> +    return $not_expanded
>   }
>   
>   foreach_with_prefix type_name {"short" "int" "long" "char"} {
> 

LGTM, and it fixes things for me as well. Thanks a lot for looking into this.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-20 13:20                   ` Andrew Burgess
  2022-12-20 14:04                     ` Luis Machado
@ 2022-12-20 14:54                     ` tdevries
  2022-12-24 16:05                       ` Andrew Burgess
  1 sibling, 1 reply; 17+ messages in thread
From: tdevries @ 2022-12-20 14:54 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

On 2022-12-20 13:20, Andrew Burgess wrote:
> Andrew Burgess <aburgess@redhat.com> writes:
> 
>> tdevries <tdevries@suse.de> writes:
>> 
>>> On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
>>>> Luis Machado <luis.machado@arm.com> writes:
>>>> 
>>>>> On 12/15/22 11:22, Andrew Burgess wrote:
>>>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>>> 
>>>>>>> Hi Andrew,
>>>>>>> 
>>>>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>> 
>>>>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches
>>>>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>>>>> 
>>>>>>>>> Thank you for doing this.
>>>>>>>>> 
>>>>>>>>> Andrew>   - However, GDB checks each partial symbol using 
>>>>>>>>> multiple
>>>>>>>>> languages,
>>>>>>>>> Andrew>     not just the current language (C in this case), so,
>>>>>>>>> when GDB
>>>>>>>>> Andrew>     checks using the C++ language, the symbol name is
>>>>>>>>> first demangled,
>>>>>>>>> Andrew>     the code that does this can be found
>>>>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the
>>>>>>>>> demangled form of
>>>>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any
>>>>>>>>> symbols with
>>>>>>>>> Andrew>     the name 'int', most partial symtabs will contain 
>>>>>>>>> such
>>>>>>>>> a symbol,
>>>>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>>>>> 
>>>>>>>>> It's a pedantic point but what happens here is name
>>>>>>>>> canonicalization,
>>>>>>>>> not demangling.  Demangling is just used to refer to the
>>>>>>>>> translation
>>>>>>>>> from a name like "_Zmumble" to "something::else" -- that is, 
>>>>>>>>> the
>>>>>>>>> input
>>>>>>>>> is a linkage name and the output is a C++ name.  
>>>>>>>>> Canonicalization
>>>>>>>>> takes
>>>>>>>>> a C++ name as input and returns the standard form, basically
>>>>>>>>> dealing
>>>>>>>>> with the fact that C++ (and as we discovered, C) has multiple
>>>>>>>>> possible
>>>>>>>>> spellings for some symbols.
>>>>>>>> 
>>>>>>>> Please, be pedantic.  My goal here was to better understand this
>>>>>>>> code,
>>>>>>>> there's no point me understanding it wrong.
>>>>>>>> 
>>>>>>>> I'll reword that paragraph.
>>>>>>>> 
>>>>>>>> Thanks for taking a look.
>>>>>>>> 
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>> 
>>>>>>> I'm not saying you should investigate this, as it is a new test, 
>>>>>>> but
>>>>>>> I'm getting a lot of these messages for this test:
>>>>>>> 
>>>>>>> ERROR: internal buffer is full.
>>>>>> 
>>>>>> Happy to take a look at the problem.
>>>>>> 
>>>>>> I guess the issue is coming from the gdb_test_multiple that I use 
>>>>>> in
>>>>>> the
>>>>>> new test script.
>>>>>> 
>>>>>> I'm tried to write patterns that match and discard all the lines 
>>>>>> as
>>>>>> they
>>>>>> arrive from GDB.  I guess you are seeing a pattern that I am not 
>>>>>> for
>>>>>> some reason.
>>>>>> 
>>>>>> Could you run just this test and attach the gdb.log file and I'll
>>>>>> take a
>>>>>> look.  I probably just need to tweak one of the patterns a little.
>>>>>> 
>>>>>> Thanks,
>>>>>> Andrew
>>>>>> 
>>>>> 
>>>>> I briefly looked into this. The problem seems to arise from the 
>>>>> fact
>>>>> that sometimes we don't have multiple lines for the "info sources"
>>>>> output.
>>>>> 
>>>>> Some sections are output in a single line. For example, one of them
>>>>> has 133K characters. But each entry seems to be separated by a 
>>>>> comma
>>>>> character:
>>>>> 
>>>>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h,
>>>>> ./elf/../sysdeps/generic/ldsodefs.h,
>>>>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
>>>> 
>>>> Ahh, that would explain it.  We don't appear to use 'info sources' 
>>>> that
>>>> frequently in the testsuite.  I wonder if you are also seeing 
>>>> failures
>>>> on those other tests?
>>>> 
>>>>   gdb.asm/asm-source.exp
>>>>   gdb.dwarf2/dup-psym.exp
>>>>   gdb.dwarf2/dw2-filename.exp
>>>> 
>>>>> It might be best (for the testsuite) if gdb outputs this data 
>>>>> across
>>>>> more lines.
>>>> 
>>>> The other option might be to extend 'info sources' to allow 
>>>> filtering
>>>> based on the objfile name, then we can use this in the testsuite to
>>>> limit the output...
>>>> 
>>>> ... or I wonder if we could trick GDB by setting the width to 
>>>> something
>>>> small, the I guess the lines would be broken after the ',' 
>>>> characters.
>>>> 
>>>> I'll have a play and see what I can come up with.
>>>> 
>>> 
>>> I also ran into this issue on ubuntu 22.04.1 x86_64.
>>> 
>>> AFAIK, the way we usually test for this type of information is "maint
>>> print objfile", which is less verbose, and doesn't have long lines.
>> 
>> I'm looking at this issue today, I'll give 'maint print objfile' a go.
>> Thanks for the suggestion.
> 
> I was able to reproduce the buffer overflow errors.  The patch below
> addresses the issue for me.
> 
> Thoughts?

LGTM.

Though I wonder if we can make do with being less precise, and just do 
something like:
...
proc assert_shared_library_debug_not_fully_expanded {} {
     gdb_test_lines "maint print objfiles $::libname" "" \
         "Object file \[^\r\n\]*$::libname" \
         -re-not "Symtabs:"
}
...

Thanks,
- Tom

> Thanks,
> Andrew
> 
> ---
> 
> commit e1f51c1b3b37d96e679fa2698eb83a6a3a05eb53
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Dec 20 12:51:50 2022 +0000
> 
>     gdb/testsuite: fix buffer overflow in 
> gdb.base/signed-builtin-types.exp
> 
>     In commit:
> 
>       commit 9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12
>       Date:   Wed Dec 7 15:55:25 2022 +0000
> 
>           gdb/testsuite: new test for recent dwarf reader issue
> 
>     A new test (gdb.base/signed-builtin-types.exp) was added that made 
> use
>     of 'info sources' to figure out if the debug information for a
>     particular object file had been fully expanded or not.  
> Unfortunately
>     some lines of the 'info sources' output can be very long, this was
>     observed on some systems where the debug information for the
>     dynamic-linker was installed, in this case, the list of source 
> files
>     associated with the dynamic linker was so long it would cause 
> expect's
>     internal buffer to overflow.
> 
>     This commit switches from using 'info sources' to 'maint print
>     objfile', the output from the latter command is more compact, but
>     also, can be restricted to a named object file.
> 
>     With this change in place I am no longer seeing buffer overflow 
> errors
>     from expect when running gdb.base/signed-builtin-types.exp.
> 
> diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.exp
> b/gdb/testsuite/gdb.base/signed-builtin-types.exp
> index e9784330fee..fdb9251758e 100644
> --- a/gdb/testsuite/gdb.base/signed-builtin-types.exp
> +++ b/gdb/testsuite/gdb.base/signed-builtin-types.exp
> @@ -21,7 +21,8 @@ standard_testfile .c -lib.c
> 
>  # Compile the shared library.
>  set srcdso [file join $srcdir $subdir $srcfile2]
> -set objdso [standard_output_file lib${gdb_test_file_name}.so]
> +set libname "lib${gdb_test_file_name}.so"
> +set objdso [standard_output_file $libname]
>  if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
>      untested "failed to compile dso"
>      return -1
> @@ -47,45 +48,39 @@ if {[readnow]} {
>  # information has NOT been fully expanded (which is what we want for 
> this
>  # test).
>  proc shared_library_debug_not_fully_expanded {} {
> -    set library_expanded ""
> -    gdb_test_multiple "info sources" "" {
> -	-re "^info sources\r\n" {
> +    set not_expanded true
> +    gdb_test_multiple "maint print objfiles $::libname" "" {
> +	-re "^maint print objfiles \[^\r\n\]+\r\n" {
>  	    exp_continue
>  	}
> -	-re "^(\[^\r\n\]+):\r\n\\(Full debug information has not yet been
> read for this file\\.\\)\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		set library_expanded "no"
> -	    }
> +
> +	-re "^\\s*\r\n" {
> +	    exp_continue
> +	}
> +
> +	-re "^Object file \[^\r\n\]+\r\n" {
>  	    exp_continue
>  	}
> -	-re "^(\[^\r\n\]+):\r\n\\(Objfile has no debug 
> information\\.\\)\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		# For some reason the shared library has no debug
> -		# information, this is not expected.
> -		set library_expanded "missing debug"
> -	    }
> +
> +	-re "^Cooked index in use\r\n" {
>  	    exp_continue
>  	}
> -	-re "^(\[^\r\n\]+):\r\n\r\n" {
> -	    set libname $expect_out(1,string)
> -	    if {$libname == $::objdso} {
> -		set library_expanded "yes"
> -	    }
> +
> +	-re "^Symtabs:\r\n" {
> +	    set not_expanded false
>  	    exp_continue
>  	}
> +
>  	-re "^$::gdb_prompt $" {
> -	    gdb_assert {[string equal $library_expanded "yes"] \
> -			    || [string equal $library_expanded "no"]} \
> -		$gdb_test_name
> +	    pass $gdb_test_name
>  	}
> -	-re "^(\[^\r\n:\]*)\r\n" {
> +
> +	-re "^\[^\r\n\]+\r\n" {
>  	    exp_continue
>  	}
>      }
> 
> -    return [expr $library_expanded == "no"]
> +    return $not_expanded
>  }
> 
>  foreach_with_prefix type_name {"short" "int" "long" "char"} {

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue
  2022-12-20 14:54                     ` tdevries
@ 2022-12-24 16:05                       ` Andrew Burgess
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Burgess @ 2022-12-24 16:05 UTC (permalink / raw)
  To: tdevries; +Cc: Luis Machado, Tom Tromey, Andrew Burgess via Gdb-patches

tdevries <tdevries@suse.de> writes:

> On 2022-12-20 13:20, Andrew Burgess wrote:
>> Andrew Burgess <aburgess@redhat.com> writes:
>> 
>>> tdevries <tdevries@suse.de> writes:
>>> 
>>>> On 2022-12-19 13:52, Andrew Burgess via Gdb-patches wrote:
>>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>> 
>>>>>> On 12/15/22 11:22, Andrew Burgess wrote:
>>>>>>> Luis Machado <luis.machado@arm.com> writes:
>>>>>>> 
>>>>>>>> Hi Andrew,
>>>>>>>> 
>>>>>>>> On 12/9/22 19:24, Andrew Burgess via Gdb-patches wrote:
>>>>>>>>> Tom Tromey <tom@tromey.com> writes:
>>>>>>>>> 
>>>>>>>>>>>>>>> "Andrew" == Andrew Burgess via Gdb-patches
>>>>>>>>>>>>>>> <gdb-patches@sourceware.org> writes:
>>>>>>>>>> 
>>>>>>>>>> Thank you for doing this.
>>>>>>>>>> 
>>>>>>>>>> Andrew>   - However, GDB checks each partial symbol using 
>>>>>>>>>> multiple
>>>>>>>>>> languages,
>>>>>>>>>> Andrew>     not just the current language (C in this case), so,
>>>>>>>>>> when GDB
>>>>>>>>>> Andrew>     checks using the C++ language, the symbol name is
>>>>>>>>>> first demangled,
>>>>>>>>>> Andrew>     the code that does this can be found
>>>>>>>>>> Andrew>     lookup_name_info::language_lookup_name.  As the
>>>>>>>>>> demangled form of
>>>>>>>>>> Andrew>     'signed int' is just 'int', GDB then looks for any
>>>>>>>>>> symbols with
>>>>>>>>>> Andrew>     the name 'int', most partial symtabs will contain 
>>>>>>>>>> such
>>>>>>>>>> a symbol,
>>>>>>>>>> Andrew>     so GDB ends up expanding pretty much every symtab.
>>>>>>>>>> 
>>>>>>>>>> It's a pedantic point but what happens here is name
>>>>>>>>>> canonicalization,
>>>>>>>>>> not demangling.  Demangling is just used to refer to the
>>>>>>>>>> translation
>>>>>>>>>> from a name like "_Zmumble" to "something::else" -- that is, 
>>>>>>>>>> the
>>>>>>>>>> input
>>>>>>>>>> is a linkage name and the output is a C++ name.  
>>>>>>>>>> Canonicalization
>>>>>>>>>> takes
>>>>>>>>>> a C++ name as input and returns the standard form, basically
>>>>>>>>>> dealing
>>>>>>>>>> with the fact that C++ (and as we discovered, C) has multiple
>>>>>>>>>> possible
>>>>>>>>>> spellings for some symbols.
>>>>>>>>> 
>>>>>>>>> Please, be pedantic.  My goal here was to better understand this
>>>>>>>>> code,
>>>>>>>>> there's no point me understanding it wrong.
>>>>>>>>> 
>>>>>>>>> I'll reword that paragraph.
>>>>>>>>> 
>>>>>>>>> Thanks for taking a look.
>>>>>>>>> 
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> I'm not saying you should investigate this, as it is a new test, 
>>>>>>>> but
>>>>>>>> I'm getting a lot of these messages for this test:
>>>>>>>> 
>>>>>>>> ERROR: internal buffer is full.
>>>>>>> 
>>>>>>> Happy to take a look at the problem.
>>>>>>> 
>>>>>>> I guess the issue is coming from the gdb_test_multiple that I use 
>>>>>>> in
>>>>>>> the
>>>>>>> new test script.
>>>>>>> 
>>>>>>> I'm tried to write patterns that match and discard all the lines 
>>>>>>> as
>>>>>>> they
>>>>>>> arrive from GDB.  I guess you are seeing a pattern that I am not 
>>>>>>> for
>>>>>>> some reason.
>>>>>>> 
>>>>>>> Could you run just this test and attach the gdb.log file and I'll
>>>>>>> take a
>>>>>>> look.  I probably just need to tweak one of the patterns a little.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Andrew
>>>>>>> 
>>>>>> 
>>>>>> I briefly looked into this. The problem seems to arise from the 
>>>>>> fact
>>>>>> that sometimes we don't have multiple lines for the "info sources"
>>>>>> output.
>>>>>> 
>>>>>> Some sections are output in a single line. For example, one of them
>>>>>> has 133K characters. But each entry seems to be separated by a 
>>>>>> comma
>>>>>> character:
>>>>>> 
>>>>>> ./elf/./elf/rtld.c, ./elf/../include/rtld-malloc.h,
>>>>>> ./elf/../sysdeps/generic/ldsodefs.h,
>>>>>> ./elf/../sysdeps/aarch64/dl-machine.h, ...
>>>>> 
>>>>> Ahh, that would explain it.  We don't appear to use 'info sources' 
>>>>> that
>>>>> frequently in the testsuite.  I wonder if you are also seeing 
>>>>> failures
>>>>> on those other tests?
>>>>> 
>>>>>   gdb.asm/asm-source.exp
>>>>>   gdb.dwarf2/dup-psym.exp
>>>>>   gdb.dwarf2/dw2-filename.exp
>>>>> 
>>>>>> It might be best (for the testsuite) if gdb outputs this data 
>>>>>> across
>>>>>> more lines.
>>>>> 
>>>>> The other option might be to extend 'info sources' to allow 
>>>>> filtering
>>>>> based on the objfile name, then we can use this in the testsuite to
>>>>> limit the output...
>>>>> 
>>>>> ... or I wonder if we could trick GDB by setting the width to 
>>>>> something
>>>>> small, the I guess the lines would be broken after the ',' 
>>>>> characters.
>>>>> 
>>>>> I'll have a play and see what I can come up with.
>>>>> 
>>>> 
>>>> I also ran into this issue on ubuntu 22.04.1 x86_64.
>>>> 
>>>> AFAIK, the way we usually test for this type of information is "maint
>>>> print objfile", which is less verbose, and doesn't have long lines.
>>> 
>>> I'm looking at this issue today, I'll give 'maint print objfile' a go.
>>> Thanks for the suggestion.
>> 
>> I was able to reproduce the buffer overflow errors.  The patch below
>> addresses the issue for me.
>> 
>> Thoughts?
>
> LGTM.
>
> Though I wonder if we can make do with being less precise, and just do 
> something like:
> ...
> proc assert_shared_library_debug_not_fully_expanded {} {
>      gdb_test_lines "maint print objfiles $::libname" "" \
>          "Object file \[^\r\n\]*$::libname" \
>          -re-not "Symtabs:"
> }
> ...
>

Thanks for that suggestion Tom, that really is much better that what I
had.

I've taken your suggestion and pushed the fix to master.  My final patch
is below.

Thanks,
Andrew

---

commit 3a98808c164b36c7023bd80fc6b019cbe6274365
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Tue Dec 20 12:51:50 2022 +0000

    gdb/testsuite: fix buffer overflow in gdb.base/signed-builtin-types.exp
    
    In commit:
    
      commit 9f50fe0835850645bd8ea9bb1efe1fe6c48dfb12
      Date:   Wed Dec 7 15:55:25 2022 +0000
    
          gdb/testsuite: new test for recent dwarf reader issue
    
    A new test (gdb.base/signed-builtin-types.exp) was added that made use
    of 'info sources' to figure out if the debug information for a
    particular object file had been fully expanded or not.  Unfortunately
    some lines of the 'info sources' output can be very long, this was
    observed on some systems where the debug information for the
    dynamic-linker was installed, in this case, the list of source files
    associated with the dynamic linker was so long it would cause expect's
    internal buffer to overflow.
    
    This commit switches from using 'info sources' to 'maint print
    objfile', the output from the latter command is more compact, but
    also, can be restricted to a single named object file.
    
    With this change in place I am no longer seeing buffer overflow errors
    from expect when running gdb.base/signed-builtin-types.exp.

diff --git a/gdb/testsuite/gdb.base/signed-builtin-types.exp b/gdb/testsuite/gdb.base/signed-builtin-types.exp
index e9784330fee..30e224fb439 100644
--- a/gdb/testsuite/gdb.base/signed-builtin-types.exp
+++ b/gdb/testsuite/gdb.base/signed-builtin-types.exp
@@ -21,7 +21,8 @@ standard_testfile .c -lib.c
 
 # Compile the shared library.
 set srcdso [file join $srcdir $subdir $srcfile2]
-set objdso [standard_output_file lib${gdb_test_file_name}.so]
+set libname "lib${gdb_test_file_name}.so"
+set objdso [standard_output_file $libname]
 if {[gdb_compile_shlib $srcdso $objdso {debug}] != ""} {
     untested "failed to compile dso"
     return -1
@@ -46,46 +47,10 @@ if {[readnow]} {
 # library has been fully expanded or not.  Return true if the debug
 # information has NOT been fully expanded (which is what we want for this
 # test).
-proc shared_library_debug_not_fully_expanded {} {
-    set library_expanded ""
-    gdb_test_multiple "info sources" "" {
-	-re "^info sources\r\n" {
-	    exp_continue
-	}
-	-re "^(\[^\r\n\]+):\r\n\\(Full debug information has not yet been read for this file\\.\\)\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		set library_expanded "no"
-	    }
-	    exp_continue
-	}
-	-re "^(\[^\r\n\]+):\r\n\\(Objfile has no debug information\\.\\)\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		# For some reason the shared library has no debug
-		# information, this is not expected.
-		set library_expanded "missing debug"
-	    }
-	    exp_continue
-	}
-	-re "^(\[^\r\n\]+):\r\n\r\n" {
-	    set libname $expect_out(1,string)
-	    if {$libname == $::objdso} {
-		set library_expanded "yes"
-	    }
-	    exp_continue
-	}
-	-re "^$::gdb_prompt $" {
-	    gdb_assert {[string equal $library_expanded "yes"] \
-			    || [string equal $library_expanded "no"]} \
-		$gdb_test_name
-	}
-	-re "^(\[^\r\n:\]*)\r\n" {
-	    exp_continue
-	}
-    }
-
-    return [expr $library_expanded == "no"]
+proc assert_shared_library_debug_not_fully_expanded {} {
+    gdb_test_lines "maint print objfiles $::libname" "" \
+	"Object file \[^\r\n\]*$::libname" \
+	-re-not "Symtabs:"
 }
 
 foreach_with_prefix type_name {"short" "int" "long" "char"} {
@@ -93,7 +58,7 @@ foreach_with_prefix type_name {"short" "int" "long" "char"} {
 	with_test_prefix "before sizeof expression" {
 	    # Check that the debug information for the shared library has
 	    # not yet been read in.
-	    gdb_assert { [shared_library_debug_not_fully_expanded] }
+	    assert_shared_library_debug_not_fully_expanded
 	}
 
 	# Evaluate a sizeof expression for a builtin type.  At one point GDB
@@ -106,7 +71,7 @@ foreach_with_prefix type_name {"short" "int" "long" "char"} {
 	with_test_prefix "after sizeof expression" {
 	    # Check that the debug information for the shared library has not
 	    # yet been read in.
-	    gdb_assert { [shared_library_debug_not_fully_expanded] }
+	    assert_shared_library_debug_not_fully_expanded
 	}
     }
 }


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-12-24 16:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-08 15:38 [PATCH 0/2] New test for slow DWARF reader issue Andrew Burgess
2022-12-08 15:38 ` [PATCH 1/2] gdb/testsuite: fix readnow detection Andrew Burgess
2022-12-08 15:38 ` [PATCH 2/2] gdb/testsuite: new test for recent dwarf reader issue Andrew Burgess
2022-12-09 18:18   ` Tom Tromey
2022-12-09 19:24     ` Andrew Burgess
2022-12-14 14:47       ` Luis Machado
2022-12-15 11:22         ` Andrew Burgess
2022-12-19 13:20           ` Luis Machado
2022-12-19 13:52             ` Andrew Burgess
2022-12-20  8:43               ` tdevries
2022-12-20 10:32                 ` Andrew Burgess
2022-12-20 13:20                   ` Andrew Burgess
2022-12-20 14:04                     ` Luis Machado
2022-12-20 14:54                     ` tdevries
2022-12-24 16:05                       ` Andrew Burgess
2022-12-09 18:18 ` [PATCH 0/2] New test for slow DWARF " Tom Tromey
2022-12-14 10:25   ` Andrew Burgess

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).