public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
@ 2024-04-05 15:10 Tom de Vries
  2024-04-05 15:10 ` [PATCH v3 2/2] [gdb/testsuite] Add gdb.dwarf2/dw2-epilogue-begin-2.exp Tom de Vries
  2024-04-06  5:03 ` [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Bernd Edlinger
  0 siblings, 2 replies; 8+ messages in thread
From: Tom de Vries @ 2024-04-05 15:10 UTC (permalink / raw)
  To: gdb-patches

From: Bernd Edlinger <bernd.edlinger@hotmail.de>

An out of bounds array access in find_epilogue_using_linetable causes random
test failures like these:

FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $fn_fba
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: check frame-id matches
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: bt 2
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: up
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $sp_value == $::main_sp
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $::main_fba
FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: [string equal $fid $::main_fid]

Here the read happens below the first element of the line
table, and the test failure depends on the value that is
read from there.

It also happens that std::lower_bound returns a pointer exactly at the upper
bound of the line table, also here the read value is undefined, that happens
in this test:

FAIL: gdb.dwarf2/dw2-epilogue-begin.exp: confirm watchpoint doesn't trigger

Fixes: 528b729be1a2 ("gdb/dwarf2: Add support for DW_LNS_set_epilogue_begin in line-table")

Co-Authored-By: Tom de Vries <tdevries@suse.de>

PR symtab/31268
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31268
---
 gdb/symtab.c | 85 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 75 insertions(+), 10 deletions(-)

diff --git a/gdb/symtab.c b/gdb/symtab.c
index 86603dfebc3..0c126d99cd4 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
 	= unrelocated_addr (end_pc - objfile->text_section_offset ());
 
       const linetable *linetable = sal.symtab->linetable ();
-      /* This should find the last linetable entry of the current function.
-	 It is probably where the epilogue begins, but since the DWARF 5
-	 spec doesn't guarantee it, we iterate backwards through the function
-	 until we either find it or are sure that it doesn't exist.  */
+      if (linetable->nitems == 0)
+	{
+	  /* Empty line table.  */
+	  return {};
+	}
+
+      /* Find the first linetable entry after the current function.  Note that
+	 this also may be an end_sequence entry.  */
       auto it = std::lower_bound
 	(linetable->item, linetable->item + linetable->nitems, unrel_end,
 	 [] (const linetable_entry &lte, unrelocated_addr pc)
@@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
 	   return lte.unrelocated_pc () < pc;
 	 });
 
-      while (it->unrelocated_pc () >= unrel_start)
-      {
-	if (it->epilogue_begin)
-	  return {it->pc (objfile)};
-	it --;
-      }
+      if (it == linetable->item + linetable->nitems)
+	{
+	  /* We couldn't find either:
+	     - a linetable entry starting the function after the current
+	       function, or
+	     - an end_sequence entry that terminates the current function
+	       at unrel_end.
+	     This can happen when the linetable doesn't describe the full
+	     extent of the function.  Even though this is a corner case, which
+	     may not happen other than in dwarf assembly test-cases, let's
+	     handle this.
+
+	     Move to the last entry in the linetable, and check that it's an
+	     end_sequence terminating the current function.  */
+	  gdb_assert (it != &linetable->item[0]);
+	  it--;
+	  if (!(it->line == 0
+		&& unrel_start <= it->unrelocated_pc ()
+		&& it->unrelocated_pc () < unrel_end))
+	    return {};
+	}
+      else
+	gdb_assert (unrel_end <= it->unrelocated_pc ());
+
+      /* Move to the last linetable entry of the current function.  */
+      if (it == &linetable->item[0])
+	{
+	  /* Doing it-- would introduce undefined behaviour, avoid it by
+	     explicitly handling this case.  */
+	  return {};
+	}
+      it--;
+      if (it->unrelocated_pc () < unrel_start)
+	{
+	  /* Not in the current function.  */
+	  return {};
+	}
+      gdb_assert (it->unrelocated_pc () < unrel_end);
+
+      /* We're at the the last linetable entry of the current function.  This
+	 is probably where the epilogue begins, but since the DWARF 5 spec
+	 doesn't guarantee it, we iterate backwards through the current
+	 function until we either find the epilogue beginning, or are sure
+	 that it doesn't exist.  */
+      for (; it >= &linetable->item[0]; it--)
+	{
+	  if (it->unrelocated_pc () < unrel_start)
+	    {
+	      /* No longer in the current function.  */
+	      break;
+	    }
+
+	  if (it->epilogue_begin)
+	    {
+	      /* Found the beginning of the epilogue.  */
+	      return {it->pc (objfile)};
+	    }
+
+	  if (it == &linetable->item[0])
+	    {
+	      /* No more entries in the current function.
+		 Doing it-- would introduce undefined behaviour, avoid it by
+		 explicitly handling this case.  */
+	      break;
+	    }
+	}
     }
+
   return {};
 }
 

base-commit: c0419c024bf922128131671e40de0aed736e38ed
-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 2/2] [gdb/testsuite] Add gdb.dwarf2/dw2-epilogue-begin-2.exp
  2024-04-05 15:10 [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Tom de Vries
@ 2024-04-05 15:10 ` Tom de Vries
  2024-04-06  5:03 ` [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Bernd Edlinger
  1 sibling, 0 replies; 8+ messages in thread
From: Tom de Vries @ 2024-04-05 15:10 UTC (permalink / raw)
  To: gdb-patches

Test-case gdb.dwarf2/dw2-epilogue-begin.exp has an end_sequence at 0x4004ff:
...
File name                Line number    Starting address    View    Stmt

dw2-epilogue-begin.c              20            0x4004b7               x
dw2-epilogue-begin.c              27            0x4004be               x
dw2-epilogue-begin.c              34            0x4004d0               x
dw2-epilogue-begin.c              37            0x4004de               x
dw2-epilogue-begin.c              38            0x4004ec               x
dw2-epilogue-begin.c              43            0x4004ef               x
dw2-epilogue-begin.c              47            0x4004fa               x
dw2-epilogue-begin.c              50            0x4004ff               x
dw2-epilogue-begin.c               -            0x4004ff
...
which is before the actual exclusive end of the main function at 0x40050d:
...
00000000004004ff <main_epilogue>:
  4004ff:       c6 05 1b 1b 00 00 0a    movb   $0xa,0x1b1b(%rip)
  400506:       b8 00 00 00 00          mov    $0x0,%eax
  40050b:       5d                      pop    %rbp
  40050c:       c3                      ret
...

This triggers the corner case in find_epilogue_using_linetable that
the call to std::lower_bound returns the "not found" case.

However, if we handle the corner case explicitly by returning something
incorrect:
...
+      if (it == linetable->item + linetable->nitems)
+	return {};
...
the test-case still passes.

Fix this by:
- rearranging the test-case to move the watch function to after the main
  function, and
- reworking the test-case into two variants:
  - end_sequence at the watch function end (the correct version), and
  - end_sequence before the watch function end (triggered the corner-case).

Tested on x86_64-linux.
---
 .../gdb.dwarf2/dw2-epilogue-begin-2.exp       |  20 ++
 gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.c |  28 +--
 .../gdb.dwarf2/dw2-epilogue-begin.exp         | 156 +-------------
 .../gdb.dwarf2/dw2-epilogue-begin.exp.tcl     | 191 ++++++++++++++++++
 4 files changed, 229 insertions(+), 166 deletions(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin-2.exp
 create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp.tcl

diff --git a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin-2.exp b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin-2.exp
new file mode 100644
index 00000000000..64cd85644e0
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin-2.exp
@@ -0,0 +1,20 @@
+# Copyright 2022-2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile dw2-epilogue-begin.c dw2-epilogue-begin.S
+
+set early_end_sequence 1
+
+source $srcdir/$subdir/dw2-epilogue-begin.exp.tcl
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.c b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.c
index 4ff445cf37d..2fcc4807904 100644
--- a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.c
+++ b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.c
@@ -22,6 +22,20 @@ trivial (void)
 
 char global;
 
+extern void watch (void);
+
+int
+main (void)
+{							/* main prologue */
+  asm ("main_label: .global main_label");
+  global = 0;
+  asm ("main_fun_call: .global main_fun_call");
+  watch ();						/* main function call */
+  asm ("main_epilogue: .global main_epilogue");
+  global = 10;
+  return 0;						/* main end */
+}
+
 void
 watch (void)
 {							/* watch start */
@@ -36,16 +50,6 @@ watch (void)
   asm ("mov $0x2, %rax");
   local = 2;						/* watch reassign */
   asm ("watch_end: .global watch_end");			/* watch end */
-}
-
-int
-main (void)
-{							/* main prologue */
-  asm ("main_label: .global main_label");
-  global = 0;
-  asm ("main_fun_call: .global main_fun_call");
-  watch ();						/* main function call */
-  asm ("main_epilogue: .global main_epilogue");
-  global = 10;
-  return 0;						/* main end */
+  local = 3;
+  asm ("watch_early_end_sequence: .global watch_early_end_sequence");
 }
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp
index f646e23da62..9552dd764dd 100644
--- a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp
+++ b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp
@@ -13,161 +13,9 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
-# Check that GDB can honor the epilogue_begin flag the compiler can place
-# in the line-table data.
-# We test 2 things: 1. that a software watchpoint triggered in an epilogue
-# is correctly ignored
-# 2. that GDB can mark the same line as both prologue and epilogue
-
-load_lib dwarf.exp
-
-# This test can only be run on targets which support DWARF-2 and use gas.
-require dwarf2_support
-# restricted to x86 to make it simpler to follow a variable
-require is_x86_64_m64_target
-
 standard_testfile .c .S
 
-set trivial_line [gdb_get_line_number "trivial function"]
-set main_prologue [gdb_get_line_number "main prologue"]
-set main_epilogue [gdb_get_line_number "main end"]
-set watch_start_line [gdb_get_line_number "watch start"]
-
-set asm_file [standard_output_file $srcfile2]
-
-# The producer will be set to clang because at the time of writing
-# we only care about epilogues if the producer is clang.  When the
-# producer is GCC, variables use CFA locations, so watchpoints can
-# continue working even on epilogues.
-Dwarf::assemble $asm_file {
-    global srcdir subdir srcfile srcfile2
-    global trivial_line main_prologue main_epilogue watch_start_line
-    declare_labels lines_label
-
-    get_func_info main
-    get_func_info trivial
-    get_func_info watch
-
-    cu {} {
-	compile_unit {
-	    {language @DW_LANG_C}
-	    {name dw2-prologue-end.c}
-	    {stmt_list ${lines_label} DW_FORM_sec_offset}
-	    {producer "clang version 17.0.1"}
-	} {
-	    declare_labels char_label
-
-	    char_label: base_type {
-		{name char}
-		{encoding @DW_ATE_signed}
-		{byte_size 1 DW_FORM_sdata}
-	    }
-
-	    subprogram {
-		{external 1 flag}
-		{name trivial}
-		{low_pc $trivial_start addr}
-		{high_pc "$trivial_start + $trivial_len" addr}
-	    }
-	    subprogram {
-		{external 1 flag}
-		{name watch}
-		{low_pc $watch_start addr}
-		{high_pc "$watch_start + $watch_len" addr}
-	    } {
-		DW_TAG_variable {
-		    {name local}
-		    {type :$char_label}
-		    {DW_AT_location {DW_OP_reg0} SPECIAL_expr}
-		}
-	    }
-	    subprogram {
-		{external 1 flag}
-		{name main}
-		{low_pc $main_start addr}
-		{high_pc "$main_start + $main_len" addr}
-	    }
-	}
-    }
-
-    lines {version 5} lines_label {
-	set diridx [include_dir "${srcdir}/${subdir}"]
-	file_name "$srcfile" $diridx
-
-	program {
-	    DW_LNS_set_file $diridx
-	    DW_LNE_set_address $trivial_start
-	    line $trivial_line
-	    DW_LNS_set_prologue_end
-	    DW_LNS_set_epilogue_begin
-	    DW_LNS_copy
-
-	    DW_LNE_set_address watch
-	    line $watch_start_line
-	    DW_LNS_copy
-
-	    DW_LNE_set_address watch_start
-	    line [gdb_get_line_number "watch assign"]
-	    DW_LNS_set_prologue_end
-	    DW_LNS_copy
-
-	    DW_LNE_set_address watch_reassign
-	    line [gdb_get_line_number "watch reassign"]
-	    DW_LNS_set_epilogue_begin
-	    DW_LNS_copy
-
-	    DW_LNE_set_address watch_end
-	    line [gdb_get_line_number "watch end"]
-	    DW_LNS_copy
-
-	    DW_LNE_set_address $main_start
-	    line $main_prologue
-	    DW_LNS_set_prologue_end
-	    DW_LNS_copy
-
-	    DW_LNE_set_address main_fun_call
-	    line [gdb_get_line_number "main function call"]
-	    DW_LNS_copy
-
-	    DW_LNE_set_address main_epilogue
-	    line $main_epilogue
-	    DW_LNS_set_epilogue_begin
-	    DW_LNS_copy
-
-	    DW_LNE_end_sequence
-	}
-    }
-}
-
-if { [prepare_for_testing "failed to prepare" ${testfile} \
-	  [list $srcfile $asm_file] {nodebug}] } {
-    return -1
-}
-
-if ![runto_main] {
-    return -1
-}
-
-# Moving to the scope with a local variable.
-gdb_breakpoint $watch_start_line
-gdb_continue_to_breakpoint "continuing to function" ".*"
-gdb_test "next" "local = 2.*" "stepping to epilogue"
-
-# Forcing software watchpoints because hardware ones don't care if we
-# are in the epilogue or not.
-gdb_test_no_output "set can-use-hw-watchpoints 0"
+set early_end_sequence 0
 
-# Test that the software watchpoint will not trigger in this case
-gdb_test "watch local" "\[W|w\]atchpoint .: local" "set watchpoint"
-gdb_test "continue" ".*\[W|w\]atchpoint . deleted.*" \
-    "confirm watchpoint doesn't trigger"
+source $srcdir/$subdir/dw2-epilogue-begin.exp.tcl
 
-# First we test that the trivial function has a line with both a prologue
-# and an epilogue. Do this by finding a line that has 3 Y columns
-set sep "\[ \t\]"
-set hex_number "0x\[0-9a-f\]+"
-gdb_test_multiple "maint info line-table" "test epilogue in linetable" -lbl {
-    -re "\[0-9\]$sep+$trivial_line$sep+$hex_number$sep+$hex_number$sep+Y$sep+Y$sep+Y" {
-	pass $gdb_test_name
-    }
-}
diff --git a/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp.tcl b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp.tcl
new file mode 100644
index 00000000000..18282e884e4
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/dw2-epilogue-begin.exp.tcl
@@ -0,0 +1,191 @@
+# Copyright 2022-2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check that GDB can honor the epilogue_begin flag the compiler can place
+# in the line-table data.
+# We test 2 things: 1. that a software watchpoint triggered in an epilogue
+# is correctly ignored
+# 2. that GDB can mark the same line as both prologue and epilogue
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+require dwarf2_support
+# restricted to x86 to make it simpler to follow a variable
+require is_x86_64_m64_target
+
+set trivial_line [gdb_get_line_number "trivial function"]
+set main_prologue [gdb_get_line_number "main prologue"]
+set main_epilogue [gdb_get_line_number "main end"]
+set watch_start_line [gdb_get_line_number "watch start"]
+
+set asm_file [standard_output_file $srcfile2]
+
+# The producer will be set to clang because at the time of writing
+# we only care about epilogues if the producer is clang.  When the
+# producer is GCC, variables use CFA locations, so watchpoints can
+# continue working even on epilogues.
+Dwarf::assemble $asm_file {
+    global srcdir subdir srcfile srcfile2
+    global trivial_line main_prologue main_epilogue watch_start_line
+    declare_labels lines_label
+
+    get_func_info main
+    get_func_info trivial
+    get_func_info watch
+
+    if { $::early_end_sequence == 1 } {
+	set watch_end_sequence watch_early_end_sequence
+    } else {
+	set watch_end_sequence $watch_end
+    }
+
+    cu {} {
+	compile_unit {
+	    {language @DW_LANG_C}
+	    {name dw2-prologue-end.c}
+	    {stmt_list ${lines_label} DW_FORM_sec_offset}
+	    {producer "clang version 17.0.1"}
+	} {
+	    declare_labels char_label
+
+	    char_label: base_type {
+		{name char}
+		{encoding @DW_ATE_signed}
+		{byte_size 1 DW_FORM_sdata}
+	    }
+
+	    subprogram {
+		{external 1 flag}
+		{name trivial}
+		{low_pc $trivial_start addr}
+		{high_pc "$trivial_start + $trivial_len" addr}
+	    }
+	    subprogram {
+		{external 1 flag}
+		{name watch}
+		{low_pc $watch_start addr}
+		{high_pc "$watch_start + $watch_len" addr}
+	    } {
+		DW_TAG_variable {
+		    {name local}
+		    {type :$char_label}
+		    {DW_AT_location {DW_OP_reg0} SPECIAL_expr}
+		}
+	    }
+	    subprogram {
+		{external 1 flag}
+		{name main}
+		{low_pc $main_start addr}
+		{high_pc "$main_start + $main_len" addr}
+	    }
+	}
+    }
+
+    lines {version 5} lines_label {
+	set diridx [include_dir "${srcdir}/${subdir}"]
+	file_name "$srcfile" $diridx
+
+	program {
+	    DW_LNS_set_file $diridx
+
+	    DW_LNE_set_address $trivial_start
+	    line $trivial_line
+	    DW_LNS_set_prologue_end
+	    DW_LNS_set_epilogue_begin
+	    DW_LNS_copy
+
+	    DW_LNE_set_address $trivial_end
+	    DW_LNE_end_sequence
+
+
+	    DW_LNS_set_file $diridx
+
+	    DW_LNE_set_address $main_start
+	    line $main_prologue
+	    DW_LNS_set_prologue_end
+	    DW_LNS_copy
+
+	    DW_LNE_set_address main_fun_call
+	    line [gdb_get_line_number "main function call"]
+	    DW_LNS_copy
+
+	    DW_LNE_set_address main_epilogue
+	    line $main_epilogue
+	    DW_LNS_set_epilogue_begin
+	    DW_LNS_copy
+
+	    DW_LNE_set_address $main_end
+	    DW_LNE_end_sequence
+
+
+	    DW_LNS_set_file $diridx
+
+	    DW_LNE_set_address $watch_start
+	    line $watch_start_line
+	    DW_LNS_copy
+
+	    DW_LNE_set_address watch_start
+	    line [gdb_get_line_number "watch assign"]
+	    DW_LNS_set_prologue_end
+	    DW_LNS_copy
+
+	    DW_LNE_set_address watch_reassign
+	    line [gdb_get_line_number "watch reassign"]
+	    DW_LNS_set_epilogue_begin
+	    DW_LNS_copy
+
+	    DW_LNE_set_address watch_end
+	    line [gdb_get_line_number "watch end"]
+	    DW_LNS_copy
+
+	    DW_LNE_set_address $watch_end_sequence
+	    DW_LNE_end_sequence
+	}
+    }
+}
+
+if { [prepare_for_testing "failed to prepare" ${testfile} \
+	  [list $srcfile $asm_file] {nodebug}] } {
+    return -1
+}
+
+if ![runto_main] {
+    return -1
+}
+
+# Moving to the scope with a local variable.
+gdb_breakpoint $watch_start_line
+gdb_continue_to_breakpoint "continuing to function" ".*"
+gdb_test "next" "local = 2.*" "stepping to epilogue"
+
+# Forcing software watchpoints because hardware ones don't care if we
+# are in the epilogue or not.
+gdb_test_no_output "set can-use-hw-watchpoints 0"
+
+# Test that the software watchpoint will not trigger in this case
+gdb_test "watch local" "\[W|w\]atchpoint .: local" "set watchpoint"
+gdb_test "continue" ".*\[W|w\]atchpoint . deleted.*" \
+    "confirm watchpoint doesn't trigger"
+
+# First we test that the trivial function has a line with both a prologue
+# and an epilogue. Do this by finding a line that has 3 Y columns
+set sep "\[ \t\]"
+set hex_number "0x\[0-9a-f\]+"
+gdb_test_multiple "maint info line-table" "test epilogue in linetable" -lbl {
+    -re "\[0-9\]$sep+$trivial_line$sep+$hex_number$sep+$hex_number$sep+Y$sep+Y$sep+Y" {
+	pass $gdb_test_name
+    }
+}
-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-05 15:10 [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Tom de Vries
  2024-04-05 15:10 ` [PATCH v3 2/2] [gdb/testsuite] Add gdb.dwarf2/dw2-epilogue-begin-2.exp Tom de Vries
@ 2024-04-06  5:03 ` Bernd Edlinger
  2024-04-06  8:29   ` Tom de Vries
  1 sibling, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2024-04-06  5:03 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 4/5/24 17:10, Tom de Vries wrote:
> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
> 
> An out of bounds array access in find_epilogue_using_linetable causes random
> test failures like these:
> 
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $fn_fba
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: check frame-id matches
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: bt 2
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: up
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $sp_value == $::main_sp
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $::main_fba
> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: [string equal $fid $::main_fid]
> 
> Here the read happens below the first element of the line
> table, and the test failure depends on the value that is
> read from there.
> 
> It also happens that std::lower_bound returns a pointer exactly at the upper
> bound of the line table, also here the read value is undefined, that happens
> in this test:
> 
> FAIL: gdb.dwarf2/dw2-epilogue-begin.exp: confirm watchpoint doesn't trigger
> 
> Fixes: 528b729be1a2 ("gdb/dwarf2: Add support for DW_LNS_set_epilogue_begin in line-table")
> 
> Co-Authored-By: Tom de Vries <tdevries@suse.de>
> 
> PR symtab/31268
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31268
> ---
>  gdb/symtab.c | 85 +++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 75 insertions(+), 10 deletions(-)
> 
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 86603dfebc3..0c126d99cd4 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>  	= unrelocated_addr (end_pc - objfile->text_section_offset ());
>  
>        const linetable *linetable = sal.symtab->linetable ();
> -      /* This should find the last linetable entry of the current function.
> -	 It is probably where the epilogue begins, but since the DWARF 5
> -	 spec doesn't guarantee it, we iterate backwards through the function
> -	 until we either find it or are sure that it doesn't exist.  */
> +      if (linetable->nitems == 0)
> +	{
> +	  /* Empty line table.  */
> +	  return {};
> +	}
> +
> +      /* Find the first linetable entry after the current function.  Note that
> +	 this also may be an end_sequence entry.  */
>        auto it = std::lower_bound
>  	(linetable->item, linetable->item + linetable->nitems, unrel_end,
>  	 [] (const linetable_entry &lte, unrelocated_addr pc)
> @@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>  	   return lte.unrelocated_pc () < pc;
>  	 });
>  
> -      while (it->unrelocated_pc () >= unrel_start)
> -      {
> -	if (it->epilogue_begin)
> -	  return {it->pc (objfile)};
> -	it --;
> -      }
> +      if (it == linetable->item + linetable->nitems)
> +	{
> +	  /* We couldn't find either:
> +	     - a linetable entry starting the function after the current
> +	       function, or
> +	     - an end_sequence entry that terminates the current function
> +	       at unrel_end.
> +	     This can happen when the linetable doesn't describe the full
> +	     extent of the function.  Even though this is a corner case, which
> +	     may not happen other than in dwarf assembly test-cases, let's
> +	     handle this.
> +
> +	     Move to the last entry in the linetable, and check that it's an
> +	     end_sequence terminating the current function.  */
> +	  gdb_assert (it != &linetable->item[0]);
> +	  it--;
> +	  if (!(it->line == 0
> +		&& unrel_start <= it->unrelocated_pc ()
> +		&& it->unrelocated_pc () < unrel_end))
> +	    return {};

Why is this check necessary here, and not also when
this is not the last function in the line-table?

And why is this check necessary at all?

> +	}
> +      else
> +	gdb_assert (unrel_end <= it->unrelocated_pc ());

Why do you not check that 'it' points to an end_sequence
at exactly unrel_end?
It could be anything at an address much higher PC than unrel_end.


Bernd.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-06  5:03 ` [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Bernd Edlinger
@ 2024-04-06  8:29   ` Tom de Vries
  2024-04-07  8:17     ` Bernd Edlinger
  0 siblings, 1 reply; 8+ messages in thread
From: Tom de Vries @ 2024-04-06  8:29 UTC (permalink / raw)
  To: Bernd Edlinger, gdb-patches

On 4/6/24 07:03, Bernd Edlinger wrote:
> On 4/5/24 17:10, Tom de Vries wrote:
>> From: Bernd Edlinger <bernd.edlinger@hotmail.de>
>>
>> An out of bounds array access in find_epilogue_using_linetable causes random
>> test failures like these:
>>
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $fn_fba
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: check frame-id matches
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: bt 2
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: up
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $sp_value == $::main_sp
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: $fba_value == $::main_fba
>> FAIL: gdb.base/unwind-on-each-insn-amd64.exp: foo: instruction 6: [string equal $fid $::main_fid]
>>
>> Here the read happens below the first element of the line
>> table, and the test failure depends on the value that is
>> read from there.
>>
>> It also happens that std::lower_bound returns a pointer exactly at the upper
>> bound of the line table, also here the read value is undefined, that happens
>> in this test:
>>
>> FAIL: gdb.dwarf2/dw2-epilogue-begin.exp: confirm watchpoint doesn't trigger
>>
>> Fixes: 528b729be1a2 ("gdb/dwarf2: Add support for DW_LNS_set_epilogue_begin in line-table")
>>
>> Co-Authored-By: Tom de Vries <tdevries@suse.de>
>>
>> PR symtab/31268
>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31268
>> ---
>>   gdb/symtab.c | 85 +++++++++++++++++++++++++++++++++++++++++++++-------
>>   1 file changed, 75 insertions(+), 10 deletions(-)
>>
>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>> index 86603dfebc3..0c126d99cd4 100644
>> --- a/gdb/symtab.c
>> +++ b/gdb/symtab.c
>> @@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>   	= unrelocated_addr (end_pc - objfile->text_section_offset ());
>>   
>>         const linetable *linetable = sal.symtab->linetable ();
>> -      /* This should find the last linetable entry of the current function.
>> -	 It is probably where the epilogue begins, but since the DWARF 5
>> -	 spec doesn't guarantee it, we iterate backwards through the function
>> -	 until we either find it or are sure that it doesn't exist.  */
>> +      if (linetable->nitems == 0)
>> +	{
>> +	  /* Empty line table.  */
>> +	  return {};
>> +	}
>> +
>> +      /* Find the first linetable entry after the current function.  Note that
>> +	 this also may be an end_sequence entry.  */
>>         auto it = std::lower_bound
>>   	(linetable->item, linetable->item + linetable->nitems, unrel_end,
>>   	 [] (const linetable_entry &lte, unrelocated_addr pc)
>> @@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>   	   return lte.unrelocated_pc () < pc;
>>   	 });
>>   
>> -      while (it->unrelocated_pc () >= unrel_start)
>> -      {
>> -	if (it->epilogue_begin)
>> -	  return {it->pc (objfile)};
>> -	it --;
>> -      }
>> +      if (it == linetable->item + linetable->nitems)
>> +	{
>> +	  /* We couldn't find either:
>> +	     - a linetable entry starting the function after the current
>> +	       function, or
>> +	     - an end_sequence entry that terminates the current function
>> +	       at unrel_end.
>> +	     This can happen when the linetable doesn't describe the full
>> +	     extent of the function.  Even though this is a corner case, which
>> +	     may not happen other than in dwarf assembly test-cases, let's
>> +	     handle this.
>> +
>> +	     Move to the last entry in the linetable, and check that it's an
>> +	     end_sequence terminating the current function.  */
>> +	  gdb_assert (it != &linetable->item[0]);
>> +	  it--;
>> +	  if (!(it->line == 0
>> +		&& unrel_start <= it->unrelocated_pc ()
>> +		&& it->unrelocated_pc () < unrel_end))
>> +	    return {};
> 
> Why is this check necessary here, and not also when
> this is not the last function in the line-table?
> 
> And why is this check necessary at all?
> 

It spells out as much as possible the specific conditions of the 
corner-case we're handling.

We could also just simply handle the cornercase by returning {}, I went 
forth and back a bit on that, and decided to support it on the basis 
that at least currently we have dwarf assembly test-cases in the 
testsuite that trigger this path, though I've submitted a series to 
clean that up.

But I'm still on the fence about this, if you prefer a "return {}" I'm 
fine with that.

>> +	}
>> +      else
>> +	gdb_assert (unrel_end <= it->unrelocated_pc ());
> 
> Why do you not check that 'it' points to an end_sequence
> at exactly unrel_end?
> It could be anything at an address much higher PC than unrel_end.

This assert spells out the post-condition of the call to 
std::lower_bound, in case it found an entry.

If there's debug info where one line entry straddles two functions, the 
call returns the entry after it, which doesn't have address unrel_end.

Having said that, we can unsupport such a scenario by doing:
...
       else
         {
           if (unrel_end < it->unrelocated_pc ())
             return {};
           gdb_assert (unrel_end == it->unrelocated_pc ());
...

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-06  8:29   ` Tom de Vries
@ 2024-04-07  8:17     ` Bernd Edlinger
  2024-04-08 12:58       ` Tom de Vries
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2024-04-07  8:17 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 4/6/24 10:29, Tom de Vries wrote:
> On 4/6/24 07:03, Bernd Edlinger wrote:
>> On 4/5/24 17:10, Tom de Vries wrote:
>>>
>>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>>> index 86603dfebc3..0c126d99cd4 100644
>>> --- a/gdb/symtab.c
>>> +++ b/gdb/symtab.c
>>> @@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>>       = unrelocated_addr (end_pc - objfile->text_section_offset ());
>>>           const linetable *linetable = sal.symtab->linetable ();
>>> -      /* This should find the last linetable entry of the current function.
>>> -     It is probably where the epilogue begins, but since the DWARF 5
>>> -     spec doesn't guarantee it, we iterate backwards through the function
>>> -     until we either find it or are sure that it doesn't exist.  */
>>> +      if (linetable->nitems == 0)
>>> +    {
>>> +      /* Empty line table.  */
>>> +      return {};
>>> +    }
>>> +

Hmm, this can be an assertion, because
the line table was found by find_pc_line (start_pc, 0);
so the linetable is guaranteed to be non-empty.
BTW: empty linetables are usually NULL pointers,
so that probably the assertion should
also assert like 
gdb_assert (linetable != NULL && linetable->nitems > 0);

>>> +      /* Find the first linetable entry after the current function.  Note that
>>> +     this also may be an end_sequence entry.  */
>>>         auto it = std::lower_bound
>>>       (linetable->item, linetable->item + linetable->nitems, unrel_end,
>>>        [] (const linetable_entry &lte, unrelocated_addr pc)
>>> @@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>>          return lte.unrelocated_pc () < pc;
>>>        });
>>>   -      while (it->unrelocated_pc () >= unrel_start)
>>> -      {
>>> -    if (it->epilogue_begin)
>>> -      return {it->pc (objfile)};
>>> -    it --;
>>> -      }
>>> +      if (it == linetable->item + linetable->nitems)
>>> +    {
>>> +      /* We couldn't find either:
>>> +         - a linetable entry starting the function after the current
>>> +           function, or
>>> +         - an end_sequence entry that terminates the current function
>>> +           at unrel_end.
>>> +         This can happen when the linetable doesn't describe the full
>>> +         extent of the function.  Even though this is a corner case, which
>>> +         may not happen other than in dwarf assembly test-cases, let's
>>> +         handle this.
>>> +
>>> +         Move to the last entry in the linetable, and check that it's an
>>> +         end_sequence terminating the current function.  */
>>> +      gdb_assert (it != &linetable->item[0]);
>>> +      it--;
>>> +      if (!(it->line == 0
>>> +        && unrel_start <= it->unrelocated_pc ()
>>> +        && it->unrelocated_pc () < unrel_end))
>>> +        return {};
>>
>> Why is this check necessary here, and not also when
>> this is not the last function in the line-table?
>>
>> And why is this check necessary at all?
>>
> 
> It spells out as much as possible the specific conditions of the corner-case we're handling.
> 
> We could also just simply handle the cornercase by returning {}, I went forth and back a bit on that, and decided to support it on the basis that at least currently we have dwarf assembly test-cases in the testsuite that trigger this path, though I've submitted a series to clean that up.
> 
> But I'm still on the fence about this, if you prefer a "return {}" I'm fine with that.
> 
>>> +    }
>>> +      else
>>> +    gdb_assert (unrel_end <= it->unrelocated_pc ());
>>
>> Why do you not check that 'it' points to an end_sequence
>> at exactly unrel_end?
>> It could be anything at an address much higher PC than unrel_end.
> 
> This assert spells out the post-condition of the call to std::lower_bound, in case it found an entry.
> 
> If there's debug info where one line entry straddles two functions, the call returns the entry after it, which doesn't have address unrel_end.
> 
> Having said that, we can unsupport such a scenario by doing:
> ...
>       else
>         {
>           if (unrel_end < it->unrelocated_pc ())
>             return {};
>           gdb_assert (unrel_end == it->unrelocated_pc ());
> ...
> 

I think we should not look at the `it` in any case.
If there is an inconsistency in the debug info, a
debug message that can be enabled in maintainer mode
would be good enough.
But even in this case, I would prefer a best effort,
and continue whenever possible.

If you look at skip_prologue_using_linetable
you see that it does stop the search immediately, when
it->unrelocated_pc() reaches unrel_end, or when the
end of the linetable is reached:

      for (;
           (it < linetable->item + linetable->nitems
            && it->unrelocated_pc () < unrel_end);
           it++)
        if (it->prologue_end)
          return {it->pc (objfile)};

Therefore I would like find_epilogue_using_linetable
to use the same algorithm just in reverse direction.
Which means always do `it--` first before using `it`.

After all this is just a partial function range,
it can end with a jump or a return, and in both
cases the linetable entry at unrel_end can belong
to a completely different function, and it is not
guaranteed to be an end_sequence entry.

BTW: I am not sure what happens if the function
has multiple line tables, e.g. because of inline
functions, or #include to pull in parts of the function
body.  In that case I would expect that the line
table found by find_pc_line (start_pc, 0);
may be covering the prologue area, while the epilogue
may be missing.
Maybe find_pc_line (end_pc - 1, 0); would be better
candidate for a line table covering the epilogue area?


Thanks.
Bernd.

> Thanks,
> - Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-07  8:17     ` Bernd Edlinger
@ 2024-04-08 12:58       ` Tom de Vries
  2024-04-08 14:41         ` Bernd Edlinger
  0 siblings, 1 reply; 8+ messages in thread
From: Tom de Vries @ 2024-04-08 12:58 UTC (permalink / raw)
  To: Bernd Edlinger, gdb-patches

On 4/7/24 10:17, Bernd Edlinger wrote:
> On 4/6/24 10:29, Tom de Vries wrote:
>> On 4/6/24 07:03, Bernd Edlinger wrote:
>>> On 4/5/24 17:10, Tom de Vries wrote:
>>>>
>>>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>>>> index 86603dfebc3..0c126d99cd4 100644
>>>> --- a/gdb/symtab.c
>>>> +++ b/gdb/symtab.c
>>>> @@ -4166,10 +4166,14 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>>>        = unrelocated_addr (end_pc - objfile->text_section_offset ());
>>>>            const linetable *linetable = sal.symtab->linetable ();
>>>> -      /* This should find the last linetable entry of the current function.
>>>> -     It is probably where the epilogue begins, but since the DWARF 5
>>>> -     spec doesn't guarantee it, we iterate backwards through the function
>>>> -     until we either find it or are sure that it doesn't exist.  */
>>>> +      if (linetable->nitems == 0)
>>>> +    {
>>>> +      /* Empty line table.  */
>>>> +      return {};
>>>> +    }
>>>> +
> 
> Hmm, this can be an assertion, because
> the line table was found by find_pc_line (start_pc, 0);
> so the linetable is guaranteed to be non-empty.
> BTW: empty linetables are usually NULL pointers,
> so that probably the assertion should
> also assert like
> gdb_assert (linetable != NULL && linetable->nitems > 0);
> 

I've updated the if condition to include the nullptr check, but didn't 
turn it into an assert.  By doing so we gain nothing, and add a 
potential user inconvenience.

>>>> +      /* Find the first linetable entry after the current function.  Note that
>>>> +     this also may be an end_sequence entry.  */
>>>>          auto it = std::lower_bound
>>>>        (linetable->item, linetable->item + linetable->nitems, unrel_end,
>>>>         [] (const linetable_entry &lte, unrelocated_addr pc)
>>>> @@ -4177,13 +4181,74 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>>>>           return lte.unrelocated_pc () < pc;
>>>>         });
>>>>    -      while (it->unrelocated_pc () >= unrel_start)
>>>> -      {
>>>> -    if (it->epilogue_begin)
>>>> -      return {it->pc (objfile)};
>>>> -    it --;
>>>> -      }
>>>> +      if (it == linetable->item + linetable->nitems)
>>>> +    {
>>>> +      /* We couldn't find either:
>>>> +         - a linetable entry starting the function after the current
>>>> +           function, or
>>>> +         - an end_sequence entry that terminates the current function
>>>> +           at unrel_end.
>>>> +         This can happen when the linetable doesn't describe the full
>>>> +         extent of the function.  Even though this is a corner case, which
>>>> +         may not happen other than in dwarf assembly test-cases, let's
>>>> +         handle this.
>>>> +
>>>> +         Move to the last entry in the linetable, and check that it's an
>>>> +         end_sequence terminating the current function.  */
>>>> +      gdb_assert (it != &linetable->item[0]);
>>>> +      it--;
>>>> +      if (!(it->line == 0
>>>> +        && unrel_start <= it->unrelocated_pc ()
>>>> +        && it->unrelocated_pc () < unrel_end))
>>>> +        return {};
>>>
>>> Why is this check necessary here, and not also when
>>> this is not the last function in the line-table?
>>>
>>> And why is this check necessary at all?
>>>
>>
>> It spells out as much as possible the specific conditions of the corner-case we're handling.
>>
>> We could also just simply handle the cornercase by returning {}, I went forth and back a bit on that, and decided to support it on the basis that at least currently we have dwarf assembly test-cases in the testsuite that trigger this path, though I've submitted a series to clean that up.
>>
>> But I'm still on the fence about this, if you prefer a "return {}" I'm fine with that.
>>

I've thought about this a bit more over the weekend, and I managed to 
convince myself that a simple "return {}" is the proper solution.  The 
rationale is that an incorrectly written dwarf assembly test-case is not 
a good reason to support a corner-case.  If we do stumble one day into a 
compiler that generates the incorrect debug info, we can always opt to 
add a workaround for this (which we then can test extensively).  But 
adding a workaround without such an incentive is pointless.

>>>> +    }
>>>> +      else
>>>> +    gdb_assert (unrel_end <= it->unrelocated_pc ());
>>>
>>> Why do you not check that 'it' points to an end_sequence
>>> at exactly unrel_end?
>>> It could be anything at an address much higher PC than unrel_end.
>>
>> This assert spells out the post-condition of the call to std::lower_bound, in case it found an entry.
>>
>> If there's debug info where one line entry straddles two functions, the call returns the entry after it, which doesn't have address unrel_end.
>>
>> Having said that, we can unsupport such a scenario by doing:
>> ...
>>        else
>>          {
>>            if (unrel_end < it->unrelocated_pc ())
>>              return {};
>>            gdb_assert (unrel_end == it->unrelocated_pc ());
>> ...
>>
> 
> I think we should not look at the `it` in any case.
> If there is an inconsistency in the debug info, a
> debug message that can be enabled in maintainer mode
> would be good enough.
> But even in this case, I would prefer a best effort,
> and continue whenever possible.
> 

IMO, a best effort only makes sense in case there's a compiler release 
generating the incorrect debug info.

> If you look at skip_prologue_using_linetable
> you see that it does stop the search immediately, when
> it->unrelocated_pc() reaches unrel_end, or when the
> end of the linetable is reached:
> 
>        for (;
>             (it < linetable->item + linetable->nitems
>              && it->unrelocated_pc () < unrel_end);
>             it++)
>          if (it->prologue_end)
>            return {it->pc (objfile)};
> 
> Therefore I would like find_epilogue_using_linetable
> to use the same algorithm just in reverse direction.
> Which means always do `it--` first before using `it`.
> 

I think each function should do what is appropriate, whatever that is.

> After all this is just a partial function range,
> it can end with a jump or a return, and in both
> cases the linetable entry at unrel_end can belong
> to a completely different function, and it is not
> guaranteed to be an end_sequence entry.
> 
> BTW: I am not sure what happens if the function
> has multiple line tables, e.g. because of inline
> functions, or #include to pull in parts of the function
> body.  In that case I would expect that the line
> table found by find_pc_line (start_pc, 0);
> may be covering the prologue area, while the epilogue
> may be missing.
> Maybe find_pc_line (end_pc - 1, 0); would be better
> candidate for a line table covering the epilogue area?
> 

The function was committed with this caveat documented:
...
    While the standard allows for multiple points marked with epilogue_begin
     in the same function, for performance reasons, the function that
     searches for the epilogue address will only find the last address that
     sets this flag for a given block.
...

So indeed there's work to be done to extend this to be more general, but 
that's beyond the scope of our current patch.

I've submitted a v4 here ( 
https://sourceware.org/pipermail/gdb-patches/2024-April/207925.html ).

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-08 12:58       ` Tom de Vries
@ 2024-04-08 14:41         ` Bernd Edlinger
  2024-04-09  9:37           ` Tom de Vries
  0 siblings, 1 reply; 8+ messages in thread
From: Bernd Edlinger @ 2024-04-08 14:41 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

Hi Tom,

we are making good progress.

On 4/8/24 14:58, Tom de Vries wrote:
> On 4/7/24 10:17, Bernd Edlinger wrote:
>>
>> Hmm, this can be an assertion, because
>> the line table was found by find_pc_line (start_pc, 0);
>> so the linetable is guaranteed to be non-empty.
>> BTW: empty linetables are usually NULL pointers,
>> so that probably the assertion should
>> also assert like
>> gdb_assert (linetable != NULL && linetable->nitems > 0);
>>
> 
> I've updated the if condition to include the nullptr check, but didn't turn it into an assert.  By doing so we gain nothing, and add a potential user inconvenience.
> 

ok.

>>>
>>> We could also just simply handle the cornercase by returning {}, I went forth and back a bit on that, and decided to support it on the basis that at least currently we have dwarf assembly test-cases in the testsuite that trigger this path, though I've submitted a series to clean that up.
>>>
>>> But I'm still on the fence about this, if you prefer a "return {}" I'm fine with that.
>>>
> 
> I've thought about this a bit more over the weekend, and I managed to convince myself that a simple "return {}" is the proper solution.  The rationale is that an incorrectly written dwarf assembly test-case is not a good reason to support a corner-case.  If we do stumble one day into a compiler that generates the incorrect debug info, we can always opt to add a workaround for this (which we then can test extensively).  But adding a workaround without such an incentive is pointless.
> 
>>>>> +    }
>>>>> +      else
>>>>> +    gdb_assert (unrel_end <= it->unrelocated_pc ());
>>>>
>>>> Why do you not check that 'it' points to an end_sequence
>>>> at exactly unrel_end?
>>>> It could be anything at an address much higher PC than unrel_end.
>>>
>>> This assert spells out the post-condition of the call to std::lower_bound, in case it found an entry.
>>>
>>> If there's debug info where one line entry straddles two functions, the call returns the entry after it, which doesn't have address unrel_end.
>>>
>>> Having said that, we can unsupport such a scenario by doing:
>>> ...
>>>        else
>>>          {
>>>            if (unrel_end < it->unrelocated_pc ())
>>>              return {};
>>>            gdb_assert (unrel_end == it->unrelocated_pc ());
>>> ...
>>>
>>
>> I think we should not look at the `it` in any case.
>> If there is an inconsistency in the debug info, a
>> debug message that can be enabled in maintainer mode
>> would be good enough.
>> But even in this case, I would prefer a best effort,
>> and continue whenever possible.
>>
> 
> IMO, a best effort only makes sense in case there's a compiler release generating the incorrect debug info.
> 

Consider this counter example:
The debug info is correct, but the internal representation
is not what you probably expect.

$ cat hello.c 
#include <stdio.h>
int main()
{
  printf("hello ");
  #include "world.inc"
/*** End of hello.c ***/

$ cat world.inc 
  printf("world\n");
  return 0;
}
/*** End of world.inc ***/

gcc -g -o hello hello.c

The line table starting at main ends between
the two printf statements.

I demonstrate the effect this little mod over
your v4 patch version:

diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
index 0bb7a24cbd0..35c1fb85409 100644
--- a/gdb/amd64-tdep.c
+++ b/gdb/amd64-tdep.c
@@ -2908,6 +2908,7 @@ amd64_epilogue_frame_sniffer_1 (const struct frame_unwind *self,
     /* We're not in the inner frame, so assume we're not in an epilogue.  */
     return 0;
 
+#if 0
   bool unwind_valid_p
     = compunit_epilogue_unwind_valid (find_pc_compunit_symtab (pc));
   if (override_p)
@@ -2924,6 +2925,7 @@ amd64_epilogue_frame_sniffer_1 (const struct frame_unwind *self,
           "amd64 epilogue".  */
        return 0;
     }
+#endif
 
   /* Check whether we're in an epilogue.  */
   return amd64_stack_frame_destroyed_p_1 (gdbarch, pc);
diff --git a/gdb/symtab.c b/gdb/symtab.c
index 427d7b9f8b2..bb3c1bba70b 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4192,6 +4192,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
             extent of the function, which shouldn't happen with
             compiler-generated debug info.  Handle the corner case
             conservatively.  */
+         printf("at linetable end nitems=%d\n", linetable->nitems);
          return {};
        }
       else
@@ -4202,6 +4203,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
                 function.  This can happen if the previous entry straddles
                 two functions, which shouldn't happen with compiler-generated
                 debug info.  Handle the corner case conservatively.  */
+             printf("unrel_end = %lx < it->unrelocated_pc = %lx\n", (long)unrel_end, (long)it->unrelocated_pc());
              return {};
            }
          gdb_assert (unrel_end == it->unrelocated_pc ());

So this debug info which is not invalid at all,
triggers your error conditions:

$ ..../gdb-build/gdb/gdb hello
GNU gdb (GDB) 15.0.50.20240406-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from hello...
(gdb) b main
Breakpoint 1 at 0x114d: file hello.c, line 4.
(gdb) r
Starting program: /home/.../hello 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
at linetable end nitems=3

Breakpoint 1, main () at hello.c:4
4	  printf("hello ");
(gdb) n
at linetable end nitems=3
at linetable end nitems=3
1	  printf("world\n");
(gdb) n
hello world
at linetable end nitems=3
at linetable end nitems=3
2	  return 0;
(gdb) maint info line-table
objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
symtab: .../hello.c ((struct symtab *) 0x5641508bfc40)
linetable: ((struct linetable *) 0x564150920ce0):
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END EPILOGUE-BEGIN 
0      3      0x0000555555555149 0x0000000000001149 Y                                   
1      4      0x000055555555514d 0x000000000000114d Y                                   
2      END    0x0000555555555161 0x0000000000001161 Y                                   

objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
symtab: .../world.inc ((struct symtab *) 0x5641508bfc80)
linetable: ((struct linetable *) 0x564150920c80):
INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END EPILOGUE-BEGIN 
0      1      0x0000555555555161 0x0000000000001161 Y                                   
1      2      0x0000555555555170 0x0000000000001170 Y                                   
2      3      0x0000555555555175 0x0000000000001175 Y                                   
3      END    0x0000555555555177 0x0000000000001177 Y                                   

objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
symtab: /usr/include/stdio.h ((struct symtab *) 0x5641508bfcc0)
linetable: ((struct linetable *) 0x0):
No line table.


And this is also why I would suggest to look into using
the line-table that has addresses near the end of the range
like this:

--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -4156,7 +4156,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
   if (!find_pc_partial_function (func_addr, nullptr, &start_pc, &end_pc))
     return {};
 
-  const struct symtab_and_line sal = find_pc_line (start_pc, 0);
+  const struct symtab_and_line sal = find_pc_line (end_pc - 1, 0);
   if (sal.symtab != nullptr && sal.symtab->language () != language_asm)
     {
       struct objfile *objfile = sal.symtab->compunit ()->objfile ();



Thanks
Bernd.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable
  2024-04-08 14:41         ` Bernd Edlinger
@ 2024-04-09  9:37           ` Tom de Vries
  0 siblings, 0 replies; 8+ messages in thread
From: Tom de Vries @ 2024-04-09  9:37 UTC (permalink / raw)
  To: Bernd Edlinger, gdb-patches

On 4/8/24 16:41, Bernd Edlinger wrote:
> Hi Tom,
> 
> we are making good progress.
> 

Agreed :)

> On 4/8/24 14:58, Tom de Vries wrote:
>> On 4/7/24 10:17, Bernd Edlinger wrote:
>>>
>>> Hmm, this can be an assertion, because
>>> the line table was found by find_pc_line (start_pc, 0);
>>> so the linetable is guaranteed to be non-empty.
>>> BTW: empty linetables are usually NULL pointers,
>>> so that probably the assertion should
>>> also assert like
>>> gdb_assert (linetable != NULL && linetable->nitems > 0);
>>>
>>
>> I've updated the if condition to include the nullptr check, but didn't turn it into an assert.  By doing so we gain nothing, and add a potential user inconvenience.
>>
> 
> ok.
> 
>>>>
>>>> We could also just simply handle the cornercase by returning {}, I went forth and back a bit on that, and decided to support it on the basis that at least currently we have dwarf assembly test-cases in the testsuite that trigger this path, though I've submitted a series to clean that up.
>>>>
>>>> But I'm still on the fence about this, if you prefer a "return {}" I'm fine with that.
>>>>
>>
>> I've thought about this a bit more over the weekend, and I managed to convince myself that a simple "return {}" is the proper solution.  The rationale is that an incorrectly written dwarf assembly test-case is not a good reason to support a corner-case.  If we do stumble one day into a compiler that generates the incorrect debug info, we can always opt to add a workaround for this (which we then can test extensively).  But adding a workaround without such an incentive is pointless.
>>
>>>>>> +    }
>>>>>> +      else
>>>>>> +    gdb_assert (unrel_end <= it->unrelocated_pc ());
>>>>>
>>>>> Why do you not check that 'it' points to an end_sequence
>>>>> at exactly unrel_end?
>>>>> It could be anything at an address much higher PC than unrel_end.
>>>>
>>>> This assert spells out the post-condition of the call to std::lower_bound, in case it found an entry.
>>>>
>>>> If there's debug info where one line entry straddles two functions, the call returns the entry after it, which doesn't have address unrel_end.
>>>>
>>>> Having said that, we can unsupport such a scenario by doing:
>>>> ...
>>>>         else
>>>>           {
>>>>             if (unrel_end < it->unrelocated_pc ())
>>>>               return {};
>>>>             gdb_assert (unrel_end == it->unrelocated_pc ());
>>>> ...
>>>>
>>>
>>> I think we should not look at the `it` in any case.
>>> If there is an inconsistency in the debug info, a
>>> debug message that can be enabled in maintainer mode
>>> would be good enough.
>>> But even in this case, I would prefer a best effort,
>>> and continue whenever possible.
>>>
>>
>> IMO, a best effort only makes sense in case there's a compiler release generating the incorrect debug info.
>>
> 
> Consider this counter example:
> The debug info is correct, but the internal representation
> is not what you probably expect.
> 
> $ cat hello.c
> #include <stdio.h>
> int main()
> {
>    printf("hello ");
>    #include "world.inc"
> /*** End of hello.c ***/
> 
> $ cat world.inc
>    printf("world\n");
>    return 0;
> }
> /*** End of world.inc ***/
> 
> gcc -g -o hello hello.c
> 
> The line table starting at main ends between
> the two printf statements.
> 
> I demonstrate the effect this little mod over
> your v4 patch version:
> 
> diff --git a/gdb/amd64-tdep.c b/gdb/amd64-tdep.c
> index 0bb7a24cbd0..35c1fb85409 100644
> --- a/gdb/amd64-tdep.c
> +++ b/gdb/amd64-tdep.c
> @@ -2908,6 +2908,7 @@ amd64_epilogue_frame_sniffer_1 (const struct frame_unwind *self,
>       /* We're not in the inner frame, so assume we're not in an epilogue.  */
>       return 0;
>   
> +#if 0
>     bool unwind_valid_p
>       = compunit_epilogue_unwind_valid (find_pc_compunit_symtab (pc));
>     if (override_p)
> @@ -2924,6 +2925,7 @@ amd64_epilogue_frame_sniffer_1 (const struct frame_unwind *self,
>             "amd64 epilogue".  */
>          return 0;
>       }
> +#endif
>   
>     /* Check whether we're in an epilogue.  */
>     return amd64_stack_frame_destroyed_p_1 (gdbarch, pc);
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 427d7b9f8b2..bb3c1bba70b 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -4192,6 +4192,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>               extent of the function, which shouldn't happen with
>               compiler-generated debug info.  Handle the corner case
>               conservatively.  */
> +         printf("at linetable end nitems=%d\n", linetable->nitems);
>            return {};
>          }
>         else
> @@ -4202,6 +4203,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>                   function.  This can happen if the previous entry straddles
>                   two functions, which shouldn't happen with compiler-generated
>                   debug info.  Handle the corner case conservatively.  */
> +             printf("unrel_end = %lx < it->unrelocated_pc = %lx\n", (long)unrel_end, (long)it->unrelocated_pc());
>                return {};
>              }
>            gdb_assert (unrel_end == it->unrelocated_pc ());
> 

Nice.  I've filed this as a PR ( 
https://sourceware.org/bugzilla/show_bug.cgi?id=31622 ) since it's an 
orthogonal issue, and added your suggested fix with a test-case in a 
separate patch in v5.

> So this debug info which is not invalid at all,
> triggers your error conditions:
> 

IIUC, one of them.

Anyway, I've updated the comments to reflect this situation, and also 
added a comment that was part of the commit message of the initial 
commit adding find_epilogue_using_linetable.

V5 submitted here ( 
https://sourceware.org/pipermail/gdb-patches/2024-April/207959.html ).

Thanks,
- Tom

> $ ..../gdb-build/gdb/gdb hello
> GNU gdb (GDB) 15.0.50.20240406-git
> Copyright (C) 2024 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-pc-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>      <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from hello...
> (gdb) b main
> Breakpoint 1 at 0x114d: file hello.c, line 4.
> (gdb) r
> Starting program: /home/.../hello
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> at linetable end nitems=3
> 
> Breakpoint 1, main () at hello.c:4
> 4	  printf("hello ");
> (gdb) n
> at linetable end nitems=3
> at linetable end nitems=3
> 1	  printf("world\n");
> (gdb) n
> hello world
> at linetable end nitems=3
> at linetable end nitems=3
> 2	  return 0;
> (gdb) maint info line-table
> objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
> compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
> symtab: .../hello.c ((struct symtab *) 0x5641508bfc40)
> linetable: ((struct linetable *) 0x564150920ce0):
> INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END EPILOGUE-BEGIN
> 0      3      0x0000555555555149 0x0000000000001149 Y
> 1      4      0x000055555555514d 0x000000000000114d Y
> 2      END    0x0000555555555161 0x0000000000001161 Y
> 
> objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
> compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
> symtab: .../world.inc ((struct symtab *) 0x5641508bfc80)
> linetable: ((struct linetable *) 0x564150920c80):
> INDEX  LINE   REL-ADDRESS        UNREL-ADDRESS      IS-STMT PROLOGUE-END EPILOGUE-BEGIN
> 0      1      0x0000555555555161 0x0000000000001161 Y
> 1      2      0x0000555555555170 0x0000000000001170 Y
> 2      3      0x0000555555555175 0x0000000000001175 Y
> 3      END    0x0000555555555177 0x0000000000001177 Y
> 
> objfile: .../hello ((struct objfile *) 0x5641508bf6b0)
> compunit_symtab: hello.c ((struct compunit_symtab *) 0x5641508bfbc0)
> symtab: /usr/include/stdio.h ((struct symtab *) 0x5641508bfcc0)
> linetable: ((struct linetable *) 0x0):
> No line table.
> 
> 
> And this is also why I would suggest to look into using
> the line-table that has addresses near the end of the range
> like this:
> 
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -4156,7 +4156,7 @@ find_epilogue_using_linetable (CORE_ADDR func_addr)
>     if (!find_pc_partial_function (func_addr, nullptr, &start_pc, &end_pc))
>       return {};
>   
> -  const struct symtab_and_line sal = find_pc_line (start_pc, 0);
> +  const struct symtab_and_line sal = find_pc_line (end_pc - 1, 0);
>     if (sal.symtab != nullptr && sal.symtab->language () != language_asm)
>       {
>         struct objfile *objfile = sal.symtab->compunit ()->objfile ();
> 
> 
> 
> Thanks
> Bernd.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-04-09  9:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-05 15:10 [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Tom de Vries
2024-04-05 15:10 ` [PATCH v3 2/2] [gdb/testsuite] Add gdb.dwarf2/dw2-epilogue-begin-2.exp Tom de Vries
2024-04-06  5:03 ` [PATCH v3 1/2] [gdb/symtab] Fix an out of bounds array access in find_epilogue_using_linetable Bernd Edlinger
2024-04-06  8:29   ` Tom de Vries
2024-04-07  8:17     ` Bernd Edlinger
2024-04-08 12:58       ` Tom de Vries
2024-04-08 14:41         ` Bernd Edlinger
2024-04-09  9:37           ` Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).