public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb: avoid premature dummy frame garbage collection
@ 2021-05-19 12:29 Andrew Burgess
  2021-05-20 14:30 ` Simon Marchi
  2021-05-25 10:49 ` [PATCHv2] " Andrew Burgess
  0 siblings, 2 replies; 5+ messages in thread
From: Andrew Burgess @ 2021-05-19 12:29 UTC (permalink / raw)
  To: gdb-patches

From: Richard Bunt <richard.bunt@arm.com>

Under the following conditions, inferior calls would not complete.

  * GDB is performing a function call in the inferior, and
  * the inferior call executes set/long jumps, and
  * there is no usable unwind information at the long/set jump sites.

The above scenario causes GDB to prematurely remove the dummy frame
that was setup for the inferior call.  When GDB eventually returns to
the dummy frame the inferior attempts to execute the dummy frame which
can cause undefined behaviour.

This situation is already warned about in the comment on the function
check_longjmp_breakpoint_for_call_dummy where we say:

   You should call this function only at places where it is safe to currently
   unwind the whole stack.  Failed stack unwind would discard live dummy
   frames.

The warning here is fine, the problem is that, even though we call the
function from a location within GDB where we hope to be able to
unwind, sometime the state of the inferior means that the unwind will
not succeed.

This commit tries to improve the situation by adding the following
additional check; when GDB fails to find the dummy frame on the stack,
instead of just assuming that the dummy frame can be garbage
collected, first find the stop_reason for the last frame on the stack.
If this stop_reason indicates that the stack unwinding may have failed
then we assume that the dummy frame is still in use.  However, if the
last frame's stop_reason indicates that the stack unwind completed
successfully then we can be confident that the dummy frame is no
longer in use, and we garbage collect it.

Tested on x864-64 GNU/Linux.

gdb/ChangeLog:

	* breakpoint.c (check_longjmp_breakpoint_for_call_dummy): Add
	check for why the backtrace stopped.

gdb/testsuite/ChangeLog:

	* gdb.base/premature-dummy-frame-removal.c: New file.
	* gdb.base/premature-dummy-frame-removal.exp: New file.
	* gdb.base/premature-dummy-frame-removal.py: New file.

Change-Id: I8f330cfe0f3f33beb3a52a36994094c4abada07e
---
 gdb/ChangeLog                                 |  6 ++
 gdb/breakpoint.c                              | 38 +++++++++--
 gdb/testsuite/ChangeLog                       |  6 ++
 .../gdb.base/premature-dummy-frame-removal.c  | 65 +++++++++++++++++++
 .../premature-dummy-frame-removal.exp         | 50 ++++++++++++++
 .../gdb.base/premature-dummy-frame-removal.py | 58 +++++++++++++++++
 6 files changed, 219 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.py

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index d479f008948..d2dc2f98ce0 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -7452,9 +7452,10 @@ set_longjmp_breakpoint_for_call_dummy (void)
    TP.  Remove those which can no longer be found in the current frame
    stack.
 
-   You should call this function only at places where it is safe to currently
-   unwind the whole stack.  Failed stack unwind would discard live dummy
-   frames.  */
+   If the unwind fails then there is not sufficient information to discard
+   dummy frames.  In this case, elide the clean up and the dummy frames will
+   be cleaned up next time this function is called from a location where
+   unwinding is possible.  */
 
 void
 check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
@@ -7471,7 +7472,36 @@ check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
 	if (dummy_b->type != bp_call_dummy
 	    || frame_find_by_id (dummy_b->frame_id) != NULL)
 	  continue;
-	
+
+	/* We didn't find the dummy frame on the stack, this could be
+	   because we have longjmp'd to stack frame that is previous to the
+	   dummy frame, or it could be because the stack unwind is broken
+	   at some point between the longjmp frame and the dummy frame.
+
+	   Next we figure out why the stack unwind stopped.  If it looks
+	   like the unwind is complete then we assume the dummy frame has
+	   been jumped over, however, if the unwind stopped for an
+	   unexpected reason then we assume the stack unwind is currently
+	   broken, and that we will (eventually) return to the dummy
+	   frame.  */
+	bool unwind_finished_unexpectedly = false;
+	for (struct frame_info *fi = get_current_frame (); fi != nullptr; )
+	  {
+	    struct frame_info *prev = get_prev_frame (fi);
+	    if (prev == nullptr)
+	      {
+		/* FI is the last stack frame.  Why did this frame not
+		   unwind further?  */
+		auto stop_reason = get_frame_unwind_stop_reason (fi);
+		if (stop_reason != UNWIND_NO_REASON
+		    && stop_reason != UNWIND_OUTERMOST)
+		  unwind_finished_unexpectedly = true;
+	      }
+	    fi = prev;
+	  }
+	if (unwind_finished_unexpectedly)
+	  continue;
+
 	dummy_frame_discard (dummy_b->frame_id, tp);
 
 	while (b->related_breakpoint != b)
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
new file mode 100644
index 00000000000..868359c44b5
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
@@ -0,0 +1,65 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <setjmp.h>
+
+jmp_buf env;
+
+void
+worker ()
+{
+  longjmp (env, 1);
+}
+
+void
+test_inner ()
+{
+  if (setjmp (env) == 0)
+    {
+      /* Direct call.  */
+      worker ();
+
+      /* Will never get here.  */
+      abort ();
+    }
+  else
+    {
+      /* Called from longjmp.  */
+    }
+}
+
+void
+break_bt_here ()
+{
+  test_inner ();
+}
+
+int
+some_func ()
+{
+  break_bt_here ();
+  return 0;
+}
+
+int
+main ()
+{
+  some_func ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
new file mode 100644
index 00000000000..d0c089a5ccd
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
@@ -0,0 +1,50 @@
+# Copyright 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Make an inferior call to a function which uses longjmp.  However,
+# the backtrace for the function that is called is broken at the point
+# where the longjmp is handled.  This test is checking to see if the
+# inferior call still completes successfully.
+#
+# In "real life" the broken backtrace was caused by missing debug
+# information, and a failure of GDB's prologue based unwinder.  In
+# this test I force a broken backtrace using the Python unwinder API.
+#
+# The problem that was caused was that due to the broken backtrace GDB
+# failed to find the inferior call's dummy frame, GDB then concluded
+# that we had longjmp'd backward past the dummy frame and so garbage
+# collected the dummy frame, this causes the breakpoint within the
+# dummy frame to be deleted.
+#
+# When program flow naturally returned to the dummy frame we tried to
+# execute the instruction on the stack and received a SIGSEGV.
+
+standard_testfile .c
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile [list debug]] } {
+    return -1
+}
+
+if ![runto_main] then {
+    return 0
+}
+
+# Skip this test if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
+gdb_test_no_output "source ${pyfile}" "load python file again"
+
+gdb_test "p some_func ()" " = 0"
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
new file mode 100644
index 00000000000..b6be1e6efe0
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
@@ -0,0 +1,58 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This dummy unwinder will break GDB's backtrce at the function called
+# 'break_bt_here'.
+
+import gdb
+from gdb.unwinder import Unwinder
+
+class FrameId(object):
+    def __init__(self, sp, pc):
+        self._sp = sp
+        self._pc = pc
+
+    @property
+    def sp(self):
+        return self._sp
+
+    @property
+    def pc(self):
+        return self._pc
+
+class TestUnwinder(Unwinder):
+
+    def __init__(self):
+        Unwinder.__init__(self, "break unwinding")
+
+    def __call__(self, pending_frame):
+        pc_desc = pending_frame.architecture ().registers ().find ("pc")
+        pc = pending_frame.read_register (pc_desc)
+
+        sp_desc = pending_frame.architecture ().registers ().find ("sp")
+        sp = pending_frame.read_register (sp_desc)
+
+        block = gdb.block_for_pc (int (pc))
+        if block == None:
+            return None
+        func = block.function
+        if func == None:
+            return None
+        if str (func) != "break_bt_here":
+            return None
+        fid = FrameId (pc, sp)
+        return pending_frame.create_unwind_info (fid)
+
+gdb.unwinder.register_unwinder(None, TestUnwinder(), True)
-- 
2.25.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] gdb: avoid premature dummy frame garbage collection
  2021-05-19 12:29 [PATCH] gdb: avoid premature dummy frame garbage collection Andrew Burgess
@ 2021-05-20 14:30 ` Simon Marchi
  2021-05-21  8:42   ` Andrew Burgess
  2021-05-25 10:49 ` [PATCHv2] " Andrew Burgess
  1 sibling, 1 reply; 5+ messages in thread
From: Simon Marchi @ 2021-05-20 14:30 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches

On 2021-05-19 8:29 a.m., Andrew Burgess wrote:
> From: Richard Bunt <richard.bunt@arm.com>
> 
> Under the following conditions, inferior calls would not complete.
> 
>   * GDB is performing a function call in the inferior, and
>   * the inferior call executes set/long jumps, and
>   * there is no usable unwind information at the long/set jump sites.

I didn't really understand the sequence of events at first.  To clarify,
what happens is:

  - longjmp is called during an infcall
  - the magic longjmp breakpoint is hit
  - we are not able to unwind from the longjmp frame, because of missing
    info
  - we conclude, erroneously: "looks the infcall dummy frame doesn't
    exist anymore, it must mean the infcall has finished, let's clean
    things up"
  - we forget about that dummy frame and remove the magic breakpoint set
    on it to detect the end of the infcall
  - when the execution actually returns to the dummy frame, we execute
    garbage

Is that right?

And checking whether the dummy frame still exists when a longjmp is done
in an infcall would be important because it's possible for an infcall to
longjmp out of the infcall, if jumping to a location that was set before
the infcall.

> The above scenario causes GDB to prematurely remove the dummy frame
> that was setup for the inferior call.  When GDB eventually returns to
> the dummy frame the inferior attempts to execute the dummy frame which
> can cause undefined behaviour.
> 
> This situation is already warned about in the comment on the function
> check_longjmp_breakpoint_for_call_dummy where we say:
> 
>    You should call this function only at places where it is safe to currently
>    unwind the whole stack.  Failed stack unwind would discard live dummy
>    frames.
> 
> The warning here is fine, the problem is that, even though we call the
> function from a location within GDB where we hope to be able to
> unwind, sometime the state of the inferior means that the unwind will
> not succeed.
> 
> This commit tries to improve the situation by adding the following
> additional check; when GDB fails to find the dummy frame on the stack,
> instead of just assuming that the dummy frame can be garbage
> collected, first find the stop_reason for the last frame on the stack.
> If this stop_reason indicates that the stack unwinding may have failed
> then we assume that the dummy frame is still in use.  However, if the
> last frame's stop_reason indicates that the stack unwind completed
> successfully then we can be confident that the dummy frame is no
> longer in use, and we garbage collect it.

That sounds reasonable.

> Tested on x864-64 GNU/Linux.

x864 -> x86

> gdb/ChangeLog:
> 
> 	* breakpoint.c (check_longjmp_breakpoint_for_call_dummy): Add
> 	check for why the backtrace stopped.
> 
> gdb/testsuite/ChangeLog:
> 
> 	* gdb.base/premature-dummy-frame-removal.c: New file.
> 	* gdb.base/premature-dummy-frame-removal.exp: New file.
> 	* gdb.base/premature-dummy-frame-removal.py: New file.
> 
> Change-Id: I8f330cfe0f3f33beb3a52a36994094c4abada07e
> ---
>  gdb/ChangeLog                                 |  6 ++
>  gdb/breakpoint.c                              | 38 +++++++++--
>  gdb/testsuite/ChangeLog                       |  6 ++
>  .../gdb.base/premature-dummy-frame-removal.c  | 65 +++++++++++++++++++
>  .../premature-dummy-frame-removal.exp         | 50 ++++++++++++++
>  .../gdb.base/premature-dummy-frame-removal.py | 58 +++++++++++++++++
>  6 files changed, 219 insertions(+), 4 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
>  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
>  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> 
> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
> index d479f008948..d2dc2f98ce0 100644
> --- a/gdb/breakpoint.c
> +++ b/gdb/breakpoint.c
> @@ -7452,9 +7452,10 @@ set_longjmp_breakpoint_for_call_dummy (void)
>     TP.  Remove those which can no longer be found in the current frame
>     stack.
>  
> -   You should call this function only at places where it is safe to currently
> -   unwind the whole stack.  Failed stack unwind would discard live dummy
> -   frames.  */
> +   If the unwind fails then there is not sufficient information to discard
> +   dummy frames.  In this case, elide the clean up and the dummy frames will
> +   be cleaned up next time this function is called from a location where
> +   unwinding is possible.  */
>  
>  void
>  check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
> @@ -7471,7 +7472,36 @@ check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
>  	if (dummy_b->type != bp_call_dummy
>  	    || frame_find_by_id (dummy_b->frame_id) != NULL)
>  	  continue;

Could you add some one-liner comments to the while and if above,
explaining what they do?  I'm having a hard time decrypting what they
are trying to achieve, I think that having a small comment for each
would make the code much easier to understand.

> -	
> +
> +	/* We didn't find the dummy frame on the stack, this could be
> +	   because we have longjmp'd to stack frame that is previous to the

to "a" stack frame?

> +	   dummy frame, or it could be because the stack unwind is broken
> +	   at some point between the longjmp frame and the dummy frame.
> +
> +	   Next we figure out why the stack unwind stopped.  If it looks
> +	   like the unwind is complete then we assume the dummy frame has
> +	   been jumped over, however, if the unwind stopped for an
> +	   unexpected reason then we assume the stack unwind is currently
> +	   broken, and that we will (eventually) return to the dummy
> +	   frame.  */

Could we maybe also use frame_id_inner?  If, the unwind fails, but we see
that we have unwind past where the dummy frame would be, we can conclude
that it has been jumped over.  Just an idea, I haven't really thought
this through.

> +	bool unwind_finished_unexpectedly = false;
> +	for (struct frame_info *fi = get_current_frame (); fi != nullptr; )
> +	  {
> +	    struct frame_info *prev = get_prev_frame (fi);
> +	    if (prev == nullptr)
> +	      {
> +		/* FI is the last stack frame.  Why did this frame not
> +		   unwind further?  */
> +		auto stop_reason = get_frame_unwind_stop_reason (fi);
> +		if (stop_reason != UNWIND_NO_REASON
> +		    && stop_reason != UNWIND_OUTERMOST)
> +		  unwind_finished_unexpectedly = true;
> +	      }
> +	    fi = prev;
> +	  }
> +	if (unwind_finished_unexpectedly)
> +	  continue;
> +
>  	dummy_frame_discard (dummy_b->frame_id, tp);
>  
>  	while (b->related_breakpoint != b)
> diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
> new file mode 100644
> index 00000000000..868359c44b5
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
> @@ -0,0 +1,65 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2021 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <stdlib.h>
> +#include <setjmp.h>
> +
> +jmp_buf env;
> +
> +void
> +worker ()
> +{
> +  longjmp (env, 1);
> +}
> +
> +void
> +test_inner ()
> +{
> +  if (setjmp (env) == 0)
> +    {
> +      /* Direct call.  */
> +      worker ();
> +
> +      /* Will never get here.  */
> +      abort ();
> +    }
> +  else
> +    {
> +      /* Called from longjmp.  */
> +    }
> +}
> +
> +void
> +break_bt_here ()
> +{
> +  test_inner ();
> +}
> +
> +int
> +some_func ()
> +{
> +  break_bt_here ();
> +  return 0;
> +}
> +
> +int
> +main ()

Since this is C, use (void) everywhere in the prototypes.

> +{
> +  some_func ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
> new file mode 100644
> index 00000000000..d0c089a5ccd
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
> @@ -0,0 +1,50 @@
> +# Copyright 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Make an inferior call to a function which uses longjmp.  However,
> +# the backtrace for the function that is called is broken at the point
> +# where the longjmp is handled.  This test is checking to see if the
> +# inferior call still completes successfully.
> +#
> +# In "real life" the broken backtrace was caused by missing debug
> +# information, and a failure of GDB's prologue based unwinder.  In
> +# this test I force a broken backtrace using the Python unwinder API.

First person is discouraged.  You could say "This test forces a
broken...".

> +#
> +# The problem that was caused was that due to the broken backtrace GDB

"The problem that was caused" sounds weird.  Maybe "The problem was due
to..."?

> +# failed to find the inferior call's dummy frame, GDB then concluded
> +# that we had longjmp'd backward past the dummy frame and so garbage
> +# collected the dummy frame, this causes the breakpoint within the
> +# dummy frame to be deleted.
> +#
> +# When program flow naturally returned to the dummy frame we tried to
> +# execute the instruction on the stack and received a SIGSEGV.
> +
> +standard_testfile .c
> +
> +if { [prepare_for_testing "failed to prepare" $testfile $srcfile [list debug]] } {

I think you can omit [list debug], it's the default.

> +    return -1
> +}
> +
> +if ![runto_main] then {
> +    return 0
> +}
> +
> +# Skip this test if Python scripting is not enabled.
> +if { [skip_python_tests] } { continue }
> +
> +set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
> +gdb_test_no_output "source ${pyfile}" "load python file again"

again?

> +
> +gdb_test "p some_func ()" " = 0"
> diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> new file mode 100644
> index 00000000000..b6be1e6efe0
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> @@ -0,0 +1,58 @@
> +# Copyright (C) 2021 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This dummy unwinder will break GDB's backtrce at the function called
> +# 'break_bt_here'.
> +
> +import gdb
> +from gdb.unwinder import Unwinder
> +
> +class FrameId(object):
> +    def __init__(self, sp, pc):
> +        self._sp = sp
> +        self._pc = pc
> +
> +    @property
> +    def sp(self):
> +        return self._sp
> +
> +    @property
> +    def pc(self):
> +        return self._pc
> +
> +class TestUnwinder(Unwinder):
> +
> +    def __init__(self):
> +        Unwinder.__init__(self, "break unwinding")
> +
> +    def __call__(self, pending_frame):
> +        pc_desc = pending_frame.architecture ().registers ().find ("pc")
> +        pc = pending_frame.read_register (pc_desc)
> +
> +        sp_desc = pending_frame.architecture ().registers ().find ("sp")
> +        sp = pending_frame.read_register (sp_desc)
> +
> +        block = gdb.block_for_pc (int (pc))
> +        if block == None:
> +            return None
> +        func = block.function
> +        if func == None:
> +            return None
> +        if str (func) != "break_bt_here":
> +            return None
> +        fid = FrameId (pc, sp)
> +        return pending_frame.create_unwind_info (fid)
> +
> +gdb.unwinder.register_unwinder(None, TestUnwinder(), True)

"black ." :)

(this is so much easier than pointing out individual formatting issues,
I love it)

Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] gdb: avoid premature dummy frame garbage collection
  2021-05-20 14:30 ` Simon Marchi
@ 2021-05-21  8:42   ` Andrew Burgess
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Burgess @ 2021-05-21  8:42 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

* Simon Marchi <simon.marchi@polymtl.ca> [2021-05-20 10:30:05 -0400]:

> On 2021-05-19 8:29 a.m., Andrew Burgess wrote:
> > From: Richard Bunt <richard.bunt@arm.com>
> > 
> > Under the following conditions, inferior calls would not complete.
> > 
> >   * GDB is performing a function call in the inferior, and
> >   * the inferior call executes set/long jumps, and
> >   * there is no usable unwind information at the long/set jump sites.
> 
> I didn't really understand the sequence of events at first.  To clarify,
> what happens is:
> 
>   - longjmp is called during an infcall
>   - the magic longjmp breakpoint is hit
>   - we are not able to unwind from the longjmp frame, because of missing
>     info
>   - we conclude, erroneously: "looks the infcall dummy frame doesn't
>     exist anymore, it must mean the infcall has finished, let's clean
>     things up"
>   - we forget about that dummy frame and remove the magic breakpoint set
>     on it to detect the end of the infcall
>   - when the execution actually returns to the dummy frame, we execute
>     garbage
> 
> Is that right?

That is exactly right.

> 
> And checking whether the dummy frame still exists when a longjmp is done
> in an infcall would be important because it's possible for an infcall to
> longjmp out of the infcall, if jumping to a location that was set before
> the infcall.

Correct again.

Thanks for all your additional feedback, I'll roll another version of
this patch soon.

Andrew

> 
> > The above scenario causes GDB to prematurely remove the dummy frame
> > that was setup for the inferior call.  When GDB eventually returns to
> > the dummy frame the inferior attempts to execute the dummy frame which
> > can cause undefined behaviour.
> > 
> > This situation is already warned about in the comment on the function
> > check_longjmp_breakpoint_for_call_dummy where we say:
> > 
> >    You should call this function only at places where it is safe to currently
> >    unwind the whole stack.  Failed stack unwind would discard live dummy
> >    frames.
> > 
> > The warning here is fine, the problem is that, even though we call the
> > function from a location within GDB where we hope to be able to
> > unwind, sometime the state of the inferior means that the unwind will
> > not succeed.
> > 
> > This commit tries to improve the situation by adding the following
> > additional check; when GDB fails to find the dummy frame on the stack,
> > instead of just assuming that the dummy frame can be garbage
> > collected, first find the stop_reason for the last frame on the stack.
> > If this stop_reason indicates that the stack unwinding may have failed
> > then we assume that the dummy frame is still in use.  However, if the
> > last frame's stop_reason indicates that the stack unwind completed
> > successfully then we can be confident that the dummy frame is no
> > longer in use, and we garbage collect it.
> 
> That sounds reasonable.
> 
> > Tested on x864-64 GNU/Linux.
> 
> x864 -> x86
> 
> > gdb/ChangeLog:
> > 
> > 	* breakpoint.c (check_longjmp_breakpoint_for_call_dummy): Add
> > 	check for why the backtrace stopped.
> > 
> > gdb/testsuite/ChangeLog:
> > 
> > 	* gdb.base/premature-dummy-frame-removal.c: New file.
> > 	* gdb.base/premature-dummy-frame-removal.exp: New file.
> > 	* gdb.base/premature-dummy-frame-removal.py: New file.
> > 
> > Change-Id: I8f330cfe0f3f33beb3a52a36994094c4abada07e
> > ---
> >  gdb/ChangeLog                                 |  6 ++
> >  gdb/breakpoint.c                              | 38 +++++++++--
> >  gdb/testsuite/ChangeLog                       |  6 ++
> >  .../gdb.base/premature-dummy-frame-removal.c  | 65 +++++++++++++++++++
> >  .../premature-dummy-frame-removal.exp         | 50 ++++++++++++++
> >  .../gdb.base/premature-dummy-frame-removal.py | 58 +++++++++++++++++
> >  6 files changed, 219 insertions(+), 4 deletions(-)
> >  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
> >  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
> >  create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> > 
> > diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
> > index d479f008948..d2dc2f98ce0 100644
> > --- a/gdb/breakpoint.c
> > +++ b/gdb/breakpoint.c
> > @@ -7452,9 +7452,10 @@ set_longjmp_breakpoint_for_call_dummy (void)
> >     TP.  Remove those which can no longer be found in the current frame
> >     stack.
> >  
> > -   You should call this function only at places where it is safe to currently
> > -   unwind the whole stack.  Failed stack unwind would discard live dummy
> > -   frames.  */
> > +   If the unwind fails then there is not sufficient information to discard
> > +   dummy frames.  In this case, elide the clean up and the dummy frames will
> > +   be cleaned up next time this function is called from a location where
> > +   unwinding is possible.  */
> >  
> >  void
> >  check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
> > @@ -7471,7 +7472,36 @@ check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
> >  	if (dummy_b->type != bp_call_dummy
> >  	    || frame_find_by_id (dummy_b->frame_id) != NULL)
> >  	  continue;
> 
> Could you add some one-liner comments to the while and if above,
> explaining what they do?  I'm having a hard time decrypting what they
> are trying to achieve, I think that having a small comment for each
> would make the code much easier to understand.
> 
> > -	
> > +
> > +	/* We didn't find the dummy frame on the stack, this could be
> > +	   because we have longjmp'd to stack frame that is previous to the
> 
> to "a" stack frame?
> 
> > +	   dummy frame, or it could be because the stack unwind is broken
> > +	   at some point between the longjmp frame and the dummy frame.
> > +
> > +	   Next we figure out why the stack unwind stopped.  If it looks
> > +	   like the unwind is complete then we assume the dummy frame has
> > +	   been jumped over, however, if the unwind stopped for an
> > +	   unexpected reason then we assume the stack unwind is currently
> > +	   broken, and that we will (eventually) return to the dummy
> > +	   frame.  */
> 
> Could we maybe also use frame_id_inner?  If, the unwind fails, but we see
> that we have unwind past where the dummy frame would be, we can conclude
> that it has been jumped over.  Just an idea, I haven't really thought
> this through.
> 
> > +	bool unwind_finished_unexpectedly = false;
> > +	for (struct frame_info *fi = get_current_frame (); fi != nullptr; )
> > +	  {
> > +	    struct frame_info *prev = get_prev_frame (fi);
> > +	    if (prev == nullptr)
> > +	      {
> > +		/* FI is the last stack frame.  Why did this frame not
> > +		   unwind further?  */
> > +		auto stop_reason = get_frame_unwind_stop_reason (fi);
> > +		if (stop_reason != UNWIND_NO_REASON
> > +		    && stop_reason != UNWIND_OUTERMOST)
> > +		  unwind_finished_unexpectedly = true;
> > +	      }
> > +	    fi = prev;
> > +	  }
> > +	if (unwind_finished_unexpectedly)
> > +	  continue;
> > +
> >  	dummy_frame_discard (dummy_b->frame_id, tp);
> >  
> >  	while (b->related_breakpoint != b)
> > diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
> > new file mode 100644
> > index 00000000000..868359c44b5
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
> > @@ -0,0 +1,65 @@
> > +/* This testcase is part of GDB, the GNU debugger.
> > +
> > +   Copyright 2021 Free Software Foundation, Inc.
> > +
> > +   This program is free software; you can redistribute it and/or modify
> > +   it under the terms of the GNU General Public License as published by
> > +   the Free Software Foundation; either version 3 of the License, or
> > +   (at your option) any later version.
> > +
> > +   This program is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +   GNU General Public License for more details.
> > +
> > +   You should have received a copy of the GNU General Public License
> > +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> > +
> > +#include <stdlib.h>
> > +#include <setjmp.h>
> > +
> > +jmp_buf env;
> > +
> > +void
> > +worker ()
> > +{
> > +  longjmp (env, 1);
> > +}
> > +
> > +void
> > +test_inner ()
> > +{
> > +  if (setjmp (env) == 0)
> > +    {
> > +      /* Direct call.  */
> > +      worker ();
> > +
> > +      /* Will never get here.  */
> > +      abort ();
> > +    }
> > +  else
> > +    {
> > +      /* Called from longjmp.  */
> > +    }
> > +}
> > +
> > +void
> > +break_bt_here ()
> > +{
> > +  test_inner ();
> > +}
> > +
> > +int
> > +some_func ()
> > +{
> > +  break_bt_here ();
> > +  return 0;
> > +}
> > +
> > +int
> > +main ()
> 
> Since this is C, use (void) everywhere in the prototypes.
> 
> > +{
> > +  some_func ();
> > +
> > +  return 0;
> > +}
> > diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
> > new file mode 100644
> > index 00000000000..d0c089a5ccd
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
> > @@ -0,0 +1,50 @@
> > +# Copyright 2021 Free Software Foundation, Inc.
> > +
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 3 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +# Make an inferior call to a function which uses longjmp.  However,
> > +# the backtrace for the function that is called is broken at the point
> > +# where the longjmp is handled.  This test is checking to see if the
> > +# inferior call still completes successfully.
> > +#
> > +# In "real life" the broken backtrace was caused by missing debug
> > +# information, and a failure of GDB's prologue based unwinder.  In
> > +# this test I force a broken backtrace using the Python unwinder API.
> 
> First person is discouraged.  You could say "This test forces a
> broken...".
> 
> > +#
> > +# The problem that was caused was that due to the broken backtrace GDB
> 
> "The problem that was caused" sounds weird.  Maybe "The problem was due
> to..."?
> 
> > +# failed to find the inferior call's dummy frame, GDB then concluded
> > +# that we had longjmp'd backward past the dummy frame and so garbage
> > +# collected the dummy frame, this causes the breakpoint within the
> > +# dummy frame to be deleted.
> > +#
> > +# When program flow naturally returned to the dummy frame we tried to
> > +# execute the instruction on the stack and received a SIGSEGV.
> > +
> > +standard_testfile .c
> > +
> > +if { [prepare_for_testing "failed to prepare" $testfile $srcfile [list debug]] } {
> 
> I think you can omit [list debug], it's the default.
> 
> > +    return -1
> > +}
> > +
> > +if ![runto_main] then {
> > +    return 0
> > +}
> > +
> > +# Skip this test if Python scripting is not enabled.
> > +if { [skip_python_tests] } { continue }
> > +
> > +set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
> > +gdb_test_no_output "source ${pyfile}" "load python file again"
> 
> again?
> 
> > +
> > +gdb_test "p some_func ()" " = 0"
> > diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> > new file mode 100644
> > index 00000000000..b6be1e6efe0
> > --- /dev/null
> > +++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
> > @@ -0,0 +1,58 @@
> > +# Copyright (C) 2021 Free Software Foundation, Inc.
> > +
> > +# This program is free software; you can redistribute it and/or modify
> > +# it under the terms of the GNU General Public License as published by
> > +# the Free Software Foundation; either version 3 of the License, or
> > +# (at your option) any later version.
> > +#
> > +# This program is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +# This dummy unwinder will break GDB's backtrce at the function called
> > +# 'break_bt_here'.
> > +
> > +import gdb
> > +from gdb.unwinder import Unwinder
> > +
> > +class FrameId(object):
> > +    def __init__(self, sp, pc):
> > +        self._sp = sp
> > +        self._pc = pc
> > +
> > +    @property
> > +    def sp(self):
> > +        return self._sp
> > +
> > +    @property
> > +    def pc(self):
> > +        return self._pc
> > +
> > +class TestUnwinder(Unwinder):
> > +
> > +    def __init__(self):
> > +        Unwinder.__init__(self, "break unwinding")
> > +
> > +    def __call__(self, pending_frame):
> > +        pc_desc = pending_frame.architecture ().registers ().find ("pc")
> > +        pc = pending_frame.read_register (pc_desc)
> > +
> > +        sp_desc = pending_frame.architecture ().registers ().find ("sp")
> > +        sp = pending_frame.read_register (sp_desc)
> > +
> > +        block = gdb.block_for_pc (int (pc))
> > +        if block == None:
> > +            return None
> > +        func = block.function
> > +        if func == None:
> > +            return None
> > +        if str (func) != "break_bt_here":
> > +            return None
> > +        fid = FrameId (pc, sp)
> > +        return pending_frame.create_unwind_info (fid)
> > +
> > +gdb.unwinder.register_unwinder(None, TestUnwinder(), True)
> 
> "black ." :)
> 
> (this is so much easier than pointing out individual formatting issues,
> I love it)
> 
> Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2] gdb: avoid premature dummy frame garbage collection
  2021-05-19 12:29 [PATCH] gdb: avoid premature dummy frame garbage collection Andrew Burgess
  2021-05-20 14:30 ` Simon Marchi
@ 2021-05-25 10:49 ` Andrew Burgess
  2021-05-29  2:03   ` Simon Marchi
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Burgess @ 2021-05-25 10:49 UTC (permalink / raw)
  To: gdb-patches

From: Richard Bunt <richard.bunt@arm.com>

Consider the following chain of events:

  * GDB is performing an inferior call, and

  * the inferior calls longjmp, and

  * GDB detects that the longjmp has completed, stops, and enters
    check_longjmp_breakpoint_for_call_dummy (in breakpoint.c), and

  * GDB tries to unwind the stack in order to check that the dummy
    frame (setup for the inferior call) is still on the stack, but

  * The unwind fails, possibly due to missing debug information, so

  * GDB incorrectly concludes that the inferior has longjmp'd past the
    dummy frame, and so deletes the dummy frame, including the dummy
    frame breakpoint, but then

  * The inferior continues, and eventually returns to the dummy frame,
    which is usually (always?) on the stack, the inferior starts
    trying to execute the random contents of the stack, this results
    in undefined behaviour.

This situation is already warned about in the comment on the function
check_longjmp_breakpoint_for_call_dummy where we say:

   You should call this function only at places where it is safe to currently
   unwind the whole stack.  Failed stack unwind would discard live dummy
   frames.

The warning here is fine, the problem is that, even though we call the
function from a location within GDB where we hope to be able to
unwind, sometime the state of the inferior means that the unwind will
not succeed.

This commit tries to improve the situation by adding the following
additional check; when GDB fails to find the dummy frame on the stack,
instead of just assuming that the dummy frame can be garbage
collected, first find the stop_reason for the last frame on the stack.
If this stop_reason indicates that the stack unwinding may have failed
then we assume that the dummy frame is still in use.  However, if the
last frame's stop_reason indicates that the stack unwind completed
successfully then we can be confident that the dummy frame is no
longer in use, and we garbage collect it.

Tested on x86-64 GNU/Linux.

gdb/ChangeLog:

	* breakpoint.c (check_longjmp_breakpoint_for_call_dummy): Add
	check for why the backtrace stopped.

gdb/testsuite/ChangeLog:

	* gdb.base/premature-dummy-frame-removal.c: New file.
	* gdb.base/premature-dummy-frame-removal.exp: New file.
	* gdb.base/premature-dummy-frame-removal.py: New file.

Change-Id: I8f330cfe0f3f33beb3a52a36994094c4abada07e
---
 gdb/ChangeLog                                 |  6 ++
 gdb/breakpoint.c                              | 52 +++++++++++++--
 gdb/testsuite/ChangeLog                       |  6 ++
 .../gdb.base/premature-dummy-frame-removal.c  | 65 +++++++++++++++++++
 .../premature-dummy-frame-removal.exp         | 53 +++++++++++++++
 .../gdb.base/premature-dummy-frame-removal.py | 60 +++++++++++++++++
 6 files changed, 238 insertions(+), 4 deletions(-)
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
 create mode 100644 gdb/testsuite/gdb.base/premature-dummy-frame-removal.py

diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
index d479f008948..a28539548cd 100644
--- a/gdb/breakpoint.c
+++ b/gdb/breakpoint.c
@@ -7452,9 +7452,10 @@ set_longjmp_breakpoint_for_call_dummy (void)
    TP.  Remove those which can no longer be found in the current frame
    stack.
 
-   You should call this function only at places where it is safe to currently
-   unwind the whole stack.  Failed stack unwind would discard live dummy
-   frames.  */
+   If the unwind fails then there is not sufficient information to discard
+   dummy frames.  In this case, elide the clean up and the dummy frames will
+   be cleaned up next time this function is called from a location where
+   unwinding is possible.  */
 
 void
 check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
@@ -7466,12 +7467,55 @@ check_longjmp_breakpoint_for_call_dummy (struct thread_info *tp)
       {
 	struct breakpoint *dummy_b = b->related_breakpoint;
 
+	/* Find the bp_call_dummy breakpoint in the list of breakpoints
+	   chained off b->related_breakpoint.  */
 	while (dummy_b != b && dummy_b->type != bp_call_dummy)
 	  dummy_b = dummy_b->related_breakpoint;
+
+	/* If there was no bp_call_dummy breakpoint then there's nothing
+	   more to do.  Or, if the dummy frame associated with the
+	   bp_call_dummy is still on the stack then we need to leave this
+	   bp_call_dummy in place.  */
 	if (dummy_b->type != bp_call_dummy
 	    || frame_find_by_id (dummy_b->frame_id) != NULL)
 	  continue;
-	
+
+	/* We didn't find the dummy frame on the stack, this could be
+	   because we have longjmp'd to a stack frame that is previous to
+	   the dummy frame, or it could be because the stack unwind is
+	   broken at some point between the longjmp frame and the dummy
+	   frame.
+
+	   Next we figure out why the stack unwind stopped.  If it looks
+	   like the unwind is complete then we assume the dummy frame has
+	   been jumped over, however, if the unwind stopped for an
+	   unexpected reason then we assume the stack unwind is currently
+	   broken, and that we will (eventually) return to the dummy
+	   frame.
+
+	   It might be tempting to consider using frame_id_inner here, but
+	   that is not safe.   There is no guarantee that the stack frames
+	   we are looking at here are even on the same stack as the
+	   original dummy frame, hence frame_id_inner can't be used.  See
+	   the comments on frame_id_inner for more details.  */
+	bool unwind_finished_unexpectedly = false;
+	for (struct frame_info *fi = get_current_frame (); fi != nullptr; )
+	  {
+	    struct frame_info *prev = get_prev_frame (fi);
+	    if (prev == nullptr)
+	      {
+		/* FI is the last stack frame.  Why did this frame not
+		   unwind further?  */
+		auto stop_reason = get_frame_unwind_stop_reason (fi);
+		if (stop_reason != UNWIND_NO_REASON
+		    && stop_reason != UNWIND_OUTERMOST)
+		  unwind_finished_unexpectedly = true;
+	      }
+	    fi = prev;
+	  }
+	if (unwind_finished_unexpectedly)
+	  continue;
+
 	dummy_frame_discard (dummy_b->frame_id, tp);
 
 	while (b->related_breakpoint != b)
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
new file mode 100644
index 00000000000..32deefc962f
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.c
@@ -0,0 +1,65 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <setjmp.h>
+
+jmp_buf env;
+
+void
+worker (void)
+{
+  longjmp (env, 1);
+}
+
+void
+test_inner (void)
+{
+  if (setjmp (env) == 0)
+    {
+      /* Direct call.  */
+      worker ();
+
+      /* Will never get here.  */
+      abort ();
+    }
+  else
+    {
+      /* Called from longjmp.  */
+    }
+}
+
+void
+break_bt_here (void)
+{
+  test_inner ();
+}
+
+int
+some_func (void)
+{
+  break_bt_here ();
+  return 0;
+}
+
+int
+main (void)
+{
+  some_func ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
new file mode 100644
index 00000000000..bf2a2a79756
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.exp
@@ -0,0 +1,53 @@
+# Copyright 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Make an inferior call to a function which uses longjmp.  However,
+# the backtrace for the function that is called is broken at the point
+# where the longjmp is handled.  This test is checking to see if the
+# inferior call still completes successfully.
+#
+# This test forces a broken backtrace using Python, but in real life a
+# broken backtrace can easily occur when calling through code for
+# which there is no debug information if the prologue unwinder fails,
+# which can often happen if the code has been optimized.
+#
+# The problem was that, due to the broken backtrace, GDB failed to
+# find the inferior call's dummy frame.  GDB then concluded that the
+# inferior had longjmp'd backward past the dummy frame and so garbage
+# collected the dummy frame, this causes the breakpoint within the
+# dummy frame to be deleted.
+#
+# When the inferior continued, and eventually returned to the dummy
+# frame, it would try to execute instruction from the dummy frame
+# (which for most, or even all, targets, is on the stack), and then
+# experience undefined behaviuor, often a SIGSEGV.
+
+standard_testfile .c
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile] } {
+    return -1
+}
+
+if ![runto_main] then {
+    return 0
+}
+
+# Skip this test if Python scripting is not enabled.
+if { [skip_python_tests] } { continue }
+
+set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
+gdb_test_no_output "source ${pyfile}" "load python file"
+
+gdb_test "p some_func ()" " = 0"
diff --git a/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
new file mode 100644
index 00000000000..31936658788
--- /dev/null
+++ b/gdb/testsuite/gdb.base/premature-dummy-frame-removal.py
@@ -0,0 +1,60 @@
+# Copyright (C) 2021 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This dummy unwinder will break GDB's backtrce at the function called
+# 'break_bt_here'.
+
+import gdb
+from gdb.unwinder import Unwinder
+
+
+class FrameId(object):
+    def __init__(self, sp, pc):
+        self._sp = sp
+        self._pc = pc
+
+    @property
+    def sp(self):
+        return self._sp
+
+    @property
+    def pc(self):
+        return self._pc
+
+
+class TestUnwinder(Unwinder):
+    def __init__(self):
+        Unwinder.__init__(self, "break unwinding")
+
+    def __call__(self, pending_frame):
+        pc_desc = pending_frame.architecture().registers().find("pc")
+        pc = pending_frame.read_register(pc_desc)
+
+        sp_desc = pending_frame.architecture().registers().find("sp")
+        sp = pending_frame.read_register(sp_desc)
+
+        block = gdb.block_for_pc(int(pc))
+        if block == None:
+            return None
+        func = block.function
+        if func == None:
+            return None
+        if str(func) != "break_bt_here":
+            return None
+        fid = FrameId(pc, sp)
+        return pending_frame.create_unwind_info(fid)
+
+
+gdb.unwinder.register_unwinder(None, TestUnwinder(), True)
-- 
2.25.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv2] gdb: avoid premature dummy frame garbage collection
  2021-05-25 10:49 ` [PATCHv2] " Andrew Burgess
@ 2021-05-29  2:03   ` Simon Marchi
  0 siblings, 0 replies; 5+ messages in thread
From: Simon Marchi @ 2021-05-29  2:03 UTC (permalink / raw)
  To: Andrew Burgess, gdb-patches



On 2021-05-25 6:49 a.m., Andrew Burgess wrote:
> From: Richard Bunt <richard.bunt@arm.com>
> 
> Consider the following chain of events:
> 
>   * GDB is performing an inferior call, and
> 
>   * the inferior calls longjmp, and
> 
>   * GDB detects that the longjmp has completed, stops, and enters
>     check_longjmp_breakpoint_for_call_dummy (in breakpoint.c), and
> 
>   * GDB tries to unwind the stack in order to check that the dummy
>     frame (setup for the inferior call) is still on the stack, but
> 
>   * The unwind fails, possibly due to missing debug information, so
> 
>   * GDB incorrectly concludes that the inferior has longjmp'd past the
>     dummy frame, and so deletes the dummy frame, including the dummy
>     frame breakpoint, but then
> 
>   * The inferior continues, and eventually returns to the dummy frame,
>     which is usually (always?) on the stack, the inferior starts
>     trying to execute the random contents of the stack, this results
>     in undefined behaviour.
> 
> This situation is already warned about in the comment on the function
> check_longjmp_breakpoint_for_call_dummy where we say:
> 
>    You should call this function only at places where it is safe to currently
>    unwind the whole stack.  Failed stack unwind would discard live dummy
>    frames.
> 
> The warning here is fine, the problem is that, even though we call the
> function from a location within GDB where we hope to be able to
> unwind, sometime the state of the inferior means that the unwind will
> not succeed.
> 
> This commit tries to improve the situation by adding the following
> additional check; when GDB fails to find the dummy frame on the stack,
> instead of just assuming that the dummy frame can be garbage
> collected, first find the stop_reason for the last frame on the stack.
> If this stop_reason indicates that the stack unwinding may have failed
> then we assume that the dummy frame is still in use.  However, if the
> last frame's stop_reason indicates that the stack unwind completed
> successfully then we can be confident that the dummy frame is no
> longer in use, and we garbage collect it.
> 
> Tested on x86-64 GNU/Linux.

LGTM, thanks!

Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-29  2:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19 12:29 [PATCH] gdb: avoid premature dummy frame garbage collection Andrew Burgess
2021-05-20 14:30 ` Simon Marchi
2021-05-21  8:42   ` Andrew Burgess
2021-05-25 10:49 ` [PATCHv2] " Andrew Burgess
2021-05-29  2:03   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).