[PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
@ 2014-10-22 14:02 Martin Galvan
  2014-10-22 15:10 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Martin Galvan @ 2014-10-22 14:02 UTC (permalink / raw)
  To: gdb-patches; +Cc: Martin Galvan

Added two new functions to the Python API: gdb.is_in_prologue and gdb.is_in_epilogue.
These expose the in_prologue and gdbarch_in_function_epilogue_p functions
respectively, which are useful for checking if the values of local variables are
valid when single-stepping through machine instructions.

Also added tests for gdb.is_in_prologue only. The reason of this is that
the tests work by checking the first instruction of a given function, as that's
conventionally part of the prologue. Testing gdb.is_in_epilogue seems to be
architecture-dependant (for instance, the last instruction of a function is
reported as part of an epilogue by gdbarch_in_function_epilogue_p for ARM,
but not always for x86_64), so I didn't include a test for it.

gdb/ChangeLog

2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>

    * NEWS: Mention new Python functions.
    * python/python.c:
    (gdbpy_is_in_prologue)
    (gdbpy_is_in_epilogue): New functions.

gdb/doc/ChangeLog

2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>

    * python.texi (Basic Python): Document new functions gdb.is_in_prologue and
    gdb.is_in_epilogue.

gdb/testsuite/ChangeLog

2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>

    * gdb.python/python.exp: Add tests for gdb.is_in_prologue.
---
 gdb/NEWS                            |  3 ++
 gdb/doc/python.texi                 | 45 +++++++++++++++++++++++++++
 gdb/python/python.c                 | 61 +++++++++++++++++++++++++++++++++++++
 gdb/testsuite/gdb.python/python.exp | 44 ++++++++++++++++++++++++++
 4 files changed, 153 insertions(+)

diff --git a/gdb/NEWS b/gdb/NEWS
index 606fd16..31959bb 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -13,6 +13,9 @@
      which is the gdb.Progspace object of the containing program space.
   ** A new event "gdb.clear_objfiles" has been added, triggered when
      selecting a new file to debug.
+  ** Two new functions: gdb.is_in_prologue and gdb.is_in_epilogue,
+     which are wrappers for in_prologue and gdbarch_in_function_epilogue_p
+     respectively.
 
 * New Python-based convenience functions:
 
diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
index f1fd841..d87913a 100644
--- a/gdb/doc/python.texi
+++ b/gdb/doc/python.texi
@@ -440,6 +440,51 @@ such as those used by readline for command input, and annotation
 related prompts are prohibited from being changed.
 @end defun
 
+@findex gdb.is_in_prologue
+@defun gdb.is_in_prologue (pc, @r{[}functionStart@r{]})
+Returns @code{True} if the given @var{pc} value *might* correspond to
+an instruction that executes before the stack frame of its containing
+function is completely set up, @code{False} if it definitely doesn't.
+
+When stepping by machine instructions it's possible that local variables
+appear to have wrong values at the beginning of a function.  This
+happens because it usually takes more than one instruction to set up
+a stack frame (including local variable definitions); such instructions
+are part of a function's prologue.  @value{GDBN} can identify the
+addresses where the local variables may show wrong values and inform
+you so.  @value{GDBN} will usually take a "conservative" approach when
+analyzing the prologue, assuming the result will be @code{True} unless
+it's completely sure it won't.  As such, sometimes a given @var{pc} may
+be reported as being before the stack frame is set up when it actually
+isn't; however if the result is @code{False} you can be sure
+@value{GDBN} is right.
+
+The optional @var{functionStart} argument is the start address of the
+function you want to check if @var{pc} belongs to.  If your binary
+doesn't have debugging info, @value{GDBN} may need to use this value
+to guess if @var{pc} belongs to the prologue.  If omitted it defaults
+to 0.
+
+In general you shouldn't worry about passing a @var{functionStart}
+argument unless your binary doesn't have debugging info, in which case
+ommiting @var{functionStart} may result in @code{True} being returned
+when the @var{pc} is not actually inside a prologue.
+@end defun
+
+@findex gdb.is_in_epilogue
+@defun gdb.is_in_epilogue (pc)
+Returns @code{True} if the given @var{pc} value corresponds to an
+instruction that executes after the stack of its containing function
+has been destroyed, @code{False} if it doesn't.
+
+When stepping by machine instructions it's possible that local variables
+appear to have wrong values at the end of a function.  This happens
+because it usually takes more than one instruction to tear down a stack
+frame; such instructions are part of a function's epilogue.  @value{GDBN}
+can identify the addresses where the local variables may show wrong
+values and inform you so.
+@end defun
+
 @node Exception Handling
 @subsubsection Exception Handling
 @cindex python exceptions
diff --git a/gdb/python/python.c b/gdb/python/python.c
index ca531e2..b1b4422 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -703,6 +703,55 @@ gdbpy_solib_name (PyObject *self, PyObject *args)
   return str_obj;
 }
 
+/* A Python wrapper for in_prologue. */
+
+static PyObject *
+gdbpy_is_in_prologue (PyObject *self, PyObject *args, PyObject *kw)
+{
+  /* If the user doesn't provide a function start address, assume 0 and hope
+     we have at least minimal symbols. If we don't, we might be returning
+     a false positive */
+  gdb_py_longest pc;
+  gdb_py_longest function_start = 0;
+  const PyObject *result;
+  char *keywords[] = {"pc", "functionStart", NULL};
+
+  if (PyArg_ParseTupleAndKeywords (args, kw,
+  				   GDB_PY_LLU_ARG "|" GDB_PY_LLU_ARG,
+  				   keywords, &pc, &function_start))
+    {
+      result = in_prologue (python_gdbarch, pc, function_start) ? Py_True : Py_False;
+      Py_INCREF (result);
+    }
+  else /* Couldn't parse the given args. */
+    {
+      result = NULL;
+    }
+
+  return result;
+}
+
+/* A Python wrapper for gdbarch_in_function_epilogue_p. */
+
+static PyObject *
+gdbpy_is_in_epilogue (PyObject *self, PyObject *args)
+{
+  gdb_py_longest pc;
+  const PyObject* result;
+
+  if (PyArg_ParseTuple (args, GDB_PY_LLU_ARG, &pc))
+    {
+      result = gdbarch_in_function_epilogue_p (python_gdbarch, pc) ? Py_True : Py_False;
+      Py_INCREF (result);
+    }
+  else /* Couldn't parse the given args. */
+    {
+      result = NULL;
+    }
+
+  return result;
+}
+
 /* A Python function which is a wrapper for decode_line_1.  */
 
 static PyObject *
@@ -2000,6 +2049,18 @@ Return the selected inferior object." },
   { "inferiors", gdbpy_inferiors, METH_NOARGS,
     "inferiors () -> (gdb.Inferior, ...).\n\
 Return a tuple containing all inferiors." },
+
+  { "is_in_prologue", gdbpy_is_in_prologue, METH_VARARGS | METH_KEYWORDS,
+    "is_in_prologue (pc, functionStart) -> Boolean.\n\
+Returns True if the given pc value *might* correspond to an instruction\n\
+that executes before the stack of its containing function is completely set up,\n\
+False if it definitely doesn't."},
+  { "is_in_epilogue", gdbpy_is_in_epilogue, METH_VARARGS | METH_KEYWORDS,
+    "is_in_epilogue (pc) -> Boolean.\n\
+Returns True if the given pc value corresponds to an instruction\n\
+that executes after the stack of its containing function has been destroyed,\n\
+False if it doesn't."},
+
   {NULL, NULL, 0, NULL}
 };
 
diff --git a/gdb/testsuite/gdb.python/python.exp b/gdb/testsuite/gdb.python/python.exp
index 3df9347..1611e0b 100644
--- a/gdb/testsuite/gdb.python/python.exp
+++ b/gdb/testsuite/gdb.python/python.exp
@@ -417,3 +417,47 @@ gdb_py_test_silent_cmd "step" "Step into func2" 1
 gdb_py_test_silent_cmd "up" "Step out of func2" 1
 
 gdb_test "python print (gdb.find_pc_line(gdb.selected_frame().pc()).line > line)" "True" "Test find_pc_line with resume address"
+
+# gdb.is_in_prologue and gdb.is_in_epilogue:
+
+# Start with a fresh gdb.
+clean_restart ${testfile}
+
+if ![runto_main] then {
+    fail "Can't run to main"
+    return 0
+}
+
+# Go somewhere in the function body.
+runto [gdb_get_line_number "Break at func2 call site."]
+
+# Get the current frame object.
+gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
+    "Get the current frame object." 0
+
+# Get an address inside the function body.
+gdb_py_test_silent_cmd "python functionBodyAddress = selectedFrame.pc()" \
+    "Get an address inside the function body." 0
+
+# Get the start address of the function.
+gdb_py_test_silent_cmd "python functionStartAddress = long(selectedFrame.function().value().address)" \
+    "Get the start address of the function." 0
+
+# Test the function's start address and an address somewhere inside the function body.
+
+# With functionStartAddress:
+gdb_test "python print(gdb.is_in_prologue(functionStartAddress, functionStartAddress))" "True" \
+    "Function start is in the prologue"
+
+gdb_test "python print(gdb.is_in_prologue(functionBodyAddress, functionStartAddress))" "False" \
+    "The function body isn't in the prologue"
+
+# Without functionStartAddress:
+gdb_test "python print(gdb.is_in_prologue(functionStartAddress))" "True" \
+    "Function start is in the prologue"
+
+gdb_test "python print(gdb.is_in_prologue(functionBodyAddress))" "False" \
+    "The function body isn't in the prologue (requires debug info to pass)"
+
+gdb_test "python print(gdb.is_in_epilogue(functionBodyAddress))" "False" \
+    "The function body isn't in the epilogue (requires debug info to pass)"

-- 
1.9.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 14:02 [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
@ 2014-10-22 15:10 ` Eli Zaretskii
  2014-10-22 15:14   ` Martin Galvan
  2014-10-22 17:33   ` Martin Galvan
  2014-10-22 19:23 ` Doug Evans
  2014-10-22 21:34 ` Pedro Alves
  2 siblings, 2 replies; 30+ messages in thread
From: Eli Zaretskii @ 2014-10-22 15:10 UTC (permalink / raw)
  To: Martin Galvan; +Cc: gdb-patches, martin.galvan

> From: Martin Galvan <martin.galvan@tallertechnologies.com>
> Cc: Martin Galvan <martin.galvan@tallertechnologies.com>
> Date: Wed, 22 Oct 2014 11:01:25 -0300
> 
> Added two new functions to the Python API: gdb.is_in_prologue and gdb.is_in_epilogue.

Thanks.

>     * NEWS: Mention new Python functions.

This part is OK.

> +@findex gdb.is_in_prologue
> +@defun gdb.is_in_prologue (pc, @r{[}functionStart@r{]})
> +Returns @code{True} if the given @var{pc} value *might* correspond to

Please use Texinfo markup for emphasis, not ASCII art.  I think you
want @emph{might} here.

> +The optional @var{functionStart} argument is the start address of the
> +function you want to check if @var{pc} belongs to.  If your binary
                              ^^
"whether"

> +doesn't have debugging info, @value{GDBN} may need to use this value
> +to guess if @var{pc} belongs to the prologue.  If omitted it defaults
            ^^
Likewise.

> +In general you shouldn't worry about passing a @var{functionStart}
> +argument unless your binary doesn't have debugging info, in which case
> +ommiting @var{functionStart} may result in @code{True} being returned
> +when the @var{pc} is not actually inside a prologue.

Isn't it better to require this argument in that case?  Zero is not
very useful starting address, in most cases.

> +When stepping by machine instructions it's possible that local variables
> +appear to have wrong values at the end of a function.  This happens
> +because it usually takes more than one instruction to tear down a stack
> +frame; such instructions are part of a function's epilogue.  @value{GDBN}
> +can identify the addresses where the local variables may show wrong
> +values and inform you so.

This repeats almost verbatim what has been already said about the
prologue.  It would be better to make a single description that covers
both the prologue and the epilogue, before you describe the 2 methods.

OK with those changes.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 15:10 ` Eli Zaretskii
@ 2014-10-22 15:14   ` Martin Galvan
  2014-10-22 17:33   ` Martin Galvan
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Galvan @ 2014-10-22 15:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Wed, Oct 22, 2014 at 12:10 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> OK with those changes.

Awesome, will send v2 right away.

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 15:10 ` Eli Zaretskii
  2014-10-22 15:14   ` Martin Galvan
@ 2014-10-22 17:33   ` Martin Galvan
  2014-10-22 17:47     ` Eli Zaretskii
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Galvan @ 2014-10-22 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Wed, Oct 22, 2014 at 12:10 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: Martin Galvan <martin.galvan@tallertechnologies.com>
>> Cc: Martin Galvan <martin.galvan@tallertechnologies.com>
>> +In general you shouldn't worry about passing a @var{functionStart}
>> +argument unless your binary doesn't have debugging info, in which case
>> +ommiting @var{functionStart} may result in @code{True} being returned
>> +when the @var{pc} is not actually inside a prologue.
>
> Isn't it better to require this argument in that case?  Zero is not
> very useful starting address, in most cases.

I looked at this again and I think it should stay as an optional
argument. GDB's in_prologue will only check for the functionStart
argument when it has no other way to determine whether the given PC
belongs to a prologue; otherwise it'll just ignore it. Because of
this, is_in_prologue will return True if we pass it a pc that belongs
to a prologue together with any valid address regardless of it being a
function start, let alone the function PC belongs to. Making it a
required argument to me sounds like we're asking is_in_prologue
whether PC is in the prologue of the function starting at
functionStart, instead of simply telling whether PC belongs to any
prologue (which was my original intent).

I thought of removing the functionStart argument altogether, however
if we're working without debugging info but we're certain of where the
function starts we can still get an accurate result instead of a false
positive. Let me know what you think of this so I can send v2 (I'll
also include a few minor corrections such as correctly casting
gdbpy_is_in_prologue to PyCFunction in the method table).

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 17:33   ` Martin Galvan
@ 2014-10-22 17:47     ` Eli Zaretskii
  2014-10-22 18:06       ` Martin Galvan
  2014-10-22 18:07       ` Eli Zaretskii
  0 siblings, 2 replies; 30+ messages in thread
From: Eli Zaretskii @ 2014-10-22 17:47 UTC (permalink / raw)
  To: Martin Galvan; +Cc: gdb-patches

> Date: Wed, 22 Oct 2014 14:33:15 -0300
> From: Martin Galvan <martin.galvan@tallertechnologies.com>
> Cc: gdb-patches@sourceware.org
> 
> On Wed, Oct 22, 2014 at 12:10 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> >> From: Martin Galvan <martin.galvan@tallertechnologies.com>
> >> Cc: Martin Galvan <martin.galvan@tallertechnologies.com>
> >> +In general you shouldn't worry about passing a @var{functionStart}
> >> +argument unless your binary doesn't have debugging info, in which case
> >> +ommiting @var{functionStart} may result in @code{True} being returned
> >> +when the @var{pc} is not actually inside a prologue.
> >
> > Isn't it better to require this argument in that case?  Zero is not
> > very useful starting address, in most cases.
> 
> I looked at this again and I think it should stay as an optional
> argument. GDB's in_prologue will only check for the functionStart
> argument when it has no other way to determine whether the given PC
> belongs to a prologue; otherwise it'll just ignore it. Because of
> this, is_in_prologue will return True if we pass it a pc that belongs
> to a prologue together with any valid address regardless of it being a
> function start, let alone the function PC belongs to. Making it a
> required argument to me sounds like we're asking is_in_prologue
> whether PC is in the prologue of the function starting at
> functionStart, instead of simply telling whether PC belongs to any
> prologue (which was my original intent).

I think we are miscommunicating.  What I had in mind is raise an
exception or display an error message when GDB has no other means to
determine where the function's start is (e.g., no debug info), and no
functionStart argument was passed.  That is what I meant by
"require".  IOW, it's up to the user to decide when to provide it, but
GDB will refuse to use some arbitrary number, such as zero, if it
cannot determine the starting address.

(Btw, using mixed-case argument names in thge manual is a bad idea,
because in the Info format @var causes its argument to be up-cased.)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 17:47     ` Eli Zaretskii
@ 2014-10-22 18:06       ` Martin Galvan
  2014-10-22 18:07       ` Eli Zaretskii
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Galvan @ 2014-10-22 18:06 UTC (permalink / raw)
  To: Eli Zaretskii, gdb-patches

On Wed, Oct 22, 2014 at 2:47 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> I think we are miscommunicating.  What I had in mind is raise an
> exception or display an error message when GDB has no other means to
> determine where the function's start is (e.g., no debug info), and no
> functionStart argument was passed.  That is what I meant by
> "require".  IOW, it's up to the user to decide when to provide it, but
> GDB will refuse to use some arbitrary number, such as zero, if it
> cannot determine the starting address.

Oh, I see. That's a great idea.

I think the simplest way to do this is not to call GDB's in_prologue
directly from gdbpy_is_in_prologue, but reproduce some of its behavior
and adding a check on the return value of find_pc_partial_function
which will throw an exception if functionStart wasn't provided. This
may result in a bit of duplicated code, but in_prologue isn't that
long a function if you take the comments out so I don't think that
would be a problem. What do you think?

> (Btw, using mixed-case argument names in thge manual is a bad idea,
> because in the Info format @var causes its argument to be up-cased.)

Indeed, will also change that.

Thanks a lot!

-- 


Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 17:47     ` Eli Zaretskii
  2014-10-22 18:06       ` Martin Galvan
@ 2014-10-22 18:07       ` Eli Zaretskii
  2014-10-22 18:32         ` Martin Galvan
  1 sibling, 1 reply; 30+ messages in thread
From: Eli Zaretskii @ 2014-10-22 18:07 UTC (permalink / raw)
  To: martin.galvan; +Cc: gdb-patches

> Date: Wed, 22 Oct 2014 20:47:08 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: gdb-patches@sourceware.org
> 
> I think we are miscommunicating.  What I had in mind is raise an
> exception or display an error message when GDB has no other means to
> determine where the function's start is (e.g., no debug info), and no
> functionStart argument was passed.  That is what I meant by
> "require".  IOW, it's up to the user to decide when to provide it, but
> GDB will refuse to use some arbitrary number, such as zero, if it
> cannot determine the starting address.

Having said all that, please don't read this as a rejection of your
code.  Even if you agree with me, please wait for others to chime in
and say what they think, before acting on my opinions.

Thanks.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 18:07       ` Eli Zaretskii
@ 2014-10-22 18:32         ` Martin Galvan
  2014-10-22 18:37           ` Eli Zaretskii
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Galvan @ 2014-10-22 18:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On Wed, Oct 22, 2014 at 3:07 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Having said all that, please don't read this as a rejection of your
> code.  Even if you agree with me, please wait for others to chime in
> and say what they think, before acting on my opinions.
>
> Thanks.

Don't worry about that.

Looking at the documentation again, I think saying "functionStart [..]
is the start address of the function you want to check if pc belongs
to" is a bit misleading in this case. I think that paragraph should be
changed to something like:

"The optional @var{functionStart} argument is the start address of the
function @var{pc} belongs to.  If your binary doesn't have debugging info,
@value{GDBN} will need to use this value to guess whether @var{pc}
belongs to the
prologue, otherwise it'll ignore it. Notice that in the latter case you can pass
any valid address as @var{functionStart}; the result will only depend
on @var{pc}
being in a prologue, even if it's not the prologue of the function starting at
@var{functionStart}."

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 18:32         ` Martin Galvan
@ 2014-10-22 18:37           ` Eli Zaretskii
  0 siblings, 0 replies; 30+ messages in thread
From: Eli Zaretskii @ 2014-10-22 18:37 UTC (permalink / raw)
  To: Martin Galvan; +Cc: gdb-patches

> Date: Wed, 22 Oct 2014 15:32:38 -0300
> From: Martin Galvan <martin.galvan@tallertechnologies.com>
> Cc: gdb-patches@sourceware.org
> 
> "The optional @var{functionStart} argument is the start address of the
> function @var{pc} belongs to.  If your binary doesn't have debugging info,
> @value{GDBN} will need to use this value to guess whether @var{pc}
> belongs to the                           ^^^^^^^^
> prologue, otherwise it'll ignore it. Notice that in the latter case you can pass
> any valid address as @var{functionStart}; the result will only depend
> on @var{pc}
> being in a prologue, even if it's not the prologue of the function starting at
> @var{functionStart}."

I'd say "to determine" instead of "to guess"/

Otherwise, fine with me, thanks.  (But please remember to leave 2
spaces between sentences.)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 14:02 [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
  2014-10-22 15:10 ` Eli Zaretskii
@ 2014-10-22 19:23 ` Doug Evans
  2014-10-22 21:34 ` Pedro Alves
  2 siblings, 0 replies; 30+ messages in thread
From: Doug Evans @ 2014-10-22 19:23 UTC (permalink / raw)
  To: Martin Galvan; +Cc: gdb-patches

Martin Galvan writes:
 > Added two new functions to the Python API: gdb.is_in_prologue and gdb.is_in_epilogue.
 > These expose the in_prologue and gdbarch_in_function_epilogue_p functions
 > respectively, which are useful for checking if the values of local variables are
 > valid when single-stepping through machine instructions.
 > 
 > Also added tests for gdb.is_in_prologue only. The reason of this is that
 > the tests work by checking the first instruction of a given function, as that's
 > conventionally part of the prologue. Testing gdb.is_in_epilogue seems to be
 > architecture-dependant (for instance, the last instruction of a function is
 > reported as part of an epilogue by gdbarch_in_function_epilogue_p for ARM,
 > but not always for x86_64), so I didn't include a test for it.

Hi.

We should have a test for is_in_epilogue anyway.
The canonical choice is to just make it amd64-linux specific.

API functions like these are problematic.
Users don't expect API functions to be heuristic-based,
and that is all these can ever be in the general case.
The patch does try to provide the user some guarantees
("however if the result is @code{False} you can be sure
@value{GDBN} is right."),
but it's not clear to me this will be true in the general case.
I may have missed something so I'm certainly willing to be
persuaded otherwise.
I might be ok with this patch if the functions were named something like
"maybe_is_in_prologue" and "maybe_is_in_epilogue".
That way they scream to the user (and future code readers) "Heads Up!"

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 14:02 [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
  2014-10-22 15:10 ` Eli Zaretskii
  2014-10-22 19:23 ` Doug Evans
@ 2014-10-22 21:34 ` Pedro Alves
  2014-10-22 21:59   ` Pedro Alves
  2 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2014-10-22 21:34 UTC (permalink / raw)
  To: Martin Galvan, gdb-patches

On 10/22/2014 03:01 PM, Martin Galvan wrote:
> Added two new functions to the Python API: gdb.is_in_prologue and gdb.is_in_epilogue.
> These expose the in_prologue and gdbarch_in_function_epilogue_p functions
> respectively, which are useful for checking if the values of local variables are
> valid when single-stepping through machine instructions.

gdbarch_in_function_epilogue_p is misnamed, see:

 https://sourceware.org/ml/gdb-patches/2014-08/msg00592.html

From that thread:

> > # A target might have problems with watchpoints as soon as the stack
> > # frame of the current function has been destroyed.  This mostly happens
> > # as the first action in a funtion's epilogue.  in_function_epilogue_p()
> > # is defined to return a non-zero value if either the given addr is one
> >                                                                  ^^^^^^
> > # instruction after the stack destroying instruction up to the trailing
> >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > # return instruction or if we can figure out that the stack frame has
> > # already been invalidated regardless of the value of addr.  Targets
> > # which don't suffer from that problem could just let this functionality
> > # untouched.
> > m:int:in_function_epilogue_p:CORE_ADDR addr:addr:0:generic_in_function_epilogue_p::0
> >
>
> Yes, this gdbarch hook's name is misleading.  How about renaming it to
> stack_frame_destroyed_p?

(I agreed to that name.)

target's whose gdbarch_in_function_epilogue_p is returning true for
instructions in the epilogue before the stack is destroyed, will
have software watchpoints broken in the tail call case Yao described,
like ARM had.

And so continuing with the in_function_epilogue_p name results in a
misnamed Python hook too; I'd much rather not carry that confusion to
the exported API.  That function could well return true if for some
reason we're stopped at code in the middle of the function, not
an in epilogue, that for some reason destroys the stack.

Or, consider the case of at some point we wanting to expose a
function that actually returns whether the PC is in the
epilogue, no matter whether the stack/frame has been destroyed
or not.

> 
> Also added tests for gdb.is_in_prologue only. The reason of this is that
> the tests work by checking the first instruction of a given function, as that's
> conventionally part of the prologue. Testing gdb.is_in_epilogue seems to be
> architecture-dependant (for instance, the last instruction of a function is
> reported as part of an epilogue by gdbarch_in_function_epilogue_p for ARM,
> but not always for x86_64), so I didn't include a test for it.

See above.  This confusion is a consequence of the bad function name.

How about a test that single-steps out of a function, and checks
in_function_epilogue at each single-step until the function is
finished?  We can use istarget "foo" to tweak whether to expect
whether in_function_epilogue returns true in one of the steps
or not.

> 
> gdb/ChangeLog
> 
> 2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>
> 
>     * NEWS: Mention new Python functions.
>     * python/python.c:
>     (gdbpy_is_in_prologue)
>     (gdbpy_is_in_epilogue): New functions.
> 
> gdb/doc/ChangeLog
> 
> 2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>
> 
>     * python.texi (Basic Python): Document new functions gdb.is_in_prologue and
>     gdb.is_in_epilogue.
> 
> gdb/testsuite/ChangeLog
> 
> 2014-10-22  Martin Galvan  <martin.galvan@tallertechnologies.com>
> 
>     * gdb.python/python.exp: Add tests for gdb.is_in_prologue.
> ---
>  gdb/NEWS                            |  3 ++
>  gdb/doc/python.texi                 | 45 +++++++++++++++++++++++++++
>  gdb/python/python.c                 | 61 +++++++++++++++++++++++++++++++++++++
>  gdb/testsuite/gdb.python/python.exp | 44 ++++++++++++++++++++++++++
>  4 files changed, 153 insertions(+)
> 
> diff --git a/gdb/NEWS b/gdb/NEWS
> index 606fd16..31959bb 100644
> --- a/gdb/NEWS
> +++ b/gdb/NEWS
> @@ -13,6 +13,9 @@
>       which is the gdb.Progspace object of the containing program space.
>    ** A new event "gdb.clear_objfiles" has been added, triggered when
>       selecting a new file to debug.
> +  ** Two new functions: gdb.is_in_prologue and gdb.is_in_epilogue,
> +     which are wrappers for in_prologue and gdbarch_in_function_epilogue_p
> +     respectively.

This is talking about wrapping some GDB internals.  I think it'd be
better to say something in user-speak.  The implementations of
these new Python functions could change, even.


>  
>  * New Python-based convenience functions:
>  
> diff --git a/gdb/doc/python.texi b/gdb/doc/python.texi
> index f1fd841..d87913a 100644
> --- a/gdb/doc/python.texi
> +++ b/gdb/doc/python.texi
> @@ -440,6 +440,51 @@ such as those used by readline for command input, and annotation
>  related prompts are prohibited from being changed.
>  @end defun
>  
> +@findex gdb.is_in_prologue
> +@defun gdb.is_in_prologue (pc, @r{[}functionStart@r{]})
> +Returns @code{True} if the given @var{pc} value *might* correspond to
> +an instruction that executes before the stack frame of its containing
> +function is completely set up, @code{False} if it definitely doesn't.
> +
> +When stepping by machine instructions it's possible that local variables
> +appear to have wrong values at the beginning of a function.  This
> +happens because it usually takes more than one instruction to set up
> +a stack frame (including local variable definitions); such instructions
> +are part of a function's prologue.  @value{GDBN} can identify the
> +addresses where the local variables may show wrong values and inform
> +you so.  @value{GDBN} will usually take a "conservative" approach when
> +analyzing the prologue, assuming the result will be @code{True} unless
> +it's completely sure it won't.  As such, sometimes a given @var{pc} may
> +be reported as being before the stack frame is set up when it actually
> +isn't; however if the result is @code{False} you can be sure
> +@value{GDBN} is right.
> +
> +The optional @var{functionStart} argument is the start address of the
> +function you want to check if @var{pc} belongs to.  If your binary
> +doesn't have debugging info, @value{GDBN} may need to use this value
> +to guess if @var{pc} belongs to the prologue.  If omitted it defaults
> +to 0.

Some targets have code at address 0.  Seems like we may be exposing a
bad interface for these targets here?

> +
> +In general you shouldn't worry about passing a @var{functionStart}
> +argument unless your binary doesn't have debugging info, in which case
> +ommiting @var{functionStart} may result in @code{True} being returned
> +when the @var{pc} is not actually inside a prologue.
> +@end defun
> +
> +@findex gdb.is_in_epilogue
> +@defun gdb.is_in_epilogue (pc)
> +Returns @code{True} if the given @var{pc} value corresponds to an
> +instruction that executes after the stack of its containing function
> +has been destroyed, @code{False} if it doesn't.
> +
> +When stepping by machine instructions it's possible that local variables
> +appear to have wrong values at the end of a function.  This happens
> +because it usually takes more than one instruction to tear down a stack
> +frame; such instructions are part of a function's epilogue.  @value{GDBN}
> +can identify the addresses where the local variables may show wrong
> +values and inform you so.
> +@end defun
> +
>  @node Exception Handling
>  @subsubsection Exception Handling
>  @cindex python exceptions
> diff --git a/gdb/python/python.c b/gdb/python/python.c
> index ca531e2..b1b4422 100644
> --- a/gdb/python/python.c
> +++ b/gdb/python/python.c
> @@ -703,6 +703,55 @@ gdbpy_solib_name (PyObject *self, PyObject *args)
>    return str_obj;
>  }
>  
> +/* A Python wrapper for in_prologue. */
> +
> +static PyObject *
> +gdbpy_is_in_prologue (PyObject *self, PyObject *args, PyObject *kw)
> +{
> +  /* If the user doesn't provide a function start address, assume 0 and hope
> +     we have at least minimal symbols. If we don't, we might be returning
> +     a false positive */
> +  gdb_py_longest pc;
> +  gdb_py_longest function_start = 0;
> +  const PyObject *result;
> +  char *keywords[] = {"pc", "functionStart", NULL};
> +
> +  if (PyArg_ParseTupleAndKeywords (args, kw,
> +  				   GDB_PY_LLU_ARG "|" GDB_PY_LLU_ARG,
> +  				   keywords, &pc, &function_start))
> +    {
> +      result = in_prologue (python_gdbarch, pc, function_start) ? Py_True : Py_False;
> +      Py_INCREF (result);
> +    }
> +  else /* Couldn't parse the given args. */
> +    {
> +      result = NULL;
> +    }
> +
> +  return result;
> +}
> +
> +/* A Python wrapper for gdbarch_in_function_epilogue_p. */
> +
> +static PyObject *
> +gdbpy_is_in_epilogue (PyObject *self, PyObject *args)
> +{
> +  gdb_py_longest pc;
> +  const PyObject* result;
> +
> +  if (PyArg_ParseTuple (args, GDB_PY_LLU_ARG, &pc))
> +    {
> +      result = gdbarch_in_function_epilogue_p (python_gdbarch, pc) ? Py_True : Py_False;
> +      Py_INCREF (result);
> +    }
> +  else /* Couldn't parse the given args. */
> +    {
> +      result = NULL;
> +    }
> +
> +  return result;
> +}
> +
>  /* A Python function which is a wrapper for decode_line_1.  */
>  
>  static PyObject *
> @@ -2000,6 +2049,18 @@ Return the selected inferior object." },
>    { "inferiors", gdbpy_inferiors, METH_NOARGS,
>      "inferiors () -> (gdb.Inferior, ...).\n\
>  Return a tuple containing all inferiors." },
> +
> +  { "is_in_prologue", gdbpy_is_in_prologue, METH_VARARGS | METH_KEYWORDS,
> +    "is_in_prologue (pc, functionStart) -> Boolean.\n\
> +Returns True if the given pc value *might* correspond to an instruction\n\
> +that executes before the stack of its containing function is completely set up,\n\
> +False if it definitely doesn't."},
> +  { "is_in_epilogue", gdbpy_is_in_epilogue, METH_VARARGS | METH_KEYWORDS,
> +    "is_in_epilogue (pc) -> Boolean.\n\
> +Returns True if the given pc value corresponds to an instruction\n\
> +that executes after the stack of its containing function has been destroyed,\n\
> +False if it doesn't."},
> +
>    {NULL, NULL, 0, NULL}
>  };
>  
> diff --git a/gdb/testsuite/gdb.python/python.exp b/gdb/testsuite/gdb.python/python.exp
> index 3df9347..1611e0b 100644
> --- a/gdb/testsuite/gdb.python/python.exp
> +++ b/gdb/testsuite/gdb.python/python.exp
> @@ -417,3 +417,47 @@ gdb_py_test_silent_cmd "step" "Step into func2" 1
>  gdb_py_test_silent_cmd "up" "Step out of func2" 1
>  
>  gdb_test "python print (gdb.find_pc_line(gdb.selected_frame().pc()).line > line)" "True" "Test find_pc_line with resume address"
> +
> +# gdb.is_in_prologue and gdb.is_in_epilogue:
> +
> +# Start with a fresh gdb.
> +clean_restart ${testfile}
> +
> +if ![runto_main] then {
> +    fail "Can't run to main"
> +    return 0
> +}

Any reason these new tests aren't in a separate file?

> +
> +# Go somewhere in the function body.
> +runto [gdb_get_line_number "Break at func2 call site."]
> +
> +# Get the current frame object.
> +gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
> +    "Get the current frame object." 0

Lowercase, and no "." please.  Here and throughout.  Usual style is to
drop the "the"s to be more concise:

gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
    "get current frame object" 0

> +
> +# Get an address inside the function body.
> +gdb_py_test_silent_cmd "python functionBodyAddress = selectedFrame.pc()" \
> +    "Get an address inside the function body." 0

gdb_py_test_silent_cmd "python functionBodyAddress = selectedFrame.pc()" \
    "get address inside function body" 0

etc.

> +
> +# Get the start address of the function.
> +gdb_py_test_silent_cmd "python functionStartAddress = long(selectedFrame.function().value().address)" \
> +    "Get the start address of the function." 0
> +
> +# Test the function's start address and an address somewhere inside the function body.
> +
> +# With functionStartAddress:
> +gdb_test "python print(gdb.is_in_prologue(functionStartAddress, functionStartAddress))" "True" \
> +    "Function start is in the prologue"
> +
> +gdb_test "python print(gdb.is_in_prologue(functionBodyAddress, functionStartAddress))" "False" \
> +    "The function body isn't in the prologue"
> +
> +# Without functionStartAddress:
> +gdb_test "python print(gdb.is_in_prologue(functionStartAddress))" "True" \
> +    "Function start is in the prologue"
> +
> +gdb_test "python print(gdb.is_in_prologue(functionBodyAddress))" "False" \
> +    "The function body isn't in the prologue (requires debug info to pass)"
> +
> +gdb_test "python print(gdb.is_in_epilogue(functionBodyAddress))" "False" \
> +    "The function body isn't in the epilogue (requires debug info to pass)"
> 

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 21:34 ` Pedro Alves
@ 2014-10-22 21:59   ` Pedro Alves
  2014-10-23 17:36     ` Martin Galvan
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2014-10-22 21:59 UTC (permalink / raw)
  To: Martin Galvan, gdb-patches

On 10/22/2014 10:34 PM, Pedro Alves wrote:
>> +# Get the current frame object.
>> > +gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>> > +    "Get the current frame object." 0
> Lowercase, and no "." please.  Here and throughout.  Usual style is to
> drop the "the"s to be more concise:
> 
> gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
                                                     ^^^^^^^^
>     "get current frame object" 0
           ^^^^^^^

BTW, I happened to read this back, and noticed: "current" and "selected"
have distinct meanings in the frame machinery.  Best not mix them up:

gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
    "get selected frame object" 0

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-22 21:59   ` Pedro Alves
@ 2014-10-23 17:36     ` Martin Galvan
  2014-10-23 17:57       ` Ulrich Weigand
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Martin Galvan @ 2014-10-23 17:36 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, dje, Eli Zaretskii

First of all, thanks a lot to all of you for the feedback.

On Wed, Oct 22, 2014 at 4:23 PM, Doug Evans <dje@google.com> wrote:
> We should have a test for is_in_epilogue anyway.
> The canonical choice is to just make it amd64-linux specific.

I briefly thought of that. The test I originally wrote for
is_in_epilogue checked for the last PC of a known, simple function
which ended with retq on amd64 and pop on thumb ARM. It worked fine on
ARM, however for some reason the retq was returning False. I looked at
amd64_in_function_epilogue_p and saw all it did was check if the
opcode was the one for ret. Not being too familiar with amd64 I
initially assumed the opcode for retq was different, but after doing
an objdump I noticed it's the same (0xc3). I was surprised to see
this, and debugged GDB itself only to find out that
gdbarch->in_function_epilogue_p was the generic one instead of amd64.
Is this a bug?

> API functions like these are problematic.
> Users don't expect API functions to be heuristic-based,
> and that is all these can ever be in the general case.
> The patch does try to provide the user some guarantees
> ("however if the result is @code{False} you can be sure
> @value{GDBN} is right."),
> but it's not clear to me this will be true in the general case.
> I may have missed something so I'm certainly willing to be
> persuaded otherwise.

Well, the comment at the top of in_prologue says "Returns 1 if PC
*might* be in prologue, 0 if definately (sic) *not* in prologue".
However, looking at the function itself it relies on the compiler
having marked the prologue as its own "line", and if it can't use that
info it falls back to using the architecture-dependant
gdbarch_skip_prologue. So far every time I've tested it it's worked
fine, and if it didn't it would've probably already been reported as a
bug of in_prologue or gdbarch_skip_prologue. I guess if you want to be
*really* careful I could change that line in the documentation for
something like "if the result is False, you can be almost sure GDB is
right", or "GDB is most likely right".

I think in this case it's more important to emphasize false positives
rather than false negatives.

> I might be ok with this patch if the functions were named something like
> "maybe_is_in_prologue" and "maybe_is_in_epilogue".
> That way they scream to the user (and future code readers) "Heads Up!"

Agreed, although from what Pedro said I think it would be even more
accurate if we renamed the epilogue function to something like
"stack_frame_destroyed".

On Wed, Oct 22, 2014 at 6:34 PM, Pedro Alves <palves@redhat.com> wrote:
> target's whose gdbarch_in_function_epilogue_p is returning true for
> instructions in the epilogue before the stack is destroyed, will
> have software watchpoints broken in the tail call case Yao described,
> like ARM had.
>
> And so continuing with the in_function_epilogue_p name results in a
> misnamed Python hook too; I'd much rather not carry that confusion to
> the exported API.  That function could well return true if for some
> reason we're stopped at code in the middle of the function, not
> an in epilogue, that for some reason destroys the stack.
>
> Or, consider the case of at some point we wanting to expose a
> function that actually returns whether the PC is in the
> epilogue, no matter whether the stack/frame has been destroyed
> or not.

Agreed.

> How about a test that single-steps out of a function, and checks
> in_function_epilogue at each single-step until the function is
> finished?  We can use istarget "foo" to tweak whether to expect
> whether in_function_epilogue returns true in one of the steps
> or not.

I actually wrote a test that checked is_in_epilogue with the last PC
in a very simple function. I used it to test on ARM and it worked
fine, however for amd64 the story was different (see my answer to
Doug), and that's why I didn't include it.

If we were to single-step and do architecture-dependant checks for
every single instruction, however, I think the test would be quite
complicated.

>> +  ** Two new functions: gdb.is_in_prologue and gdb.is_in_epilogue,
>> +     which are wrappers for in_prologue and gdbarch_in_function_epilogue_p
>> +     respectively.
>
> This is talking about wrapping some GDB internals.  I think it'd be
> better to say something in user-speak.  The implementations of
> these new Python functions could change, even.

Agreed. How about:

** Two new functions: gdb.maybe_in_prologue and gdb.stack_destroyed.
    The former returns True if a given PC may be inside a function's prologue
    and thus the local variables may show wrong values at that point,
    and the latter returns True if a given PC is at a point where GDB detected
    the stack frame as having been destroyed, usually by the instructions
    in a function's epilogue.

>> +The optional @var{functionStart} argument is the start address of the
>> +function you want to check if @var{pc} belongs to.  If your binary
>> +doesn't have debugging info, @value{GDBN} may need to use this value
>> +to guess if @var{pc} belongs to the prologue.  If omitted it defaults
>> +to 0.
>
> Some targets have code at address 0.  Seems like we may be exposing a
> bad interface for these targets here?

I used 0 because in_prologue expects it to be non-zero. If it's 0 and
we have no debugging info, it'll always return true:

      /* We don't even have minsym information, so fall back to using
         func_start, if given.  */
    if (! func_start)
        return 1;        /* We *might* be in a prologue.  */

Again, I did it because of the way in_prologue works, but as Eli said
this would probably be better handled with a Python exception or a
message of some kind.

> Any reason these new tests aren't in a separate file?

Not particularly. I thought they should go in python.exp since my
functions belong to the gdb module itself, but I could place them in a
separate file if you want. I'm thinking of naming the file
py-prologue-epilogue.exp or maybe py-frame-is-invalid.exp. Any better
naming suggestions are welcome!

>> +# Get the current frame object.
>> +gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>> +    "Get the current frame object." 0
>
> Lowercase, and no "." please.  Here and throughout.  Usual style is to
> drop the "the"s to be more concise:
>
> gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>     "get current frame object" 0
>
> BTW, I happened to read this back, and noticed: "current" and "selected"
> have distinct meanings in the frame machinery.  Best not mix them up:
>
> gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>     "get selected frame object" 0

Will do.

One more thing: I'd like to know if everyone's ok with this:

On Wed, Oct 22, 2014 at 3:06 PM, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> On Wed, Oct 22, 2014 at 2:47 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> I think we are miscommunicating.  What I had in mind is raise an
>> exception or display an error message when GDB has no other means to
>> determine where the function's start is (e.g., no debug info), and no
>> functionStart argument was passed.  That is what I meant by
>> "require".  IOW, it's up to the user to decide when to provide it, but
>> GDB will refuse to use some arbitrary number, such as zero, if it
>> cannot determine the starting address.
>
> Oh, I see. That's a great idea.
>
> I think the simplest way to do this is not to call GDB's in_prologue
> directly from gdbpy_is_in_prologue, but reproduce some of its behavior
> and adding a check on the return value of find_pc_partial_function
> which will throw an exception if functionStart wasn't provided. This
> may result in a bit of duplicated code, but in_prologue isn't that
> long a function if you take the comments out so I don't think that
> would be a problem. What do you think?

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 17:36     ` Martin Galvan
@ 2014-10-23 17:57       ` Ulrich Weigand
  2014-10-23 18:09         ` Martin Galvan
  2014-10-24 14:58         ` Pedro Alves
  2014-10-24  4:57       ` Doug Evans
  2014-10-24 14:57       ` Pedro Alves
  2 siblings, 2 replies; 30+ messages in thread
From: Ulrich Weigand @ 2014-10-23 17:57 UTC (permalink / raw)
  To: Martin Galvan; +Cc: Pedro Alves, gdb-patches, dje, Eli Zaretskii

Martin Galvan wrote:

> Well, the comment at the top of in_prologue says "Returns 1 if PC
> *might* be in prologue, 0 if definately (sic) *not* in prologue".
> However, looking at the function itself it relies on the compiler
> having marked the prologue as its own "line", and if it can't use that
> info it falls back to using the architecture-dependant
> gdbarch_skip_prologue. So far every time I've tested it it's worked
> fine, and if it didn't it would've probably already been reported as a
> bug of in_prologue or gdbarch_skip_prologue. I guess if you want to be
> *really* careful I could change that line in the documentation for
> something like "if the result is False, you can be almost sure GDB is
> right", or "GDB is most likely right".

The fundamental problem is that the notion of "prologue" and "epilogue"
simply no longer exists in optimized code generated by modern compilers;
and even more compiler features get implemented that make those notions
even less useful (e.g. shrink-wrapping).

As a result, we have been trying to the rid of using those notions as
much as possible; for example, when debugging optimized code with modern
DWARF information present, GDB will today no longer even use prologue
skipping at all.  Instead, the debug information is good enough that
the correct location of local variables can be recovered at every
instruction in the function, making the distinction no longer needed.

The in_prologue routine is likewise only still uses under certain rather
rare circumstances; in fact it might even today be possible to simply
remove it.  Once more platforms provide correct DWARF covering epilogues
as well, the gdbarch_in_function_epilogue_p calls in breakpoint.c may
likewise become unnecessary.

So if we hope at some point to get rid of those routines, then it seems
counterproductive to now export them as part of a fixed external API ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 17:57       ` Ulrich Weigand
@ 2014-10-23 18:09         ` Martin Galvan
  2014-10-23 18:14           ` Daniel Gutson
  2014-10-24 14:58         ` Pedro Alves
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Galvan @ 2014-10-23 18:09 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Pedro Alves, gdb-patches, dje, Eli Zaretskii, Daniel Gutson

On Thu, Oct 23, 2014 at 2:57 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> The fundamental problem is that the notion of "prologue" and "epilogue"
> simply no longer exists in optimized code generated by modern compilers;
> and even more compiler features get implemented that make those notions
> even less useful (e.g. shrink-wrapping).
>
> As a result, we have been trying to the rid of using those notions as
> much as possible; for example, when debugging optimized code with modern
> DWARF information present, GDB will today no longer even use prologue
> skipping at all.  Instead, the debug information is good enough that
> the correct location of local variables can be recovered at every
> instruction in the function, making the distinction no longer needed.
>
> The in_prologue routine is likewise only still uses under certain rather
> rare circumstances; in fact it might even today be possible to simply
> remove it.  Once more platforms provide correct DWARF covering epilogues
> as well, the gdbarch_in_function_epilogue_p calls in breakpoint.c may
> likewise become unnecessary.
>
> So if we hope at some point to get rid of those routines, then it seems
> counterproductive to now export them as part of a fixed external API ...

While that may be true, it's also true that at some points we still
see the local variables having wrong values when stepping through
machine code. The aim of this patch is to expose a way of detecting
such situations for scripts that may need it. Until we have a safer
way to do it I think this should be integrated to the code base.

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 18:09         ` Martin Galvan
@ 2014-10-23 18:14           ` Daniel Gutson
  2014-10-24  2:42             ` Doug Evans
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Gutson @ 2014-10-23 18:14 UTC (permalink / raw)
  To: Martin Galvan
  Cc: Ulrich Weigand, Pedro Alves, gdb-patches, Doug Evans, Eli Zaretskii

On Thu, Oct 23, 2014 at 3:09 PM, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> On Thu, Oct 23, 2014 at 2:57 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> The fundamental problem is that the notion of "prologue" and "epilogue"
>> simply no longer exists in optimized code generated by modern compilers;
>> and even more compiler features get implemented that make those notions
>> even less useful (e.g. shrink-wrapping).
>>
>> As a result, we have been trying to the rid of using those notions as
>> much as possible; for example, when debugging optimized code with modern
>> DWARF information present, GDB will today no longer even use prologue
>> skipping at all.  Instead, the debug information is good enough that
>> the correct location of local variables can be recovered at every
>> instruction in the function, making the distinction no longer needed.
>>
>> The in_prologue routine is likewise only still uses under certain rather
>> rare circumstances; in fact it might even today be possible to simply
>> remove it.  Once more platforms provide correct DWARF covering epilogues
>> as well, the gdbarch_in_function_epilogue_p calls in breakpoint.c may
>> likewise become unnecessary.
>>
>> So if we hope at some point to get rid of those routines, then it seems
>> counterproductive to now export them as part of a fixed external API ...
>
> While that may be true, it's also true that at some points we still
> see the local variables having wrong values when stepping through
> machine code. The aim of this patch is to expose a way of detecting
> such situations for scripts that may need it. Until we have a safer
> way to do it I think this should be integrated to the code base.

Hi all,
   (Hi Pedro!)

 we badly need this. If you think the patch is in a shape good enough
to be committed, please commit it for Martín since he doesn't have
write access.

We can then start a fresh new thread to discuss future directions
specially related to optimized code and exactly what/how DWARF
tags should be handled.

Thanks!

   Daniel.

>
> --
>
> Martín Galván
>
> Software Engineer
>
> Taller Technologies Argentina
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> Córdoba, Argentina
>
> Phone: 54 351 4217888 / +54 351 4218211



-- 

Daniel F. Gutson
Chief Engineering Officer, SPD


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211

Skype: dgutson

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 18:14           ` Daniel Gutson
@ 2014-10-24  2:42             ` Doug Evans
  0 siblings, 0 replies; 30+ messages in thread
From: Doug Evans @ 2014-10-24  2:42 UTC (permalink / raw)
  To: Daniel Gutson
  Cc: Martin Galvan, Ulrich Weigand, Pedro Alves, gdb-patches, Eli Zaretskii

On Thu, Oct 23, 2014 at 11:13 AM, Daniel Gutson
<daniel.gutson@tallertechnologies.com> wrote:
> On Thu, Oct 23, 2014 at 3:09 PM, Martin Galvan
> <martin.galvan@tallertechnologies.com> wrote:
>> On Thu, Oct 23, 2014 at 2:57 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>>> The fundamental problem is that the notion of "prologue" and "epilogue"
>>> simply no longer exists in optimized code generated by modern compilers;
>>> and even more compiler features get implemented that make those notions
>>> even less useful (e.g. shrink-wrapping).
>>>
>>> As a result, we have been trying to the rid of using those notions as
>>> much as possible; for example, when debugging optimized code with modern
>>> DWARF information present, GDB will today no longer even use prologue
>>> skipping at all.  Instead, the debug information is good enough that
>>> the correct location of local variables can be recovered at every
>>> instruction in the function, making the distinction no longer needed.
>>>
>>> The in_prologue routine is likewise only still uses under certain rather
>>> rare circumstances; in fact it might even today be possible to simply
>>> remove it.  Once more platforms provide correct DWARF covering epilogues
>>> as well, the gdbarch_in_function_epilogue_p calls in breakpoint.c may
>>> likewise become unnecessary.
>>>
>>> So if we hope at some point to get rid of those routines, then it seems
>>> counterproductive to now export them as part of a fixed external API ...
>>
>> While that may be true, it's also true that at some points we still
>> see the local variables having wrong values when stepping through
>> machine code. The aim of this patch is to expose a way of detecting
>> such situations for scripts that may need it. Until we have a safer
>> way to do it I think this should be integrated to the code base.
>
> Hi all,
>    (Hi Pedro!)
>
>  we badly need this. If you think the patch is in a shape good enough
> to be committed, please commit it for Martín since he doesn't have
> write access.
>
> We can then start a fresh new thread to discuss future directions
> specially related to optimized code and exactly what/how DWARF
> tags should be handled.

Ulrich raises a valid point though.
API design needs to be done with care.
I'd rather not rush this.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 17:57       ` Ulrich Weigand
  2014-10-23 18:09         ` Martin Galvan
@ 2014-10-24 14:58         ` Pedro Alves
  1 sibling, 0 replies; 30+ messages in thread
From: Pedro Alves @ 2014-10-24 14:58 UTC (permalink / raw)
  To: Ulrich Weigand, Martin Galvan; +Cc: gdb-patches, dje, Eli Zaretskii

On 10/23/2014 06:57 PM, Ulrich Weigand wrote:

> The in_prologue routine is likewise only still uses under certain rather
> rare circumstances; in fact it might even today be possible to simply
> remove it.  

Yes, I think so, see:

 https://sourceware.org/ml/gdb-patches/2014-10/msg00643.html

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 17:36     ` Martin Galvan
  2014-10-23 17:57       ` Ulrich Weigand
@ 2014-10-24  4:57       ` Doug Evans
  2014-10-24 15:02         ` Pedro Alves
  2014-10-24 14:57       ` Pedro Alves
  2 siblings, 1 reply; 30+ messages in thread
From: Doug Evans @ 2014-10-24  4:57 UTC (permalink / raw)
  To: Martin Galvan; +Cc: Pedro Alves, gdb-patches, Eli Zaretskii

On Thu, Oct 23, 2014 at 10:36 AM, Martin Galvan
<martin.galvan@tallertechnologies.com> wrote:
> First of all, thanks a lot to all of you for the feedback.
>
> On Wed, Oct 22, 2014 at 4:23 PM, Doug Evans <dje@google.com> wrote:
>> We should have a test for is_in_epilogue anyway.
>> The canonical choice is to just make it amd64-linux specific.
>
> I briefly thought of that. The test I originally wrote for
> is_in_epilogue checked for the last PC of a known, simple function
> which ended with retq on amd64 and pop on thumb ARM. It worked fine on
> ARM, however for some reason the retq was returning False. I looked at
> amd64_in_function_epilogue_p and saw all it did was check if the
> opcode was the one for ret. Not being too familiar with amd64 I
> initially assumed the opcode for retq was different, but after doing
> an objdump I noticed it's the same (0xc3). I was surprised to see
> this, and debugged GDB itself only to find out that
> gdbarch->in_function_epilogue_p was the generic one instead of amd64.
> Is this a bug?

Dunno offhand.
However, I still think there needs to be a testcase that exercises the
API call, however minimally.
[I don't care much about the details of the exercising in this
particular case, just that it is at least minimally exercised.  The
more exercise the better of course.]

>> API functions like these are problematic.
>> Users don't expect API functions to be heuristic-based,
>> and that is all these can ever be in the general case.
>> The patch does try to provide the user some guarantees
>> ("however if the result is @code{False} you can be sure
>> @value{GDBN} is right."),
>> but it's not clear to me this will be true in the general case.
>> I may have missed something so I'm certainly willing to be
>> persuaded otherwise.
>
> Well, the comment at the top of in_prologue says "Returns 1 if PC
> *might* be in prologue, 0 if definately (sic) *not* in prologue".
> However, looking at the function itself it relies on the compiler
> having marked the prologue as its own "line", and if it can't use that
> info it falls back to using the architecture-dependant
> gdbarch_skip_prologue.

Yep.

> So far every time I've tested it it's worked
> fine, and if it didn't it would've probably already been reported as a
> bug of in_prologue or gdbarch_skip_prologue.

Optimized code, and hand written assembler can make this problematic.

From the point of view of the user trying to use this function, if the
compiler moves an instruction into the middle of the prologue, what
should the function return when the pc is back in the prologue after
having left it?

E.g.,
my_function:
    prologue_insn_1
    random_instruction_scheduled_into_prologue
    prologue_insn_2   ; <<<<

The function will always return 0 for prologue_insn_2, but users not
familiar such things may find this confusing.

> I guess if you want to be
> *really* careful I could change that line in the documentation for
> something like "if the result is False, you can be almost sure GDB is
> right", or "GDB is most likely right".
>
> I think in this case it's more important to emphasize false positives
> rather than false negatives.

I don't have a problem with this.
Naming the function maybe_is_in_prologue helps emphasize "*might* be
in prologue".

>> I might be ok with this patch if the functions were named something like
>> "maybe_is_in_prologue" and "maybe_is_in_epilogue".
>> That way they scream to the user (and future code readers) "Heads Up!"
>
> Agreed, although from what Pedro said I think it would be even more
> accurate if we renamed the epilogue function to something like
> "stack_frame_destroyed".

If one went that route then I wonder whether we need two API functions.
[If we did go with only one function I'd choose a different name than
foo_destroyed of course.]
The problem being solved is wanting to know whether the debug
information can be used to look up local variables.  Is there another
problem being solved by knowing whether we're in the prologue?

> On Wed, Oct 22, 2014 at 6:34 PM, Pedro Alves <palves@redhat.com> wrote:
>> target's whose gdbarch_in_function_epilogue_p is returning true for
>> instructions in the epilogue before the stack is destroyed, will
>> have software watchpoints broken in the tail call case Yao described,
>> like ARM had.
>>
>> And so continuing with the in_function_epilogue_p name results in a
>> misnamed Python hook too; I'd much rather not carry that confusion to
>> the exported API.  That function could well return true if for some
>> reason we're stopped at code in the middle of the function, not
>> an in epilogue, that for some reason destroys the stack.
>>
>> Or, consider the case of at some point we wanting to expose a
>> function that actually returns whether the PC is in the
>> epilogue, no matter whether the stack/frame has been destroyed
>> or not.
>
> Agreed.
>
>> How about a test that single-steps out of a function, and checks
>> in_function_epilogue at each single-step until the function is
>> finished?  We can use istarget "foo" to tweak whether to expect
>> whether in_function_epilogue returns true in one of the steps
>> or not.
>
> I actually wrote a test that checked is_in_epilogue with the last PC
> in a very simple function. I used it to test on ARM and it worked
> fine, however for amd64 the story was different (see my answer to
> Doug), and that's why I didn't include it.
>
> If we were to single-step and do architecture-dependant checks for
> every single instruction, however, I think the test would be quite
> complicated.

Not necessarily that complicated.
We have very explicit dwarf testcases to exercise gdb processing of
special dwarf conditions.
We *could* do a similar thing here (e.g., have an assembler function
with a known number of instructions and exercise the API call(s) at
each instruction).

>>> +  ** Two new functions: gdb.is_in_prologue and gdb.is_in_epilogue,
>>> +     which are wrappers for in_prologue and gdbarch_in_function_epilogue_p
>>> +     respectively.
>>
>> This is talking about wrapping some GDB internals.  I think it'd be
>> better to say something in user-speak.  The implementations of
>> these new Python functions could change, even.
>
> Agreed. How about:
>
> ** Two new functions: gdb.maybe_in_prologue and gdb.stack_destroyed.
>     The former returns True if a given PC may be inside a function's prologue
>     and thus the local variables may show wrong values at that point,
>     and the latter returns True if a given PC is at a point where GDB detected
>     the stack frame as having been destroyed, usually by the instructions
>     in a function's epilogue.
>
>>> +The optional @var{functionStart} argument is the start address of the
>>> +function you want to check if @var{pc} belongs to.  If your binary
>>> +doesn't have debugging info, @value{GDBN} may need to use this value
>>> +to guess if @var{pc} belongs to the prologue.  If omitted it defaults
>>> +to 0.
>>
>> Some targets have code at address 0.  Seems like we may be exposing a
>> bad interface for these targets here?
>
> I used 0 because in_prologue expects it to be non-zero. If it's 0 and
> we have no debugging info, it'll always return true:
>
>       /* We don't even have minsym information, so fall back to using
>          func_start, if given.  */
>     if (! func_start)
>         return 1;        /* We *might* be in a prologue.  */
>
> Again, I did it because of the way in_prologue works, but as Eli said
> this would probably be better handled with a Python exception or a
> message of some kind.
>
>> Any reason these new tests aren't in a separate file?
>
> Not particularly. I thought they should go in python.exp since my
> functions belong to the gdb module itself, but I could place them in a
> separate file if you want.

A separate file would be preferable.

> I'm thinking of naming the file
> py-prologue-epilogue.exp or maybe py-frame-is-invalid.exp. Any better
> naming suggestions are welcome!
>
>>> +# Get the current frame object.
>>> +gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>>> +    "Get the current frame object." 0
>>
>> Lowercase, and no "." please.  Here and throughout.  Usual style is to
>> drop the "the"s to be more concise:
>>
>> gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>>     "get current frame object" 0
>>
>> BTW, I happened to read this back, and noticed: "current" and "selected"
>> have distinct meanings in the frame machinery.  Best not mix them up:
>>
>> gdb_py_test_silent_cmd "python selectedFrame = gdb.selected_frame()" \
>>     "get selected frame object" 0
>
> Will do.
>
> One more thing: I'd like to know if everyone's ok with this:
>
> On Wed, Oct 22, 2014 at 3:06 PM, Martin Galvan
> <martin.galvan@tallertechnologies.com> wrote:
>> On Wed, Oct 22, 2014 at 2:47 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>>> I think we are miscommunicating.  What I had in mind is raise an
>>> exception or display an error message when GDB has no other means to
>>> determine where the function's start is (e.g., no debug info), and no
>>> functionStart argument was passed.  That is what I meant by
>>> "require".  IOW, it's up to the user to decide when to provide it, but
>>> GDB will refuse to use some arbitrary number, such as zero, if it
>>> cannot determine the starting address.
>>
>> Oh, I see. That's a great idea.
>>
>> I think the simplest way to do this is not to call GDB's in_prologue
>> directly from gdbpy_is_in_prologue, but reproduce some of its behavior
>> and adding a check on the return value of find_pc_partial_function
>> which will throw an exception if functionStart wasn't provided. This
>> may result in a bit of duplicated code, but in_prologue isn't that
>> long a function if you take the comments out so I don't think that
>> would be a problem. What do you think?

I prefer the robustness of flagging an exception over a heuristic in
the API, so yeah, works for me.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24  4:57       ` Doug Evans
@ 2014-10-24 15:02         ` Pedro Alves
  2014-10-24 15:34           ` Ulrich Weigand
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2014-10-24 15:02 UTC (permalink / raw)
  To: Doug Evans, Martin Galvan; +Cc: gdb-patches, Eli Zaretskii

On 10/24/2014 05:57 AM, Doug Evans wrote:
> If one went that route then I wonder whether we need two API functions.
> [If we did go with only one function I'd choose a different name than
> foo_destroyed of course.]

Do you have a better suggestion for the gdbarch hook?  I think we
should just rename it for good, avoiding these confusions further.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 15:02         ` Pedro Alves
@ 2014-10-24 15:34           ` Ulrich Weigand
  2014-10-24 15:47             ` Doug Evans
  0 siblings, 1 reply; 30+ messages in thread
From: Ulrich Weigand @ 2014-10-24 15:34 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Doug Evans, Martin Galvan, gdb-patches, Eli Zaretskii

Pedro Alves wrote:
> On 10/24/2014 05:57 AM, Doug Evans wrote:
> > If one went that route then I wonder whether we need two API functions.
> > [If we did go with only one function I'd choose a different name than
> > foo_destroyed of course.]
> 
> Do you have a better suggestion for the gdbarch hook?  I think we
> should just rename it for good, avoiding these confusions further.

So if the only use of this interface is to check whether the result of
some other interface (I assume something like Frame.read_var ?) is
reliable, then I guess we might consider moving the check actually
into that other interface.  E.g. have Frame.read_var itself check
in_epilogue and return an unavailable or optimized-out value if
the value would be unreliable otherwise.

This might in the end even push the check into the GDB core itself,
so that we might no longer need the explicit checks in breakpoint.c.
Not sure if that it feasible ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 15:34           ` Ulrich Weigand
@ 2014-10-24 15:47             ` Doug Evans
  0 siblings, 0 replies; 30+ messages in thread
From: Doug Evans @ 2014-10-24 15:47 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Pedro Alves, Martin Galvan, gdb-patches, Eli Zaretskii

On Fri, Oct 24, 2014 at 8:34 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Pedro Alves wrote:
>> On 10/24/2014 05:57 AM, Doug Evans wrote:
>> > If one went that route then I wonder whether we need two API functions.
>> > [If we did go with only one function I'd choose a different name than
>> > foo_destroyed of course.]
>>
>> Do you have a better suggestion for the gdbarch hook?  I think we
>> should just rename it for good, avoiding these confusions further.
>
> So if the only use of this interface is to check whether the result of
> some other interface (I assume something like Frame.read_var ?) is
> reliable, then I guess we might consider moving the check actually
> into that other interface.  E.g. have Frame.read_var itself check
> in_epilogue and return an unavailable or optimized-out value if
> the value would be unreliable otherwise.

I can imagine someone wanting to do the check before doing
gdb.parse_and_eval (an escape hatch for general expression
evaluation).

Also, FAOD, the API function in question still should check whether
the pc is in the prologue (unless, e.g., gdb knows the debug info is
usable) as there too locals may not be accessible.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-23 17:36     ` Martin Galvan
  2014-10-23 17:57       ` Ulrich Weigand
  2014-10-24  4:57       ` Doug Evans
@ 2014-10-24 14:57       ` Pedro Alves
  2014-10-24 15:13         ` Ulrich Weigand
  2014-10-24 19:49         ` [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
  2 siblings, 2 replies; 30+ messages in thread
From: Pedro Alves @ 2014-10-24 14:57 UTC (permalink / raw)
  To: Martin Galvan; +Cc: gdb-patches, dje, Eli Zaretskii, Ulrich Weigand

On 10/23/2014 06:36 PM, Martin Galvan wrote:
>> > Some targets have code at address 0.  Seems like we may be exposing a
>> > bad interface for these targets here?
> I used 0 because in_prologue expects it to be non-zero. If it's 0 and
> we have no debugging info, it'll always return true:
> 
>       /* We don't even have minsym information, so fall back to using
>          func_start, if given.  */
>     if (! func_start)
>         return 1;        /* We *might* be in a prologue.  */

Design mistakes in the internal APIs shouldn't be exposed to a public
API.  I'd even suggest that whatever Python API we end up with, it'd
be good to make the internal API match it.

> 
> Again, I did it because of the way in_prologue works, but as Eli said
> this would probably be better handled with a Python exception or a
> message of some kind.

Not sure an exception makes sense given the function's
interface.  Say in the future another optional parameter is added.
What would you do then?  What of old code that passed in func_start
but not that new argument?  Those might not expect an exception.
So for the case of the new argument not being specified, we'd
have to return 1, which is right -- the PC _might_ be pointing
at a prologue.

But, how exactly were you planning using the gdb.is_in_prologue
function?  GDB itself doesn't use this to determine whether locals
are valid, only gdbarch_in_function_epilogue_p/gdb.is_in_epilogue.

I think we can just remove the in_prologue function entirely.
Looking at the code in GDB that makes use of this, all we find is this
one caller:

      if ((ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
	  || ((ecs->event_thread->control.step_range_end == 1)
	      && in_prologue (gdbarch, ecs->event_thread->prev_pc,
			      ecs->stop_func_start)))
	{
	  /* I presume that step_over_calls is only 0 when we're
	     supposed to be stepping at the assembly language level
	     ("stepi").  Just stop.  */
	  /* Also, maybe we just did a "nexti" inside a prolog, so we
	     thought it was a subroutine call but it was not.  Stop as
	     well.  FENN */
	  /* And this works the same backward as frontward.  MVS */
	  end_stepping_range (ecs);
	  return;
	}

So this is only used for "nexti", and it's itself a dubious use.
It looks like old code papering over unwinder problems.

Doing some archaeology, I found the revision that code
was added:

 commit 100a02e1deec2f037a15cdf232f026dc79763bf8
 Author:     Andrew Cagney <cagney@redhat.com>
 AuthorDate: Thu Jun 28 21:48:41 2001 +0000
 Commit:     Andrew Cagney <cagney@redhat.com>
 CommitDate: Thu Jun 28 21:48:41 2001 +0000

     From Fernando Nasser:
     * infrun.c (handle_inferior_event): Handle "nexti" inside function
     prologues.

And the mailing list thread:

  https://sourceware.org/ml/gdb-patches/2001-01/msg00047.html

Not much discussion there, and no test...

Looking at the code around what was patched in that revision, we see
that the checks that detect whether the program has just stepped into a
subroutine didn't rely on the unwinders _at all_ back then!

From 'git show 100a02e1:gdb/infrun.c':

    if (stop_pc == ecs->stop_func_start         /* Quick test */
        || (in_prologue (stop_pc, ecs->stop_func_start) &&
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            !IN_SOLIB_RETURN_TRAMPOLINE (stop_pc, ecs->stop_func_name))
        || IN_SOLIB_CALL_TRAMPOLINE (stop_pc, ecs->stop_func_name)
        || ecs->stop_func_name == 0)
      {
        /* It's a subroutine call.  */

        if ((step_over_calls == STEP_OVER_NONE)
            || ((step_range_end == 1)
                && in_prologue (prev_pc, ecs->stop_func_start)))
          {
            /* I presume that step_over_calls is only 0 when we're
               supposed to be stepping at the assembly language level
               ("stepi").  Just stop.  */
            /* Also, maybe we just did a "nexti" inside a prolog,
               so we thought it was a subroutine call but it was not.
               Stop as well.  FENN */
            stop_step = 1;
            print_stop_reason (END_STEPPING_RANGE, 0);
            stop_stepping (ecs);
            return;
          }


Stripping the IN_SOLIB_RETURN_TRAMPOLINE checks for simplicity, we had:

    if (stop_pc == ecs->stop_func_start         /* Quick test */
        || in_prologue (stop_pc, ecs->stop_func_start)
        || ecs->stop_func_name == 0)
      {
        /* It's a subroutine call.  */

That's it.  Detecting a subroutine call was based on prologue
detection.  And that's why nexti needed the fix it needed back
then.  IOW, the in_prologue check in the current tree was undoing
a bad decision the in_prologue check just above did...

Compare to today's subroutine call check:

  /* Check for subroutine calls.  The check for the current frame
     equalling the step ID is not necessary - the check of the
     previous frame's ID is sufficient - but it is a common case and
     cheaper than checking the previous frame's ID.

     NOTE: frame_id_eq will never report two invalid frame IDs as
     being equal, so to get into this block, both the current and
     previous frame must have valid frame IDs.  */
  /* The outer_frame_id check is a heuristic to detect stepping
     through startup code.  If we step over an instruction which
     sets the stack pointer from an invalid value to a valid value,
     we may detect that as a subroutine call from the mythical
     "outermost" function.  This could be fixed by marking
     outermost frames as !stack_p,code_p,special_p.  Then the
     initial outermost frame, before sp was valid, would
     have code_addr == &_start.  See the comment in frame_id_eq
     for more.  */
  if (!frame_id_eq (get_stack_frame_id (frame),
		    ecs->event_thread->control.step_stack_frame_id)
      && (frame_id_eq (frame_unwind_caller_id (get_current_frame ()),
		       ecs->event_thread->control.step_stack_frame_id)
	  && (!frame_id_eq (ecs->event_thread->control.step_stack_frame_id,
			    outer_frame_id)
	      || step_start_function != find_pc_function (stop_pc))))
    {

Nothing to do with prologues.  Completely relying on frame ids,
which are stable throughout the function.

It seems to me we can just remove the in_prologue check for nexti,
and the whole in_prologue function along with it...

I nexti'd manually a prologue for smoke testing, and it passes
regression testing for me on x86_64 Fedora 20, FWIW...

--------
From 822267fed3994acf191071653d10caf4c9dd7247 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 24 Oct 2014 14:49:33 +0100
Subject: [PATCH] Revert nexti prologue check

commit 100a02e1deec2f037a15cdf232f026dc79763bf8
Author:     Andrew Cagney <cagney@redhat.com>
AuthorDate: Thu Jun 28 21:48:41 2001 +0000
Commit:     Andrew Cagney <cagney@redhat.com>
CommitDate: Thu Jun 28 21:48:41 2001 +0000

    From Fernando Nasser:
    * infrun.c (handle_inferior_event): Handle "nexti" inside function
    prologues.
---
 gdb/infrun.c |  5 +----
 gdb/symtab.c | 73 ------------------------------------------------------------
 gdb/symtab.h |  3 ---
 3 files changed, 1 insertion(+), 80 deletions(-)

diff --git a/gdb/infrun.c b/gdb/infrun.c
index eca8eec..7a7d10d 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4945,10 +4945,7 @@ process_event_stop_test (struct execution_control_state *ecs)
       if (debug_infrun)
 	 fprintf_unfiltered (gdb_stdlog, "infrun: stepped into subroutine\n");

-      if ((ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
-	  || ((ecs->event_thread->control.step_range_end == 1)
-	      && in_prologue (gdbarch, ecs->event_thread->prev_pc,
-			      ecs->stop_func_start)))
+      if (ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
 	{
 	  /* I presume that step_over_calls is only 0 when we're
 	     supposed to be stepping at the assembly language level
diff --git a/gdb/symtab.c b/gdb/symtab.c
index c530d50..f39fc6c 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -3034,79 +3034,6 @@ skip_prologue_sal (struct symtab_and_line *sal)
     }
 }

-/* Determine if PC is in the prologue of a function.  The prologue is the area
-   between the first instruction of a function, and the first executable line.
-   Returns 1 if PC *might* be in prologue, 0 if definately *not* in prologue.
-
-   If non-zero, func_start is where we think the prologue starts, possibly
-   by previous examination of symbol table information.  */
-
-int
-in_prologue (struct gdbarch *gdbarch, CORE_ADDR pc, CORE_ADDR func_start)
-{
-  struct symtab_and_line sal;
-  CORE_ADDR func_addr, func_end;
-
-  /* We have several sources of information we can consult to figure
-     this out.
-     - Compilers usually emit line number info that marks the prologue
-       as its own "source line".  So the ending address of that "line"
-       is the end of the prologue.  If available, this is the most
-       reliable method.
-     - The minimal symbols and partial symbols, which can usually tell
-       us the starting and ending addresses of a function.
-     - If we know the function's start address, we can call the
-       architecture-defined gdbarch_skip_prologue function to analyze the
-       instruction stream and guess where the prologue ends.
-     - Our `func_start' argument; if non-zero, this is the caller's
-       best guess as to the function's entry point.  At the time of
-       this writing, handle_inferior_event doesn't get this right, so
-       it should be our last resort.  */
-
-  /* Consult the partial symbol table, to find which function
-     the PC is in.  */
-  if (! find_pc_partial_function (pc, NULL, &func_addr, &func_end))
-    {
-      CORE_ADDR prologue_end;
-
-      /* We don't even have minsym information, so fall back to using
-         func_start, if given.  */
-      if (! func_start)
-	return 1;		/* We *might* be in a prologue.  */
-
-      prologue_end = gdbarch_skip_prologue (gdbarch, func_start);
-
-      return func_start <= pc && pc < prologue_end;
-    }
-
-  /* If we have line number information for the function, that's
-     usually pretty reliable.  */
-  sal = find_pc_line (func_addr, 0);
-
-  /* Now sal describes the source line at the function's entry point,
-     which (by convention) is the prologue.  The end of that "line",
-     sal.end, is the end of the prologue.
-
-     Note that, for functions whose source code is all on a single
-     line, the line number information doesn't always end up this way.
-     So we must verify that our purported end-of-prologue address is
-     *within* the function, not at its start or end.  */
-  if (sal.line == 0
-      || sal.end <= func_addr
-      || func_end <= sal.end)
-    {
-      /* We don't have any good line number info, so use the minsym
-	 information, together with the architecture-specific prologue
-	 scanning code.  */
-      CORE_ADDR prologue_end = gdbarch_skip_prologue (gdbarch, func_addr);
-
-      return func_addr <= pc && pc < prologue_end;
-    }
-
-  /* We have line number info, and it looks good.  */
-  return func_addr <= pc && pc < sal.end;
-}
-
 /* Given PC at the function's start address, attempt to find the
    prologue end using SAL information.  Return zero if the skip fails.

diff --git a/gdb/symtab.h b/gdb/symtab.h
index 4a47d48..aaef9a0 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -1320,9 +1320,6 @@ extern enum language deduce_language_from_filename (const char *);

 /* symtab.c */

-extern int in_prologue (struct gdbarch *gdbarch,
-			CORE_ADDR pc, CORE_ADDR func_start);
-
 extern CORE_ADDR skip_prologue_using_sal (struct gdbarch *gdbarch,
 					  CORE_ADDR func_addr);

-- 
1.9.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 14:57       ` Pedro Alves
@ 2014-10-24 15:13         ` Ulrich Weigand
  2014-11-07 14:45           ` [push] Revert old nexti prologue check and eliminate in_prologue Pedro Alves
  2014-10-24 19:49         ` [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
  1 sibling, 1 reply; 30+ messages in thread
From: Ulrich Weigand @ 2014-10-24 15:13 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Martin Galvan, gdb-patches, dje, Eli Zaretskii

Pedro Alves wrote:

> Subject: [PATCH] Revert nexti prologue check
> 
>     From Fernando Nasser:
>     * infrun.c (handle_inferior_event): Handle "nexti" inside function
>     prologues.

This makes sense to me.  Minor nit: please also remove the comment

> 	  /* Also, maybe we just did a "nexti" inside a prolog, so we
> 	     thought it was a subroutine call but it was not.  Stop as
> 	     well.  FENN */

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [push] Revert old nexti prologue check and eliminate in_prologue
  2014-10-24 15:13         ` Ulrich Weigand
@ 2014-11-07 14:45           ` Pedro Alves
  0 siblings, 0 replies; 30+ messages in thread
From: Pedro Alves @ 2014-11-07 14:45 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Martin Galvan, gdb-patches, dje, Eli Zaretskii

On 10/24/2014 04:12 PM, Ulrich Weigand wrote:
> Pedro Alves wrote:
> 
>> Subject: [PATCH] Revert nexti prologue check
>>
>>     From Fernando Nasser:
>>     * infrun.c (handle_inferior_event): Handle "nexti" inside function
>>     prologues.
> 
> This makes sense to me.  Minor nit: please also remove the comment
> 
>> 	  /* Also, maybe we just did a "nexti" inside a prolog, so we
>> 	     thought it was a subroutine call but it was not.  Stop as
>> 	     well.  FENN */

Done, and pushed now.

Thanks.

From b7a084bebe979a4743540349025561ce82208843 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 7 Nov 2014 13:53:01 +0000
Subject: [PATCH] Revert old nexti prologue check and eliminate in_prologue

The in_prologue check in the nexti code is obsolete; this commit
removes that, and then removes the in_prologue function as nothing
else uses it.

Looking at the code in GDB that makes use in_prologue, all we find is
this one caller:

      if ((ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
	  || ((ecs->event_thread->control.step_range_end == 1)
	      && in_prologue (gdbarch, ecs->event_thread->prev_pc,
			      ecs->stop_func_start)))
	{
	  /* I presume that step_over_calls is only 0 when we're
	     supposed to be stepping at the assembly language level
	     ("stepi").  Just stop.  */
	  /* Also, maybe we just did a "nexti" inside a prolog, so we
	     thought it was a subroutine call but it was not.  Stop as
	     well.  FENN */
	  /* And this works the same backward as frontward.  MVS */
	  end_stepping_range (ecs);
	  return;
	}

This was added by:

 commit 100a02e1deec2f037a15cdf232f026dc79763bf8
 ...
     From Fernando Nasser:
     * infrun.c (handle_inferior_event): Handle "nexti" inside function
     prologues.

The mailing list thread is here:

  https://sourceware.org/ml/gdb-patches/2001-01/msg00047.html

Not much discussion there, and no test, but looking at the code around
what was patched in that revision, we see that the checks that detect
whether the program has just stepped into a subroutine didn't rely on
the unwinders at all back then.

From 'git show 100a02e1:gdb/infrun.c':

    if (stop_pc == ecs->stop_func_start         /* Quick test */
        || (in_prologue (stop_pc, ecs->stop_func_start) &&
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            !IN_SOLIB_RETURN_TRAMPOLINE (stop_pc, ecs->stop_func_name))
        || IN_SOLIB_CALL_TRAMPOLINE (stop_pc, ecs->stop_func_name)
        || ecs->stop_func_name == 0)
      {
        /* It's a subroutine call.  */

        if ((step_over_calls == STEP_OVER_NONE)
            || ((step_range_end == 1)
                && in_prologue (prev_pc, ecs->stop_func_start)))
          {
            /* I presume that step_over_calls is only 0 when we're
               supposed to be stepping at the assembly language level
               ("stepi").  Just stop.  */
            /* Also, maybe we just did a "nexti" inside a prolog,
               so we thought it was a subroutine call but it was not.
               Stop as well.  FENN */
            stop_step = 1;
            print_stop_reason (END_STEPPING_RANGE, 0);
            stop_stepping (ecs);
            return;
          }

Stripping the IN_SOLIB_RETURN_TRAMPOLINE checks for simplicity, we had:

    if (stop_pc == ecs->stop_func_start         /* Quick test */
        || in_prologue (stop_pc, ecs->stop_func_start)
        || ecs->stop_func_name == 0)
      {
        /* It's a subroutine call.  */

That is, detecting a subroutine call was based on prologue detection
back then.  So the in_prologue check in the current tree only made
sense back then as it was undoing a bad decision the in_prologue check
that used to exist above did.

Today, the check for a subroutine call relies on frame ids instead,
which are stable throughout the function.  So we can just remove the
in_prologue check for nexti, and the whole in_prologue function along
with it.

Tested on x86_64 Fedora 20, and also by nexti-ing manually a prologue.

gdb/
2014-11-07  Pedro Alves  <palves@redhat.com>

	* infrun.c (process_event_stop_test) <subroutine check>: Don't
	check if we did a "nexti" inside a prologue.
	* symtab.c (in_prologue): Delete function.
	* symtab.h (in_prologue): Delete declaration.
---
 gdb/ChangeLog |  7 ++++++
 gdb/infrun.c  |  8 +------
 gdb/symtab.c  | 73 -----------------------------------------------------------
 gdb/symtab.h  |  3 ---
 4 files changed, 8 insertions(+), 83 deletions(-)

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index f75a5ba..0a8bee8 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,10 @@
+2014-11-07  Pedro Alves  <palves@redhat.com>
+
+	* infrun.c (process_event_stop_test) <subroutine check>: Don't
+	check if we did a "nexti" inside a prologue.
+	* symtab.c (in_prologue): Delete function.
+	* symtab.h (in_prologue): Delete declaration.
+
 2014-11-06  Doug Evans  <xdje42@gmail.com>
 
 	* symtab.h (lookup_global_symbol): Improve function comment.
diff --git a/gdb/infrun.c b/gdb/infrun.c
index b950b74..b2e182f 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -5004,17 +5004,11 @@ process_event_stop_test (struct execution_control_state *ecs)
       if (debug_infrun)
 	 fprintf_unfiltered (gdb_stdlog, "infrun: stepped into subroutine\n");
 
-      if ((ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
-	  || ((ecs->event_thread->control.step_range_end == 1)
-	      && in_prologue (gdbarch, ecs->event_thread->prev_pc,
-			      ecs->stop_func_start)))
+      if (ecs->event_thread->control.step_over_calls == STEP_OVER_NONE)
 	{
 	  /* I presume that step_over_calls is only 0 when we're
 	     supposed to be stepping at the assembly language level
 	     ("stepi").  Just stop.  */
-	  /* Also, maybe we just did a "nexti" inside a prolog, so we
-	     thought it was a subroutine call but it was not.  Stop as
-	     well.  FENN */
 	  /* And this works the same backward as frontward.  MVS */
 	  end_stepping_range (ecs);
 	  return;
diff --git a/gdb/symtab.c b/gdb/symtab.c
index 2aae04c..df974bf 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -2954,79 +2954,6 @@ skip_prologue_sal (struct symtab_and_line *sal)
     }
 }
 
-/* Determine if PC is in the prologue of a function.  The prologue is the area
-   between the first instruction of a function, and the first executable line.
-   Returns 1 if PC *might* be in prologue, 0 if definately *not* in prologue.
-
-   If non-zero, func_start is where we think the prologue starts, possibly
-   by previous examination of symbol table information.  */
-
-int
-in_prologue (struct gdbarch *gdbarch, CORE_ADDR pc, CORE_ADDR func_start)
-{
-  struct symtab_and_line sal;
-  CORE_ADDR func_addr, func_end;
-
-  /* We have several sources of information we can consult to figure
-     this out.
-     - Compilers usually emit line number info that marks the prologue
-       as its own "source line".  So the ending address of that "line"
-       is the end of the prologue.  If available, this is the most
-       reliable method.
-     - The minimal symbols and partial symbols, which can usually tell
-       us the starting and ending addresses of a function.
-     - If we know the function's start address, we can call the
-       architecture-defined gdbarch_skip_prologue function to analyze the
-       instruction stream and guess where the prologue ends.
-     - Our `func_start' argument; if non-zero, this is the caller's
-       best guess as to the function's entry point.  At the time of
-       this writing, handle_inferior_event doesn't get this right, so
-       it should be our last resort.  */
-
-  /* Consult the partial symbol table, to find which function
-     the PC is in.  */
-  if (! find_pc_partial_function (pc, NULL, &func_addr, &func_end))
-    {
-      CORE_ADDR prologue_end;
-
-      /* We don't even have minsym information, so fall back to using
-         func_start, if given.  */
-      if (! func_start)
-	return 1;		/* We *might* be in a prologue.  */
-
-      prologue_end = gdbarch_skip_prologue (gdbarch, func_start);
-
-      return func_start <= pc && pc < prologue_end;
-    }
-
-  /* If we have line number information for the function, that's
-     usually pretty reliable.  */
-  sal = find_pc_line (func_addr, 0);
-
-  /* Now sal describes the source line at the function's entry point,
-     which (by convention) is the prologue.  The end of that "line",
-     sal.end, is the end of the prologue.
-
-     Note that, for functions whose source code is all on a single
-     line, the line number information doesn't always end up this way.
-     So we must verify that our purported end-of-prologue address is
-     *within* the function, not at its start or end.  */
-  if (sal.line == 0
-      || sal.end <= func_addr
-      || func_end <= sal.end)
-    {
-      /* We don't have any good line number info, so use the minsym
-	 information, together with the architecture-specific prologue
-	 scanning code.  */
-      CORE_ADDR prologue_end = gdbarch_skip_prologue (gdbarch, func_addr);
-
-      return func_addr <= pc && pc < prologue_end;
-    }
-
-  /* We have line number info, and it looks good.  */
-  return func_addr <= pc && pc < sal.end;
-}
-
 /* Given PC at the function's start address, attempt to find the
    prologue end using SAL information.  Return zero if the skip fails.
 
diff --git a/gdb/symtab.h b/gdb/symtab.h
index 9b3ea80..d78b832 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -1347,9 +1347,6 @@ extern enum language deduce_language_from_filename (const char *);
 
 /* symtab.c */
 
-extern int in_prologue (struct gdbarch *gdbarch,
-			CORE_ADDR pc, CORE_ADDR func_start);
-
 extern CORE_ADDR skip_prologue_using_sal (struct gdbarch *gdbarch,
 					  CORE_ADDR func_addr);
 
-- 
1.9.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 14:57       ` Pedro Alves
  2014-10-24 15:13         ` Ulrich Weigand
@ 2014-10-24 19:49         ` Martin Galvan
  2014-10-24 20:09           ` Pedro Alves
  1 sibling, 1 reply; 30+ messages in thread
From: Martin Galvan @ 2014-10-24 19:49 UTC (permalink / raw)
  To: Pedro Alves
  Cc: gdb-patches, Doug Evans, Eli Zaretskii, Ulrich Weigand, Daniel Gutson

On Fri, Oct 24, 2014 at 11:56 AM, Pedro Alves <palves@redhat.com> wrote:
> On 10/23/2014 06:36 PM, Martin Galvan wrote:
>>> > Some targets have code at address 0.  Seems like we may be exposing a
>>> > bad interface for these targets here?
>> I used 0 because in_prologue expects it to be non-zero. If it's 0 and
>> we have no debugging info, it'll always return true:
>>
>>       /* We don't even have minsym information, so fall back to using
>>          func_start, if given.  */
>>     if (! func_start)
>>         return 1;        /* We *might* be in a prologue.  */
>
> Design mistakes in the internal APIs shouldn't be exposed to a public
> API.  I'd even suggest that whatever Python API we end up with, it'd
> be good to make the internal API match it.
>
>>
>> Again, I did it because of the way in_prologue works, but as Eli said
>> this would probably be better handled with a Python exception or a
>> message of some kind.
>
> Not sure an exception makes sense given the function's
> interface.  Say in the future another optional parameter is added.
> What would you do then?  What of old code that passed in func_start
> but not that new argument?  Those might not expect an exception.
> So for the case of the new argument not being specified, we'd
> have to return 1, which is right -- the PC _might_ be pointing
> at a prologue.

I probably didn't make myself clear-- I wasn't talking about using
in_prologue directly anymore, but to follow its approach in the API
function. Of course it wouldn't make sense to put Python exception
raising directly inside in_prologue.

> But, how exactly were you planning using the gdb.is_in_prologue
> function?  GDB itself doesn't use this to determine whether locals
> are valid, only gdbarch_in_function_epilogue_p/gdb.is_in_epilogue.

Well, I followed the code while testing a rather simple function and
noticed that handle_step_into_function is very similar (in terms of
the approach) to in_prologue plus some address corrections and setting
a breakpoint to proceed to. The API function needs only the address
calculation part.

What if:
   1) I split handle_step_into_function in the address calc part and
the brakpoint insertion part,
moving the address calc to a new function (publicly available from infrun.h).
   2) I expose such function to the Python API.

Would that be accepted? Would you want to see a patch?

Please keep in mind that what I actually need is not really messing
with the prologue, but to know where the local variables are
accessible. If I could simply use DWARF info to accomplish that then I
wouldn't even touch the prologue at all.

Thanks!

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 19:49         ` [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
@ 2014-10-24 20:09           ` Pedro Alves
  2014-10-24 21:11             ` Martin Galvan
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2014-10-24 20:09 UTC (permalink / raw)
  To: Martin Galvan
  Cc: gdb-patches, Doug Evans, Eli Zaretskii, Ulrich Weigand, Daniel Gutson

On 10/24/2014 08:49 PM, Martin Galvan wrote:
> On Fri, Oct 24, 2014 at 11:56 AM, Pedro Alves <palves@redhat.com> wrote:
>> On 10/23/2014 06:36 PM, Martin Galvan wrote:
>>>>> Some targets have code at address 0.  Seems like we may be exposing a
>>>>> bad interface for these targets here?
>>> I used 0 because in_prologue expects it to be non-zero. If it's 0 and
>>> we have no debugging info, it'll always return true:
>>>
>>>       /* We don't even have minsym information, so fall back to using
>>>          func_start, if given.  */
>>>     if (! func_start)
>>>         return 1;        /* We *might* be in a prologue.  */
>>
>> Design mistakes in the internal APIs shouldn't be exposed to a public
>> API.  I'd even suggest that whatever Python API we end up with, it'd
>> be good to make the internal API match it.
>>
>>>
>>> Again, I did it because of the way in_prologue works, but as Eli said
>>> this would probably be better handled with a Python exception or a
>>> message of some kind.
>>
>> Not sure an exception makes sense given the function's
>> interface.  Say in the future another optional parameter is added.
>> What would you do then?  What of old code that passed in func_start
>> but not that new argument?  Those might not expect an exception.
>> So for the case of the new argument not being specified, we'd
>> have to return 1, which is right -- the PC _might_ be pointing
>> at a prologue.
> 
> I probably didn't make myself clear-- I wasn't talking about using
> in_prologue directly anymore, but to follow its approach in the API
> function. Of course it wouldn't make sense to put Python exception
> raising directly inside in_prologue.

That concern with about clients of the Python API, and if another
optional parameter is added to the Python API.

> 
>> But, how exactly were you planning using the gdb.is_in_prologue
>> function?  GDB itself doesn't use this to determine whether locals
>> are valid, only gdbarch_in_function_epilogue_p/gdb.is_in_epilogue.
> 
> Well, I followed the code while testing a rather simple function and
> noticed that handle_step_into_function is very similar (in terms of
> the approach) to in_prologue plus some address corrections and setting
> a breakpoint to proceed to. The API function needs only the address
> calculation part.
> 
> What if:
>    1) I split handle_step_into_function in the address calc part and
> the brakpoint insertion part,
> moving the address calc to a new function (publicly available from infrun.h).
>    2) I expose such function to the Python API.
> 
> Would that be accepted? Would you want to see a patch?
> 
> Please keep in mind that what I actually need is not really messing
> with the prologue, but to know where the local variables are
> accessible. If I could simply use DWARF info to accomplish that then I
> wouldn't even touch the prologue at all.

Hmm, how is this different from simply doing "break function" ?
GDB sets function breakpoints after the prologue already.  A "step"
into a function should stop at the exact same address as if the user
did "b function; c" to run to said function.

So, when you detect that you stepped into a function, you could
just set the breakpoint by function name?

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 20:09           ` Pedro Alves
@ 2014-10-24 21:11             ` Martin Galvan
  2014-10-24 22:34               ` Pedro Alves
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Galvan @ 2014-10-24 21:11 UTC (permalink / raw)
  To: Pedro Alves
  Cc: gdb-patches, Doug Evans, Eli Zaretskii, Ulrich Weigand, Daniel Gutson

On Fri, Oct 24, 2014 at 5:09 PM, Pedro Alves <palves@redhat.com> wrote:
>> Well, I followed the code while testing a rather simple function and
>> noticed that handle_step_into_function is very similar (in terms of
>> the approach) to in_prologue plus some address corrections and setting
>> a breakpoint to proceed to. The API function needs only the address
>> calculation part.
>>
>> What if:
>>    1) I split handle_step_into_function in the address calc part and
>> the brakpoint insertion part,
>> moving the address calc to a new function (publicly available from infrun.h).
>>    2) I expose such function to the Python API.
>>
>> Would that be accepted? Would you want to see a patch?
>>
>> Please keep in mind that what I actually need is not really messing
>> with the prologue, but to know where the local variables are
>> accessible. If I could simply use DWARF info to accomplish that then I
>> wouldn't even touch the prologue at all.
>
> Hmm, how is this different from simply doing "break function" ?
> GDB sets function breakpoints after the prologue already.  A "step"
> into a function should stop at the exact same address as if the user
> did "b function; c" to run to said function.
>
> So, when you detect that you stepped into a function, you could
> just set the breakpoint by function name?

In order for that to work, I'd have to run the program up to that
point. I really need to be able to determine if at a given PC the
local variables will be accessible without actually running the
program. Ideally I'd use only DWARF info to know that.

I looked up the approach GDB takes when setting a breakpoint at a
function name. From what I saw it appears to be similar as the
"optimistic" path from in_prologue (that is, using symtab and line
info). I guess that makes sense since setting a breakpoint by function
name by definition requires us to have debugging info.

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 21:11             ` Martin Galvan
@ 2014-10-24 22:34               ` Pedro Alves
  2014-10-27 16:40                 ` Martin Galvan
  0 siblings, 1 reply; 30+ messages in thread
From: Pedro Alves @ 2014-10-24 22:34 UTC (permalink / raw)
  To: Martin Galvan
  Cc: gdb-patches, Doug Evans, Eli Zaretskii, Ulrich Weigand, Daniel Gutson

On 10/24/2014 10:11 PM, Martin Galvan wrote:
> On Fri, Oct 24, 2014 at 5:09 PM, Pedro Alves <palves@redhat.com> wrote:
>>> Well, I followed the code while testing a rather simple function and
>>> noticed that handle_step_into_function is very similar (in terms of
>>> the approach) to in_prologue plus some address corrections and setting
>>> a breakpoint to proceed to. The API function needs only the address
>>> calculation part.
>>>
>>> What if:
>>>    1) I split handle_step_into_function in the address calc part and
>>> the brakpoint insertion part,
>>> moving the address calc to a new function (publicly available from infrun.h).
>>>    2) I expose such function to the Python API.
>>>
>>> Would that be accepted? Would you want to see a patch?
>>>
>>> Please keep in mind that what I actually need is not really messing
>>> with the prologue, but to know where the local variables are
>>> accessible. If I could simply use DWARF info to accomplish that then I
>>> wouldn't even touch the prologue at all.
>>
>> Hmm, how is this different from simply doing "break function" ?
>> GDB sets function breakpoints after the prologue already.  A "step"
>> into a function should stop at the exact same address as if the user
>> did "b function; c" to run to said function.
>>
>> So, when you detect that you stepped into a function, you could
>> just set the breakpoint by function name?
> 
> In order for that to work, I'd have to run the program up to that
> point. 

You can set breakpoints before running the program:

(top-gdb) b *main
Breakpoint 3 at 0x45ed30: file /home/pedro/gdb/mygit/build/../src/gdb/gdb.c, line 25.
(top-gdb) b main
Breakpoint 4 at 0x45ed3f: file /home/pedro/gdb/mygit/build/../src/gdb/gdb.c, line 28.

That offset is your "prologue", however meaningful that is.

> I really need to be able to determine if at a given PC the
> local variables will be accessible

Why?

> without actually running the
> program. Ideally I'd use only DWARF info to know that.

I think we still don't know what you're trying to do, only
a bit of _how_ you're trying to do it.  :-)  It makes it
harder to understand the use case, and to suggest solutions.

> I looked up the approach GDB takes when setting a breakpoint at a
> function name. From what I saw it appears to be similar as the
> "optimistic" path from in_prologue (that is, using symtab and line
> info). I guess that makes sense since setting a breakpoint by function
> name by definition requires us to have debugging info.

If you need access to local variables, then you're already
relying on debug info.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue.
  2014-10-24 22:34               ` Pedro Alves
@ 2014-10-27 16:40                 ` Martin Galvan
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Galvan @ 2014-10-27 16:40 UTC (permalink / raw)
  To: Pedro Alves
  Cc: gdb-patches, Doug Evans, Eli Zaretskii, Ulrich Weigand, Daniel Gutson

On Fri, Oct 24, 2014 at 7:34 PM, Pedro Alves <palves@redhat.com> wrote:
> You can set breakpoints before running the program:
>
> (top-gdb) b *main
> Breakpoint 3 at 0x45ed30: file /home/pedro/gdb/mygit/build/../src/gdb/gdb.c, line 25.
> (top-gdb) b main
> Breakpoint 4 at 0x45ed3f: file /home/pedro/gdb/mygit/build/../src/gdb/gdb.c, line 28.
>
> That offset is your "prologue", however meaningful that is.

Indeed. However, setting breakpoints just to parse that output is ugly :)

> I think we still don't know what you're trying to do, only
> a bit of _how_ you're trying to do it.  :-)  It makes it
> harder to understand the use case, and to suggest solutions.

In my case, I'm working on a Gdb-based fault injection machine which
will select some random PC from a given program, check if local
variables are visible at that point, drive the target to said point
and corrupt the contents of said variables. If I had a way to know at
which point a function's arguments will be ready to be used, the
attacks could be more effective: it's useless to corrupt the temporary
value the args have before they're initialized in the function's
prologue. However, as the PC is selected at random and it won't always
target the prologue, I wouldn't mind a few "missed" attacks if you
really don't want to include the prologue function.

The epilogue is a bit trickier: after the stack frame is destroyed,
Gdb will show the variables' addresses as pointing to other places in
the stack. If I were to corrupt those places, I could change the value
of the previous PC (which was stored as we entered the function),
which would make the program crash. While that seems like a desirable
thing, it's not for what I'm trying to test with this particular
behavior.

>> I looked up the approach GDB takes when setting a breakpoint at a
>> function name. From what I saw it appears to be similar as the
>> "optimistic" path from in_prologue (that is, using symtab and line
>> info). I guess that makes sense since setting a breakpoint by function
>> name by definition requires us to have debugging info.
>
> If you need access to local variables, then you're already
> relying on debug info.

Yes, but here's the thing: what in_prologue (and
handle_step_into_function, etc) do is taking the first address of a
function, getting its line number info and assuming the prologue ends
at the last address covered by that line. This is based on the
assumption that the compiler will mark the prologue as its own "line".
What confused me initially was that in_prologue had what appeared to
be a check for the case where a function would have all its code on a
single line. If that check didn't pass, it called
gdbarch_skip_prologue (which is what we're trying to avoid). I didn't
see something like that for the breakpoint setting code, so I tried
doing "break myFunction", where myFunction is defined in a single
line. It worked as expected: the breakpoint was still set at the end
of the prologue. I'm not certain, however, of how would this work on
hand-written assembly or the case where a compiler would schedule an
instruction from the function body to be inside the prologue (would it
mark that instruction as part of the first "line"?).

The bottom line is: I'm willing to drop the prologue API function, as
it's not essential to me particularly, but if we were to keep it we'd
have to require the function_start argument. I really need to keep the
epilogue (or perhaps I should say "stack destroyed"?) function,
though.

-- 

Martín Galván

Software Engineer

Taller Technologies Argentina

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: 54 351 4217888 / +54 351 4218211

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-11-07 14:45 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-22 14:02 [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
2014-10-22 15:10 ` Eli Zaretskii
2014-10-22 15:14   ` Martin Galvan
2014-10-22 17:33   ` Martin Galvan
2014-10-22 17:47     ` Eli Zaretskii
2014-10-22 18:06       ` Martin Galvan
2014-10-22 18:07       ` Eli Zaretskii
2014-10-22 18:32         ` Martin Galvan
2014-10-22 18:37           ` Eli Zaretskii
2014-10-22 19:23 ` Doug Evans
2014-10-22 21:34 ` Pedro Alves
2014-10-22 21:59   ` Pedro Alves
2014-10-23 17:36     ` Martin Galvan
2014-10-23 17:57       ` Ulrich Weigand
2014-10-23 18:09         ` Martin Galvan
2014-10-23 18:14           ` Daniel Gutson
2014-10-24  2:42             ` Doug Evans
2014-10-24 14:58         ` Pedro Alves
2014-10-24  4:57       ` Doug Evans
2014-10-24 15:02         ` Pedro Alves
2014-10-24 15:34           ` Ulrich Weigand
2014-10-24 15:47             ` Doug Evans
2014-10-24 14:57       ` Pedro Alves
2014-10-24 15:13         ` Ulrich Weigand
2014-11-07 14:45           ` [push] Revert old nexti prologue check and eliminate in_prologue Pedro Alves
2014-10-24 19:49         ` [PATCH] Python API: Add gdb.is_in_prologue and gdb.is_in_epilogue Martin Galvan
2014-10-24 20:09           ` Pedro Alves
2014-10-24 21:11             ` Martin Galvan
2014-10-24 22:34               ` Pedro Alves
2014-10-27 16:40                 ` Martin Galvan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).