public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* Is nexti confused by pushq?
@ 2019-02-25 15:40 David Griffiths
  2019-02-25 15:54 ` dwk
  0 siblings, 1 reply; 11+ messages in thread
From: David Griffiths @ 2019-02-25 15:40 UTC (permalink / raw)
  To: gdb

Hi, when I get to the following instructions:

  0x00007fffe192413e: rex.W pushq 0x28(%rsp)
  0x00007fffe1924143: rex.W popq (%rsp)
  0x00007fffe1924147: callq  0x00007fffe1045de0

and do "nexti" at the first, it doesn't stop at the second but instead acts
as though I'd done "continue". For some reason I can't reproduce with a
little test though.

(gdb 8.1 on Ubuntu 16.04)

BTW I'm doing nexti programmatically and trying to avoid looking at the
next instruction to decide whether to do stepi or nexti.

Cheers,

David

-- 

David Griffiths, Senior Software Engineer

Undo <https://undo.io> | Resolve even the most challenging software defects
with software flight recorder technology

Software reliability report: optimizing the software supplier and customer
relationship
<https://info.undo.io/software-reliability-report-optimizing-supplier-and-customer-relationship>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-25 15:40 Is nexti confused by pushq? David Griffiths
@ 2019-02-25 15:54 ` dwk
  2019-02-26  7:32   ` Andrew Burgess
  0 siblings, 1 reply; 11+ messages in thread
From: dwk @ 2019-02-25 15:54 UTC (permalink / raw)
  To: David Griffiths; +Cc: GDB

I encounter this frequently, although I don't have a minimal case yet
either. I think it may have something to do with symbol information, as
I've only encountered the case when symbol information is not present (as
in the example you gave). stepi always works but nexti sometimes turns into
a continue, I assumed because it was unable to figure out where the "next"
instruction was somehow in the absence of symbols.

dwk

On Mon, Feb 25, 2019, 10:41 AM David Griffiths <dgriffiths@undo.io> wrote:

> Hi, when I get to the following instructions:
>
>   0x00007fffe192413e: rex.W pushq 0x28(%rsp)
>   0x00007fffe1924143: rex.W popq (%rsp)
>   0x00007fffe1924147: callq  0x00007fffe1045de0
>
> and do "nexti" at the first, it doesn't stop at the second but instead acts
> as though I'd done "continue". For some reason I can't reproduce with a
> little test though.
>
> (gdb 8.1 on Ubuntu 16.04)
>
> BTW I'm doing nexti programmatically and trying to avoid looking at the
> next instruction to decide whether to do stepi or nexti.
>
> Cheers,
>
> David
>
> --
>
> David Griffiths, Senior Software Engineer
>
> Undo <https://undo.io> | Resolve even the most challenging software
> defects
> with software flight recorder technology
>
> Software reliability report: optimizing the software supplier and customer
> relationship
> <
> https://info.undo.io/software-reliability-report-optimizing-supplier-and-customer-relationship
> >
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-25 15:54 ` dwk
@ 2019-02-26  7:32   ` Andrew Burgess
  2019-02-26 10:12     ` Jan Kratochvil
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Burgess @ 2019-02-26  7:32 UTC (permalink / raw)
  To: dwk; +Cc: David Griffiths, GDB

* dwk <dwks42@gmail.com> [2019-02-25 10:54:19 -0500]:

> I encounter this frequently, although I don't have a minimal case yet
> either. I think it may have something to do with symbol information, as
> I've only encountered the case when symbol information is not present (as
> in the example you gave). stepi always works but nexti sometimes turns into
> a continue, I assumed because it was unable to figure out where the "next"
> instruction was somehow in the absence of symbols.

The problem here is that pushq changes the stack pointer, this is
obviously interacting badly with the unwinder in some cases.

If we consider the difference between 'stepi' and 'nexti' we will see
what is going wrong.

A 'stepi' simply single steps the machine.  There's very little extra
logic, it's just a single step.

A 'nexti' however, steps the next instruction in the current function,
stepping over function calls.  The way this works is that when the
'nexti' is issued GDB caches the current frame-id, that is (roughly)
function entry $pc, and the frame base pointer (related to $sp at
entry to the function).  Once this is cached GDB single steps, and
after each step it checks the current frame-id.  If the frame-id has
changed then GDB believes we have entered a new (nested) function,
places a breakpoint at the return address, and then continues.  Once
we hit the breakpoint we should be back in the original frame and we
have completed the 'nexti'.

Now the problem comes if when we single step over the 'pushq' the
frame-id changes, if this happens GDB gets confused and then
continues.

To check this you should try walking over your problem code using
'stepi', and at each step run the 'bt' command.  What you'll see is
that as you step over the 'pushq' the backtrace will change in
someway, this indicates the change in frame-id that is causing
problems for you.

Of course, this doesn't solve the problem for you, but at least you
know what's going wrong now :)

Thanks,
Andrew





> 
> dwk
> 
> On Mon, Feb 25, 2019, 10:41 AM David Griffiths <dgriffiths@undo.io> wrote:
> 
> > Hi, when I get to the following instructions:
> >
> >   0x00007fffe192413e: rex.W pushq 0x28(%rsp)
> >   0x00007fffe1924143: rex.W popq (%rsp)
> >   0x00007fffe1924147: callq  0x00007fffe1045de0
> >
> > and do "nexti" at the first, it doesn't stop at the second but instead acts
> > as though I'd done "continue". For some reason I can't reproduce with a
> > little test though.
> >
> > (gdb 8.1 on Ubuntu 16.04)
> >
> > BTW I'm doing nexti programmatically and trying to avoid looking at the
> > next instruction to decide whether to do stepi or nexti.
> >
> > Cheers,
> >
> > David
> >
> > --
> >
> > David Griffiths, Senior Software Engineer
> >
> > Undo <https://undo.io> | Resolve even the most challenging software
> > defects
> > with software flight recorder technology
> >
> > Software reliability report: optimizing the software supplier and customer
> > relationship
> > <
> > https://info.undo.io/software-reliability-report-optimizing-supplier-and-customer-relationship
> > >
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26  7:32   ` Andrew Burgess
@ 2019-02-26 10:12     ` Jan Kratochvil
  2019-02-26 11:50       ` David Griffiths
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Kratochvil @ 2019-02-26 10:12 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: dwk, David Griffiths, GDB

On Tue, 26 Feb 2019 08:32:37 +0100, Andrew Burgess wrote:
> Of course, this doesn't solve the problem for you, but at least you
> know what's going wrong now :)

To make it clear the debuggee has wrong/insufficient debug info, its
.eh_frame/.debug_frame there should annotate the push (and pop) instructions.


Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 10:12     ` Jan Kratochvil
@ 2019-02-26 11:50       ` David Griffiths
  2019-02-26 11:58         ` Jan Kratochvil
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: David Griffiths @ 2019-02-26 11:50 UTC (permalink / raw)
  To: Jan Kratochvil; +Cc: Andrew Burgess, dwk, GDB

Ok, so in my case this is generated code with no debug info (Java JIT
generated) so does that mean I shouldn't attempt to use nexti? (I've got
other issues which probably preclude using nexti anyway but just curious)

Cheers,

David

On Tue, 26 Feb 2019 at 10:12, Jan Kratochvil <jan.kratochvil@redhat.com>
wrote:

> On Tue, 26 Feb 2019 08:32:37 +0100, Andrew Burgess wrote:
> > Of course, this doesn't solve the problem for you, but at least you
> > know what's going wrong now :)
>
> To make it clear the debuggee has wrong/insufficient debug info, its
> .eh_frame/.debug_frame there should annotate the push (and pop)
> instructions.
>
>
> Jan
>


-- 

David Griffiths, Senior Software Engineer

Undo <https://undo.io> | Resolve even the most challenging software defects
with software flight recorder technology

Software reliability report: optimizing the software supplier and customer
relationship
<https://info.undo.io/software-reliability-report-optimizing-supplier-and-customer-relationship>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 11:50       ` David Griffiths
@ 2019-02-26 11:58         ` Jan Kratochvil
  2019-02-26 14:19         ` Dmitry Samersoff
  2019-02-26 19:05         ` Tom Tromey
  2 siblings, 0 replies; 11+ messages in thread
From: Jan Kratochvil @ 2019-02-26 11:58 UTC (permalink / raw)
  To: David Griffiths; +Cc: Andrew Burgess, dwk, GDB

On Tue, 26 Feb 2019 12:50:37 +0100, David Griffiths wrote:
> Ok, so in my case this is generated code with no debug info (Java JIT
> generated) so does that mean I shouldn't attempt to use nexti? (I've got
> other issues which probably preclude using nexti anyway but just curious)

The proper fix is in OpenJDK so that it produces proper debug info for the
JITted module (but I do not know the details of GDB JIT modules).
Otherwise it is always some sort of workaround.


Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 11:50       ` David Griffiths
  2019-02-26 11:58         ` Jan Kratochvil
@ 2019-02-26 14:19         ` Dmitry Samersoff
  2019-02-26 14:42           ` David Griffiths
  2019-02-26 19:05         ` Tom Tromey
  2 siblings, 1 reply; 11+ messages in thread
From: Dmitry Samersoff @ 2019-02-26 14:19 UTC (permalink / raw)
  To: David Griffiths, Jan Kratochvil; +Cc: Andrew Burgess, dwk, GDB

David,

On 26.02.2019 14:50, David Griffiths wrote:
> Ok, so in my case this is generated code with no debug info (Java JIT
> generated) so does that mean I shouldn't attempt to use nexti? (I've got
> other issues which probably preclude using nexti anyway but just curious)

On my experience with Java JIT (C2) produced code, it's better to avoid
using nexti.

If you do it programmatically, you can try to mimic nexti behavior in
some cases by analyzing instructions ahead and setting breakpoint where
appropriate.

-Dmitry

> 
> Cheers,
> 
> David
> 
> On Tue, 26 Feb 2019 at 10:12, Jan Kratochvil <jan.kratochvil@redhat.com>
> wrote:
> 
>> On Tue, 26 Feb 2019 08:32:37 +0100, Andrew Burgess wrote:
>>> Of course, this doesn't solve the problem for you, but at least you
>>> know what's going wrong now :)
>>
>> To make it clear the debuggee has wrong/insufficient debug info, its
>> .eh_frame/.debug_frame there should annotate the push (and pop)
>> instructions.
>>
>>
>> Jan
>>
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 14:19         ` Dmitry Samersoff
@ 2019-02-26 14:42           ` David Griffiths
  0 siblings, 0 replies; 11+ messages in thread
From: David Griffiths @ 2019-02-26 14:42 UTC (permalink / raw)
  To: Dmitry Samersoff; +Cc: Jan Kratochvil, Andrew Burgess, dwk, GDB

Thanks Dmitry, I will avoid nexti. It's pretty weird stepping through JITed
code anyway, sometimes even a breakpoint/continue is not enough because it
dives off into deopt functions and re-emerges in the interpreter!

On Tue, 26 Feb 2019 at 14:19, Dmitry Samersoff <dms@samersoff.net> wrote:

> David,
>
> On 26.02.2019 14:50, David Griffiths wrote:
> > Ok, so in my case this is generated code with no debug info (Java JIT
> > generated) so does that mean I shouldn't attempt to use nexti? (I've got
> > other issues which probably preclude using nexti anyway but just curious)
>
> On my experience with Java JIT (C2) produced code, it's better to avoid
> using nexti.
>
> If you do it programmatically, you can try to mimic nexti behavior in
> some cases by analyzing instructions ahead and setting breakpoint where
> appropriate.
>
> -Dmitry
>
> >
> > Cheers,
> >
> > David
> >
> > On Tue, 26 Feb 2019 at 10:12, Jan Kratochvil <jan.kratochvil@redhat.com>
> > wrote:
> >
> >> On Tue, 26 Feb 2019 08:32:37 +0100, Andrew Burgess wrote:
> >>> Of course, this doesn't solve the problem for you, but at least you
> >>> know what's going wrong now :)
> >>
> >> To make it clear the debuggee has wrong/insufficient debug info, its
> >> .eh_frame/.debug_frame there should annotate the push (and pop)
> >> instructions.
> >>
> >>
> >> Jan
> >>
> >
> >
>


-- 

David Griffiths, Senior Software Engineer

Undo <https://undo.io> | Resolve even the most challenging software defects
with software flight recorder technology

Software reliability report: optimizing the software supplier and customer
relationship
<https://info.undo.io/software-reliability-report-optimizing-supplier-and-customer-relationship>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 11:50       ` David Griffiths
  2019-02-26 11:58         ` Jan Kratochvil
  2019-02-26 14:19         ` Dmitry Samersoff
@ 2019-02-26 19:05         ` Tom Tromey
  2019-02-27  7:59           ` Dmitry Samersoff
  2 siblings, 1 reply; 11+ messages in thread
From: Tom Tromey @ 2019-02-26 19:05 UTC (permalink / raw)
  To: David Griffiths; +Cc: Jan Kratochvil, Andrew Burgess, dwk, GDB

>>>>> "David" == David Griffiths <dgriffiths@undo.io> writes:

David> Ok, so in my case this is generated code with no debug info (Java JIT
David> generated) so does that mean I shouldn't attempt to use nexti? (I've got
David> other issues which probably preclude using nexti anyway but just curious)

There are a few options to deal with this sort of problem.

As Jan said, the JIT could generate debug info using one of the
gdb-provided JIT interfaces.  That's kind of heavyweight but gives a lot
of control.

Another option is to write an unwinder in Python.  The crucial thing
here is to ensure that the frame ID is constant for the duration of a
frame.  In DWARF this is done by using the CFA as part of the identity;
for the JIT you'd want to do something similar.  I thought there was
already such an unwinder for OpenJDK at least... ?

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-26 19:05         ` Tom Tromey
@ 2019-02-27  7:59           ` Dmitry Samersoff
  2019-02-27 14:53             ` Tom Tromey
  0 siblings, 1 reply; 11+ messages in thread
From: Dmitry Samersoff @ 2019-02-27  7:59 UTC (permalink / raw)
  To: Tom Tromey, David Griffiths; +Cc: Jan Kratochvil, Andrew Burgess, dwk, GDB

Tom,

> for the JIT you'd want to do something similar.  I thought there was
> already such an unwinder for OpenJDK at least... ?

Yes, JDK has different kind of unwinders but unfortunately porting it to
python is problematic.

It reminds me old discussion about a native plugin interface for gdb

-Dmitry

On 26.02.2019 22:05, Tom Tromey wrote:
>>>>>> "David" == David Griffiths <dgriffiths@undo.io> writes:
> 
> David> Ok, so in my case this is generated code with no debug info (Java JIT
> David> generated) so does that mean I shouldn't attempt to use nexti? (I've got
> David> other issues which probably preclude using nexti anyway but just curious)
> 
> There are a few options to deal with this sort of problem.
> 
> As Jan said, the JIT could generate debug info using one of the
> gdb-provided JIT interfaces.  That's kind of heavyweight but gives a lot
> of control.
> 
> Another option is to write an unwinder in Python.  The crucial thing
> here is to ensure that the frame ID is constant for the duration of a
> frame.  In DWARF this is done by using the CFA as part of the identity;
> for the JIT you'd want to do something similar.  I thought there was
> already such an unwinder for OpenJDK at least... ?
> 
> Tom
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Is nexti confused by pushq?
  2019-02-27  7:59           ` Dmitry Samersoff
@ 2019-02-27 14:53             ` Tom Tromey
  0 siblings, 0 replies; 11+ messages in thread
From: Tom Tromey @ 2019-02-27 14:53 UTC (permalink / raw)
  To: Dmitry Samersoff
  Cc: Tom Tromey, David Griffiths, Jan Kratochvil, Andrew Burgess, dwk, GDB

>> for the JIT you'd want to do something similar.  I thought there was
>> already such an unwinder for OpenJDK at least... ?

Dmitry> Yes, JDK has different kind of unwinders but unfortunately porting it to
Dmitry> python is problematic.

Dmitry> It reminds me old discussion about a native plugin interface for gdb

I was referring to this:

http://mail.openjdk.java.net/pipermail/jdk9-dev/2016-May/004379.html

Tom

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-27 14:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-25 15:40 Is nexti confused by pushq? David Griffiths
2019-02-25 15:54 ` dwk
2019-02-26  7:32   ` Andrew Burgess
2019-02-26 10:12     ` Jan Kratochvil
2019-02-26 11:50       ` David Griffiths
2019-02-26 11:58         ` Jan Kratochvil
2019-02-26 14:19         ` Dmitry Samersoff
2019-02-26 14:42           ` David Griffiths
2019-02-26 19:05         ` Tom Tromey
2019-02-27  7:59           ` Dmitry Samersoff
2019-02-27 14:53             ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).