how to fix internal errors on connection to remote stub?

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

* how to fix internal errors on connection to remote stub?
@ 2015-01-23 18:07 Sandra Loosemore
  2015-01-23 19:37 ` Paul Koning
  0 siblings, 1 reply; 3+ messages in thread
From: Sandra Loosemore @ 2015-01-23 18:07 UTC (permalink / raw)
  To: gdb; +Cc: Luis Machado

We have a GDB stub we use to interface to various hardware probes.  In 
GDB we normally run programs on the target using a series of commands like

(gdb) target remote ....
(gdb) load
(gdb) c

This works about 99.9% of the time.  The other .1%, we get "a problem 
internal to GDB has been detected" from the "target remote" command, 
because it is trying to ask the target for a backtrace *before* it has 
loaded the program.  At that point, the contents of memory and registers 
on the target have no relation to the program gdb is trying to debug, 
and GDB gets mighty confused.  Maybe this isn't so awful for manual use, 
but it certainly screws up automated testing.

Luis and I have been discussing this and we both think the most likely 
solution is for the stub to answer the initial '?' packet with some 
response that indicates it's in an inconsistent state, rather than the 
"S00" it is sending now, and have GDB recognize it and at least suppress 
the initial backtrace, and perhaps also sets a flag disallowing other 
commands to view target state (variable values, etc) until it's cleared 
by the "load" command.  Maybe the 'W' stop reply could be used for this 
purpose, or maybe we need to introduce a new one?

Alternatively, maybe we could have a separate GDB command that can be 
issued before the "target remote" to suppress the auto-backtrace?

Any other ideas?  What do other folks think is the right solution?  We'd 
much rather fix this in a way that's acceptable for mainline rather than 
having to carry a patch locally.

-Sandra

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: how to fix internal errors on connection to remote stub?
  2015-01-23 18:07 how to fix internal errors on connection to remote stub? Sandra Loosemore
@ 2015-01-23 19:37 ` Paul Koning
  2015-01-25  5:54   ` Sandra Loosemore
  0 siblings, 1 reply; 3+ messages in thread
From: Paul Koning @ 2015-01-23 19:37 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: gdb, Luis Machado

> On Jan 23, 2015, at 12:47 PM, Sandra Loosemore <sandra@codesourcery.com> wrote:
> 
> We have a GDB stub we use to interface to various hardware probes.  In GDB we normally run programs on the target using a series of commands like
> 
> (gdb) target remote ....
> (gdb) load
> (gdb) c
> 
> This works about 99.9% of the time.  The other .1%, we get "a problem internal to GDB has been detected" from the "target remote" command, because it is trying to ask the target for a backtrace *before* it has loaded the program.  At that point, the contents of memory and registers on the target have no relation to the program gdb is trying to debug, and GDB gets mighty confused.  Maybe this isn't so awful for manual use, but it certainly screws up automated testing.
> 
> Luis and I have been discussing this and we both think the most likely solution is for the stub to answer the initial '?' packet with some response that indicates it's in an inconsistent state, rather than the "S00" it is sending now, and have GDB recognize it and at least suppress the initial backtrace, and perhaps also sets a flag disallowing other commands to view target state (variable values, etc) until it's cleared by the "load" command.  Maybe the 'W' stop reply could be used for this purpose, or maybe we need to introduce a new one?
> 
> Alternatively, maybe we could have a separate GDB command that can be issued before the "target remote" to suppress the auto-backtrace?
> 
> Any other ideas?  What do other folks think is the right solution?  We'd much rather fix this in a way that's acceptable for mainline rather than having to carry a patch locally.

If gdbserver is sending something that confuses gdb, the default answer is that this is a gdb bug (it should not fall over) and possibly in addition a gdbserver bug (it should obey the protocol spec).   The reason I say “default answer” is because of the standard distributed systems rule that it’s always your bug if a received packet causes you to malfunction; the fact that the packet was invalid is not an excuse.

You said that the stub is in an “inconsistent state”.  I’m not sure about that.  The target is stopped by the initial connection, and at that point you have a target thread, it’s stopped, it has registers, so it’s in some state that can be reported.  Yes, that state has no connection to the program GDB knows about, because it’s not in the target yet.  So the target might be in some boot loader or other bit of skeleton code, but it’s obviously executing something.  So I don’t think “inconsistent” applies from the gdbserver point of view.  

Instead, it seems that gdb, when it queries gdbserver for the stopped inferior state, gets back stuff that doesn’t fit in the program it’s been told about.  But so what?  That can happen in other places for other reasons, and gdb usually handles that just fine.  Consider the  “heuristic fencepost” machinery that protects from wild backtraces.  So it seems that we just have some gaps in gdb’s robustness, and those are bugs that should be fixed.  

New commands or new protocol mechanisms don’t seem like the right answer; it’s not the user’s job to work around gdb bugs, nor is it gdbserver’s job to know that it is out of sync with gdb.

	paul

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: how to fix internal errors on connection to remote stub?
  2015-01-23 19:37 ` Paul Koning
@ 2015-01-25  5:54   ` Sandra Loosemore
  0 siblings, 0 replies; 3+ messages in thread
From: Sandra Loosemore @ 2015-01-25  5:54 UTC (permalink / raw)
  To: Paul Koning; +Cc: gdb, Luis Machado

On 01/23/2015 11:07 AM, Paul Koning wrote:
>
> If gdbserver is sending something that confuses gdb, the default
> answer is that this is a gdb bug (it should not fall over) and
> possibly in addition a gdbserver bug (it should obey the protocol
> spec).   The reason I say Â“default answerÂ” is because of the standard
> distributed systems rule that itÂ’s always your bug if a received
> packet causes you to malfunction; the fact that the packet was
> invalid is not an excuse.
>
> You said that the stub is in an Â“inconsistent stateÂ”.  IÂ’m not sure
> about that.  The target is stopped by the initial connection, and at
> that point you have a target thread, itÂ’s stopped, it has registers,
> so itÂ’s in some state that can be reported.  Yes, that state has no
> connection to the program GDB knows about, because itÂ’s not in the
> target yet.  So the target might be in some boot loader or other bit
> of skeleton code, but itÂ’s obviously executing something.  So I donÂ’t
> think Â“inconsistentÂ” applies from the gdbserver point of view.

Hmmm, I'm not so sure about this.  In the situations where we have been 
hitting this problem, a more exact description of what is going on in 
the stub is this:  it previously completed normal execution of some 
other program in a different gdb instance and sent a 'W' packet.  When a 
new gdb instance reconnects to the stub, the target is still sitting 
stopped at the semihosting breakpoint that triggered the 'W' packet. 
That's why I'm wondering whether the response it should be giving to the 
initial '?' packet on the new connection should be 'W' ("the program has 
exited and has no meaningful state any more") instead of 'S' ("the 
program is stopped").  But GDB only accepts a 'W' reply to '?' in 
extended-remote mode, which isn't supported by this stub.

> Instead, it seems that gdb, when it queries gdbserver for the stopped
> inferior state, gets back stuff that doesnÂ’t fit in the program itÂ’s
> been told about.  But so what?  That can happen in other places for
> other reasons, and gdb usually handles that just fine.  Consider the
> Â“heuristic fencepostÂ” machinery that protects from wild backtraces.
> So it seems that we just have some gaps in gdbÂ’s robustness, and
> those are bugs that should be fixed.
>
> New commands or new protocol mechanisms donÂ’t seem like the right
> answer; itÂ’s not the userÂ’s job to work around gdb bugs, nor is it
> gdbserverÂ’s job to know that it is out of sync with gdb.

It does seem like GDB could do a better job here of checking that the 
code in target memory (e.g., the instruction at the reported PC) matches 
what's expected from the program it's trying to debug.  While fixing 
that might make these errors less likely, it wouldn't be as foolproof as 
the stub simply telling GDB that it definitely has no useful program 
state to report yet.

-Sandra

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-23 19:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 18:07 how to fix internal errors on connection to remote stub? Sandra Loosemore
2015-01-23 19:37 ` Paul Koning
2015-01-25  5:54   ` Sandra Loosemore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).