public inbox for archer@sourceware.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: archer@sourceware.org, utrace-devel@redhat.com
Subject: BUG: gdb && notification packets (Was: gdbstub initial code, v12)
Date: Tue, 05 Oct 2010 17:31:00 -0000	[thread overview]
Message-ID: <20101005172729.GA27882@redhat.com> (raw)
In-Reply-To: <20101004181053.GA30217@redhat.com>

On 10/04, Oleg Nesterov wrote:
>
> While trying to understand what does this mean, I hit another bug
> in ugdb.

No, /usr/bin/gdb is buggy.

> A multithreaded tracee can "hang" if gdb simulates
> watchpoints with single-steps + mem-fetch/compare.

It doesn't, but gdb "forgets" about the pending "Stop:" notification,
see below.

> This never
> happens if the tracee is single-threaded.

This is clear. And this doesn't happen with the real gdbbserver.
Probably timing issues, with ugdb the tracee can report itself
as stopped much faster (not because ugdb is faster, but because
it sends the notification before the tracee actually stops).


Consider this simplified artifitial test-case:

	#include <stdio.h>
	#include <unistd.h>
	#include <pthread.h>

	struct {
		char x[128];
	} VAR;

	void *tfunc(void *arg)
	{
		for (;;)
			;
	}

	int main(void)
	{
		pthread_t thr;

		printf("pid: %d\n", getpid());

		pthread_create(&thr, NULL, tfunc, NULL);

		tfunc(NULL);

		return 0;
	}

It just creates 2 threads, each threads spins in the trivial endless loop.
VAR is the dummy variable for "watch", it is never changed. It should be big
enough to ensure can_use_hardware_watchpoint->default_region_ok_for_hw_watchpoint()
returns 0 and gdb falls back to simulating.

	(gdb) set target-async on
	(gdb) set non-stop
	(gdb) target extended-remote :2000
	(gdb) attach 1404
	(gdb) watch VAR
	(gdb) c -a

After that, sooner or later the tracee "hangs". gdb sleeps in poll(), both
traced threads sleep after reporting SIGTRAP (single-step).

The more or less "typical" transcript is:

	[... snip ...]
	=> s
	<= OK
	<= %Stop:T05thread:p57c.57d;
	=> vStopped
	<= T05thread:p57c.57c;
	=> vStopped
	<= OK
	=> g
	<= 50994242000000000000000000000000761f0a16677f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> g
	<= 50994242000000000000000000000000761f0a16677f000000000000000000...
	=> s
	<= OK
	<= %Stop:T05thread:p57c.57d;
	=> Hgp57c.57c
	<= OK
	=> g
	<= 000000000000000030064000000000008162e115677f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> Hcp57c.57c
	<= OK
	=> s#73$vStopped
	<= OK#9a$OK
	<= %Stop:T05thread:p57c.57c;
	=> Hgp57c.57d#ec$g
	<= OK#9a$50994242000000000000000000000000761f0a16677f000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> g
	<= 50994242000000000000000000000000761f0a16677f000000000000000000...
	=> Hcp57c.57d
	<= OK
	=> s#73$vStopped
	<= OK#9a$OK
	<= %Stop:T05thread:p57c.57d;
	=> Hgp57c.57c
	<= OK
	=> g
	<= 000000000000000030064000000000008162e115677f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> Hcp57c.57c
	<= OK
	=> s#73$vStopped
	<= OK#9a$OK

Everything is fine so far. And this is how it works with the real gdbserver:
gdb always issues vStopped until it gets the final "OK". Like it should,
I guess.

So. At this point gdb knows that 57d is stopped, and it sends '$s' to 57c.

	<= %Stop:T05thread:p57c.57c;

57c reports the next stop, but gdb temporary "ignores" this notification
and plays with 57d. This is fine.

	=> Hgp57c.57d#ec$g
	<= OK#9a$50994242000000000000000000000000761f0a16677f000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> g
	<= 50994242000000000000000000000000761f0a16677f000000000000000000...
	=> Hcp57c.57d
	<= OK
	=> s#73$vStopped

Finally it does vStopped. But, by this time 57d is stopped again (note
that gdb sends '$s' before vStopped).

	<= OK#9a$T05thread:p57c.57d;

ugdb reports 57d as stopped.

_If_ gdb sent vStopped right now, he would get "OK" in return. But it
doesn't. Again, this is not the bug yet.

	=> Hgp57c.57c
	<= OK
	=> g
	<= 000000000000000030064000000000008162e115677f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> Hcp57c.57c
	<= OK
	=> s
	<= OK

Now, 57c stops. But it can't report this stop. We are "inside" the
"vStopped" sequence, ugdb waits for gdb which should complete it
by sending the next vStopped packet which should find that 57c is
stopped.

From the documentation:

	If the stub receives a `vStopped' packet and there are no additional
	stop events to report, the stub shall return an `OK' response.  At this
	point, if further stop events occur, the stub shall send a new stop
	reply notification,

IOW, 57c should not send the notification and it doesn't. But this confuses
gdb.

In short: gdb forgets that the last 'vStopped' didn't return 'OK' and
thus we probably have more stopped tracees but waits for the next
notification. ugdb doesn't send the notification because it shouldn't
do this until gdb sends the "final" vStopped according to the doc
above.

Another similar transcript:

	[... snip ...]
	=> Hcp589.589
	<= OK
	=> s
	<= OK#9a%Stop:T05thread:p589.589;
	=> vStopped
	<= OK
	=> g
	<= 0000000000000000300640000000000081621fd01f7f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> s
	<= OK#9a%Stop:T05thread:p589.589;
	=> vStopped
	<= OK
	=> g
	<= 0000000000000000300640000000000081621fd01f7f000000000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> s#73$vStopped
	<= OK#9a$T05thread:p589.589;
	=> Hgp589.58a#c1$g
	<= OK#9a$50990141000000000000000000000000761f48d01f7f000000000000...
	=> m600a60,80
	<= 00000000000000000000000000000000000000000000000000000000000000...
	=> g
	<= 50990141000000000000000000000000761f48d01f7f000000000000000000...
	=> Hcp589.58a
	<= OK
	=> s
	<= OK

Almost the same.


Once again, whatever I did I failed to reproduce this hang with the
real gdbserver.

So I hacked gdb to send the similar sequence of packets instead, and
result is the same:

	getpkt ("Hcp589.589");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]
	getpkt ("s");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]
	pending stop replies: 1
	putpkt ("%Stop:T0506:f089f2deff7f0* ;07:f089f2deff7f0* ;10:d40540*';thread:p589.589;core:1;#94"); [notif]

	getpkt ("Hcp589.58a");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]

	getpkt ("s");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]
	pending stop replies: 2

	getpkt ("vStopped");  [no ack sent]
	vStopped: acking LWP 1417.1417
	putpkt ("$T0506:309101410*"00;07:309101410*"00;10:d40540*';thread:p589.58a;core:1;#24"); [noack mode]
	getpkt ("Hcp589.589");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]

	getpkt ("s");  [no ack sent]
	putpkt ("$OK#9a"); [noack mode]
	pending stop replies: 2

Like ugdb, gdbserver does NOT send the new notification, but waits for gdb
which should continue and complete the vStopped sequence.

So, I strongly believe gdb is buggy and should be fixed.

Oleg.

  reply	other threads:[~2010-10-05 17:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-30 18:27 gdbstub initial code, v12 Oleg Nesterov
2010-10-04 18:14 ` Oleg Nesterov
2010-10-05 17:31   ` Oleg Nesterov [this message]
2010-10-05 18:30     ` BUG: gdb && notification packets (Was: gdbstub initial code, v12) Pedro Alves
2010-10-05 18:35       ` Pedro Alves
2010-10-06 17:23       ` Oleg Nesterov
2010-10-06 18:32         ` Pedro Alves
2010-10-07 23:03           ` Oleg Nesterov
2010-10-08  0:03             ` Pedro Alves
2010-10-08  0:23               ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101005172729.GA27882@redhat.com \
    --to=oleg@redhat.com \
    --cc=archer@sourceware.org \
    --cc=utrace-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).