* [PATCH] Fix crash of gdbserver when kill threads
@ 2014-06-23 10:10 Hui Zhu
2014-06-29 3:28 ` Hui Zhu
0 siblings, 1 reply; 10+ messages in thread
From: Hui Zhu @ 2014-06-23 10:10 UTC (permalink / raw)
To: gdb-patches ml
gdbserver :1234 gdb.base/watch_thread_num
gdb gdb.base/watch_thread_num
(gdb) b 48
Breakpoint 1 at 0x400737: file ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c, line 48.
(gdb) c
Continuing.
Breakpoint 1, main () at ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c:48
48 thread_result = thread_function ((void *) i);
(gdb) k
Kill the program being debugged? (y or n) y
gdbserver :1234 gdb.base/watch_thread_num
Process gdb.base/watch_thread_num created; pid = 9719
Listening on port 1234
Remote debugging from host 127.0.0.1
Killing all inferiors
Segmentation fault (core dumped)
Backtrace:
(gdb) bt
#0 find_inferior (list=<optimized out>, func=func@entry=0x423990 <kill_one_lwp_callback>, arg=arg@entry=0x7fffe97405dc)
at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:199
#1 0x0000000000425bff in linux_kill (pid=10130) at ../../../binutils-gdb/gdb/gdbserver/linux-low.c:966
#2 0x000000000040ae8c in kill_inferior_callback (entry=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:2934
#3 0x0000000000405c61 in for_each_inferior (list=<optimized out>, action=action@entry=0x40ae60 <kill_inferior_callback>)
at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:57
#4 0x000000000040d5e2 in process_serial_event () at ../../../binutils-gdb/gdb/gdbserver/server.c:3767
#5 handle_serial_event (err=<optimized out>, client_data=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:3880
#6 0x0000000000412cda in handle_file_event (event_file_desc=event_file_desc@entry=4)
at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:434
#7 0x000000000041357a in process_event () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:189
#8 start_event_loop () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:552
#9 0x0000000000403088 in main (argc=3, argv=0x7fffe9740938) at ../../../binutils-gdb/gdb/gdbserver/server.c:3283
The cause of this issue is when linux_kill call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
to kill all the lwp of pid.
In linux_wait_for_event, it will delete_lwp any lwp in all_threads if it
get exit event of it. Then it make find_inferior crash.
I make a patch that let kill_one_lwp_callback return 1, then after
linux_wait_for_event is called(Maybe all_threads is changed), find_inferior
will return.
And change call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
to be a loop. It will stop when all_threads doesn't have any lwp is pid.
It pass regression test in x86_64 Linux.
Thanks,
Hui
2014-06-23 Hui Zhu <hui@codesourcery.com>
* linux-low.c (kill_one_lwp_callback): Change last return to 1.
(linux_kill): Call find_inferior with a loop.
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l
pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
} while (pid > 0 && WIFSTOPPED (wstat));
- return 0;
+ /* Let find_inferior return because maybe other lwp in the list will
+ be deleted by delete_lwp. */
+ return 1;
}
static int
@@ -963,7 +965,9 @@ linux_kill (int pid)
first, as PTRACE_KILL will not work otherwise. */
stop_all_lwps (0, NULL);
- find_inferior (&all_threads, kill_one_lwp_callback , &pid);
+ /* Keep call kill_one_lwp_callback until find_inferior cannot find any
+ lwps that is for pid. */
+ while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL);
/* See the comment in linux_kill_one_lwp. We did not kill the first
thread in the list, so do so now. */
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix crash of gdbserver when kill threads
2014-06-23 10:10 [PATCH] Fix crash of gdbserver when kill threads Hui Zhu
@ 2014-06-29 3:28 ` Hui Zhu
2014-07-02 9:07 ` Pedro Alves
0 siblings, 1 reply; 10+ messages in thread
From: Hui Zhu @ 2014-06-29 3:28 UTC (permalink / raw)
To: Pedro Alves, gdb-patches ml
Hi Pedro,
This issue is introduced by the patches for PR 12702 that was pushed by you.
Maybe you could take a look on this patch.
Thanks,
Hui
On 06/23/14 18:09, Hui Zhu wrote:
> gdbserver :1234 gdb.base/watch_thread_num
> gdb gdb.base/watch_thread_num
> (gdb) b 48
> Breakpoint 1 at 0x400737: file ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c, line 48.
> (gdb) c
> Continuing.
>
> Breakpoint 1, main () at ../../../binutils-gdb/gdb/testsuite/gdb.base/watch_thread_num.c:48
> 48 thread_result = thread_function ((void *) i);
> (gdb) k
> Kill the program being debugged? (y or n) y
> gdbserver :1234 gdb.base/watch_thread_num
> Process gdb.base/watch_thread_num created; pid = 9719
> Listening on port 1234
> Remote debugging from host 127.0.0.1
> Killing all inferiors
> Segmentation fault (core dumped)
>
> Backtrace:
> (gdb) bt
> #0 find_inferior (list=<optimized out>, func=func@entry=0x423990 <kill_one_lwp_callback>, arg=arg@entry=0x7fffe97405dc)
> at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:199
> #1 0x0000000000425bff in linux_kill (pid=10130) at ../../../binutils-gdb/gdb/gdbserver/linux-low.c:966
> #2 0x000000000040ae8c in kill_inferior_callback (entry=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:2934
> #3 0x0000000000405c61 in for_each_inferior (list=<optimized out>, action=action@entry=0x40ae60 <kill_inferior_callback>)
> at ../../../binutils-gdb/gdb/gdbserver/inferiors.c:57
> #4 0x000000000040d5e2 in process_serial_event () at ../../../binutils-gdb/gdb/gdbserver/server.c:3767
> #5 handle_serial_event (err=<optimized out>, client_data=<optimized out>) at ../../../binutils-gdb/gdb/gdbserver/server.c:3880
> #6 0x0000000000412cda in handle_file_event (event_file_desc=event_file_desc@entry=4)
> at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:434
> #7 0x000000000041357a in process_event () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:189
> #8 start_event_loop () at ../../../binutils-gdb/gdb/gdbserver/event-loop.c:552
> #9 0x0000000000403088 in main (argc=3, argv=0x7fffe9740938) at ../../../binutils-gdb/gdb/gdbserver/server.c:3283
>
> The cause of this issue is when linux_kill call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
> to kill all the lwp of pid.
> In linux_wait_for_event, it will delete_lwp any lwp in all_threads if it
> get exit event of it. Then it make find_inferior crash.
>
> I make a patch that let kill_one_lwp_callback return 1, then after
> linux_wait_for_event is called(Maybe all_threads is changed), find_inferior
> will return.
> And change call "find_inferior (&all_threads, kill_one_lwp_callback , &pid)"
> to be a loop. It will stop when all_threads doesn't have any lwp is pid.
>
> It pass regression test in x86_64 Linux.
>
> Thanks,
> Hui
>
> 2014-06-23 Hui Zhu <hui@codesourcery.com>
>
> * linux-low.c (kill_one_lwp_callback): Change last return to 1.
> (linux_kill): Call find_inferior with a loop.
>
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l
> pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
> } while (pid > 0 && WIFSTOPPED (wstat));
>
> - return 0;
> + /* Let find_inferior return because maybe other lwp in the list will
> + be deleted by delete_lwp. */
> + return 1;
> }
>
> static int
> @@ -963,7 +965,9 @@ linux_kill (int pid)
> first, as PTRACE_KILL will not work otherwise. */
> stop_all_lwps (0, NULL);
>
> - find_inferior (&all_threads, kill_one_lwp_callback , &pid);
> + /* Keep call kill_one_lwp_callback until find_inferior cannot find any
> + lwps that is for pid. */
> + while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL);
>
> /* See the comment in linux_kill_one_lwp. We did not kill the first
> thread in the list, so do so now. */
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Fix crash of gdbserver when kill threads
2014-06-29 3:28 ` Hui Zhu
@ 2014-07-02 9:07 ` Pedro Alves
2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves
0 siblings, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2014-07-02 9:07 UTC (permalink / raw)
To: Hui Zhu, gdb-patches ml
On 06/29/2014 04:28 AM, Hui Zhu wrote:
> Hi Pedro,
>
> This issue is introduced by the patches for PR 12702 that was pushed by you.
> Maybe you could take a look on this patch.
Thanks. I'll need to think a bit about whether this is the right
solution.
I've added it to the 7.8 regression queue at:
https://sourceware.org/gdb/wiki/GDB_7.8_Release
I'll look better at it as soon as I have a chance.
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] GDBserver crashes when killing a multi-thread process
2014-07-02 9:07 ` Pedro Alves
@ 2014-07-10 15:17 ` Pedro Alves
2014-07-11 8:21 ` Hui Zhu
2015-07-13 16:07 ` Yao Qi
0 siblings, 2 replies; 10+ messages in thread
From: Pedro Alves @ 2014-07-10 15:17 UTC (permalink / raw)
To: Hui Zhu, gdb-patches ml
Hi Hui,
On 07/02/2014 10:07 AM, Pedro Alves wrote:
>> This issue is introduced by the patches for PR 12702 that was pushed by you.
>> Maybe you could take a look on this patch.
>
> Thanks. I'll need to think a bit about whether this is the right
> solution.
>
> I've added it to the 7.8 regression queue at:
>
> https://sourceware.org/gdb/wiki/GDB_7.8_Release
>
> I'll look better at it as soon as I have a chance.
>
First, it's really annoying that the testsuite doesn't catch this!
We're really missing such a basic test, so I went ahead and wrote one.
I was actually quite surprised that we didn't have any
test that makes sure "kill" works without complaints/spurious output.
However, unfortunately, the new test revealed that the fix isn't right. :-/
Sometimes the test passed, but other times (that'll depend on
machine, of course), I'd get:
kill
Kill the program being debugged? (y or n) y
Ignoring packet error, continuing...
(gdb) FAIL: gdb.threads/kill.exp: kill
That means GDBserver didn't reply to the vKill packet.
Attaching to GDBserver at this point shows that it's stuck forever
in the new loop:
+ /* Keep call kill_one_lwp_callback until find_inferior cannot find any
+ lwps that is for pid. */
+ while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL);
The reason was that the leader lwp goes zombie before the other
lwps are reaped, and linux_wait_for_event would delete it. We'd be
left with e.g., only one lwp that is _not_ the leader.
That lwp had disappeared already (due to SIGKILL), but there's
no status to collect (waitpid would return -1). Nothing in linux-low.c
would notice the lwp is already gone. Because that lwp has lwp->stopped == 0,
linux_wait_for_event would return, here:
if ((find_inferior (&all_threads,
not_stopped_callback,
&wait_ptid) == NULL))
{
if (debug_threads)
debug_printf ("LLW: exit (no unwaited-for LWP)\n");
sigprocmask (SIG_SETMASK, &prev_mask, NULL);
return -1;
}
So this would return true:
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l
pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
} while (pid > 0 && WIFSTOPPED (wstat));
- return 0;
+ /* Let find_inferior return because maybe other lwp in the list will
+ be deleted by delete_lwp. */
+ return 1;
}
So the new loop would never break.
Here's a different patch that passes the test cleanly for me.
Let me know if it works for you.
8<-------------------
From ed3e42e5447e0276d6fc759a27408d06e9ccf0b7 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Thu, 10 Jul 2014 15:13:26 +0100
Subject: [PATCH] GDBserver crashes when killing a multi-thread process
Here's an example, with the new test:
gdbserver :9999 gdb.threads/kill
gdb gdb.threads/kill
(gdb) b 52
Breakpoint 1 at 0x4007f4: file kill.c, line 52.
Continuing.
Breakpoint 1, main () at kill.c:52
52 return 0; /* set break here */
(gdb) k
Kill the program being debugged? (y or n) y
gdbserver :9999 gdb.threads/kill
Process gdb.base/watch_thread_num created; pid = 9719
Listening on port 1234
Remote debugging from host 127.0.0.1
Killing all inferiors
Segmentation fault (core dumped)
Backtrace:
(gdb) bt
#0 0x00000000004068a0 in find_inferior (list=0x66b060 <all_threads>, func=0x427637 <kill_one_lwp_callback>, arg=0x7fffffffd3fc) at src/gdb/gdbserver/inferiors.c:199
#1 0x00000000004277b6 in linux_kill (pid=15708) at src/gdb/gdbserver/linux-low.c:966
#2 0x000000000041354d in kill_inferior (pid=15708) at src/gdb/gdbserver/target.c:163
#3 0x00000000004107e9 in kill_inferior_callback (entry=0x6704f0) at src/gdb/gdbserver/server.c:2934
#4 0x0000000000406522 in for_each_inferior (list=0x66b050 <all_processes>, action=0x4107a6 <kill_inferior_callback>) at src/gdb/gdbserver/inferiors.c:57
#5 0x0000000000412377 in process_serial_event () at src/gdb/gdbserver/server.c:3767
#6 0x000000000041267c in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:3880
#7 0x00000000004189ff in handle_file_event (event_file_desc=4) at src/gdb/gdbserver/event-loop.c:434
#8 0x00000000004181c6 in process_event () at src/gdb/gdbserver/event-loop.c:189
#9 0x0000000000418f45 in start_event_loop () at src/gdb/gdbserver/event-loop.c:552
#10 0x0000000000411272 in main (argc=3, argv=0x7fffffffd8d8) at src/gdb/gdbserver/server.c:3283
The problem is that linux_wait_for_event deletes lwps that have exited
(even those not passed in as lwps of interest), while the lwp/thread
list is being walked on with find_inferior. find_inferior can handle
the current iterated inferior being deleted, but not others.
When killing lwps, we don't really care about any of the pending
status handling of linux_wait_for_event. We can just waitpid the lwps
directly, which is also what GDB does (see
linux-nat.c:kill_wait_callback). This way the lwps are not deleted
while we're walking the list. They'll be deleted by linux_mourn
afterwards.
This crash triggers several times when running the testsuite against
GDBserver with the native-gdbserver board (target remote), but as GDB
can't distinguish between GDBserver crashing and "kill" being
sucessful, as in both cases the connection is closed (the 'k' packet
doesn't require a reply), and the inferior is gone, that results in no
FAIL.
The patch adds a generic test that catches the issue with
extended-remote mode (and works fine with native testing too). Here's
how it fails with the native-extended-gdbserver board without the fix:
(gdb) info threads
Id Target Id Frame
6 Thread 15367.15374 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
5 Thread 15367.15373 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
4 Thread 15367.15372 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
3 Thread 15367.15371 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
2 Thread 15367.15370 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
* 1 Thread 15367.15367 main () at .../gdb.threads/kill.c:52
(gdb) kill
Kill the program being debugged? (y or n) y
Remote connection closed
^^^^^^^^^^^^^^^^^^^^^^^^
(gdb) FAIL: gdb.threads/kill.exp: kill
Extended remote should remain connected after the kill.
gdb/gdbserver/
2014-07-10 Pedro Alves <palves@redhat.com>
* linux-low.c (kill_wait_lwp): New function, based on
kill_one_lwp_callback, but use my_waitpid directly.
(kill_one_lwp_callback, linux_kill): Use it.
gdb/testsuite/
2014-07-10 Pedro Alves <palves@redhat.com>
* gdb.threads/kill.c: New file.
* gdb.threads/kill.exp: New file.
---
gdb/gdbserver/linux-low.c | 68 ++++++++++++++++++++-------------
gdb/testsuite/gdb.threads/kill.c | 64 +++++++++++++++++++++++++++++++
gdb/testsuite/gdb.threads/kill.exp | 77 ++++++++++++++++++++++++++++++++++++++
3 files changed, 183 insertions(+), 26 deletions(-)
create mode 100644 gdb/testsuite/gdb.threads/kill.c
create mode 100644 gdb/testsuite/gdb.threads/kill.exp
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 61552f4..bdc9aaa 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -909,6 +909,46 @@ linux_kill_one_lwp (struct lwp_info *lwp)
errno ? strerror (errno) : "OK");
}
+/* Kill LWP and wait for it to die. */
+
+static void
+kill_wait_lwp (struct lwp_info *lwp)
+{
+ struct thread_info *thr = get_lwp_thread (lwp);
+ int pid = ptid_get_pid (ptid_of (thr));
+ int lwpid = ptid_get_lwp (ptid_of (thr));
+ int wstat;
+ int res;
+
+ if (debug_threads)
+ debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid);
+
+ do
+ {
+ linux_kill_one_lwp (lwp);
+
+ /* Make sure it died. Notes:
+
+ - The loop is most likely unnecessary.
+
+ - We don't use linux_wait_for_event as that could delete lwps
+ while we're iterating over them. We're not interested in
+ any pending status at this point, only in making sure all
+ wait status on the kernel side are collected until the
+ process is reaped.
+
+ - We don't use __WALL here as the __WALL emulation relies on
+ SIGCHLD, and killing a stopped process doesn't generate
+ one, nor an exit status.
+ */
+ res = my_waitpid (lwpid, &wstat, 0);
+ if (res == -1 && errno == ECHILD)
+ res = my_waitpid (lwpid, &wstat, __WCLONE);
+ } while (res > 0 && WIFSTOPPED (wstat));
+
+ gdb_assert (res > 0);
+}
+
/* Callback for `find_inferior'. Kills an lwp of a given process,
except the leader. */
@@ -917,7 +957,6 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args)
{
struct thread_info *thread = (struct thread_info *) entry;
struct lwp_info *lwp = get_thread_lwp (thread);
- int wstat;
int pid = * (int *) args;
if (ptid_get_pid (entry->id) != pid)
@@ -936,14 +975,7 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args)
return 0;
}
- do
- {
- linux_kill_one_lwp (lwp);
-
- /* Make sure it died. The loop is most likely unnecessary. */
- pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
- } while (pid > 0 && WIFSTOPPED (wstat));
-
+ kill_wait_lwp (lwp);
return 0;
}
@@ -952,8 +984,6 @@ linux_kill (int pid)
{
struct process_info *process;
struct lwp_info *lwp;
- int wstat;
- int lwpid;
process = find_process_pid (pid);
if (process == NULL)
@@ -976,21 +1006,7 @@ linux_kill (int pid)
pid);
}
else
- {
- struct thread_info *thr = get_lwp_thread (lwp);
-
- if (debug_threads)
- debug_printf ("lk_1: killing lwp %ld, for pid: %d\n",
- lwpid_of (thr), pid);
-
- do
- {
- linux_kill_one_lwp (lwp);
-
- /* Make sure it died. The loop is most likely unnecessary. */
- lwpid = linux_wait_for_event (thr->entry.id, &wstat, __WALL);
- } while (lwpid > 0 && WIFSTOPPED (wstat));
- }
+ kill_wait_lwp (lwp);
the_target->mourn (process);
diff --git a/gdb/testsuite/gdb.threads/kill.c b/gdb/testsuite/gdb.threads/kill.c
new file mode 100644
index 0000000..aefbb06
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/kill.c
@@ -0,0 +1,64 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+ Copyright 2014 Free Software Foundation, Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+#ifdef USE_THREADS
+
+#include <unistd.h>
+#include <pthread.h>
+
+#define NUM 5
+
+pthread_barrier_t barrier;
+
+void *
+thread_function (void *arg)
+{
+ volatile unsigned int counter = 1;
+
+ pthread_barrier_wait (&barrier);
+
+ while (counter > 0)
+ {
+ counter++;
+ usleep (1);
+ }
+
+ pthread_exit (NULL);
+}
+
+#endif /* USE_THREADS */
+
+void
+setup (void)
+{
+#ifdef USE_THREADS
+ pthread_t threads[NUM];
+ int i;
+
+ pthread_barrier_init (&barrier, NULL, NUM + 1);
+ for (i = 0; i < NUM; i++)
+ pthread_create (&threads[i], NULL, thread_function, NULL);
+ pthread_barrier_wait (&barrier);
+#endif /* USE_THREADS */
+}
+
+int
+main (void)
+{
+ setup ();
+ return 0; /* set break here */
+}
diff --git a/gdb/testsuite/gdb.threads/kill.exp b/gdb/testsuite/gdb.threads/kill.exp
new file mode 100644
index 0000000..a3f921f
--- /dev/null
+++ b/gdb/testsuite/gdb.threads/kill.exp
@@ -0,0 +1,77 @@
+# This testcase is part of GDB, the GNU debugger.
+
+# Copyright 2014 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+standard_testfile
+
+# Run the test proper. THREADED indicates whether to build a threaded
+# program and spawn several threads before trying to kill the program.
+
+proc test {threaded} {
+ global testfile srcfile
+
+ with_test_prefix [expr ($threaded)?"threaded":"non-threaded"] {
+
+ set options {debug}
+ if {$threaded} {
+ lappend options "pthreads"
+ lappend options "additional_flags=-DUSE_THREADS"
+ set prog ${testfile}_threads
+ } else {
+ set prog ${testfile}_nothreads
+ }
+
+ if {[prepare_for_testing "failed to prepare" $prog $srcfile $options] == -1} {
+ return -1
+ }
+
+ if { ![runto main] } then {
+ fail "run to main"
+ return
+ }
+
+ set linenum [gdb_get_line_number "set break here"]
+ gdb_breakpoint "$srcfile:$linenum"
+ gdb_continue_to_breakpoint "break here" ".*break here.*"
+
+ if {$threaded} {
+ gdb_test "info threads" "6.*5.*4.*3.*2.*1.*" "all threads started"
+ }
+
+ # This kills and ensures no output other than the prompt comes out,
+ # like:
+ #
+ # (gdb) kill
+ # Kill the program being debugged? (y or n) y
+ # (gdb)
+ #
+ # If we instead saw more output, like e.g., with an extended-remote
+ # connection:
+ #
+ # (gdb) kill
+ # Kill the program being debugged? (y or n) y
+ # Remote connection closed
+ # (gdb)
+ #
+ # the above would mean that the remote end crashed.
+
+ gdb_test "kill" "^y" "kill program" "Kill the program being debugged\\? \\(y or n\\) $" "y"
+ }
+}
+
+foreach threaded {true false} {
+ test $threaded
+}
--
1.9.3
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves
@ 2014-07-11 8:21 ` Hui Zhu
2014-07-11 10:53 ` [PUSHED+7.8] " Pedro Alves
2015-07-13 16:07 ` Yao Qi
1 sibling, 1 reply; 10+ messages in thread
From: Hui Zhu @ 2014-07-11 8:21 UTC (permalink / raw)
To: Pedro Alves, gdb-patches ml
Hi Pedro,
Thanks for your patch. It fixed the issue in my part.
Best,
Hui
On 07/10/14 23:16, Pedro Alves wrote:
> Hi Hui,
>
> On 07/02/2014 10:07 AM, Pedro Alves wrote:
>>> This issue is introduced by the patches for PR 12702 that was pushed by you.
>>> Maybe you could take a look on this patch.
>>
>> Thanks. I'll need to think a bit about whether this is the right
>> solution.
>>
>> I've added it to the 7.8 regression queue at:
>>
>> https://sourceware.org/gdb/wiki/GDB_7.8_Release
>>
>> I'll look better at it as soon as I have a chance.
>>
>
> First, it's really annoying that the testsuite doesn't catch this!
> We're really missing such a basic test, so I went ahead and wrote one.
> I was actually quite surprised that we didn't have any
> test that makes sure "kill" works without complaints/spurious output.
>
> However, unfortunately, the new test revealed that the fix isn't right. :-/
>
> Sometimes the test passed, but other times (that'll depend on
> machine, of course), I'd get:
>
> kill
> Kill the program being debugged? (y or n) y
> Ignoring packet error, continuing...
> (gdb) FAIL: gdb.threads/kill.exp: kill
>
> That means GDBserver didn't reply to the vKill packet.
>
> Attaching to GDBserver at this point shows that it's stuck forever
> in the new loop:
>
>
> + /* Keep call kill_one_lwp_callback until find_inferior cannot find any
> + lwps that is for pid. */
> + while (find_inferior (&all_threads, kill_one_lwp_callback , &pid) != NULL);
>
>
> The reason was that the leader lwp goes zombie before the other
> lwps are reaped, and linux_wait_for_event would delete it. We'd be
> left with e.g., only one lwp that is _not_ the leader.
>
> That lwp had disappeared already (due to SIGKILL), but there's
> no status to collect (waitpid would return -1). Nothing in linux-low.c
> would notice the lwp is already gone. Because that lwp has lwp->stopped == 0,
> linux_wait_for_event would return, here:
>
> if ((find_inferior (&all_threads,
> not_stopped_callback,
> &wait_ptid) == NULL))
> {
> if (debug_threads)
> debug_printf ("LLW: exit (no unwaited-for LWP)\n");
> sigprocmask (SIG_SETMASK, &prev_mask, NULL);
> return -1;
> }
>
> So this would return true:
>
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -944,7 +944,9 @@ kill_one_lwp_callback (struct inferior_l
> pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
> } while (pid > 0 && WIFSTOPPED (wstat));
>
> - return 0;
> + /* Let find_inferior return because maybe other lwp in the list will
> + be deleted by delete_lwp. */
> + return 1;
> }
>
> So the new loop would never break.
>
> Here's a different patch that passes the test cleanly for me.
> Let me know if it works for you.
>
> 8<-------------------
> From ed3e42e5447e0276d6fc759a27408d06e9ccf0b7 Mon Sep 17 00:00:00 2001
> From: Pedro Alves <palves@redhat.com>
> Date: Thu, 10 Jul 2014 15:13:26 +0100
> Subject: [PATCH] GDBserver crashes when killing a multi-thread process
>
> Here's an example, with the new test:
>
> gdbserver :9999 gdb.threads/kill
> gdb gdb.threads/kill
> (gdb) b 52
> Breakpoint 1 at 0x4007f4: file kill.c, line 52.
> Continuing.
>
> Breakpoint 1, main () at kill.c:52
> 52 return 0; /* set break here */
> (gdb) k
> Kill the program being debugged? (y or n) y
>
> gdbserver :9999 gdb.threads/kill
> Process gdb.base/watch_thread_num created; pid = 9719
> Listening on port 1234
> Remote debugging from host 127.0.0.1
> Killing all inferiors
> Segmentation fault (core dumped)
>
> Backtrace:
>
> (gdb) bt
> #0 0x00000000004068a0 in find_inferior (list=0x66b060 <all_threads>, func=0x427637 <kill_one_lwp_callback>, arg=0x7fffffffd3fc) at src/gdb/gdbserver/inferiors.c:199
> #1 0x00000000004277b6 in linux_kill (pid=15708) at src/gdb/gdbserver/linux-low.c:966
> #2 0x000000000041354d in kill_inferior (pid=15708) at src/gdb/gdbserver/target.c:163
> #3 0x00000000004107e9 in kill_inferior_callback (entry=0x6704f0) at src/gdb/gdbserver/server.c:2934
> #4 0x0000000000406522 in for_each_inferior (list=0x66b050 <all_processes>, action=0x4107a6 <kill_inferior_callback>) at src/gdb/gdbserver/inferiors.c:57
> #5 0x0000000000412377 in process_serial_event () at src/gdb/gdbserver/server.c:3767
> #6 0x000000000041267c in handle_serial_event (err=0, client_data=0x0) at src/gdb/gdbserver/server.c:3880
> #7 0x00000000004189ff in handle_file_event (event_file_desc=4) at src/gdb/gdbserver/event-loop.c:434
> #8 0x00000000004181c6 in process_event () at src/gdb/gdbserver/event-loop.c:189
> #9 0x0000000000418f45 in start_event_loop () at src/gdb/gdbserver/event-loop.c:552
> #10 0x0000000000411272 in main (argc=3, argv=0x7fffffffd8d8) at src/gdb/gdbserver/server.c:3283
>
> The problem is that linux_wait_for_event deletes lwps that have exited
> (even those not passed in as lwps of interest), while the lwp/thread
> list is being walked on with find_inferior. find_inferior can handle
> the current iterated inferior being deleted, but not others.
>
> When killing lwps, we don't really care about any of the pending
> status handling of linux_wait_for_event. We can just waitpid the lwps
> directly, which is also what GDB does (see
> linux-nat.c:kill_wait_callback). This way the lwps are not deleted
> while we're walking the list. They'll be deleted by linux_mourn
> afterwards.
>
> This crash triggers several times when running the testsuite against
> GDBserver with the native-gdbserver board (target remote), but as GDB
> can't distinguish between GDBserver crashing and "kill" being
> sucessful, as in both cases the connection is closed (the 'k' packet
> doesn't require a reply), and the inferior is gone, that results in no
> FAIL.
>
> The patch adds a generic test that catches the issue with
> extended-remote mode (and works fine with native testing too). Here's
> how it fails with the native-extended-gdbserver board without the fix:
>
> (gdb) info threads
> Id Target Id Frame
> 6 Thread 15367.15374 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> 5 Thread 15367.15373 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> 4 Thread 15367.15372 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> 3 Thread 15367.15371 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> 2 Thread 15367.15370 0x000000373bcbc98d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
> * 1 Thread 15367.15367 main () at .../gdb.threads/kill.c:52
> (gdb) kill
> Kill the program being debugged? (y or n) y
> Remote connection closed
> ^^^^^^^^^^^^^^^^^^^^^^^^
> (gdb) FAIL: gdb.threads/kill.exp: kill
>
> Extended remote should remain connected after the kill.
>
> gdb/gdbserver/
> 2014-07-10 Pedro Alves <palves@redhat.com>
>
> * linux-low.c (kill_wait_lwp): New function, based on
> kill_one_lwp_callback, but use my_waitpid directly.
> (kill_one_lwp_callback, linux_kill): Use it.
>
> gdb/testsuite/
> 2014-07-10 Pedro Alves <palves@redhat.com>
>
> * gdb.threads/kill.c: New file.
> * gdb.threads/kill.exp: New file.
> ---
> gdb/gdbserver/linux-low.c | 68 ++++++++++++++++++++-------------
> gdb/testsuite/gdb.threads/kill.c | 64 +++++++++++++++++++++++++++++++
> gdb/testsuite/gdb.threads/kill.exp | 77 ++++++++++++++++++++++++++++++++++++++
> 3 files changed, 183 insertions(+), 26 deletions(-)
> create mode 100644 gdb/testsuite/gdb.threads/kill.c
> create mode 100644 gdb/testsuite/gdb.threads/kill.exp
>
> diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
> index 61552f4..bdc9aaa 100644
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -909,6 +909,46 @@ linux_kill_one_lwp (struct lwp_info *lwp)
> errno ? strerror (errno) : "OK");
> }
>
> +/* Kill LWP and wait for it to die. */
> +
> +static void
> +kill_wait_lwp (struct lwp_info *lwp)
> +{
> + struct thread_info *thr = get_lwp_thread (lwp);
> + int pid = ptid_get_pid (ptid_of (thr));
> + int lwpid = ptid_get_lwp (ptid_of (thr));
> + int wstat;
> + int res;
> +
> + if (debug_threads)
> + debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid);
> +
> + do
> + {
> + linux_kill_one_lwp (lwp);
> +
> + /* Make sure it died. Notes:
> +
> + - The loop is most likely unnecessary.
> +
> + - We don't use linux_wait_for_event as that could delete lwps
> + while we're iterating over them. We're not interested in
> + any pending status at this point, only in making sure all
> + wait status on the kernel side are collected until the
> + process is reaped.
> +
> + - We don't use __WALL here as the __WALL emulation relies on
> + SIGCHLD, and killing a stopped process doesn't generate
> + one, nor an exit status.
> + */
> + res = my_waitpid (lwpid, &wstat, 0);
> + if (res == -1 && errno == ECHILD)
> + res = my_waitpid (lwpid, &wstat, __WCLONE);
> + } while (res > 0 && WIFSTOPPED (wstat));
> +
> + gdb_assert (res > 0);
> +}
> +
> /* Callback for `find_inferior'. Kills an lwp of a given process,
> except the leader. */
>
> @@ -917,7 +957,6 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args)
> {
> struct thread_info *thread = (struct thread_info *) entry;
> struct lwp_info *lwp = get_thread_lwp (thread);
> - int wstat;
> int pid = * (int *) args;
>
> if (ptid_get_pid (entry->id) != pid)
> @@ -936,14 +975,7 @@ kill_one_lwp_callback (struct inferior_list_entry *entry, void *args)
> return 0;
> }
>
> - do
> - {
> - linux_kill_one_lwp (lwp);
> -
> - /* Make sure it died. The loop is most likely unnecessary. */
> - pid = linux_wait_for_event (thread->entry.id, &wstat, __WALL);
> - } while (pid > 0 && WIFSTOPPED (wstat));
> -
> + kill_wait_lwp (lwp);
> return 0;
> }
>
> @@ -952,8 +984,6 @@ linux_kill (int pid)
> {
> struct process_info *process;
> struct lwp_info *lwp;
> - int wstat;
> - int lwpid;
>
> process = find_process_pid (pid);
> if (process == NULL)
> @@ -976,21 +1006,7 @@ linux_kill (int pid)
> pid);
> }
> else
> - {
> - struct thread_info *thr = get_lwp_thread (lwp);
> -
> - if (debug_threads)
> - debug_printf ("lk_1: killing lwp %ld, for pid: %d\n",
> - lwpid_of (thr), pid);
> -
> - do
> - {
> - linux_kill_one_lwp (lwp);
> -
> - /* Make sure it died. The loop is most likely unnecessary. */
> - lwpid = linux_wait_for_event (thr->entry.id, &wstat, __WALL);
> - } while (lwpid > 0 && WIFSTOPPED (wstat));
> - }
> + kill_wait_lwp (lwp);
>
> the_target->mourn (process);
>
> diff --git a/gdb/testsuite/gdb.threads/kill.c b/gdb/testsuite/gdb.threads/kill.c
> new file mode 100644
> index 0000000..aefbb06
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/kill.c
> @@ -0,0 +1,64 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2014 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +#ifdef USE_THREADS
> +
> +#include <unistd.h>
> +#include <pthread.h>
> +
> +#define NUM 5
> +
> +pthread_barrier_t barrier;
> +
> +void *
> +thread_function (void *arg)
> +{
> + volatile unsigned int counter = 1;
> +
> + pthread_barrier_wait (&barrier);
> +
> + while (counter > 0)
> + {
> + counter++;
> + usleep (1);
> + }
> +
> + pthread_exit (NULL);
> +}
> +
> +#endif /* USE_THREADS */
> +
> +void
> +setup (void)
> +{
> +#ifdef USE_THREADS
> + pthread_t threads[NUM];
> + int i;
> +
> + pthread_barrier_init (&barrier, NULL, NUM + 1);
> + for (i = 0; i < NUM; i++)
> + pthread_create (&threads[i], NULL, thread_function, NULL);
> + pthread_barrier_wait (&barrier);
> +#endif /* USE_THREADS */
> +}
> +
> +int
> +main (void)
> +{
> + setup ();
> + return 0; /* set break here */
> +}
> diff --git a/gdb/testsuite/gdb.threads/kill.exp b/gdb/testsuite/gdb.threads/kill.exp
> new file mode 100644
> index 0000000..a3f921f
> --- /dev/null
> +++ b/gdb/testsuite/gdb.threads/kill.exp
> @@ -0,0 +1,77 @@
> +# This testcase is part of GDB, the GNU debugger.
> +
> +# Copyright 2014 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>. */
> +
> +standard_testfile
> +
> +# Run the test proper. THREADED indicates whether to build a threaded
> +# program and spawn several threads before trying to kill the program.
> +
> +proc test {threaded} {
> + global testfile srcfile
> +
> + with_test_prefix [expr ($threaded)?"threaded":"non-threaded"] {
> +
> + set options {debug}
> + if {$threaded} {
> + lappend options "pthreads"
> + lappend options "additional_flags=-DUSE_THREADS"
> + set prog ${testfile}_threads
> + } else {
> + set prog ${testfile}_nothreads
> + }
> +
> + if {[prepare_for_testing "failed to prepare" $prog $srcfile $options] == -1} {
> + return -1
> + }
> +
> + if { ![runto main] } then {
> + fail "run to main"
> + return
> + }
> +
> + set linenum [gdb_get_line_number "set break here"]
> + gdb_breakpoint "$srcfile:$linenum"
> + gdb_continue_to_breakpoint "break here" ".*break here.*"
> +
> + if {$threaded} {
> + gdb_test "info threads" "6.*5.*4.*3.*2.*1.*" "all threads started"
> + }
> +
> + # This kills and ensures no output other than the prompt comes out,
> + # like:
> + #
> + # (gdb) kill
> + # Kill the program being debugged? (y or n) y
> + # (gdb)
> + #
> + # If we instead saw more output, like e.g., with an extended-remote
> + # connection:
> + #
> + # (gdb) kill
> + # Kill the program being debugged? (y or n) y
> + # Remote connection closed
> + # (gdb)
> + #
> + # the above would mean that the remote end crashed.
> +
> + gdb_test "kill" "^y" "kill program" "Kill the program being debugged\\? \\(y or n\\) $" "y"
> + }
> +}
> +
> +foreach threaded {true false} {
> + test $threaded
> +}
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PUSHED+7.8] Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2014-07-11 8:21 ` Hui Zhu
@ 2014-07-11 10:53 ` Pedro Alves
0 siblings, 0 replies; 10+ messages in thread
From: Pedro Alves @ 2014-07-11 10:53 UTC (permalink / raw)
To: Hui Zhu, gdb-patches ml
On 07/11/2014 08:33 AM, Hui Zhu wrote:
> Hi Pedro,
>
> Thanks for your patch. It fixed the issue in my part.
Thank you. I went ahead and pushed it to both master and gdb-7.8-branch.
--
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves
2014-07-11 8:21 ` Hui Zhu
@ 2015-07-13 16:07 ` Yao Qi
2015-07-13 17:32 ` Pedro Alves
1 sibling, 1 reply; 10+ messages in thread
From: Yao Qi @ 2015-07-13 16:07 UTC (permalink / raw)
To: Pedro Alves, gdb-patches ml
On 10/07/14 16:16, Pedro Alves wrote:
> +static void
> +kill_wait_lwp (struct lwp_info *lwp)
> +{
> + struct thread_info *thr = get_lwp_thread (lwp);
> + int pid = ptid_get_pid (ptid_of (thr));
> + int lwpid = ptid_get_lwp (ptid_of (thr));
> + int wstat;
> + int res;
> +
> + if (debug_threads)
> + debug_printf ("kwl: killing lwp %d, for pid: %d\n", lwpid, pid);
> +
> + do
> + {
> + linux_kill_one_lwp (lwp);
> +
> + /* Make sure it died. Notes:
> +
> + - The loop is most likely unnecessary.
> +
> + - We don't use linux_wait_for_event as that could delete lwps
> + while we're iterating over them. We're not interested in
> + any pending status at this point, only in making sure all
> + wait status on the kernel side are collected until the
> + process is reaped.
> +
> + - We don't use __WALL here as the __WALL emulation relies on
> + SIGCHLD, and killing a stopped process doesn't generate
> + one, nor an exit status.
> + */
> + res = my_waitpid (lwpid, &wstat, 0);
> + if (res == -1 && errno == ECHILD)
> + res = my_waitpid (lwpid, &wstat, __WCLONE);
> + } while (res > 0 && WIFSTOPPED (wstat));
> +
> + gdb_assert (res > 0);
> +}
Hi Pedro,
do you still remember why did you add this assert? It wasn't
mentioned in the mail
https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html
I am looking at a GDBserver internal error on x86_64 when I run
gdb.threads/thread-unwindonsignal.exp with GDBserver,
continue^M
Continuing.^M
warning: Remote failure reply: E.No unwaited-for children left.^M
PC register is not available^M
(gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit
Remote debugging from host 127.0.0.1^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
monitor exit^M
Killing process(es): 30694^M
(gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A
problem internal to GDBserver has been detected.^M
kill_wait_lwp: Assertion `res > 0' failed.
After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html
GDBserver starts to swallows errors if the LWP is gone. Then, when
GDBservers kills non-exist LWP, the assert will be triggered.
Why don't we implement kill_wait_lwp like its counterpart in GDB
linux-nat.c:kill_wait_callback? we can loop and assert like this
patch below, (note that this patch fixes the internal error, and
the FAIL is still there).
--
Yao (é½å°§)
diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 7bb9f7f..07d051a 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -1101,9 +1101,9 @@ kill_wait_lwp (struct lwp_info *lwp)
res = my_waitpid (lwpid, &wstat, 0);
if (res == -1 && errno == ECHILD)
res = my_waitpid (lwpid, &wstat, __WCLONE);
- } while (res > 0 && WIFSTOPPED (wstat));
+ } while (res == lwpid);
- gdb_assert (res > 0);
+ gdb_assert (res == -1 && errno == ECHILD);
}
/* Callback for `find_inferior'. Kills an lwp of a given process,
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2015-07-13 16:07 ` Yao Qi
@ 2015-07-13 17:32 ` Pedro Alves
2015-07-14 8:00 ` Yao Qi
0 siblings, 1 reply; 10+ messages in thread
From: Pedro Alves @ 2015-07-13 17:32 UTC (permalink / raw)
To: Yao Qi, gdb-patches ml
On 07/13/2015 05:07 PM, Yao Qi wrote:
> Hi Pedro,
> do you still remember why did you add this assert? It wasn't
> mentioned in the mail
> https://sourceware.org/ml/gdb-patches/2014-07/msg00206.html
>
Simply because getting here was supposed to indicate
something went wrong elsewhere, but at the time I didn't consider
that the child could die while ptrace-stopped.
> I am looking at a GDBserver internal error on x86_64 when I run
> gdb.threads/thread-unwindonsignal.exp with GDBserver,
>
> continue^M
> Continuing.^M
> warning: Remote failure reply: E.No unwaited-for children left.^M
> PC register is not available^M
> (gdb) FAIL: gdb.threads/thread-unwindonsignal.exp: continue until exit
> Remote debugging from host 127.0.0.1^M
> ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
> ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
> ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
> ptrace(regsets_fetch_inferior_registers) PID=30700: No such process^M
> monitor exit^M
> Killing process(es): 30694^M
> (gdb) /home/yao/SourceCode/gnu/gdb/git/gdb/gdbserver/linux-low.c:1106: A
> problem internal to GDBserver has been detected.^M
> kill_wait_lwp: Assertion `res > 0' failed.
>
> After your patch https://sourceware.org/ml/gdb-patches/2015-03/msg00597.html
> GDBserver starts to swallows errors if the LWP is gone. Then, when
> GDBservers kills non-exist LWP, the assert will be triggered.
>
Looks like I forgot to push the rest of that series:
https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html
What do you think of that one?
> Why don't we implement kill_wait_lwp like its counterpart in GDB
> linux-nat.c:kill_wait_callback? we can loop and assert like this
> patch below, (note that this patch fixes the internal error, and
> the FAIL is still there).
>
Seems to me it's not 100% correct to waitpid the pid one more time
after we've already reaped it, because there's a minuscule chance
another process that we're debugging could clone a new lwp that reuses
the PID of the one we've just killed/reaped, and then another iteration
could collect the initial SIGSTOP of the wrong LWP and we'd kill it:
-> kill (pid1, SIGKILL);
<- waitpid (pid1) returns pid1/WSIGNALLED
-> on iteration1: new pid1 clone lwp is spawned
-> ret==pid1, continue iterating
-> kill (pid1, SIGKILL); // killing wrong process
<- waitpid (pid1) returns either SIGSTOP or WSIGNALLED
...
Thanks,
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2015-07-13 17:32 ` Pedro Alves
@ 2015-07-14 8:00 ` Yao Qi
2015-07-14 9:13 ` Pedro Alves
0 siblings, 1 reply; 10+ messages in thread
From: Yao Qi @ 2015-07-14 8:00 UTC (permalink / raw)
To: Pedro Alves; +Cc: Yao Qi, gdb-patches ml
Pedro Alves <palves@redhat.com> writes:
> Looks like I forgot to push the rest of that series:
>
> https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html
>
> What do you think of that one?
Yes, it looks good to me. We also need it on 7.10 branch.
>
>> Why don't we implement kill_wait_lwp like its counterpart in GDB
>> linux-nat.c:kill_wait_callback? we can loop and assert like this
>> patch below, (note that this patch fixes the internal error, and
>> the FAIL is still there).
>>
>
> Seems to me it's not 100% correct to waitpid the pid one more time
> after we've already reaped it, because there's a minuscule chance
> another process that we're debugging could clone a new lwp that reuses
> the PID of the one we've just killed/reaped, and then another iteration
> could collect the initial SIGSTOP of the wrong LWP and we'd kill it:
>
> -> kill (pid1, SIGKILL);
> <- waitpid (pid1) returns pid1/WSIGNALLED
> -> on iteration1: new pid1 clone lwp is spawned
> -> ret==pid1, continue iterating
> -> kill (pid1, SIGKILL); // killing wrong process
> <- waitpid (pid1) returns either SIGSTOP or WSIGNALLED
> ...
Yes, that is possible.
--
Yao (齐尧)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] GDBserver crashes when killing a multi-thread process
2015-07-14 8:00 ` Yao Qi
@ 2015-07-14 9:13 ` Pedro Alves
0 siblings, 0 replies; 10+ messages in thread
From: Pedro Alves @ 2015-07-14 9:13 UTC (permalink / raw)
To: Yao Qi; +Cc: gdb-patches ml
On 07/14/2015 09:00 AM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
>
>> Looks like I forgot to push the rest of that series:
>>
>> https://sourceware.org/ml/gdb-patches/2015-03/msg00182.html
>>
>> What do you think of that one?
>
> Yes, it looks good to me. We also need it on 7.10 branch.
OK, I'll push it.
Thanks,
Pedro Alves
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-07-14 9:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-23 10:10 [PATCH] Fix crash of gdbserver when kill threads Hui Zhu
2014-06-29 3:28 ` Hui Zhu
2014-07-02 9:07 ` Pedro Alves
2014-07-10 15:17 ` [PATCH v2] GDBserver crashes when killing a multi-thread process Pedro Alves
2014-07-11 8:21 ` Hui Zhu
2014-07-11 10:53 ` [PUSHED+7.8] " Pedro Alves
2015-07-13 16:07 ` Yao Qi
2015-07-13 17:32 ` Pedro Alves
2015-07-14 8:00 ` Yao Qi
2015-07-14 9:13 ` Pedro Alves
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).