From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17713 invoked by alias); 20 Jul 2010 21:14:15 -0000 Received: (qmail 17698 invoked by uid 22791); 20 Jul 2010 21:14:12 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,T_MIME_NO_TEXT,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 20 Jul 2010 21:14:04 +0000 Received: (qmail 18827 invoked from network); 20 Jul 2010 21:14:00 -0000 Received: from unknown (HELO dirichlet.schwinge.homeip.net) (thomas@127.0.0.2) by mail.codesourcery.com with ESMTPA; 20 Jul 2010 21:14:00 -0000 From: Thomas Schwinge To: Joel Brobecker Cc: gdb-patches@sourceware.org Subject: Re: [RFA/gdbserver] Unexpected EOF read from socket after inferior exits. References: <1278451801-10588-1-git-send-email-brobecker@adacore.com> Date: Tue, 20 Jul 2010 21:14:00 -0000 In-Reply-To: <1278451801-10588-1-git-send-email-brobecker@adacore.com> (Joel Brobecker's message of "Tue, 6 Jul 2010 14:30:01 -0700") Message-ID: <87mxtlkgac.fsf@dirichlet.schwinge.homeip.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2010-07/txt/msg00319.txt.bz2 --=-=-= Content-Transfer-Encoding: quoted-printable Content-length: 10079 Hello! I'm quoting a lot of text, as it's been two weeks since this has been posted. On 2010-07-06 21:30, Joel Brobecker wrote: > This is on GNU/Linux. > > GDBserver does not exit properly when the inferior exits, as demonstrated > with any program using the procedure below: > > % gdbserver-head :4444 simple_main > Process simple_main created; pid =3D 25681 > Listening on port 4444 > > Then, in another terminal, start GDB, connect to GDBserver, and run > the program to completion: > > % gdb-head simple_main > (gdb) tar rem :4444 > (gdb) cont > Continuing. > > Program exited normally. > > Going back to the terminal where GDBserver is running, we see the followi= ng > output: > > Child exited with status 0 > readchar: Got EOF > Remote side has terminated connection. GDBserver will reopen the con= nection. > Listening on port 4444 Without your patch, I'm seeing the same thing in a mips64el-linux-gnu configuration while the GDB testsuite is working on the gdb.threads/current-lwp-dead test: [...] Remote debugging from host 10.0.0.218 =20=20=20=20 Child exited with status 0 readchar: Got EOF Remote side has terminated connection. GDBserver will reopen the conne= ction. Listening on port 1234 Then dejagnu is out of sync, and all following tests fail with timeout errors. > The problem is that we're missing a call to mourn_inferior. As a result, > after we've handled the vCont packet, we fail to notice that there are > no process left to debug (target_running() returns true), and thus try > to continue reading from the remote socket. However, since GDB just > disconnected after having received the "exit with status 0" reply to the > vCont request, the read triggers the EOF exception. > > This patch fixes the problem by calling mourn_inferior after receiving > an inferior-exited event when in all-stop mode. I know there are reasons > why we don't call the mourning code right after the wait like we used to. > It seems that, in that case, we can because we exit gdbserver shortly > after. > > I really don't know whether this is the right approach or not - it feels > fragile to me. For now, I took the lazy approach, but I noticed that > the resume-and-if-nonstop-send-ok-else-wait-send-reply code in both > handle_v_cont and myresume are identical and I think should be identical, > so perhaps we should factorize this code as well. > > The problem is not all that serious in terms of the actual damage, but > I do feel that it's sufficiently visible and annoying that we should > have a fix for GDB 7.2. > > gdb/ChangeLog: > > * server.c (handle_v_cont): Call mourn_inferior if process > just exited. > (myresume): Likewise. > > Tested on x86_64-linux. > > --- > gdb/gdbserver/server.c | 8 ++++++++ > 1 files changed, 8 insertions(+), 0 deletions(-) > > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c > index 226d123..9125f0e 100644 > --- a/gdb/gdbserver/server.c > +++ b/gdb/gdbserver/server.c > @@ -1779,6 +1779,10 @@ handle_v_cont (char *own_buf) > last_ptid =3D mywait (minus_one_ptid, &last_status, 0, 1); > prepare_resume_reply (own_buf, last_ptid, &last_status); > disable_async_io (); > + > + if (last_status.kind =3D=3D TARGET_WAITKIND_EXITED > + || last_status.kind =3D=3D TARGET_WAITKIND_SIGNALLED) > + mourn_inferior (find_process_pid (ptid_get_pid (last_ptid))); > } > return; >=20=20 > @@ -2079,6 +2083,10 @@ myresume (char *own_buf, int step, int sig) > last_ptid =3D mywait (minus_one_ptid, &last_status, 0, 1); > prepare_resume_reply (own_buf, last_ptid, &last_status); > disable_async_io (); > + > + if (last_status.kind =3D=3D TARGET_WAITKIND_EXITED > + || last_status.kind =3D=3D TARGET_WAITKIND_SIGNALLED) > + mourn_inferior (find_process_pid (ptid_get_pid (last_ptid))); > } > } However with this patch, the following happens: Running /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainlin= e/gdb/testsuite/gdb.threads/current-lwp-dead.exp ... [...] > 'LD_LIBRARY_PATH=3D[...] [ld.so] [gdbserver] ':2887' 'current-lwp-dea= d' Process current-lwp-dead created; pid =3D 20009 Listening on port 2887 target remote philidor:2887 Remote debugging using philidor:2887 Remote debugging from host 10.0.0.218 [...] Breakpoint 2, fn_return (unused=3D0x0) at /scratch/thomas/issue8927-FM_= mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/current-lwp-d= ead.c:45 45 return 0; /* at-fn_return */ (gdb) PASS: gdb.threads/current-lwp-dead.exp: continue to breakpoint: f= n_return testcase /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainli= ne/gdb/testsuite/gdb.threads/current-lwp-dead.exp completed in 2 seconds Running /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainlin= e/gdb/testsuite/gdb.threads/execl.exp ... Here this test has finished, the remote gdbserver terminates, and the next test is about to be started. Executing on host: mips64el-linux-gnu-gcc /scratch/thomas/issue8927-FM_= mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/execl.c [...] Killing all inferiors Segmentation fault Instead of terminating, the remote gdbserver crashed with a segfault. (gdb) bt #0 0x1003b068 in thread_db_mourn (proc=3D0x0) at /scratch/thomas/issue= 8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/thread-db.c:889 #1 0x1002bc3c in linux_mourn (process=3D0x0) at /scratch/thomas/issue8= 927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/linux-low.c:896 #2 0x1001085c in handle_v_cont (own_buf=3D0x555800d0 "W00") at /scratc= h/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/ser= ver.c:1785 #3 0x10011154 in handle_v_requests (own_buf=3D0x555800d0 "W00", packet= _len=3D7, new_packet_len=3D0x7faf0d64) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline= /gdb/gdbserver/server.c:1976 #4 0x100148e8 in process_serial_event () at /scratch/thomas/issue8927-= FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:3048 #5 0x10014a9c in handle_serial_event (err=3D0, client_data=3D0x0) at /= scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserv= er/server.c:3093 #6 0x1001bd70 in handle_file_event (event_file_desc=3D4) at /scratch/t= homas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-= loop.c:488 #7 0x1001b134 in process_event () at /scratch/thomas/issue8927-FM_mips= 64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-loop.c:244 #8 0x1001c2e0 in start_event_loop () at /scratch/thomas/issue8927-FM_m= ips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-loop.c:606 #9 0x1001303c in main (argc=3D3, argv=3D0x7faf1040) at /scratch/thomas= /issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:25= 89 (gdb) x/i $pc =3D> 0x1003b068 : lw v0,40(v0) (gdb) info registers v0 v0: 0x0 (gdb) frame 0 #0 0x1003b068 in thread_db_mourn (proc=3D0x0) at /scratch/thomas/issue= 8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/thread-db.c:889 889 struct thread_db *thread_db =3D proc->private->thread_db; (gdb) list 884 /* Disconnect from libthread_db and free resources. */ 885 886 void 887 thread_db_mourn (struct process_info *proc) 888 { 889 struct thread_db *thread_db =3D proc->private->thread_db; 890 if (thread_db) 891 { 892 td_err_e (*td_ta_delete_p) (td_thragent_t *); 893 Dump of assembler code for function thread_db_mourn: 888 { 0x1003b044 <+0>: addiu sp,sp,-64 0x1003b048 <+4>: sd ra,56(sp) 0x1003b04c <+8>: sd s8,48(sp) 0x1003b050 <+12>: sd gp,40(sp) 0x1003b054 <+16>: move s8,sp 0x1003b058 <+20>: lui gp,0x1006 0x1003b05c <+24>: addiu gp,gp,368 0x1003b060 <+28>: sw a0,16(s8) =20=20=20=20 889 struct thread_db *thread_db =3D proc->private->thread_db; 0x1003b064 <+32>: lw v0,16(s8) =3D> 0x1003b068 <+36>: lw v0,40(v0) 0x1003b06c <+40>: lw v0,4(v0) 0x1003b070 <+44>: sw v0,0(s8) [...] (gdb) print proc $1 =3D (struct process_info *) 0x0 (gdb) # right, proc =3D 0 as bt already told us... (gdb) frame 1 #1 0x1002bc3c in linux_mourn (process=3D0x0) at /scratch/thomas/issue8= 927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/linux-low.c:896 896 thread_db_mourn (process); (gdb) list 891 linux_mourn (struct process_info *process) 892 { 893 struct process_info_private *priv; 894 895 #ifdef USE_THREAD_DB 896 thread_db_mourn (process); 897 #endif 898 899 find_inferior (&all_lwps, delete_lwp_callback, process); 900 (gdb) frame 2 #2 0x1001085c in handle_v_cont (own_buf=3D0x555800d0 "W00") at /scratc= h/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/ser= ver.c:1785 1785 mourn_inferior (find_process_pid (ptid_get_pid (last_pt= id))); (gdb) list 1780 prepare_resume_reply (own_buf, last_ptid, &last_status); 1781 disable_async_io (); 1782 1783 if (last_status.kind =3D=3D TARGET_WAITKIND_EXITED 1784 || last_status.kind =3D=3D TARGET_WAITKIND_SIGNALLED) 1785 mourn_inferior (find_process_pid (ptid_get_pid (last_pt= id))); 1786 } 1787 return; 1788 1789 err: So, the find_process_pid thing returns 0 where gdbserver doesn't expect it to do so. As I don't know this code, I can't easily tell if that is a gdbserver bug (that is, with the new mourn_inferior code you added), or another problem, and how it should be tackled. Any suggestions? Regards, Thomas --=-=-= Content-Type: application/pgp-signature Content-length: 197 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkxGEYsACgkQC9ZuxbdEiFgWsQCfYx02yQBoQVntXuKeYjiYod0i AxIAoJ5UG/i6BP1qpjQ7TGtA6/2qqHVB =F4NF -----END PGP SIGNATURE----- --=-=-=--