public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Joel Brobecker <brobecker@adacore.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [RFA/gdbserver] Unexpected EOF read from socket after inferior exits.
Date: Tue, 20 Jul 2010 21:14:00 -0000	[thread overview]
Message-ID: <87mxtlkgac.fsf@dirichlet.schwinge.homeip.net> (raw)
In-Reply-To: <1278451801-10588-1-git-send-email-brobecker@adacore.com> (Joel	Brobecker's message of "Tue, 6 Jul 2010 14:30:01 -0700")

[-- Attachment #1: Type: text/plain, Size: 10140 bytes --]

Hello!

I'm quoting a lot of text, as it's been two weeks since this has been
posted.


On 2010-07-06 21:30, Joel Brobecker wrote:
> This is on GNU/Linux.
>
> GDBserver does not exit properly when the inferior exits, as demonstrated
> with any program using the procedure below:
>
>    % gdbserver-head :4444 simple_main
>    Process simple_main created; pid = 25681
>    Listening on port 4444
>
> Then, in another terminal, start GDB, connect to GDBserver, and run
> the program to completion:
>
>    % gdb-head simple_main
>    (gdb) tar rem :4444
>    (gdb) cont
>    Continuing.
>
>    Program exited normally.
>
> Going back to the terminal where GDBserver is running, we see the following
> output:
>
>     Child exited with status 0
>     readchar: Got EOF
>     Remote side has terminated connection.  GDBserver will reopen the connection.
>     Listening on port 4444

Without your patch, I'm seeing the same thing in a mips64el-linux-gnu
configuration while the GDB testsuite is working on the
gdb.threads/current-lwp-dead test:

    [...]
    Remote debugging from host 10.0.0.218
    
    Child exited with status 0
    readchar: Got EOF
    Remote side has terminated connection.  GDBserver will reopen the connection.
    Listening on port 1234

Then dejagnu is out of sync, and all following tests fail with timeout
errors.

> The problem is that we're missing a call to mourn_inferior.  As a result,
> after we've handled the vCont packet, we fail to notice that there are
> no process left to debug (target_running() returns true), and thus try
> to continue reading from the remote socket.  However, since GDB just
> disconnected after having received the "exit with status 0" reply to the
> vCont request, the read triggers the EOF exception.
>
> This patch fixes the problem by calling mourn_inferior after receiving
> an inferior-exited event when in all-stop mode.  I know there are reasons
> why we don't call the mourning code right after the wait like we used to.
> It seems that, in that case, we can because we exit gdbserver shortly
> after.
>
> I really don't know whether this is the right approach or not - it feels
> fragile to me.  For now, I took the lazy approach, but I noticed that
> the resume-and-if-nonstop-send-ok-else-wait-send-reply code in both
> handle_v_cont and myresume are identical and I think should be identical,
> so perhaps we should factorize this code as well.
>
> The problem is not all that serious in terms of the actual damage, but
> I do feel that it's sufficiently visible and annoying that we should
> have a fix for GDB 7.2.
>
> gdb/ChangeLog:
>
>         * server.c (handle_v_cont): Call mourn_inferior if process
>         just exited.
>         (myresume): Likewise.
>
> Tested on x86_64-linux.
>
> ---
>  gdb/gdbserver/server.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> index 226d123..9125f0e 100644
> --- a/gdb/gdbserver/server.c
> +++ b/gdb/gdbserver/server.c
> @@ -1779,6 +1779,10 @@ handle_v_cont (char *own_buf)
>        last_ptid = mywait (minus_one_ptid, &last_status, 0, 1);
>        prepare_resume_reply (own_buf, last_ptid, &last_status);
>        disable_async_io ();
> +
> +      if (last_status.kind == TARGET_WAITKIND_EXITED
> +          || last_status.kind == TARGET_WAITKIND_SIGNALLED)
> +        mourn_inferior (find_process_pid (ptid_get_pid (last_ptid)));
>      }
>    return;
>  
> @@ -2079,6 +2083,10 @@ myresume (char *own_buf, int step, int sig)
>        last_ptid = mywait (minus_one_ptid, &last_status, 0, 1);
>        prepare_resume_reply (own_buf, last_ptid, &last_status);
>        disable_async_io ();
> +
> +      if (last_status.kind == TARGET_WAITKIND_EXITED
> +          || last_status.kind == TARGET_WAITKIND_SIGNALLED)
> +        mourn_inferior (find_process_pid (ptid_get_pid (last_ptid)));
>      }
>  }

However with this patch, the following happens:

    Running /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/current-lwp-dead.exp ...
    [...]
    > 'LD_LIBRARY_PATH=[...] [ld.so] [gdbserver] ':2887' 'current-lwp-dead'
    Process current-lwp-dead created; pid = 20009
    Listening on port 2887
    target remote philidor:2887
    Remote debugging using philidor:2887
    Remote debugging from host 10.0.0.218
    [...]
    Breakpoint 2, fn_return (unused=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/current-lwp-dead.c:45
    45        return 0;     /* at-fn_return */
    (gdb) PASS: gdb.threads/current-lwp-dead.exp: continue to breakpoint: fn_return
    testcase /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/current-lwp-dead.exp completed in 2 seconds
    Running /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/execl.exp ...

Here this test has finished, the remote gdbserver terminates, and the
next test is about to be started.

    Executing on host: mips64el-linux-gnu-gcc /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/testsuite/gdb.threads/execl.c  [...]
    Killing all inferiors
    Segmentation fault

Instead of terminating, the remote gdbserver crashed with a segfault.

    (gdb) bt
    #0  0x1003b068 in thread_db_mourn (proc=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/thread-db.c:889
    #1  0x1002bc3c in linux_mourn (process=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/linux-low.c:896
    #2  0x1001085c in handle_v_cont (own_buf=0x555800d0 "W00") at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:1785
    #3  0x10011154 in handle_v_requests (own_buf=0x555800d0 "W00", packet_len=7, new_packet_len=0x7faf0d64)
        at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:1976
    #4  0x100148e8 in process_serial_event () at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:3048
    #5  0x10014a9c in handle_serial_event (err=0, client_data=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:3093
    #6  0x1001bd70 in handle_file_event (event_file_desc=4) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-loop.c:488
    #7  0x1001b134 in process_event () at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-loop.c:244
    #8  0x1001c2e0 in start_event_loop () at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/event-loop.c:606
    #9  0x1001303c in main (argc=3, argv=0x7faf1040) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:2589
    (gdb) x/i $pc
    => 0x1003b068 <thread_db_mourn+36>:     lw      v0,40(v0)
    (gdb) info registers v0
    v0: 0x0
    (gdb) frame 0
    #0  0x1003b068 in thread_db_mourn (proc=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/thread-db.c:889
    889       struct thread_db *thread_db = proc->private->thread_db;
    (gdb) list
    884     /* Disconnect from libthread_db and free resources.  */
    885
    886     void
    887     thread_db_mourn (struct process_info *proc)
    888     {
    889       struct thread_db *thread_db = proc->private->thread_db;
    890       if (thread_db)
    891         {
    892           td_err_e (*td_ta_delete_p) (td_thragent_t *);
    893
    Dump of assembler code for function thread_db_mourn:
    888     {
       0x1003b044 <+0>:     addiu   sp,sp,-64
       0x1003b048 <+4>:     sd      ra,56(sp)
       0x1003b04c <+8>:     sd      s8,48(sp)
       0x1003b050 <+12>:    sd      gp,40(sp)
       0x1003b054 <+16>:    move    s8,sp
       0x1003b058 <+20>:    lui     gp,0x1006
       0x1003b05c <+24>:    addiu   gp,gp,368
       0x1003b060 <+28>:    sw      a0,16(s8)
    
    889       struct thread_db *thread_db = proc->private->thread_db;
       0x1003b064 <+32>:    lw      v0,16(s8)
    => 0x1003b068 <+36>:    lw      v0,40(v0)
       0x1003b06c <+40>:    lw      v0,4(v0)
       0x1003b070 <+44>:    sw      v0,0(s8)
    [...]
    (gdb) print proc
    $1 = (struct process_info *) 0x0
    (gdb) # right, proc = 0 as bt already told us...
    (gdb) frame 1
    #1  0x1002bc3c in linux_mourn (process=0x0) at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/linux-low.c:896
    896       thread_db_mourn (process);
    (gdb) list
    891     linux_mourn (struct process_info *process)
    892     {
    893       struct process_info_private *priv;
    894
    895     #ifdef USE_THREAD_DB
    896       thread_db_mourn (process);
    897     #endif
    898
    899       find_inferior (&all_lwps, delete_lwp_callback, process);
    900
    (gdb) frame 2
    #2  0x1001085c in handle_v_cont (own_buf=0x555800d0 "W00") at /scratch/thomas/issue8927-FM_mips64el-linux-gnu/src/gdb-mainline/gdb/gdbserver/server.c:1785
    1785            mourn_inferior (find_process_pid (ptid_get_pid (last_ptid)));
    (gdb) list
    1780          prepare_resume_reply (own_buf, last_ptid, &last_status);
    1781          disable_async_io ();
    1782
    1783          if (last_status.kind == TARGET_WAITKIND_EXITED
    1784              || last_status.kind == TARGET_WAITKIND_SIGNALLED)
    1785            mourn_inferior (find_process_pid (ptid_get_pid (last_ptid)));
    1786        }
    1787      return;
    1788
    1789    err:

So, the find_process_pid thing returns 0 where gdbserver doesn't expect
it to do so.  As I don't know this code, I can't easily tell if that is a
gdbserver bug (that is, with the new mourn_inferior code you added), or
another problem, and how it should be tackled.  Any suggestions?


Regards,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

  parent reply	other threads:[~2010-07-20 21:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-06 21:30 Joel Brobecker
2010-07-06 22:30 ` Daniel Jacobowitz
2010-07-07 16:10   ` Joel Brobecker
2010-07-20 21:14 ` Thomas Schwinge [this message]
2010-07-21 20:49   ` Thomas Schwinge
2010-08-11 12:16   ` Thomas Schwinge
2010-08-11 13:22     ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mxtlkgac.fsf@dirichlet.schwinge.homeip.net \
    --to=thomas@codesourcery.com \
    --cc=brobecker@adacore.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).