[Bug threads/25478] gdb hangs after external SIGKILL

public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed

* [Bug threads/25478] gdb hangs after external SIGKILL
       [not found] <bug-25478-4717@http.sourceware.org/bugzilla/>
@ 2020-05-14 12:16 ` cvs-commit at gcc dot gnu.org
  2020-11-06 22:42 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-05-14 12:16 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25478

--- Comment #15 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tankut Baris Aktemur
<aktemur@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a05575d39a5348bd9979fc09e658a03ff22722b9

commit a05575d39a5348bd9979fc09e658a03ff22722b9
Author: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
Date:   Thu May 14 13:59:54 2020 +0200

    gdb/infrun: handle already-exited threads when attempting to stop

    In stop_all_threads, GDB sends signals to other threads in an attempt
    to stop them.  While in a typical scenario the expected wait status is
    TARGET_WAITKIND_STOPPED, it is possible that the thread GDB attempted
    to stop has already terminated.  If so, a waitstatus other than
    TARGET_WAITKIND_STOPPED would be received.  Handle this case
    appropriately.

    If a wait status that denotes thread termination is ignored, GDB goes
    into an infinite loop in stop_all_threads.
    E.g.:

      $ gdb ./a.out
      (gdb) start
      ...
      (gdb) add-inferior -exec ./a.out
      ...
      (gdb) inferior 2
      ...
      (gdb) start
      ...
      (gdb) set schedule-multiple on
      (gdb) set debug infrun 2
      (gdb) continue
      Continuing.
      infrun: clear_proceed_status_thread (process 10449)
      infrun: clear_proceed_status_thread (process 10453)
      infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
      infrun: proceed: resuming process 10449
      infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current
thread [process 10449] at 0x55555555514e
      infrun: infrun_async(1)
      infrun: prepare_to_wait
      infrun: proceed: resuming process 10453
      infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current
thread [process 10453] at 0x55555555514e
      infrun: prepare_to_wait
      infrun: Found 2 inferiors, starting at #0
      infrun: target_wait (-1.0.0, status) =
      infrun:   10449.10449.0 [process 10449],
      infrun:   status->kind = exited, status = 0
      infrun: handle_inferior_event status->kind = exited, status = 0
      [Inferior 1 (process 10449) exited normally]
      infrun: stop_waiting
      infrun: stop_all_threads
      infrun: stop_all_threads, pass=0, iterations=0
      infrun:   process 10453 executing, need stop
      infrun: target_wait (-1.0.0, status) =
      infrun:   10453.10453.0 [process 10453],
      infrun:   status->kind = exited, status = 0
      infrun: stop_all_threads status->kind = exited, status = 0 process 10453
      infrun:   process 10453 executing, already stopping
      infrun: target_wait (-1.0.0, status) =
      infrun:   -1.0.0 [process -1],
      infrun:   status->kind = no-resumed
      infrun: infrun_async(0)
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      infrun: stop_all_threads status->kind = no-resumed process -1
      infrun:   process 10453 executing, already stopping
      ...

    And this polling goes on forever.  This patch prevents the infinite
    looping behavior.  For the same scenario above, we obtain the
    following behavior:

      ...
      (gdb) continue
      Continuing.
      infrun: clear_proceed_status_thread (process 31229)
      infrun: clear_proceed_status_thread (process 31233)
      infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
      infrun: proceed: resuming process 31229
      infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current
thread [process 31229] at 0x55555555514e
      infrun: infrun_async(1)
      infrun: prepare_to_wait
      infrun: proceed: resuming process 31233
      infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current
thread [process 31233] at 0x55555555514e
      infrun: prepare_to_wait
      infrun: Found 2 inferiors, starting at #0
      infrun: target_wait (-1.0.0, status) =
      infrun:   31229.31229.0 [process 31229],
      infrun:   status->kind = exited, status = 0
      infrun: handle_inferior_event status->kind = exited, status = 0
      [Inferior 1 (process 31229) exited normally]
      infrun: stop_waiting
      infrun: stop_all_threads
      infrun: stop_all_threads, pass=0, iterations=0
      infrun:   process 31233 executing, need stop
      infrun: target_wait (-1.0.0, status) =
      infrun:   31233.31233.0 [process 31233],
      infrun:   status->kind = exited, status = 0
      infrun: stop_all_threads status->kind = exited, status = 0 process 31233
      infrun: saving status status->kind = exited, status = 0 for 31233.31233.0
      infrun:   process 31233 not executing
      infrun: stop_all_threads, pass=1, iterations=1
      infrun:   process 31233 not executing
      infrun: stop_all_threads done
      (gdb)

    The exit event from Inferior 1 is received and shown to the user.
    The exit event from Inferior 2 is not displayed, but kept pending.

      (gdb) info inferiors
        Num  Description       Connection           Executable
      * 1    <null>                                 a.out
        2    process 31233     1 (native)           a.out
      (gdb) inferior 2
      [Switching to inferior 2 [process 31233] (a.out)]
      [Switching to thread 2.1 (process 31233)]
      Couldn't get registers: No such process.
      (gdb) continue
      Continuing.
      infrun: clear_proceed_status_thread (process 31233)
      infrun: clear_proceed_status_thread: thread process 31233 has pending
wait status status->kind = exited, status = 0 (currently_stepping=0).
      infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
      infrun: proceed: resuming process 31233
      infrun: resume: thread process 31233 has pending wait status status->kind
= exited, status = 0 (currently_stepping=0).
      infrun: prepare_to_wait
      infrun: Using pending wait status status->kind = exited, status = 0 for
process 31233.
      infrun: target_wait (-1.0.0, status) =
      infrun:   31233.31233.0 [process 31233],
      infrun:   status->kind = exited, status = 0
      infrun: handle_inferior_event status->kind = exited, status = 0
      [Inferior 2 (process 31233) exited normally]
      infrun: stop_waiting
      (gdb) info inferiors
        Num  Description       Connection           Executable
        1    <null>                                 a.out
      * 2    <null>                                 a.out
      (gdb)

    When a process exits and we leave the process exit event pending, we
    need to make sure that at least one thread is left listed in the
    inferior's thread list.  This is necessary in order to make sure we
    have a thread that we can later resume, so the process exit event can
    be collected/reported.

    When native debugging, the GNU/Linux back end already makes sure that
    the last LWP isn't deleted.

    When remote debugging against GNU/Linux GDBserver, the GNU/Linux
    GDBserver backend also makes sure that the last thread isn't deleted
    until the process exit event is reported to GDBserver core.

    However, between the backend reporting the process exit event to
    GDBserver core, and GDB consuming the event, GDB may update the thread
    list and find no thread left in the process.  The process exit event
    will be pending somewhere in GDBserver's stop reply queue, or
    gdb/remote.c's queue, or whathever other event queue inbetween
    GDBserver and infrun.c's handle_inferior_event.

    This patch tweaks remote.c's target_update_thread_list implementation
    to avoid deleting the last thread of an inferior.

    In the past, this case of inferior-with-no-threads led to a special
    case at the bottom of handle_no_resumed, where it reads:

      /* Note however that we may find no resumed thread because the whole
         process exited meanwhile (thus updating the thread list results
         in an empty thread list).  In this case we know we'll be getting
         a process exit event shortly.  */
      for (inferior *inf : all_non_exited_inferiors (ecs->target))

    In current master, that code path is still reachable with the
    gdb.threads/continue-pending-after-query.exp testcase, when tested
    against GDBserver, with "maint set target-non-stop" forced "on".

    With this patch, the scenario that loop was concerned about is still
    properly handled, because the loop above it finds the process's last
    thread with "executing" set to true, and thus the handle_no_resumed
    function still returns true.

    Since GNU/Linux native and remote are the only targets that support
    non-stop mode, and with this patch, we always make sure the inferior
    has at least one thread, this patch also removes that "inferior with
    no threads" special case handling from handle_no_resumed.

    Since remote.c now has a special case where we treat a thread that has
    already exited as if it was still alive, we might need to tweak
    remote.c's target_thread_alive implementation to return true for that
    thread without querying the remote side (which would say "no, not
    alive").  After inspecting all the target_thread_alive calls in the
    codebase, it seems that only the one from prune_threads could result
    in that thread being accidentally deleted.  There's only one call to
    prune_threads in GDB's common code, so this patch handles this by
    replacing the prune_threads call with a delete_exited_threads call.
    This seems like an improvement anyway, because we'll still be doing
    what the comment suggests we want to do, and, we avoid remote protocol
    traffic.

    Regression-tested on X86_64 Linux.

    gdb/ChangeLog:
    2020-05-14  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
                Tom de Vries  <tdevries@suse.de>
                Pedro Alves  <palves@redhat.com>

            PR threads/25478
            * infrun.c (stop_all_threads): Do NOT ignore
            TARGET_WAITKIND_NO_RESUMED, TARGET_WAITKIND_THREAD_EXITED,
            TARGET_WAITKIND_EXITED, TARGET_WAITKIND_SIGNALLED wait statuses
            received.
            (handle_no_resumed): Remove code handling a live inferior with no
            threads.
            * remote.c (has_single_non_exited_thread): New.
            (remote_target::update_thread_list): Do not delete a thread if is
            the last thread of the process.
            * thread.c (thread_select): Call delete_exited_threads instead of
            prune_threads.

    gdb/testsuite/ChangeLog:
    2020-05-14  Tankut Baris Aktemur  <tankut.baris.aktemur@intel.com>
                Pedro Alves  <palves@redhat.com>

            * gdb.multi/multi-exit.c: New file.
            * gdb.multi/multi-exit.exp: New file.
            * gdb.multi/multi-kill.c: New file.
            * gdb.multi/multi-kill.exp: New file.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug threads/25478] gdb hangs after external SIGKILL
       [not found] <bug-25478-4717@http.sourceware.org/bugzilla/>
  2020-05-14 12:16 ` [Bug threads/25478] gdb hangs after external SIGKILL cvs-commit at gcc dot gnu.org
@ 2020-11-06 22:42 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 2+ messages in thread
From: vries at gcc dot gnu.org @ 2020-11-06 22:42 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=25478

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |10.1
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #16 from Tom de Vries <vries at gcc dot gnu.org> ---
Tested gdb 10.1 build with original test-case, no issues found, marking
resolved-fixed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-11-06 22:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-25478-4717@http.sourceware.org/bugzilla/>
2020-05-14 12:16 ` [Bug threads/25478] gdb hangs after external SIGKILL cvs-commit at gcc dot gnu.org
2020-11-06 22:42 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).