[Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state

public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed

* [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state
@ 2015-03-15  7:56 palves at redhat dot com
  2015-05-05 15:31 ` [Bug threads/18127] " richard_sharman at mitel dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: palves at redhat dot com @ 2015-03-15  7:56 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

            Bug ID: 18127
           Summary: threads spawned by infcall end up stuck in "running"
                    state
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: threads
          Assignee: unassigned at sourceware dot org
          Reporter: palves at redhat dot com

Ref: https://sourceware.org/ml/gdb/2015-03/msg00033.html

Calling a function that spawns new threads results in the new threads getting
stuck in "running" state.

On GNU/Linux, and a trivial program that has:

~~~
void
start_thread (void)
{
  pthread_t thread;

  pthread_create (&thread, NULL, thread_function, NULL);
}
~~~

calling that from GDB results in:

(gdb) p start_thread ()
[New Thread 0x7ffff7fc1700 (LWP 9903)]
$1 = void
(gdb) info threads
  Id   Target Id         Frame
  2    Thread 0x7ffff7fc1700 (LWP 9903) "start-thread-in" (running)
* 1    Thread 0x7ffff7fc2740 (LWP 9899) "start-thread-in" main () at
start-thread-infcall.c:35

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
@ 2015-05-05 15:31 ` richard_sharman at mitel dot com
  2015-05-05 17:07 ` palves at redhat dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: richard_sharman at mitel dot com @ 2015-05-05 15:31 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

Richard Sharman <richard_sharman at mitel dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |richard_sharman at mitel dot com

--- Comment #1 from Richard Sharman <richard_sharman at mitel dot com> ---
I have seen something similar:  when stopped at a breakpoint printing a
variable that has a [python] pretty printer (and scheduler-locking is off)
quite often other threads run before the printing is finished,  and if any of
these threads create other threads then these newly created threads -- and, for
some reason some existing threads --  appear as running.


The following shows a normal "into thread" with all threads stopped, but
after the print msg [a structure that contains a field that is pretty printed]
the info threads shows many threads running,  and not just the new ones.  At
this point we cannot continue.


(gdb) c
Continuing.
[New Thread 0xf70e3930 (LWP 7456)]
[Switching to Thread 0xf71a67b0 (LWP 7427)]

Breakpoint 1, ccsend (msg=...) at cgen/message0.cc:498
(gdb) info thread
  Id   Target Id         Frame 
  19   Thread 0xf70e3930 (LWP 7456) "McastAudit" 0xf7ffd430 in
__kernel_vsyscall ()
  18   Thread 0xf7105930 (LWP 7452) "McastAudit" 0xf7ffd430 in
__kernel_vsyscall ()
  17   Thread 0xf70ed7b0 (LWP 7453) "auditmgr       " 0xf7ffd430 in
__kernel_vsyscall ()
  16   Thread 0xf7134930 (LWP 7451) "McastAudit" 0xf7ffd430 in
__kernel_vsyscall ()
  15   Thread 0xf078d7b0 (LWP 7447) "debug_term     " 0xf7ffd430 in
__kernel_vsyscall ()
  14   Thread 0xf71527b0 (LWP 7448) "auditwork      " 0xf7ffd430 in
__kernel_vsyscall ()
  13   Thread 0xf7148930 (LWP 7449) "tSsuSrvHS" 0x0072b505 in dl_open_worker ()
from /lib/ld-linux.so.2
  12   Thread 0xf713e930 (LWP 7450) "McastAudit" 0xf7ffd430 in
__kernel_vsyscall ()
  11   Thread 0xf07977b0 (LWP 7446) "maint_term     " 0xf7ffd430 in
__kernel_vsyscall ()
  10   Thread 0xf7ffa7b0 (LWP 7441) "dbgmtterm      " 0xf7ffd430 in
__kernel_vsyscall ()
  9    Thread 0xf719c7b0 (LWP 7442) "cmsgsysin      " 0xf7ffd430 in
__kernel_vsyscall ()
  8    Thread 0xf71667b0 (LWP 7443) "auditmgr       " 0xf7ffd430 in
__kernel_vsyscall ()
  7    Thread 0xf715c930 (LWP 7444) "tMainApplQ" 0xf7ffd430 in
__kernel_vsyscall ()
  6    Thread 0xf13bc7b0 (LWP 7423) "dispatch       " 0xf7ffd430 in
__kernel_vsyscall ()
  5    Thread 0xf7ff07b0 (LWP 7424) "msgtimer       " DynArray<Array<unsigned
short, 0, 3> >::operator[] (
    this=0x99f2c46 <msgtimer_allocation>, index=4262) at
/home/gx5000/sharman/mpascaldir/d08/x86-linux/Array.h:180
  4    Thread 0xf7fe67b0 (LWP 7425) "guardian       " 0xf7ffd430 in
__kernel_vsyscall ()
  3    Thread 0xf71b07b0 (LWP 7426) "cleanup        " 0xf7ffd430 in
__kernel_vsyscall ()
* 2    Thread 0xf71a67b0 (LWP 7427) "sysinit        " ccsend (msg=...) at
cgen/message0.cc:498
  1    Thread 0xf7741a30 (LWP 7412) "cc" 0xf7ffd430 in __kernel_vsyscall ()
(gdb) p msg
$4 = (message &) @0xf71a5a3a: {control_byte = {msg_type = reg_msg, redun =
{sdrd = plane_0, rxrd = plane_0}}, tx_node = {
    group = 0 '\000', level = maincpu, {{subsystem_id = message_switch,
upper_id_byte = 0 '\000', lower_id_byte = 0 '\000'}, {
        cntrlr_no = 0 '\000', card_no = 0 '\000', circuit_no = 0 '\000'}}},
rx_node = {group = 0 '\000', level = maincpu, {{
        subsystem_id = message_switch, upper_id_byte = 0 '\000', lower_id_byte
= 0 '\000'}, {cntrlr_no = 0 '\000', 
        card_no = 0 '\000', circuit_no = 0 '\000'}}}, enter_lld_when_received =
0 '\000', [New Thread 0xf70d9930 (LWP 7457)]

  tx_sw = 0x70006  (7,6)        "sysinit   ", [New Thread 0xf70c5930 (LWP
7459)]
[New Thread 0xf70cf930 (LWP 7458)]
rx_sw = 0x7000d  (7,13) "auditmgr  ", tx_applic_id = nil_applic, 
  function_code = 13 '\r', data = {storage = "\002", '\000' <repeats 18
times>}, checksum = 0 '\000', icb = 0 '\000'}
(gdb) info thread
  Id   Target Id         Frame 
  22   Thread 0xf70cf930 (LWP 7458) "McastAudit" 0xf7ffd430 in
__kernel_vsyscall ()
  21   Thread 0xf70c5930 (LWP 7459) "SsuPeerSrvI" 0xf7ffd430 in
__kernel_vsyscall ()
  20   Thread 0xf70d9930 (LWP 7457) "McastAudit" (running)
  19   Thread 0xf70e3930 (LWP 7456) "McastAudit" (running)
  18   Thread 0xf7105930 (LWP 7452) "McastAudit" (running)
  17   Thread 0xf70ed7b0 (LWP 7453) "auditmgr       " (running)
  16   Thread 0xf7134930 (LWP 7451) "McastAudit" (running)
  15   Thread 0xf078d7b0 (LWP 7447) "debug_term     " (running)
  14   Thread 0xf71527b0 (LWP 7448) "auditwork      " (running)
  13   Thread 0xf7148930 (LWP 7449) "tSsuSrvHS" (running)
  12   Thread 0xf713e930 (LWP 7450) "McastAudit" (running)
  11   Thread 0xf07977b0 (LWP 7446) "maint_term     " (running)
  10   Thread 0xf7ffa7b0 (LWP 7441) "dbgmtterm      " (running)
  9    Thread 0xf719c7b0 (LWP 7442) "cmsgsysin      " (running)
  8    Thread 0xf71667b0 (LWP 7443) "auditmgr       " (running)
  7    Thread 0xf715c930 (LWP 7444) "tMainApplQ" (running)
  6    Thread 0xf13bc7b0 (LWP 7423) "dispatch       " (running)
  5    Thread 0xf7ff07b0 (LWP 7424) "msgtimer       " (running)
  4    Thread 0xf7fe67b0 (LWP 7425) "guardian       " (running)
  3    Thread 0xf71b07b0 (LWP 7426) "cleanup        " (running)
* 2    Thread 0xf71a67b0 (LWP 7427) "sysinit        " (running)
  1    Thread 0xf7741a30 (LWP 7412) "cc" (running)
(gdb) c
Continuing.
Cannot execute this command while the selected thread is running.
(gdb)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
  2015-05-05 15:31 ` [Bug threads/18127] " richard_sharman at mitel dot com
@ 2015-05-05 17:07 ` palves at redhat dot com
  2015-05-05 18:05 ` richard.sharman at mitel dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: palves at redhat dot com @ 2015-05-05 17:07 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

--- Comment #2 from Pedro Alves <palves at redhat dot com> ---
Thanks.   

> quite often other threads run before the printing is finished

That means the pretty printer is calling functions in the inferior, which then
ends up being the same problem.

The workaround is to switch to one of the threads that GDB knows is stopped
(e.g., thread 22 in your case) and continue that one, or do "stepi" -- when
that step finishes, the threads' states sync up.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
  2015-05-05 15:31 ` [Bug threads/18127] " richard_sharman at mitel dot com
  2015-05-05 17:07 ` palves at redhat dot com
@ 2015-05-05 18:05 ` richard.sharman at mitel dot com
  2015-06-10 15:33 ` eliz at gnu dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: richard.sharman at mitel dot com @ 2015-05-05 18:05 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

--- Comment #3 from richard.sharman at mitel dot com ---
Thanks - I thought of that workaround after I'd sent mail.
On a subsequent run all threads were marked as stopped!

I forgot to mention the version of gdb I was running;  it was 7.9.

Richard


On 5 May 2015 at 13:07, palves at redhat dot com <
sourceware-bugzilla@sourceware.org> wrote:

> https://sourceware.org/bugzilla/show_bug.cgi?id=18127
>
> --- Comment #2 from Pedro Alves <palves at redhat dot com> ---
> Thanks.
>
> > quite often other threads run before the printing is finished
>
> That means the pretty printer is calling functions in the inferior, which
> then
> ends up being the same problem.
>
> The workaround is to switch to one of the threads that GDB knows is stopped
> (e.g., thread 22 in your case) and continue that one, or do "stepi" -- when
> that step finishes, the threads' states sync up.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
>

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
                   ` (2 preceding siblings ...)
  2015-05-05 18:05 ` richard.sharman at mitel dot com
@ 2015-06-10 15:33 ` eliz at gnu dot org
  2015-06-29 15:55 ` cvs-commit at gcc dot gnu.org
  2015-06-29 18:58 ` palves at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: eliz at gnu dot org @ 2015-06-10 15:33 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

Eli Zaretskii <eliz at gnu dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eliz at gnu dot org

--- Comment #4 from Eli Zaretskii <eliz at gnu dot org> ---
On MS-Windows during native MinGW debugging, this issue, when it happens, makes
the debugging session unusable.  MinGW native debugging doesn't support async
execution, and therefore there's no command to stop the threads that GDB
considers "running", nor help GDB re-synchronize its notion of thread states
with the actual situation (which of course is that the threads are all
suspended by the OS).

Unlike in the examples brought here from Unix and GNU systems, I see this on
Windows when I call functions from the inferior.  Those functions don't start
any threads; the threads that trigger the problem are started by Windows for
reasons unknown to me.  And because in Windows native debugging the set_running
function is called with minus_one_ptid, it marks all the threads as running.

This isan acute problem that needs to be solved at least for the above
configuration.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
                   ` (3 preceding siblings ...)
  2015-06-10 15:33 ` eliz at gnu dot org
@ 2015-06-29 15:55 ` cvs-commit at gcc dot gnu.org
  2015-06-29 18:58 ` palves at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2015-06-29 15:55 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

--- Comment #5 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pedro Alves <palves@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=28bf096c62d7da6b349605f3940f4c586a850f78

commit 28bf096c62d7da6b349605f3940f4c586a850f78
Author: Pedro Alves <palves@redhat.com>
Date:   Mon Jun 29 16:07:57 2015 +0100

    PR threads/18127 - threads spawned by infcall end up stuck in "running"
state

    Refs:
     https://sourceware.org/ml/gdb/2015-03/msg00024.html
     https://sourceware.org/ml/gdb/2015-06/msg00005.html

    On GNU/Linux, if an infcall spawns a thread, that thread ends up with
    stuck running state.  This happens because:

     - when linux-nat.c detects a new thread, it marks them as running,
       and does not report anything to the core.

     - we skip finish_thread_state when the thread that is running the
       infcall stops.

    As result, that new thread ends up with stuck "running" state, even
    though it really is stopped.

    On Windows, _all_ threads end up stuck in running state, not just the
    one that was spawned.  That happens because when a new thread is
    detected, unlike linux-nat.c, windows-nat.c reports
    TARGET_WAITKIND_SPURIOUS to infrun.  It's the fact that that event
    does not cause a user-visible stop that triggers the problem.  When
    the target is re-resumed, we call set_running with a wildcard ptid,
    which marks all thread as running.  That set_running is not suppressed
    because the (leader) thread being resumed does not have in_infcall
    set.  Later, when the infcall finally finishes successfully, nothing
    marks all threads back to stopped.

    We can trigger the same problem on all targets by having a thread
    other than the one that is running the infcall report a breakpoint hit
    to infrun, and then have that breakpoint not cause a stop.  That's
    what the included test does.

    The fix is to stop GDB from suppressing the set_running calls while
    doing an infcall, and then set the threads back to stopped when the
    call finishes, iff they were originally stopped before the infcall
    started.  (Note the MI *running/*stopped event suppression isn't
    affected.)

    Tested on x86_64 GNU/Linux.

    gdb/ChangeLog:
    2015-06-29  Pedro Alves  <palves@redhat.com>

        PR threads/18127
        * infcall.c (run_inferior_call): On infcall success, if the thread
        was marked stopped before, reset it back to stopped.
        * infrun.c (resume): Don't suppress the set_running calls when
        doing an infcall.
        (normal_stop): Only discard the finish_thread_state cleanup if the
        infcall succeeded.

    gdb/testsuite/ChangeLog:
    2015-06-29  Pedro Alves  <palves@redhat.com>

        PR threads/18127
        * gdb.threads/hand-call-new-thread.c: New file.
        * gdb.threads/hand-call-new-thread.c: New file.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug threads/18127] threads spawned by infcall end up stuck in "running" state
  2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
                   ` (4 preceding siblings ...)
  2015-06-29 15:55 ` cvs-commit at gcc dot gnu.org
@ 2015-06-29 18:58 ` palves at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: palves at redhat dot com @ 2015-06-29 18:58 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=18127

Pedro Alves <palves at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |7.10

--- Comment #6 from Pedro Alves <palves at redhat dot com> ---
Fixed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-06-29 18:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-15  7:56 [Bug threads/18127] New: threads spawned by infcall end up stuck in "running" state palves at redhat dot com
2015-05-05 15:31 ` [Bug threads/18127] " richard_sharman at mitel dot com
2015-05-05 17:07 ` palves at redhat dot com
2015-05-05 18:05 ` richard.sharman at mitel dot com
2015-06-10 15:33 ` eliz at gnu dot org
2015-06-29 15:55 ` cvs-commit at gcc dot gnu.org
2015-06-29 18:58 ` palves at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).