public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
@ 2024-01-29  6:49 vries at gcc dot gnu.org
  2024-01-29 10:00 ` [Bug dap/31306] " vries at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29  6:49 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

            Bug ID: 31306
           Summary: [gdb/dap] segfault in new_threadstate during
                    gdb.dap/eof.exp
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: dap
          Assignee: unassigned at sourceware dot org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

I noticed after doing a full test run on aarch64-linux that I was left with two
core files.

One of them is from the gdb internal error in gdb.threads/detach-step-over.exp,
which triggers regularly (but not always) for me.

The other one I couldn't relate back to a test-case, so I did another run which
lists the build/gdb/testsuite dir after each test, which suggested the core is
generated by gdb.dap/eof.exp.

The backtrace looks like:
...
(gdb) bt
#0  0x0000ffff76e92280 in __pthread_kill_implementation () from
/lib64/libc.so.6
#1  0x0000ffff76e45800 [PAC] in raise () from /lib64/libc.so.6
#2  0x00000000007aeeb0 [PAC] in handle_fatal_signal (sig=11)
    at /home/vries/gdb/src/gdb/event-top.c:926
#3  0x00000000007aef38 in handle_sigsegv (sig=11)
    at /home/vries/gdb/src/gdb/event-top.c:976
#4  <signal handler called>
#5  0x0000ffff7780db14 in new_threadstate () from /lib64/libpython3.12.so.1.0
#6  0x0000ffff77870548 [PAC] in PyGILState_Ensure () from
/lib64/libpython3.12.so.1.0
#7  0x0000000000a6adf0 [PAC] in gdbpy_gil::gdbpy_gil (this=0xffffd6b572a8)
    at /home/vries/gdb/src/gdb/python/python-internal.h:787
#8  0x0000000000ab6568 in gdbpy_event::~gdbpy_event (this=0xffff3c001e80, 
    __in_chrg=<optimized out>) at /home/vries/gdb/src/gdb/python/python.c:1051
#9  0x0000000000ab721c in
std::_Function_base::_Base_manager<gdbpy_event>::_M_destroy (
    __victim=...) at /usr/include/c++/13/bits/std_function.h:175
#10 0x0000000000ab7098 in
std::_Function_base::_Base_manager<gdbpy_event>::_M_manager (
    __dest=..., __source=..., __op=std::__destroy_functor)
    at /usr/include/c++/13/bits/std_function.h:203
--Type <RET> for more, q to quit, c to continue without paging--c
#11 0x0000000000ab6cd0 in std::_Function_handler<void (),
gdbpy_event>::_M_manager(std::_Any_data&, std::_Any_data const&,
std::_Manager_operation) (__dest=..., __source=..., 
    __op=std::__destroy_functor) at /usr/include/c++/13/bits/std_function.h:282
#12 0x000000000042dd9c in std::_Function_base::~_Function_base
(this=0xffff3c001bb0, 
    __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:244
#13 0x000000000042e654 in std::function<void ()>::~function()
(this=0xffff3c001bb0, 
    __in_chrg=<optimized out>) at /usr/include/c++/13/bits/std_function.h:334
#14 0x0000000000b66c30 in std::_Destroy<std::function<void ()>
>(std::function<void ()>*) (__pointer=0xffff3c001bb0) at
/usr/include/c++/13/bits/stl_construct.h:151
#15 0x0000000000b66aa0 in
std::_Destroy_aux<false>::__destroy<std::function<void ()>*>(std::function<void
()>*, std::function<void ()>*) (__first=0xffff3c001bb0, 
    __last=0xffff3c001bd0) at /usr/include/c++/13/bits/stl_construct.h:163
#16 0x0000000000b667a8 in std::_Destroy<std::function<void
()>*>(std::function<void ()>*, std::function<void ()>*)
(__first=0xffff3c001bb0, __last=0xffff3c001bd0)
    at /usr/include/c++/13/bits/stl_construct.h:196
#17 0x0000000000b661e4 in std::_Destroy<std::function<void ()>*,
std::function<void ()> >(std::function<void ()>*, std::function<void ()>*,
std::allocator<std::function<void ()> >&) (__last=0xffff3c001bd0,
__first=0xffff3c001bb0)
    at /usr/include/c++/13/bits/alloc_traits.h:948
#18 std::vector<std::function<void ()>, std::allocator<std::function<void ()> >
>::~vector() (this=0x2a183c8 <runnables>, __in_chrg=<optimized out>)
    at /usr/include/c++/13/bits/stl_vector.h:732
#19 0x0000ffff76e48370 in __run_exit_handlers () from /lib64/libc.so.6
#20 0x0000ffff76e48450 [PAC] in exit () from /lib64/libc.so.6
#21 0x0000000000c933dc [PAC] in quit_force (exit_arg=0x0, from_tty=0)
    at /home/vries/gdb/src/gdb/top.c:1826
#22 0x0000000000607b44 in quit_command (args=0x0, from_tty=0)
    at /home/vries/gdb/src/gdb/cli/cli-cmds.c:508
#23 0x0000000000c90480 in quit_cover () at /home/vries/gdb/src/gdb/top.c:304
#24 0x00000000007af49c in async_disconnect (arg=0x0)
    at /home/vries/gdb/src/gdb/event-top.c:1230
#25 0x00000000005474fc in invoke_async_signal_handlers ()
    at /home/vries/gdb/src/gdb/async-event.c:234
#26 0x000000000157af8c in gdb_do_one_event (mstimeout=-1)
    at /home/vries/gdb/src/gdbsupport/event-loop.cc:199
#27 0x0000000000941898 in start_event_loop () at
/home/vries/gdb/src/gdb/main.c:401
#28 0x0000000000941a10 in captured_command_loop () at
/home/vries/gdb/src/gdb/main.c:465
#29 0x0000000000943490 in captured_main (data=0xffffd6b57878)
    at /home/vries/gdb/src/gdb/main.c:1335
#30 0x0000000000943514 in gdb_main (args=0xffffd6b57878)
    at /home/vries/gdb/src/gdb/main.c:1354
#31 0x0000000000423ab4 in main (argc=14, argv=0xffffd6b57a08)
    at /home/vries/gdb/src/gdb/gdb.c:39
...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
@ 2024-01-29 10:00 ` vries at gcc dot gnu.org
  2024-01-29 10:56 ` vries at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29 10:00 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
Found some related info here:
https://docs.python.org/3/c-api/init.html#c.PyGILState_Ensure

This fixes it:
...
diff --git a/gdb/python/python.c b/gdb/python/python.c
index de3a94dfc9a..537db58d8ab 100644
--- a/gdb/python/python.c
+++ b/gdb/python/python.c
@@ -1048,6 +1048,8 @@ struct gdbpy_event

   ~gdbpy_event ()
   {
+    if (_Py_IsFinalizing ())
+      return;
     gdbpy_gil gil;
     Py_XDECREF (m_func);
   }
...
but I'm not familiar with the code, so this might paper over the problem rather
than fix it, I'm not sure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
  2024-01-29 10:00 ` [Bug dap/31306] " vries at gcc dot gnu.org
@ 2024-01-29 10:56 ` vries at gcc dot gnu.org
  2024-01-29 11:53 ` vries at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29 10:56 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
Could be a dup or PR31172.  The tentative patch posted there makes this PR less
likely to occur, but it still does.  Which almost looks like a runnable was
posted while quitting.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
  2024-01-29 10:00 ` [Bug dap/31306] " vries at gcc dot gnu.org
  2024-01-29 10:56 ` vries at gcc dot gnu.org
@ 2024-01-29 11:53 ` vries at gcc dot gnu.org
  2024-01-29 11:54 ` vries at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29 11:53 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
Created attachment 15339
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15339&action=edit
Tentative patch

(In reply to Tom de Vries from comment #2)
> Could be a dup or PR31172.  The tentative patch posted there makes this PR
> less likely to occur, but it still does.  Which almost looks like a runnable
> was posted while quitting.

This is a more elaborate version of the patch, which seems to fix all the
runnables-related problems I ran into.

However, now I run into:
...
(gdb) bt
#0  0x0000ffff61d02280 in __pthread_kill_implementation () from
/lib64/libc.so.6
#1  0x0000ffff61cb5800 [PAC] in raise () from /lib64/libc.so.6
#2  0x00000000007aeeb0 [PAC] in handle_fatal_signal (sig=11)
    at /home/vries/gdb/src/gdb/event-top.c:926
#3  0x00000000007aef38 in handle_sigsegv (sig=11)
    at /home/vries/gdb/src/gdb/event-top.c:976
#4  <signal handler called>
#5  0x0000000000604a84 in cli_ui_out::do_message (this=0xffff4ee9d728,
style=..., 
    format=0xffff2c0029f1 "%s", args=...) at
/home/vries/gdb/src/gdb/cli-out.c:232
#6  0x0000000000ce4268 in ui_out::call_do_message (this=0xffff4ee9d728,
style=..., 
    format=0xffff2c0029f1 "%s") at /home/vries/gdb/src/gdb/ui-out.c:584
#7  0x0000000000ce4520 in ui_out::vmessage (this=0xffff4ee9d728, in_style=..., 
    format=0x16f8e62 "", args=...) at /home/vries/gdb/src/gdb/ui-out.c:621
#8  0x0000000000ce19ac in ui_file::vprintf (this=0xffffcc6aa958,
format=0x16f8e60 "%s", 
    args=...) at /home/vries/gdb/src/gdb/ui-file.c:74
#9  0x0000000000d29024 in gdb_vprintf (stream=0xffffcc6aa958, format=0x16f8e60
"%s", 
    args=...) at /home/vries/gdb/src/gdb/utils.c:1879
#10 0x0000000000d29118 in gdb_printf (stream=0xffffcc6aa958, format=0x16f8e60
"%s")
--Type <RET> for more, q to quit, c to continue without paging--
    at /home/vries/gdb/src/gdb/utils.c:1894
#11 0x0000000000ab2fc4 in gdbpy_write (self=0x67db720, args=0x6b1c5a0,
kw=0x6d8df40)
    at /home/vries/gdb/src/gdb/python/python.c:1464
#12 0x0000ffff625fcedc in cfunction_call () from /lib64/libpython3.12.so.1.0
#13 0x0000ffff625cc500 [PAC] in _PyObject_MakeTpCall () from
/lib64/libpython3.12.so.1.0
#14 0x0000ffff625d8b64 [PAC] in _PyEval_EvalFrameDefault ()
   from /lib64/libpython3.12.so.1.0
#15 0x0000ffff62628cd0 [PAC] in method_vectorcall () from
/lib64/libpython3.12.so.1.0
#16 0x0000ffff62609824 [PAC] in PyObject_CallOneArg () from
/lib64/libpython3.12.so.1.0
#17 0x0000ffff626a7674 [PAC] in PyFile_WriteObject () from
/lib64/libpython3.12.so.1.0
#18 0x0000ffff626a77a0 [PAC] in PyFile_WriteString () from
/lib64/libpython3.12.so.1.0
#19 0x0000ffff625b5354 [PAC] in thread_excepthook () from
/lib64/libpython3.12.so.1.0
#20 0x0000ffff625fc6e0 [PAC] in cfunction_vectorcall_O ()
   from /lib64/libpython3.12.so.1.0
#21 0x0000ffff625f32d8 [PAC] in PyObject_Vectorcall () from
/lib64/libpython3.12.so.1.0
#22 0x0000ffff625d8b64 [PAC] in _PyEval_EvalFrameDefault ()
   from /lib64/libpython3.12.so.1.0
#23 0x0000ffff62628d88 [PAC] in method_vectorcall () from
/lib64/libpython3.12.so.1.0
--Type <RET> for more, q to quit, c to continue without paging--
#24 0x0000ffff62730ef4 [PAC] in thread_run () from /lib64/libpython3.12.so.1.0
#25 0x0000ffff626e1ec0 [PAC] in pythread_wrapper () from
/lib64/libpython3.12.so.1.0
#26 0x0000ffff61d00584 [PAC] in start_thread () from /lib64/libc.so.6
#27 0x0000ffff61d6fc4c [PAC] in thread_start () from /lib64/libc.so.6
(gdb) 
...
with:
...
(gdb) down
#11 0x0000000000ab2fc4 in gdbpy_write (self=0x67db720, args=0x6b1c5a0,
kw=0x6d8df40)
    at /home/vries/gdb/src/gdb/python/python.c:1464
1464                gdb_printf (gdb_stderr, "%s", arg);
(gdb) p arg
$5 = 0xffff2c002948 "Exception in thread "
(gdb) 
...

At this point, with:
...
(gdb) p *(&current_ui->m_gdb_stderr)
$23 = (ui_file *) 0x63e2ed0
...
I get what looks to me like a valid ui_file *:
...
gdb) p **(&current_ui->m_gdb_stderr)
$25 = {_vptr.ui_file = 0x175c908 <vtable for stderr_file+16>, m_applied_style =
{
    m_foreground = {m_simple = true, {m_value = -1, {m_red = 255 '\377', 
          m_green = 255 '\377', m_blue = 255 '\377'}}}, m_background = {
      m_simple = true, {m_value = -1, {m_red = 255 '\377', m_green = 255
'\377', 
          m_blue = 255 '\377'}}}, m_intensity = ui_file_style::NORMAL, 
    m_reverse = false}}
...

But if we go one frame down:
...
(gdb) down
#10 0x0000000000d29118 in gdb_printf (stream=0xffffcc6aa958, format=0x16f8e60
"%s")
    at /home/vries/gdb/src/gdb/utils.c:1894
1894      gdb_vprintf (stream, format, args);
...
we have an invalid ui_file:
...
(gdb) p stream
$27 = (ui_file *) 0xffffcc6aa958
(gdb) p *stream
$28 = {_vptr.ui_file = 0x0, m_applied_style = {m_foreground = {m_simple = true,
{
        m_value = 0, {m_red = 0 '\000', m_green = 0 '\000', m_blue = 0
'\000'}}}, 
    m_background = {m_simple = 32, {m_value = 65535, {m_red = 255 '\377', 
          m_green = 255 '\377', m_blue = 0 '\000'}}}, 
    m_intensity = (unknown: 0x62a4e710), m_reverse = 255}}
...
and AFAICT that ultimately causes the segfault.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-01-29 11:53 ` vries at gcc dot gnu.org
@ 2024-01-29 11:54 ` vries at gcc dot gnu.org
  2024-01-29 17:20 ` tromey at sourceware dot org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29 11:54 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |luis.machado at arm dot com,
                   |                            |tromey at sourceware dot org

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-01-29 11:54 ` vries at gcc dot gnu.org
@ 2024-01-29 17:20 ` tromey at sourceware dot org
  2024-01-29 21:18 ` vries at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: tromey at sourceware dot org @ 2024-01-29 17:20 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #4 from Tom Tromey <tromey at sourceware dot org> ---
(In reply to Tom de Vries from comment #2)
> Could be a dup or PR31172.  The tentative patch posted there makes this PR
> less likely to occur, but it still does.  Which almost looks like a runnable
> was posted while quitting.

Could it be the runnable that is posted to cause the quit?
io.py does:

            if obj is None:
                # This is an exit request.  The stream is already
                # flushed, so all that's left to do is request an
                # exit.
                send_gdb("quit")
                break

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-01-29 17:20 ` tromey at sourceware dot org
@ 2024-01-29 21:18 ` vries at gcc dot gnu.org
  2024-02-05 16:24 ` vries at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-01-29 21:18 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #5 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #3)
> However, now I run into:

Doesn't reproduce on x86_64-linux, for me.  Different gcc version though.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-01-29 21:18 ` vries at gcc dot gnu.org
@ 2024-02-05 16:24 ` vries at gcc dot gnu.org
  2024-02-07  9:03 ` vries at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-02-05 16:24 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #6 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom Tromey from comment #4)
> (In reply to Tom de Vries from comment #2)
> > Could be a dup or PR31172.  The tentative patch posted there makes this PR
> > less likely to occur, but it still does.  Which almost looks like a runnable
> > was posted while quitting.
> 
> Could it be the runnable that is posted to cause the quit?
> io.py does:
> 
>             if obj is None:
>                 # This is an exit request.  The stream is already
>                 # flushed, so all that's left to do is request an
>                 # exit.
>                 send_gdb("quit")
>                 break

Thanks for the hint.  I think there's a race between:
- a SIGHUP, and
- send_gdb("quit")

So, with clean sources, we have:
- gdb receives a SIGHUP, calls quit_force
- at that point, runnables contains quit in a python wrapper
- python is finalized, so the python global interpreter lock is
  no longer available
- runnables is destroyed, which trigger the destruction of a gdbpy_event, which
  requires the python global interpreter lock
- segfault when trying to acquire the no longer availabe python global
interpreter
  lock

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-02-05 16:24 ` vries at gcc dot gnu.org
@ 2024-02-07  9:03 ` vries at gcc dot gnu.org
  2024-02-12 18:58 ` tromey at sourceware dot org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-02-07  9:03 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #7 from Tom de Vries <vries at gcc dot gnu.org> ---
Posted RFC:
https://sourceware.org/pipermail/gdb-patches/2024-February/206403.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-02-07  9:03 ` vries at gcc dot gnu.org
@ 2024-02-12 18:58 ` tromey at sourceware dot org
  2024-02-14 17:24 ` cvs-commit at gcc dot gnu.org
  2024-02-14 17:25 ` vries at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: tromey at sourceware dot org @ 2024-02-12 18:58 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

Tom Tromey <tromey at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |15.1

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-02-12 18:58 ` tromey at sourceware dot org
@ 2024-02-14 17:24 ` cvs-commit at gcc dot gnu.org
  2024-02-14 17:25 ` vries at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-14 17:24 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

--- Comment #8 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tom de Vries <vries@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=98e1896364c7db2a3088fa7e36c334683566fe97

commit 98e1896364c7db2a3088fa7e36c334683566fe97
Author: Tom de Vries <tdevries@suse.de>
Date:   Wed Feb 14 18:24:39 2024 +0100

    [gdb/dap] Fix exit race

    When running test-case gdb.dap/eof.exp, we're likely to get a coredump due
to
    a segfault in new_threadstate.

    At the point of the core dump, the gdb main thread looks like:
    ...
     (gdb) bt
     #0  0x0000fffee30d2280 in __pthread_kill_implementation () from
/lib64/libc.so.6
     #1  0x0000fffee3085800 [PAC] in raise () from /lib64/libc.so.6
     #2  0x00000000007b03e8 [PAC] in handle_fatal_signal (sig=11)
         at gdb/event-top.c:926
     #3  0x00000000007b0470 in handle_sigsegv (sig=11)
         at gdb/event-top.c:976
     #4  <signal handler called>
     #5  0x0000fffee3a4db14 in new_threadstate () from
/lib64/libpython3.12.so.1.0
     #6  0x0000fffee3ab0548 [PAC] in PyGILState_Ensure () from
/lib64/libpython3.12.so.1.0
     #7  0x0000000000a6d034 [PAC] in gdbpy_gil::gdbpy_gil (this=0xffffcb279738)
         at gdb/python/python-internal.h:787
     #8  0x0000000000ab87ac in gdbpy_event::~gdbpy_event (this=0xfffea8001ee0,
         __in_chrg=<optimized out>) at gdb/python/python.c:1051
     #9  0x0000000000ab9460 in
std::_Function_base::_Base_manager<...>::_M_destroy
         (__victim=...) at /usr/include/c++/13/bits/std_function.h:175
     #10 0x0000000000ab92dc in
std::_Function_base::_Base_manager<...>::_M_manager
         (__dest=..., __source=..., __op=std::__destroy_functor)
         at /usr/include/c++/13/bits/std_function.h:203
     #11 0x0000000000ab8f14 in std::_Function_handler<...>::_M_manager(...)
(...)
         at /usr/include/c++/13/bits/std_function.h:282
     #12 0x000000000042dd9c in std::_Function_base::~_Function_base
(this=0xfffea8001c10,
         __in_chrg=<optimized out>) at
/usr/include/c++/13/bits/std_function.h:244
     #13 0x000000000042e654 in std::function<void ()>::~function()
(this=0xfffea8001c10,
         __in_chrg=<optimized out>) at
/usr/include/c++/13/bits/std_function.h:334
     #14 0x0000000000b68e60 in std::_Destroy<std::function<void ()> >(...)
(...)
         at /usr/include/c++/13/bits/stl_construct.h:151
     #15 0x0000000000b68cd0 in std::_Destroy_aux<false>::__destroy<...>(...)
(...)
         at /usr/include/c++/13/bits/stl_construct.h:163
     #16 0x0000000000b689d8 in std::_Destroy<...>(...) (...)
         at /usr/include/c++/13/bits/stl_construct.h:196
     #17 0x0000000000b68414 in std::_Destroy<...>(...) (...)
         at /usr/include/c++/13/bits/alloc_traits.h:948
     #18 std::vector<...>::~vector() (this=0x2a183c8 <runnables>)
         at /usr/include/c++/13/bits/stl_vector.h:732
     #19 0x0000fffee3088370 in __run_exit_handlers () from /lib64/libc.so.6
     #20 0x0000fffee3088450 [PAC] in exit () from /lib64/libc.so.6
     #21 0x0000000000c95600 [PAC] in quit_force (exit_arg=0x0, from_tty=0)
         at gdb/top.c:1822
     #22 0x0000000000609140 in quit_command (args=0x0, from_tty=0)
         at gdb/cli/cli-cmds.c:508
     #23 0x0000000000c926a4 in quit_cover () at gdb/top.c:300
     #24 0x00000000007b09d4 in async_disconnect (arg=0x0)
         at gdb/event-top.c:1230
     #25 0x0000000000548acc in invoke_async_signal_handlers ()
         at gdb/async-event.c:234
     #26 0x000000000157d2d4 in gdb_do_one_event (mstimeout=-1)
         at gdbsupport/event-loop.cc:199
     #27 0x0000000000943a84 in start_event_loop () at gdb/main.c:401
     #28 0x0000000000943bfc in captured_command_loop () at gdb/main.c:465
     #29 0x000000000094567c in captured_main (data=0xffffcb279d08)
         at gdb/main.c:1335
     #30 0x0000000000945700 in gdb_main (args=0xffffcb279d08)
         at gdb/main.c:1354
     #31 0x0000000000423ab4 in main (argc=14, argv=0xffffcb279e98)
         at gdb/gdb.c:39
    ...

    The direct cause of the segfault is calling PyGILState_Ensure after
    calling Py_Finalize.

    AFAICT the problem is a race between the gdb main thread and DAP's JSON
writer
    thread.

    On one side, we have the following events:
    - DAP's JSON reader thread reads an EOF, and lets DAP's main thread known
      by writing None into read_queue
    - DAP's main thread lets DAP's JSON writer thread known by writing None
into
      write_queue
    - DAP's JSON writer thread sees the None in its queue, and calls
      send_gdb("quit")
    - a corresponding gdbpy_event is deposited in the runnables vector, to be
      run by the gdb main thread

    On the other side, we have the following events:
    - the gdb main thread receives a SIGHUP
    - the corresponding handler calls quit_force, which calls do_final_cleanups
    - one of the final cleanups is finalize_python, which calls Py_Finalize
    - quit_force calls exit, which triggers the exit handlers
    - one of the exit handlers is the destructor of the runnables vector
    - destruction of the vector triggers destruction of the remaining element
    - the remaining element is a gdbpy_event, and the destructor (indirectly)
      calls PyGILState_Ensure

    It's good to note that both events (EOF and SIGHUP) are caused by this line
in
    the test-case:
    ...
    catch "close -i $gdb_spawn_id"
    ...
    where "expect close" closes the stdin and stdout file descriptors, which
    causes the SIGHUP to be send.

    So, for the system I'm running this on, the send_gdb("quit") is actually
not
    needed.

    I'm not sure if we support any systems where it's actually needed.

    Fix this by removing the send_gdb("quit").

    Tested on aarch64-linux.

    PR dap/31306
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31306

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug dap/31306] [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp
  2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2024-02-14 17:24 ` cvs-commit at gcc dot gnu.org
@ 2024-02-14 17:25 ` vries at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: vries at gcc dot gnu.org @ 2024-02-14 17:25 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31306

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #9 from Tom de Vries <vries at gcc dot gnu.org> ---
Fixed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-02-14 17:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-29  6:49 [Bug dap/31306] New: [gdb/dap] segfault in new_threadstate during gdb.dap/eof.exp vries at gcc dot gnu.org
2024-01-29 10:00 ` [Bug dap/31306] " vries at gcc dot gnu.org
2024-01-29 10:56 ` vries at gcc dot gnu.org
2024-01-29 11:53 ` vries at gcc dot gnu.org
2024-01-29 11:54 ` vries at gcc dot gnu.org
2024-01-29 17:20 ` tromey at sourceware dot org
2024-01-29 21:18 ` vries at gcc dot gnu.org
2024-02-05 16:24 ` vries at gcc dot gnu.org
2024-02-07  9:03 ` vries at gcc dot gnu.org
2024-02-12 18:58 ` tromey at sourceware dot org
2024-02-14 17:24 ` cvs-commit at gcc dot gnu.org
2024-02-14 17:25 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).