* [Bug gdb/27830] Attempt to resume already running thread
2021-05-07 14:16 [Bug gdb/27830] New: Attempt to resume already running thread andrew.burgess at embecosm dot com
2021-05-07 14:26 ` [Bug gdb/27830] " andrew.burgess at embecosm dot com
@ 2021-05-10 17:27 ` andrew.burgess at embecosm dot com
2021-05-10 19:24 ` andrew.burgess at embecosm dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: andrew.burgess at embecosm dot com @ 2021-05-10 17:27 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=27830
--- Comment #2 from Andrew Burgess <andrew.burgess at embecosm dot com> ---
The problem seems to be that in the following situation:
- gdb is doing a single step over a clone syscall, and
- displaced stepping is off
Then, when GDB steps it might be that the first event reported to GDB is the
creation of the new thread.
If this is the case the thread is added to GDB with state THREAD_RUNNING and
'executing=true'. However, the resumed field is left as false.
Once the in-place step over has completed then GDB tries to restart all
threads, however, the new thread will have already been set running by the
Linux code from resume_stopped_resumed_lwps.
If feels like, when the new thread is spotted we should actually, in this case,
realise that all other threads are currently stopped, and so add the thread to
GDB in the stopped state.
Annoyingly, the same problem is seen with the remote protocol, only, I think
the situation here is worse.
When the stepi is started the remote end will see the new thread, but as this
is non-stop mode, the new thread will be set running again, and only the
completed step is reported to GDB.
Now, when GDB tries to complete the step-over and resume all threads, first GDB
updates the threads list and sees the new thread, then GDB tries to set the new
thread running. Unfortunately, the new thread is already running, and so we
get an error.
Of course, as the new thread was allowed to run unchecked while we were doing
the step over, it could easily have missed hitting a breakpoint.
It would seem (to me) that when doing an in-place step over, we almost need to
take the remote target out of non-stop mode temporarily so that any new threads
that appear will be held until the step has completed??
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug gdb/27830] Attempt to resume already running thread
2021-05-07 14:16 [Bug gdb/27830] New: Attempt to resume already running thread andrew.burgess at embecosm dot com
` (3 preceding siblings ...)
2022-06-21 11:28 ` pedro at palves dot net
@ 2023-11-13 14:25 ` cvs-commit at gcc dot gnu.org
2023-11-13 14:25 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-13 14:25 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=27830
--- Comment #5 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pedro Alves <palves@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=0d36baa9af0d9929c96b89a184a469c432c68b0d
commit 0d36baa9af0d9929c96b89a184a469c432c68b0d
Author: Pedro Alves <pedro@palves.net>
Date: Fri Nov 12 20:50:29 2021 +0000
Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
(A good chunk of the problem statement in the commit log below is
Andrew's, adjusted for a different solution, and for covering
displaced stepping too. The testcase is mostly Andrew's too.)
This commit addresses bugs gdb/19675 and gdb/27830, which are about
stepping over a breakpoint set at a clone syscall instruction, one is
about displaced stepping, and the other about in-line stepping.
Currently, when a new thread is created through a clone syscall, GDB
sets the new thread running. With 'continue' this makes sense
(assuming no schedlock):
- all-stop mode, user issues 'continue', all threads are set running,
a newly created thread should also be set running.
- non-stop mode, user issues 'continue', other pre-existing threads
are not affected, but as the new thread is (sort-of) a child of the
thread the user asked to run, it makes sense that the new threads
should be created in the running state.
Similarly, if we are stopped at the clone syscall, and there's no
software breakpoint at this address, then the current behaviour is
fine:
- all-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). While stepping the thread
of interest all the other threads will be allowed to continue. A
newly created thread will be set running, and then stopped once the
thread of interest has completed its step.
- non-stop mode, user issues 'stepi', stepping will be done in place
(as there's no breakpoint to step over). Other threads might be
running or stopped, but as with the continue case above, the new
thread will be created running. The only possible issue here is
that the new thread will be left running after the initial thread
has completed its stepi. The user would need to manually select
the thread and interrupt it, this might not be what the user
expects. However, this is not something this commit tries to
change.
The problem then is what happens when we try to step over a clone
syscall if there is a breakpoint at the syscall address.
- For both all-stop and non-stop modes, with in-line stepping:
+ user issues 'stepi',
+ [non-stop mode only] GDB stops all threads. In all-stop mode all
threads are already stopped.
+ GDB removes s/w breakpoint at syscall address,
+ GDB single steps just the thread of interest, all other threads
are left stopped,
+ New thread is created running,
+ Initial thread completes its step,
+ [non-stop mode only] GDB resumes all threads that it previously
stopped.
There are two problems in the in-line stepping scenario above:
1. The new thread might pass through the same code that the initial
thread is in (i.e. the clone syscall code), in which case it will
fail to hit the breakpoint in clone as this was removed so the
first thread can single step,
2. The new thread might trigger some other stop event before the
initial thread reports its step completion. If this happens we
end up triggering an assertion as GDB assumes that only the
thread being stepped should stop. The assert looks like this:
infrun.c:5899: internal-error: int
finish_step_over(execution_control_state*): Assertion
`ecs->event_thread->control.trap_expected' failed.
- For both all-stop and non-stop modes, with displaced stepping:
+ user issues 'stepi',
+ GDB starts the displaced step, moves thread's PC to the
out-of-line scratch pad, maybe adjusts registers,
+ GDB single steps the thread of interest, [non-stop mode only] all
other threads are left as they were, either running or stopped.
In all-stop, all other threads are left stopped.
+ New thread is created running,
+ Initial thread completes its step, GDB re-adjusts its PC,
restores/releases scratchpad,
+ [non-stop mode only] GDB resumes the thread, now past its
breakpoint.
+ [all-stop mode only] GDB resumes all threads.
There is one problem with the displaced stepping scenario above:
3. When the parent thread completed its step, GDB adjusted its PC,
but did not adjust the child's PC, thus that new child thread
will continue execution in the scratch pad, invoking undefined
behavior. If you're lucky, you see a crash. If unlucky, the
inferior gets silently corrupted.
What is needed is for GDB to have more control over whether the new
thread is created running or not. Issue #1 above requires that the
new thread not be allowed to run until the breakpoint has been
reinserted. The only way to guarantee this is if the new thread is
held in a stopped state until the single step has completed. Issue #3
above requires that GDB is informed of when a thread clones itself,
and of what is the child's ptid, so that GDB can fixup both the parent
and the child.
When looking for solutions to this problem I considered how GDB
handles fork/vfork as these have some of the same issues. The main
difference between fork/vfork and clone is that the clone events are
not reported back to core GDB. Instead, the clone event is handled
automatically in the target code and the child thread is immediately
set running.
Note we have support for requesting thread creation events out of the
target (TARGET_WAITKIND_THREAD_CREATED). However, those are reported
for the new/child thread. That would be sufficient to address in-line
stepping (issue #1), but not for displaced-stepping (issue #3). To
handle displaced-stepping, we need an event that is reported to the
_parent_ of the clone, as the information about the displaced step is
associated with the clone parent. TARGET_WAITKIND_THREAD_CREATED
includes no indication of which thread is the parent that spawned the
new child. In fact, for some targets, like e.g., Windows, it would be
impossible to know which thread that was, as thread creation there
doesn't work by "cloning".
The solution implemented here is to model clone on fork/vfork, and
introduce a new TARGET_WAITKIND_THREAD_CLONED event. This event is
similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
that we end up with a new thread in the same process, instead of a new
thread of a new process. Like FORKED and VFORKED, THREAD_CLONED
waitstatuses have a child_ptid property, and the child is held stopped
until GDB explicitly resumes it. This addresses the in-line stepping
case (issues #1 and #2).
The infrun code that handles displaced stepping fixup for the child
after a fork/vfork event is thus reused for THREAD_CLONE, with some
minimal conditions added, addressing the displaced stepping case
(issue #3).
The native Linux backend is adjusted to unconditionally report
TARGET_WAITKIND_THREAD_CLONED events to the core.
Following the follow_fork model in core GDB, we introduce a
target_follow_clone target method, which is responsible for making the
new clone child visible to the rest of GDB.
Subsequent patches will add clone events support to the remote
protocol and gdbserver.
displaced_step_in_progress_thread becomes unused with this patch, but
a new use will reappear later in the series. To avoid deleting it and
readding it back, this patch marks it with attribute unused, and the
latter patch removes the attribute again. We need to do this because
the function is static, and with no callers, the compiler would warn,
(error with -Werror), breaking the build.
This adds a new gdb.threads/stepi-over-clone.exp testcase, which
exercises stepping over a clone syscall, with displaced stepping vs
inline stepping, and all-stop vs non-stop. We already test stepping
over clone syscalls with gdb.base/step-over-syscall.exp, but this test
uses pthreads, while the other test uses raw clone, and this one is
more thorough. The testcase passes on native GNU/Linux, but fails
against GDBserver. GDBserver will be fixed by a later patch in the
series.
Co-authored-by: Andrew Burgess <aburgess@redhat.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Change-Id: I95c06024736384ae8542a67ed9fdf6534c325c8e
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug gdb/27830] Attempt to resume already running thread
2021-05-07 14:16 [Bug gdb/27830] New: Attempt to resume already running thread andrew.burgess at embecosm dot com
` (5 preceding siblings ...)
2023-11-13 14:25 ` cvs-commit at gcc dot gnu.org
@ 2023-11-13 14:25 ` cvs-commit at gcc dot gnu.org
2023-11-13 14:25 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-13 14:25 UTC (permalink / raw)
To: gdb-prs
https://sourceware.org/bugzilla/show_bug.cgi?id=27830
--- Comment #7 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pedro Alves <palves@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=65c459abebf70bd5a64dcee11d4d7d4a8498465f
commit 65c459abebf70bd5a64dcee11d4d7d4a8498465f
Author: Pedro Alves <pedro@palves.net>
Date: Tue Nov 23 20:35:12 2021 +0000
Thread options & clone events (core + remote)
A previous patch taught GDB about a new TARGET_WAITKIND_THREAD_CLONED
event kind, and made the Linux target report clone events.
A following patch will teach Linux GDBserver to do the same thing.
However, for remote debugging, it wouldn't be ideal for GDBserver to
report every clone event to GDB, when GDB only cares about such events
in some specific situations. Reporting clone events all the time
would be potentially chatty. We don't enable thread create/exit
events all the time for the same reason. Instead we have the
QThreadEvents packet. QThreadEvents is target-wide, though.
This patch makes GDB instead explicitly request that the target
reports clone events or not, on a per-thread basis.
In order to be able to do that with GDBserver, we need a new remote
protocol feature. Since a following patch will want to enable thread
exit events on per-thread basis too, the packet introduced here is
more generic than just for clone events. It lets you enable/disable a
set of options at once, modelled on Linux ptrace's PTRACE_SETOPTIONS.
IOW, this commit introduces a new QThreadOptions packet, that lets you
specify a set of per-thread event options you want to enable. The
packet accepts a list of options/thread-id pairs, similarly to vCont,
processed left to right, with the options field being a number
interpreted as a bit mask of options. The only option defined in this
commit is GDB_THREAD_OPTION_CLONE (0x1), which ask the remote target
to report clone events. Another patch later in the series will
introduce another option.
For example, this packet sets option "1" (clone events) on thread
p1000.2345:
QThreadOptions;1:p1000.2345
and this clears options for all threads of process 1000, and then sets
option "1" (clone events) on thread p1000.2345:
QThreadOptions;0:p1000.-1;1:p1000.2345
This clears options of all threads of all processes:
QThreadOptions;0
The target reports the set of supported options by including
"QThreadOptions=<supported options>" in its qSupported response.
infrun is then tweaked to enable GDB_THREAD_OPTION_CLONE when stepping
over a breakpoint.
Unlike PTRACE_SETOPTIONS, fork/vfork/clone children do NOT inherit
their parent's thread options. This is so that GDB can send e.g.,
"QThreadOptions;0;1:TID" without worrying about threads it doesn't
know about yet.
Documentation for this new remote protocol feature is included in a
documentation patch later in the series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=19675
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27830
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Change-Id: Ie41e5093b2573f14cf6ac41b0b5804eba75be37e
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread