Re: [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Simon Marchi <simark@simark.ca>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED
Date: Wed, 20 Jul 2022 21:35:14 -0400	[thread overview]
Message-ID: <caf519e8-e940-4e1c-067a-3dcbbf0274cf@simark.ca> (raw)
In-Reply-To: <20220713222433.374898-5-pedro@palves.net>


Just nits, from what I can tell the patch looks OK.

On 2022-07-13 18:24, Pedro Alves wrote:
> (A good chunk of the problem statement in the commit log below is
> Andrew's, adjusted for a different solution, and for covering
> displaced stepping too.)
> 
> This commit addresses bugs gdb/19675 and gdb/27830, which are about
> stepping over a breakpoint set at a clone syscall instruction, one is
> about displaced stepping, and the other about in-line stepping.
> 
> Currently, when a new thread is created through a clone syscall, GDB
> sets the new thread running.  With 'continue' this makes sense
> (assuming no schedlock):
> 
>  - all-stop mode, user issues 'continue', all threads are set running,
>    a newly created thread should also be set running.
> 
>  - non-stop mode, user issues 'continue', other pre-existing threads
>    are not effected, but as the new thread is (sort-of) a child of the

effected -> affected ?

>    thread the user asked to run, it makes sense that the new threads
>    should be created in the running state.
> 
> Similarly, if we are stopped at the clone syscall, and there's no
> software breakpoint at this address, then the current behaviour is
> fine:
> 
>  - all-stop mode, user issues 'stepi', stepping will be done in place
>    (as there's no breakpoint to step over).  While stepping the thread
>    of interest all the other threads will be allowed to continue.  A
>    newly created thread will be set running, and then stopped once the
>    thread of interest has completed its step.
> 
>  - non-stop mode, user issues 'stepi', stepping will be done in place
>    (as there's no breakpoint to step over).  Other threads might be
>    running or stopped, but as with the continue case above, the new
>    thread will be created running.  The only possible issue here is
>    that the new thread will be left running after the initial thread
>    has completed its stepi.  The user would need to manually select
>    the thread and interrupt it, this might not be what the user
>    expects.  However, this is not something this commit tries to
>    change.
> 
> The problem then is what happens when we try to step over a clone
> syscall if there is a breakpoint at the syscall address.
> 
> - For both all-stop and non-stop modes, with in-line stepping:
> 
>    + user issues 'stepi',
>    + [non-stop mode only] GDB stops all threads.  In all-stop mode all
>      threads are already stopped.
>    + GDB removes s/w breakpoint at syscall address,
>    + GDB single steps just the thread of interest, all other threads
>      are left stopped,
>    + New thread is created running,
>    + Initial thread completes its step,
>    + [non-stop mode only] GDB resumes all threads that it previously
>      stopped.
> 
> There are two problems in the in-line stepping scenario above:
> 
>   1. The new thread might pass through the same code that the initial
>      thread is in (i.e. the clone syscall code), in which case it will
>      fail to hit the breakpoint in clone as this was removed so the
>      first thread can single step,
> 
>   2. The new thread might trigger some other stop event before the
>      initial thread reports its step completion.  If this happens we
>      end up triggering an assertion as GDB assumes that only the
>      thread being stepped should stop.  The assert looks like this:
> 
>      infrun.c:5899: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> 
> - For both all-stop and non-stop modes, with displaced stepping:
> 
>    + user issues 'stepi',
>    + GDB starts the displaced step, moves thread's PC to the
>      out-of-line scratch pad, maybe adjusts registers,
>    + GDB single steps the thread of interest, [non-stop mode only] all
>      other threads are left as they were, either running or stopped.
>      In all-stop, all other threads are left stopped.
>    + New thread is created running,
>    + Initial thread completes its step, GDB re-adjusts its PC,
>      restores/releases scratchpad,
>    + [non-stop mode only] GDB resumes the thread, now past its
>      breakpoint.
>    + [all-stop mode only] GDB resumes all threads.
> 
> There is one problem with the displaced stepping scenario above:
> 
>   3. When the parent thread completed its step, GDB adjusted its PC,
>      but did not adjust the child's PC, thus that new child thread
>      will continue execution in the scratch pad, invoking undefined
>      behavior.  If you're lucky, you see a crash.  If unlucky, the
>      inferior gets silently corrupted.
> 
> What is needed is for GDB to have more control over whether the new
> thread is created running or not.  Issue #1 above requires that the
> new thread not be allowed to run until the breakpoint has been
> reinserted.  The only way to guarantee this is if the new thread is
> held in a stopped state until the single step has completed.  Issue #3
> above requires that GDB is informed of when a thread clones itself,
> and of what is the child's ptid, so that GDB can fixup both the parent
> and the child.
> 
> When looking for solutions to this problem I considered how GDB
> handles fork/vfork as these have some of the same issues.  The main
> difference between fork/vfork and clone is that the clone events are
> not reported back to core GDB.  Instead, the clone event is handled
> automatically in the target code and the child thread is immediately
> set running.
> 
> Note we have support for requesting thread creation events out of the
> target (TARGET_WAITKIND_THREAD_CREATED).  However, those are reported
> for the new/child thread.  That would be sufficient to address in-line
> stepping (issue #1), but not for displaced-stepping (issue #3).  To
> handle displaced-stepping, we need an event that is reported to the
> _parent_ of the clone, as the information about the displaced step is
> associated with the clone parent.  TARGET_WAITKIND_THREAD_CREATED
> includes no indication of which thread is the parent that spawned the
> new child.  In fact, for some targets, like e.g., Windows, it would be
> impossible to know which thread that was, as thread creation there
> doesn't work by "cloning".
> 
> The solution implemented here is to model clone on fork/vfork, and
> introduce a new TARGET_WAITKIND_THREAD_CLONED event.  This event is
> similar to TARGET_WAITKIND_FORKED and TARGET_WAITKIND_VFORKED, except
> that we end up with a new thread in the same process, instead of a new
> thread of a new process.  Like FORKED and VFORKED, THREAD_CLONED
> waitstatuses have a child_ptid property, and the child is help stopped

help -> held

> until GDB explicitly resumes it.  This addresses the in-line stepping
> case (issues #1 and #2).
> 
> The infrun code that handles displaced stepping fixup for the child
> after a fork/vfork event is thus reused for THREAD_CLONE, with some
> minimal conditions added, addressing the displaced stepping case
> (issue #3).
> 
> The native Linux backend is adjusted to unconditionally report
> TARGET_WAITKIND_THREAD_CLONED events to the core.
> 
> Following the follow_fork model in core GDB, we introduce a
> target_follow_clone target method, which is responsible for making the
> new clone child visible to the rest of GDB.
> 
> Subsequent patches will add clone events support to the remote
> protocol and gdbserver.
> 
> A testcase will be added by a later patch.
> 
> displaced_step_in_progress_thread is removed in this patch, but it is
> added back again in a subsequent patch.  We need to do this because
> the function is static, and with no callers, the compiler would warn,
> (error with -Werror), breaking the build.

If you prefer, you could annotate temporarily the function with the
"used" attribute.

> @@ -1848,8 +1863,39 @@ displaced_step_finish (thread_info *event_thread,
>  
>    /* Do the fixup, and release the resources acquired to do the displaced
>       step. */
> -  return gdbarch_displaced_step_finish (displaced->get_original_gdbarch (),
> -					event_thread, event_status);
> +  displaced_step_finish_status status
> +    = gdbarch_displaced_step_finish (displaced->get_original_gdbarch (),
> +				     event_thread, event_status);
> +
> +  if (event_status.kind () == TARGET_WAITKIND_FORKED
> +      || event_status.kind () == TARGET_WAITKIND_VFORKED
> +      || event_status.kind () == TARGET_WAITKIND_THREAD_CLONED)
> +    {
> +      /* Since the vfork/fork/clone syscall instruction was executed
> +	 in the scratchpad, the child's PC is also within the
> +	 scratchpad.  Set the child's PC to the parent's PC value,
> +	 which has already been fixed up.  Note: we use the parent's
> +	 aspace here, although we're touching the child, because the
> +	 child hasn't been added to the inferior list yet at this
> +	 point.  */
> +
> +      struct regcache *child_regcache
> +	= get_thread_arch_aspace_regcache (parent_inf->process_target (),
> +					   event_status.child_ptid (),
> +					   gdbarch,
> +					   parent_inf->aspace);
> +      /* Read PC value of parent.  */
> +      CORE_ADDR parent_pc = regcache_read_pc (regcache);
> +
> +      displaced_debug_printf ("write child pc from %s to %s",
> +			      paddress (gdbarch,
> +					regcache_read_pc (child_regcache)),
> +			      paddress (gdbarch, parent_pc));
> +
> +      regcache_write_pc (child_regcache, parent_pc);

Since you talked about Windows in the commit message, that made me
wonder: the code just above wouldn't work for Windows, right?  My
understanding is that Windows threads start with the PC directly at the
function the user provided to CreateThread.

Would the Windows target just not report TARGET_WAITKIND_THREAD_CLONED?

> @@ -5032,7 +5076,7 @@ handle_one (const wait_one_event &event)
>  		global_thread_step_over_chain_enqueue (t);
>  	    }
>  
> -	  regcache = get_thread_regcache (t);
> +	  struct regcache *regcache = get_thread_regcache (t);

This change is kind of unrelated to this patch.

> +/* If LP has a pending fork/vfork/clone status, store it in WS and
> +   return true.  Otherwise, return false.  */
>  
> +static bool
> +get_pending_child_status (lwp_info *lp, target_waitstatus *ws)

It would maybe be a bit more C++ to return an
optional<target_waitstatus>?  The callers can then look like:

  if (gdb::optional<target_waitstatus> ws = get_pending_child_status (lp))

> @@ -1819,6 +1853,58 @@ linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
>    return 1;
>  }
>  
> +void
> +linux_nat_target::follow_clone (ptid_t child_ptid)
> +{
> +  linux_nat_debug_printf
> +    ("Got clone event from LWP %ld, new child is LWP %ld",
> +     inferior_ptid.lwp (), child_ptid.lwp ());

Even if we are in the linux-nat target and know the LWPs are unique, I
think it's nicer to print full ptids anyway.  When you are debugging a
problem that involves multiples processes, it becomes easier to identify
which LWPs belong to which process.

> +
> +  lwp_info *new_lp = add_lwp (child_ptid);
> +  new_lp->stopped = 1;
> +
> +  /* If the thread_db layer is active, let it record the user
> +     level thread id and status, and add the thread to GDB's
> +     list.  */
> +  if (!thread_db_notice_clone (inferior_ptid, new_lp->ptid))
> +    {
> +      /* The process is not using thread_db.  Add the LWP to
> +	 GDB's list.  */
> +      target_post_attach (new_lp->ptid.lwp ());

Can you explain why that target_post_attach call?  Oh, I just noticed it
was already there.  Still, can you explain?

> +      add_thread (linux_target, new_lp->ptid);
> +    }
> +
> +  /* We just created NEW_LP so it cannot yet contain STATUS.  */
> +  gdb_assert (new_lp->status == 0);
> +
> +  if (!pull_pid_from_list (&stopped_pids, child_ptid.lwp (), &new_lp->status))
> +    internal_error (__FILE__, __LINE__, _("no saved status for clone lwp"));

gdb_assert_not_reached maybe?

Otherwise, IBWN to have a version of gdb_assert that lets us provide a
message.

Simon

next prev parent reply	other threads:[~2022-07-21  1:35 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 22:24 [PATCH v2 00/29] Step over thread clone and thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 01/29] displaced step: pass down target_waitstatus instead of gdb_signal Pedro Alves
2022-07-20 20:21   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 02/29] linux-nat: introduce pending_status_str Pedro Alves
2022-07-20 20:38   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 03/29] gdb/linux: Delete all other LWPs immediately on ptrace exec event Pedro Alves
2022-07-21  0:45   ` Simon Marchi
2022-07-13 22:24 ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-21  1:35   ` Simon Marchi [this message]
2022-10-17 18:54     ` Pedro Alves
2022-10-18 12:40     ` [PATCH] Don't explicitly set clone child ptrace options (was: Re: [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED) Pedro Alves
2022-10-28 14:50       ` Simon Marchi
2022-12-12 20:13     ` [PATCH v2 04/29] Step over clone syscall w/ breakpoint, TARGET_WAITKIND_THREAD_CLONED Pedro Alves
2022-07-13 22:24 ` [PATCH v2 05/29] Support clone events in the remote protocol Pedro Alves
2022-07-21  2:25   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 06/29] Avoid duplicate QThreadEvents packets Pedro Alves
2022-07-21  2:30   ` Simon Marchi
2022-12-12 20:14     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-21  3:14   ` Simon Marchi
2022-10-27 19:55     ` [PATCH] enum_flags to_string (was: Re: [PATCH v2 07/29] Thread options & clone events (core + remote)) Pedro Alves
2022-10-28 10:26       ` [PATCH v2] " Pedro Alves
2022-10-28 11:08         ` [PATCH v3] " Pedro Alves
2022-10-28 15:59           ` Simon Marchi
2022-10-28 18:23             ` Pedro Alves
2022-10-31 12:47               ` Simon Marchi
2022-11-07 17:26                 ` [PATCH v5] " Pedro Alves
2022-11-07 18:29                   ` Simon Marchi
2022-11-08 14:56                     ` Pedro Alves
2022-12-12 20:15     ` [PATCH v2 07/29] Thread options & clone events (core + remote) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 08/29] Thread options & clone events (native Linux) Pedro Alves
2022-07-21 12:38   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 09/29] Thread options & clone events (Linux GDBserver) Pedro Alves
2022-07-21 13:11   ` Simon Marchi
2022-12-12 20:16     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 10/29] gdbserver: Hide and don't detach pending clone children Pedro Alves
2022-07-13 22:24 ` [PATCH v2 11/29] Remove gdb/19675 kfails (displaced stepping + clone) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 12/29] Add test for stepping over clone syscall Pedro Alves
2022-07-13 22:24 ` [PATCH v2 13/29] all-stop/synchronous RSP support thread-exit events Pedro Alves
2022-07-13 22:24 ` [PATCH v2 14/29] gdbserver/linux-low.cc: Ignore event_ptid if TARGET_WAITKIND_IGNORE Pedro Alves
2022-07-13 22:24 ` [PATCH v2 15/29] Introduce GDB_TO_EXIT thread option, fix step-over-thread-exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 16/29] Implement GDB_TO_EXIT support for Linux GDBserver Pedro Alves
2022-07-13 22:24 ` [PATCH v2 17/29] Implement GDB_TO_EXIT support for native Linux Pedro Alves
2022-07-21 15:26   ` Simon Marchi
2022-12-12 20:17     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 18/29] gdb: clear step over information on thread exit (PR gdb/27338) Pedro Alves
2022-07-21 18:12   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 19/29] stop_all_threads: (re-)enable async before waiting for stops Pedro Alves
2022-07-21 18:21   ` Simon Marchi
2022-12-12 20:18     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 20/29] gdbserver: Queue no-resumed event after thread exit Pedro Alves
2022-07-13 22:24 ` [PATCH v2 21/29] Don't resume new threads if scheduler-locking is in effect Pedro Alves
2022-07-14  5:28   ` Eli Zaretskii
2022-07-21 18:49   ` Simon Marchi
2022-12-12 20:19     ` Pedro Alves
2022-07-13 22:24 ` [PATCH v2 22/29] Report thread exit event for leader if reporting thread exit events Pedro Alves
2022-07-13 22:24 ` [PATCH v2 23/29] Ignore failure to read PC when resuming Pedro Alves
2022-07-13 22:24 ` [PATCH v2 24/29] gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro Pedro Alves
2022-07-13 22:24 ` [PATCH v2 25/29] Testcases for stepping over thread exit syscall (PR gdb/27338) Pedro Alves
2022-07-13 22:24 ` [PATCH v2 26/29] Document remote clone events, and QThreadOptions packet Pedro Alves
2022-07-14  5:27   ` Eli Zaretskii
2022-07-13 22:24 ` [PATCH v2 27/29] inferior::clear_thread_list always silent Pedro Alves
2022-07-13 22:24 ` [PATCH v2 28/29] Centralize "[Thread ...exited]" notifications Pedro Alves
2022-07-13 22:24 ` [PATCH v2 29/29] Cancel execution command on thread exit, when stepping, nexting, etc Pedro Alves
2022-07-21 19:28 ` [PATCH v2 00/29] Step over thread clone and thread exit Simon Marchi
2022-10-03 13:46 ` Tom de Vries
2022-10-03 18:37   ` Tom de Vries
2022-12-12 20:20     ` Pedro Alves
2022-12-12 20:19   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=caf519e8-e940-4e1c-067a-3dcbbf0274cf@simark.ca \
    --to=simark@simark.ca \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).