public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla@sourceware.org>
To: gdb-prs@sourceware.org
Subject: [Bug gdb/28942] Problem with breakpoint condition calling a function in multi-threaded program
Date: Mon, 25 Mar 2024 17:40:59 +0000	[thread overview]
Message-ID: <bug-28942-4717-q1LTc40jSc@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-28942-4717@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=28942

--- Comment #8 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=3df7843699ff3610f89ac880685396b531d8ec1b

commit 3df7843699ff3610f89ac880685396b531d8ec1b
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Fri Oct 9 13:27:13 2020 +0200

    gdb: fix b/p conditions with infcalls in multi-threaded inferiors

    This commit fixes bug PR 28942, that is, creating a conditional
    breakpoint in a multi-threaded inferior, where the breakpoint
    condition includes an inferior function call.

    Currently, when a user tries to create such a breakpoint, then GDB
    will fail with:

      (gdb) break infcall-from-bp-cond-single.c:61 if (return_true ())
      Breakpoint 2 at 0x4011fa: file
/tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-single.c,
line 61.
      (gdb) continue
      Continuing.
      [New Thread 0x7ffff7c5d700 (LWP 2460150)]
      [New Thread 0x7ffff745c700 (LWP 2460151)]
      [New Thread 0x7ffff6c5b700 (LWP 2460152)]
      [New Thread 0x7ffff645a700 (LWP 2460153)]
      [New Thread 0x7ffff5c59700 (LWP 2460154)]
      Error in testing breakpoint condition:
      Couldn't get registers: No such process.
      An error occurred while in a function called from GDB.
      Evaluation of the expression containing the function
      (return_true) will be abandoned.
      When the function is done executing, GDB will silently stop.
      Selected thread is running.
      (gdb)

    Or, in some cases, like this:

      (gdb) break infcall-from-bp-cond-simple.c:56 if (is_matching_tid (arg,
1))
      Breakpoint 2 at 0x401194: file
/tmp/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/infcall-from-bp-cond-simple.c,
line 56.
      (gdb) continue
      Continuing.
      [New Thread 0x7ffff7c5d700 (LWP 2461106)]
      [New Thread 0x7ffff745c700 (LWP 2461107)]
      ../../src.release/gdb/nat/x86-linux-dregs.c:146: internal-error:
x86_linux_update_debug_registers: Assertion `lwp_is_stopped (lwp)' failed.
      A problem internal to GDB has been detected,
      further debugging may prove unreliable.

    The precise error depends on the exact thread state; so there's race
    conditions depending on which threads have fully started, and which
    have not.  But the underlying problem is always the same; when GDB
    tries to execute the inferior function call from within the breakpoint
    condition, GDB will, incorrectly, try to resume threads that are
    already running - GDB doesn't realise that some threads might already
    be running.

    The solution proposed in this patch requires an additional member
    variable thread_info::in_cond_eval.  This flag is set to true (in
    breakpoint.c) when GDB is evaluating a breakpoint condition.

    In user_visible_resume_ptid (infrun.c), when the in_cond_eval flag is
    true, then GDB will only try to resume the current thread, that is,
    the thread for which the breakpoint condition is being evaluated.
    This solves the problem of GDB trying to resume threads that are
    already running.

    The next problem is that inferior function calls are assumed to be
    synchronous, that is, GDB doesn't expect to start an inferior function
    call in thread #1, then receive a stop from thread #2 for some other,
    unrelated reason.  To prevent GDB responding to an event from another
    thread, we update fetch_inferior_event and do_target_wait in infrun.c,
    so that, when an inferior function call (on behalf of a breakpoint
    condition) is in progress, we only wait for events from the current
    thread (the one evaluating the condition).

    In do_target_wait I had to change the inferior_matches lambda
    function, which is used to select which inferior to wait on.
    Previously the logic was this:

       auto inferior_matches = [&wait_ptid] (inferior *inf)
         {
           return (inf->process_target () != nullptr
                   && ptid_t (inf->pid).matches (wait_ptid));
         };

    This compares the pid of the inferior against the complete ptid we
    want to wait on.  Before this commit wait_ptid was only ever
    minus_one_ptid (which is special, and means any process), and so every
    inferior would match.

    After this commit though wait_ptid might represent a specific thread
    in a specific inferior.  If we compare the pid of the inferior to a
    specific ptid then these will not match.  The fix is to compare
    against the pid extracted from the wait_ptid, not against the complete
    wait_ptid itself.

    In fetch_inferior_event, after receiving the event, we only want to
    stop all the other threads, and call inferior_event_handler with
    INF_EXEC_COMPLETE, if we are not evaluating a conditional breakpoint.
    If we are, then all the other threads should be left doing whatever
    they were before.  The inferior_event_handler call will be performed
    once the breakpoint condition has finished being evaluated, and GDB
    decides to stop or not.

    The final problem that needs solving relates to GDB's commit-resume
    mechanism, which allows GDB to collect resume requests into a single
    packet in order to reduce traffic to a remote target.

    The problem is that the commit-resume mechanism will not send any
    resume requests for an inferior if there are already events pending on
    the GDB side.

    Imagine an inferior with two threads.  Both threads hit a breakpoint,
    maybe the same conditional breakpoint.  At this point there are two
    pending events, one for each thread.

    GDB selects one of the events and spots that this is a conditional
    breakpoint, GDB evaluates the condition.

    The condition includes an inferior function call, so GDB sets up for
    the call and resumes the one thread, the resume request is added to
    the commit-resume queue.

    When the commit-resume queue is committed GDB sees that there is a
    pending event from another thread, and so doesn't send any resume
    requests to the actual target, GDB is assuming that when we wait we
    will select the event from the other thread.

    However, as this is an inferior function call for a condition
    evaluation, we will not select the event from the other thread, we
    only care about events from the thread that is evaluating the
    condition - and the resume for this thread was never sent to the
    target.

    And so, GDB hangs, waiting for an event from a thread that was never
    fully resumed.

    To fix this issue I have added the concept of "forcing" the
    commit-resume queue.  When enabling commit resume, if the force flag
    is true, then any resumes will be committed to the target, even if
    there are other threads with pending events.

    A note on authorship: this patch was based on some work done by
    Natalia Saiapova and Tankut Baris Aktemur from Intel[1].  I have made
    some changes to their work in this version.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28942

    [1] https://sourceware.org/pipermail/gdb-patches/2020-October/172454.html

    Co-authored-by: Natalia Saiapova <natalia.saiapova@intel.com>
    Co-authored-by: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
    Reviewed-By: Tankut Baris Aktemur <tankut.baris.aktemur@intel.com>
    Tested-By: Luis Machado <luis.machado@arm.com>
    Tested-By: Keith Seitz <keiths@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

      parent reply	other threads:[~2024-03-25 17:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-03 19:40 [Bug gdb/28942] New: " simon.marchi at polymtl dot ca
2022-03-04 11:15 ` [Bug gdb/28942] " aburgess at redhat dot com
2022-03-04 14:01 ` aburgess at redhat dot com
2022-03-04 14:44 ` simark at simark dot ca
2022-03-07  7:34 ` tankut.baris.aktemur at intel dot com
2022-10-21 17:57 ` tromey at sourceware dot org
2022-10-21 17:57 ` tromey at sourceware dot org
2022-10-21 17:58 ` tromey at sourceware dot org
2024-03-25 17:40 ` cvs-commit at gcc dot gnu.org [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-28942-4717-q1LTc40jSc@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=gdb-prs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).