public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla@sourceware.org>
To: gdb-prs@sourceware.org
Subject: [Bug breakpoints/22921] breakpoint in PLT corrupts function argument in $rcx
Date: Thu, 06 Apr 2023 13:27:30 +0000	[thread overview]
Message-ID: <bug-22921-4717-omqIcZREbS@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-22921-4717@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=22921

--- Comment #14 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Burgess <aburgess@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=cf141dd8ccd36efe833aae3ccdb060b517cc1112

commit cf141dd8ccd36efe833aae3ccdb060b517cc1112
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Feb 22 12:15:34 2023 +0000

    gdb: fix reg corruption from displaced stepping on amd64

    This commit aims to address a problem that exists with the current
    approach to displaced stepping, and was identified in PR gdb/22921.

    Displaced stepping is currently supported on AArch64, ARM, amd64,
    i386, rs6000 (ppc), and s390.  Of these, I believe there is a problem
    with the current approach which will impact amd64 and ARM, and can
    lead to random register corruption when the inferior makes use of
    asynchronous signals and GDB is using displaced stepping.

    The problem can be found in displaced_step_buffers::finish in
    displaced-stepping.c, and is this; after GDB tries to perform a
    displaced step, and the inferior stops, GDB classifies the stop into
    one of two states, either the displaced step succeeded, or the
    displaced step failed.

    If the displaced step succeeded then gdbarch_displaced_step_fixup is
    called, which has the job of fixing up the state of the current
    inferior as if the step had not been performed in a displaced manner.
    This all seems just fine.

    However, if the displaced step is considered to have not completed
    then GDB doesn't call gdbarch_displaced_step_fixup, instead GDB
    remains in displaced_step_buffers::finish and just performs a minimal
    fixup which involves adjusting the program counter back to its
    original value.

    The problem here is that for amd64 and ARM setting up for a displaced
    step can involve changing the values in some temporary registers.  If
    the displaced step succeeds then this is fine; after the step the
    temporary registers are restored to their original values in the
    architecture specific code.

    But if the displaced step does not succeed then the temporary
    registers are never restored, and they retain their modified values.

    In this context a temporary register is simply any register that is
    not otherwise used by the instruction being stepped that the
    architecture specific code considers safe to borrow for the lifetime
    of the instruction being stepped.

    In the bug PR gdb/22921, the amd64 instruction being stepped is
    an rip-relative instruction like this:

      jmp    *0x2fe2(%rip)

    When we displaced step this instruction we borrow a register, and
    modify the instruction to something like:

      jmp    *0x2fe2(%rcx)

    with %rcx having its value adjusted to contain the original %rip
    value.

    Now if the displaced step does not succeed, then %rcx will be left
    with a corrupted value.  Obviously corrupting any register is bad; in
    the bug report this problem was spotted because %rcx is used as a
    function argument register.

    And finally, why might a displaced step not succeed?  Asynchronous
    signals provides one reason.  GDB sets up for the displaced step and,
    at that precise moment, the OS delivers a signal (SIGALRM in the bug
    report), the signal stops the inferior at the address of the displaced
    instruction.  GDB cancels the displaced instruction, handles the
    signal, and then tries again with the displaced step.  But it is that
    first cancellation of the displaced step that causes the problem; in
    that case GDB (correctly) sees the displaced step as having not
    completed, and so does not perform the architecture specific fixup,
    leaving the register corrupted.

    The reason why I think AArch64, rs600, i386, and s390 are not effected
    by this problem is that I don't believe these architectures make use
    of any temporary registers, so when a displaced step is not completed
    successfully, the minimal fix up is sufficient.

    On amd64 we use at most one temporary register.

    On ARM, looking at arm_displaced_step_copy_insn_closure, we could
    modify up to 16 temporary registers, and the instruction being
    displaced stepped could be expanded to multiple replacement
    instructions, which increases the chances of this bug triggering.

    This commit only aims to address the issue on amd64 for now, though I
    believe that the approach I'm proposing here might be applicable for
    ARM too.

    What I propose is that we always call gdbarch_displaced_step_fixup.

    We will now pass an extra argument to gdbarch_displaced_step_fixup,
    this a boolean that indicates whether GDB thinks the displaced step
    completed successfully or not.

    When this flag is false this indicates that the displaced step halted
    for some "other" reason.  On ARM GDB can potentially read the
    inferior's program counter in order figure out how far through the
    sequence of replacement instructions we got, and from that GDB can
    figure out what fixup needs to be performed.

    On targets like amd64 the problem is slightly easier as displaced
    stepping only uses a single replacement instruction.  If the displaced
    step didn't complete the GDB knows that the single instruction didn't
    execute.

    The point is that by always calling gdbarch_displaced_step_fixup, each
    architecture can now ensure that the inferior state is fixed up
    correctly in all cases, not just the success case.

    On amd64 this ensures that we always restore the temporary register
    value, and so bug PR gdb/22921 is resolved.

    In order to move all architectures to this new API, I have moved the
    minimal roll-back version of the code inside the architecture specific
    fixup functions for AArch64, rs600, s390, and ARM.  For all of these
    except ARM I think this is good enough, as no temporaries are used all
    that's needed is the program counter restore anyway.

    For ARM the minimal code is no worse than what we had before, though I
    do consider this architecture's displaced-stepping broken.

    I've updated the gdb.arch/amd64-disp-step.exp test to cover the
    'jmpq*' instruction that was causing problems in the original bug, and
    also added support for testing the displaced step in the presence of
    asynchronous signal delivery.

    I've also added two new tests (for amd64 and i386) that check that GDB
    can correctly handle displaced stepping over a single instruction that
    branches to itself.  I added these tests after a first version of this
    patch relied too much on checking the program-counter value in order
    to see if the displaced instruction had executed.  This works fine in
    almost all cases, but when an instruction branches to itself a pure
    program counter check is not sufficient.  The new tests expose this
    problem.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=22921

    Approved-By: Pedro Alves <pedro@palves.net>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2023-04-06 13:27 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-22921-4717@http.sourceware.org/bugzilla/>
2023-02-12  6:57 ` aburgess at redhat dot com
2023-02-12  7:35 ` stsp at users dot sourceforge.net
2023-02-12  8:10 ` aburgess at redhat dot com
2023-02-12  8:17 ` stsp at users dot sourceforge.net
2023-02-12 10:17 ` stsp at users dot sourceforge.net
2023-02-12 10:20 ` stsp at users dot sourceforge.net
2023-02-12 15:26 ` aburgess at redhat dot com
2023-02-12 15:27 ` aburgess at redhat dot com
2023-02-12 15:32 ` aburgess at redhat dot com
2023-02-12 15:38 ` stsp at users dot sourceforge.net
2023-02-28 15:03 ` aburgess at redhat dot com
2023-04-06 13:27 ` cvs-commit at gcc dot gnu.org [this message]
2023-04-06 13:29 ` aburgess at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-22921-4717-omqIcZREbS@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=gdb-prs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).