From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C4AB4385772A; Thu, 6 Apr 2023 13:27:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C4AB4385772A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1680787651; bh=b7/r5HxjaljzW4VOtPJzP3TGIpFW70PiR20f6kAdabY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=M/rsR82rT0zJb0mhzIaMCUnpw/a2/dHQ9twJBa3AOoCLuDIV9+o/IwvVYZJ+K+OLW VAMWDypLrFgmxyiQ9lJLPEAzDvlLL7dCTSYvZK4SFhO/Y81q82vlnE0eiYD+Khw5p3 A0orm8QQK3hn8DY6jdBrtKo5zbHoeyeEmM3xw664= From: "cvs-commit at gcc dot gnu.org" To: gdb-prs@sourceware.org Subject: [Bug breakpoints/22921] breakpoint in PLT corrupts function argument in $rcx Date: Thu, 06 Apr 2023 13:27:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gdb X-Bugzilla-Component: breakpoints X-Bugzilla-Version: 8.0.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D22921 --- Comment #14 from cvs-commit at gcc dot gnu.org --- The master branch has been updated by Andrew Burgess : https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3Dcf141dd8ccd3= 6efe833aae3ccdb060b517cc1112 commit cf141dd8ccd36efe833aae3ccdb060b517cc1112 Author: Andrew Burgess Date: Wed Feb 22 12:15:34 2023 +0000 gdb: fix reg corruption from displaced stepping on amd64 This commit aims to address a problem that exists with the current approach to displaced stepping, and was identified in PR gdb/22921. Displaced stepping is currently supported on AArch64, ARM, amd64, i386, rs6000 (ppc), and s390. Of these, I believe there is a problem with the current approach which will impact amd64 and ARM, and can lead to random register corruption when the inferior makes use of asynchronous signals and GDB is using displaced stepping. The problem can be found in displaced_step_buffers::finish in displaced-stepping.c, and is this; after GDB tries to perform a displaced step, and the inferior stops, GDB classifies the stop into one of two states, either the displaced step succeeded, or the displaced step failed. If the displaced step succeeded then gdbarch_displaced_step_fixup is called, which has the job of fixing up the state of the current inferior as if the step had not been performed in a displaced manner. This all seems just fine. However, if the displaced step is considered to have not completed then GDB doesn't call gdbarch_displaced_step_fixup, instead GDB remains in displaced_step_buffers::finish and just performs a minimal fixup which involves adjusting the program counter back to its original value. The problem here is that for amd64 and ARM setting up for a displaced step can involve changing the values in some temporary registers. If the displaced step succeeds then this is fine; after the step the temporary registers are restored to their original values in the architecture specific code. But if the displaced step does not succeed then the temporary registers are never restored, and they retain their modified values. In this context a temporary register is simply any register that is not otherwise used by the instruction being stepped that the architecture specific code considers safe to borrow for the lifetime of the instruction being stepped. In the bug PR gdb/22921, the amd64 instruction being stepped is an rip-relative instruction like this: jmp *0x2fe2(%rip) When we displaced step this instruction we borrow a register, and modify the instruction to something like: jmp *0x2fe2(%rcx) with %rcx having its value adjusted to contain the original %rip value. Now if the displaced step does not succeed, then %rcx will be left with a corrupted value. Obviously corrupting any register is bad; in the bug report this problem was spotted because %rcx is used as a function argument register. And finally, why might a displaced step not succeed? Asynchronous signals provides one reason. GDB sets up for the displaced step and, at that precise moment, the OS delivers a signal (SIGALRM in the bug report), the signal stops the inferior at the address of the displaced instruction. GDB cancels the displaced instruction, handles the signal, and then tries again with the displaced step. But it is that first cancellation of the displaced step that causes the problem; in that case GDB (correctly) sees the displaced step as having not completed, and so does not perform the architecture specific fixup, leaving the register corrupted. The reason why I think AArch64, rs600, i386, and s390 are not effected by this problem is that I don't believe these architectures make use of any temporary registers, so when a displaced step is not completed successfully, the minimal fix up is sufficient. On amd64 we use at most one temporary register. On ARM, looking at arm_displaced_step_copy_insn_closure, we could modify up to 16 temporary registers, and the instruction being displaced stepped could be expanded to multiple replacement instructions, which increases the chances of this bug triggering. This commit only aims to address the issue on amd64 for now, though I believe that the approach I'm proposing here might be applicable for ARM too. What I propose is that we always call gdbarch_displaced_step_fixup. We will now pass an extra argument to gdbarch_displaced_step_fixup, this a boolean that indicates whether GDB thinks the displaced step completed successfully or not. When this flag is false this indicates that the displaced step halted for some "other" reason. On ARM GDB can potentially read the inferior's program counter in order figure out how far through the sequence of replacement instructions we got, and from that GDB can figure out what fixup needs to be performed. On targets like amd64 the problem is slightly easier as displaced stepping only uses a single replacement instruction. If the displaced step didn't complete the GDB knows that the single instruction didn't execute. The point is that by always calling gdbarch_displaced_step_fixup, each architecture can now ensure that the inferior state is fixed up correctly in all cases, not just the success case. On amd64 this ensures that we always restore the temporary register value, and so bug PR gdb/22921 is resolved. In order to move all architectures to this new API, I have moved the minimal roll-back version of the code inside the architecture specific fixup functions for AArch64, rs600, s390, and ARM. For all of these except ARM I think this is good enough, as no temporaries are used all that's needed is the program counter restore anyway. For ARM the minimal code is no worse than what we had before, though I do consider this architecture's displaced-stepping broken. I've updated the gdb.arch/amd64-disp-step.exp test to cover the 'jmpq*' instruction that was causing problems in the original bug, and also added support for testing the displaced step in the presence of asynchronous signal delivery. I've also added two new tests (for amd64 and i386) that check that GDB can correctly handle displaced stepping over a single instruction that branches to itself. I added these tests after a first version of this patch relied too much on checking the program-counter value in order to see if the displaced instruction had executed. This works fine in almost all cases, but when an instruction branches to itself a pure program counter check is not sufficient. The new tests expose this problem. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=3D22921 Approved-By: Pedro Alves --=20 You are receiving this mail because: You are on the CC list for the bug.=