From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) by sourceware.org (Postfix) with ESMTPS id 7B4B43858D37 for ; Tue, 21 Mar 2023 13:23:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B4B43858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f41.google.com with SMTP id t15so13660938wrz.7 for ; Tue, 21 Mar 2023 06:23:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679404993; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:to:subject :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wIhW9o2ld+EaNib+cqZ558OQ/sDFk9eRPzY0Q4ybi6k=; b=oEqT5JNlLvzkWehYxRoR2PiCuO+hxlugGWWYIYes37btpnSx5Kz5PH3era0aQV02Ut USlSgjFmuSYTWThqjK781a2PIPGX0U5X53M4z2axpb9FhvcX1tjHyV9NMWSTQuwUg77q Gc+XkO2YYPt5awGW3B4dimoLeRBWgxQky5LcAbm8TohSdd2I/g153hQxiuYUytGrFd4k yxuAOpwbmKoCj4SqK/9d6fy3ZNuW62LIy9eO+IYtNope4T2aYbcQ3dv2YB6JUYW+jZpN l0zdBYfZN7VEC0BucgtVZisxvgTpSWEkKfkj8wXh4d3BQbkziSwJqSv6xOoVMcIqib9T tvzQ== X-Gm-Message-State: AO0yUKXUQOqRUF0yWDWmVyXBpS95JINj7zH1LOVy9T6IMsazR/A96Pml DhLnO8g2svHpq3ZlheGiAXqCafFe85BQ7A== X-Google-Smtp-Source: AK7set/AwELoX7/F5/GJ0TcVGLcLW/E61MuK4qhCyYkeuIc4TyRWdNpaCiT0Uu64sj99ualZQIQrDQ== X-Received: by 2002:a5d:4e8f:0:b0:2d7:1ec1:9e47 with SMTP id e15-20020a5d4e8f000000b002d71ec19e47mr2112827wru.19.1679404992712; Tue, 21 Mar 2023 06:23:12 -0700 (PDT) Received: from ?IPv6:2001:8a0:f93c:5900::1fe? ([2001:8a0:f93c:5900::1fe]) by smtp.gmail.com with ESMTPSA id g9-20020a5d4889000000b002c559843748sm11358995wrq.10.2023.03.21.06.23.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Mar 2023 06:23:12 -0700 (PDT) Subject: Re: [PATCHv2 3/4] gdb: fix reg corruption from displaced stepping on amd64 To: Andrew Burgess , gdb-patches@sourceware.org References: <14d2a5493727b8b3c9743c7d1cdd9e1f31c5c2fe.1678984664.git.aburgess@redhat.com> From: Pedro Alves Message-ID: Date: Tue, 21 Mar 2023 13:23:19 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <14d2a5493727b8b3c9743c7d1cdd9e1f31c5c2fe.1678984664.git.aburgess@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_NUMSUBJECT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2023-03-16 4:39 p.m., Andrew Burgess via Gdb-patches wrote: > This commit aims to address a problem that exists with the current > approach to displaced stepping, and was identified in PR gdb/22921. > > Displaced stepping is currently supported on AArch64, ARM, amd64, > i386, rs6000 (ppc), and s390. Of these, I believe there is a problem > with the current approach which will impact amd64 and ARM, and can > lead to random register corruption when the inferior makes use of > asynchronous signals and GDB is using displaced stepping. > > The problem can be found in displaced_step_buffers::finish in > displaced-stepping.c, and is this; after GDB tries to perform a > displaced step, and the inferior stops, GDB classifies the stop into > one of two states, either the displaced step succeeded, or the > displaced step failed. > > If the displaced step succeeded then gdbarch_displaced_step_fixup is > called, which has the job of fixing up the state of the current > inferior as if the step had not been performed in a displaced manor. typo: "manor" -> "manner" > This all seems just fine. > > However, if the displaced step is considered to have not completed > then GDB doesn't call gdbarch_displaced_step_fixup, instead GDB > remains in displaced_step_buffers::finish and just performs a minimal > fixup which involves adjusting the program counter back to its > original value. > > The problem here is that for amd64 and ARM setting up for a displaced > step can involve changing the values in some temporary registers. If > the displaced step succeeds then this is fine; after the step the > temporary registers are restored to their original values in the > architecture specific code. > > But if the displaced step does not succeed then the temporary > registers are never restored, and they retain their modified values. > > In this context a temporary register is simply any register that is > not otherwise used by the instruction being stepped that the > architecture specific code considers safe to borrow for the lifetime > of the instruction being stepped. > > In the bug PR gdb/22921, the amd64 instruction being stepped is > an rip-relative instruction like this: > > jmp *0x2fe2(%rip) > > When we displaced step this instruction we borrow a register, and > modify the instruction to something like: > > jmp *0x2fe2(%rcx) > > with %rcx having its value adjusted to contain the original %rip > value. > > Now if the displaced step does not succeed then %rcx will be left with > a corrupted value. Obviously corrupting any register is bad; in the > bug report this problem was spotted because %rcx is used as a function > argument register. > > And finally, why might a displaced step not succeed? Asynchronous > signals provides one reason. GDB sets up for the displaced step and, > at that precise moment, the OS delivers a signal (SIGALRM in the > test), the signal stops the inferior at the address of the displaced > instruction. GDB cancels the displaced instruction, handles the > signal, and then tries again with the displaced step. But it is that > first cancellation of the displaced step that causes the problem; in > that case GDB (correctly) sees the displaced step as having not > completed, and so does not perform the architecture specific fixup, > leaving the register corrupted. > > The reason why I think AArch64, rs600, i386, and s390 are not effected > by this problem is that I don't believe these architectures make use > of any temporary registers, so when a displaced step is not completed > successfully, the minimal fix up is sufficient. > > On amd64 we use at most one temporary register. > > On ARM, looking at arm_displaced_step_copy_insn_closure, we could > modify up to 16 temporary registers, and the instruction being > displaced stepped could be expanded to multiple replacement > instructions, which increases the chances of this bug triggering. > > This commit only aims to address the issue on amd64 for now, though I > believe that the approach I'm proposing here might be applicable for > ARM too. > Agree with everything so far. > What I propose is that we always call gdbarch_displaced_step_fixup. > > We will now pass an extra argument to gdbarch_displaced_step_fixup, > this is the program-counter value at which the inferior stopped. > > Using this program-counter value GDB can determine if the displaced > stepped instruction completed or not. This switching to comparing PC I disagree with. I don't think that it is correct to use the PC to determine if the displaced stepped instruction completed. We can simply be stepping over a jump to self instruction, in which case the PC will not have changed in either success or failure scenarios. > Or (in the future) for ARM, how > far through the displaced step sequence the inferior got before being > interrupted. OK, that justifies passing down the PC reguardless, though it could be left to do until ARM is actually fixed. > > It is then possible for the architecture specific code in GDB to > conditionally roll back some or all of the instructions effects, > i.e. on amd64 we can roll back the temporary register value in all > cases. > > In order to move all architectures to this new API, I have moved the > minimal roll-back version of the code inside the architecture specific > fixup functions for AArch64, rs600, s390, and ARM. For all of these > except ARM I think this is good enough, as no temporaries are used all > that's needed is the program counter restore anyway. > > For ARM the minimal code is no worse than what we had before, though I > do consider this architectures displaced-stepping broken. > > For amd64 and i386 I make use of the program counter to conditionally > roll back the inferior state. For i386 this is not strictly > necessary, as no temporaries are used. However, the structure of > i386_displaced_step_fixup is so similar to amd64_displaced_step_fixup > that is seems a shame to take a different approach in these two > functions. > > One thing to note is that previously we determined if the displaced > step had completed based on how the inferior stopped. Stopping with a > SIGTRAP we assumed meant that we had hit the breakpoint at the end of > the displaced instruction. Anything else indicated the displaced > instruction had not completed successfully. > > After this patch GDB now relies entirely on the program-counter value > to make this decision. If the program-counter points that the start > of the displaced instruction buffer (i.e. it hasn't changed) then the > displaced step didn't complete, otherwise (for amd64 and i386, which > always replace with a single instruction), we assume the displaced > step did complete. I disagree with this part of the change. Pedro Alves > > I've updated the gdb.arch/amd64-disp-step.exp test to cover the > 'jmpq*' instruction that was causing problems in the original bug, and > also added support for testing the displaced step in the presence of > asynchronous signal delivery. > > Finally, after this commit the 'sig' argument passed to > displaced_step_buffers::finish is redundant. This redundancy can be > chased all the way back to displaced_step_finish in infrun.c, so to > strip it out is a significant change in its own right. So I'm > deferring that until the next commit. > > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=22921