From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=zt5C=7U=simark.ca=simark@sourceware.org>
Received: from simark.ca (simark.ca [158.69.221.121])
	by sourceware.org (Postfix) with ESMTPS id 9F2823858D39
	for <gdb-patches@sourceware.org>; Tue, 28 Mar 2023 15:12:00 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9F2823858D39
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=simark.ca
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=simark.ca
Received: from [172.16.0.146] (192-222-143-198.qc.cable.ebox.net [192.222.143.198])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by simark.ca (Postfix) with ESMTPSA id 30E821E0D2;
	Tue, 28 Mar 2023 11:12:00 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=simark.ca; s=mail;
	t=1680016320; bh=/4nmA0SH4gzSaz5L56ho5d2QajChW3jO0mH4WPF6ggQ=;
	h=Date:Subject:To:Cc:References:From:In-Reply-To:From;
	b=NRdYdrxzBQLEzaMRMiHqThcdoQnR3Qxlm+gjFxvugNTxyt+7rdLdzTrkX0nUMiCm6
	 gCGAyARLxBbrepFVFiLVlkTne/Y+x9Sy9IHvGhnhknOosJWKNOUQGfvfZCZ34tORWP
	 l8rZWzVkiOolcBsThB5Zj6S+qcFCILpyt+Bw9/5M=
Message-ID: <fc5050c6-c433-1b8a-d5c9-4fc2ec2fb007@simark.ca>
Date: Tue, 28 Mar 2023 11:11:59 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.9.0
Subject: Re: [PATCHv3 1/3] gdb: more debug output for displaced stepping
Content-Language: fr
To: Andrew Burgess <aburgess@redhat.com>, gdb-patches@sourceware.org
Cc: Pedro Alves <pedro@palves.net>
References: <cover.1678984664.git.aburgess@redhat.com>
 <cover.1679919937.git.aburgess@redhat.com>
 <20744f2c843ca8bffb773634350b8479a58c05e5.1679919937.git.aburgess@redhat.com>
 <6e0638df-b0b1-29e8-a9ba-acf091f717c5@simark.ca> <87h6u4eww0.fsf@redhat.com>
From: Simon Marchi <simark@simark.ca>
In-Reply-To: <87h6u4eww0.fsf@redhat.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gdb-patches.sourceware.org>

On 3/28/23 11:08, Andrew Burgess wrote:
> Simon Marchi <simark@simark.ca> writes:
> 
>> On 3/27/23 08:32, Andrew Burgess wrote:
>>> While investigating a displaced stepping issue I wanted an easy way to
>>> see what GDB thought the original instruction was, and what
>>> instruction GDB replaced that with when performing the displaced step.
>>>
>>> We do print out the address that is being stepped, so I can track down
>>> the original instruction, I just need to go find the information
>>> myself.
>>>
>>> And we do print out the bytes of the new instruction, so I can figure
>>> out what the replacement instruction was, but it's not really easy.
>>>
>>> Also, the code that prints the bytes of the replacement instruction
>>> only prints 4 bytes, which clearly isn't always going to be correct.
>>>
>>> In this commit I remove the existing code that prints the bytes of the
>>> replacement instruction, and add two new blocks of code to
>>> displaced_step_prepare_throw.  This new code prints the original
>>> instruction, and the replacement instruction.  In each case we print
>>> both the bytes that make up the instruction and the completely
>>> disassembled instruction.
>>>
>>> Here's an example of what the output looks like on x86-64 (this is
>>> with 'set debug displaced on').  The two interesting lines contain the
>>> strings 'original insn' and 'replacement insn':
>>>
>>>   (gdb) step
>>>   [displaced] displaced_step_prepare_throw: displaced-stepping 2892655.2892655.0 now
>>>   [displaced] displaced_step_prepare_throw: original insn 0x401030: ff 25 e2 2f 00 00	jmp    *0x2fe2(%rip)        # 0x404018 <puts@got.plt>
>>>   [displaced] prepare: selected buffer at 0x401052
>>>   [displaced] prepare: saved 0x401052: 1e fa 31 ed 49 89 d1 5e 48 89 e2 48 83 e4 f0 50
>>>   [displaced] fixup_riprel: %rip-relative addressing used.
>>>   [displaced] fixup_riprel: using temp reg 2, old value 0x7ffff7f8a578, new value 0x401036
>>>   [displaced] amd64_displaced_step_copy_insn: copy 0x401030->0x401052: ff a1 e2 2f 00 00 68 00 00 00 00 e9 e0 ff ff ff
>>>   [displaced] displaced_step_prepare_throw: prepared successfully thread=2892655.2892655.0, original_pc=0x401030, displaced_pc=0x401052
>>>   [displaced] displaced_step_prepare_throw: replacement insn 0x401052: ff a1 e2 2f 00 00	jmp    *0x2fe2(%rcx)
>>>   [displaced] finish: restored 2892655.2892655.0 0x401052
>>>   [displaced] amd64_displaced_step_fixup: fixup (0x401030, 0x401052), insn = 0xff 0xa1 ...
>>>   [displaced] amd64_displaced_step_fixup: restoring reg 2 to 0x7ffff7f8a578
>>>   0x00007ffff7e402c0 in puts () from /lib64/libc.so.6
>>>   (gdb)
>>>
>>> One final note.  For many targets that support displaced stepping (in
>>> fact all targets except ARM) the replacement instruction is always a
>>> single instruction.  But on ARM the replacement could actually be a
>>> series of instructions.
>>>
>>> The debug code tries to handle this by disassembling the entire
>>> displaced stepping buffer.  Obviously this might actually print more
>>> than is necessary, but there's (currently) no easy way to know how
>>> many instructions to disassemble; that knowledge is all locked in the
>>> architecture specific code.  Still I don't think it really hurts, if
>>> someone is looking at this debug then hopefully they known what to
>>> expect.
>>>
>>> Obviously we can imagine schemes where the architecture specific
>>> displaced stepping code could communicate back how many bytes its
>>> replacement sequence was, and then our debug print code could use this
>>> to limit the disassembly.  But this seems like a lot of effort just to
>>> save printing a few additional instructions in some debug output.
>>>
>>> I'm not proposing to do anything about this issue for now.
>>> ---
>>>  gdb/infrun.c | 85 +++++++++++++++++++++++++++++++++++++++++-----------
>>>  1 file changed, 68 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/gdb/infrun.c b/gdb/infrun.c
>>> index 5c9babb9104..8c56a9a4dfb 100644
>>> --- a/gdb/infrun.c
>>> +++ b/gdb/infrun.c
>>> @@ -74,6 +74,7 @@
>>>  #include "gdbsupport/common-debug.h"
>>>  #include "gdbsupport/buildargv.h"
>>>  #include "extension.h"
>>> +#include "disasm.h"
>>>  
>>>  /* Prototypes for local functions */
>>>  
>>> @@ -1807,6 +1808,31 @@ displaced_step_prepare_throw (thread_info *tp)
>>>    CORE_ADDR original_pc = regcache_read_pc (regcache);
>>>    CORE_ADDR displaced_pc;
>>>  
>>> +  /* Display the instruction we are going to displaced step.  */
>>> +  if (debug_displaced)
>>> +    {
>>> +      string_file tmp_stream;
>>> +      int dislen = gdb_print_insn (gdbarch, original_pc, &tmp_stream,
>>> +				   nullptr);
>>> +
>>> +      if (dislen > 0)
>>> +	{
>>> +	  gdb::byte_vector insn_buf (dislen);
>>> +	  read_memory (original_pc, insn_buf.data (), insn_buf.size ());
>>> +
>>> +	  std::string insn_bytes
>>> +	    = displaced_step_dump_bytes (insn_buf.data (), insn_buf.size ());
>>> +
>>> +	  displaced_debug_printf ("original insn %s: %s \t %s",
>>> +				  paddress (gdbarch, original_pc),
>>> +				  insn_bytes.c_str (),
>>> +				  tmp_stream.string ().c_str ());
>>
>> If the bytes disassemble to more than one instruction, does tmp_stream
>> contain new lines characters?  Just wondering what the output would look
>> like (not a big deal in any case).
> 
> No.  gdb_print_insn will only disassemble a single instruction and
> return its length.  In this bit of debug, we assume the original
> instruction is always a single instruction.  If that's not true then
> I've seriously not understood how displaced stepping works.

Eh yeah, that's the input instruction, you're right.

> 
> For the replacement instructions the call to gdb_print_insn is placed
> inside a loop which calls gdb_print_insn multiple times, so you'll see
> multiple lines like:
> 
>   [displaced] displaced_step_prepare_throw: replacement insn <ADDRESS>: <BYTES> <DISASSEMBLY>
> 
> I use this trick:
> 
>   CORE_ADDR end
>     = addr + (gdbarch_displaced_step_hw_singlestep (gdbarch)
>               ? 1 : gdbarch_displaced_step_buffer_length (gdbarch));
> 
> Which means for targets that do a 1:1 replacement we only disassemble a
> single instruction.  But for everyone else we'll always disassemble the
> entire displaced step buffer.  Currently this is just ARM.

Ok I missed that, that makes sense.

Simon