From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=QWNh=56=fbl.cz=vanekt@sourceware.org>
Received: from smtp00.avonet.cz (smtp00.avonet.cz [217.112.162.55])
	by sourceware.org (Postfix) with ESMTP id 33E983858401
	for <gdb-patches@sourceware.org>; Thu,  2 Feb 2023 06:38:44 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33E983858401
Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=fbl.cz
Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=fbl.cz
Received: from ktus.lan (217-115-245-101.cust.avonet.cz [217.115.245.101])
	by smtp00.avonet.cz (Postfix) with ESMTP id 4P6pyq1rl0z1xq5
	for <gdb-patches@sourceware.org>; Thu,  2 Feb 2023 07:38:43 +0100 (CET)
Received: by ktus.lan (Postfix, from userid 209)
	id 2ADEA3203FF; Thu,  2 Feb 2023 07:38:43 +0100 (CET)
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_NONE,TXREP,T_SPF_PERMERROR autolearn=ham autolearn_force=no version=3.4.6
Received: from [192.168.33.9] (217-115-245-101.cust.avonet.cz [217.115.245.101])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	(Authenticated sender: vanekt)
	by ktus.lan (Postfix) with ESMTPSA id 71AA03203F9;
	Thu,  2 Feb 2023 07:38:32 +0100 (CET)
Message-ID: <1e1f3a74-38e9-1d16-d585-9d40e437bf7a@fbl.cz>
Date: Thu, 2 Feb 2023 07:38:33 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
 Thunderbird/91.13.1
Subject: Re: [PATCH v2] gdb: Modify until_break_command to act correctly on
 SIGTRAMP_FRAME
Content-Language: en-US
To: Simon Marchi <simark@simark.ca>, gdb-patches@sourceware.org
References: <1666353538-15846-1-git-send-email-vanekt@fbl.cz>
 <dd7a85c5-a1ff-f687-f44e-640c8191d3c4@simark.ca>
 <7a8c358c-7de3-8390-2566-cee03278184e@fbl.cz>
 <3bb9f09e-68a8-8829-c3b2-380f66bf5974@simark.ca>
From: Tomas Vanek <vanekt@fbl.cz>
Cc: Luis Machado <luis.machado@arm.com>
In-Reply-To: <3bb9f09e-68a8-8829-c3b2-380f66bf5974@simark.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
List-Id: <gdb-patches.sourceware.org>

On 11/01/2023 02:38, Simon Marchi wrote:
>
> On 1/10/23 18:22, Tomas Vanek via Gdb-patches wrote:
>> On 10/01/2023 18:48, Simon Marchi wrote:
>>> On 10/21/22 07:58, Tomas Vanek wrote:
>>>> This patch partially depends on
>>>> gdb/arm: Terminate frame unwinding in M-profile lockup state
>>>> (without it lockup state is unwound as if it were a normal
>>>> stack frame).
>>>>
>>>> The commands 'advance' and 'until' try to set a breakpoint
>>>> on the bogus return address derived from Arm M-profile magic
>>>> address (actually EXC_RETURN or a PC value indicating lockup).
>>>>
>>>> The offending breakpoint should be set at the return address in
>>>> the caller. The magic value 0xffffffff in LR indicates
>>>> there is no caller (return to this address would lock up the CPU).
>>>>
>>>> Similar behaviour of 'advance' and 'until' is observed in
>>>> an exception handler routine. In this case LR contains e.g.
>>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at
>>>> 0xfffffff0. It should use a return value stacked by the exception
>>>> instead.
>>>>
>>>> Testbench setup:
>>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used.
>>>> A test application (an ordinary blink) with a standard startup
>>>> is loaded to the device flash.
>>>>
>>>> Steps to reproduce the problem:
>>>>
>>>> start GDB server
>>>>    $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg
>>>>
>>>> start GDB in second terminal
>>>>    $ arm-none-eabi-gdb blink.elf
>>>>
>>>>    (gdb) target extended-remote localhost:3333
>>>>
>>>> Reset the device and halt it:
>>>>    (gdb) monitor reset halt
>>>>    target halted due to debug-request, current mode: Thread
>>>>    xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000
>>>>
>>>> Step by one instruction to re-read GDB register cache:
>>>>    (gdb) stepi
>>>>
>>>> Check registers, LR should be 0xffffffff after reset:
>>>>    (gdb) info registers
>>>>    ...
>>>>    sp             0x20020000          0x20020000
>>>>    lr             0xffffffff          -1
>>>>    pc             0x8000e16           0x8000e16
>>>>    xPSR           0x1000000           16777216
>>>>    ...
>>>>
>>>>    (gdb) set debug remote
>>>>
>>>> Issue 'advance' command:
>>>>    (gdb) advance main
>>>>    [remote] Sending packet: $mfffffffe,2#fa
>>>>    [remote] Packet received: 0000
>>>>    [remote] Sending packet: $mfffffffe,2#fa
>>>>    [remote] Packet received: 0000
>>>>    [remote] Sending packet: $m8000526,2#30
>>>>    [remote] Packet received: 2046
>>>>    [remote] Sending packet: $Z1,8000526,2#7a
>>>>    [remote] Packet received: OK
>>>>    [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported
>>>>    [remote] Sending packet: $Z0,fffffffe,2#43
>>>>    [remote] Packet received: E0E
>>>>    [remote] packet_ok: Packet Z0 (software-breakpoint) is supported
>>>>    Warning:
>>>>    Cannot insert breakpoint 0.
>>>>    Cannot access memory at address 0xfffffffe
>>>>
>>>>    Command aborted.
>>>>    (gdb)
>>>>
>>>> Relevant messages from OpenOCD:
>>>>    Error: Failed to read memory at 0xfffff000
>>>>    Error: can't add breakpoint: unknown reason
>>>>
>>>> This patch adds skipping over frames that are not suitable for
>>>> guarding with a breakpoint inspired by 'finish' command.
>>>> If no suitable frame is found, a momentary breakpoint is not set.
>>>>
>>>> v2: Comment fixes, bug reference.
>>>>
>>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>>>> Signed-off-by: Tomas Vanek <vanekt@fbl.cz>
>>>> ---
>>>>    gdb/breakpoint.c | 22 +++++++++++++++++-----
>>>>    1 file changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c
>>>> index f6591d4..bb85342 100644
>>>> --- a/gdb/breakpoint.c
>>>> +++ b/gdb/breakpoint.c
>>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason
>>>>    until_break_command (const char *arg, int from_tty, int anywhere)
>>>>    {
>>>>      frame_info_ptr frame;
>>>> +  frame_info_ptr caller_frame;
>>>>      struct gdbarch *frame_gdbarch;
>>>>      struct frame_id stack_frame_id;
>>>>      struct frame_id caller_frame_id;
>>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason
>>>>      frame = get_selected_frame (NULL);
>>>>      frame_gdbarch = get_frame_arch (frame);
>>>>      stack_frame_id = get_stack_frame_id (frame);
>>>> -  caller_frame_id = frame_unwind_caller_id (frame);
>>>> +
>>>> +  caller_frame = get_prev_frame_always (frame);
>>>> +
>>>> +  while (caller_frame)
>>>> +    {
>>>> +      if (get_frame_type (caller_frame) != TAILCALL_FRAME
>>>> +      && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame))
>>>> +    break;
>>>> +
>>>> +      caller_frame = get_prev_frame_always (caller_frame);
>>>> +    }
>>> frame_unwind_caller_id does skip inline frames (through
>>> skip_artificial_frames), whereas your version does not, is that correct?
>> TBH not sure.
>> I just reused the code in 'finish' command
>> https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1790-1808
>> because it does not suffer from breakpoints at magic addresses and changed it to prevent stopping at main()
>> as explained in
>> https://sourceware.org/pipermail/gdb-patches/2022-October/193223.htmlo
> For finish, we do not want to skip inline frames, because that's the
> semantic of the command.  finish for inlined frames is implemented by
> single-stepping in the current function until the current frame changes.
> It is taken care specially here:
>
>    https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1851
>
> Note that the pc for the inlined frame is the same as the pc for the
> frame it is inlined in.  So if you were in an inlined frame and simply
> took the next frame's pc and put a breakpoint there, you would end up
> putting a breakpoint at the current pc, so I think you would stop
> immediately.  Not what you want.  Hence the need to skip inline frames
> for those.
>
> If we had common code between the finish command and this other stuff, I
> think it would be harmless for the finish command if the common code
> skipped inline frames, since the finish command would have taken care of
> the inline frame case earlier, and returned early.
>
>>> I'm wondering if we should make frame_unwind_caller_id /
>>> skip_artificial_frames handle it instead.  In other words, would other
>>> callers of frame_unwind_caller_id / skip_artificial_frames need to skip
>>> such frames, and benefit from it.
>>>
>>> If we do that, frame_unwind_caller_id would return null_frame_id in your
>>> case, and I think that until_break_command would not need to be
>>> modified.
>>>
>>> Looking at watch_command_1, for instance, there seems to be the same
>>> situation, where we create a temporary breakpoint to detect when we get
>>> out of scope:
>>>
>>>     https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/breakpoint.c#L10215
>>>
>>> I guess if you try to create a watchpoint in that special frame, you'll
>>> hit the same problem?
>> Indeed.
>> I uploaded a slightly modified example
>> https://github.com/libopencm3/libopencm3-examples/blob/master/examples/stm32/f4/stm32f4-discovery/tick_blink/tick_blink.c
>> to the original bug ticket https://sourceware.org/bugzilla/show_bug.cgi?id=28683
>> It runs in QEMU and has a local variable in the ISR routine.
>> Setting 'watch loc'  while stopped at tick_blink.c:39 and 'continue' results in
>>   [remote] Sending packet: $Z0,fffffff9,2#17
>> as you expected.
> Cool, thanks.
>
>>>   From what I understand, in the ARM case, the sigtramp frame does not
>>> represent any code, and has no real PC.  When the normal frame #0
>>> returns, control goes back directly to the frame above the sigtramp
>>> frame.  It does not go execute instructions in the sigtramp frame.  Is
>>> that right?
>> Yes, exactly.
>>> With Linux on amd64, after returning from the normal frame, control goes
>>> to some instructions that will call the sigreturn syscall, so the
>>> sigtramp frame does have a real pc.
>> Didn't know that.
>> At least the current 'finish' command implementation skips that part.
>> Isn't it intentional?
> Using this program:
>
>    #include <signal.h>
>
>    int c;
>
>    void handler(int a)
>    {
>      c++;
>    }
>
>    int main(int argc, char ** argv)
>    {
>      signal(SIGSEGV, handler);
>      raise(SIGSEGV);
>    }
>
> GDB lets me debug the sigtramp frame just like a normal frame, including
> finish:
>
>      $ ./gdb -q -nx --data-directory=data-directory a.out -iex "set debuginfod enabled on"
>      Reading symbols from a.out...
>      (gdb) b handler
>      Breakpoint 1 at 0x1150: file test.c, line 7.
>      (gdb) run
>      Starting program: /home/simark/build/binutils-gdb/gdb/a.out
>      Downloading separate debug info for system-supplied DSO at 0x7ffff7fc8000
>      [Thread debugging using libthread_db enabled]
>      Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1".
>
>      Program received signal SIGSEGV, Segmentation fault.
>      __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
>      44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
>      (gdb) c
>      Continuing.
>
>      Breakpoint 1, handler (a=11) at test.c:7
>      7         c++;
>      (gdb) bt
>      #0  handler (a=11) at test.c:7
>      #1  <signal handler called>
>      #2  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
>      #3  0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=<optimized out>) at pthread_kill.c:78
>      #4  0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
>      #5  0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13
>      (gdb) fin
>      Run till exit from #0  handler (a=11) at test.c:7
>      <signal handler called>
>      (gdb) disas
>      Dump of assembler code for function __restore_rt:
>      => 0x00007ffff7ddca00 <+0>:     mov    $0xf,%rax
>         0x00007ffff7ddca07 <+7>:     syscall
>         0x00007ffff7ddca09 <+9>:     nopl   0x0(%rax)
>      End of assembler dump.
>      (gdb) bt
>      #0  <signal handler called>
>      #1  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
>      #2  0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=<optimized out>) at pthread_kill.c:78
>      #3  0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
>      #4  0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13
>      (gdb) si
>      <signal handler called>
>
> I'm honestly surprised that GDB lets you debug the sigtramp frame.  Not
> that there's anything wrong with it, but the user typically doesn't
> care, unless they are debugging the delivery of signals of their OS.
>
>>> If so, I think that's the distinction we are dealing with here.  We want
>>> to know what's the first instruction that is going to be executed after
>>> leaving the current frame.  And it may sometimes be the sigtramp frame,
>>> or it may sometimes be the frame above that.
>> No idea how to handle that.
> I think you are on the right track.  We need a function that has that
> semantic: what is the closest place after returning from the given
> frame, where it's possible to put a breakpoint.  I think this function
> could be used by the various commands that have this similar need.  The
> until/advance commands, probably finish, and maybe the watch command, as
> you have found earlier.
>
> Simon
Simon,

I addressed your remarks in

https://sourceware.org/pipermail/gdb-patches/2023-January/195561.html

I also covered the case you pointed out, the 'watch' command of a local 
variable.

insert_step_resume_breakpoint_at_caller() is covered too, unfortunately 
I was not
able to prepare a test case when 
insert_step_resume_breakpoint_at_caller() operates
with SIGTRAMP caller frame - perhaps it's not possible at all?

Could you take a look?

Tomas