From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp00.avonet.cz (smtp00.avonet.cz [217.112.162.55]) by sourceware.org (Postfix) with ESMTP id 33E983858401 for ; Thu, 2 Feb 2023 06:38:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33E983858401 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=fbl.cz Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=fbl.cz Received: from ktus.lan (217-115-245-101.cust.avonet.cz [217.115.245.101]) by smtp00.avonet.cz (Postfix) with ESMTP id 4P6pyq1rl0z1xq5 for ; Thu, 2 Feb 2023 07:38:43 +0100 (CET) Received: by ktus.lan (Postfix, from userid 209) id 2ADEA3203FF; Thu, 2 Feb 2023 07:38:43 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_NONE,TXREP,T_SPF_PERMERROR autolearn=ham autolearn_force=no version=3.4.6 Received: from [192.168.33.9] (217-115-245-101.cust.avonet.cz [217.115.245.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: vanekt) by ktus.lan (Postfix) with ESMTPSA id 71AA03203F9; Thu, 2 Feb 2023 07:38:32 +0100 (CET) Message-ID: <1e1f3a74-38e9-1d16-d585-9d40e437bf7a@fbl.cz> Date: Thu, 2 Feb 2023 07:38:33 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.13.1 Subject: Re: [PATCH v2] gdb: Modify until_break_command to act correctly on SIGTRAMP_FRAME Content-Language: en-US To: Simon Marchi , gdb-patches@sourceware.org References: <1666353538-15846-1-git-send-email-vanekt@fbl.cz> <7a8c358c-7de3-8390-2566-cee03278184e@fbl.cz> <3bb9f09e-68a8-8829-c3b2-380f66bf5974@simark.ca> From: Tomas Vanek Cc: Luis Machado In-Reply-To: <3bb9f09e-68a8-8829-c3b2-380f66bf5974@simark.ca> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: On 11/01/2023 02:38, Simon Marchi wrote: > > On 1/10/23 18:22, Tomas Vanek via Gdb-patches wrote: >> On 10/01/2023 18:48, Simon Marchi wrote: >>> On 10/21/22 07:58, Tomas Vanek wrote: >>>> This patch partially depends on >>>> gdb/arm: Terminate frame unwinding in M-profile lockup state >>>> (without it lockup state is unwound as if it were a normal >>>> stack frame). >>>> >>>> The commands 'advance' and 'until' try to set a breakpoint >>>> on the bogus return address derived from Arm M-profile magic >>>> address (actually EXC_RETURN or a PC value indicating lockup). >>>> >>>> The offending breakpoint should be set at the return address in >>>> the caller. The magic value 0xffffffff in LR indicates >>>> there is no caller (return to this address would lock up the CPU). >>>> >>>> Similar behaviour of 'advance' and 'until' is observed in >>>> an exception handler routine. In this case LR contains e.g. >>>> 0xfffffff1 (EXC_RETURN) and GDB tries to set a breakpoint at >>>> 0xfffffff0. It should use a return value stacked by the exception >>>> instead. >>>> >>>> Testbench setup: >>>> STM32G474, a Cortex-M4 device. Any Cortex-M device can be used. >>>> A test application (an ordinary blink) with a standard startup >>>> is loaded to the device flash. >>>> >>>> Steps to reproduce the problem: >>>> >>>> start GDB server >>>> $ openocd -f interface/cmsis-dap.cfg -f target/stm32g4x.cfg >>>> >>>> start GDB in second terminal >>>> $ arm-none-eabi-gdb blink.elf >>>> >>>> (gdb) target extended-remote localhost:3333 >>>> >>>> Reset the device and halt it: >>>> (gdb) monitor reset halt >>>> target halted due to debug-request, current mode: Thread >>>> xPSR: 0x01000000 pc: 0x08000e14 msp: 0x20020000 >>>> >>>> Step by one instruction to re-read GDB register cache: >>>> (gdb) stepi >>>> >>>> Check registers, LR should be 0xffffffff after reset: >>>> (gdb) info registers >>>> ... >>>> sp 0x20020000 0x20020000 >>>> lr 0xffffffff -1 >>>> pc 0x8000e16 0x8000e16 >>>> xPSR 0x1000000 16777216 >>>> ... >>>> >>>> (gdb) set debug remote >>>> >>>> Issue 'advance' command: >>>> (gdb) advance main >>>> [remote] Sending packet: $mfffffffe,2#fa >>>> [remote] Packet received: 0000 >>>> [remote] Sending packet: $mfffffffe,2#fa >>>> [remote] Packet received: 0000 >>>> [remote] Sending packet: $m8000526,2#30 >>>> [remote] Packet received: 2046 >>>> [remote] Sending packet: $Z1,8000526,2#7a >>>> [remote] Packet received: OK >>>> [remote] packet_ok: Packet Z1 (hardware-breakpoint) is supported >>>> [remote] Sending packet: $Z0,fffffffe,2#43 >>>> [remote] Packet received: E0E >>>> [remote] packet_ok: Packet Z0 (software-breakpoint) is supported >>>> Warning: >>>> Cannot insert breakpoint 0. >>>> Cannot access memory at address 0xfffffffe >>>> >>>> Command aborted. >>>> (gdb) >>>> >>>> Relevant messages from OpenOCD: >>>> Error: Failed to read memory at 0xfffff000 >>>> Error: can't add breakpoint: unknown reason >>>> >>>> This patch adds skipping over frames that are not suitable for >>>> guarding with a breakpoint inspired by 'finish' command. >>>> If no suitable frame is found, a momentary breakpoint is not set. >>>> >>>> v2: Comment fixes, bug reference. >>>> >>>> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28683 >>>> Signed-off-by: Tomas Vanek >>>> --- >>>> gdb/breakpoint.c | 22 +++++++++++++++++----- >>>> 1 file changed, 17 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/gdb/breakpoint.c b/gdb/breakpoint.c >>>> index f6591d4..bb85342 100644 >>>> --- a/gdb/breakpoint.c >>>> +++ b/gdb/breakpoint.c >>>> @@ -10467,6 +10467,7 @@ enum async_reply_reason >>>> until_break_command (const char *arg, int from_tty, int anywhere) >>>> { >>>> frame_info_ptr frame; >>>> + frame_info_ptr caller_frame; >>>> struct gdbarch *frame_gdbarch; >>>> struct frame_id stack_frame_id; >>>> struct frame_id caller_frame_id; >>>> @@ -10505,7 +10506,17 @@ enum async_reply_reason >>>> frame = get_selected_frame (NULL); >>>> frame_gdbarch = get_frame_arch (frame); >>>> stack_frame_id = get_stack_frame_id (frame); >>>> - caller_frame_id = frame_unwind_caller_id (frame); >>>> + >>>> + caller_frame = get_prev_frame_always (frame); >>>> + >>>> + while (caller_frame) >>>> + { >>>> + if (get_frame_type (caller_frame) != TAILCALL_FRAME >>>> + && gdbarch_code_of_frame_writable (get_frame_arch (caller_frame), caller_frame)) >>>> + break; >>>> + >>>> + caller_frame = get_prev_frame_always (caller_frame); >>>> + } >>> frame_unwind_caller_id does skip inline frames (through >>> skip_artificial_frames), whereas your version does not, is that correct? >> TBH not sure. >> I just reused the code in 'finish' command >> https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1790-1808 >> because it does not suffer from breakpoints at magic addresses and changed it to prevent stopping at main() >> as explained in >> https://sourceware.org/pipermail/gdb-patches/2022-October/193223.htmlo > For finish, we do not want to skip inline frames, because that's the > semantic of the command. finish for inlined frames is implemented by > single-stepping in the current function until the current frame changes. > It is taken care specially here: > > https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/infcmd.c#L1851 > > Note that the pc for the inlined frame is the same as the pc for the > frame it is inlined in. So if you were in an inlined frame and simply > took the next frame's pc and put a breakpoint there, you would end up > putting a breakpoint at the current pc, so I think you would stop > immediately. Not what you want. Hence the need to skip inline frames > for those. > > If we had common code between the finish command and this other stuff, I > think it would be harmless for the finish command if the common code > skipped inline frames, since the finish command would have taken care of > the inline frame case earlier, and returned early. > >>> I'm wondering if we should make frame_unwind_caller_id / >>> skip_artificial_frames handle it instead. In other words, would other >>> callers of frame_unwind_caller_id / skip_artificial_frames need to skip >>> such frames, and benefit from it. >>> >>> If we do that, frame_unwind_caller_id would return null_frame_id in your >>> case, and I think that until_break_command would not need to be >>> modified. >>> >>> Looking at watch_command_1, for instance, there seems to be the same >>> situation, where we create a temporary breakpoint to detect when we get >>> out of scope: >>> >>> https://gitlab.com/gnutools/binutils-gdb/-/blob/8ec0b0b5df0ebe28c32900afc7ae8ff22b21f381/gdb/breakpoint.c#L10215 >>> >>> I guess if you try to create a watchpoint in that special frame, you'll >>> hit the same problem? >> Indeed. >> I uploaded a slightly modified example >> https://github.com/libopencm3/libopencm3-examples/blob/master/examples/stm32/f4/stm32f4-discovery/tick_blink/tick_blink.c >> to the original bug ticket https://sourceware.org/bugzilla/show_bug.cgi?id=28683 >> It runs in QEMU and has a local variable in the ISR routine. >> Setting 'watch loc' while stopped at tick_blink.c:39 and 'continue' results in >> [remote] Sending packet: $Z0,fffffff9,2#17 >> as you expected. > Cool, thanks. > >>> From what I understand, in the ARM case, the sigtramp frame does not >>> represent any code, and has no real PC. When the normal frame #0 >>> returns, control goes back directly to the frame above the sigtramp >>> frame. It does not go execute instructions in the sigtramp frame. Is >>> that right? >> Yes, exactly. >>> With Linux on amd64, after returning from the normal frame, control goes >>> to some instructions that will call the sigreturn syscall, so the >>> sigtramp frame does have a real pc. >> Didn't know that. >> At least the current 'finish' command implementation skips that part. >> Isn't it intentional? > Using this program: > > #include > > int c; > > void handler(int a) > { > c++; > } > > int main(int argc, char ** argv) > { > signal(SIGSEGV, handler); > raise(SIGSEGV); > } > > GDB lets me debug the sigtramp frame just like a normal frame, including > finish: > > $ ./gdb -q -nx --data-directory=data-directory a.out -iex "set debuginfod enabled on" > Reading symbols from a.out... > (gdb) b handler > Breakpoint 1 at 0x1150: file test.c, line 7. > (gdb) run > Starting program: /home/simark/build/binutils-gdb/gdb/a.out > Downloading separate debug info for system-supplied DSO at 0x7ffff7fc8000 > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/usr/lib/../lib/libthread_db.so.1". > > Program received signal SIGSEGV, Segmentation fault. > __pthread_kill_implementation (threadid=, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44 > 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; > (gdb) c > Continuing. > > Breakpoint 1, handler (a=11) at test.c:7 > 7 c++; > (gdb) bt > #0 handler (a=11) at test.c:7 > #1 > #2 __pthread_kill_implementation (threadid=, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44 > #3 0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=) at pthread_kill.c:78 > #4 0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26 > #5 0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13 > (gdb) fin > Run till exit from #0 handler (a=11) at test.c:7 > > (gdb) disas > Dump of assembler code for function __restore_rt: > => 0x00007ffff7ddca00 <+0>: mov $0xf,%rax > 0x00007ffff7ddca07 <+7>: syscall > 0x00007ffff7ddca09 <+9>: nopl 0x0(%rax) > End of assembler dump. > (gdb) bt > #0 > #1 __pthread_kill_implementation (threadid=, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44 > #2 0x00007ffff7e2c6b3 in __pthread_kill_internal (signo=11, threadid=) at pthread_kill.c:78 > #3 0x00007ffff7ddc958 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26 > #4 0x000055555555518f in main (argc=1, argv=0x7fffffffd758) at test.c:13 > (gdb) si > > > I'm honestly surprised that GDB lets you debug the sigtramp frame. Not > that there's anything wrong with it, but the user typically doesn't > care, unless they are debugging the delivery of signals of their OS. > >>> If so, I think that's the distinction we are dealing with here. We want >>> to know what's the first instruction that is going to be executed after >>> leaving the current frame. And it may sometimes be the sigtramp frame, >>> or it may sometimes be the frame above that. >> No idea how to handle that. > I think you are on the right track. We need a function that has that > semantic: what is the closest place after returning from the given > frame, where it's possible to put a breakpoint. I think this function > could be used by the various commands that have this similar need. The > until/advance commands, probably finish, and maybe the watch command, as > you have found earlier. > > Simon Simon, I addressed your remarks in https://sourceware.org/pipermail/gdb-patches/2023-January/195561.html I also covered the case you pointed out, the 'watch' command of a local variable. insert_step_resume_breakpoint_at_caller() is covered too, unfortunately I was not able to prepare a test case when insert_step_resume_breakpoint_at_caller() operates with SIGTRAMP caller frame - perhaps it's not possible at all? Could you take a look? Tomas