* GDB Frame Unwinding for Pure Assembly Code @ 2020-06-22 21:29 Ahmad Nouralizadeh 2020-06-22 21:39 ` Jan Kratochvil 0 siblings, 1 reply; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-22 21:29 UTC (permalink / raw) To: gdb Hi, I ran a simple ffmpeg benchmark in perf to record dram access samples with backtraces. Perf copies stack contents during execution and uses libunwind to unwind the backtraces in a post-process phase. The problem is that ffmpeg has many pure assembly implementations which do not have entries in ".eh_frame". Therefore, libunwind can not unwind backtraces containing these access points. For example, GDB reports the following backtrace which has a pure assembly code address in the innermost frame (i.e., #0): #0 0x00007fffeab68840 in x264_pixel_avg_w16_avx2 () at /usr/lib/x86_64-linux-gnu/libx264.so.152 #1 0x00007fffeaabdad9 in x264_mb_mc_01xywh (h=0x5555561dea00, x=0, y=0, width=4, height=<optimized out>) at common/macroblock.c:128 #2 0x00007fffeab0dd37 in x264_macroblock_analyse (h=h@entry=0x5555561dea00) at encoder/analyse.c:3456 #3 0x00007fffeab48c6e in x264_slice_write (h=0x5555561dea00) at encoder/encoder.c:2739 #4 0x00007fffeab85286 in x264_stack_align () at /usr/lib/x86_64-linux-gnu/libx264.so.152 #5 0x00007fffeab45bdc in x264_slices_write (h=h@entry=0x5555561dea00) at encoder/encoder.c:3079 #6 0x00007fffeab52062 in x264_encoder_encode (h=<optimized out>, pp_nal=pp_nal@entry=0x7fffffffca28, pi_nal=pi_nal@entry=0x7fffffffca20, pic_in=pic_in@entry=0x555555828098, pic_out=pic_out@entry=0x7fffffffca30) at encoder/encoder.c:3738 #7 0x00007ffff5ff8cd1 in X264_frame (ctx=0x555555827800, pkt=0x55555610acc0, frame=0x55555633a500, got_packet=0x7fffffffcc64) at src/libavcodec/libx264.c:337 #8 0x00007ffff5dbc6c7 in avcodec_encode_video2 (avctx=avctx@entry=0x555555827800, avpkt=0x55555610acc0, frame=frame@entry=0x55555633a500, got_packet_ptr=got_packet_ptr@entry=0x7fffffffcc64) at src/libavcodec/encode.c:307 #9 0x00007ffff5dbcaad in do_encode (avctx=0x555555827800, frame=0x55555633a500, got_packet=0x7fffffffcc64) at src/libavcodec/encode.c:378 #10 0x00007ffff5dbcc5a in avcodec_send_frame (avctx=0x555555827800, frame=0x55555633a500) at src/libavcodec/encode.c:427 #11 0x000055555557e253 in do_video_out (of=0x555555825020, ost=0x5555558208a0, next_picture=0x55555633a500, sync_ipts=<optimized out>) at src/fftools/ffmpeg.c:1288 #12 0x000055555558220f in reap_filters (flush=0) at src/fftools/ffmpeg.c:1506 #13 0x00005555555611b2 in transcode_step () at src/fftools/ffmpeg.c:4588 #14 0x00005555555611b2 in transcode () at src/fftools/ffmpeg.c:4632 #15 0x00005555555611b2 in main (argc=<optimized out>, argv=<optimized out>) at src/fftools/ffmpeg.c:4838 Unlike GDB, Perf (precisely, libunwind) can not unwind this backtrace and only reports the innermost frame (i.e., "x264_pixel_avg_w16_avx2()"). I traced GDB execution at backtrace report time and this seems to be the point where the RIP value (i.e., "0x00007fffeaabdad9") for the next to the innermost frame (frame 1?!) is retrieved: #0 _Z27frame_unwind_register_valueP10frame_infoi (frame=0x55555673c2e0, regnum=16) at frame.c:1234 #1 0x00005555558b231e in _Z21frame_register_unwindP10frame_infoiPiS1_P9lval_typePmS1_Ph (frame=0x55555673c2e0, regnum=16, optimizedp=0x7fffffffd6d0, unavailablep=0x7fffffffd6d4, lvalp=0x7fffffffd6dc, addrp=0x7fffffffd6e0, realnump=0x7fffffffd6d8, bufferp=0x7fffffffd710 " \327\377\377\377\177") at frame.c:1096 #2 0x00005555558b2643 in _Z21frame_unwind_registerP10frame_infoiPh (frame=0x55555673c2e0, regnum=16, buf=0x7fffffffd710 " \327\377\377\377\177") at frame.c:1153 #3 0x0000555555602706 in i386_unwind_pc (gdbarch=0x555556692200, next_frame=0x55555673c2e0) at i386-tdep.c:1965 #4 0x00005555558c6056 in _Z17gdbarch_unwind_pcP7gdbarchP10frame_info (gdbarch=0x555556692200, next_frame=0x55555673c2e0) at gdbarch.c:3075 #5 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x55555673c2e0) at frame.c:885 #6 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info (frame=0x55555673c490) at frame.c:2379 #7 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info (this_frame=0x55555673c490) at frame.c:2410 #8 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 <inline_frame_unwind>, this_frame=0x55555673c490, this_cache=0x55555673c4a8) at inline-frame.c:215 #9 0x00005555558b719a in frame_unwind_try_unwinder (this_frame=0x55555673c490, this_cache=0x55555673c4a8, unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 #10 0x00005555558b737d in _Z26frame_unwind_find_by_frameP10frame_infoPPv (this_frame=0x55555673c490, this_cache=0x55555673c4a8) at frame-unwind.c:164 #11 0x00005555558b0d8f in compute_frame_id (fi=0x55555673c490) at frame.c:501 #12 0x00005555558b407d in get_prev_frame_if_no_cycle (this_frame=0x55555673c2e0) at frame.c:1913 #13 0x00005555558b478e in get_prev_frame_always_1 (this_frame=0x55555673c2e0) at frame.c:2087 #14 0x00005555558b47e6 in _Z21get_prev_frame_alwaysP10frame_info (this_frame=0x55555673c2e0) at frame.c:2103 #15 0x00005555558b4e39 in _Z14get_prev_frameP10frame_info (this_frame=0x55555673c2e0) at frame.c:2356 #16 0x00005555556b7fab in _Z26frame_info_to_frame_objectP10frame_info (frame=0x55555673c2e0) at python/py-frame.c:372 #17 0x00005555556bd003 in bootstrap_python_frame_filters (frame=0x55555673c2e0, frame_low=0, frame_high=-1) at python/py-framefilter.c:1283 #18 0x00005555556bd3a2 in _Z24gdbpy_apply_frame_filterPK23extension_language_defnP10frame_infoi19ext_lang_frame_argsP6ui_outii ( extlang=0x5555560c3c40 <extension_language_python>, frame=0x55555673c2e0, flags=7, args_type=CLI_SCALAR_VALUES, out=0x55555636bef0, frame_low=0, frame_high=-1) at python/py-framefilter.c:1359 #19 0x00005555558a8664 in _Z27apply_ext_lang_frame_filterP10frame_infoi19ext_lang_frame_argsP6ui_outii (frame=0x55555673c2e0, flags=7, args_type=CLI_SCALAR_VALUES, out=0x55555636bef0, frame_low=0, frame_high=-1) at extension.c:570 #20 0x00005555559e33b4 in backtrace_command_1 (count_exp=0x0, show_locals=0, no_filters=0, from_tty=1) at stack.c:1789 #21 0x00005555559e3816 in backtrace_command (arg=0x0, from_tty=1) at stack.c:1902 #22 0x000055555566c8a4 in do_const_cfunc (c=0x555556344a70, args=0x0, from_tty=1) at cli/cli-decode.c:106 #23 0x000055555566fba8 in _Z8cmd_funcP16cmd_list_elementPKci (cmd=0x555556344a70, args=0x0, from_tty=1) at cli/cli-decode.c:1886 #24 0x0000555555a4066e in _Z15execute_commandPKci (p=0x55555660a1d2 "", from_tty=1) at top.c:630 #25 0x00005555558a1b58 in _Z15command_handlerPKc (command=0x55555660a1d0 "bt") at event-top.c:583 #26 0x00005555558a1f6c in _Z20command_line_handlerPc (rl=0x55555ac9fe70 "") at event-top.c:774 #27 0x00005555558a12c6 in gdb_rl_callback_handler (rl=0x55555ac9fe70 "") at event-top.c:213 #28 0x0000555555aca378 in rl_callback_read_char () at callback.c:220 #29 0x00005555558a11b2 in gdb_rl_callback_read_char_wrapper_noexcept () at event-top.c:175 #30 0x00005555558a1231 in gdb_rl_callback_read_char_wrapper (client_data=0x5555561f9030) at event-top.c:192 #31 0x00005555558a19ea in _Z19stdin_event_handleriPv (error=0, client_data=0x5555561f9030) at event-top.c:511 #32 0x000055555589f930 in handle_file_event (file_ptr=0x555560306720, ready_mask=1) at event-loop.c:733 #33 0x000055555589fef9 in gdb_wait_for_event (block=1) at event-loop.c:859 #34 0x000055555589ec76 in _Z16gdb_do_one_eventv () at event-loop.c:347 #35 0x000055555589ecbd in _Z16start_event_loopv () at event-loop.c:371 #36 0x000055555592869b in captured_command_loop () at main.c:330 #37 0x0000555555929c7c in captured_main (data=0x7fffffffe3f0) at main.c:1157 #38 0x0000555555929d51 in _Z8gdb_mainP18captured_main_args (args=0x7fffffffe3f0) at main.c:1173 #39 0x00005555555ff4a3 in main (argc=2, argv=0x7fffffffe4f8) at gdb.c:32 I'm not sure if this is the moment when that RIP value (i.e., "0x00007fffeaabdad9") is actually calculated (perhaps it was calculated as soon as the inferior process stopped and retrieved in the above backtrace). 1) Am I right? My other question is: 2) How can GDB get around the problem with pure assembly files (those assembled using NASM,... and not GCC)? Based on the second backtrace, it seems that, GDB differentiates between frames and has detected that the innermost frame should use "inline_frame_sniffer" as the unwinder to find its return point (i.e., "0x00007fffeaabdad9"). I also checked for two such pure assembly cases and saw that the return address is exactly where $RSP points. Is it possible that GDB, luckily, found the return address? Perhaps, because the assembly code is only a single level of function call (e.g., there is no pure assembly call to another pure assembly code) and no stack manipulation (i.e., push/pop) happened after the assembly function is called and before the breakpoint is triggered (Note that the above backtrace triggers immediately when the assembly function is called and is not a good example. But in reality my samples are in the middle of the assembly functions)? Any guesses/ideas is appreciated. The code is really advanced and hard to grasp. In other words, I'm stuck. Regards. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-22 21:29 GDB Frame Unwinding for Pure Assembly Code Ahmad Nouralizadeh @ 2020-06-22 21:39 ` Jan Kratochvil 2020-06-22 21:55 ` Ahmad Nouralizadeh 0 siblings, 1 reply; 13+ messages in thread From: Jan Kratochvil @ 2020-06-22 21:39 UTC (permalink / raw) To: Ahmad Nouralizadeh; +Cc: gdb On Mon, 22 Jun 2020 23:29:01 +0200, Ahmad Nouralizadeh via Gdb wrote: > Perf copies stack contents during execution and uses > libunwind to unwind the backtraces in a post-process phase. perf can use also elfutils unwinder by a compile time option. IIRC someone was implementing non-DWARF unwinder for elfutils but not sure if it got upstreamed or not. You can try also the elfutils unwinder. as has .cfi_startproc to generate .eh_frame, doesn't nasm also have some such? Jan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-22 21:39 ` Jan Kratochvil @ 2020-06-22 21:55 ` Ahmad Nouralizadeh 2020-06-23 7:26 ` Jan Kratochvil 0 siblings, 1 reply; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-22 21:55 UTC (permalink / raw) To: Jan Kratochvil; +Cc: gdb Hi Jan, Thanks! I will try with elfutils. I could not find any ".cfi_*" directive in libx264 source code. But knowing the GDB mechanism to get over the problem will be helpful. Regards. On Tue, 23 Jun 2020 at 02:10, Jan Kratochvil <jan.kratochvil@redhat.com> wrote: > > On Mon, 22 Jun 2020 23:29:01 +0200, Ahmad Nouralizadeh via Gdb wrote: > > Perf copies stack contents during execution and uses > > libunwind to unwind the backtraces in a post-process phase. > > perf can use also elfutils unwinder by a compile time option. > IIRC someone was implementing non-DWARF unwinder for elfutils but not sure if > it got upstreamed or not. You can try also the elfutils unwinder. > > as has .cfi_startproc to generate .eh_frame, doesn't nasm also have some such? > > > Jan > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-22 21:55 ` Ahmad Nouralizadeh @ 2020-06-23 7:26 ` Jan Kratochvil 2020-06-23 11:48 ` Ahmad Nouralizadeh 0 siblings, 1 reply; 13+ messages in thread From: Jan Kratochvil @ 2020-06-23 7:26 UTC (permalink / raw) To: Ahmad Nouralizadeh; +Cc: gdb On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb wrote: > But knowing the GDB mechanism to get over the problem will be helpful. GDB disassembles the code and tries to guess how to unwind it. amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. That is just a last resort way of unwinding (=a bug in the debuggee), there should always be .eh_frame in the debuggee, also for throwing exceptions across such .eh_frame-less functions if there is any callback there. Jan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-23 7:26 ` Jan Kratochvil @ 2020-06-23 11:48 ` Ahmad Nouralizadeh 2020-06-23 12:23 ` Jan Kratochvil ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-23 11:48 UTC (permalink / raw) To: Jan Kratochvil; +Cc: gdb Hi Jan, Thanks for the answer! Could you tell me why is an inline unwinder used? Regards. On Tuesday, 23 June 2020, Jan Kratochvil <jan.kratochvil@redhat.com> wrote: > On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb wrote: > > But knowing the GDB mechanism to get over the problem will be helpful. > > GDB disassembles the code and tries to guess how to unwind it. > amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. > > That is just a last resort way of unwinding (=a bug in the debuggee), there > should always be .eh_frame in the debuggee, also for throwing exceptions > across such .eh_frame-less functions if there is any callback there. > > > Jan > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-23 11:48 ` Ahmad Nouralizadeh @ 2020-06-23 12:23 ` Jan Kratochvil 2020-06-23 14:23 ` Ahmad Nouralizadeh 2020-06-24 14:00 ` Simon Marchi 2020-06-24 15:06 ` Andrew Burgess 2 siblings, 1 reply; 13+ messages in thread From: Jan Kratochvil @ 2020-06-23 12:23 UTC (permalink / raw) To: Ahmad Nouralizadeh via Gdb On Tue, 23 Jun 2020 13:48:42 +0200, Ahmad Nouralizadeh via Gdb wrote: > Thanks for the answer! Could you tell me why is an inline unwinder used? No. That is already too much dependent on the build, you should have provided reproducibility instructions in the first place. But then also I do not want to invest more time into it as I am LLDB developer and not GDB developer. Jan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-23 12:23 ` Jan Kratochvil @ 2020-06-23 14:23 ` Ahmad Nouralizadeh 0 siblings, 0 replies; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-23 14:23 UTC (permalink / raw) To: Jan Kratochvil; +Cc: Ahmad Nouralizadeh via Gdb Thanks Jan! On Tuesday, 23 June 2020, Jan Kratochvil via Gdb <gdb@sourceware.org> wrote: > On Tue, 23 Jun 2020 13:48:42 +0200, Ahmad Nouralizadeh via Gdb wrote: > > Thanks for the answer! Could you tell me why is an inline unwinder used? > > No. That is already too much dependent on the build, you should have > provided > reproducibility instructions in the first place. > > But then also I do not want to invest more time into it as I am LLDB > developer > and not GDB developer. > > > Jan > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-23 11:48 ` Ahmad Nouralizadeh 2020-06-23 12:23 ` Jan Kratochvil @ 2020-06-24 14:00 ` Simon Marchi 2020-06-24 19:09 ` Ahmad Nouralizadeh 2020-06-24 15:06 ` Andrew Burgess 2 siblings, 1 reply; 13+ messages in thread From: Simon Marchi @ 2020-06-24 14:00 UTC (permalink / raw) To: Ahmad Nouralizadeh, Jan Kratochvil; +Cc: gdb On 2020-06-23 7:48 a.m., Ahmad Nouralizadeh via Gdb wrote: > Hi Jan, > Thanks for the answer! Could you tell me why is an inline unwinder used? > Regards. I'd break on / add prints to the `inline_frame_sniffer` function and see what it does. Simon ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-24 14:00 ` Simon Marchi @ 2020-06-24 19:09 ` Ahmad Nouralizadeh 0 siblings, 0 replies; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-24 19:09 UTC (permalink / raw) To: Simon Marchi; +Cc: Jan Kratochvil, gdb Hi Simon, Thanks, I will try. Regards. On Wednesday, 24 June 2020, Simon Marchi <simark@simark.ca> wrote: > On 2020-06-23 7:48 a.m., Ahmad Nouralizadeh via Gdb wrote: > > Hi Jan, > > Thanks for the answer! Could you tell me why is an inline unwinder used? > > Regards. > > I'd break on / add prints to the `inline_frame_sniffer` function and see > what it does. > > Simon > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-23 11:48 ` Ahmad Nouralizadeh 2020-06-23 12:23 ` Jan Kratochvil 2020-06-24 14:00 ` Simon Marchi @ 2020-06-24 15:06 ` Andrew Burgess 2020-06-25 9:10 ` Ahmad Nouralizadeh 2 siblings, 1 reply; 13+ messages in thread From: Andrew Burgess @ 2020-06-24 15:06 UTC (permalink / raw) To: Ahmad Nouralizadeh; +Cc: Jan Kratochvil, gdb * Ahmad Nouralizadeh via Gdb <gdb@sourceware.org> [2020-06-23 16:18:42 +0430]: > Thanks for the answer! Could you tell me why is an inline unwinder used? What you saw in your GDB backtrace was this: #5 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x55555673c2e0) at frame.c:885 #6 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info (frame=0x55555673c490) at frame.c:2379 #7 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info (this_frame=0x55555673c490) at frame.c:2410 #8 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 <inline_frame_unwind>, this_frame=0x55555673c490, this_cache=0x55555673c4a8) at inline-frame.c:215 #9 0x00005555558b719a in frame_unwind_try_unwinder (this_frame=0x55555673c490, this_cache=0x55555673c4a8, unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 So GDB hasn't decided for sure that a frame is an inline frame, instead it is running the inline_frame_sniffer to see if a particular frame is an inline frame or not. In order to figure this out GDB needs to know the value of $pc in the frame that is being sniffed. To get the $pc value GDB asks the next frame (that would be a frame with a lower frame number in GDB terms) to unwind the $pc register. You can see this happening between frames #8 and #6 in the above, before finally in #5 we ask the next frame to unwind the $pc. Almost every frame will have had the inline frame unwinder run on it in order to figure out if it was an inline frame, that doesn't mean the inline frame unwinder will claim the frame. Hope that helps, Thanks, Andrew > > On Tuesday, 23 June 2020, Jan Kratochvil <jan.kratochvil@redhat.com> wrote: > > > On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb wrote: > > > But knowing the GDB mechanism to get over the problem will be helpful. > > > > GDB disassembles the code and tries to guess how to unwind it. > > amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. > > > > That is just a last resort way of unwinding (=a bug in the debuggee), there > > should always be .eh_frame in the debuggee, also for throwing exceptions > > across such .eh_frame-less functions if there is any callback there. > > > > > > Jan > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* GDB Frame Unwinding for Pure Assembly Code 2020-06-24 15:06 ` Andrew Burgess @ 2020-06-25 9:10 ` Ahmad Nouralizadeh 2020-06-25 16:51 ` Andrew Burgess 0 siblings, 1 reply; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-25 9:10 UTC (permalink / raw) To: Andrew Burgess; +Cc: Jan Kratochvil, gdb Hi Andrew, Thanks! This makes many things clear to me. I enabled frame debugging (by setting the global variable, "frame_debug", to 1). The frame debugging information shows that, the frame that Perf could not unwind (i.e., the next to the innermost frame of ffmpeg benchmark), is extracted using the inline unwinder. It knows that "0x7fffeaabdad9" is stored at $RSP. The value is accessed using the following backtrace: #0 _Z23frame_unwind_got_memoryP10frame_infoim (frame=0x555556314750, regnum=16, addr=140737488309304) at frame-unwind.c:236 #1 0x000055555561dc64 in amd64_frame_prev_register (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at amd64-tdep.c:2642 #2 0x00005555558b27d4 in _Z27frame_unwind_register_valueP10frame_infoi (frame=0x555556314750, regnum=16) at frame.c:1194 #3 0x00005555558b231e in _Z21frame_register_unwindP10frame_infoiPiS1_P9lval_typePmS1_Ph (frame=0x555556314750, regnum=16, optimizedp=0x7fffffffd6e0, unavailablep=0x7fffffffd6e4, lvalp=0x7fffffffd6ec, addrp=0x7fffffffd6f0, realnump=0x7fffffffd6e8, bufferp=0x7fffffffd720 "0\327\377\377\377\177") at frame.c:1096 #4 0x00005555558b2643 in _Z21frame_unwind_registerP10frame_infoiPh (frame=0x555556314750, regnum=16, buf=0x7fffffffd720 "0\327\377\377\377\177") at frame.c:1153 #5 0x0000555555602706 in i386_unwind_pc (gdbarch=0x5555566924c0, next_frame=0x555556314750) at i386-tdep.c:1965 #6 0x00005555558c6056 in _Z17gdbarch_unwind_pcP7gdbarchP10frame_info (gdbarch=0x5555566924c0, next_frame=0x555556314750) at gdbarch.c:3075 #7 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x555556314750) at frame.c:885 #8 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info (frame=0x555556314900) at frame.c:2379 #9 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info (this_frame=0x555556314900) at frame.c:2410 #10 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 <inline_frame_unwind>, this_frame=0x555556314900, this_cache=0x555556314918) at inline-frame.c:215 #11 0x00005555558b719a in frame_unwind_try_unwinder (this_frame=0x555556314900, this_cache=0x555556314918, unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 #12 0x00005555558b737d in _Z26frame_unwind_find_by_frameP10frame_infoPPv (this_frame=0x555556314900, this_cache=0x555556314918) at frame-unwind.c:164 #13 0x00005555558b0d8f in compute_frame_id (fi=0x555556314900) at frame.c:501 #14 0x00005555558b407d in get_prev_frame_if_no_cycle (this_frame=0x555556314750) at frame.c:1913 #15 0x00005555558b478e in get_prev_frame_always_1 (this_frame=0x555556314750) at frame.c:2087 #16 0x00005555558b47e6 in _Z21get_prev_frame_alwaysP10frame_info (this_frame=0x555556314750) at frame.c:2103 #17 0x00005555558b4e39 in _Z14get_prev_frameP10frame_info (this_frame=0x555556314750) at frame.c:2356 #18 0x00005555556b7fab in _Z26frame_info_to_frame_objectP10frame_info (frame=0x555556314750) at python/py-frame.c:372 #19 0x00005555556bd003 in bootstrap_python_frame_filters (frame=0x555556314750, frame_low=0, frame_high=-1) at python/py-framefilter.c:1283 #20 0x00005555556bd3a2 in _Z24gdbpy_apply_frame_filterPK23extension_language_defnP10frame_infoi19ext_lang_frame_argsP6ui_outii ( extlang=0x5555560c3c40 <extension_language_python>, frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, out=0x55555636bed0, frame_low=0, frame_high=-1) at python/py-framefilter.c:1359 #21 0x00005555558a8664 in _Z27apply_ext_lang_frame_filterP10frame_infoi19ext_lang_frame_argsP6ui_outii (frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, out=0x55555636bed0, frame_low=0, frame_high=-1) at extension.c:570 #22 0x00005555559e33b4 in backtrace_command_1 (count_exp=0x0, show_locals=0, no_filters=0, from_tty=1) at stack.c:1789 #23 0x00005555559e3816 in backtrace_command (arg=0x0, from_tty=1) at stack.c:1902 #24 0x000055555566c8a4 in do_const_cfunc (c=0x555556344a60, args=0x0, from_tty=1) at cli/cli-decode.c:106 #25 0x000055555566fba8 in _Z8cmd_funcP16cmd_list_elementPKci (cmd=0x555556344a60, args=0x0, from_tty=1) at cli/cli-decode.c:1886 #26 0x0000555555a4066e in _Z15execute_commandPKci (p=0x555556609e52 "", from_tty=1) at top.c:630 #27 0x00005555558a1b58 in _Z15command_handlerPKc (command=0x555556609e50 "bt") at event-top.c:583 #28 0x00005555558a1f6c in _Z20command_line_handlerPc (rl=0x55555ab70790 "") at event-top.c:774 #29 0x00005555558a12c6 in gdb_rl_callback_handler (rl=0x55555ab70790 "") at event-top.c:213 #30 0x0000555555aca378 in rl_callback_read_char () at callback.c:220 #31 0x00005555558a11b2 in gdb_rl_callback_read_char_wrapper_noexcept () at event-top.c:175 #32 0x00005555558a1231 in gdb_rl_callback_read_char_wrapper (client_data=0x5555561f9030) at event-top.c:192 #33 0x00005555558a19ea in _Z19stdin_event_handleriPv (error=0, client_data=0x5555561f9030) at event-top.c:511 #34 0x000055555589f930 in handle_file_event (file_ptr=0x55555af2fd60, ready_mask=1) at event-loop.c:733 #35 0x000055555589fef9 in gdb_wait_for_event (block=1) at event-loop.c:859 #36 0x000055555589ec76 in _Z16gdb_do_one_eventv () at event-loop.c:347 #37 0x000055555589ecbd in _Z16start_event_loopv () at event-loop.c:371 #38 0x000055555592869b in captured_command_loop () at main.c:330 #39 0x0000555555929c7c in captured_main (data=0x7fffffffe400) at main.c:1157 #40 0x0000555555929d51 in _Z8gdb_mainP18captured_main_args (args=0x7fffffffe400) at main.c:1173 #41 0x00005555555ff4a3 in main (argc=2, argv=0x7fffffffe508) at gdb.c:32 "#1 0x000055555561dc64 in amd64_frame_prev_register (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at amd64-tdep.c:2642", is this code snippet: return frame_unwind_got_memory (this_frame, regnum, cache->saved_regs[regnum]); There seems to be a cache of registers. The last parameter of "#0 _Z23frame_unwind_got_memoryP10frame_infoim (frame=0x555556314750, regnum=16, addr=140737488309304) at frame-unwind.c:236" shows that addr is 140737488309304 which is "0x7fffffff4c38" in hex (RSP for the innermost ffmpeg backtrace function (frame -1) that I mentioned in the first thread, is in fact equal to "0x7fffffff4c38"). GDB, somehow, knows that the $RIP value for frame 0 (i.e. "#1 0x00007fffeaabdad9 in x264_mb_mc_01xywh (h=0x5555561dea00, x=0, y=0, width=4, height=<optimized out>) at common/macroblock.c:128") is stored at "0x7fffffff4c38" (i.e., the RSP value for frame -1). But I do not know how! Perhaps, luckily, because there is no push/pop in the innermost function of the ffmpeg workload?! Here is the initial portion of the GDB output, when frame_debug is enabled: Thread 1 "ffmpeg" hit Breakpoint 1, 0x00007fffeab68840 in x264_pixel_avg_w16_avx2 () from /usr/lib/x86_64-linux-gnu/libx264.so.152 (gdb) bt { get_prev_frame_always (this_frame=-1) -> {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc=0x7fffeab68840,id=<unknown>,func=<unknown>} // cached { get_prev_frame_always (this_frame=-1) -> {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc=0x7fffeab68840,id=<unknown>,func=<unknown>} // cached { compute_frame_id (fi=0) { get_frame_func (this_frame=0) -> 0x7fffeab68840 } { frame_unwind_register_value (frame=-1,regnum=7(rsp),...) -> register=7 bytes=[f845ffffff7f0000] } { frame_id_p (l={stack=0x7fffffff4600,code=0x7fffeab68840,!special}) -> 1 } -> {stack=0x7fffffff4600,code=0x7fffeab68840,!special} } { get_prev_frame_always (this_frame=0) -> {level=1,type=<unknown>,unwind=<unknown>,pc=<unknown>,id=<unknown>,func=<unknown>} } { compute_frame_id (fi=1) { frame_unwind_arch (next_frame=0) -> i386:x86-64 } { frame_unwind_register_value (frame=0,regnum=16(rip),...) -> address=0x7fffffff45f8 bytes=[d9daabeaff7f0000] } { frame_unwind_pc (this_frame=0) -> 0x7fffeaabdad9 } { get_frame_func (this_frame=1) -> 0x7fffeaabd860 } { frame_id_p (l={!stack,!code,!special}) -> 0 } { frame_unwind_register_value (frame=0,regnum=7(rsp),...) -> computed bytes=[0046ffffff7f0000] } { frame_id_p (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> 1 } -> {stack=0x7fffffff4940,code=0x7fffeaabd860,!special} } #0 0x00007fffeab68840 in x264_pixel_avg_w16_avx2 () from /usr/lib/x86_64-linux-gnu/libx264.so.152 { get_prev_frame_always (this_frame=0) -> {level=1,type=NORMAL_FRAME,unwind=0x5555560c8820,pc=0x7fffeaabdad9,id={stack=0x7fffffff4940,code=0x7fffeaabd860,!special},func=0x7fffeaabd860} // cached #1 0x00007fffeaabdad9 in x264_mb_mc_01xywh ({ frame_id_p (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> 1 } { frame_id_eq (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special},r={stack=<sentinel>,!code,special=0x0}) -> 0 } The $RIP value for frame 0 is reported as "{ frame_unwind_register_value (frame=0,regnum=16(rip),...) -> address=0x7fffffff45f8 bytes=[d9daabeaff7f0000] }". Regards. On Wednesday, 24 June 2020, Andrew Burgess <andrew.burgess@embecosm.com> wrote: > * Ahmad Nouralizadeh via Gdb <gdb@sourceware.org> [2020-06-23 16:18:42 > +0430]: > > > Thanks for the answer! Could you tell me why is an inline unwinder used? > > What you saw in your GDB backtrace was this: > > #5 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x55555673c2e0) at > frame.c:885 > #6 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info > (frame=0x55555673c490) at frame.c:2379 > #7 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info > (this_frame=0x55555673c490) at frame.c:2410 > #8 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 > <inline_frame_unwind>, this_frame=0x55555673c490, > this_cache=0x55555673c4a8) at inline-frame.c:215 > #9 0x00005555558b719a in frame_unwind_try_unwinder > (this_frame=0x55555673c490, this_cache=0x55555673c4a8, > unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 > > So GDB hasn't decided for sure that a frame is an inline frame, > instead it is running the inline_frame_sniffer to see if a particular > frame is an inline frame or not. > > In order to figure this out GDB needs to know the value of $pc in the > frame that is being sniffed. To get the $pc value GDB asks the next > frame (that would be a frame with a lower frame number in GDB terms) > to unwind the $pc register. You can see this happening between frames > #8 and #6 in the above, before finally in #5 we ask the next frame to > unwind the $pc. > > Almost every frame will have had the inline frame unwinder run on it > in order to figure out if it was an inline frame, that doesn't mean > the inline frame unwinder will claim the frame. > > Hope that helps, > Thanks, > Andrew > > > > > > > > > On Tuesday, 23 June 2020, Jan Kratochvil <jan.kratochvil@redhat.com> > wrote: > > > > > On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb wrote: > > > > But knowing the GDB mechanism to get over the problem will be > helpful. > > > > > > GDB disassembles the code and tries to guess how to unwind it. > > > amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. > > > > > > That is just a last resort way of unwinding (=a bug in the debuggee), > there > > > should always be .eh_frame in the debuggee, also for throwing > exceptions > > > across such .eh_frame-less functions if there is any callback there. > > > > > > > > > Jan > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-25 9:10 ` Ahmad Nouralizadeh @ 2020-06-25 16:51 ` Andrew Burgess 2020-06-26 11:33 ` Ahmad Nouralizadeh 0 siblings, 1 reply; 13+ messages in thread From: Andrew Burgess @ 2020-06-25 16:51 UTC (permalink / raw) To: Ahmad Nouralizadeh; +Cc: Jan Kratochvil, gdb * Ahmad Nouralizadeh <ahmad.mlists@gmail.com> [2020-06-25 13:40:21 +0430]: > Hi Andrew, > > Thanks! This makes many things clear to me. > > I enabled frame debugging (by setting the global variable, "frame_debug", > to 1). The frame debugging information shows that, the frame that Perf > could not unwind (i.e., the next to the innermost frame of ffmpeg > benchmark), is extracted using the inline unwinder. It knows that > "0x7fffeaabdad9" is stored at $RSP. The value is accessed using the > following backtrace: > #0 _Z23frame_unwind_got_memoryP10frame_infoim (frame=0x555556314750, > regnum=16, addr=140737488309304) at frame-unwind.c:236 > #1 0x000055555561dc64 in amd64_frame_prev_register > (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at > amd64-tdep.c:2642 > #2 0x00005555558b27d4 in _Z27frame_unwind_register_valueP10frame_infoi > (frame=0x555556314750, regnum=16) at frame.c:1194 > #3 0x00005555558b231e in > _Z21frame_register_unwindP10frame_infoiPiS1_P9lval_typePmS1_Ph > (frame=0x555556314750, regnum=16, optimizedp=0x7fffffffd6e0, > unavailablep=0x7fffffffd6e4, lvalp=0x7fffffffd6ec, > addrp=0x7fffffffd6f0, realnump=0x7fffffffd6e8, bufferp=0x7fffffffd720 > "0\327\377\377\377\177") at frame.c:1096 > #4 0x00005555558b2643 in _Z21frame_unwind_registerP10frame_infoiPh > (frame=0x555556314750, regnum=16, buf=0x7fffffffd720 > "0\327\377\377\377\177") at frame.c:1153 > #5 0x0000555555602706 in i386_unwind_pc (gdbarch=0x5555566924c0, > next_frame=0x555556314750) at i386-tdep.c:1965 > #6 0x00005555558c6056 in _Z17gdbarch_unwind_pcP7gdbarchP10frame_info > (gdbarch=0x5555566924c0, next_frame=0x555556314750) at gdbarch.c:3075 > #7 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x555556314750) at > frame.c:885 > #8 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info > (frame=0x555556314900) at frame.c:2379 > #9 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info > (this_frame=0x555556314900) at frame.c:2410 > #10 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 > <inline_frame_unwind>, this_frame=0x555556314900, > this_cache=0x555556314918) at inline-frame.c:215 > #11 0x00005555558b719a in frame_unwind_try_unwinder > (this_frame=0x555556314900, this_cache=0x555556314918, > unwinder=0x555556193520 <inline_frame_unwind>) > at frame-unwind.c:106 > #12 0x00005555558b737d in _Z26frame_unwind_find_by_frameP10frame_infoPPv > (this_frame=0x555556314900, this_cache=0x555556314918) at frame-unwind.c:164 > #13 0x00005555558b0d8f in compute_frame_id (fi=0x555556314900) at > frame.c:501 > #14 0x00005555558b407d in get_prev_frame_if_no_cycle > (this_frame=0x555556314750) at frame.c:1913 > #15 0x00005555558b478e in get_prev_frame_always_1 > (this_frame=0x555556314750) at frame.c:2087 > #16 0x00005555558b47e6 in _Z21get_prev_frame_alwaysP10frame_info > (this_frame=0x555556314750) at frame.c:2103 > #17 0x00005555558b4e39 in _Z14get_prev_frameP10frame_info > (this_frame=0x555556314750) at frame.c:2356 > #18 0x00005555556b7fab in _Z26frame_info_to_frame_objectP10frame_info > (frame=0x555556314750) at python/py-frame.c:372 > #19 0x00005555556bd003 in bootstrap_python_frame_filters > (frame=0x555556314750, frame_low=0, frame_high=-1) at > python/py-framefilter.c:1283 > #20 0x00005555556bd3a2 in > _Z24gdbpy_apply_frame_filterPK23extension_language_defnP10frame_infoi19ext_lang_frame_argsP6ui_outii > ( > extlang=0x5555560c3c40 <extension_language_python>, > frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, > out=0x55555636bed0, frame_low=0, frame_high=-1) > at python/py-framefilter.c:1359 > #21 0x00005555558a8664 in > _Z27apply_ext_lang_frame_filterP10frame_infoi19ext_lang_frame_argsP6ui_outii > (frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, > out=0x55555636bed0, frame_low=0, frame_high=-1) at extension.c:570 > #22 0x00005555559e33b4 in backtrace_command_1 (count_exp=0x0, > show_locals=0, no_filters=0, from_tty=1) at stack.c:1789 > #23 0x00005555559e3816 in backtrace_command (arg=0x0, from_tty=1) at > stack.c:1902 > #24 0x000055555566c8a4 in do_const_cfunc (c=0x555556344a60, args=0x0, > from_tty=1) at cli/cli-decode.c:106 > #25 0x000055555566fba8 in _Z8cmd_funcP16cmd_list_elementPKci > (cmd=0x555556344a60, args=0x0, from_tty=1) at cli/cli-decode.c:1886 > #26 0x0000555555a4066e in _Z15execute_commandPKci (p=0x555556609e52 "", > from_tty=1) at top.c:630 > #27 0x00005555558a1b58 in _Z15command_handlerPKc (command=0x555556609e50 > "bt") at event-top.c:583 > #28 0x00005555558a1f6c in _Z20command_line_handlerPc (rl=0x55555ab70790 "") > at event-top.c:774 > #29 0x00005555558a12c6 in gdb_rl_callback_handler (rl=0x55555ab70790 "") at > event-top.c:213 > #30 0x0000555555aca378 in rl_callback_read_char () at callback.c:220 > #31 0x00005555558a11b2 in gdb_rl_callback_read_char_wrapper_noexcept () at > event-top.c:175 > #32 0x00005555558a1231 in gdb_rl_callback_read_char_wrapper > (client_data=0x5555561f9030) at event-top.c:192 > #33 0x00005555558a19ea in _Z19stdin_event_handleriPv (error=0, > client_data=0x5555561f9030) at event-top.c:511 > #34 0x000055555589f930 in handle_file_event (file_ptr=0x55555af2fd60, > ready_mask=1) at event-loop.c:733 > #35 0x000055555589fef9 in gdb_wait_for_event (block=1) at event-loop.c:859 > #36 0x000055555589ec76 in _Z16gdb_do_one_eventv () at event-loop.c:347 > #37 0x000055555589ecbd in _Z16start_event_loopv () at event-loop.c:371 > #38 0x000055555592869b in captured_command_loop () at main.c:330 > #39 0x0000555555929c7c in captured_main (data=0x7fffffffe400) at main.c:1157 > #40 0x0000555555929d51 in _Z8gdb_mainP18captured_main_args > (args=0x7fffffffe400) at main.c:1173 > #41 0x00005555555ff4a3 in main (argc=2, argv=0x7fffffffe508) at gdb.c:32 > > "#1 0x000055555561dc64 in amd64_frame_prev_register > (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at > amd64-tdep.c:2642", is this code snippet: > return frame_unwind_got_memory (this_frame, regnum, > cache->saved_regs[regnum]); > > There seems to be a cache of registers. > The last parameter of "#0 _Z23frame_unwind_got_memoryP10frame_infoim > (frame=0x555556314750, regnum=16, addr=140737488309304) at > frame-unwind.c:236" shows that addr is 140737488309304 which is > "0x7fffffff4c38" in hex (RSP for the innermost ffmpeg backtrace function > (frame -1) that I mentioned in the first thread, is in fact equal > to "0x7fffffff4c38"). GDB, somehow, knows that the $RIP value for frame 0 > (i.e. "#1 0x00007fffeaabdad9 in x264_mb_mc_01xywh (h=0x5555561dea00, x=0, > y=0, width=4, height=<optimized out>) at common/macroblock.c:128") is > stored at "0x7fffffff4c38" (i.e., the RSP value for frame -1). But I do not > know how! Perhaps, luckily, because there is no push/pop in the innermost > function of the ffmpeg workload?! If you're on x86-64 then the place to start in answering this question would be amd64-tdep.c:amd64_analyze_prologue. I'm no expert on this code, but the basic idea is we actually inspect the instructions, and so long as the prologue of the function follows some basic conventions then we can figure out the size of the stack frame, and where the important bits of state have been written too. Obviously it is trivially easy to write a prologue that _can't_ be scanned, as Jan said earlier in the thread, the prologue scanner is a last ditched fall back for when no other scanner wants to claim the frame. I think the most common usage for the prologue scanning unwinder is to unwind compiler generated code when the debug info has been stripped. As such the "conventions" that I mention are actually just "what does the compiler spit out", and from time to time the prologue scanner will get an update as GCC/LLVM starts spitting out some new instruction into the prologue. Hope that helps a little. Thanks, Andrew > > Here is the initial portion of the GDB output, when frame_debug is enabled: > Thread 1 "ffmpeg" hit Breakpoint 1, 0x00007fffeab68840 in > x264_pixel_avg_w16_avx2 () from /usr/lib/x86_64-linux-gnu/libx264.so.152 > (gdb) bt > { get_prev_frame_always (this_frame=-1) -> > {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc=0x7fffeab68840,id=<unknown>,func=<unknown>} > // cached > { get_prev_frame_always (this_frame=-1) -> > {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc=0x7fffeab68840,id=<unknown>,func=<unknown>} > // cached > { compute_frame_id (fi=0) { get_frame_func (this_frame=0) -> 0x7fffeab68840 > } > { frame_unwind_register_value (frame=-1,regnum=7(rsp),...) -> register=7 > bytes=[f845ffffff7f0000] } > { frame_id_p (l={stack=0x7fffffff4600,code=0x7fffeab68840,!special}) -> 1 } > -> {stack=0x7fffffff4600,code=0x7fffeab68840,!special} } > { get_prev_frame_always (this_frame=0) -> > {level=1,type=<unknown>,unwind=<unknown>,pc=<unknown>,id=<unknown>,func=<unknown>} > } > { compute_frame_id (fi=1) { frame_unwind_arch (next_frame=0) -> i386:x86-64 > } > { frame_unwind_register_value (frame=0,regnum=16(rip),...) -> > address=0x7fffffff45f8 bytes=[d9daabeaff7f0000] } > { frame_unwind_pc (this_frame=0) -> 0x7fffeaabdad9 } > { get_frame_func (this_frame=1) -> 0x7fffeaabd860 } > { frame_id_p (l={!stack,!code,!special}) -> 0 } > { frame_unwind_register_value (frame=0,regnum=7(rsp),...) -> computed > bytes=[0046ffffff7f0000] } > { frame_id_p (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> 1 } > -> {stack=0x7fffffff4940,code=0x7fffeaabd860,!special} } > #0 0x00007fffeab68840 in x264_pixel_avg_w16_avx2 () from > /usr/lib/x86_64-linux-gnu/libx264.so.152 > { get_prev_frame_always (this_frame=0) -> > {level=1,type=NORMAL_FRAME,unwind=0x5555560c8820,pc=0x7fffeaabdad9,id={stack=0x7fffffff4940,code=0x7fffeaabd860,!special},func=0x7fffeaabd860} > // cached > #1 0x00007fffeaabdad9 in x264_mb_mc_01xywh ({ frame_id_p > (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> 1 } > { frame_id_eq > (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special},r={stack=<sentinel>,!code,special=0x0}) > -> 0 } > > The $RIP value for frame 0 is reported as "{ frame_unwind_register_value > (frame=0,regnum=16(rip),...) -> address=0x7fffffff45f8 > bytes=[d9daabeaff7f0000] }". > > Regards. > > On Wednesday, 24 June 2020, Andrew Burgess <andrew.burgess@embecosm.com> > wrote: > > > * Ahmad Nouralizadeh via Gdb <gdb@sourceware.org> [2020-06-23 16:18:42 > > +0430]: > > > > > Thanks for the answer! Could you tell me why is an inline unwinder used? > > > > What you saw in your GDB backtrace was this: > > > > #5 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x55555673c2e0) at > > frame.c:885 > > #6 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info > > (frame=0x55555673c490) at frame.c:2379 > > #7 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info > > (this_frame=0x55555673c490) at frame.c:2410 > > #8 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 > > <inline_frame_unwind>, this_frame=0x55555673c490, > > this_cache=0x55555673c4a8) at inline-frame.c:215 > > #9 0x00005555558b719a in frame_unwind_try_unwinder > > (this_frame=0x55555673c490, this_cache=0x55555673c4a8, > > unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 > > > > So GDB hasn't decided for sure that a frame is an inline frame, > > instead it is running the inline_frame_sniffer to see if a particular > > frame is an inline frame or not. > > > > In order to figure this out GDB needs to know the value of $pc in the > > frame that is being sniffed. To get the $pc value GDB asks the next > > frame (that would be a frame with a lower frame number in GDB terms) > > to unwind the $pc register. You can see this happening between frames > > #8 and #6 in the above, before finally in #5 we ask the next frame to > > unwind the $pc. > > > > Almost every frame will have had the inline frame unwinder run on it > > in order to figure out if it was an inline frame, that doesn't mean > > the inline frame unwinder will claim the frame. > > > > Hope that helps, > > Thanks, > > Andrew > > > > > > > > > > > > > > > > On Tuesday, 23 June 2020, Jan Kratochvil <jan.kratochvil@redhat.com> > > wrote: > > > > > > > On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb wrote: > > > > > But knowing the GDB mechanism to get over the problem will be > > helpful. > > > > > > > > GDB disassembles the code and tries to guess how to unwind it. > > > > amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. > > > > > > > > That is just a last resort way of unwinding (=a bug in the debuggee), > > there > > > > should always be .eh_frame in the debuggee, also for throwing > > exceptions > > > > across such .eh_frame-less functions if there is any callback there. > > > > > > > > > > > > Jan > > > > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GDB Frame Unwinding for Pure Assembly Code 2020-06-25 16:51 ` Andrew Burgess @ 2020-06-26 11:33 ` Ahmad Nouralizadeh 0 siblings, 0 replies; 13+ messages in thread From: Ahmad Nouralizadeh @ 2020-06-26 11:33 UTC (permalink / raw) To: Andrew Burgess; +Cc: Jan Kratochvil, gdb Thanks, Andrew, I will try harder. Regards. On Thursday, 25 June 2020, Andrew Burgess <andrew.burgess@embecosm.com> wrote: > * Ahmad Nouralizadeh <ahmad.mlists@gmail.com> [2020-06-25 13:40:21 +0430]: > > > Hi Andrew, > > > > Thanks! This makes many things clear to me. > > > > I enabled frame debugging (by setting the global variable, "frame_debug", > > to 1). The frame debugging information shows that, the frame that Perf > > could not unwind (i.e., the next to the innermost frame of ffmpeg > > benchmark), is extracted using the inline unwinder. It knows that > > "0x7fffeaabdad9" is stored at $RSP. The value is accessed using the > > following backtrace: > > #0 _Z23frame_unwind_got_memoryP10frame_infoim (frame=0x555556314750, > > regnum=16, addr=140737488309304) at frame-unwind.c:236 > > #1 0x000055555561dc64 in amd64_frame_prev_register > > (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at > > amd64-tdep.c:2642 > > #2 0x00005555558b27d4 in _Z27frame_unwind_register_valueP10frame_infoi > > (frame=0x555556314750, regnum=16) at frame.c:1194 > > #3 0x00005555558b231e in > > _Z21frame_register_unwindP10frame_infoiPiS1_P9lval_typePmS1_Ph > > (frame=0x555556314750, regnum=16, optimizedp=0x7fffffffd6e0, > > unavailablep=0x7fffffffd6e4, lvalp=0x7fffffffd6ec, > > addrp=0x7fffffffd6f0, realnump=0x7fffffffd6e8, bufferp=0x7fffffffd720 > > "0\327\377\377\377\177") at frame.c:1096 > > #4 0x00005555558b2643 in _Z21frame_unwind_registerP10frame_infoiPh > > (frame=0x555556314750, regnum=16, buf=0x7fffffffd720 > > "0\327\377\377\377\177") at frame.c:1153 > > #5 0x0000555555602706 in i386_unwind_pc (gdbarch=0x5555566924c0, > > next_frame=0x555556314750) at i386-tdep.c:1965 > > #6 0x00005555558c6056 in _Z17gdbarch_unwind_pcP7gdbarchP10frame_info > > (gdbarch=0x5555566924c0, next_frame=0x555556314750) at gdbarch.c:3075 > > #7 0x00005555558b1b0b in frame_unwind_pc (this_frame=0x555556314750) at > > frame.c:885 > > #8 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info > > (frame=0x555556314900) at frame.c:2379 > > #9 0x00005555558b50ea in _Z26get_frame_address_in_blockP10frame_info > > (this_frame=0x555556314900) at frame.c:2410 > > #10 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 > > <inline_frame_unwind>, this_frame=0x555556314900, > > this_cache=0x555556314918) at inline-frame.c:215 > > #11 0x00005555558b719a in frame_unwind_try_unwinder > > (this_frame=0x555556314900, this_cache=0x555556314918, > > unwinder=0x555556193520 <inline_frame_unwind>) > > at frame-unwind.c:106 > > #12 0x00005555558b737d in _Z26frame_unwind_find_by_frameP10frame_infoPPv > > (this_frame=0x555556314900, this_cache=0x555556314918) at > frame-unwind.c:164 > > #13 0x00005555558b0d8f in compute_frame_id (fi=0x555556314900) at > > frame.c:501 > > #14 0x00005555558b407d in get_prev_frame_if_no_cycle > > (this_frame=0x555556314750) at frame.c:1913 > > #15 0x00005555558b478e in get_prev_frame_always_1 > > (this_frame=0x555556314750) at frame.c:2087 > > #16 0x00005555558b47e6 in _Z21get_prev_frame_alwaysP10frame_info > > (this_frame=0x555556314750) at frame.c:2103 > > #17 0x00005555558b4e39 in _Z14get_prev_frameP10frame_info > > (this_frame=0x555556314750) at frame.c:2356 > > #18 0x00005555556b7fab in _Z26frame_info_to_frame_objectP10frame_info > > (frame=0x555556314750) at python/py-frame.c:372 > > #19 0x00005555556bd003 in bootstrap_python_frame_filters > > (frame=0x555556314750, frame_low=0, frame_high=-1) at > > python/py-framefilter.c:1283 > > #20 0x00005555556bd3a2 in > > _Z24gdbpy_apply_frame_filterPK23extension_language_ > defnP10frame_infoi19ext_lang_frame_argsP6ui_outii > > ( > > extlang=0x5555560c3c40 <extension_language_python>, > > frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, > > out=0x55555636bed0, frame_low=0, frame_high=-1) > > at python/py-framefilter.c:1359 > > #21 0x00005555558a8664 in > > _Z27apply_ext_lang_frame_filterP10frame_infoi19ext_ > lang_frame_argsP6ui_outii > > (frame=0x555556314750, flags=7, args_type=CLI_SCALAR_VALUES, > > out=0x55555636bed0, frame_low=0, frame_high=-1) at extension.c:570 > > #22 0x00005555559e33b4 in backtrace_command_1 (count_exp=0x0, > > show_locals=0, no_filters=0, from_tty=1) at stack.c:1789 > > #23 0x00005555559e3816 in backtrace_command (arg=0x0, from_tty=1) at > > stack.c:1902 > > #24 0x000055555566c8a4 in do_const_cfunc (c=0x555556344a60, args=0x0, > > from_tty=1) at cli/cli-decode.c:106 > > #25 0x000055555566fba8 in _Z8cmd_funcP16cmd_list_elementPKci > > (cmd=0x555556344a60, args=0x0, from_tty=1) at cli/cli-decode.c:1886 > > #26 0x0000555555a4066e in _Z15execute_commandPKci (p=0x555556609e52 "", > > from_tty=1) at top.c:630 > > #27 0x00005555558a1b58 in _Z15command_handlerPKc (command=0x555556609e50 > > "bt") at event-top.c:583 > > #28 0x00005555558a1f6c in _Z20command_line_handlerPc (rl=0x55555ab70790 > "") > > at event-top.c:774 > > #29 0x00005555558a12c6 in gdb_rl_callback_handler (rl=0x55555ab70790 "") > at > > event-top.c:213 > > #30 0x0000555555aca378 in rl_callback_read_char () at callback.c:220 > > #31 0x00005555558a11b2 in gdb_rl_callback_read_char_wrapper_noexcept () > at > > event-top.c:175 > > #32 0x00005555558a1231 in gdb_rl_callback_read_char_wrapper > > (client_data=0x5555561f9030) at event-top.c:192 > > #33 0x00005555558a19ea in _Z19stdin_event_handleriPv (error=0, > > client_data=0x5555561f9030) at event-top.c:511 > > #34 0x000055555589f930 in handle_file_event (file_ptr=0x55555af2fd60, > > ready_mask=1) at event-loop.c:733 > > #35 0x000055555589fef9 in gdb_wait_for_event (block=1) at > event-loop.c:859 > > #36 0x000055555589ec76 in _Z16gdb_do_one_eventv () at event-loop.c:347 > > #37 0x000055555589ecbd in _Z16start_event_loopv () at event-loop.c:371 > > #38 0x000055555592869b in captured_command_loop () at main.c:330 > > #39 0x0000555555929c7c in captured_main (data=0x7fffffffe400) at > main.c:1157 > > #40 0x0000555555929d51 in _Z8gdb_mainP18captured_main_args > > (args=0x7fffffffe400) at main.c:1173 > > #41 0x00005555555ff4a3 in main (argc=2, argv=0x7fffffffe508) at gdb.c:32 > > > > "#1 0x000055555561dc64 in amd64_frame_prev_register > > (this_frame=0x555556314750, this_cache=0x555556314768, regnum=16) at > > amd64-tdep.c:2642", is this code snippet: > > return frame_unwind_got_memory (this_frame, regnum, > > cache->saved_regs[regnum]); > > > > There seems to be a cache of registers. > > The last parameter of "#0 _Z23frame_unwind_got_memoryP10frame_infoim > > (frame=0x555556314750, regnum=16, addr=140737488309304) at > > frame-unwind.c:236" shows that addr is 140737488309304 which is > > "0x7fffffff4c38" in hex (RSP for the innermost ffmpeg backtrace function > > (frame -1) that I mentioned in the first thread, is in fact equal > > to "0x7fffffff4c38"). GDB, somehow, knows that the $RIP value for frame 0 > > (i.e. "#1 0x00007fffeaabdad9 in x264_mb_mc_01xywh (h=0x5555561dea00, > x=0, > > y=0, width=4, height=<optimized out>) at common/macroblock.c:128") is > > stored at "0x7fffffff4c38" (i.e., the RSP value for frame -1). But I do > not > > know how! Perhaps, luckily, because there is no push/pop in the innermost > > function of the ffmpeg workload?! > > If you're on x86-64 then the place to start in answering this question > would be amd64-tdep.c:amd64_analyze_prologue. I'm no expert on this > code, but the basic idea is we actually inspect the instructions, and > so long as the prologue of the function follows some basic conventions > then we can figure out the size of the stack frame, and where the > important bits of state have been written too. > > Obviously it is trivially easy to write a prologue that _can't_ be > scanned, as Jan said earlier in the thread, the prologue scanner is a > last ditched fall back for when no other scanner wants to claim the > frame. > > I think the most common usage for the prologue scanning unwinder is to > unwind compiler generated code when the debug info has been stripped. > As such the "conventions" that I mention are actually just "what does > the compiler spit out", and from time to time the prologue scanner > will get an update as GCC/LLVM starts spitting out some new > instruction into the prologue. > > Hope that helps a little. > > Thanks, > Andrew > > > > > > > > > Here is the initial portion of the GDB output, when frame_debug is > enabled: > > Thread 1 "ffmpeg" hit Breakpoint 1, 0x00007fffeab68840 in > > x264_pixel_avg_w16_avx2 () from /usr/lib/x86_64-linux-gnu/libx264.so.152 > > (gdb) bt > > { get_prev_frame_always (this_frame=-1) -> > > {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc= > 0x7fffeab68840,id=<unknown>,func=<unknown>} > > // cached > > { get_prev_frame_always (this_frame=-1) -> > > {level=0,type=NORMAL_FRAME,unwind=0x555556191780,pc= > 0x7fffeab68840,id=<unknown>,func=<unknown>} > > // cached > > { compute_frame_id (fi=0) { get_frame_func (this_frame=0) -> > 0x7fffeab68840 > > } > > { frame_unwind_register_value (frame=-1,regnum=7(rsp),...) -> register=7 > > bytes=[f845ffffff7f0000] } > > { frame_id_p (l={stack=0x7fffffff4600,code=0x7fffeab68840,!special}) -> > 1 } > > -> {stack=0x7fffffff4600,code=0x7fffeab68840,!special} } > > { get_prev_frame_always (this_frame=0) -> > > {level=1,type=<unknown>,unwind=<unknown>,pc=<unknown>, > id=<unknown>,func=<unknown>} > > } > > { compute_frame_id (fi=1) { frame_unwind_arch (next_frame=0) -> > i386:x86-64 > > } > > { frame_unwind_register_value (frame=0,regnum=16(rip),...) -> > > address=0x7fffffff45f8 bytes=[d9daabeaff7f0000] } > > { frame_unwind_pc (this_frame=0) -> 0x7fffeaabdad9 } > > { get_frame_func (this_frame=1) -> 0x7fffeaabd860 } > > { frame_id_p (l={!stack,!code,!special}) -> 0 } > > { frame_unwind_register_value (frame=0,regnum=7(rsp),...) -> computed > > bytes=[0046ffffff7f0000] } > > { frame_id_p (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> > 1 } > > -> {stack=0x7fffffff4940,code=0x7fffeaabd860,!special} } > > #0 0x00007fffeab68840 in x264_pixel_avg_w16_avx2 () from > > /usr/lib/x86_64-linux-gnu/libx264.so.152 > > { get_prev_frame_always (this_frame=0) -> > > {level=1,type=NORMAL_FRAME,unwind=0x5555560c8820,pc= > 0x7fffeaabdad9,id={stack=0x7fffffff4940,code= > 0x7fffeaabd860,!special},func=0x7fffeaabd860} > > // cached > > #1 0x00007fffeaabdad9 in x264_mb_mc_01xywh ({ frame_id_p > > (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special}) -> 1 } > > { frame_id_eq > > (l={stack=0x7fffffff4940,code=0x7fffeaabd860,!special},r={ > stack=<sentinel>,!code,special=0x0}) > > -> 0 } > > > > The $RIP value for frame 0 is reported as "{ frame_unwind_register_value > > (frame=0,regnum=16(rip),...) -> address=0x7fffffff45f8 > > bytes=[d9daabeaff7f0000] }". > > > > Regards. > > > > On Wednesday, 24 June 2020, Andrew Burgess <andrew.burgess@embecosm.com> > > wrote: > > > > > * Ahmad Nouralizadeh via Gdb <gdb@sourceware.org> [2020-06-23 16:18:42 > > > +0430]: > > > > > > > Thanks for the answer! Could you tell me why is an inline unwinder > used? > > > > > > What you saw in your GDB backtrace was this: > > > > > > #5 0x00005555558b1b0b in frame_unwind_pc > (this_frame=0x55555673c2e0) at > > > frame.c:885 > > > #6 0x00005555558b4f72 in _Z12get_frame_pcP10frame_info > > > (frame=0x55555673c490) at frame.c:2379 > > > #7 0x00005555558b50ea in _Z26get_frame_address_in_ > blockP10frame_info > > > (this_frame=0x55555673c490) at frame.c:2410 > > > #8 0x0000555555905d53 in inline_frame_sniffer (self=0x555556193520 > > > <inline_frame_unwind>, this_frame=0x55555673c490, > > > this_cache=0x55555673c4a8) at inline-frame.c:215 > > > #9 0x00005555558b719a in frame_unwind_try_unwinder > > > (this_frame=0x55555673c490, this_cache=0x55555673c4a8, > > > unwinder=0x555556193520 <inline_frame_unwind>) at frame-unwind.c:106 > > > > > > So GDB hasn't decided for sure that a frame is an inline frame, > > > instead it is running the inline_frame_sniffer to see if a particular > > > frame is an inline frame or not. > > > > > > In order to figure this out GDB needs to know the value of $pc in the > > > frame that is being sniffed. To get the $pc value GDB asks the next > > > frame (that would be a frame with a lower frame number in GDB terms) > > > to unwind the $pc register. You can see this happening between frames > > > #8 and #6 in the above, before finally in #5 we ask the next frame to > > > unwind the $pc. > > > > > > Almost every frame will have had the inline frame unwinder run on it > > > in order to figure out if it was an inline frame, that doesn't mean > > > the inline frame unwinder will claim the frame. > > > > > > Hope that helps, > > > Thanks, > > > Andrew > > > > > > > > > > > > > > > > > > > > > > > On Tuesday, 23 June 2020, Jan Kratochvil <jan.kratochvil@redhat.com> > > > wrote: > > > > > > > > > On Mon, 22 Jun 2020 23:55:30 +0200, Ahmad Nouralizadeh via Gdb > wrote: > > > > > > But knowing the GDB mechanism to get over the problem will be > > > helpful. > > > > > > > > > > GDB disassembles the code and tries to guess how to unwind it. > > > > > amd64-tdep.c amd64_analyze_prologue(), amd64_frame_cache_1() etc. > > > > > > > > > > That is just a last resort way of unwinding (=a bug in the > debuggee), > > > there > > > > > should always be .eh_frame in the debuggee, also for throwing > > > exceptions > > > > > across such .eh_frame-less functions if there is any callback > there. > > > > > > > > > > > > > > > Jan > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-06-26 11:33 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-22 21:29 GDB Frame Unwinding for Pure Assembly Code Ahmad Nouralizadeh 2020-06-22 21:39 ` Jan Kratochvil 2020-06-22 21:55 ` Ahmad Nouralizadeh 2020-06-23 7:26 ` Jan Kratochvil 2020-06-23 11:48 ` Ahmad Nouralizadeh 2020-06-23 12:23 ` Jan Kratochvil 2020-06-23 14:23 ` Ahmad Nouralizadeh 2020-06-24 14:00 ` Simon Marchi 2020-06-24 19:09 ` Ahmad Nouralizadeh 2020-06-24 15:06 ` Andrew Burgess 2020-06-25 9:10 ` Ahmad Nouralizadeh 2020-06-25 16:51 ` Andrew Burgess 2020-06-26 11:33 ` Ahmad Nouralizadeh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).