public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* Need help with understanding truncated and corrupted backtraces
@ 2021-03-16 13:24 Eli Zaretskii
  2021-03-24 20:05 ` Eli Zaretskii
  2021-03-24 22:48 ` Simon Marchi
  0 siblings, 2 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-16 13:24 UTC (permalink / raw)
  To: gdb-patches

Hi,

This is most probably not a problem with GDB, but what GDB does in the
case described below is the only thread I can pull to unlock the
mystery.  So I'm asking here, in the hope the your expertise in how
GDB works will give me some ideas.  TIA.

I'm debugging a tricky problem in the development version of Emacs,
the branch which uses libgccjit to compile Lisp files into native
code.  In that build, I see strange phenomena with backtraces reported
by GDB 10.1.  These problems, which AFAIK only happen on MS-Windows in
a 32-bit build of Emacs, result in corrupted and truncated backtraces.
I show the details below, and I'm asking if someone here could
describe how GDB determines which call-stack frame is the "initial
frame", the one above which one cannot go "up", and what problem in
the program being debugged could possibly cause GDB to "lose" that
initial frame and decide that the backtrace ends much sooner than it
should, or in fact does.

Here's a demonstration of the problem.  I put a breakpoint in a
certain Emacs function, called Fskip_chars_backward, which is
implemented in C, and run Emacs in a way that I know will hit that
breakpoint.  The initial backtrace is almost impeccable, see backtrace
#1 below.  I say "almost" because one of the functions in the
backtrace is shown as "??" -- not sure if this is part of the problem
or a separate problem.  But at least the backtrace is otherwise full
and goes up all the way to 'main'.  (This is an unoptimized build,
btw, and with -gdwarf-4 as the debug info switch.  GCC version is
9.2.0.  In case you wonder what are those strange *.eln names: those
are DLLs in disguise produced by compiling Lisp into native code in
the form of a shared library that Emacs then loads.)

Then I "continue" the program, and after a few times this breakpoint
breaks, I see a very different backtrace (see backtrace #2 below):
truncated soon after the innermost frame, and without any error
message from GDB indicating that it thinks the last frame it shows has
problems.

I'm looking for possible reasons for this strange behavior of GDB,
under the assumption that the libgccjit build somehow causes this (I
see no such problems in any other build of Emacs, and never did see
them in the past).  I hope that if I understand better how this could
happen, and/or if I read and step through the code in GDB which could
be involved, I'll be able to track this tricky problem.

Thanks in advance for any help or ideas.

Here's backtrace #1, which is almost normal (and quite long):

  Thread 1 hit Breakpoint 3, Fskip_chars_backward (
      string=XIL(0x80000000068bbe40), lim=XIL(0)) at syntax.c:1610
  1610      return skip_chars (0, string, lim, 1);
  (gdb) bt
  #0  Fskip_chars_backward (string=XIL(0x80000000068bbe40), lim=XIL(0))
      at syntax.c:1610
  #1  0x09980e90 in ?? ()
  #2  0x012608c9 in funcall_subr (subr=0x75f33a8, numargs=0, args=0x828d10)
      at eval.c:3082
  #3  0x01260359 in Ffuncall (nargs=1, args=0x828d08) at eval.c:3009
  #4  0x6d14e5ff in F632d666f6e742d6c6f636b2d636f6d706c65782d6465636c2d70726570617265_c_font_lock_complex_decl_prepare_0 ()
     from d:\usr\eli\.emacs.d\eln-cache\28.0.50-88a242e7\cc-fonts-d7d8a7f5-b7c359cd.eln
  #5  0x01261a74 in funcall_lambda (fun=XIL(0xa000000007779d08), nargs=1,
      arg_vector=0x829090) at eval.c:3292
  #6  0x012603c9 in Ffuncall (nargs=2, args=0x829088) at eval.c:3013
  #7  0x6544afd3 in F666f6e742d6c6f636b2d666f6e746966792d6b6579776f7264732d726567696f6e_font_lock_fontify_keywords_region_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
  #8  0x0126095c in funcall_subr (subr=0x5db766c, numargs=3, args=0x8293e8)
      at eval.c:3089
  #9  0x01260359 in Ffuncall (nargs=4, args=0x8293e0) at eval.c:3009
  #10 0x65446fb9 in F666f6e742d6c6f636b2d64656661756c742d666f6e746966792d726567696f6e_font_lock_default_fontify_region_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
  #11 0x0126095c in funcall_subr (subr=0x5b7b264, numargs=3, args=0x8296f8)
      at eval.c:3089
  #12 0x01260359 in Ffuncall (nargs=4, args=0x8296f0) at eval.c:3009
  #13 0x6c99c7e9 in F632d666f6e742d6c6f636b2d666f6e746966792d726567696f6e_c_font_lock_fontify_region_0 ()
     from d:\usr\eli\.emacs.d\eln-cache\28.0.50-88a242e7\cc-mode-515a98a5-c7761ade.eln
  #14 0x01261a74 in funcall_lambda (fun=XIL(0xa0000000077a2c28), nargs=3,
      arg_vector=0x8298a8) at eval.c:3292
  #15 0x012603c9 in Ffuncall (nargs=4, args=0x8298a0) at eval.c:3013
  #16 0x6544590e in F666f6e742d6c6f636b2d666f6e746966792d726567696f6e_font_lock_fontify_region_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
  #17 0x0126095c in funcall_subr (subr=0x5b8ff1c, numargs=2, args=0x829a98)
      at eval.c:3089
  #18 0x01260359 in Ffuncall (nargs=3, args=0x829a90) at eval.c:3009
  #19 0x012cbf88 in exec_byte_code (bytestr=XIL(0x8000000005b803ac),
      vector=XIL(0xa00000000781d360), maxdepth=make_fixnum(10),
      args_template=make_fixnum(257), nargs=1, args=0x82a1c8) at bytecode.c:632
  #20 0x01260e9e in fetch_and_exec_byte_code (fun=XIL(0xa00000000781d3a8),
      syms_left=make_fixnum(257), nargs=1, args=0x82a1c0) at eval.c:3133
  #21 0x0126141b in funcall_lambda (fun=XIL(0xa00000000781d3a8), nargs=1,
      arg_vector=0x82a1c0) at eval.c:3214
  #22 0x012603c9 in Ffuncall (nargs=2, args=0x82a1b8) at eval.c:3013
  #23 0x0125ef81 in run_hook_wrapped_funcall (nargs=2, args=0x82a1b8)
      at eval.c:2746
  #24 0x0125f432 in run_hook_with_args (nargs=2, args=0x82a1b8,
      funcall=0x125ef39 <run_hook_wrapped_funcall>) at eval.c:2827
  #25 0x0125efd0 in Frun_hook_wrapped (nargs=2, args=0x82a1b8) at eval.c:2761
  #26 0x6e984a40 in F6a69742d6c6f636b2d2d72756e2d66756e6374696f6e73_jit_lock__run_functions_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
  #27 0x01260914 in funcall_subr (subr=0x5b7ff44, numargs=2, args=0x82a440)
      at eval.c:3086
  #28 0x01260359 in Ffuncall (nargs=3, args=0x82a438) at eval.c:3009
  #29 0x6e9858ff in F6a69742d6c6f636b2d666f6e746966792d6e6f77_jit_lock_fontify_now_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
  #30 0x01260914 in funcall_subr (subr=0x5b8046c, numargs=2, args=0x82a700)
      at eval.c:3086
  #31 0x01260359 in Ffuncall (nargs=3, args=0x82a6f8) at eval.c:3009
  #32 0x6e98431b in F6a69742d6c6f636b2d66756e6374696f6e_jit_lock_function_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
  #33 0x012608e5 in funcall_subr (subr=0x5b80ebc, numargs=1, args=0x82a928)
      at eval.c:3084
  #34 0x01260359 in Ffuncall (nargs=2, args=0x82a920) at eval.c:3009
  #35 0x0125a974 in internal_condition_case_n (bfun=0x1260164 <Ffuncall>,
      nargs=2, args=0x82a920, handlers=XIL(0x30),
      hfun=0x104150e <safe_eval_handler>) at eval.c:1612
  #36 0x01041741 in safe__call (inhibit_quit=false, nargs=2,
      func=XIL(0x4316b2c), ap=0x82a9e4 "") at xdisp.c:2981
  #37 0x010417ba in safe_call (nargs=2, func=XIL(0x4316b2c)) at xdisp.c:2996
  #38 0x0104180e in safe_call1 (fn=XIL(0x4316b2c), arg=make_fixnum(31810))
      at xdisp.c:3007
  #39 0x01045486 in handle_fontified_prop (it=0x82bc78) at xdisp.c:4313
  #40 0x01043e0d in handle_stop (it=0x82bc78) at xdisp.c:3841
  #41 0x0104f655 in reseat (it=0x82bc78, pos=..., force_p=true) at xdisp.c:7087
  #42 0x0104306b in init_iterator (it=0x82bc78, w=0x68df710, charpos=31810,
      bytepos=31810, row=0x0, base_face_id=DEFAULT_FACE_ID) at xdisp.c:3442
  #43 0x01074987 in redisplay_window (window=XIL(0xa0000000068df710),
      just_this_one_p=false) at xdisp.c:18960
  #44 0x0106c131 in redisplay_window_0 (window=XIL(0xa0000000068df710))
      at xdisp.c:16581
  #45 0x0125a3d3 in internal_condition_case_1 (
      bfun=0x106c0d9 <redisplay_window_0>, arg=XIL(0xa0000000068df710),
      handlers=XIL(0xc000000006060ffc), hfun=0x106c096 <redisplay_window_error>)
      at eval.c:1472
  #46 0x0106c058 in redisplay_windows (window=XIL(0xa0000000068df710))
      at xdisp.c:16561
  #47 0x0106a8a9 in redisplay_internal () at xdisp.c:16029
  #48 0x0106b73b in redisplay_preserve_echo_area (from_where=2) at xdisp.c:16378
  #49 0x01012141 in Fredisplay (force=XIL(0)) at dispnew.c:6157
  #50 0x68d5b65f in F7369742d666f72_sit_for_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\subr-13adf6a6-4a99f497.eln
  #51 0x0126095c in funcall_subr (subr=0x5b97abc, numargs=1, args=0x82e040)
      at eval.c:3089
  #52 0x01260359 in Ffuncall (nargs=2, args=0x82e038) at eval.c:3009
  #53 0x707e4c86 in F697365617263682d6c617a792d686967686c696768742d6e65772d6c6f6f70_isearch_lazy_highlight_new_loop_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #54 0x01260914 in funcall_subr (subr=0x5f3b99c, numargs=0, args=0x82e320)
      at eval.c:3086
  #55 0x01260359 in Ffuncall (nargs=1, args=0x82e318) at eval.c:3009
  #56 0x707cb91f in F697365617263682d757064617465_isearch_update_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #57 0x012608c9 in funcall_subr (subr=0x5c026e4, numargs=0, args=0x82e620)
      at eval.c:3082
  #58 0x01260359 in Ffuncall (nargs=1, args=0x82e618) at eval.c:3009
  #59 0x707db33a in F697365617263682d7365617263682d616e642d757064617465_isearch_search_and_update_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #60 0x012608c9 in funcall_subr (subr=0x5ed2a4c, numargs=0, args=0x82e830)
      at eval.c:3082
  #61 0x01260359 in Ffuncall (nargs=1, args=0x82e828) at eval.c:3009
  #62 0x707de798 in F697365617263682d70726f636573732d7365617263682d737472696e67_isearch_process_search_string_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #63 0x01260914 in funcall_subr (subr=0x5ecbc1c, numargs=2, args=0x82ea58)
      at eval.c:3086
  #64 0x01260359 in Ffuncall (nargs=3, args=0x82ea50) at eval.c:3009
  #65 0x707de54b in F697365617263682d70726f636573732d7365617263682d63686172_isearch_process_search_char_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #66 0x01260914 in funcall_subr (subr=0x5ed9e5c, numargs=2, args=0x82ec90)
      at eval.c:3086
  #67 0x01260359 in Ffuncall (nargs=3, args=0x82ec88) at eval.c:3009
  #68 0x707de31e in F697365617263682d7072696e74696e672d63686172_isearch_printing_char_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
  #69 0x01260914 in funcall_subr (subr=0x5c296f4, numargs=2, args=0x82f030)
      at eval.c:3086
  #70 0x01260359 in Ffuncall (nargs=3, args=0x82f028) at eval.c:3009
  #71 0x0125028d in Ffuncall_interactively (nargs=3, args=0x82f028)
      at callint.c:260
  #72 0x01260801 in funcall_subr (subr=0x1730200 <Sfuncall_interactively>,
      numargs=3, args=0x82f028) at eval.c:3064
  #73 0x01260359 in Ffuncall (nargs=4, args=0x82f020) at eval.c:3009
  #74 0x0125ed8c in Fapply (nargs=3, args=0x82f1a8) at eval.c:2639
  #75 0x012509a8 in Fcall_interactively (function=XIL(0x43bf364),
      record_flag=XIL(0), keys=XIL(0xa00000000679da50)) at callint.c:353
  #76 0x6a9daca1 in F636f6d6d616e642d65786563757465_command_execute_0 ()
     from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\simple-fab5b0cf-9fc01ab7.eln
  #77 0x012609cf in funcall_subr (subr=0x5b3a3c4, numargs=1, args=0x82f598)
      at eval.c:3092
  #78 0x01260359 in Ffuncall (nargs=2, args=0x82f590) at eval.c:3009
  #79 0x0125f66c in call1 (fn=XIL(0x4110), arg1=XIL(0x43bf364)) at eval.c:2869
  #80 0x011654b4 in command_loop_1 () at keyboard.c:1466
  #81 0x0125a2e9 in internal_condition_case (bfun=0x1164873 <command_loop_1>,
      handlers=XIL(0x90), hfun=0x1163ad1 <cmd_error>) at eval.c:1448
  #82 0x011642e0 in command_loop_2 (ignore=XIL(0)) at keyboard.c:1094
  #83 0x012593cc in internal_catch (tag=XIL(0xee80),
      func=0x11642a3 <command_loop_2>, arg=XIL(0)) at eval.c:1198
  #84 0x0116425d in command_loop () at keyboard.c:1073
  #85 0x01163561 in recursive_edit_1 () at keyboard.c:720
  #86 0x011637cf in Frecursive_edit () at keyboard.c:789
  #87 0x0115ee6e in main (argc=2, argv=0xa42828) at emacs.c:2095

And here, by contrast, is backtrace#2.  (It doesn't always get so
short, sometimes it has several dozens of frames, but it always stops
short of reaching 'main'.)

  Thread 1 hit Breakpoint 3, Fskip_chars_backward (
      string=XIL(0x80000000068bc060), lim=XIL(0)) at syntax.c:1610
  1610      return skip_chars (0, string, lim, 1);
  (gdb) bt
  #0  Fskip_chars_backward (string=XIL(0x80000000068bc060), lim=XIL(0))
      at syntax.c:1610
  #1  0x09977f08 in ?? ()
  #2  0x01260359 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
  #3  0x099685bb in ?? ()
  #4  0x00000000 in ?? ()


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-16 13:24 Need help with understanding truncated and corrupted backtraces Eli Zaretskii
@ 2021-03-24 20:05 ` Eli Zaretskii
  2021-03-24 22:48 ` Simon Marchi
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-24 20:05 UTC (permalink / raw)
  To: gdb-patches

Ping!  Could someone please give me some hints about this, even if
that's just a pointer to the relevant code in GDB where I could
understand why backtrace could become truncated like that?  Please?

> Date: Tue, 16 Mar 2021 15:24:56 +0200
> From: Eli Zaretskii via Gdb-patches <gdb-patches@sourceware.org>
> 
> Hi,
> 
> This is most probably not a problem with GDB, but what GDB does in the
> case described below is the only thread I can pull to unlock the
> mystery.  So I'm asking here, in the hope the your expertise in how
> GDB works will give me some ideas.  TIA.
> 
> I'm debugging a tricky problem in the development version of Emacs,
> the branch which uses libgccjit to compile Lisp files into native
> code.  In that build, I see strange phenomena with backtraces reported
> by GDB 10.1.  These problems, which AFAIK only happen on MS-Windows in
> a 32-bit build of Emacs, result in corrupted and truncated backtraces.
> I show the details below, and I'm asking if someone here could
> describe how GDB determines which call-stack frame is the "initial
> frame", the one above which one cannot go "up", and what problem in
> the program being debugged could possibly cause GDB to "lose" that
> initial frame and decide that the backtrace ends much sooner than it
> should, or in fact does.
> 
> Here's a demonstration of the problem.  I put a breakpoint in a
> certain Emacs function, called Fskip_chars_backward, which is
> implemented in C, and run Emacs in a way that I know will hit that
> breakpoint.  The initial backtrace is almost impeccable, see backtrace
> #1 below.  I say "almost" because one of the functions in the
> backtrace is shown as "??" -- not sure if this is part of the problem
> or a separate problem.  But at least the backtrace is otherwise full
> and goes up all the way to 'main'.  (This is an unoptimized build,
> btw, and with -gdwarf-4 as the debug info switch.  GCC version is
> 9.2.0.  In case you wonder what are those strange *.eln names: those
> are DLLs in disguise produced by compiling Lisp into native code in
> the form of a shared library that Emacs then loads.)
> 
> Then I "continue" the program, and after a few times this breakpoint
> breaks, I see a very different backtrace (see backtrace #2 below):
> truncated soon after the innermost frame, and without any error
> message from GDB indicating that it thinks the last frame it shows has
> problems.
> 
> I'm looking for possible reasons for this strange behavior of GDB,
> under the assumption that the libgccjit build somehow causes this (I
> see no such problems in any other build of Emacs, and never did see
> them in the past).  I hope that if I understand better how this could
> happen, and/or if I read and step through the code in GDB which could
> be involved, I'll be able to track this tricky problem.
> 
> Thanks in advance for any help or ideas.
> 
> Here's backtrace #1, which is almost normal (and quite long):
> 
>   Thread 1 hit Breakpoint 3, Fskip_chars_backward (
>       string=XIL(0x80000000068bbe40), lim=XIL(0)) at syntax.c:1610
>   1610      return skip_chars (0, string, lim, 1);
>   (gdb) bt
>   #0  Fskip_chars_backward (string=XIL(0x80000000068bbe40), lim=XIL(0))
>       at syntax.c:1610
>   #1  0x09980e90 in ?? ()
>   #2  0x012608c9 in funcall_subr (subr=0x75f33a8, numargs=0, args=0x828d10)
>       at eval.c:3082
>   #3  0x01260359 in Ffuncall (nargs=1, args=0x828d08) at eval.c:3009
>   #4  0x6d14e5ff in F632d666f6e742d6c6f636b2d636f6d706c65782d6465636c2d70726570617265_c_font_lock_complex_decl_prepare_0 ()
>      from d:\usr\eli\.emacs.d\eln-cache\28.0.50-88a242e7\cc-fonts-d7d8a7f5-b7c359cd.eln
>   #5  0x01261a74 in funcall_lambda (fun=XIL(0xa000000007779d08), nargs=1,
>       arg_vector=0x829090) at eval.c:3292
>   #6  0x012603c9 in Ffuncall (nargs=2, args=0x829088) at eval.c:3013
>   #7  0x6544afd3 in F666f6e742d6c6f636b2d666f6e746966792d6b6579776f7264732d726567696f6e_font_lock_fontify_keywords_region_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
>   #8  0x0126095c in funcall_subr (subr=0x5db766c, numargs=3, args=0x8293e8)
>       at eval.c:3089
>   #9  0x01260359 in Ffuncall (nargs=4, args=0x8293e0) at eval.c:3009
>   #10 0x65446fb9 in F666f6e742d6c6f636b2d64656661756c742d666f6e746966792d726567696f6e_font_lock_default_fontify_region_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
>   #11 0x0126095c in funcall_subr (subr=0x5b7b264, numargs=3, args=0x8296f8)
>       at eval.c:3089
>   #12 0x01260359 in Ffuncall (nargs=4, args=0x8296f0) at eval.c:3009
>   #13 0x6c99c7e9 in F632d666f6e742d6c6f636b2d666f6e746966792d726567696f6e_c_font_lock_fontify_region_0 ()
>      from d:\usr\eli\.emacs.d\eln-cache\28.0.50-88a242e7\cc-mode-515a98a5-c7761ade.eln
>   #14 0x01261a74 in funcall_lambda (fun=XIL(0xa0000000077a2c28), nargs=3,
>       arg_vector=0x8298a8) at eval.c:3292
>   #15 0x012603c9 in Ffuncall (nargs=4, args=0x8298a0) at eval.c:3013
>   #16 0x6544590e in F666f6e742d6c6f636b2d666f6e746966792d726567696f6e_font_lock_fontify_region_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\font-lock-895216f6-4cc69ac6.eln
>   #17 0x0126095c in funcall_subr (subr=0x5b8ff1c, numargs=2, args=0x829a98)
>       at eval.c:3089
>   #18 0x01260359 in Ffuncall (nargs=3, args=0x829a90) at eval.c:3009
>   #19 0x012cbf88 in exec_byte_code (bytestr=XIL(0x8000000005b803ac),
>       vector=XIL(0xa00000000781d360), maxdepth=make_fixnum(10),
>       args_template=make_fixnum(257), nargs=1, args=0x82a1c8) at bytecode.c:632
>   #20 0x01260e9e in fetch_and_exec_byte_code (fun=XIL(0xa00000000781d3a8),
>       syms_left=make_fixnum(257), nargs=1, args=0x82a1c0) at eval.c:3133
>   #21 0x0126141b in funcall_lambda (fun=XIL(0xa00000000781d3a8), nargs=1,
>       arg_vector=0x82a1c0) at eval.c:3214
>   #22 0x012603c9 in Ffuncall (nargs=2, args=0x82a1b8) at eval.c:3013
>   #23 0x0125ef81 in run_hook_wrapped_funcall (nargs=2, args=0x82a1b8)
>       at eval.c:2746
>   #24 0x0125f432 in run_hook_with_args (nargs=2, args=0x82a1b8,
>       funcall=0x125ef39 <run_hook_wrapped_funcall>) at eval.c:2827
>   #25 0x0125efd0 in Frun_hook_wrapped (nargs=2, args=0x82a1b8) at eval.c:2761
>   #26 0x6e984a40 in F6a69742d6c6f636b2d2d72756e2d66756e6374696f6e73_jit_lock__run_functions_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
>   #27 0x01260914 in funcall_subr (subr=0x5b7ff44, numargs=2, args=0x82a440)
>       at eval.c:3086
>   #28 0x01260359 in Ffuncall (nargs=3, args=0x82a438) at eval.c:3009
>   #29 0x6e9858ff in F6a69742d6c6f636b2d666f6e746966792d6e6f77_jit_lock_fontify_now_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
>   #30 0x01260914 in funcall_subr (subr=0x5b8046c, numargs=2, args=0x82a700)
>       at eval.c:3086
>   #31 0x01260359 in Ffuncall (nargs=3, args=0x82a6f8) at eval.c:3009
>   #32 0x6e98431b in F6a69742d6c6f636b2d66756e6374696f6e_jit_lock_function_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\jit-lock-8a988e43-73eb7028.eln
>   #33 0x012608e5 in funcall_subr (subr=0x5b80ebc, numargs=1, args=0x82a928)
>       at eval.c:3084
>   #34 0x01260359 in Ffuncall (nargs=2, args=0x82a920) at eval.c:3009
>   #35 0x0125a974 in internal_condition_case_n (bfun=0x1260164 <Ffuncall>,
>       nargs=2, args=0x82a920, handlers=XIL(0x30),
>       hfun=0x104150e <safe_eval_handler>) at eval.c:1612
>   #36 0x01041741 in safe__call (inhibit_quit=false, nargs=2,
>       func=XIL(0x4316b2c), ap=0x82a9e4 "") at xdisp.c:2981
>   #37 0x010417ba in safe_call (nargs=2, func=XIL(0x4316b2c)) at xdisp.c:2996
>   #38 0x0104180e in safe_call1 (fn=XIL(0x4316b2c), arg=make_fixnum(31810))
>       at xdisp.c:3007
>   #39 0x01045486 in handle_fontified_prop (it=0x82bc78) at xdisp.c:4313
>   #40 0x01043e0d in handle_stop (it=0x82bc78) at xdisp.c:3841
>   #41 0x0104f655 in reseat (it=0x82bc78, pos=..., force_p=true) at xdisp.c:7087
>   #42 0x0104306b in init_iterator (it=0x82bc78, w=0x68df710, charpos=31810,
>       bytepos=31810, row=0x0, base_face_id=DEFAULT_FACE_ID) at xdisp.c:3442
>   #43 0x01074987 in redisplay_window (window=XIL(0xa0000000068df710),
>       just_this_one_p=false) at xdisp.c:18960
>   #44 0x0106c131 in redisplay_window_0 (window=XIL(0xa0000000068df710))
>       at xdisp.c:16581
>   #45 0x0125a3d3 in internal_condition_case_1 (
>       bfun=0x106c0d9 <redisplay_window_0>, arg=XIL(0xa0000000068df710),
>       handlers=XIL(0xc000000006060ffc), hfun=0x106c096 <redisplay_window_error>)
>       at eval.c:1472
>   #46 0x0106c058 in redisplay_windows (window=XIL(0xa0000000068df710))
>       at xdisp.c:16561
>   #47 0x0106a8a9 in redisplay_internal () at xdisp.c:16029
>   #48 0x0106b73b in redisplay_preserve_echo_area (from_where=2) at xdisp.c:16378
>   #49 0x01012141 in Fredisplay (force=XIL(0)) at dispnew.c:6157
>   #50 0x68d5b65f in F7369742d666f72_sit_for_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\subr-13adf6a6-4a99f497.eln
>   #51 0x0126095c in funcall_subr (subr=0x5b97abc, numargs=1, args=0x82e040)
>       at eval.c:3089
>   #52 0x01260359 in Ffuncall (nargs=2, args=0x82e038) at eval.c:3009
>   #53 0x707e4c86 in F697365617263682d6c617a792d686967686c696768742d6e65772d6c6f6f70_isearch_lazy_highlight_new_loop_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #54 0x01260914 in funcall_subr (subr=0x5f3b99c, numargs=0, args=0x82e320)
>       at eval.c:3086
>   #55 0x01260359 in Ffuncall (nargs=1, args=0x82e318) at eval.c:3009
>   #56 0x707cb91f in F697365617263682d757064617465_isearch_update_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #57 0x012608c9 in funcall_subr (subr=0x5c026e4, numargs=0, args=0x82e620)
>       at eval.c:3082
>   #58 0x01260359 in Ffuncall (nargs=1, args=0x82e618) at eval.c:3009
>   #59 0x707db33a in F697365617263682d7365617263682d616e642d757064617465_isearch_search_and_update_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #60 0x012608c9 in funcall_subr (subr=0x5ed2a4c, numargs=0, args=0x82e830)
>       at eval.c:3082
>   #61 0x01260359 in Ffuncall (nargs=1, args=0x82e828) at eval.c:3009
>   #62 0x707de798 in F697365617263682d70726f636573732d7365617263682d737472696e67_isearch_process_search_string_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #63 0x01260914 in funcall_subr (subr=0x5ecbc1c, numargs=2, args=0x82ea58)
>       at eval.c:3086
>   #64 0x01260359 in Ffuncall (nargs=3, args=0x82ea50) at eval.c:3009
>   #65 0x707de54b in F697365617263682d70726f636573732d7365617263682d63686172_isearch_process_search_char_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #66 0x01260914 in funcall_subr (subr=0x5ed9e5c, numargs=2, args=0x82ec90)
>       at eval.c:3086
>   #67 0x01260359 in Ffuncall (nargs=3, args=0x82ec88) at eval.c:3009
>   #68 0x707de31e in F697365617263682d7072696e74696e672d63686172_isearch_printing_char_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\isearch-693bdd75-d165ea70.eln
>   #69 0x01260914 in funcall_subr (subr=0x5c296f4, numargs=2, args=0x82f030)
>       at eval.c:3086
>   #70 0x01260359 in Ffuncall (nargs=3, args=0x82f028) at eval.c:3009
>   #71 0x0125028d in Ffuncall_interactively (nargs=3, args=0x82f028)
>       at callint.c:260
>   #72 0x01260801 in funcall_subr (subr=0x1730200 <Sfuncall_interactively>,
>       numargs=3, args=0x82f028) at eval.c:3064
>   #73 0x01260359 in Ffuncall (nargs=4, args=0x82f020) at eval.c:3009
>   #74 0x0125ed8c in Fapply (nargs=3, args=0x82f1a8) at eval.c:2639
>   #75 0x012509a8 in Fcall_interactively (function=XIL(0x43bf364),
>       record_flag=XIL(0), keys=XIL(0xa00000000679da50)) at callint.c:353
>   #76 0x6a9daca1 in F636f6d6d616e642d65786563757465_command_execute_0 ()
>      from d:\gnu\git\emacs\native-comp\native-lisp\28.0.50-88a242e7\simple-fab5b0cf-9fc01ab7.eln
>   #77 0x012609cf in funcall_subr (subr=0x5b3a3c4, numargs=1, args=0x82f598)
>       at eval.c:3092
>   #78 0x01260359 in Ffuncall (nargs=2, args=0x82f590) at eval.c:3009
>   #79 0x0125f66c in call1 (fn=XIL(0x4110), arg1=XIL(0x43bf364)) at eval.c:2869
>   #80 0x011654b4 in command_loop_1 () at keyboard.c:1466
>   #81 0x0125a2e9 in internal_condition_case (bfun=0x1164873 <command_loop_1>,
>       handlers=XIL(0x90), hfun=0x1163ad1 <cmd_error>) at eval.c:1448
>   #82 0x011642e0 in command_loop_2 (ignore=XIL(0)) at keyboard.c:1094
>   #83 0x012593cc in internal_catch (tag=XIL(0xee80),
>       func=0x11642a3 <command_loop_2>, arg=XIL(0)) at eval.c:1198
>   #84 0x0116425d in command_loop () at keyboard.c:1073
>   #85 0x01163561 in recursive_edit_1 () at keyboard.c:720
>   #86 0x011637cf in Frecursive_edit () at keyboard.c:789
>   #87 0x0115ee6e in main (argc=2, argv=0xa42828) at emacs.c:2095
> 
> And here, by contrast, is backtrace#2.  (It doesn't always get so
> short, sometimes it has several dozens of frames, but it always stops
> short of reaching 'main'.)
> 
>   Thread 1 hit Breakpoint 3, Fskip_chars_backward (
>       string=XIL(0x80000000068bc060), lim=XIL(0)) at syntax.c:1610
>   1610      return skip_chars (0, string, lim, 1);
>   (gdb) bt
>   #0  Fskip_chars_backward (string=XIL(0x80000000068bc060), lim=XIL(0))
>       at syntax.c:1610
>   #1  0x09977f08 in ?? ()
>   #2  0x01260359 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
>   #3  0x099685bb in ?? ()
>   #4  0x00000000 in ?? ()
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-16 13:24 Need help with understanding truncated and corrupted backtraces Eli Zaretskii
  2021-03-24 20:05 ` Eli Zaretskii
@ 2021-03-24 22:48 ` Simon Marchi
  2021-03-25  3:47   ` Simon Marchi
  2021-03-25 10:04   ` Eli Zaretskii
  1 sibling, 2 replies; 17+ messages in thread
From: Simon Marchi @ 2021-03-24 22:48 UTC (permalink / raw)
  To: Eli Zaretskii, gdb-patches

Hi Eli,

First, you could try to enable "set debug frame 1" before backtracing,
It will show information about each frame unwind, in between each frame.
In particular, you should see the reason why unwinding the last shown
frame didn't yield any more frame.

On 2021-03-16 9:24 a.m., Eli Zaretskii via Gdb-patches wrote:
> Here's backtrace #1, which is almost normal (and quite long):
> 
>   Thread 1 hit Breakpoint 3, Fskip_chars_backward (
>       string=XIL(0x80000000068bbe40), lim=XIL(0)) at syntax.c:1610
>   1610      return skip_chars (0, string, lim, 1);
>   (gdb) bt
>   #0  Fskip_chars_backward (string=XIL(0x80000000068bbe40), lim=XIL(0))
>       at syntax.c:1610
>   #1  0x09980e90 in ?? ()
>   #2  0x012608c9 in funcall_subr (subr=0x75f33a8, numargs=0, args=0x828d10)
>       at eval.c:3082
>   #3  0x01260359 in Ffuncall (nargs=1, args=0x828d08) at eval.c:3009
>   #4  0x6d14e5ff in F632d666f6e742d6c6f636b2d636f6d706c65782d6465636c2d70726570617265_c_font_lock_complex_decl_prepare_0 ()
>      from d:\usr\eli\.emacs.d\eln-cache\28.0.50-88a242e7\cc-fonts-d7d8a7f5-b7c359cd.eln
>   #5  0x01261a74 in funcall_lambda (fun=XIL(0xa000000007779d08), nargs=1,
>       arg_vector=0x829090) at eval.c:3292

[truncated for brevity]

> 
> And here, by contrast, is backtrace#2.  (It doesn't always get so
> short, sometimes it has several dozens of frames, but it always stops
> short of reaching 'main'.)
> 
>   Thread 1 hit Breakpoint 3, Fskip_chars_backward (
>       string=XIL(0x80000000068bc060), lim=XIL(0)) at syntax.c:1610
>   1610      return skip_chars (0, string, lim, 1);
>   (gdb) bt
>   #0  Fskip_chars_backward (string=XIL(0x80000000068bc060), lim=XIL(0))
>       at syntax.c:1610
>   #1  0x09977f08 in ?? ()
>   #2  0x01260359 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
>   #3  0x099685bb in ?? ()
>   #4  0x00000000 in ?? ()
> 

From these backtraces, what I notice is that you have debug info for the
code written in C, such as Ffuncall: we see the argument values.

But the code coming from the JIT (in eln files) doesn't appear to have
debug info: we don't see any argument values.  Perhaps that's expected
for JIT code, I don't know.  But there isn't debug info for these
frames, GDB will use its own unwinder, which tries to analyze the
assembly instructions in the prologue to understand what was saved where
and the previous frame address.  This is heuristic-ish, so if something
goes wrong in one frame, the backtrace will be wrong from there.  In
comparison, frames for which there is unwind information from the DWARF
debug information are unwound very precisely.

It would therefore be interesting to know which unwinder is selected for
each frame: the unwinder that uses the DWARF debug info or the heuristic
unwinder.  Unfortunately, "set debug frame 1" doesn't seem to show any
information about that.  You could always break on the
frame_unwind_find_by_frame function to find that out.  But I think it
would be benificial to everyone to add some debug output in that
function, to say things like:

  Trying unwinder foo for frame 1 -> nope
  Trying unwinder bar for frame 1 -> yes!

Once you know which unwinder is selected for each frame, you can try to
find starting at which unwind things went wrong and focus on that one.

And if this is indeed what is happening to you (the heuristic unwinder
gets things wrong), I see two choices:

 - make the JIT producer generate DWARF info for the JIT-ed code, which
   will make GDB use the DWARF-based unwinder
 - improve the heuristic-based unwinder

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-24 22:48 ` Simon Marchi
@ 2021-03-25  3:47   ` Simon Marchi
  2021-03-25 10:09     ` Eli Zaretskii
  2021-03-25 10:04   ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: Simon Marchi @ 2021-03-25  3:47 UTC (permalink / raw)
  To: Eli Zaretskii, gdb-patches

> It would therefore be interesting to know which unwinder is selected for
> each frame: the unwinder that uses the DWARF debug info or the heuristic
> unwinder.  Unfortunately, "set debug frame 1" doesn't seem to show any
> information about that.  You could always break on the
> frame_unwind_find_by_frame function to find that out.  But I think it
> would be benificial to everyone to add some debug output in that
> function, to say things like:
>
>   Trying unwinder foo for frame 1 -> nope
>   Trying unwinder bar for frame 1 -> yes!

I implemented this, it is available on the users/simark/frame-debug
branch on sourceware:

    https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/simark/frame-debug

It is not suitable for an upstream submission yet, but you can give it a
try.  I converted the "debug frame" messages to the new-ish system,
which nests messages.  It dramatically improves readability, IMO.

Here's a small example of what you can expect:

    [frame] get_prev_frame: enter
      [frame] get_prev_frame_always_1: enter
        [frame] get_prev_frame_always_1: this_frame=-1
        [frame] get_prev_frame_always_1:   -> {level=0, type=<unknown>, unwind=<unknown>, pc=0x7ffff7fd08e8, id=<not computed>, func=<unknown>} // cached
      [frame] get_prev_frame_always_1: exit
    [frame] get_prev_frame: exit
    [frame] compute_frame_id: enter
      [frame] compute_frame_id: fi=0
      [frame] frame_unwind_find_by_frame: enter
        [frame] frame_unwind_find_by_frame: this_frame=0
        [frame] frame_unwind_try_unwinder: trying unwinder "dummy"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "inline"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "jit"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "python"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "amd64 epilogue"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "i386 epilogue"
        [frame] frame_unwind_try_unwinder: no
        [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
        [frame] frame_unwind_try_unwinder: yes
      [frame] frame_unwind_find_by_frame: exit
      [frame] get_frame_func_if_available: this_frame=0 -> 0x7ffff7fce4c0
      [frame] frame_id_p: l={!stack, !code, !special} -> 0
      [frame] frame_unwind_register_value: frame=-1, regnum=6(rbp)
      [frame] frame_unwind_register_value:   -> register=6 bytes=[b0ddffffff7f0000]
      [frame] frame_unwind_arch: next_frame=0 -> i386:x86-64
      [frame] frame_unwind_register_value: frame=0, regnum=16(rip)
      [frame] frame_unwind_register_value:   -> address=0x7fffffffddb8 bytes=[9f86fef7ff7f0000]
      [frame] frame_id_p: l={stack=0x7fffffffddc0, code=0x7ffff7fce4c0, !special} -> 1
      [frame] compute_frame_id:   -> {stack=0x7fffffffddc0, code=0x7ffff7fce4c0, !special}
    [frame] compute_frame_id: exit

>
> Once you know which unwinder is selected for each frame, you can try to
> find starting at which unwind things went wrong and focus on that one.
>
> And if this is indeed what is happening to you (the heuristic unwinder
> gets things wrong), I see two choices:
>
>  - make the JIT producer generate DWARF info for the JIT-ed code, which
>    will make GDB use the DWARF-based unwinder
>  - improve the heuristic-based unwinder

While doing this work, I saw and remembered that another possibility is
to provide a custom unwinder:

    https://sourceware.org/gdb/current/onlinedocs/gdb/Custom-Debug-Info.html#Custom-Debug-Info

But anyway, let's start with figuring out what goes wrong.

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-24 22:48 ` Simon Marchi
  2021-03-25  3:47   ` Simon Marchi
@ 2021-03-25 10:04   ` Eli Zaretskii
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-25 10:04 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Wed, 24 Mar 2021 18:48:18 -0400
> 
> First, you could try to enable "set debug frame 1" before backtracing,
> It will show information about each frame unwind, in between each frame.
> In particular, you should see the reason why unwinding the last shown
> frame didn't yield any more frame.

Thanks, I did that, but how do I tell what is the reason for not
showing more frames?  This setting spews gobs of information, and I
don't know what should I look for.

> >   Thread 1 hit Breakpoint 3, Fskip_chars_backward (
> >       string=XIL(0x80000000068bc060), lim=XIL(0)) at syntax.c:1610
> >   1610      return skip_chars (0, string, lim, 1);
> >   (gdb) bt
> >   #0  Fskip_chars_backward (string=XIL(0x80000000068bc060), lim=XIL(0))
> >       at syntax.c:1610
> >   #1  0x09977f08 in ?? ()
> >   #2  0x01260359 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
> >   #3  0x099685bb in ?? ()
> >   #4  0x00000000 in ?? ()
> > 
> 
> >From these backtraces, what I notice is that you have debug info for the
> code written in C, such as Ffuncall: we see the argument values.

Yes.  Emacs is compiled with -gdwarf-4 (and if you are wondering about
those XIL(..) things, you will find a custom Python printer in
src/.gdbinit in the Emacs sources).

> But the code coming from the JIT (in eln files) doesn't appear to have
> debug info: we don't see any argument values.  Perhaps that's expected
> for JIT code, I don't know.

Yes, currently the native-compiled code doesn't have any debug info.
I'm thinking of perhaps changing that.

> Once you know which unwinder is selected for each frame, you can try to
> find starting at which unwind things went wrong and focus on that one.
> 
> And if this is indeed what is happening to you (the heuristic unwinder
> gets things wrong), I see two choices:
> 
>  - make the JIT producer generate DWARF info for the JIT-ed code, which
>    will make GDB use the DWARF-based unwinder
>  - improve the heuristic-based unwinder

Let me know if you'd like to see the info emitted by "debug frame 1".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-25  3:47   ` Simon Marchi
@ 2021-03-25 10:09     ` Eli Zaretskii
  2021-03-25 14:19       ` Simon Marchi
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-25 10:09 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Wed, 24 Mar 2021 23:47:32 -0400
> 
> >   Trying unwinder foo for frame 1 -> nope
> >   Trying unwinder bar for frame 1 -> yes!
> 
> I implemented this, it is available on the users/simark/frame-debug
> branch on sourceware:
> 
>     https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/simark/frame-debug
> 
> It is not suitable for an upstream submission yet, but you can give it a
> try.  I converted the "debug frame" messages to the new-ish system,
> which nests messages.  It dramatically improves readability, IMO.

Thanks.  Is this supposed to work for 32-bit GDB, though?  It looks
like the non-trivial changes are in amd64-windows-tdep.c, so I wonder
whether you only implemented this for 64-bit Widnows?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-25 10:09     ` Eli Zaretskii
@ 2021-03-25 14:19       ` Simon Marchi
  2021-03-26  7:19         ` Eli Zaretskii
  2021-03-29  8:03         ` Eli Zaretskii
  0 siblings, 2 replies; 17+ messages in thread
From: Simon Marchi @ 2021-03-25 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On 2021-03-25 6:09 a.m., Eli Zaretskii wrote:
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Date: Wed, 24 Mar 2021 23:47:32 -0400
>>
>>>   Trying unwinder foo for frame 1 -> nope
>>>   Trying unwinder bar for frame 1 -> yes!
>>
>> I implemented this, it is available on the users/simark/frame-debug
>> branch on sourceware:
>>
>>     https://sourceware.org/git/?p=binutils-gdb.git;a=shortlog;h=refs/heads/users/simark/frame-debug
>>
>> It is not suitable for an upstream submission yet, but you can give it a
>> try.  I converted the "debug frame" messages to the new-ish system,
>> which nests messages.  It dramatically improves readability, IMO.
> 
> Thanks.  Is this supposed to work for 32-bit GDB, though?  It looks
> like the non-trivial changes are in amd64-windows-tdep.c, so I wonder
> whether you only implemented this for 64-bit Widnows?

The interesting bits are in the generic frame handling code, including
the new printouts I added in the frame_unwind_try_unwinder function.
amd64-windows-tdep.c just happens to have some frame-related debug
prints, so simply I updated them as well.

It would be interesting to see the output of your two backtraces, with
"set debug frame 1" enabled (with my patches applied).

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-25 14:19       ` Simon Marchi
@ 2021-03-26  7:19         ` Eli Zaretskii
  2021-03-29  8:03         ` Eli Zaretskii
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-26  7:19 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Thu, 25 Mar 2021 10:19:53 -0400
> 
> It would be interesting to see the output of your two backtraces, with
> "set debug frame 1" enabled (with my patches applied).

OK, I will build a snapshot of the master branch with your patches,
and get back with that info.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-25 14:19       ` Simon Marchi
  2021-03-26  7:19         ` Eli Zaretskii
@ 2021-03-29  8:03         ` Eli Zaretskii
  2021-03-29 14:33           ` Simon Marchi
  1 sibling, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-29  8:03 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Thu, 25 Mar 2021 10:19:53 -0400
> 
> It would be interesting to see the output of your two backtraces, with
> "set debug frame 1" enabled (with my patches applied).

I've built the GDB snapshot from Mar 25, after applying the patches
from your branch wrt master.

I attach below the output for the "short" backtrace.  The backtrace
itself is just this:

  #0  Fskip_chars_backward (string=-9223372036743969912, lim=0) at syntax.c:1610
  #1  0x09957f08 in ?? ()
  #2  0x01260805 in Ffuncall (nargs=1, args=0x8282e0) at eval.c:3009
  #3  0x099485bb in ?? ()
  #4  0x00000000 in ?? ()

The output from "set debug frame 1" is:

  Thread 1 hit Breakpoint 2, [frame] frame_unwind_find_by_frame: enter
    [frame] frame_unwind_find_by_frame: this_frame=0
    [frame] frame_unwind_try_unwinder: trying unwinder "dummy"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "inline"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "jit"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "python"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "i386 epilogue"
    [frame] frame_unwind_try_unwinder: no
    [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
    [frame] frame_unwind_try_unwinder: yes
  [frame] frame_unwind_find_by_frame: exit
  Fskip_chars_backward ([frame] get_prev_frame_always_1: enter
    [frame] get_prev_frame_always_1: this_frame=0
    [frame] compute_frame_id: enter
      [frame] compute_frame_id: fi=0
      [frame] get_frame_func_if_available: this_frame=0 -> 0x12c3e1b
      [frame] frame_id_p: l={!stack, !code, !special} -> 0
      [frame] frame_unwind_register_value: frame=-1, regnum=5(ebp)
      [frame] frame_unwind_register_value:   -> register=5 bytes=[08808200]
      [frame] frame_unwind_arch: next_frame=0 -> i386
      [frame] frame_unwind_register_value: frame=0, regnum=8(eip)
      [frame] frame_unwind_register_value:   -> address=0x82800c bytes=[087f9509]
      [frame] frame_unwind_register_value: frame=0, regnum=4(esp)
      [frame] frame_unwind_register_value:   -> computed bytes=[10808200]
      [frame] frame_id_p: l={stack=0x828010, code=0x12c3e1b, !special} -> 1
      [frame] compute_frame_id:   -> {stack=0x828010, code=0x12c3e1b, !special}
    [frame] compute_frame_id: exit
    [frame] get_prev_frame_raw:   -> {level=1, type=<unknown>, unwind=<unknown>, pc=<unknown>, id=<not computed>, func=<unknown>}
    [frame] compute_frame_id: enter
      [frame] compute_frame_id: fi=1
      [frame] frame_unwind_find_by_frame: enter
	[frame] frame_unwind_find_by_frame: this_frame=1
	[frame] frame_unwind_try_unwinder: trying unwinder "dummy"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "inline"
	[frame] frame_unwind_register_value: frame=0, regnum=8(eip)
	[frame] frame_unwind_register_value:   -> address=0x82800c bytes=[087f9509]
	[frame] frame_unwind_pc: this_frame=0 -> 0x9957f08
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "jit"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "python"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "i386 epilogue"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 signal"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "i386 stack tramp"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "i386 sigtramp"
	[frame] frame_unwind_try_unwinder: no
	[frame] frame_unwind_try_unwinder: trying unwinder "i386 prologue"
	[frame] frame_unwind_try_unwinder: yes
      [frame] frame_unwind_find_by_frame: exit
      [frame] get_frame_func_if_available: this_frame=1 -> 0x0
      [frame] frame_unwind_register_value: frame=0, regnum=5(ebp)
      [frame] frame_unwind_register_value:   -> address=0x828008 bytes=[e8818200]
      [frame] frame_id_p: l={stack=0x8281f0, code=0x0, !special} -> 1
      [frame] compute_frame_id:   -> {stack=0x8281f0, code=0x0, !special}
    [frame] compute_frame_id: exit
  [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: enter
    [frame] get_prev_frame_always_1: enter
      [frame] get_prev_frame_always_1: this_frame=0
      [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
    [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: exit
  [frame] get_prev_frame: enter
    [frame] get_prev_frame_always_1: enter
      [frame] get_prev_frame_always_1: this_frame=0
      [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
    [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: exit
  string=-9223372036743969912, [frame] get_prev_frame_always_1: enter
    [frame] get_prev_frame_always_1: this_frame=0
    [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
  [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: enter
    [frame] get_prev_frame_always_1: enter
      [frame] get_prev_frame_always_1: this_frame=0
      [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
    [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: exit
  [frame] get_prev_frame: enter
    [frame] get_prev_frame_always_1: enter
      [frame] get_prev_frame_always_1: this_frame=0
      [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
    [frame] get_prev_frame_always_1: exit
  [frame] get_prev_frame: exit
  lim=0[frame] frame_id_p: l={!stack, !code, !special} -> 0
  ) at syntax.c:1610
  1610	  return skip_chars (0, string, lim, 1);
  [frame] get_prev_frame_always_1: enter
    [frame] get_prev_frame_always_1: this_frame=-1
    [frame] get_prev_frame_always_1:   -> {level=0, type=NORMAL_FRAME, unwind=0x9f16e0, pc=0x12c3e39, id={stack=0x828010, code=0x12c3e1b, !special}, func=0x12c3e1b} // cached
  [frame] get_prev_frame_always_1: exit

Let me know if you want to see the output for "long" backtrace as
well, and whether you'd like to see it for the case where GDB is able
to show all the frames back to 'main'.

TIA

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29  8:03         ` Eli Zaretskii
@ 2021-03-29 14:33           ` Simon Marchi
  2021-03-29 14:57             ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Simon Marchi @ 2021-03-29 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches



On 2021-03-29 4:03 a.m., Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Date: Thu, 25 Mar 2021 10:19:53 -0400
>>
>> It would be interesting to see the output of your two backtraces, with
>> "set debug frame 1" enabled (with my patches applied).
> 
> I've built the GDB snapshot from Mar 25, after applying the patches
> from your branch wrt master.
> 
> I attach below the output for the "short" backtrace.  The backtrace
> itself is just this:
> 
>   #0  Fskip_chars_backward (string=-9223372036743969912, lim=0) at syntax.c:1610
>   #1  0x09957f08 in ?? ()
>   #2  0x01260805 in Ffuncall (nargs=1, args=0x8282e0) at eval.c:3009
>   #3  0x099485bb in ?? ()
>   #4  0x00000000 in ?? ()
> 
> The output from "set debug frame 1" is:
> 
>   Thread 1 hit Breakpoint 2, [frame] frame_unwind_find_by_frame: enter
>     [frame] frame_unwind_find_by_frame: this_frame=0
>     [frame] frame_unwind_try_unwinder: trying unwinder "dummy"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "inline"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "jit"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "python"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "i386 epilogue"
>     [frame] frame_unwind_try_unwinder: no
>     [frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
>     [frame] frame_unwind_try_unwinder: yes
>   [frame] frame_unwind_find_by_frame: exit
>   Fskip_chars_backward ([frame] get_prev_frame_always_1: enter
>     [frame] get_prev_frame_always_1: this_frame=0
>     [frame] compute_frame_id: enter
>       [frame] compute_frame_id: fi=0
>       [frame] get_frame_func_if_available: this_frame=0 -> 0x12c3e1b
>       [frame] frame_id_p: l={!stack, !code, !special} -> 0
>       [frame] frame_unwind_register_value: frame=-1, regnum=5(ebp)
>       [frame] frame_unwind_register_value:   -> register=5 bytes=[08808200]
>       [frame] frame_unwind_arch: next_frame=0 -> i386
>       [frame] frame_unwind_register_value: frame=0, regnum=8(eip)
>       [frame] frame_unwind_register_value:   -> address=0x82800c bytes=[087f9509]
>       [frame] frame_unwind_register_value: frame=0, regnum=4(esp)
>       [frame] frame_unwind_register_value:   -> computed bytes=[10808200]
>       [frame] frame_id_p: l={stack=0x828010, code=0x12c3e1b, !special} -> 1
>       [frame] compute_frame_id:   -> {stack=0x828010, code=0x12c3e1b, !special}
>     [frame] compute_frame_id: exit
>     [frame] get_prev_frame_raw:   -> {level=1, type=<unknown>, unwind=<unknown>, pc=<unknown>, id=<not computed>, func=<unknown>}
>     [frame] compute_frame_id: enter
>       [frame] compute_frame_id: fi=1
>       [frame] frame_unwind_find_by_frame: enter
> 	[frame] frame_unwind_find_by_frame: this_frame=1
> 	[frame] frame_unwind_try_unwinder: trying unwinder "dummy"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 tailcall"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "inline"
> 	[frame] frame_unwind_register_value: frame=0, regnum=8(eip)
> 	[frame] frame_unwind_register_value:   -> address=0x82800c bytes=[087f9509]
> 	[frame] frame_unwind_pc: this_frame=0 -> 0x9957f08
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "jit"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "python"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "i386 epilogue"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "dwarf2 signal"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "i386 stack tramp"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "i386 sigtramp"
> 	[frame] frame_unwind_try_unwinder: no
> 	[frame] frame_unwind_try_unwinder: trying unwinder "i386 prologue"
> 	[frame] frame_unwind_try_unwinder: yes
>       [frame] frame_unwind_find_by_frame: exit
>       [frame] get_frame_func_if_available: this_frame=1 -> 0x0
>       [frame] frame_unwind_register_value: frame=0, regnum=5(ebp)
>       [frame] frame_unwind_register_value:   -> address=0x828008 bytes=[e8818200]
>       [frame] frame_id_p: l={stack=0x8281f0, code=0x0, !special} -> 1
>       [frame] compute_frame_id:   -> {stack=0x8281f0, code=0x0, !special}
>     [frame] compute_frame_id: exit
>   [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: enter
>     [frame] get_prev_frame_always_1: enter
>       [frame] get_prev_frame_always_1: this_frame=0
>       [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
>     [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: exit
>   [frame] get_prev_frame: enter
>     [frame] get_prev_frame_always_1: enter
>       [frame] get_prev_frame_always_1: this_frame=0
>       [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
>     [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: exit
>   string=-9223372036743969912, [frame] get_prev_frame_always_1: enter
>     [frame] get_prev_frame_always_1: this_frame=0
>     [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
>   [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: enter
>     [frame] get_prev_frame_always_1: enter
>       [frame] get_prev_frame_always_1: this_frame=0
>       [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
>     [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: exit
>   [frame] get_prev_frame: enter
>     [frame] get_prev_frame_always_1: enter
>       [frame] get_prev_frame_always_1: this_frame=0
>       [frame] get_prev_frame_always_1:   -> {level=1, type=NORMAL_FRAME, unwind=0xa297c0, pc=0x9957f08, id={stack=0x8281f0, code=0x0, !special}, func=0x0} // cached
>     [frame] get_prev_frame_always_1: exit
>   [frame] get_prev_frame: exit
>   lim=0[frame] frame_id_p: l={!stack, !code, !special} -> 0
>   ) at syntax.c:1610
>   1610	  return skip_chars (0, string, lim, 1);
>   [frame] get_prev_frame_always_1: enter
>     [frame] get_prev_frame_always_1: this_frame=-1
>     [frame] get_prev_frame_always_1:   -> {level=0, type=NORMAL_FRAME, unwind=0x9f16e0, pc=0x12c3e39, id={stack=0x828010, code=0x12c3e1b, !special}, func=0x12c3e1b} // cached
>   [frame] get_prev_frame_always_1: exit
> 
> Let me know if you want to see the output for "long" backtrace as
> well, and whether you'd like to see it for the case where GDB is able
> to show all the frames back to 'main'.
> 
> TIA
> 

That seems to be the output of "set debug frame 1" while you "run" until
the location.  We would need the output of "set debug frame 1" while you
execute the "backtrace" command.

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 14:33           ` Simon Marchi
@ 2021-03-29 14:57             ` Eli Zaretskii
  2021-03-29 15:32               ` Simon Marchi
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-29 14:57 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 483 bytes --]

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Mon, 29 Mar 2021 10:33:33 -0400
> 
> That seems to be the output of "set debug frame 1" while you "run" until
> the location.  We would need the output of "set debug frame 1" while you
> execute the "backtrace" command.

Sorry, I thought you wanted it the other way around, since the sample
output you showed me seemed to be like that.

Anyway, the output during "bt" (compressed) is attached.


[-- Attachment #2: gdb2.txt.xz --]
[-- Type: application/octet-stream, Size: 7000 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 14:57             ` Eli Zaretskii
@ 2021-03-29 15:32               ` Simon Marchi
  2021-03-29 15:37                 ` Eli Zaretskii
  2021-04-06 15:18                 ` Eli Zaretskii
  0 siblings, 2 replies; 17+ messages in thread
From: Simon Marchi @ 2021-03-29 15:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches



On 2021-03-29 10:57 a.m., Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Date: Mon, 29 Mar 2021 10:33:33 -0400
>>
>> That seems to be the output of "set debug frame 1" while you "run" until
>> the location.  We would need the output of "set debug frame 1" while you
>> execute the "backtrace" command.
> 
> Sorry, I thought you wanted it the other way around, since the sample
> output you showed me seemed to be like that.
> 
> Anyway, the output during "bt" (compressed) is attached.

Ok, so given this backtrace:

    #0  Fskip_chars_backward (string=XIL(0x8000000006a71ad0), lim=XIL(0)) at syntax.c:1610
    #1  0x09957f08 in ?? ()
    #2  0x01260805 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
    #3  0x099485bb in ?? ()
    #4  0x00000000 in ?? ()

What I can see is that the following unwinders are used for each frame:

    #0 dwarf2
    #1 i386 prologue
    #2 dwarf2
    #3 i386 prologue
    #4 i386 prologue

So we can assume that going from frame 2 to 3 went well.  The code
address in frame 3 makes sense, it is close to the address in frame 1,
and I assume it's the jit-ed code.  It looks like going from frame 3 to
frame 4 went wrong (the code address 0x00000000 is a sign of that).  The
backtrace is stopped because when GDB encounters a frame with PC == 0,
it stops:

    [frame] get_prev_frame: enter
      [frame] frame_debug_got_null_frame: this_frame=4 -> zero PC
    [frame] get_prev_frame: exit

    https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/frame.c;h=dc9fdd41646fbb52463ec8bbc73d670011ed6448;hb=HEAD#l2589

So, the JIT code produces code and a stack layout that GDB's prologue
analyzer doesn't understand.  As mentioned previously, you have two
options: looking into having dwarf info produced for the jit-ed code, or
tweaking GDB's prologue analyzer to make it understand your code and
stack.

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 15:32               ` Simon Marchi
@ 2021-03-29 15:37                 ` Eli Zaretskii
  2021-03-29 15:39                   ` Simon Marchi
  2021-04-06 15:18                 ` Eli Zaretskii
  1 sibling, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-29 15:37 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Mon, 29 Mar 2021 11:32:14 -0400
> 
> So, the JIT code produces code and a stack layout that GDB's prologue
> analyzer doesn't understand.  As mentioned previously, you have two
> options: looking into having dwarf info produced for the jit-ed code, or
> tweaking GDB's prologue analyzer to make it understand your code and
> stack.

Thanks.

The JIT code was produced by libgccjit, so is it reasonable that its
prologue is different from the one produced by GCC?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 15:37                 ` Eli Zaretskii
@ 2021-03-29 15:39                   ` Simon Marchi
  2021-03-29 16:55                     ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Simon Marchi @ 2021-03-29 15:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On 2021-03-29 11:37 a.m., Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Date: Mon, 29 Mar 2021 11:32:14 -0400
>>
>> So, the JIT code produces code and a stack layout that GDB's prologue
>> analyzer doesn't understand.  As mentioned previously, you have two
>> options: looking into having dwarf info produced for the jit-ed code, or
>> tweaking GDB's prologue analyzer to make it understand your code and
>> stack.
> 
> Thanks.
> 
> The JIT code was produced by libgccjit, so is it reasonable that its
> prologue is different from the one produced by GCC?

Maybe, I have no idea as I have never used it.

Simon

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 15:39                   ` Simon Marchi
@ 2021-03-29 16:55                     ` Eli Zaretskii
  2021-03-29 17:13                       ` Simon Marchi
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2021-03-29 16:55 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Mon, 29 Mar 2021 11:39:41 -0400
> 
> On 2021-03-29 11:37 a.m., Eli Zaretskii wrote:
> >> Cc: gdb-patches@sourceware.org
> >> From: Simon Marchi <simon.marchi@polymtl.ca>
> >> Date: Mon, 29 Mar 2021 11:32:14 -0400
> >>
> >> So, the JIT code produces code and a stack layout that GDB's prologue
> >> analyzer doesn't understand.  As mentioned previously, you have two
> >> options: looking into having dwarf info produced for the jit-ed code, or
> >> tweaking GDB's prologue analyzer to make it understand your code and
> >> stack.
> > 
> > Thanks.
> > 
> > The JIT code was produced by libgccjit, so is it reasonable that its
> > prologue is different from the one produced by GCC?
> 
> Maybe, I have no idea as I have never used it.

I'm out of my depth here, but just FTR, I took one function whose name
is shown as "??" and used "x/10i" to display instructions back from
$pc until I saw RET.  I presume that's where this anonymous function
starts.  Here's what I see immediately following that RET instruction:

   0x996e6b2:   lea    0x0(%esi,%eiz,1),%esi
   0x996e6b9:   lea    0x0(%esi,%eiz,1),%esi
   0x996e6c0:   push   %ebp
   0x996e6c1:   mov    %esp,%ebp
   0x996e6c3:   push   %edi
   0x996e6c4:   push   %esi
   0x996e6c5:   push   %ebx
   0x996e6c6:   sub    $0x8c,%esp

Which AFAIK is the "normal" prologue (perhaps modulo the 2 lea
instructions)?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 16:55                     ` Eli Zaretskii
@ 2021-03-29 17:13                       ` Simon Marchi
  0 siblings, 0 replies; 17+ messages in thread
From: Simon Marchi @ 2021-03-29 17:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches

On 2021-03-29 12:55 p.m., Eli Zaretskii wrote:
>> Cc: gdb-patches@sourceware.org
>> From: Simon Marchi <simon.marchi@polymtl.ca>
>> Date: Mon, 29 Mar 2021 11:39:41 -0400
>>
>> On 2021-03-29 11:37 a.m., Eli Zaretskii wrote:
>>>> Cc: gdb-patches@sourceware.org
>>>> From: Simon Marchi <simon.marchi@polymtl.ca>
>>>> Date: Mon, 29 Mar 2021 11:32:14 -0400
>>>>
>>>> So, the JIT code produces code and a stack layout that GDB's prologue
>>>> analyzer doesn't understand.  As mentioned previously, you have two
>>>> options: looking into having dwarf info produced for the jit-ed code, or
>>>> tweaking GDB's prologue analyzer to make it understand your code and
>>>> stack.
>>>
>>> Thanks.
>>>
>>> The JIT code was produced by libgccjit, so is it reasonable that its
>>> prologue is different from the one produced by GCC?
>>
>> Maybe, I have no idea as I have never used it.
> 
> I'm out of my depth here, but just FTR, I took one function whose name
> is shown as "??" and used "x/10i" to display instructions back from
> $pc until I saw RET.  I presume that's where this anonymous function
> starts.  Here's what I see immediately following that RET instruction:
> 
>    0x996e6b2:   lea    0x0(%esi,%eiz,1),%esi
>    0x996e6b9:   lea    0x0(%esi,%eiz,1),%esi
>    0x996e6c0:   push   %ebp
>    0x996e6c1:   mov    %esp,%ebp
>    0x996e6c3:   push   %edi
>    0x996e6c4:   push   %esi
>    0x996e6c5:   push   %ebx
>    0x996e6c6:   sub    $0x8c,%esp
> 
> Which AFAIK is the "normal" prologue (perhaps modulo the 2 lea
> instructions)?

"push %ebp" and "mov %esp, %ebp" is the start of a typical prologue.  So
the above looks quite typical.

The lea instructions are just NOPs for alignment:

    https://stackoverflow.com/questions/2553517/what-is-register-eiz

When GDB attempts to analyze the prologue of the function of frame #3
(from your initial example), then it must have some way of finding the
start of the function that PC "0x099485bb" is in.  We want it to find
the PC of the "push %ebp" instruction, if it goes back too far and finds
the PC of one of those lea instructions, then it won't look at the right
instructions when analyzing the prologue.  It would be good to check how
it does that, and if it gets that right.

I think you would need to inspect the beginning of i386_frame_cache_1,
where it does:

  cache->pc = get_frame_func (this_frame);

That PC should point to "push %ebp", ideally.  It is passed to
i386_analyze_prologue a few lines below, in i386_frame_cache_1:

  if (cache->pc != 0)
    i386_analyze_prologue (gdbarch, cache->pc, get_frame_pc (this_frame),
			   cache);

Simon


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Need help with understanding truncated and corrupted backtraces
  2021-03-29 15:32               ` Simon Marchi
  2021-03-29 15:37                 ` Eli Zaretskii
@ 2021-04-06 15:18                 ` Eli Zaretskii
  1 sibling, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2021-04-06 15:18 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches

> Cc: gdb-patches@sourceware.org
> From: Simon Marchi <simon.marchi@polymtl.ca>
> Date: Mon, 29 Mar 2021 11:32:14 -0400
> 
> Ok, so given this backtrace:
> 
>     #0  Fskip_chars_backward (string=XIL(0x8000000006a71ad0), lim=XIL(0)) at syntax.c:1610
>     #1  0x09957f08 in ?? ()
>     #2  0x01260805 in Ffuncall (nargs=1, args=0x828880) at eval.c:3009
>     #3  0x099485bb in ?? ()
>     #4  0x00000000 in ?? ()
> 
> What I can see is that the following unwinders are used for each frame:
> 
>     #0 dwarf2
>     #1 i386 prologue
>     #2 dwarf2
>     #3 i386 prologue
>     #4 i386 prologue
> 
> So we can assume that going from frame 2 to 3 went well.  The code
> address in frame 3 makes sense, it is close to the address in frame 1,
> and I assume it's the jit-ed code.  It looks like going from frame 3 to
> frame 4 went wrong (the code address 0x00000000 is a sign of that).  The
> backtrace is stopped because when GDB encounters a frame with PC == 0,
> it stops:
> 
>     [frame] get_prev_frame: enter
>       [frame] frame_debug_got_null_frame: this_frame=4 -> zero PC
>     [frame] get_prev_frame: exit
> 
>     https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/frame.c;h=dc9fdd41646fbb52463ec8bbc73d670011ed6448;hb=HEAD#l2589
> 
> So, the JIT code produces code and a stack layout that GDB's prologue
> analyzer doesn't understand.  As mentioned previously, you have two
> options: looking into having dwarf info produced for the jit-ed code, or
> tweaking GDB's prologue analyzer to make it understand your code and
> stack.

FTR: this problem was caused by GDB's being unaware of some shared
libraries that were loaded by the inferior.  Emacs with
native-compilation feature compiles Lisp packages into shared
libraries using libgccjit, and then loads them at run time when they
are first needed.  The problem is that windows-nat.c uses an
unreliable technique to glean the DLL name from the information
provided by the debug event that announces the loading of the DLL.
Then the symbols from that DLL were not read, and in fact GDB was
completely unaware that the DLL was at all loaded.  For some reason,
this problem only happened for some DLLs, but not for others.

The patch I presented in

  https://sourceware.org/pipermail/gdb-patches/2021-April/177513.html

fixes the problem with recording the DLLs loaded at run time, and that
completely eliminated the problems with backtraces.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-04-06 15:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 13:24 Need help with understanding truncated and corrupted backtraces Eli Zaretskii
2021-03-24 20:05 ` Eli Zaretskii
2021-03-24 22:48 ` Simon Marchi
2021-03-25  3:47   ` Simon Marchi
2021-03-25 10:09     ` Eli Zaretskii
2021-03-25 14:19       ` Simon Marchi
2021-03-26  7:19         ` Eli Zaretskii
2021-03-29  8:03         ` Eli Zaretskii
2021-03-29 14:33           ` Simon Marchi
2021-03-29 14:57             ` Eli Zaretskii
2021-03-29 15:32               ` Simon Marchi
2021-03-29 15:37                 ` Eli Zaretskii
2021-03-29 15:39                   ` Simon Marchi
2021-03-29 16:55                     ` Eli Zaretskii
2021-03-29 17:13                       ` Simon Marchi
2021-04-06 15:18                 ` Eli Zaretskii
2021-03-25 10:04   ` Eli Zaretskii

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).