On Mon, Jan 30, 2023 at 9:04 PM Simon Marchi via Gdb-patches < gdb-patches@sourceware.org> wrote: > The test gdb.base/frame-view.exp fails like this on AArch64: > > frame^M > #0 baz (z1=hahaha, /home/simark/src/binutils-gdb/gdb/value.c:4056: > internal-error: value_fetch_lazy_register: Assertion `next_frame != NULL' > failed.^M > A problem internal to GDB has been detected,^M > further debugging may prove unreliable.^M > FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB > internal error) > > The sequence of events leading to this is the following: > > - When we create the user frame (the "select-frame view" command), we > create a sentinel frame just for our user-created frame, in > create_new_frame. This sentinel frame has the same id as the regular > sentinel frame. > > - When printing the frame, after doing the "select-frame view" command, > the argument's pretty printer is invoked, which does an inferior > function call (this is the point of the test). This clears the frame > cache, including the "real" sentinel frame, which sets the > sentinel_frame global to nullptr. > > - Later in the frame-printing process (when printing the second > argument), the auto-reinflation mechanism re-creates the user frame > by calling create_new_frame again, creating its own special sentinel > frame again. However, note that the "real" sentinel frame, the > sentinel_frame global, is still nullptr. If the selected frame had > been a regular frame, we would have called get_current_frame at some > point during the reinflation, which would have re-created the "real" > sentinel frame. But it's not the case when reinflating a user frame. > > - Deep down the stack, something wants to fill in the unwind stop > reason for frame 0, which requires trying to unwind frame 1. This > leads us to trying to unwind the PC of frame 1: > > #0 gdbarch_unwind_pc (gdbarch=0xffff8d010080, next_frame=...) at > /home/simark/src/binutils-gdb/gdb/gdbarch.c:2955 > #1 0x000000000134569c in dwarf2_tailcall_sniffer_first > (this_frame=..., tailcall_cachep=0xffff773fcae0, > entry_cfa_sp_offsetp=0xfffff7f7d450) > at /home/simark/src/binutils-gdb/gdb/dwarf2/frame-tailcall.c:390 > #2 0x0000000001355d84 in dwarf2_frame_cache (this_frame=..., > this_cache=0xffff773fc928) at > /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1089 > #3 0x00000000013562b0 in dwarf2_frame_unwind_stop_reason > (this_frame=..., this_cache=0xffff773fc928) at > /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1101 > #4 0x0000000001990f64 in get_prev_frame_always_1 (this_frame=...) at > /home/simark/src/binutils-gdb/gdb/frame.c:2281 > #5 0x0000000001993034 in get_prev_frame_always (this_frame=...) at > /home/simark/src/binutils-gdb/gdb/frame.c:2376 > #6 0x000000000199b814 in get_frame_unwind_stop_reason (frame=...) at > /home/simark/src/binutils-gdb/gdb/frame.c:3051 > #7 0x0000000001359cd8 in dwarf2_frame_cfa (this_frame=...) at > /home/simark/src/binutils-gdb/gdb/dwarf2/frame.c:1356 > #8 0x000000000132122c in dwarf_expr_context::execute_stack_op > (this=0xfffff7f80170, op_ptr=0xffff8d8883ee "\217\002", > op_end=0xffff8d8883ee "\217\002") > at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:2110 > #9 0x0000000001317b30 in dwarf_expr_context::eval > (this=0xfffff7f80170, addr=0xffff8d8883ed "\234\217\002", len=1) at > /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239 > #10 0x000000000131d68c in dwarf_expr_context::execute_stack_op > (this=0xfffff7f80170, op_ptr=0xffff8d88840e "", op_end=0xffff8d88840e "") > at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1811 > #11 0x0000000001317b30 in dwarf_expr_context::eval > (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2) at > /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1239 > #12 0x0000000001314c3c in dwarf_expr_context::evaluate > (this=0xfffff7f80170, addr=0xffff8d88840c "\221p", len=2, as_lval=true, > per_cu=0xffff90b03700, frame=..., addr_info=0x0, > type=0xffff8f6c8400, subobj_type=0xffff8f6c8400, subobj_offset=0) > at /home/simark/src/binutils-gdb/gdb/dwarf2/expr.c:1078 > #13 0x000000000149f9e0 in dwarf2_evaluate_loc_desc_full > (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, > per_cu=0xffff90b03700, per_objfile=0xffff9070b980, > subobj_type=0xffff8f6c8400, subobj_byte_offset=0, as_lval=true) > at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1513 > #14 0x00000000014a0100 in dwarf2_evaluate_loc_desc > (type=0xffff8f6c8400, frame=..., data=0xffff8d88840c "\221p", size=2, > per_cu=0xffff90b03700, per_objfile=0xffff9070b980, as_lval=true) > at /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:1557 > #15 0x00000000014aa584 in locexpr_read_variable > (symbol=0xffff8f6cd770, frame=...) at > /home/simark/src/binutils-gdb/gdb/dwarf2/loc.c:3052 > > - AArch64 defines a special "prev register" function, > aarch64_dwarf2_prev_register, to handle unwinding the PC. This > function does > > frame_unwind_register_unsigned (this_frame, AARCH64_LR_REGNUM); > > - frame_unwind_register_unsigned ultimately creates a lazy register > value, saving the frame id of this_frame->next. this_frame is the > user-created frame, to this_frame->next is the special sentinel frame > we created for it. So the saved ID is the sentinel frame ID. > > - When time comes to un-lazify the value, value_fetch_lazy_register > calls frame_find_by_id, to find the frame with the ID we saved. > > - frame_find_by_id sees it's the sentinel frame ID, so returns the > sentinel_frame global, which is, if you remember, nullptr. > > - We hit the `gdb_assert (next_frame != NULL)` assertion in > value_fetch_lazy_register. > > The issues I see here are: > > - The ID of the sentinel frame created for the user-created frame is > not distinguishable from the ID of the regular sentinel frame. So > there's no way frame_find_by_id could find the right frame, in > value_fetch_lazy_register. > - Even if they had distinguishable IDs, sentinel frames created for > user frames are not registered anywhere, so there's no easy way > frame_find_by_id could find it. > > This patch addresses these two issues: > > - Give sentinel frames created for user frames their own distinct IDs > - Register sentinel frames in the frame cache, so they can be found > with frame_find_by_id. > > I initially had this split in two patches, but I then found that it was > easier to explain as a single patch. > > Rergarding the first part of the change: with this patch, the sentinel > frames created for user frames (in create_new_frame) still have > stack_status == FID_STACK_SENTINEL, but their code_addr and stack_addr > fields are now filled with the addresses used to create the user frame. > This ensures this sentinel frame ID is different from the "target" > sentinel frame ID, as well as any other "user" sentinel frame ID. If > the user tries to create the same frame, with the same addresses, > multiple times, create_sentinel_frame just reuses the existing frame. > So we won't end up with multiple user sentinels with the same ID. > > Regular "target" sentinel frames remain with code_addr and stack_addr > unset. > > The concrete changes for that part are: > > - Remove the sentinel_frame_id constant, since there isn't one > "sentinel frame ID" now. Add the frame_id_build_sentinel function > for building sentinel frame IDs and a is_sentinel_frame_id function > to check if a frame id represents a sentinel frame. > - Replace the sentinel_frame_id check in frame_find_by_id with a > comparison to `frame_id_build_sentinel (0, 0)`. The sentinel_frame > global is meant to contain a reference to the "target" sentinel, so > the one with addresses (0, 0). > - Add stack and code address parameters to create_sentinel_frame, to be > able to create the various types of sentinel frames. > - Adjust get_current_frame to create the regular "target" sentinel. > - Adjust create_new_frame to create a sentinel with the ID specific to > the created user frame. > - Adjust sentinel_frame_prev_register to get the sentinel frame ID from > the frame_info object, since there isn't a single "sentinel frame ID" > now. > - Change get_next_frame_sentinel_okay to check for a > sentinel-frame-id-like frame ID, rather than for sentinel_frame > specifically, since this function could be called with another > sentinel frame (and we would want the assert to catch it). > > The rest of the change is about registering the sentinel frame in the > frame cache: > > - Change frame_stash_add's assertion to allow sentinel frame levels > (-1). > - Make create_sentinel_frame add the frame to the frame cache. > - Change the "sentinel_frame != NULL" check in reinit_frame_cache for a > check that the frame stash is not empty. The idea is that if we only > have some user-created frames in the cache when reinit_frame_cache is > called, we probably want to emit the frames invalid annotation. The > goal of that check is to avoid unnecessary repeated annotations, I > suppose, so the "frame cache not empty" check should achieve that. > > After this change, I think we could theoritically get rid of the > sentienl_frame global. That sentinel frame could always be found by > looking up `frame_id_build_sentinel (0, 0)` in the frame cache. > However, I left the global there to avoid slowing the typical case down > for nothing. I however, noted in its comment that it is an > optimization. > > With this fix applied, the gdb.base/frame-view.exp now passes for me on > AArch64. value_of_register_lazy now saves the special sentinel frame ID > in the value, and value_fetch_lazy_register is able to find that > sentinel frame after the frame cache reinit and after the user-created > frame was reinflated. > > Change-Id: I8b77b3448822c8aab3e1c3dda76ec434eb62704f > --- > gdb/frame-id.h | 10 ++++--- > gdb/frame.c | 63 ++++++++++++++++++++++++++++++++++---------- > gdb/frame.h | 10 +++++++ > gdb/sentinel-frame.c | 5 +++- > 4 files changed, 70 insertions(+), 18 deletions(-) > > diff --git a/gdb/frame-id.h b/gdb/frame-id.h > index 5978d99bba8c..8ddf7d114082 100644 > --- a/gdb/frame-id.h > +++ b/gdb/frame-id.h > @@ -128,12 +128,16 @@ struct frame_id > /* For convenience. All fields are zero. This means "there is no > frame". */ > extern const struct frame_id null_frame_id; > > -/* Sentinel frame. */ > -extern const struct frame_id sentinel_frame_id; > - > /* This means "there is no frame ID, but there is a frame". It should be > replaced by best-effort frame IDs for the outermost frame, somehow. > The implementation is only special_addr_p set. */ > extern const struct frame_id outer_frame_id; > > +/* Return true if ID represents a sentinel frame. */ > +static inline bool > +is_sentinel_frame_id (frame_id id) > +{ > + return id.stack_status == FID_STACK_SENTINEL; > +} > + > #endif /* ifdef GDB_FRAME_ID_H */ > diff --git a/gdb/frame.c b/gdb/frame.c > index fed961b2a8df..9235a2ceb38c 100644 > --- a/gdb/frame.c > +++ b/gdb/frame.c > @@ -49,7 +49,10 @@ > innermost frame. > > The current frame, which is the innermost frame, can be found at > - sentinel_frame->prev. */ > + sentinel_frame->prev. > + > + This is an optimization to be able to find the sentinel frame quickly, > + it could otherwise be found in the frame cache. */ > > static frame_info *sentinel_frame; > > @@ -294,8 +297,8 @@ frame_stash_create (void) > static bool > frame_stash_add (frame_info *frame) > { > - /* Do not try to stash the sentinel frame. */ > - gdb_assert (frame->level >= 0); > + /* Valid frame levels are -1 (sentinel frames) and above. */ > + gdb_assert (frame->level >= -1); > > frame_info **slot = (frame_info **) htab_find_slot (frame_stash, > frame, INSERT); > @@ -681,7 +684,6 @@ frame_unwind_caller_id (frame_info_ptr next_frame) > } > > const struct frame_id null_frame_id = { 0 }; /* All zeros. */ > -const struct frame_id sentinel_frame_id = { 0, 0, 0, FID_STACK_SENTINEL, > 0, 1, 0 }; > const struct frame_id outer_frame_id = { 0, 0, 0, FID_STACK_OUTER, 0, 1, > 0 }; > > struct frame_id > @@ -750,6 +752,29 @@ frame_id_build_wild (CORE_ADDR stack_addr) > return id; > } > > +/* See frame.h. */ > + > +frame_id > +frame_id_build_sentinel (CORE_ADDR stack_addr, CORE_ADDR code_addr) > +{ > + frame_id id = null_frame_id; > + > + id.stack_status = FID_STACK_SENTINEL; > + id.special_addr_p = 1; > + > + if (stack_addr != 0 || code_addr != 0) > + { > + /* The purpose of saving these in the sentinel frame ID is to be > able to > + differentiate the IDs of several sentinel frames that could exist > + simultaneously in the frame cache. */ > + id.stack_addr = stack_addr; > + id.code_addr = code_addr; > + id.code_addr_p = 1; > + } > + > + return id; > +} > + > bool > frame_id_p (frame_id l) > { > @@ -896,7 +921,7 @@ frame_find_by_id (struct frame_id id) > return NULL; > > /* Check for the sentinel frame. */ > - if (id == sentinel_frame_id) > + if (id == frame_id_build_sentinel (0, 0)) > return frame_info_ptr (sentinel_frame); > > /* Try using the frame stash first. Finding it there removes the need > @@ -1587,10 +1612,14 @@ put_frame_register_bytes (frame_info_ptr frame, > int regnum, > } > } > > -/* Create a sentinel frame. */ > +/* Create a sentinel frame. > > -static frame_info * > -create_sentinel_frame (struct program_space *pspace, struct regcache > *regcache) > + See frame_id_build_sentinel for the description of STACK_ADDR and > + CODE_ADDR. */ > + > +static frame_info_ptr > +create_sentinel_frame (struct program_space *pspace, struct regcache > *regcache, > + CORE_ADDR stack_addr, CORE_ADDR code_addr) > { > frame_info *frame = FRAME_OBSTACK_ZALLOC (struct frame_info); > > @@ -1608,11 +1637,14 @@ create_sentinel_frame (struct program_space > *pspace, struct regcache *regcache) > frame->next = frame; > /* The sentinel frame has a special ID. */ > frame->this_id.p = frame_id_status::COMPUTED; > - frame->this_id.value = sentinel_frame_id; > + frame->this_id.value = frame_id_build_sentinel (stack_addr, code_addr); > + > + bool added = frame_stash_add (frame); > + gdb_assert (added); > > frame_debug_printf (" -> %s", frame->to_string ().c_str ()); > > - return frame; > + return frame_info_ptr (frame); > } > > /* Cache for frame addresses already read by gdb. Valid only while > @@ -1654,7 +1686,8 @@ get_current_frame (void) > > if (sentinel_frame == NULL) > sentinel_frame = > - create_sentinel_frame (current_program_space, get_current_regcache > ()); > + create_sentinel_frame (current_program_space, get_current_regcache > (), > + 0, 0).get (); > > /* Set the current frame before computing the frame id, to avoid > recursion inside compute_frame_id, in case the frame's > @@ -1980,7 +2013,8 @@ create_new_frame (frame_id id) > frame_info *fi = FRAME_OBSTACK_ZALLOC (struct frame_info); > > fi->next = create_sentinel_frame (current_program_space, > - get_current_regcache ()); > + get_current_regcache (), > + id.stack_addr, id.code_addr).get (); > > /* Set/update this frame's cached PC value, found in the next frame. > Do this before looking for this frame's unwinder. A sniffer is > @@ -2044,7 +2078,8 @@ get_next_frame_sentinel_okay (frame_info_ptr > this_frame) > is the sentinel frame. But we disallow it here anyway because > calling get_next_frame_sentinel_okay() on the sentinel frame > is likely a coding error. */ > - gdb_assert (this_frame != sentinel_frame); > + if (this_frame->this_id.p == frame_id_status::COMPUTED) > + gdb_assert (!is_sentinel_frame_id (this_frame->this_id.value)); > > return frame_info_ptr (this_frame->next); > } > @@ -2064,7 +2099,7 @@ reinit_frame_cache (void) > { > ++frame_cache_generation; > > - if (sentinel_frame != NULL) > + if (htab_elements (frame_stash) > 0) > annotate_frames_invalid (); > > invalidate_selected_frame (); > diff --git a/gdb/frame.h b/gdb/frame.h > index c4f7c121391b..4a99bd660dfc 100644 > --- a/gdb/frame.h > +++ b/gdb/frame.h > @@ -159,6 +159,16 @@ extern struct frame_id > as the special identifier address are set to indicate wild cards. */ > extern struct frame_id frame_id_build_wild (CORE_ADDR stack_addr); > > +/* Construct a frame ID for a sentinel frame. > + > + If either STACK_ADDR or CODE_ADDR is not 0, the ID represents a > sentinel > + frame for a user-created frame. STACK_ADDR and CODE_ADDR are the > addresses > + used to create the frame. > + > + If STACK_ADDR and CODE_ADDR are both 0, the ID represents a regular > sentinel > + frame (i.e. the "next" frame of the target's current frame). */ > +extern frame_id frame_id_build_sentinel (CORE_ADDR stack_addr, CORE_ADDR > code_addr); > + > /* Returns true when L is a valid frame. */ > extern bool frame_id_p (frame_id l); > > diff --git a/gdb/sentinel-frame.c b/gdb/sentinel-frame.c > index f323de3bb5d3..3e5b9be2edf6 100644 > --- a/gdb/sentinel-frame.c > +++ b/gdb/sentinel-frame.c > @@ -50,8 +50,11 @@ sentinel_frame_prev_register (frame_info_ptr this_frame, > = (struct frame_unwind_cache *) *this_prologue_cache; > struct value *value; > > + frame_id this_frame_id = get_frame_id (this_frame); > + gdb_assert (is_sentinel_frame_id (this_frame_id)); > + > value = cache->regcache->cooked_read_value (regnum); > - VALUE_NEXT_FRAME_ID (value) = sentinel_frame_id; > + VALUE_NEXT_FRAME_ID (value) = this_frame_id; > > return value; > } > -- > 2.39.1 > > The commit message is awesome, very informative, I really enjoyed reading it. I can see the patch set fixes gdb.base/frame-view.exp on aarch64, I've seen this failure before the patch which does not occur after applying it. frame #0 baz (z1=hahaha, ../../binutils-gdb/gdb/value.c:4056: internal-error: v alue_fetch_lazy_register: Assertion `next_frame != NULL' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. ----- Backtrace ----- FAIL: gdb.base/frame-view.exp: with_pretty_printer=true: frame (GDB internal error) Resyncing due to internal error. 0x5a6aaf gdb_internal_backtrace_1 ../../binutils-gdb/gdb/bt-utils.c:122