From: Simon Marchi <simon.marchi@polymtl.ca>
To: Zoran Zaric <zoran.zaric@amd.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH v3 01/17] Replace the symbol needs evaluator with a parser
Date: Mon, 7 Jun 2021 17:48:14 -0400 [thread overview]
Message-ID: <395847ba-8138-018f-f618-528d1a22e9fa@polymtl.ca> (raw)
In-Reply-To: <20210528154648.60881-2-zoran.zaric@amd.com>
Hi Zoran,
I left some comments, but to go a bit faster I implemented the changes
in a fixup patch that I pasted below. If you are fine with the changes,
you can just integrate them in your patch.
> diff --git a/gdb/dwarf2/loc.c b/gdb/dwarf2/loc.c
> index e816f926696..2fa68ff0426 100644
> --- a/gdb/dwarf2/loc.c
> +++ b/gdb/dwarf2/loc.c
> @@ -2736,151 +2736,430 @@ dwarf2_compile_property_to_c (string_file *stream,
> data, data + size, per_cu, per_objfile);
> }
>
> -\f
> -/* Helper functions and baton for dwarf2_loc_desc_get_symbol_read_needs. */
> +/* Compute the correct symbol_needs_kind value for the location
> + expression in EXPR. */
>
> -class symbol_needs_eval_context : public dwarf_expr_context
> +static enum symbol_needs_kind
> +dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
> + dwarf2_per_cu_data *per_cu,
> + dwarf2_per_objfile *per_objfile,
> + bfd_endian byte_order,
> + int addr_size,
> + int ref_addr_size,
> + int depth = 0)
> {
> -public:
> - symbol_needs_eval_context (dwarf2_per_objfile *per_objfile)
> - : dwarf_expr_context (per_objfile)
> - {}
> + const gdb_byte *expr_end = expr.data () + expr.size ();
> + enum symbol_needs_kind symbol_needs = SYMBOL_NEEDS_NONE;
> + const int max_depth = 256;
> + std::vector <const gdb_byte *> next_op;
> + std::set <const gdb_byte *> visited_op;
"next_op" and "visited_op" should be plural (next_ops and
visited_ops). Although I'd renamed next_ops to ops_to_visit, I find
that a bit clearer.
Please add comments for the important variables here to say what they
are for (at least the set and the vector).
> - enum symbol_needs_kind needs;
> - struct dwarf2_per_cu_data *per_cu;
> + if (depth > max_depth)
> + error (_("DWARF-2 expression error: Loop detected (%d)."), depth);
At first I didn't undersand why this depth checking is needed. Given we
have the visited_ops set, we are sure to converge and not be stuck in an
infinite loop, even if there are loops in the expression. However,
after reading the code, I now understand that this depth check is for
expressions that call another expression with DW_OP_call*. So if for
example expression A calls expression B, which calls expression A, we
wouldn't catch it without this depth check. Is that right? If so,
maybe add a comment to explain this?
As for the error message, I don't think it's useful to print the depth.
It will just appear as "Loop detected (257)" to the user, which just
looks like a random number to them. It would be nice to print some
context, like what was being evaluated, or at least which objfile /
compilation unit contains the problematic expression(s), but that
doesn't seem easy.
> - /* Reads from registers do require a frame. */
> - CORE_ADDR read_addr_from_reg (int regnum) override
> - {
> - needs = SYMBOL_NEEDS_FRAME;
> - return 1;
> - }
> + depth++;
>
> - /* "get_reg_value" callback: Reads from registers do require a
> - frame. */
> + next_op.push_back (expr.data ());
>
> - struct value *get_reg_value (struct type *type, int regnum) override
> - {
> - needs = SYMBOL_NEEDS_FRAME;
> - return value_zero (type, not_lval);
> - }
> + while (!next_op.empty ())
> + {
> + const gdb_byte *op_ptr = next_op.back ();
I'd suggest popping the item out of next_ops and inserting it in
visited_ops right here, because we know we are unconditionally visiting
it now. And then, insert an op in next_ops only if it's not in
visited_ops. This way, we know there are only unvisited ops in
next_ops, and each op goes through next_ops exactly once.
>
> - /* Reads from memory do not require a frame. */
> - void read_mem (gdb_byte *buf, CORE_ADDR addr, size_t len) override
> - {
> - memset (buf, 0, len);
> - }
> + if (op_ptr >= expr_end
> + || visited_op.find(op_ptr) != visited_op.end())
Missing spaces.
> + {
> + next_op.pop_back ();
> + continue;
> + }
>
> - /* Frame-relative accesses do require a frame. */
> - void get_frame_base (const gdb_byte **start, size_t *length) override
> - {
> - static gdb_byte lit0 = DW_OP_lit0;
> + enum dwarf_location_atom op
> + = (enum dwarf_location_atom) *op_ptr++;
> + uint64_t uoffset;
> + int64_t offset;
>
> - *start = &lit0;
> - *length = 1;
> + /* The DWARF expression might have a bug causing an infinite
> + loop. In that case, quitting is the only way out. */
> + QUIT;
>
> - needs = SYMBOL_NEEDS_FRAME;
> - }
> + switch (op)
> + {
> + case DW_OP_lit0:
> + case DW_OP_lit1:
> + case DW_OP_lit2:
> + case DW_OP_lit3:
> + case DW_OP_lit4:
> + case DW_OP_lit5:
> + case DW_OP_lit6:
> + case DW_OP_lit7:
> + case DW_OP_lit8:
> + case DW_OP_lit9:
> + case DW_OP_lit10:
> + case DW_OP_lit11:
> + case DW_OP_lit12:
> + case DW_OP_lit13:
> + case DW_OP_lit14:
> + case DW_OP_lit15:
> + case DW_OP_lit16:
> + case DW_OP_lit17:
> + case DW_OP_lit18:
> + case DW_OP_lit19:
> + case DW_OP_lit20:
> + case DW_OP_lit21:
> + case DW_OP_lit22:
> + case DW_OP_lit23:
> + case DW_OP_lit24:
> + case DW_OP_lit25:
> + case DW_OP_lit26:
> + case DW_OP_lit27:
> + case DW_OP_lit28:
> + case DW_OP_lit29:
> + case DW_OP_lit30:
> + case DW_OP_lit31:
> + case DW_OP_stack_value:
At first, I thought that when encountering DW_OP_stack_value, we should
not push the following op to the list of ops to visit, because the spec
says:
The DW_OP_stack_value operation terminates the expression.
That sounds to me like when encountering this, you stop evaluating and
whatever is on the stack is the value. But there are cases where a
DW_OP_stack_value is in the middle of the expression:
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.dwarf2/var-access.exp;h=7ccec20db6d877c2defe980bbf4acfe874249624;hb=HEAD#l240
So when I tried to implement that, it broke this test.
> + case DW_OP_call2:
> + {
> + cu_offset cu_off
> + = (cu_offset) extract_unsigned_integer (op_ptr, 2, byte_order);
> + op_ptr += 2;
> +
> + auto get_frame_pc = [&symbol_needs] ()
> + {
> + symbol_needs = SYMBOL_NEEDS_FRAME;
> + return 0;
> + };
> +
> + struct dwarf2_locexpr_baton baton
> + = dwarf2_fetch_die_loc_cu_off (cu_off, per_cu,
> + per_objfile,
> + get_frame_pc);
>
> - ctx.eval (data, size);
> + /* If SYMBOL_NEEDS_FRAME is returned from the previous call,
> + we dont have to check the baton content. */
> + if (symbol_needs != SYMBOL_NEEDS_FRAME)
> + {
> + gdbarch *arch = baton.per_objfile->objfile->arch ();
> + gdb::array_view<const gdb_byte> sub_expr (baton.data,
> + baton.size);
> + symbol_needs
> + = dwarf2_get_symbol_read_needs (sub_expr,
> + baton.per_cu,
> + baton.per_objfile,
> + gdbarch_byte_order (arch),
> + baton.per_cu->addr_size (),
> + baton.per_cu->ref_addr_size (),
> + depth);
> + }
> + break;
> + }
> +
> + case DW_OP_call4:
Since these two cases look alike, I'd suggest merging them.
I forgot to mention above, but I'd also suggest declaring uoffset and
offset in the specific cases that need it. And you can use
safe_skip_leb128 to avoid much of their uses.
And the fixup patch:
From 5ed60b518e01a5ac7ec4e96176cb110078df2ff1 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@polymtl.ca>
Date: Mon, 7 Jun 2021 17:05:20 -0400
Subject: [PATCH] Patch 1 proposed fixups
Change-Id: Id6f57ebc96cd74b95465532ce2048b2c28de8642
---
gdb/dwarf2/loc.c | 178 ++++++++++++++++++++++++-----------------------
1 file changed, 91 insertions(+), 87 deletions(-)
diff --git a/gdb/dwarf2/loc.c b/gdb/dwarf2/loc.c
index 9f84766d1d2f..802b4457cec1 100644
--- a/gdb/dwarf2/loc.c
+++ b/gdb/dwarf2/loc.c
@@ -2772,45 +2772,76 @@ dwarf2_compile_property_to_c (string_file *stream,
}
/* Compute the correct symbol_needs_kind value for the location
- expression in EXPR. */
+ expression in EXPR.
-static enum symbol_needs_kind
+ Implemented by traversing the logical control flow graph of the
+ expression. */
+
+static symbol_needs_kind
dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
- dwarf2_per_cu_data *per_cu,
- dwarf2_per_objfile *per_objfile,
- bfd_endian byte_order,
- int addr_size,
- int ref_addr_size,
- int depth = 0)
+ dwarf2_per_cu_data *per_cu,
+ dwarf2_per_objfile *per_objfile,
+ bfd_endian byte_order,
+ int addr_size,
+ int ref_addr_size,
+ int depth = 0)
{
- const gdb_byte *expr_end = expr.data () + expr.size ();
enum symbol_needs_kind symbol_needs = SYMBOL_NEEDS_NONE;
- const int max_depth = 256;
- std::vector <const gdb_byte *> next_op;
- std::set <const gdb_byte *> visited_op;
+ /* If the expression is empty, we have nothing to do. */
+ if (expr.empty ())
+ return symbol_needs;
+
+ const gdb_byte *expr_end = &expr[0] + expr.size ();
+
+ /* List of operations to visit. Operations in this list are not visited yet,
+ so are not in VISITED_OPS (and vice-versa). */
+ std::vector<const gdb_byte *> ops_to_visit;
+
+ /* Operations already visited. */
+ std::unordered_set<const gdb_byte *> visited_ops;
+
+ /* Insert OP in OPS_TO_VISIT if it is within the expression's range and
+ hasn't been visited yet. */
+ auto insert_in_ops_to_visit
+ = [expr_end, &visited_ops, &ops_to_visit] (const gdb_byte *op_ptr)
+ {
+ if (op_ptr >= expr_end)
+ return;
+
+ if (visited_ops.find (op_ptr) != visited_ops.end ())
+ return;
+
+ ops_to_visit.push_back (op_ptr);
+ };
+
+ /* Expressions can invoke other expressions with DW_OP_call*. Protect against
+ a loop of calls. */
+ const int max_depth = 256;
if (depth > max_depth)
- error (_("DWARF-2 expression error: Loop detected (%d)."), depth);
+ error (_("DWARF-2 expression error: Loop detected."));
depth++;
- next_op.push_back (expr.data ());
+ /* Initialize the to-visit list with the first operation. */
+ insert_in_ops_to_visit (&expr[0]);
- while (!next_op.empty ())
+ while (!ops_to_visit.empty ())
{
- const gdb_byte *op_ptr = next_op.back ();
+ /* Pop one op to visit, mark it as visited. */
+ const gdb_byte *op_ptr = ops_to_visit.back ();
+ ops_to_visit.pop_back ();
+ gdb_assert (visited_ops.find (op_ptr) == visited_ops.end ());
+ visited_ops.insert (op_ptr);
- if (op_ptr >= expr_end
- || visited_op.find(op_ptr) != visited_op.end())
- {
- next_op.pop_back ();
- continue;
- }
+ dwarf_location_atom op = (dwarf_location_atom) *op_ptr;
- enum dwarf_location_atom op
- = (enum dwarf_location_atom) *op_ptr++;
- uint64_t uoffset;
- int64_t offset;
+ /* Most operations have a single possible following operation (they are
+ not conditional branches). The code below updates OP_PTR to point to
+ that following operation, which is pushed back to OPS_TO_VISIT, if
+ needed, at the bottom. Here, leave OP_PTR pointing just after the
+ operand. */
+ op_ptr++;
/* The DWARF expression might have a bug causing an infinite
loop. In that case, quitting is the only way out. */
@@ -2898,22 +2929,22 @@ dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
case DW_OP_constu:
case DW_OP_plus_uconst:
case DW_OP_piece:
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
break;
case DW_OP_consts:
- op_ptr = safe_read_sleb128 (op_ptr, expr_end, &offset);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
break;
case DW_OP_bit_piece:
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
break;
case DW_OP_deref_type:
case DW_OP_GNU_deref_type:
op_ptr++;
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
break;
case DW_OP_addr:
@@ -3017,14 +3048,17 @@ dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
break;
case DW_OP_implicit_value:
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
- op_ptr += uoffset;
+ {
+ uint64_t uoffset;
+ op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
+ op_ptr += uoffset;
+ }
break;
case DW_OP_implicit_pointer:
case DW_OP_GNU_implicit_pointer:
op_ptr += ref_addr_size;
- op_ptr = safe_read_sleb128 (op_ptr, expr_end, &offset);
+ op_ptr = safe_skip_leb128 (op_ptr, expr_end);
break;
case DW_OP_deref_size:
@@ -3033,62 +3067,31 @@ dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
break;
case DW_OP_skip:
- offset = extract_signed_integer (op_ptr, 2, byte_order);
- op_ptr += 2;
- op_ptr += offset;
+ {
+ LONGEST offset = extract_signed_integer (op_ptr, 2, byte_order);
+ op_ptr += 2;
+ op_ptr += offset;
+ }
break;
case DW_OP_bra:
- visited_op.insert (next_op.back ());
- next_op.pop_back ();
-
- offset = extract_signed_integer (op_ptr, 2, byte_order);
- op_ptr += 2;
- next_op.push_back (op_ptr + offset);
- next_op.push_back (op_ptr);
- continue;
-
- case DW_OP_call2:
{
- cu_offset cu_off
- = (cu_offset) extract_unsigned_integer (op_ptr, 2, byte_order);
+ /* This is the only operation that pushes two operations in the
+ to-visit list, so handle it all here. */
+ LONGEST offset = extract_signed_integer (op_ptr, 2, byte_order);
op_ptr += 2;
-
- auto get_frame_pc = [&symbol_needs] ()
- {
- symbol_needs = SYMBOL_NEEDS_FRAME;
- return 0;
- };
-
- struct dwarf2_locexpr_baton baton
- = dwarf2_fetch_die_loc_cu_off (cu_off, per_cu,
- per_objfile,
- get_frame_pc);
-
- /* If SYMBOL_NEEDS_FRAME is returned from the previous call,
- we dont have to check the baton content. */
- if (symbol_needs != SYMBOL_NEEDS_FRAME)
- {
- gdbarch *arch = baton.per_objfile->objfile->arch ();
- gdb::array_view<const gdb_byte> sub_expr (baton.data,
- baton.size);
- symbol_needs
- = dwarf2_get_symbol_read_needs (sub_expr,
- baton.per_cu,
- baton.per_objfile,
- gdbarch_byte_order (arch),
- baton.per_cu->addr_size (),
- baton.per_cu->ref_addr_size (),
- depth);
- }
- break;
+ insert_in_ops_to_visit (op_ptr + offset);
+ insert_in_ops_to_visit (op_ptr);
+ continue;
}
+ case DW_OP_call2:
case DW_OP_call4:
{
+ unsigned int len = op == DW_OP_call2 ? 2 : 4;
cu_offset cu_off
- = (cu_offset) extract_unsigned_integer (op_ptr, 4, byte_order);
- op_ptr += 4;
+ = (cu_offset) extract_unsigned_integer (op_ptr, len, byte_order);
+ op_ptr += len;
auto get_frame_pc = [&symbol_needs] ()
{
@@ -3175,10 +3178,13 @@ dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
case DW_OP_const_type:
case DW_OP_GNU_const_type:
- op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
- offset = *op_ptr++;
- op_ptr += offset;
- break;
+ {
+ uint64_t uoffset;
+ op_ptr = safe_read_uleb128 (op_ptr, expr_end, &uoffset);
+ gdb_byte offset = *op_ptr++;
+ op_ptr += offset;
+ break;
+ }
default:
error (_("Unhandled DWARF expression opcode 0x%x"), op);
@@ -3189,9 +3195,7 @@ dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
if (symbol_needs == SYMBOL_NEEDS_FRAME)
break;
- visited_op.insert (next_op.back ());
- next_op.pop_back ();
- next_op.push_back (op_ptr);
+ insert_in_ops_to_visit (op_ptr);
}
return symbol_needs;
--
2.31.1
next prev parent reply other threads:[~2021-06-07 21:48 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-28 15:46 [PATCH v3 00/17] DWARF expression evaluator design cleanup Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 01/17] Replace the symbol needs evaluator with a parser Zoran Zaric
2021-06-07 21:48 ` Simon Marchi [this message]
2021-06-11 17:22 ` Zaric, Zoran (Zare)
2021-05-28 15:46 ` [PATCH v3 02/17] Cleanup of the dwarf_expr_context constructor Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 03/17] Move frame context info to dwarf_expr_context Zoran Zaric
2021-06-09 0:45 ` Simon Marchi
2021-06-11 17:16 ` Zaric, Zoran (Zare)
2021-05-28 15:46 ` [PATCH v3 04/17] Remove get_frame_cfa from dwarf_expr_context Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 05/17] Move compilation unit info to dwarf_expr_context Zoran Zaric
2021-06-09 0:50 ` Simon Marchi
2021-06-11 17:19 ` Zaric, Zoran (Zare)
2021-05-28 15:46 ` [PATCH v3 06/17] Move dwarf_call " Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 07/17] Move get_object_address " Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 08/17] Move read_mem " Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 09/17] Move push_dwarf_reg_entry_value to expr.c Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 10/17] Inline get_reg_value method of dwarf_expr_context Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 11/17] Remove empty frame and full evaluators Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 12/17] Merge evaluate_for_locexpr_baton evaluator Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 13/17] Move piece_closure and its support to expr.c Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 14/17] Make value_copy also copy the stack data member Zoran Zaric
2021-05-28 15:46 ` [PATCH v3 15/17] Make DWARF evaluator return a single struct value Zoran Zaric
2021-08-11 17:00 ` Tom Tromey
2021-08-12 18:03 ` Tom Tromey
2021-08-13 9:57 ` Zoran Zaric
2021-08-13 16:59 ` Tom Tromey
2021-08-13 17:57 ` Zoran Zaric
2021-08-16 15:37 ` Tom Tromey
2021-08-16 16:05 ` Zoran Zaric
2021-08-16 17:32 ` Tom Tromey
2021-05-28 15:46 ` [PATCH v3 16/17] Simplify dwarf_expr_context class interface Zoran Zaric
2021-06-09 1:01 ` Simon Marchi
2021-06-11 17:20 ` Zaric, Zoran (Zare)
2021-05-28 15:46 ` [PATCH v3 17/17] Add as_lval argument to expression evaluator Zoran Zaric
2021-06-09 1:04 ` [PATCH v3 00/17] DWARF expression evaluator design cleanup Simon Marchi
2021-08-05 16:38 ` Zaric, Zoran (Zare)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=395847ba-8138-018f-f618-528d1a22e9fa@polymtl.ca \
--to=simon.marchi@polymtl.ca \
--cc=gdb-patches@sourceware.org \
--cc=zoran.zaric@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).