Re: [PATCH 01/43] Replace the symbol needs evaluator with a parser

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Simon Marchi <simon.marchi@polymtl.ca>
To: Zoran Zaric <Zoran.Zaric@amd.com>, gdb-patches@sourceware.org
Subject: Re: [PATCH 01/43] Replace the symbol needs evaluator with a parser
Date: Mon, 26 Apr 2021 21:20:18 -0400	[thread overview]
Message-ID: <def652ba-0c4c-614e-e4b1-507288cd11e2@polymtl.ca> (raw)
In-Reply-To: <20210301144620.103016-2-Zoran.Zaric@amd.com>



On 2021-03-01 9:45 a.m., Zoran Zaric via Gdb-patches wrote:
> This patch addresses a design problem with the symbol_needs_eval_context
> class. It exposes the problem by introducing two new testsuite test
> cases.
> 
> To explain the issue, I first need to explain the dwarf_expr_context
> class that the symbol_needs_eval_context class derives from.
> 
> The intention behind the dwarf_expr_context class is to commonize the
> DWARF expression evaluation mechanism for different evaluation
> contexts. Currently in gdb, the evaluation context can contain some or
> all of the following information: architecture, object file, frame and
> compilation unit.
> 
> Depending on the information needed to evaluate a given expression,
> there are currently three distinct DWARF expression evaluators:
> 
>  - Frame: designed to evaluate an expression in the context of a call
>    frame information (dwarf_expr_executor class). This evaluator doesn’t
>    need a compilation unit information.
> 
>  - Location description: designed to evaluate an expression in the
>    context of a source level information (dwarf_evaluate_loc_desc
>    class). This evaluator expects all information needed for the
>    evaluation of the given expression to be present.
> 
>  - Symbol needs: designed to answer a question about the parts of the
>    context information required to evaluate a DWARF expression behind a
>    given symbol (symbol_needs_eval_context class). This evaluator
>    doesn’t need a frame information.
> 
> The functional difference between the symbol needs evaluator and the
> others is that this evaluator is not meant to interact with the actual
> target. Instead, it is supposed to check which parts of the context
> information are needed for the given DWARF expression to be evaluated by
> the location description evaluator.
> 
> The idea is to take advantage of the existing dwarf_expr_context
> evaluation mechanism and to fake all required interactions with the
> actual target, by returning back dummy values. The evaluation result is
> returned as one of three possible values, based on operations found in a
> given expression:
> 
> - SYMBOL_NEEDS_NONE,
> - SYMBOL_NEEDS_REGISTERS and
> - SYMBOL_NEEDS_FRAME.
> 
> The problem here is that faking results of target interactions can yield
> an incorrect evaluation result.
> 
> For example, if we have a conditional DWARF expression, where the
> condition depends on a value read from an actual target, and the true
> branch of the condition requires a frame information to be evaluated,
> while the false branch doesn’t, fake target reads could conclude that a
> frame information is not needed, where in fact it is. This wrong
> information would then cause the expression to be actually evaluated (by
> the location description evaluator) with a missing frame information.
> This would then crash the debugger.
> 
> The gdb.dwarf2/symbol_needs_eval_fail.exp test introduces this
> scenario, with the following DWARF expression:
> 
>                    DW_OP_addr $some_variable
>                    DW_OP_deref
>                    DW_OP_bra 4  # conditional jump to DW_OP_bregx
>                    DW_OP_lit0
>                    DW_OP_skip 3  # jump to DW_OP_stack_value
>                    DW_OP_bregx $dwarf_regnum 0
>                    DW_OP_stack_value
> 
> This expression describes a case where some variable dictates the
> location of another variable. Depending on a value of some_variable, the
> variable whose location is described by this expression is either read
> from a register or it is defined as a constant value 0. In both cases,
> the value will be returned as an implicit location description on the
> DWARF stack.
> 
> Currently, when the symbol needs evaluator fakes a memory read from the
> address behind the some_variable variable, the constant value 0 is used
> as the value of the variable A, and the check returns the
> SYMBOL_NEEDS_NONE result.
> 
> This is clearly a wrong result and it causes the debugger to crash.
> 
> The scenario might sound strange to some people, but it comes from a
> SIMD/SIMT architecture where $some_variable is an execution mask.  In
> any case, it is a valid DWARF expression, and GDB shouldn't crash while
> evaluating it. Also, a similar example could be made based on a
> condition of the frame base value, where if that value is concluded to
> be 0, the variable location could be defaulted to a TLS based memory
> address.
> 
> The gdb.dwarf2/symbol_needs_eval_timeout.exp test introduces a second
> scenario. This scenario is a bit more abstract due to the DWARF
> assembler lacking the CFI support, but it exposes a different
> manifestation of the same problem. Like in the previous scenario, the
> DWARF expression used in the test is valid:
> 
>                        DW_OP_lit1
>                        DW_OP_addr $some_variable
>                        DW_OP_deref
>                        DW_OP_skip 4  # jump to DW_OP_fbreg
>                        DW_OP_drop
>                        DW_OP_fbreg 0
>                        DW_OP_dup
>                        DW_OP_lit0
>                        DW_OP_eq
>                        DW_OP_bra -9  # conditional jump to DW_OP_drop
>                        DW_OP_stack_value
> 
> Similarly to the previous scenario, the location of a variable A is an
> implicit location description with a constant value that depends on a
> value held by a global variable. The difference from the previous case
> is that DWARF expression contains a loop instead of just one branch. The
> end condition of that loop depends on the expectation that a frame base
> value is never zero. Currently, the act of faking the target reads will
> cause the symbol needs evaluator to get stuck in an infinite loop.
> 
> Somebody could argue that we could change the fake reads to return
> something else, but that would only hide the real problem.
> 
> The general impression seems to be that the desired design is to have
> one class that deals with parsing of the DWARF expression, while there
> are virtual methods that deal with specifics of some operations.
> 
> Using an evaluator mechanism here doesn’t seem to be correct, because
> the act of evaluation relies on accessing the data from the actual
> target with the possibility of skipping the evaluation of some parts of
> the expression.
> 
> To better explain the proposed solution for the issue, I first need to
> explain a couple more details behind the current design:
> 
> There are multiple places in gdb that handle DWARF expression parsing
> for different purposes. Some are in charge of converting the expression
> to some other internal representation (decode_location_expression,
> disassemble_dwarf_expression and dwarf2_compile_expr_to_ax), some are
> analysing the expression for specific information
> (compute_stack_depth_worker) and some are in charge of evaluating the
> expression in a given context (dwarf_expr_context::execute_stack_op
> and decode_locdesc).
> 
> The problem is that all those functions have a similar (large) switch
> statement that handles each DWARF expression operation. The result of
> this is a code duplication and harder maintenance.
> 
> As a step into the right direction to solve this problem (at least for
> the purpose of a DWARF expression evaluation) the expression parsing was
> commonized inside of an evaluator base class (dwarf_expr_context). This
> makes sense for all derived classes, except for the symbol needs
> evaluator (symbol_needs_eval_context) class.
> 
> As described previously the problem with this evaluator is that if the
> evaluator is not allowed to access the actual target, it is not really
> evaluating.
> 
> Instead, the desired function of a symbol needs evaluator seems to fall
> more into expression analysis category. This means that a more natural
> fit for this evaluator is to be a symbol needs analysis, similar to the
> existing compute_stack_depth_worker analysis.
> 
> Another problem is that using a heavyweight mechanism of an evaluator
> to do an expression analysis seems to be an unneeded overhead. It also
> requires a more complicated design of the parent class to support fake
> target reads.
> 
> The reality is that the whole symbol_needs_eval_context class can be
> replaced with a lightweight recursive analysis function, that will give
> more correct result without compromising the design of the
> dwarf_expr_context class.
> 
> The downside of this approach is adding of one more similar switch
> statement, but at least this way the new symbol needs analysis will be
> a lightweight mechnism and it will provide a correct result for any
> given DWARF expression.
> 
> A more desired long term design would be to have one class that deals
> with parsing of the DWARF expression, while there would be a virtual
> methods that deal with specifics of some DWARF operations. Then that
> class would be used as a base for all DWARF expression parsing mentioned
> at the beginning.
> 
> This however, requires a far bigger changes that are out of the scope
> of this patch series.
> 
> The new analysis requires the DWARF location description for the
> argc argument of the niam function to change in the assembly file
> gdb.python/amd64-py-framefilter-invalidarg.S. Originally, expression
> ended with a 0 value byte, which was never reached by the symbol needs
> evaluator, because it was detecting a stack underflow when evaluating
> the operation before. The new approach does not simulate a DWARF
> stack anymore, so the 0 value byte needs to be removed because it
> makes the DWARF expression invalid.
> 
> Some concerns were raised that a linear scan of the expression byte
> stream would have issues if a DWARF producer would try to hide some
> non DWARF related data after a control flow operation. Although the
> testsuite doesn't show this case, it is a valid concern, so one of
> the later patches in this series will address it by switching back to
> the then redesigned DWARF expression evaluator.

While re-reading your exchange with Tom, I was under the impression that
traversing the "control flow graph" of the expression, visiting each
node only once, would be a good solution.  It would avoid the infinite
loop problem, the "two branches" problem, and even the corner cases
where you have garbage in the middle of the expression, or if the
expression jumps in the middle of an instruction to re-use the operand
of an instruction as an instruction.

Patch 39 changes this back to an evaluator, so I'm not sure if this is
what you implemented or something else, I'll see when I get there.

> -class symbol_needs_eval_context : public dwarf_expr_context
> +static enum symbol_needs_kind
> +dwarf2_get_symbol_read_needs (gdb::array_view<const gdb_byte> expr,
> +			     dwarf2_per_cu_data *per_cu,
> +			     dwarf2_per_objfile *per_objfile,
> +			     bfd_endian byte_order,
> +			     int addr_size,
> +			     int ref_addr_size,
> +			     int depth = 0)

The wrapped lines above are missing one column of indent.

Simon

next prev parent reply	other threads:[~2021-04-27  1:20 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01 14:45 [PATCH 00/43 V2] Allow location description on the DWARF stack Zoran Zaric
2021-03-01 14:45 ` [PATCH 01/43] Replace the symbol needs evaluator with a parser Zoran Zaric
2021-04-27  1:20   ` Simon Marchi [this message]
2021-04-28 10:17     ` Zoran Zaric
2021-04-28 14:08       ` Simon Marchi
2021-04-28 15:02         ` Zoran Zaric
2021-04-28 15:31         ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 02/43] Cleanup of the dwarf_expr_context constructor Zoran Zaric
2021-04-27  1:23   ` Simon Marchi
2021-04-28 10:19     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 03/43] Move frame context info to dwarf_expr_context Zoran Zaric
2021-04-27  2:19   ` Simon Marchi
2021-04-28 10:51     ` Zoran Zaric
2021-04-28 14:14       ` Simon Marchi
2021-04-28 15:55         ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 04/43] Remove get_frame_cfa from dwarf_expr_context Zoran Zaric
2021-03-01 14:45 ` [PATCH 05/43] Move compilation unit info to dwarf_expr_context Zoran Zaric
2021-04-27  2:58   ` Simon Marchi
2021-04-28 11:28     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 06/43] Move dwarf_call " Zoran Zaric
2021-03-01 14:45 ` [PATCH 07/43] Move get_object_address " Zoran Zaric
2021-04-27  3:12   ` Simon Marchi
2021-04-28 11:34     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 08/43] Move read_mem " Zoran Zaric
2021-03-01 14:45 ` [PATCH 09/43] Move push_dwarf_reg_entry_value to expr.c Zoran Zaric
2021-04-27  3:56   ` Simon Marchi
2021-04-28 11:36     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 10/43] Inline get_reg_value method of dwarf_expr_context Zoran Zaric
2021-03-01 14:45 ` [PATCH 11/43] Remove empty frame and full evaluators Zoran Zaric
2021-03-01 14:45 ` [PATCH 12/43] Merge evaluate_for_locexpr_baton evaluator Zoran Zaric
2021-04-28  1:33   ` Simon Marchi
2021-04-28 11:39     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 13/43] Move piece_closure and its support to expr.c Zoran Zaric
2021-04-28  1:56   ` Simon Marchi
2021-04-28 11:40     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 14/43] Make value_copy also copy the stack data member Zoran Zaric
2021-04-28  2:01   ` Simon Marchi
2021-04-28 11:43     ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 15/43] Make DWARF evaluator return a single struct value Zoran Zaric
2021-04-28  2:21   ` Simon Marchi
2021-04-28 11:47     ` Zoran Zaric
2021-04-28 14:24       ` Simon Marchi
2021-03-01 14:45 ` [PATCH 16/43] Simplify dwarf_expr_context class interface Zoran Zaric
2021-04-28  2:45   ` Simon Marchi
2021-04-28 13:15     ` Zoran Zaric
2021-04-28 14:41       ` Simon Marchi
2021-04-28 15:39         ` Zoran Zaric
2021-04-28 19:19           ` Simon Marchi
2021-04-29 15:49       ` Simon Marchi
2021-04-29 15:55         ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 17/43] Add as_lval argument to expression evaluator Zoran Zaric
2021-04-28  3:04   ` Simon Marchi
2021-04-28 13:16     ` Zoran Zaric
2021-04-28  3:30   ` Simon Marchi
2021-03-01 14:45 ` [PATCH 18/43] Add new register access interface to expr.c Zoran Zaric
2021-03-08 23:52   ` Lancelot SIX
2021-04-28  3:25   ` Simon Marchi
2021-04-28 13:29     ` Zoran Zaric
2021-04-28 14:48       ` Simon Marchi
2021-04-28 15:42         ` Zoran Zaric
2021-03-01 14:45 ` [PATCH 19/43] Add new memory " Zoran Zaric
2021-04-30 21:24   ` Simon Marchi
2021-03-01 14:45 ` [PATCH 20/43] Add new classes that model DWARF stack element Zoran Zaric
2021-03-01 14:45 ` [PATCH 21/43] Add to_location method to DWARF entry classes Zoran Zaric
2021-03-01 14:45 ` [PATCH 22/43] Add to_value " Zoran Zaric
2021-03-01 14:46 ` [PATCH 23/43] Add read method to location description classes Zoran Zaric
2021-03-01 14:46 ` [PATCH 24/43] Add write " Zoran Zaric
2021-03-01 14:46 ` [PATCH 25/43] Add deref " Zoran Zaric
2021-03-01 14:46 ` [PATCH 26/43] Add read_from_gdb_value method to dwarf_location Zoran Zaric
2021-03-01 14:46 ` [PATCH 27/43] Add write_to_gdb_value " Zoran Zaric
2021-03-01 14:46 ` [PATCH 28/43] Add is_implicit_ptr_at " Zoran Zaric
2021-03-01 14:46 ` [PATCH 29/43] Add indirect_implicit_ptr to dwarf_location class Zoran Zaric
2021-03-01 14:46 ` [PATCH 30/43] Add new computed struct value callback interface Zoran Zaric
2021-03-01 14:46 ` [PATCH 31/43] Add to_gdb_value method to DWARF entry class Zoran Zaric
2021-03-01 14:46 ` [PATCH 32/43] Change DWARF stack to use new dwarf_entry classes Zoran Zaric
2021-03-01 14:46 ` [PATCH 33/43] Remove old computed struct value callbacks Zoran Zaric
2021-03-01 14:46 ` [PATCH 34/43] Comments cleanup between expr.h and expr.c Zoran Zaric
2021-03-01 14:46 ` [PATCH 35/43] Remove dwarf_expr_context from expr.h interface Zoran Zaric
2021-03-01 14:46 ` [PATCH 36/43] Move read_addr_from_reg function to frame.c Zoran Zaric
2021-03-01 14:46 ` [PATCH 37/43] Add frame info check to DW_OP_reg operations Zoran Zaric
2021-03-01 14:46 ` [PATCH 38/43] Remove DWARF expression composition check Zoran Zaric
2021-03-01 14:46 ` [PATCH 39/43] Change back the symbol needs to use the evaluator Zoran Zaric
2021-03-01 14:46 ` [PATCH 40/43] Add support for any location description in CFI Zoran Zaric
2021-03-01 14:46 ` [PATCH 41/43] Add DWARF operations for byte and bit offset Zoran Zaric
2021-03-01 14:46 ` [PATCH 42/43] Add support for DW_OP_LLVM_undefined operation Zoran Zaric
2021-03-01 14:46 ` [PATCH 43/43] Add support for nested composite locations Zoran Zaric

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=def652ba-0c4c-614e-e4b1-507288cd11e2@polymtl.ca \
    --to=simon.marchi@polymtl.ca \
    --cc=Zoran.Zaric@amd.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).