public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PING] [PATCH] gdb/linespec: Improve sliding breakpoints
@ 2022-07-27 13:11 Six, Lancelot
  2022-08-23 15:01 ` [PING^2] " Lancelot SIX
  0 siblings, 1 reply; 2+ messages in thread
From: Six, Lancelot @ 2022-07-27 13:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: lsix, Six, Lancelot

[Public]

Hi

I was wondering if anyone has comments and or opinions about the change proposed in this patch.

Best,
Lancelot.

> -----Original Message-----
> From: Six, Lancelot <Lancelot.Six@amd.com>
> Sent: 17 June 2022 17:08
> To: gdb-patches@sourceware.org
> Cc: lsix@lancelotsix.com; Six, Lancelot <Lancelot.Six@amd.com>
> Subject: [PATCH] gdb/linespec: Improve sliding breakpoints
> 
> Hi,
> 
> Here is proposition to slightly update the breakpoint sliding algorithm
> in GDB.
> 
> This patch must be applied on top of
> https://sourceware.org/pipermail/gdb-patches/2022-April/187686.html
> which was submitted by Simon Marchi a couple of months.  This needs
> small updates as the BLOCK_ENTRY_PC and SYMBOL_BLOCK_VALUE are no
> longer used.  As a consequence
> 
>   BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (sym))
> 
> should be replaced with
> 
>   sym->value_block ()->entry_pc ()
> 
> This patch have been regression tested on x86_64-Linux.  All comments
> and feedbacks are welcome.
> 
> Best,
> Lancelot.
> 
> ---
> 
> The algorithm used to place a breakpoint by line approximately goes as
> follow:
> 
> - Look for all the PCs which maps directly to the desired line
> - If any PC is found, place a breakpoint at each found PC.
> - If no PC maps to the desired line:
>   - Search for the lower line number in the line table which is higher
>     than the desired line, and has some object code associated to it.
>   - List all PCs which maps to this line number.
>   - Place a breakpoint at each PC in the list above.
> 
> There are situations where this heuristic is not sufficient.  If we
> consider the following c++ code:
> 
>     12 template<typename T>
>     13 T func (T a, T b)
>     14 {
>     15   /* Break here.  */
>     16   if constexpr (std::is_integral_t<T>)
>     17     return a;
>     18   else
>     19     return b;
>     20 }
> 
> Given the current algorithm, if the user tries to set a breakpoint at
> line 15, here is what happens:
> 
> - Search PCs mapping to L15, there is none ;
> - Search for the first line number above L16 with code, this is L17 ;
> - List all PCs mapping to L17 and a breakpoints at insert a breakpoint at
>   each ;
> 
> Now lets suppose that this templated function has been instantiated both
> for an integral type (int) and non-integral type (float).  If we follow
> the algorithm, no breakpoint will ever be inserted in func<float>,
> while there will be a breakpoint in func<int>.
> 
> I came across a less obvious variation of this case where an inlined
> function had different optimization opportunities for different instances.
> As a consequence, some instances had the first actual line of code
> different than the other, making the breakpoint be missed and the user
> confused.  To add to the confusion, the instances which have code mapped
> to the lowest line above the user specified linespec only appeared in a
> shared library conditionally loaded, making the breakpoint stop depend on
> whether or not the said shared library was loaded or not.
> 
> This patch is a proposition to adapt the algorithm in order to cover the
> scenario exposed above.  To do so, this patch ends up scanning
> independently for the best candidate in the various blocks composing
> the line table structures (a block being a succession of
> linetable_entries terminated by an entry with line == 0).
> 
> The line table for the example program above looks something like:
> 
>     (gdb) maintenance info line-table missed
>     symtab: /[...]/missed_pb.cc ((struct symtab *) 0x6210000f0500)
>     linetable: ((struct linetable *) 0x6210001a1220):
>     INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END
>     ...
>     17     END    0x00000000000011d3 Y
>     18     13     0x00000000000011d3 Y                    <-- func<float>
>     19     19     0x00000000000011e5 Y
>     20     20     0x00000000000011ea Y
>     21     END    0x00000000000011ec Y
>     22     13     0x00000000000011ec Y                    <-- func<int>
>     23     17     0x00000000000011fa Y
>     24     20     0x00000000000011fd Y
>     25     END    0x00000000000011ff Y
> 
> The first block listed (starting at 0x11d3) is func<float>, and the
> second starting at 0x11ec is func<int>.
> 
> The proposed algorithm analyzes each block independently.  The first step
> is to decide if a block contains the line the user is interested in (a.k.a
> the target).  In our example, this is L15.  A block is considered to match
> if it contains at least one instruction mapped to a line before the target
> and at least one instruction mapped to a line after the target.  In other
> words this means that will select all blocks where:
> 
> - there is at least one instruction mapped to a line L such as L <= target
> - there is at least one instruction mapped to a line M such as M >= target
> 
> For each block which satisfies this condition, we select the lowest line
> number above our target, and select all instructions mapping to this line
> number (there might be multiple).  In our example the func<float> block
> satisfies the condition (we have line 13 and 19), so we insert breakpoints
> at the PCs mapped to L19.  Similarly, the func<int> block is selected
> (thanks to L13 and L17), and we insert breakpoints at all PCs mapped to
> L17.
> 
> The core of this heuristic is implemented in find_best_le_for_symtab_line.
> 
> There are however some cases which are still problematic, where the
> first line after the target is the first line of the block we are
> looking for.  In such situation the algorithm would find nothing where
> we would want to select 0x11d3 and 0x11ec (both mapping to L13).  In
> order to solve this, the algorithm first looks for the lowest line
> number following the user specified target as this is done before this
> patch (here the first line with associated code following L12 which is
> L13).  Once this is figured out, this new target (L13) is passed to the
> algorithm above (find_best_le_for_symtab_line).
> 
> This is safe to do because we know that there is no line between the user
> target (included) and this new target.  Therefore, whatever
> find_best_le_for_symtab_line returns, it must map to lines numbers above
> (inclusive) the new target.
> 
> Finally, the testsuite showed an interesting regression.  Let's suppose
> we have:
> 
>     1 struct Foo { Foo () {} };
>     2 Foo f1;
>     3 void bar ()
>     4 {
>     5   /* break here.  */
>     6   do_something ();
>     7 }
>     8 Foo f2;
> 
> Inspecting the linetable, we have a code block containing L2 and L8 (so
> over line 5).  If a user was to insert a breakpoint somewhere in L5, the
> algorithm shown above would also insert a breakpoint at L8.  To solve
> this, this patch uses the fact that L2 and L8 are, from dwarf's
> perspective in a artificial block which corresponds to the static
> initialization.  So if among multiple candidates, some are part of a
> virtual function and some are not, we only keep the ones in the non
> virtual function.
> 
> This proposed new algorithm will not solve all possible cases.  For
> example, if the func function is inlined multiple times in the same block
> for different template parameters, the lookup will process all of those as
> part of the same scope.  Solving such situation would require heavier
> works.  I think that this change, while not covering every scenarios is
> still a step in the right direction.
> 
> All feedbacks and thoughts are welcome.
> 
> Regression tested on x86_64 GNU/Linux.
> 
> Change-Id: I8291f9e1a3522934a0f418dfc565cebb86cff730
> ---
>  gdb/linespec.c                                | 93 ++++++++++++++++++-
>  gdb/symtab.c                                  | 45 +++++++++
>  gdb/symtab.h                                  | 14 +++
>  .../gdb.linespec/breakpoint-sliding.cc        | 70 ++++++++++++++
>  .../gdb.linespec/breakpoint-sliding.exp       | 82 ++++++++++++++++
>  5 files changed, 301 insertions(+), 3 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>  create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> 
> diff --git a/gdb/linespec.c b/gdb/linespec.c
> index da9e35c5035..1e8679cd027 100644
> --- a/gdb/linespec.c
> +++ b/gdb/linespec.c
> @@ -2092,9 +2092,42 @@ create_sals_line_offset (struct linespec_state
> *self,
>        if (intermediate_results.empty () && best_entry != NULL)
>  	{
>  	  was_exact = false;
> -	  intermediate_results = decode_digits_ordinary (self, ls,
> -							 best_entry->line,
> -							 &best_entry);
> +	  /* We did not find an exact match for the line we are looking for.
> +	     As a consequence, we are now looking for the next best line.
> +	     However, the next best line does not have to be the same across
> +	     the entire program.  If we consider the following code:
> +
> +	       12 template<typename T>
> +	       13 T select (T a, T b)
> +	       14 {
> +	       15   // Break here.
> +	       16   if constexpr (std::is_integral_t<T>)
> +	       17     return a;
> +	       18   else
> +	       19     return b;
> +	       20 }
> +
> +	     For a function like the above, the next best line is L17 if T is
> +	     an integral type, but should be L19 if T is not.  As a
> +	     consequence we try to search for the next best line independently
> +	     in each code block (a code block being a contiguous section of a
> +	     line table terminated by an entry with line == 0).  */
> +	  for (const auto &elt : ls->file_symtabs)
> +	    {
> +	      program_space *pspace = elt->compunit ()->objfile ()->pspace;
> +	      for (const linetable_entry *le
> +		   : find_best_le_for_symtab_line (elt->linetable (),
> +						   best_entry->line))
> +		{
> +		  symtab_and_line sal;
> +		  sal.pspace = pspace;
> +		  sal.symtab = elt;
> +		  sal.line = le->line;
> +		  sal.explicit_line = true;
> +		  sal.pc = le->pc;
> +		  intermediate_results.push_back (std::move (sal));
> +		}
> +	    }
>  	}
> 
>        /* For optimized code, the compiler can scatter one source line
> @@ -2110,6 +2143,25 @@ create_sals_line_offset (struct linespec_state
> *self,
>        gdb::def_vector<int> filter (intermediate_results.size ());
>        gdb::def_vector<const block *> blocks (intermediate_results.size ());
> 
> +      /* Artificial code blocks can produce false positives.  Consider
> +
> +	   1 struct Foo { Foo () {} };
> +	   2 Foo f1;
> +	   3 void bar ()
> +	   4 {
> +	   5   // break here.
> +	   6   do_something ();
> +	   7 }
> +	   8 Foo f2;
> +
> +	 In this example, if the user asked for a breakpoint in L5, the above
> +	 algorithm will select L6 and L8.  L6 is selected because the we have
> +	 an artificial block (marked by DW_AT_artificial) generated to cover
> +	 the static initializations and this block covers code from line 2 to
> +	 line 8.  To avoid this false positive, reject candidates in artificial
> +	 blocks if we also have candidates in non artificial blocks.  */
> +      bool any_nonartificial = false;
> +
>        for (i = 0; i < intermediate_results.size (); ++i)
>  	{
>  	  set_current_program_space (intermediate_results[i].pspace);
> @@ -2117,6 +2169,13 @@ create_sals_line_offset (struct linespec_state
> *self,
>  	  filter[i] = 1;
>  	  blocks[i] = block_for_pc_sect (intermediate_results[i].pc,
>  					 intermediate_results[i].section);
> +
> +	  if (!any_nonartificial && blocks[i] != nullptr)
> +	    {
> +	      struct symbol *func = block_containing_function (blocks[i]);
> +	      any_nonartificial
> +		= (func != nullptr && !func->is_artificial ());
> +	    }
>  	}
> 
>        for (i = 0; i < intermediate_results.size (); ++i)
> @@ -2171,6 +2230,34 @@ create_sals_line_offset (struct linespec_state
> *self,
>  		&& val.line < sym->line ())
>  	      continue;
> 
> +	    /* If we did not find an exact match, the above heuristic might
> +	       find some false positives.
> +
> +	       Lets consider (with Foo having a non trivial ctor):
> +
> +		    9 struct Foo { Foo () {} };
> +		   10 Foo f1;
> +		   11 int bar ()
> +		   12 {
> +		   13   // Break here.
> +		   14   return 42;
> +		   15 }
> +		   16 Foo f2;
> +
> +		When the user tries to insert a breakpoint it L13, the
> +		algorithm sees a code path from L10 to L16 (in static
> +		initialization), and will consider L16 as a candidate.
> +		However the user clearly intended to break in bar.
> Therefore,
> +		if we see that there is at least 1 breakpoint in a non
> +		artificial function (bar), ignore the hit in the static
> +		initialization.
> +	       */
> +	    if (!was_exact
> +		&& any_nonartificial
> +		&& sym != nullptr
> +		&& sym->is_artificial ())
> +	      continue;
> +
>  	    if (self->funfirstline)
>  	      skip_prologue_sal (sal);
> 
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 03aa2a96b87..1b03e6fce8e 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -3541,6 +3541,51 @@ find_pcs_for_symtab_line (struct symtab
> *symtab, int line,
>    return result;
>  }
> 
> +/* See symtab.h.  */
> +
> +std::vector<linetable_entry *>
> +find_best_le_for_symtab_line (struct linetable *lt, int line)
> +{
> +  std::vector<linetable_entry *> res;
> +
> +  if (lt == nullptr || line < 0)
> +    return res;
> +
> +  /* Tells if in the current block we found at least one PC mapping to a line
> +     before LINE.  */
> +  bool found_preceding = false;
> +  const int nitems = lt->nitems;
> +  std::vector<linetable_entry *> best_candidates;
> +
> +  for (int i = 0; i < nitems; ++i)
> +    {
> +      struct linetable_entry &item = lt->item[i];
> +      found_preceding
> +	= found_preceding || (item.line > 0 && item.line <= line);
> +
> +      if (item.is_stmt && item.line >= line
> +	  && (best_candidates.empty ()
> +	      || item.line <= best_candidates.front ()->line))
> +	{
> +	  if (!best_candidates.empty ()
> +	      && item.line < best_candidates.front ()->line)
> +	    best_candidates.clear ();
> +	  best_candidates.push_back (&item);
> +	}
> +      else if (item.line == 0)
> +	{
> +	  if (found_preceding)
> +	    res.insert (res.end (), best_candidates.begin (),
> +			best_candidates.end ());
> +
> +	  found_preceding = false;
> +	  best_candidates.clear ();
> +	}
> +    }
> +
> +  return res;
> +}
> +
> 
> 
> 
>  /* Set the PC value for a given source file and line number and return true.
>     Returns false for invalid line number (and sets the PC to 0).
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index ac902a4cbbe..94bea02cb43 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -2624,6 +2624,20 @@ void iterate_over_symtabs (const char *name,
>  std::vector<CORE_ADDR> find_pcs_for_symtab_line
>      (struct symtab *symtab, int line, struct linetable_entry **best_entry);
> 
> +/* Given a linetable, look for entries which can map to line LINE.
> +
> +   The result is the set of entries which:
> +   1) Are part of a block (i.e. a sequence of entries terminated by an entry
> +      with line == 0) containing at least one statement entry with line above
> +      or equal to LINE, and at least one statement entry with line below or
> +      equal to LINE.
> +   2) Are statement entries (is_stmt == 1)
> +   3) Have line equal to the lowest statement line number greater or equal to
> +      LINE within the block they are part of.  */
> +
> +std::vector<linetable_entry *> find_best_le_for_symtab_line
> +    (struct linetable *lt, int line);
> +
>  /* Prototype for callbacks for LA_ITERATE_OVER_SYMBOLS.  The callback
>     is called once per matching symbol SYM.  The callback should return
>     true to indicate that LA_ITERATE_OVER_SYMBOLS should continue
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> new file mode 100644
> index 00000000000..3ae6b563cdf
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> @@ -0,0 +1,70 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#if __cplusplus >= 201703L
> +#include <type_traits>
> +
> +template<typename T>
> +T
> +foo (T a, T b)
> +{
> +  T ret;
> +  /* In foo break here.  */
> +  if constexpr (std::is_integral_v<T>)
> +    ret = a; /* break in foo<int>.  */
> +  else
> +    ret = b; /* break in foo<NonIntegral>.  */
> +
> +  return ret;
> +}
> +
> +struct NonIntegral
> +{
> +  int m;
> +};
> +
> +#endif
> +
> +struct St
> +{
> +  St ()
> +  { /* Empty.  */ }
> +};
> +
> +St st1;
> +
> +static int
> +between_static_init ()
> +{
> +  /* between_static_init break here.  */
> +  return 42; /* break in between_static_init.  */
> +}
> +
> +St st2;
> +
> +int main ()
> +{
> +
> +#if __cplusplus >= 201703L
> +  foo<int> (1, 2);
> +  foo<NonIntegral> ({ 1 }, { 2 });
> +#endif
> +
> +  between_static_init ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> new file mode 100644
> index 00000000000..9b79c419e9b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> @@ -0,0 +1,82 @@
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This testcase exercises breakpoint sliding when the user tries to set
> +# a breakpoint at a line number which has no code associated to it.  GDB
> +# should then place breakpoint at the first line following the one given
> +# by the user in each code block composing the program space, given that
> +# the code block "covers" the line asked by the user.
> +
> +if { [skip_cplus_tests] } {
> +  return -1
> +}
> +
> +standard_testfile .cc
> +
> +# Part of the test require c++17.  If the used compiler does not support
> +# it, fallback only doing part of the tests.
> +set skip_cpp17 0
> +
> +if {[prepare_for_testing "failed to prepare with c++17" $testfile \
> +			 $srcfile {debug c++ additional_flags=-std=c++17}]} {
> +    set skip_cpp17 1
> +    if {[prepare_for_testing "failed to prepare" $testfile $srcfile \
> +			     {debug c++}]} {
> +	return -1
> +    }
> +}
> +
> +proc_with_prefix test_sliding_per_block {} {
> +    if {$::skip_cpp17} {
> +	return
> +    }
> +
> +    set loc [gdb_get_line_number "In foo break here."]
> +    gdb_test "break $loc" \
> +	"Breakpoint $::decimal.* \\(2 locations\\)" \
> +	"slide to if constexpr"
> +
> +    set loc_in_int_instance [gdb_get_line_number "break in foo<int>"]
> +    set loc_in_struct_instance [gdb_get_line_number "break in
> foo<NonIntegral>"]
> +    set seen_int_pb 0
> +    set seen_struct_bp 0
> +    gdb_test_multiple "info breakpoint" "info breakpoint" {
> +	-re "in foo<int>\\(int, int\\) at
> \[^\n\r\]*$::srcfile:$loc_in_int_instance" {
> +	    set seen_int_pb 1
> +	    verbose -log "found int"
> +	    exp_continue
> +	}
> +	-re "in foo<NonIntegral>\\(NonIntegral, NonIntegral\\) at
> \[^\n\r\]*$::srcfile:$loc_in_struct_instance" {
> +	    set seen_struct_bp 1
> +	    verbose -log "found struct"
> +	    exp_continue
> +	}
> +	-re -wrap "" {
> +	    gdb_assert {$seen_int_pb && $seen_struct_bp} $gdb_test_name
> +	}
> +    }
> +}
> +
> +proc_with_prefix test_ignore_artificial_block {} {
> +    set loc  [gdb_get_line_number "between_static_init break here"]
> +    set expected_loc [gdb_get_line_number "break in between_static_init."]
> +    gdb_test "break $loc" \
> +	"Breakpoint $::decimal at $::hex: file .*/$::srcfile, line
> $expected_loc\\." \
> +	"ignore static block"
> +}
> +
> +clean_restart $::binfile
> +test_sliding_per_block
> +test_ignore_artificial_block
> 
> base-commit: 5fb28d2607a8325559b44a5dc0c8760236c81218
> prerequisite-patch-id: 695c74160156ff2c66b3a88462be22a36c32f141
> --
> 2.25.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-23 15:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 13:11 [PING] [PATCH] gdb/linespec: Improve sliding breakpoints Six, Lancelot
2022-08-23 15:01 ` [PING^2] " Lancelot SIX

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).