public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PING] [PATCH] gdb/linespec: Improve sliding breakpoints
@ 2022-07-27 13:11 Six, Lancelot
  2022-08-23 15:01 ` [PING^2] " Lancelot SIX
  0 siblings, 1 reply; 2+ messages in thread
From: Six, Lancelot @ 2022-07-27 13:11 UTC (permalink / raw)
  To: gdb-patches; +Cc: lsix, Six, Lancelot

[Public]

Hi

I was wondering if anyone has comments and or opinions about the change proposed in this patch.

Best,
Lancelot.

> -----Original Message-----
> From: Six, Lancelot <Lancelot.Six@amd.com>
> Sent: 17 June 2022 17:08
> To: gdb-patches@sourceware.org
> Cc: lsix@lancelotsix.com; Six, Lancelot <Lancelot.Six@amd.com>
> Subject: [PATCH] gdb/linespec: Improve sliding breakpoints
> 
> Hi,
> 
> Here is proposition to slightly update the breakpoint sliding algorithm
> in GDB.
> 
> This patch must be applied on top of
> https://sourceware.org/pipermail/gdb-patches/2022-April/187686.html
> which was submitted by Simon Marchi a couple of months.  This needs
> small updates as the BLOCK_ENTRY_PC and SYMBOL_BLOCK_VALUE are no
> longer used.  As a consequence
> 
>   BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (sym))
> 
> should be replaced with
> 
>   sym->value_block ()->entry_pc ()
> 
> This patch have been regression tested on x86_64-Linux.  All comments
> and feedbacks are welcome.
> 
> Best,
> Lancelot.
> 
> ---
> 
> The algorithm used to place a breakpoint by line approximately goes as
> follow:
> 
> - Look for all the PCs which maps directly to the desired line
> - If any PC is found, place a breakpoint at each found PC.
> - If no PC maps to the desired line:
>   - Search for the lower line number in the line table which is higher
>     than the desired line, and has some object code associated to it.
>   - List all PCs which maps to this line number.
>   - Place a breakpoint at each PC in the list above.
> 
> There are situations where this heuristic is not sufficient.  If we
> consider the following c++ code:
> 
>     12 template<typename T>
>     13 T func (T a, T b)
>     14 {
>     15   /* Break here.  */
>     16   if constexpr (std::is_integral_t<T>)
>     17     return a;
>     18   else
>     19     return b;
>     20 }
> 
> Given the current algorithm, if the user tries to set a breakpoint at
> line 15, here is what happens:
> 
> - Search PCs mapping to L15, there is none ;
> - Search for the first line number above L16 with code, this is L17 ;
> - List all PCs mapping to L17 and a breakpoints at insert a breakpoint at
>   each ;
> 
> Now lets suppose that this templated function has been instantiated both
> for an integral type (int) and non-integral type (float).  If we follow
> the algorithm, no breakpoint will ever be inserted in func<float>,
> while there will be a breakpoint in func<int>.
> 
> I came across a less obvious variation of this case where an inlined
> function had different optimization opportunities for different instances.
> As a consequence, some instances had the first actual line of code
> different than the other, making the breakpoint be missed and the user
> confused.  To add to the confusion, the instances which have code mapped
> to the lowest line above the user specified linespec only appeared in a
> shared library conditionally loaded, making the breakpoint stop depend on
> whether or not the said shared library was loaded or not.
> 
> This patch is a proposition to adapt the algorithm in order to cover the
> scenario exposed above.  To do so, this patch ends up scanning
> independently for the best candidate in the various blocks composing
> the line table structures (a block being a succession of
> linetable_entries terminated by an entry with line == 0).
> 
> The line table for the example program above looks something like:
> 
>     (gdb) maintenance info line-table missed
>     symtab: /[...]/missed_pb.cc ((struct symtab *) 0x6210000f0500)
>     linetable: ((struct linetable *) 0x6210001a1220):
>     INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END
>     ...
>     17     END    0x00000000000011d3 Y
>     18     13     0x00000000000011d3 Y                    <-- func<float>
>     19     19     0x00000000000011e5 Y
>     20     20     0x00000000000011ea Y
>     21     END    0x00000000000011ec Y
>     22     13     0x00000000000011ec Y                    <-- func<int>
>     23     17     0x00000000000011fa Y
>     24     20     0x00000000000011fd Y
>     25     END    0x00000000000011ff Y
> 
> The first block listed (starting at 0x11d3) is func<float>, and the
> second starting at 0x11ec is func<int>.
> 
> The proposed algorithm analyzes each block independently.  The first step
> is to decide if a block contains the line the user is interested in (a.k.a
> the target).  In our example, this is L15.  A block is considered to match
> if it contains at least one instruction mapped to a line before the target
> and at least one instruction mapped to a line after the target.  In other
> words this means that will select all blocks where:
> 
> - there is at least one instruction mapped to a line L such as L <= target
> - there is at least one instruction mapped to a line M such as M >= target
> 
> For each block which satisfies this condition, we select the lowest line
> number above our target, and select all instructions mapping to this line
> number (there might be multiple).  In our example the func<float> block
> satisfies the condition (we have line 13 and 19), so we insert breakpoints
> at the PCs mapped to L19.  Similarly, the func<int> block is selected
> (thanks to L13 and L17), and we insert breakpoints at all PCs mapped to
> L17.
> 
> The core of this heuristic is implemented in find_best_le_for_symtab_line.
> 
> There are however some cases which are still problematic, where the
> first line after the target is the first line of the block we are
> looking for.  In such situation the algorithm would find nothing where
> we would want to select 0x11d3 and 0x11ec (both mapping to L13).  In
> order to solve this, the algorithm first looks for the lowest line
> number following the user specified target as this is done before this
> patch (here the first line with associated code following L12 which is
> L13).  Once this is figured out, this new target (L13) is passed to the
> algorithm above (find_best_le_for_symtab_line).
> 
> This is safe to do because we know that there is no line between the user
> target (included) and this new target.  Therefore, whatever
> find_best_le_for_symtab_line returns, it must map to lines numbers above
> (inclusive) the new target.
> 
> Finally, the testsuite showed an interesting regression.  Let's suppose
> we have:
> 
>     1 struct Foo { Foo () {} };
>     2 Foo f1;
>     3 void bar ()
>     4 {
>     5   /* break here.  */
>     6   do_something ();
>     7 }
>     8 Foo f2;
> 
> Inspecting the linetable, we have a code block containing L2 and L8 (so
> over line 5).  If a user was to insert a breakpoint somewhere in L5, the
> algorithm shown above would also insert a breakpoint at L8.  To solve
> this, this patch uses the fact that L2 and L8 are, from dwarf's
> perspective in a artificial block which corresponds to the static
> initialization.  So if among multiple candidates, some are part of a
> virtual function and some are not, we only keep the ones in the non
> virtual function.
> 
> This proposed new algorithm will not solve all possible cases.  For
> example, if the func function is inlined multiple times in the same block
> for different template parameters, the lookup will process all of those as
> part of the same scope.  Solving such situation would require heavier
> works.  I think that this change, while not covering every scenarios is
> still a step in the right direction.
> 
> All feedbacks and thoughts are welcome.
> 
> Regression tested on x86_64 GNU/Linux.
> 
> Change-Id: I8291f9e1a3522934a0f418dfc565cebb86cff730
> ---
>  gdb/linespec.c                                | 93 ++++++++++++++++++-
>  gdb/symtab.c                                  | 45 +++++++++
>  gdb/symtab.h                                  | 14 +++
>  .../gdb.linespec/breakpoint-sliding.cc        | 70 ++++++++++++++
>  .../gdb.linespec/breakpoint-sliding.exp       | 82 ++++++++++++++++
>  5 files changed, 301 insertions(+), 3 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>  create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> 
> diff --git a/gdb/linespec.c b/gdb/linespec.c
> index da9e35c5035..1e8679cd027 100644
> --- a/gdb/linespec.c
> +++ b/gdb/linespec.c
> @@ -2092,9 +2092,42 @@ create_sals_line_offset (struct linespec_state
> *self,
>        if (intermediate_results.empty () && best_entry != NULL)
>  	{
>  	  was_exact = false;
> -	  intermediate_results = decode_digits_ordinary (self, ls,
> -							 best_entry->line,
> -							 &best_entry);
> +	  /* We did not find an exact match for the line we are looking for.
> +	     As a consequence, we are now looking for the next best line.
> +	     However, the next best line does not have to be the same across
> +	     the entire program.  If we consider the following code:
> +
> +	       12 template<typename T>
> +	       13 T select (T a, T b)
> +	       14 {
> +	       15   // Break here.
> +	       16   if constexpr (std::is_integral_t<T>)
> +	       17     return a;
> +	       18   else
> +	       19     return b;
> +	       20 }
> +
> +	     For a function like the above, the next best line is L17 if T is
> +	     an integral type, but should be L19 if T is not.  As a
> +	     consequence we try to search for the next best line independently
> +	     in each code block (a code block being a contiguous section of a
> +	     line table terminated by an entry with line == 0).  */
> +	  for (const auto &elt : ls->file_symtabs)
> +	    {
> +	      program_space *pspace = elt->compunit ()->objfile ()->pspace;
> +	      for (const linetable_entry *le
> +		   : find_best_le_for_symtab_line (elt->linetable (),
> +						   best_entry->line))
> +		{
> +		  symtab_and_line sal;
> +		  sal.pspace = pspace;
> +		  sal.symtab = elt;
> +		  sal.line = le->line;
> +		  sal.explicit_line = true;
> +		  sal.pc = le->pc;
> +		  intermediate_results.push_back (std::move (sal));
> +		}
> +	    }
>  	}
> 
>        /* For optimized code, the compiler can scatter one source line
> @@ -2110,6 +2143,25 @@ create_sals_line_offset (struct linespec_state
> *self,
>        gdb::def_vector<int> filter (intermediate_results.size ());
>        gdb::def_vector<const block *> blocks (intermediate_results.size ());
> 
> +      /* Artificial code blocks can produce false positives.  Consider
> +
> +	   1 struct Foo { Foo () {} };
> +	   2 Foo f1;
> +	   3 void bar ()
> +	   4 {
> +	   5   // break here.
> +	   6   do_something ();
> +	   7 }
> +	   8 Foo f2;
> +
> +	 In this example, if the user asked for a breakpoint in L5, the above
> +	 algorithm will select L6 and L8.  L6 is selected because the we have
> +	 an artificial block (marked by DW_AT_artificial) generated to cover
> +	 the static initializations and this block covers code from line 2 to
> +	 line 8.  To avoid this false positive, reject candidates in artificial
> +	 blocks if we also have candidates in non artificial blocks.  */
> +      bool any_nonartificial = false;
> +
>        for (i = 0; i < intermediate_results.size (); ++i)
>  	{
>  	  set_current_program_space (intermediate_results[i].pspace);
> @@ -2117,6 +2169,13 @@ create_sals_line_offset (struct linespec_state
> *self,
>  	  filter[i] = 1;
>  	  blocks[i] = block_for_pc_sect (intermediate_results[i].pc,
>  					 intermediate_results[i].section);
> +
> +	  if (!any_nonartificial && blocks[i] != nullptr)
> +	    {
> +	      struct symbol *func = block_containing_function (blocks[i]);
> +	      any_nonartificial
> +		= (func != nullptr && !func->is_artificial ());
> +	    }
>  	}
> 
>        for (i = 0; i < intermediate_results.size (); ++i)
> @@ -2171,6 +2230,34 @@ create_sals_line_offset (struct linespec_state
> *self,
>  		&& val.line < sym->line ())
>  	      continue;
> 
> +	    /* If we did not find an exact match, the above heuristic might
> +	       find some false positives.
> +
> +	       Lets consider (with Foo having a non trivial ctor):
> +
> +		    9 struct Foo { Foo () {} };
> +		   10 Foo f1;
> +		   11 int bar ()
> +		   12 {
> +		   13   // Break here.
> +		   14   return 42;
> +		   15 }
> +		   16 Foo f2;
> +
> +		When the user tries to insert a breakpoint it L13, the
> +		algorithm sees a code path from L10 to L16 (in static
> +		initialization), and will consider L16 as a candidate.
> +		However the user clearly intended to break in bar.
> Therefore,
> +		if we see that there is at least 1 breakpoint in a non
> +		artificial function (bar), ignore the hit in the static
> +		initialization.
> +	       */
> +	    if (!was_exact
> +		&& any_nonartificial
> +		&& sym != nullptr
> +		&& sym->is_artificial ())
> +	      continue;
> +
>  	    if (self->funfirstline)
>  	      skip_prologue_sal (sal);
> 
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index 03aa2a96b87..1b03e6fce8e 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -3541,6 +3541,51 @@ find_pcs_for_symtab_line (struct symtab
> *symtab, int line,
>    return result;
>  }
> 
> +/* See symtab.h.  */
> +
> +std::vector<linetable_entry *>
> +find_best_le_for_symtab_line (struct linetable *lt, int line)
> +{
> +  std::vector<linetable_entry *> res;
> +
> +  if (lt == nullptr || line < 0)
> +    return res;
> +
> +  /* Tells if in the current block we found at least one PC mapping to a line
> +     before LINE.  */
> +  bool found_preceding = false;
> +  const int nitems = lt->nitems;
> +  std::vector<linetable_entry *> best_candidates;
> +
> +  for (int i = 0; i < nitems; ++i)
> +    {
> +      struct linetable_entry &item = lt->item[i];
> +      found_preceding
> +	= found_preceding || (item.line > 0 && item.line <= line);
> +
> +      if (item.is_stmt && item.line >= line
> +	  && (best_candidates.empty ()
> +	      || item.line <= best_candidates.front ()->line))
> +	{
> +	  if (!best_candidates.empty ()
> +	      && item.line < best_candidates.front ()->line)
> +	    best_candidates.clear ();
> +	  best_candidates.push_back (&item);
> +	}
> +      else if (item.line == 0)
> +	{
> +	  if (found_preceding)
> +	    res.insert (res.end (), best_candidates.begin (),
> +			best_candidates.end ());
> +
> +	  found_preceding = false;
> +	  best_candidates.clear ();
> +	}
> +    }
> +
> +  return res;
> +}
> +
> 
> 
> 
>  /* Set the PC value for a given source file and line number and return true.
>     Returns false for invalid line number (and sets the PC to 0).
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index ac902a4cbbe..94bea02cb43 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -2624,6 +2624,20 @@ void iterate_over_symtabs (const char *name,
>  std::vector<CORE_ADDR> find_pcs_for_symtab_line
>      (struct symtab *symtab, int line, struct linetable_entry **best_entry);
> 
> +/* Given a linetable, look for entries which can map to line LINE.
> +
> +   The result is the set of entries which:
> +   1) Are part of a block (i.e. a sequence of entries terminated by an entry
> +      with line == 0) containing at least one statement entry with line above
> +      or equal to LINE, and at least one statement entry with line below or
> +      equal to LINE.
> +   2) Are statement entries (is_stmt == 1)
> +   3) Have line equal to the lowest statement line number greater or equal to
> +      LINE within the block they are part of.  */
> +
> +std::vector<linetable_entry *> find_best_le_for_symtab_line
> +    (struct linetable *lt, int line);
> +
>  /* Prototype for callbacks for LA_ITERATE_OVER_SYMBOLS.  The callback
>     is called once per matching symbol SYM.  The callback should return
>     true to indicate that LA_ITERATE_OVER_SYMBOLS should continue
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> new file mode 100644
> index 00000000000..3ae6b563cdf
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
> @@ -0,0 +1,70 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#if __cplusplus >= 201703L
> +#include <type_traits>
> +
> +template<typename T>
> +T
> +foo (T a, T b)
> +{
> +  T ret;
> +  /* In foo break here.  */
> +  if constexpr (std::is_integral_v<T>)
> +    ret = a; /* break in foo<int>.  */
> +  else
> +    ret = b; /* break in foo<NonIntegral>.  */
> +
> +  return ret;
> +}
> +
> +struct NonIntegral
> +{
> +  int m;
> +};
> +
> +#endif
> +
> +struct St
> +{
> +  St ()
> +  { /* Empty.  */ }
> +};
> +
> +St st1;
> +
> +static int
> +between_static_init ()
> +{
> +  /* between_static_init break here.  */
> +  return 42; /* break in between_static_init.  */
> +}
> +
> +St st2;
> +
> +int main ()
> +{
> +
> +#if __cplusplus >= 201703L
> +  foo<int> (1, 2);
> +  foo<NonIntegral> ({ 1 }, { 2 });
> +#endif
> +
> +  between_static_init ();
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> new file mode 100644
> index 00000000000..9b79c419e9b
> --- /dev/null
> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
> @@ -0,0 +1,82 @@
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# This testcase exercises breakpoint sliding when the user tries to set
> +# a breakpoint at a line number which has no code associated to it.  GDB
> +# should then place breakpoint at the first line following the one given
> +# by the user in each code block composing the program space, given that
> +# the code block "covers" the line asked by the user.
> +
> +if { [skip_cplus_tests] } {
> +  return -1
> +}
> +
> +standard_testfile .cc
> +
> +# Part of the test require c++17.  If the used compiler does not support
> +# it, fallback only doing part of the tests.
> +set skip_cpp17 0
> +
> +if {[prepare_for_testing "failed to prepare with c++17" $testfile \
> +			 $srcfile {debug c++ additional_flags=-std=c++17}]} {
> +    set skip_cpp17 1
> +    if {[prepare_for_testing "failed to prepare" $testfile $srcfile \
> +			     {debug c++}]} {
> +	return -1
> +    }
> +}
> +
> +proc_with_prefix test_sliding_per_block {} {
> +    if {$::skip_cpp17} {
> +	return
> +    }
> +
> +    set loc [gdb_get_line_number "In foo break here."]
> +    gdb_test "break $loc" \
> +	"Breakpoint $::decimal.* \\(2 locations\\)" \
> +	"slide to if constexpr"
> +
> +    set loc_in_int_instance [gdb_get_line_number "break in foo<int>"]
> +    set loc_in_struct_instance [gdb_get_line_number "break in
> foo<NonIntegral>"]
> +    set seen_int_pb 0
> +    set seen_struct_bp 0
> +    gdb_test_multiple "info breakpoint" "info breakpoint" {
> +	-re "in foo<int>\\(int, int\\) at
> \[^\n\r\]*$::srcfile:$loc_in_int_instance" {
> +	    set seen_int_pb 1
> +	    verbose -log "found int"
> +	    exp_continue
> +	}
> +	-re "in foo<NonIntegral>\\(NonIntegral, NonIntegral\\) at
> \[^\n\r\]*$::srcfile:$loc_in_struct_instance" {
> +	    set seen_struct_bp 1
> +	    verbose -log "found struct"
> +	    exp_continue
> +	}
> +	-re -wrap "" {
> +	    gdb_assert {$seen_int_pb && $seen_struct_bp} $gdb_test_name
> +	}
> +    }
> +}
> +
> +proc_with_prefix test_ignore_artificial_block {} {
> +    set loc  [gdb_get_line_number "between_static_init break here"]
> +    set expected_loc [gdb_get_line_number "break in between_static_init."]
> +    gdb_test "break $loc" \
> +	"Breakpoint $::decimal at $::hex: file .*/$::srcfile, line
> $expected_loc\\." \
> +	"ignore static block"
> +}
> +
> +clean_restart $::binfile
> +test_sliding_per_block
> +test_ignore_artificial_block
> 
> base-commit: 5fb28d2607a8325559b44a5dc0c8760236c81218
> prerequisite-patch-id: 695c74160156ff2c66b3a88462be22a36c32f141
> --
> 2.25.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [PING^2] [PATCH] gdb/linespec: Improve sliding breakpoints
  2022-07-27 13:11 [PING] [PATCH] gdb/linespec: Improve sliding breakpoints Six, Lancelot
@ 2022-08-23 15:01 ` Lancelot SIX
  0 siblings, 0 replies; 2+ messages in thread
From: Lancelot SIX @ 2022-08-23 15:01 UTC (permalink / raw)
  To: gdb-patches; +Cc: lsix

Kindly pinging for thoughts and feedbacks.

Best,
Lancelot.

On 27/07/2022 14:11, Six, Lancelot wrote:
> [Public]
> 
> Hi
> 
> I was wondering if anyone has comments and or opinions about the change proposed in this patch.
> 
> Best,
> Lancelot.
> 
>> -----Original Message-----
>> From: Six, Lancelot <Lancelot.Six@amd.com>
>> Sent: 17 June 2022 17:08
>> To: gdb-patches@sourceware.org
>> Cc: lsix@lancelotsix.com; Six, Lancelot <Lancelot.Six@amd.com>
>> Subject: [PATCH] gdb/linespec: Improve sliding breakpoints
>>
>> Hi,
>>
>> Here is proposition to slightly update the breakpoint sliding algorithm
>> in GDB.
>>
>> This patch must be applied on top of
>> https://sourceware.org/pipermail/gdb-patches/2022-April/187686.html
>> which was submitted by Simon Marchi a couple of months.  This needs
>> small updates as the BLOCK_ENTRY_PC and SYMBOL_BLOCK_VALUE are no
>> longer used.  As a consequence
>>
>>    BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (sym))
>>
>> should be replaced with
>>
>>    sym->value_block ()->entry_pc ()
>>
>> This patch have been regression tested on x86_64-Linux.  All comments
>> and feedbacks are welcome.
>>
>> Best,
>> Lancelot.
>>
>> ---
>>
>> The algorithm used to place a breakpoint by line approximately goes as
>> follow:
>>
>> - Look for all the PCs which maps directly to the desired line
>> - If any PC is found, place a breakpoint at each found PC.
>> - If no PC maps to the desired line:
>>    - Search for the lower line number in the line table which is higher
>>      than the desired line, and has some object code associated to it.
>>    - List all PCs which maps to this line number.
>>    - Place a breakpoint at each PC in the list above.
>>
>> There are situations where this heuristic is not sufficient.  If we
>> consider the following c++ code:
>>
>>      12 template<typename T>
>>      13 T func (T a, T b)
>>      14 {
>>      15   /* Break here.  */
>>      16   if constexpr (std::is_integral_t<T>)
>>      17     return a;
>>      18   else
>>      19     return b;
>>      20 }
>>
>> Given the current algorithm, if the user tries to set a breakpoint at
>> line 15, here is what happens:
>>
>> - Search PCs mapping to L15, there is none ;
>> - Search for the first line number above L16 with code, this is L17 ;
>> - List all PCs mapping to L17 and a breakpoints at insert a breakpoint at
>>    each ;
>>
>> Now lets suppose that this templated function has been instantiated both
>> for an integral type (int) and non-integral type (float).  If we follow
>> the algorithm, no breakpoint will ever be inserted in func<float>,
>> while there will be a breakpoint in func<int>.
>>
>> I came across a less obvious variation of this case where an inlined
>> function had different optimization opportunities for different instances.
>> As a consequence, some instances had the first actual line of code
>> different than the other, making the breakpoint be missed and the user
>> confused.  To add to the confusion, the instances which have code mapped
>> to the lowest line above the user specified linespec only appeared in a
>> shared library conditionally loaded, making the breakpoint stop depend on
>> whether or not the said shared library was loaded or not.
>>
>> This patch is a proposition to adapt the algorithm in order to cover the
>> scenario exposed above.  To do so, this patch ends up scanning
>> independently for the best candidate in the various blocks composing
>> the line table structures (a block being a succession of
>> linetable_entries terminated by an entry with line == 0).
>>
>> The line table for the example program above looks something like:
>>
>>      (gdb) maintenance info line-table missed
>>      symtab: /[...]/missed_pb.cc ((struct symtab *) 0x6210000f0500)
>>      linetable: ((struct linetable *) 0x6210001a1220):
>>      INDEX  LINE   ADDRESS            IS-STMT PROLOGUE-END
>>      ...
>>      17     END    0x00000000000011d3 Y
>>      18     13     0x00000000000011d3 Y                    <-- func<float>
>>      19     19     0x00000000000011e5 Y
>>      20     20     0x00000000000011ea Y
>>      21     END    0x00000000000011ec Y
>>      22     13     0x00000000000011ec Y                    <-- func<int>
>>      23     17     0x00000000000011fa Y
>>      24     20     0x00000000000011fd Y
>>      25     END    0x00000000000011ff Y
>>
>> The first block listed (starting at 0x11d3) is func<float>, and the
>> second starting at 0x11ec is func<int>.
>>
>> The proposed algorithm analyzes each block independently.  The first step
>> is to decide if a block contains the line the user is interested in (a.k.a
>> the target).  In our example, this is L15.  A block is considered to match
>> if it contains at least one instruction mapped to a line before the target
>> and at least one instruction mapped to a line after the target.  In other
>> words this means that will select all blocks where:
>>
>> - there is at least one instruction mapped to a line L such as L <= target
>> - there is at least one instruction mapped to a line M such as M >= target
>>
>> For each block which satisfies this condition, we select the lowest line
>> number above our target, and select all instructions mapping to this line
>> number (there might be multiple).  In our example the func<float> block
>> satisfies the condition (we have line 13 and 19), so we insert breakpoints
>> at the PCs mapped to L19.  Similarly, the func<int> block is selected
>> (thanks to L13 and L17), and we insert breakpoints at all PCs mapped to
>> L17.
>>
>> The core of this heuristic is implemented in find_best_le_for_symtab_line.
>>
>> There are however some cases which are still problematic, where the
>> first line after the target is the first line of the block we are
>> looking for.  In such situation the algorithm would find nothing where
>> we would want to select 0x11d3 and 0x11ec (both mapping to L13).  In
>> order to solve this, the algorithm first looks for the lowest line
>> number following the user specified target as this is done before this
>> patch (here the first line with associated code following L12 which is
>> L13).  Once this is figured out, this new target (L13) is passed to the
>> algorithm above (find_best_le_for_symtab_line).
>>
>> This is safe to do because we know that there is no line between the user
>> target (included) and this new target.  Therefore, whatever
>> find_best_le_for_symtab_line returns, it must map to lines numbers above
>> (inclusive) the new target.
>>
>> Finally, the testsuite showed an interesting regression.  Let's suppose
>> we have:
>>
>>      1 struct Foo { Foo () {} };
>>      2 Foo f1;
>>      3 void bar ()
>>      4 {
>>      5   /* break here.  */
>>      6   do_something ();
>>      7 }
>>      8 Foo f2;
>>
>> Inspecting the linetable, we have a code block containing L2 and L8 (so
>> over line 5).  If a user was to insert a breakpoint somewhere in L5, the
>> algorithm shown above would also insert a breakpoint at L8.  To solve
>> this, this patch uses the fact that L2 and L8 are, from dwarf's
>> perspective in a artificial block which corresponds to the static
>> initialization.  So if among multiple candidates, some are part of a
>> virtual function and some are not, we only keep the ones in the non
>> virtual function.
>>
>> This proposed new algorithm will not solve all possible cases.  For
>> example, if the func function is inlined multiple times in the same block
>> for different template parameters, the lookup will process all of those as
>> part of the same scope.  Solving such situation would require heavier
>> works.  I think that this change, while not covering every scenarios is
>> still a step in the right direction.
>>
>> All feedbacks and thoughts are welcome.
>>
>> Regression tested on x86_64 GNU/Linux.
>>
>> Change-Id: I8291f9e1a3522934a0f418dfc565cebb86cff730
>> ---
>>   gdb/linespec.c                                | 93 ++++++++++++++++++-
>>   gdb/symtab.c                                  | 45 +++++++++
>>   gdb/symtab.h                                  | 14 +++
>>   .../gdb.linespec/breakpoint-sliding.cc        | 70 ++++++++++++++
>>   .../gdb.linespec/breakpoint-sliding.exp       | 82 ++++++++++++++++
>>   5 files changed, 301 insertions(+), 3 deletions(-)
>>   create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>>   create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>>
>> diff --git a/gdb/linespec.c b/gdb/linespec.c
>> index da9e35c5035..1e8679cd027 100644
>> --- a/gdb/linespec.c
>> +++ b/gdb/linespec.c
>> @@ -2092,9 +2092,42 @@ create_sals_line_offset (struct linespec_state
>> *self,
>>         if (intermediate_results.empty () && best_entry != NULL)
>>   	{
>>   	  was_exact = false;
>> -	  intermediate_results = decode_digits_ordinary (self, ls,
>> -							 best_entry->line,
>> -							 &best_entry);
>> +	  /* We did not find an exact match for the line we are looking for.
>> +	     As a consequence, we are now looking for the next best line.
>> +	     However, the next best line does not have to be the same across
>> +	     the entire program.  If we consider the following code:
>> +
>> +	       12 template<typename T>
>> +	       13 T select (T a, T b)
>> +	       14 {
>> +	       15   // Break here.
>> +	       16   if constexpr (std::is_integral_t<T>)
>> +	       17     return a;
>> +	       18   else
>> +	       19     return b;
>> +	       20 }
>> +
>> +	     For a function like the above, the next best line is L17 if T is
>> +	     an integral type, but should be L19 if T is not.  As a
>> +	     consequence we try to search for the next best line independently
>> +	     in each code block (a code block being a contiguous section of a
>> +	     line table terminated by an entry with line == 0).  */
>> +	  for (const auto &elt : ls->file_symtabs)
>> +	    {
>> +	      program_space *pspace = elt->compunit ()->objfile ()->pspace;
>> +	      for (const linetable_entry *le
>> +		   : find_best_le_for_symtab_line (elt->linetable (),
>> +						   best_entry->line))
>> +		{
>> +		  symtab_and_line sal;
>> +		  sal.pspace = pspace;
>> +		  sal.symtab = elt;
>> +		  sal.line = le->line;
>> +		  sal.explicit_line = true;
>> +		  sal.pc = le->pc;
>> +		  intermediate_results.push_back (std::move (sal));
>> +		}
>> +	    }
>>   	}
>>
>>         /* For optimized code, the compiler can scatter one source line
>> @@ -2110,6 +2143,25 @@ create_sals_line_offset (struct linespec_state
>> *self,
>>         gdb::def_vector<int> filter (intermediate_results.size ());
>>         gdb::def_vector<const block *> blocks (intermediate_results.size ());
>>
>> +      /* Artificial code blocks can produce false positives.  Consider
>> +
>> +	   1 struct Foo { Foo () {} };
>> +	   2 Foo f1;
>> +	   3 void bar ()
>> +	   4 {
>> +	   5   // break here.
>> +	   6   do_something ();
>> +	   7 }
>> +	   8 Foo f2;
>> +
>> +	 In this example, if the user asked for a breakpoint in L5, the above
>> +	 algorithm will select L6 and L8.  L6 is selected because the we have
>> +	 an artificial block (marked by DW_AT_artificial) generated to cover
>> +	 the static initializations and this block covers code from line 2 to
>> +	 line 8.  To avoid this false positive, reject candidates in artificial
>> +	 blocks if we also have candidates in non artificial blocks.  */
>> +      bool any_nonartificial = false;
>> +
>>         for (i = 0; i < intermediate_results.size (); ++i)
>>   	{
>>   	  set_current_program_space (intermediate_results[i].pspace);
>> @@ -2117,6 +2169,13 @@ create_sals_line_offset (struct linespec_state
>> *self,
>>   	  filter[i] = 1;
>>   	  blocks[i] = block_for_pc_sect (intermediate_results[i].pc,
>>   					 intermediate_results[i].section);
>> +
>> +	  if (!any_nonartificial && blocks[i] != nullptr)
>> +	    {
>> +	      struct symbol *func = block_containing_function (blocks[i]);
>> +	      any_nonartificial
>> +		= (func != nullptr && !func->is_artificial ());
>> +	    }
>>   	}
>>
>>         for (i = 0; i < intermediate_results.size (); ++i)
>> @@ -2171,6 +2230,34 @@ create_sals_line_offset (struct linespec_state
>> *self,
>>   		&& val.line < sym->line ())
>>   	      continue;
>>
>> +	    /* If we did not find an exact match, the above heuristic might
>> +	       find some false positives.
>> +
>> +	       Lets consider (with Foo having a non trivial ctor):
>> +
>> +		    9 struct Foo { Foo () {} };
>> +		   10 Foo f1;
>> +		   11 int bar ()
>> +		   12 {
>> +		   13   // Break here.
>> +		   14   return 42;
>> +		   15 }
>> +		   16 Foo f2;
>> +
>> +		When the user tries to insert a breakpoint it L13, the
>> +		algorithm sees a code path from L10 to L16 (in static
>> +		initialization), and will consider L16 as a candidate.
>> +		However the user clearly intended to break in bar.
>> Therefore,
>> +		if we see that there is at least 1 breakpoint in a non
>> +		artificial function (bar), ignore the hit in the static
>> +		initialization.
>> +	       */
>> +	    if (!was_exact
>> +		&& any_nonartificial
>> +		&& sym != nullptr
>> +		&& sym->is_artificial ())
>> +	      continue;
>> +
>>   	    if (self->funfirstline)
>>   	      skip_prologue_sal (sal);
>>
>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>> index 03aa2a96b87..1b03e6fce8e 100644
>> --- a/gdb/symtab.c
>> +++ b/gdb/symtab.c
>> @@ -3541,6 +3541,51 @@ find_pcs_for_symtab_line (struct symtab
>> *symtab, int line,
>>     return result;
>>   }
>>
>> +/* See symtab.h.  */
>> +
>> +std::vector<linetable_entry *>
>> +find_best_le_for_symtab_line (struct linetable *lt, int line)
>> +{
>> +  std::vector<linetable_entry *> res;
>> +
>> +  if (lt == nullptr || line < 0)
>> +    return res;
>> +
>> +  /* Tells if in the current block we found at least one PC mapping to a line
>> +     before LINE.  */
>> +  bool found_preceding = false;
>> +  const int nitems = lt->nitems;
>> +  std::vector<linetable_entry *> best_candidates;
>> +
>> +  for (int i = 0; i < nitems; ++i)
>> +    {
>> +      struct linetable_entry &item = lt->item[i];
>> +      found_preceding
>> +	= found_preceding || (item.line > 0 && item.line <= line);
>> +
>> +      if (item.is_stmt && item.line >= line
>> +	  && (best_candidates.empty ()
>> +	      || item.line <= best_candidates.front ()->line))
>> +	{
>> +	  if (!best_candidates.empty ()
>> +	      && item.line < best_candidates.front ()->line)
>> +	    best_candidates.clear ();
>> +	  best_candidates.push_back (&item);
>> +	}
>> +      else if (item.line == 0)
>> +	{
>> +	  if (found_preceding)
>> +	    res.insert (res.end (), best_candidates.begin (),
>> +			best_candidates.end ());
>> +
>> +	  found_preceding = false;
>> +	  best_candidates.clear ();
>> +	}
>> +    }
>> +
>> +  return res;
>> +}
>> +
>>
>>
>>
>>   /* Set the PC value for a given source file and line number and return true.
>>      Returns false for invalid line number (and sets the PC to 0).
>> diff --git a/gdb/symtab.h b/gdb/symtab.h
>> index ac902a4cbbe..94bea02cb43 100644
>> --- a/gdb/symtab.h
>> +++ b/gdb/symtab.h
>> @@ -2624,6 +2624,20 @@ void iterate_over_symtabs (const char *name,
>>   std::vector<CORE_ADDR> find_pcs_for_symtab_line
>>       (struct symtab *symtab, int line, struct linetable_entry **best_entry);
>>
>> +/* Given a linetable, look for entries which can map to line LINE.
>> +
>> +   The result is the set of entries which:
>> +   1) Are part of a block (i.e. a sequence of entries terminated by an entry
>> +      with line == 0) containing at least one statement entry with line above
>> +      or equal to LINE, and at least one statement entry with line below or
>> +      equal to LINE.
>> +   2) Are statement entries (is_stmt == 1)
>> +   3) Have line equal to the lowest statement line number greater or equal to
>> +      LINE within the block they are part of.  */
>> +
>> +std::vector<linetable_entry *> find_best_le_for_symtab_line
>> +    (struct linetable *lt, int line);
>> +
>>   /* Prototype for callbacks for LA_ITERATE_OVER_SYMBOLS.  The callback
>>      is called once per matching symbol SYM.  The callback should return
>>      true to indicate that LA_ITERATE_OVER_SYMBOLS should continue
>> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> new file mode 100644
>> index 00000000000..3ae6b563cdf
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> @@ -0,0 +1,70 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> +   Copyright (C) 2022 Free Software Foundation, Inc.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; either version 3 of the License, or
>> +   (at your option) any later version.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>> +
>> +#if __cplusplus >= 201703L
>> +#include <type_traits>
>> +
>> +template<typename T>
>> +T
>> +foo (T a, T b)
>> +{
>> +  T ret;
>> +  /* In foo break here.  */
>> +  if constexpr (std::is_integral_v<T>)
>> +    ret = a; /* break in foo<int>.  */
>> +  else
>> +    ret = b; /* break in foo<NonIntegral>.  */
>> +
>> +  return ret;
>> +}
>> +
>> +struct NonIntegral
>> +{
>> +  int m;
>> +};
>> +
>> +#endif
>> +
>> +struct St
>> +{
>> +  St ()
>> +  { /* Empty.  */ }
>> +};
>> +
>> +St st1;
>> +
>> +static int
>> +between_static_init ()
>> +{
>> +  /* between_static_init break here.  */
>> +  return 42; /* break in between_static_init.  */
>> +}
>> +
>> +St st2;
>> +
>> +int main ()
>> +{
>> +
>> +#if __cplusplus >= 201703L
>> +  foo<int> (1, 2);
>> +  foo<NonIntegral> ({ 1 }, { 2 });
>> +#endif
>> +
>> +  between_static_init ();
>> +
>> +  return 0;
>> +}
>> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> new file mode 100644
>> index 00000000000..9b79c419e9b
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> @@ -0,0 +1,82 @@
>> +# Copyright (C) 2022 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> +
>> +# This testcase exercises breakpoint sliding when the user tries to set
>> +# a breakpoint at a line number which has no code associated to it.  GDB
>> +# should then place breakpoint at the first line following the one given
>> +# by the user in each code block composing the program space, given that
>> +# the code block "covers" the line asked by the user.
>> +
>> +if { [skip_cplus_tests] } {
>> +  return -1
>> +}
>> +
>> +standard_testfile .cc
>> +
>> +# Part of the test require c++17.  If the used compiler does not support
>> +# it, fallback only doing part of the tests.
>> +set skip_cpp17 0
>> +
>> +if {[prepare_for_testing "failed to prepare with c++17" $testfile \
>> +			 $srcfile {debug c++ additional_flags=-std=c++17}]} {
>> +    set skip_cpp17 1
>> +    if {[prepare_for_testing "failed to prepare" $testfile $srcfile \
>> +			     {debug c++}]} {
>> +	return -1
>> +    }
>> +}
>> +
>> +proc_with_prefix test_sliding_per_block {} {
>> +    if {$::skip_cpp17} {
>> +	return
>> +    }
>> +
>> +    set loc [gdb_get_line_number "In foo break here."]
>> +    gdb_test "break $loc" \
>> +	"Breakpoint $::decimal.* \\(2 locations\\)" \
>> +	"slide to if constexpr"
>> +
>> +    set loc_in_int_instance [gdb_get_line_number "break in foo<int>"]
>> +    set loc_in_struct_instance [gdb_get_line_number "break in
>> foo<NonIntegral>"]
>> +    set seen_int_pb 0
>> +    set seen_struct_bp 0
>> +    gdb_test_multiple "info breakpoint" "info breakpoint" {
>> +	-re "in foo<int>\\(int, int\\) at
>> \[^\n\r\]*$::srcfile:$loc_in_int_instance" {
>> +	    set seen_int_pb 1
>> +	    verbose -log "found int"
>> +	    exp_continue
>> +	}
>> +	-re "in foo<NonIntegral>\\(NonIntegral, NonIntegral\\) at
>> \[^\n\r\]*$::srcfile:$loc_in_struct_instance" {
>> +	    set seen_struct_bp 1
>> +	    verbose -log "found struct"
>> +	    exp_continue
>> +	}
>> +	-re -wrap "" {
>> +	    gdb_assert {$seen_int_pb && $seen_struct_bp} $gdb_test_name
>> +	}
>> +    }
>> +}
>> +
>> +proc_with_prefix test_ignore_artificial_block {} {
>> +    set loc  [gdb_get_line_number "between_static_init break here"]
>> +    set expected_loc [gdb_get_line_number "break in between_static_init."]
>> +    gdb_test "break $loc" \
>> +	"Breakpoint $::decimal at $::hex: file .*/$::srcfile, line
>> $expected_loc\\." \
>> +	"ignore static block"
>> +}
>> +
>> +clean_restart $::binfile
>> +test_sliding_per_block
>> +test_ignore_artificial_block
>>
>> base-commit: 5fb28d2607a8325559b44a5dc0c8760236c81218
>> prerequisite-patch-id: 695c74160156ff2c66b3a88462be22a36c32f141
>> --
>> 2.25.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-23 15:01 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27 13:11 [PING] [PATCH] gdb/linespec: Improve sliding breakpoints Six, Lancelot
2022-08-23 15:01 ` [PING^2] " Lancelot SIX

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).