From: Lancelot SIX <Lancelot.Six@amd.com>
To: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Cc: "lsix@lancelotsix.com" <lsix@lancelotsix.com>
Subject: [PING^2] [PATCH] gdb/linespec: Improve sliding breakpoints
Date: Tue, 23 Aug 2022 16:01:05 +0100 [thread overview]
Message-ID: <aa7c8306-a2ff-1753-57ef-0ca88bb38b2a@amd.com> (raw)
In-Reply-To: <DM4PR12MB574582AEF39E7C92D94DE57B83979@DM4PR12MB5745.namprd12.prod.outlook.com>
Kindly pinging for thoughts and feedbacks.
Best,
Lancelot.
On 27/07/2022 14:11, Six, Lancelot wrote:
> [Public]
>
> Hi
>
> I was wondering if anyone has comments and or opinions about the change proposed in this patch.
>
> Best,
> Lancelot.
>
>> -----Original Message-----
>> From: Six, Lancelot <Lancelot.Six@amd.com>
>> Sent: 17 June 2022 17:08
>> To: gdb-patches@sourceware.org
>> Cc: lsix@lancelotsix.com; Six, Lancelot <Lancelot.Six@amd.com>
>> Subject: [PATCH] gdb/linespec: Improve sliding breakpoints
>>
>> Hi,
>>
>> Here is proposition to slightly update the breakpoint sliding algorithm
>> in GDB.
>>
>> This patch must be applied on top of
>> https://sourceware.org/pipermail/gdb-patches/2022-April/187686.html
>> which was submitted by Simon Marchi a couple of months. This needs
>> small updates as the BLOCK_ENTRY_PC and SYMBOL_BLOCK_VALUE are no
>> longer used. As a consequence
>>
>> BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (sym))
>>
>> should be replaced with
>>
>> sym->value_block ()->entry_pc ()
>>
>> This patch have been regression tested on x86_64-Linux. All comments
>> and feedbacks are welcome.
>>
>> Best,
>> Lancelot.
>>
>> ---
>>
>> The algorithm used to place a breakpoint by line approximately goes as
>> follow:
>>
>> - Look for all the PCs which maps directly to the desired line
>> - If any PC is found, place a breakpoint at each found PC.
>> - If no PC maps to the desired line:
>> - Search for the lower line number in the line table which is higher
>> than the desired line, and has some object code associated to it.
>> - List all PCs which maps to this line number.
>> - Place a breakpoint at each PC in the list above.
>>
>> There are situations where this heuristic is not sufficient. If we
>> consider the following c++ code:
>>
>> 12 template<typename T>
>> 13 T func (T a, T b)
>> 14 {
>> 15 /* Break here. */
>> 16 if constexpr (std::is_integral_t<T>)
>> 17 return a;
>> 18 else
>> 19 return b;
>> 20 }
>>
>> Given the current algorithm, if the user tries to set a breakpoint at
>> line 15, here is what happens:
>>
>> - Search PCs mapping to L15, there is none ;
>> - Search for the first line number above L16 with code, this is L17 ;
>> - List all PCs mapping to L17 and a breakpoints at insert a breakpoint at
>> each ;
>>
>> Now lets suppose that this templated function has been instantiated both
>> for an integral type (int) and non-integral type (float). If we follow
>> the algorithm, no breakpoint will ever be inserted in func<float>,
>> while there will be a breakpoint in func<int>.
>>
>> I came across a less obvious variation of this case where an inlined
>> function had different optimization opportunities for different instances.
>> As a consequence, some instances had the first actual line of code
>> different than the other, making the breakpoint be missed and the user
>> confused. To add to the confusion, the instances which have code mapped
>> to the lowest line above the user specified linespec only appeared in a
>> shared library conditionally loaded, making the breakpoint stop depend on
>> whether or not the said shared library was loaded or not.
>>
>> This patch is a proposition to adapt the algorithm in order to cover the
>> scenario exposed above. To do so, this patch ends up scanning
>> independently for the best candidate in the various blocks composing
>> the line table structures (a block being a succession of
>> linetable_entries terminated by an entry with line == 0).
>>
>> The line table for the example program above looks something like:
>>
>> (gdb) maintenance info line-table missed
>> symtab: /[...]/missed_pb.cc ((struct symtab *) 0x6210000f0500)
>> linetable: ((struct linetable *) 0x6210001a1220):
>> INDEX LINE ADDRESS IS-STMT PROLOGUE-END
>> ...
>> 17 END 0x00000000000011d3 Y
>> 18 13 0x00000000000011d3 Y <-- func<float>
>> 19 19 0x00000000000011e5 Y
>> 20 20 0x00000000000011ea Y
>> 21 END 0x00000000000011ec Y
>> 22 13 0x00000000000011ec Y <-- func<int>
>> 23 17 0x00000000000011fa Y
>> 24 20 0x00000000000011fd Y
>> 25 END 0x00000000000011ff Y
>>
>> The first block listed (starting at 0x11d3) is func<float>, and the
>> second starting at 0x11ec is func<int>.
>>
>> The proposed algorithm analyzes each block independently. The first step
>> is to decide if a block contains the line the user is interested in (a.k.a
>> the target). In our example, this is L15. A block is considered to match
>> if it contains at least one instruction mapped to a line before the target
>> and at least one instruction mapped to a line after the target. In other
>> words this means that will select all blocks where:
>>
>> - there is at least one instruction mapped to a line L such as L <= target
>> - there is at least one instruction mapped to a line M such as M >= target
>>
>> For each block which satisfies this condition, we select the lowest line
>> number above our target, and select all instructions mapping to this line
>> number (there might be multiple). In our example the func<float> block
>> satisfies the condition (we have line 13 and 19), so we insert breakpoints
>> at the PCs mapped to L19. Similarly, the func<int> block is selected
>> (thanks to L13 and L17), and we insert breakpoints at all PCs mapped to
>> L17.
>>
>> The core of this heuristic is implemented in find_best_le_for_symtab_line.
>>
>> There are however some cases which are still problematic, where the
>> first line after the target is the first line of the block we are
>> looking for. In such situation the algorithm would find nothing where
>> we would want to select 0x11d3 and 0x11ec (both mapping to L13). In
>> order to solve this, the algorithm first looks for the lowest line
>> number following the user specified target as this is done before this
>> patch (here the first line with associated code following L12 which is
>> L13). Once this is figured out, this new target (L13) is passed to the
>> algorithm above (find_best_le_for_symtab_line).
>>
>> This is safe to do because we know that there is no line between the user
>> target (included) and this new target. Therefore, whatever
>> find_best_le_for_symtab_line returns, it must map to lines numbers above
>> (inclusive) the new target.
>>
>> Finally, the testsuite showed an interesting regression. Let's suppose
>> we have:
>>
>> 1 struct Foo { Foo () {} };
>> 2 Foo f1;
>> 3 void bar ()
>> 4 {
>> 5 /* break here. */
>> 6 do_something ();
>> 7 }
>> 8 Foo f2;
>>
>> Inspecting the linetable, we have a code block containing L2 and L8 (so
>> over line 5). If a user was to insert a breakpoint somewhere in L5, the
>> algorithm shown above would also insert a breakpoint at L8. To solve
>> this, this patch uses the fact that L2 and L8 are, from dwarf's
>> perspective in a artificial block which corresponds to the static
>> initialization. So if among multiple candidates, some are part of a
>> virtual function and some are not, we only keep the ones in the non
>> virtual function.
>>
>> This proposed new algorithm will not solve all possible cases. For
>> example, if the func function is inlined multiple times in the same block
>> for different template parameters, the lookup will process all of those as
>> part of the same scope. Solving such situation would require heavier
>> works. I think that this change, while not covering every scenarios is
>> still a step in the right direction.
>>
>> All feedbacks and thoughts are welcome.
>>
>> Regression tested on x86_64 GNU/Linux.
>>
>> Change-Id: I8291f9e1a3522934a0f418dfc565cebb86cff730
>> ---
>> gdb/linespec.c | 93 ++++++++++++++++++-
>> gdb/symtab.c | 45 +++++++++
>> gdb/symtab.h | 14 +++
>> .../gdb.linespec/breakpoint-sliding.cc | 70 ++++++++++++++
>> .../gdb.linespec/breakpoint-sliding.exp | 82 ++++++++++++++++
>> 5 files changed, 301 insertions(+), 3 deletions(-)
>> create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> create mode 100644 gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>>
>> diff --git a/gdb/linespec.c b/gdb/linespec.c
>> index da9e35c5035..1e8679cd027 100644
>> --- a/gdb/linespec.c
>> +++ b/gdb/linespec.c
>> @@ -2092,9 +2092,42 @@ create_sals_line_offset (struct linespec_state
>> *self,
>> if (intermediate_results.empty () && best_entry != NULL)
>> {
>> was_exact = false;
>> - intermediate_results = decode_digits_ordinary (self, ls,
>> - best_entry->line,
>> - &best_entry);
>> + /* We did not find an exact match for the line we are looking for.
>> + As a consequence, we are now looking for the next best line.
>> + However, the next best line does not have to be the same across
>> + the entire program. If we consider the following code:
>> +
>> + 12 template<typename T>
>> + 13 T select (T a, T b)
>> + 14 {
>> + 15 // Break here.
>> + 16 if constexpr (std::is_integral_t<T>)
>> + 17 return a;
>> + 18 else
>> + 19 return b;
>> + 20 }
>> +
>> + For a function like the above, the next best line is L17 if T is
>> + an integral type, but should be L19 if T is not. As a
>> + consequence we try to search for the next best line independently
>> + in each code block (a code block being a contiguous section of a
>> + line table terminated by an entry with line == 0). */
>> + for (const auto &elt : ls->file_symtabs)
>> + {
>> + program_space *pspace = elt->compunit ()->objfile ()->pspace;
>> + for (const linetable_entry *le
>> + : find_best_le_for_symtab_line (elt->linetable (),
>> + best_entry->line))
>> + {
>> + symtab_and_line sal;
>> + sal.pspace = pspace;
>> + sal.symtab = elt;
>> + sal.line = le->line;
>> + sal.explicit_line = true;
>> + sal.pc = le->pc;
>> + intermediate_results.push_back (std::move (sal));
>> + }
>> + }
>> }
>>
>> /* For optimized code, the compiler can scatter one source line
>> @@ -2110,6 +2143,25 @@ create_sals_line_offset (struct linespec_state
>> *self,
>> gdb::def_vector<int> filter (intermediate_results.size ());
>> gdb::def_vector<const block *> blocks (intermediate_results.size ());
>>
>> + /* Artificial code blocks can produce false positives. Consider
>> +
>> + 1 struct Foo { Foo () {} };
>> + 2 Foo f1;
>> + 3 void bar ()
>> + 4 {
>> + 5 // break here.
>> + 6 do_something ();
>> + 7 }
>> + 8 Foo f2;
>> +
>> + In this example, if the user asked for a breakpoint in L5, the above
>> + algorithm will select L6 and L8. L6 is selected because the we have
>> + an artificial block (marked by DW_AT_artificial) generated to cover
>> + the static initializations and this block covers code from line 2 to
>> + line 8. To avoid this false positive, reject candidates in artificial
>> + blocks if we also have candidates in non artificial blocks. */
>> + bool any_nonartificial = false;
>> +
>> for (i = 0; i < intermediate_results.size (); ++i)
>> {
>> set_current_program_space (intermediate_results[i].pspace);
>> @@ -2117,6 +2169,13 @@ create_sals_line_offset (struct linespec_state
>> *self,
>> filter[i] = 1;
>> blocks[i] = block_for_pc_sect (intermediate_results[i].pc,
>> intermediate_results[i].section);
>> +
>> + if (!any_nonartificial && blocks[i] != nullptr)
>> + {
>> + struct symbol *func = block_containing_function (blocks[i]);
>> + any_nonartificial
>> + = (func != nullptr && !func->is_artificial ());
>> + }
>> }
>>
>> for (i = 0; i < intermediate_results.size (); ++i)
>> @@ -2171,6 +2230,34 @@ create_sals_line_offset (struct linespec_state
>> *self,
>> && val.line < sym->line ())
>> continue;
>>
>> + /* If we did not find an exact match, the above heuristic might
>> + find some false positives.
>> +
>> + Lets consider (with Foo having a non trivial ctor):
>> +
>> + 9 struct Foo { Foo () {} };
>> + 10 Foo f1;
>> + 11 int bar ()
>> + 12 {
>> + 13 // Break here.
>> + 14 return 42;
>> + 15 }
>> + 16 Foo f2;
>> +
>> + When the user tries to insert a breakpoint it L13, the
>> + algorithm sees a code path from L10 to L16 (in static
>> + initialization), and will consider L16 as a candidate.
>> + However the user clearly intended to break in bar.
>> Therefore,
>> + if we see that there is at least 1 breakpoint in a non
>> + artificial function (bar), ignore the hit in the static
>> + initialization.
>> + */
>> + if (!was_exact
>> + && any_nonartificial
>> + && sym != nullptr
>> + && sym->is_artificial ())
>> + continue;
>> +
>> if (self->funfirstline)
>> skip_prologue_sal (sal);
>>
>> diff --git a/gdb/symtab.c b/gdb/symtab.c
>> index 03aa2a96b87..1b03e6fce8e 100644
>> --- a/gdb/symtab.c
>> +++ b/gdb/symtab.c
>> @@ -3541,6 +3541,51 @@ find_pcs_for_symtab_line (struct symtab
>> *symtab, int line,
>> return result;
>> }
>>
>> +/* See symtab.h. */
>> +
>> +std::vector<linetable_entry *>
>> +find_best_le_for_symtab_line (struct linetable *lt, int line)
>> +{
>> + std::vector<linetable_entry *> res;
>> +
>> + if (lt == nullptr || line < 0)
>> + return res;
>> +
>> + /* Tells if in the current block we found at least one PC mapping to a line
>> + before LINE. */
>> + bool found_preceding = false;
>> + const int nitems = lt->nitems;
>> + std::vector<linetable_entry *> best_candidates;
>> +
>> + for (int i = 0; i < nitems; ++i)
>> + {
>> + struct linetable_entry &item = lt->item[i];
>> + found_preceding
>> + = found_preceding || (item.line > 0 && item.line <= line);
>> +
>> + if (item.is_stmt && item.line >= line
>> + && (best_candidates.empty ()
>> + || item.line <= best_candidates.front ()->line))
>> + {
>> + if (!best_candidates.empty ()
>> + && item.line < best_candidates.front ()->line)
>> + best_candidates.clear ();
>> + best_candidates.push_back (&item);
>> + }
>> + else if (item.line == 0)
>> + {
>> + if (found_preceding)
>> + res.insert (res.end (), best_candidates.begin (),
>> + best_candidates.end ());
>> +
>> + found_preceding = false;
>> + best_candidates.clear ();
>> + }
>> + }
>> +
>> + return res;
>> +}
>> +
>>
>>
>>
>> /* Set the PC value for a given source file and line number and return true.
>> Returns false for invalid line number (and sets the PC to 0).
>> diff --git a/gdb/symtab.h b/gdb/symtab.h
>> index ac902a4cbbe..94bea02cb43 100644
>> --- a/gdb/symtab.h
>> +++ b/gdb/symtab.h
>> @@ -2624,6 +2624,20 @@ void iterate_over_symtabs (const char *name,
>> std::vector<CORE_ADDR> find_pcs_for_symtab_line
>> (struct symtab *symtab, int line, struct linetable_entry **best_entry);
>>
>> +/* Given a linetable, look for entries which can map to line LINE.
>> +
>> + The result is the set of entries which:
>> + 1) Are part of a block (i.e. a sequence of entries terminated by an entry
>> + with line == 0) containing at least one statement entry with line above
>> + or equal to LINE, and at least one statement entry with line below or
>> + equal to LINE.
>> + 2) Are statement entries (is_stmt == 1)
>> + 3) Have line equal to the lowest statement line number greater or equal to
>> + LINE within the block they are part of. */
>> +
>> +std::vector<linetable_entry *> find_best_le_for_symtab_line
>> + (struct linetable *lt, int line);
>> +
>> /* Prototype for callbacks for LA_ITERATE_OVER_SYMBOLS. The callback
>> is called once per matching symbol SYM. The callback should return
>> true to indicate that LA_ITERATE_OVER_SYMBOLS should continue
>> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> new file mode 100644
>> index 00000000000..3ae6b563cdf
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.cc
>> @@ -0,0 +1,70 @@
>> +/* This testcase is part of GDB, the GNU debugger.
>> +
>> + Copyright (C) 2022 Free Software Foundation, Inc.
>> +
>> + This program is free software; you can redistribute it and/or modify
>> + it under the terms of the GNU General Public License as published by
>> + the Free Software Foundation; either version 3 of the License, or
>> + (at your option) any later version.
>> +
>> + This program is distributed in the hope that it will be useful,
>> + but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + GNU General Public License for more details.
>> +
>> + You should have received a copy of the GNU General Public License
>> + along with this program. If not, see <http://www.gnu.org/licenses/>. */
>> +
>> +#if __cplusplus >= 201703L
>> +#include <type_traits>
>> +
>> +template<typename T>
>> +T
>> +foo (T a, T b)
>> +{
>> + T ret;
>> + /* In foo break here. */
>> + if constexpr (std::is_integral_v<T>)
>> + ret = a; /* break in foo<int>. */
>> + else
>> + ret = b; /* break in foo<NonIntegral>. */
>> +
>> + return ret;
>> +}
>> +
>> +struct NonIntegral
>> +{
>> + int m;
>> +};
>> +
>> +#endif
>> +
>> +struct St
>> +{
>> + St ()
>> + { /* Empty. */ }
>> +};
>> +
>> +St st1;
>> +
>> +static int
>> +between_static_init ()
>> +{
>> + /* between_static_init break here. */
>> + return 42; /* break in between_static_init. */
>> +}
>> +
>> +St st2;
>> +
>> +int main ()
>> +{
>> +
>> +#if __cplusplus >= 201703L
>> + foo<int> (1, 2);
>> + foo<NonIntegral> ({ 1 }, { 2 });
>> +#endif
>> +
>> + between_static_init ();
>> +
>> + return 0;
>> +}
>> diff --git a/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> new file mode 100644
>> index 00000000000..9b79c419e9b
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.linespec/breakpoint-sliding.exp
>> @@ -0,0 +1,82 @@
>> +# Copyright (C) 2022 Free Software Foundation, Inc.
>> +
>> +# This program is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3 of the License, or
>> +# (at your option) any later version.
>> +#
>> +# This program is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
>> +
>> +# This testcase exercises breakpoint sliding when the user tries to set
>> +# a breakpoint at a line number which has no code associated to it. GDB
>> +# should then place breakpoint at the first line following the one given
>> +# by the user in each code block composing the program space, given that
>> +# the code block "covers" the line asked by the user.
>> +
>> +if { [skip_cplus_tests] } {
>> + return -1
>> +}
>> +
>> +standard_testfile .cc
>> +
>> +# Part of the test require c++17. If the used compiler does not support
>> +# it, fallback only doing part of the tests.
>> +set skip_cpp17 0
>> +
>> +if {[prepare_for_testing "failed to prepare with c++17" $testfile \
>> + $srcfile {debug c++ additional_flags=-std=c++17}]} {
>> + set skip_cpp17 1
>> + if {[prepare_for_testing "failed to prepare" $testfile $srcfile \
>> + {debug c++}]} {
>> + return -1
>> + }
>> +}
>> +
>> +proc_with_prefix test_sliding_per_block {} {
>> + if {$::skip_cpp17} {
>> + return
>> + }
>> +
>> + set loc [gdb_get_line_number "In foo break here."]
>> + gdb_test "break $loc" \
>> + "Breakpoint $::decimal.* \\(2 locations\\)" \
>> + "slide to if constexpr"
>> +
>> + set loc_in_int_instance [gdb_get_line_number "break in foo<int>"]
>> + set loc_in_struct_instance [gdb_get_line_number "break in
>> foo<NonIntegral>"]
>> + set seen_int_pb 0
>> + set seen_struct_bp 0
>> + gdb_test_multiple "info breakpoint" "info breakpoint" {
>> + -re "in foo<int>\\(int, int\\) at
>> \[^\n\r\]*$::srcfile:$loc_in_int_instance" {
>> + set seen_int_pb 1
>> + verbose -log "found int"
>> + exp_continue
>> + }
>> + -re "in foo<NonIntegral>\\(NonIntegral, NonIntegral\\) at
>> \[^\n\r\]*$::srcfile:$loc_in_struct_instance" {
>> + set seen_struct_bp 1
>> + verbose -log "found struct"
>> + exp_continue
>> + }
>> + -re -wrap "" {
>> + gdb_assert {$seen_int_pb && $seen_struct_bp} $gdb_test_name
>> + }
>> + }
>> +}
>> +
>> +proc_with_prefix test_ignore_artificial_block {} {
>> + set loc [gdb_get_line_number "between_static_init break here"]
>> + set expected_loc [gdb_get_line_number "break in between_static_init."]
>> + gdb_test "break $loc" \
>> + "Breakpoint $::decimal at $::hex: file .*/$::srcfile, line
>> $expected_loc\\." \
>> + "ignore static block"
>> +}
>> +
>> +clean_restart $::binfile
>> +test_sliding_per_block
>> +test_ignore_artificial_block
>>
>> base-commit: 5fb28d2607a8325559b44a5dc0c8760236c81218
>> prerequisite-patch-id: 695c74160156ff2c66b3a88462be22a36c32f141
>> --
>> 2.25.1
prev parent reply other threads:[~2022-08-23 15:01 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-27 13:11 [PING] " Six, Lancelot
2022-08-23 15:01 ` Lancelot SIX [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aa7c8306-a2ff-1753-57ef-0ca88bb38b2a@amd.com \
--to=lancelot.six@amd.com \
--cc=gdb-patches@sourceware.org \
--cc=lsix@lancelotsix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).