From: Andrew Burgess <aburgess@redhat.com>
To: Zheng <linuxmaker@163.com>
Cc: Tom Tromey <tom@tromey.com>,
Zheng Zhan Liang via Gdb-patches <gdb-patches@sourceware.org>,
Zheng Zhan <zzlossdev@163.com>
Subject: Re:Re:Re:Re: [PATCH] I'm debugging https://github.com/helix-editor/helix.git@63dcaae1b9083396fb3faaef9eaa2421f7e48fb9, which is a editor implemented with rust lang. When I type gdb command below: (gdb) b pars gdb dumped. I got: m_match = 0x7fffd8173cc7 "parse::h3bbecc5bbd82b347" m_ignored_ranges = { first = 0x7fffd8173cbb "<impl str>::parse::h3bbecc5bbd82b347", second = 0x7fffd8173cc5 "::parse::h3bbecc5bbd82b347" }
Date: Fri, 13 Jan 2023 14:46:55 +0000 [thread overview]
Message-ID: <87358eh4ps.fsf@redhat.com> (raw)
In-Reply-To: <61496177.3025.185a996955a.Coremail.linuxmaker@163.com>
Zheng <linuxmaker@163.com> writes:
> At 2023-01-13 03:06:02, "Andrew Burgess" <aburgess@redhat.com> wrote:
>>Zheng <linuxmaker@163.com> writes:
>>
>>> At 2023-01-06 23:53:48, "Andrew Burgess" <aburgess@redhat.com> wrote:
>>>>linuxmaker via Gdb-patches <gdb-patches@sourceware.org> writes:
>>>>
>>>>> At 2023-01-06 04:24:42, "Tom Tromey" <tom@tromey.com> wrote:
>>>>>>>>>>> "Zheng" == Zheng Zhan Liang via Gdb-patches <gdb-patches@sourceware.org> writes:
>>>>>>
>>>>>>Zheng> From: Zheng Zhan <zzlossdev@163.com>
>>>>>>
>>>>>>Hi. Thank you for the patch.
>>>>>>
>>>>>>The text all seemed to end up in the Subject line. Probably you need
>>>>>>another newline after the first line of the commit message.
>>>>>>
>>>>>>Zheng> --- a/gdb/completer.h
>>>>>>Zheng> +++ b/gdb/completer.h
>>>>>>Zheng> @@ -163,8 +163,11 @@ class completion_match_for_lcd
>>>>>>Zheng> const char *prev = m_match;
>>>>>>Zheng> for (const auto &range : m_ignored_ranges)
>>>>>>Zheng> {
>>>>>>Zheng> - m_finished_storage.append (prev, range.first);
>>>>>>Zheng> - prev = range.second;
>>>>>>Zheng> + if (prev < range.first)
>>>>>>Zheng> + {
>>>>>>Zheng> + m_finished_storage.append (prev, range.first);
>>>>>>Zheng> + prev = range.second;
>>>>>>Zheng> + }
>>>>>>
>>>>>>Is there any way to construct a test case for this?
>>>>> seems pretty common in rust. you can test it like below:
>>>>> fn main() {
>>>>> let four: u8 = "4".parse().unwrap();
>>>>> println!("{}", four);
>>>>> }
>>>>> (gdb) b pars[TAB]
>>>>
>>>>I haven't reviewed the fix. But in case this makes it easier for
>>> I've done a little more digging. And It looks like `strncmp_iw_with_mode` cached
>>> the last `mark_ignored_range` setting, then crap the current one.
>>> I paste a newer patch at the bottom, and it seems work for me.
>>
>>Thanks, I was also looking at the same area of code, wondering if that
>>would be a better place for the fix.
>>
>>I'd actually like to propose that the fix be moved one frame up the
>>stack to cp_symbol_name_matches_1, my reasons are explained in the patch
>>below.
>>
>>Additionally, I've now managed to reproduce the crash when completing a
>>C++ symbol, which means we have a much more reliable test - it doesn't
>>depend on symbols being pulled in from the rust runtime, which might
>>change in the future.
>>
>>The patch below is my proposal for what we should merge.
>>
>>All feedback welcome.
>>
>>Thanks,
>>Andrew
>>
>>--
>>
>>commit 467065c7a25a6fc87fab6c73ac6d7d509aae81b6
>>Author: Andrew Burgess <aburgess@redhat.com>
>>Date: Fri Jan 6 15:50:26 2023 +0000
>>
>> gdb: fix crash during command completion
>>
>> In some cases GDB will fail when attempting to complete a command that
>> involves a rust symbol, the failure can manifest as a crash.
>>
>> The problem is caused by the completion_match_for_lcd object being
>> left containing invalid data during calls to cp_symbol_name_matches_1.
>>
>> The first question to address is why we are calling a C++ support
>> function when handling a rust symbol. That's due to GDB's auto
>> language detection for msymbols, in some cases GDB can't tell if a
>> symbol is a rust symbol, or a C++ symbol.
>>
>> The test application contains symbols for functions which are
>> statically linked in from various rust support libraries. There's no
>> DWARF for these symbols, so all GDB has is the msymbols built from the
>> ELF symbol table.
>>
>> Here's the problematic symbol that leads to our crash:
>>
>> mangled: _ZN4core3str21_$LT$impl$u20$str$GT$5parse17h5111d2d6a50d22bdE
>> demangled: core::str::<impl str>::parse
>>
>> As an msymbol this is initially created with language auto, then GDB
>> eventually calls symbol_find_demangled_name, which loops over all
>> languages calling language_defn::sniff_from_mangled_name, the first
>> language that can demangle the symbol gets assigned as the language
>> for that symbol.
>>
>> Unfortunately, there's overlap in the mangled symbol names,
>> some (legacy) rust symbols can be demangled as both rust and C++, see
>> cplus_demangle in libiberty/cplus-dem.c where this is mentioned.
>>
>> And so, because we check the C++ language before we check for rust,
>> then the msymbol is (incorrectly) given the C++ language.
>>
>> Now it's true that is some cases we might be able to figure out that a
>> demangled symbol is not actually a valid C++ symbol, for example, in
>> our case, the construct '::<impl str>::' is not, I believe, valid in a
>> C++ symbol, we could look for ':<' and '>:' and refuse to accept this
>> as a C++ symbol.
>>
>> However, I'm not sure it is always possible to tell that a demangled
>> symbol is rust or C++, so, I think, we have to accept that some times
>> we will get this language detection wrong.
>>
>> If we accept that we can't fix the symbol language detection 100% of
>> the time, then we should make sure that GDB doesn't crash when it gets
>> the language wrong, that is what this commit addresses.
>>
>> In our test case the user tries to complete a symbol name like this:
>>
>> (gdb) complete break pars
>>
>> This results in GDB trying to find all symbols that match 'pars',
>> eventually we consider our problematic symbol, and we end up with a
>> call stack that looks like this:
>>
>> #0 0x0000000000f3c6bd in strncmp_iw_with_mode
>> #1 0x0000000000706d8d in cp_symbol_name_matches_1
>> #2 0x0000000000706fa4 in cp_symbol_name_matches
>> #3 0x0000000000df3c45 in compare_symbol_name
>> #4 0x0000000000df3c91 in completion_list_add_name
>> #5 0x0000000000df3f1d in completion_list_add_msymbol
>> #6 0x0000000000df4c94 in default_collect_symbol_completion_matches_break_on
>> #7 0x0000000000658c08 in language_defn::collect_symbol_completion_matches
>> #8 0x0000000000df54c9 in collect_symbol_completion_matches
>> #9 0x00000000009d98fb in linespec_complete_function
>> #10 0x00000000009d99f0 in complete_linespec_component
>> #11 0x00000000009da200 in linespec_complete
>> #12 0x00000000006e4132 in complete_address_and_linespec_locations
>> #13 0x00000000006e4ac3 in location_completer
>>
>> In cp_symbol_name_matches_1 we enter a loop, this loop repeatedly
>> tries to match the demangled problematic symbol name against the user
>> supplied text ('pars'). Each time around the loop another component
>> of the symbol name is stripped off, thus, we check 'pars' against
>> these options:
>>
>> core::str::<impl str>::parse
>> str::<impl str>::parse
>> <impl str>::parse
>> parse
>>
>> As soon as we get a match the cp_symbol_name_matches_1 exits its loop
>> and returns. In our case, when we're looking for 'pars', the match
>> occurs on the last iteration of the loop, when we are comparing to
>> 'parse'.
>>
>> Now the problem here is that cp_symbol_name_matches_1 uses the
>> strncmp_iw_with_mode, and inside strncmp_iw_with_mode we allow for
>> skipping over template parameters. This allows GDB to match the
>> symbol name 'foo<int>(int,int)' if the user supplies 'foo(int,'.
>> Inside strncmp_iw_with_mode GDB will record any template arguments
>> that it has skipped over inside the completion_match_for_lcd object
>> that is passed in as an argument.
>>
>> And so, when GDB tries to match against '<impl str>::parse', the first
>> thing it sees is '<impl str>', GDB assumes this is a template argument
>> and records this as a skipped region within the
>> completion_match_for_lcd object. After '<impl str>' GDB sees a ':'
>> character, which doesn't match with the 'pars' the user supplied, so
>> strncmp_iw_with_mode returns a value indicating a non-match. GDB then
>> removes the '<impl str>' component from the symbol name and tries
>> again, this time comparing to 'parse', which does match.
>>
>> Having found a match, then in cp_symbol_name_matches_1 we record the
>> match string, and the full symbol name within the
>> completion_match_result object, and return.
>>
>> The problem here is that the skipped region, the '<impl str>' that we
>> recorded in the penultimate loop iteration was never discarded, its
>> still there in our returned result.
>>
>> If we look at what the pointers held in the completion_match_result
>> that cp_symbol_name_matches_1 returns, this is what we see:
>>
>> core::str::<impl str>::parse
>> | \________/ |
>> | | '--- completion match string
>> | '---skip range
>> '--- full symbol name
>>
>> When GDB calls completion_match_for_lcd::finish, GDB tries to create a
>> string using the completion match string (parse), but excluding the
>> skip range, as the stored skip range is before the start of the
>> completion match string, then GDB tries to do some weird string
>> creation, which will cause GDB to crash.
>>
>> The reason we don't often see this problem in C++ is that for C++
>> symbols there is always some non-template text before the template
>> argument. This non-template text means GDB is likely to either match
>> the symbol, or reject the symbol without storing a skip range.
>>
>> However, notice, I did say, we don't often see this problem. Once I
>> understood the issue, I was able to reproduce the crash using a pure
>> C++ example:
>>
>> template<typename S>
>> struct foo
>> {
>> template<typename T>
>> foo (int p1, T a)
>> {
>> s = 0;
>> }
>>
>> S s;
>> };
>>
>> int
>> main ()
>> {
>> foo<int> obj (2.3, 0);
>> return 0;
>> }
>>
>> Then in GDB:
>>
>> (gdb) complete break foo(int
>>
>> The problem here is that the C++ symbol for the constructor looks like
>> this:
>>
>> foo<int>::foo<double>(int, double)
>>
>> When GDB enters cp_symbol_name_matches_1 the symbols it examines are:
>>
>> foo<int>::foo<double>(int, double)
>> foo<double>(int, double)
>>
>> The first iteration of the loop will match the 'foo', then add the
>> '<int>' template argument will be added as a skip range. When GDB
>> find the ':' after the '<int>' the first iteration of the loop fails
>> to match, GDB removes the 'foo<int>::' component, and starts the
>> second iteration of the loop.
>>
>> Again, GDB matches the 'foo', and now adds '<double>' as a skip
>> region. After that the '(int' successfully matches, and so the second
>> iteration of the loop succeeds, but, once again we left the '<int>' in
>> place as a skip region, even though this occurs before the start of
>> our match string, and this will cause GDB to crash.
>>
>> This problem was reported to the mailing list, and a solution
>> discussed in this thread:
>>
>> https://sourceware.org/pipermail/gdb-patches/2023-January/195166.html
>>
>> The solution proposed here is similar to one proposed by the original
>> bug reported, but implemented in a different location within GDB.
>> Instead of placing the fix in strncmp_iw_with_mode, I place the fix in
>> cp_symbol_name_matches_1. I believe this is a better location as it
>> is this function that implements the loop, and it is this loop, which
>> repeatedly calls strncmp_iw_with_mode, that should be resetting the
>> result object state (I believe).
>>
>> What I have done is add an assert to strncmp_iw_with_mode that the
>> incoming result object is empty.
>>
>> I've also added some other asserts in related code, in
>> completion_match_for_lcd::mark_ignored_range, I make some basic
>> assertions about the incoming range pointers, and in
>> completion_match_for_lcd::finish I also make some assertions about how
>> the skip ranges relate to the match pointer.
>>
>> There's two new tests. The original rust example that was used in the
>> initial bug report, and a C++ test. The rust example depends on which
>> symbols are pulled in from the rust libraries, so it is possible that,
>> at some future date, the problematic symbol will disappear from this
>> test program. The C++ test should be more reliable, as this only
>> depends on symbols from within the C++ source code.
>>
>> Co-Authored-By: Zheng Zhan <zzlossdev@163.com>
>>
>>diff --git a/gdb/completer.h b/gdb/completer.h
>>index 8b4ad8ec4d4..67d2fbf9d38 100644
>>--- a/gdb/completer.h
>>+++ b/gdb/completer.h
>>@@ -145,7 +145,12 @@ class completion_match_for_lcd
>>
>> /* Mark the range between [BEGIN, END) as ignored. */
>> void mark_ignored_range (const char *begin, const char *end)
>>- { m_ignored_ranges.emplace_back (begin, end); }
>>+ {
>>+ gdb_assert (begin < end);
>>+ gdb_assert (m_ignored_ranges.empty ()
>>+ || m_ignored_ranges.back ().second < begin);
>>+ m_ignored_ranges.emplace_back (begin, end);
>>+ }
>>
>> /* Get the resulting LCD, after a successful match. If there are
>> ignored ranges, then this builds a new string with the ignored
>>@@ -160,9 +165,14 @@ class completion_match_for_lcd
>> {
>> m_finished_storage.clear ();
>>
>>+ gdb_assert (m_ignored_ranges.back ().second
>>+ <= (m_match + strlen (m_match)));
>>+
>> const char *prev = m_match;
>> for (const auto &range : m_ignored_ranges)
>> {
>>+ gdb_assert (prev < range.first);
>>+ gdb_assert (range.second > range.first);
>> m_finished_storage.append (prev, range.first);
>> prev = range.second;
>> }
>>@@ -179,6 +189,13 @@ class completion_match_for_lcd
>> m_ignored_ranges.clear ();
>> }
>>
>>+ /* Return true if this object has had no match data set since its
>>+ creation, or the last call to clear. */
>>+ bool empty () const
>>+ {
>>+ return m_match == nullptr && m_ignored_ranges.empty ();
>>+ }
>>+
>> private:
>> /* The completion match result for LCD. This is usually either a
>> pointer into to a substring within a symbol's name, or to the
>>diff --git a/gdb/cp-support.c b/gdb/cp-support.c
>>index b7ed2101b78..fb5758ff914 100644
>>--- a/gdb/cp-support.c
>>+++ b/gdb/cp-support.c
>>@@ -1773,6 +1773,8 @@ cp_symbol_name_matches_1 (const char *symbol_search_name,
>> completion_match_for_lcd *match_for_lcd
>> = (comp_match_res != NULL ? &comp_match_res->match_for_lcd : NULL);
>>
>>+ gdb_assert (match_for_lcd == nullptr || match_for_lcd->empty ());
>>+
>> while (true)
>> {
>> if (strncmp_iw_with_mode (sname, lookup_name, lookup_name_len,
>>@@ -1806,6 +1808,11 @@ cp_symbol_name_matches_1 (const char *symbol_search_name,
>> return true;
>> }
>>
>>+ /* Clear match_for_lcd so the next strncmp_iw_with_mode call starts
>>+ from scratch. */
>>+ if (match_for_lcd != nullptr)
>>+ match_for_lcd->clear ();
> Well done. I have a question here, what if the `match_for_lcd` comes from the final call of
> `strncmp_iw_with_mode` and what's it contain matters?
Within cp_symbol_name_matches_1, if the strnmcp_iw_with_mode call finds
a match then we always return before the new call to ::clear. We only
call clear if strnmcp_iw_with_mode fails to find a match and we then
move onto the next part of the match.
> Currently, I find two more references to
> `strnmcp_iw_with_mode`. Are we going to fix each one outsidely?
I don't believe that these calls need fixing. The problem with
cp_symbol_name_matches_1 is that the strnmcp_iw_with_mode call is in a
loop, we effectively have multiple calls to cp_symbol_name_matches_1,
and we only want the results from the last successful call.
But in case this ever changes, I added an assert to
strnmcp_iw_with_mode, this will trigger if we ever call that function
and the completion_match_for_lcd already has some results.
Thanks,
Andrew
>
> Thanks,
> Zheng
>>+
>> unsigned int len = cp_find_first_component (sname);
>>
>> if (sname[len] == '\0')
>>diff --git a/gdb/testsuite/gdb.cp/cpcompletion.exp b/gdb/testsuite/gdb.cp/cpcompletion.exp
>>index 931f376a23d..48692dbc1f8 100644
>>--- a/gdb/testsuite/gdb.cp/cpcompletion.exp
>>+++ b/gdb/testsuite/gdb.cp/cpcompletion.exp
>>@@ -135,3 +135,5 @@ with_test_prefix "expression with namespace" {
>> # Add a disambiguating character and we get a unique completion.
>> test_gdb_complete_unique "p Test_NS::f" "p Test_NS::foo"
>> }
>>+
>>+test_gdb_complete_unique "break baz(int" "break baz(int, double)"
>>diff --git a/gdb/testsuite/gdb.cp/pr9594.cc b/gdb/testsuite/gdb.cp/pr9594.cc
>>index 54ddaafc0ca..b854d810b11 100644
>>--- a/gdb/testsuite/gdb.cp/pr9594.cc
>>+++ b/gdb/testsuite/gdb.cp/pr9594.cc
>>@@ -52,8 +52,31 @@ int qux;
>>
>> } /* namespace Test_NS */
>>
>>+/* The important thing with class baz is that both the class and the
>>+ constructor must have a template argument, we need the symbol to look
>>+ like:
>>+
>>+ baz<TYPE_1>::baz<TYPE_2>(int,....whatever...)
>>+
>>+ It doesn't really matter if TYPE_1 and TYPE_2 are the same or different,
>>+ but we create them differently in this test as it makes debugging GDB
>>+ slightly easier. */
>>+
>>+template<typename S>
>>+struct baz
>>+{
>>+ template<typename T>
>>+ baz (int p1, T a)
>>+ {
>>+ s = 0;
>>+ }
>>+
>>+ S s;
>>+};
>>+
>> int main ()
>> {
>>+ baz<int> obj (2.3, 0.1);
>> // Anonymous struct with method.
>> struct {
>> int get() { return 5; }
>>diff --git a/gdb/testsuite/gdb.rust/completion.exp b/gdb/testsuite/gdb.rust/completion.exp
>>new file mode 100644
>>index 00000000000..00923db14a6
>>--- /dev/null
>>+++ b/gdb/testsuite/gdb.rust/completion.exp
>>@@ -0,0 +1,34 @@
>>+# Copyright (C) 2023 Free Software Foundation, Inc.
>>+
>>+# This program is free software; you can redistribute it and/or modify
>>+# it under the terms of the GNU General Public License as published by
>>+# the Free Software Foundation; either version 3 of the License, or
>>+# (at your option) any later version.
>>+#
>>+# This program is distributed in the hope that it will be useful,
>>+# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>+# GNU General Public License for more details.
>>+#
>>+# You should have received a copy of the GNU General Public License
>>+# along with this program. If not, see <http://www.gnu.org/licenses/>.
>>+
>>+# Test a completion case that was causing GDB to crash.
>>+
>>+load_lib rust-support.exp
>>+if {[skip_rust_tests]} {
>>+ return
>>+}
>>+
>>+standard_testfile .rs
>>+if {[prepare_for_testing "failed to prepare" $testfile $srcfile {debug rust}]} {
>>+ return -1
>>+}
>>+
>>+set line [gdb_get_line_number "set breakpoint here"]
>>+if {![runto ${srcfile}:$line]} {
>>+ untested "could not run to breakpoint"
>>+ return -1
>>+}
>>+
>>+gdb_test "complete break pars" ".*"
>>diff --git a/gdb/testsuite/gdb.rust/completion.rs b/gdb/testsuite/gdb.rust/completion.rs
>>new file mode 100644
>>index 00000000000..8396e3f4086
>>--- /dev/null
>>+++ b/gdb/testsuite/gdb.rust/completion.rs
>>@@ -0,0 +1,19 @@
>>+// Copyright (C) 2023 Free Software Foundation, Inc.
>>+
>>+// This program is free software; you can redistribute it and/or modify
>>+// it under the terms of the GNU General Public License as published by
>>+// the Free Software Foundation; either version 3 of the License, or
>>+// (at your option) any later version.
>>+//
>>+// This program is distributed in the hope that it will be useful,
>>+// but WITHOUT ANY WARRANTY; without even the implied warranty of
>>+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>+// GNU General Public License for more details.
>>+//
>>+// You should have received a copy of the GNU General Public License
>>+// along with this program. If not, see <http://www.gnu.org/licenses/>.
>>+
>>+fn main() {
>>+ let four: u8 = "4".parse().unwrap();
>>+ println!("{}", four); // set breakpoint here
>>+}
>>diff --git a/gdb/utils.c b/gdb/utils.c
>>index 734c5bf7f70..ab14e186821 100644
>>--- a/gdb/utils.c
>>+++ b/gdb/utils.c
>>@@ -2139,6 +2139,8 @@ strncmp_iw_with_mode (const char *string1, const char *string2,
>> || language == language_rust
>> || language == language_fortran);
>>
>>+ gdb_assert (match_for_lcd == nullptr || match_for_lcd->empty ());
>>+
>> while (1)
>> {
>> if (skip_spaces
next prev parent reply other threads:[~2023-01-13 14:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-02 12:43 Zheng Zhan Liang
2023-01-05 20:24 ` Tom Tromey
2023-01-06 3:01 ` linuxmaker
2023-01-06 15:53 ` Andrew Burgess
2023-01-07 8:58 ` Zheng
2023-01-12 19:06 ` Andrew Burgess
2023-01-13 5:24 ` Zheng
2023-01-13 14:46 ` Andrew Burgess [this message]
2023-03-20 13:56 ` Tom Tromey
2023-03-20 16:08 ` Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87358eh4ps.fsf@redhat.com \
--to=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=linuxmaker@163.com \
--cc=tom@tromey.com \
--cc=zzlossdev@163.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).