From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 131043858D1E for ; Mon, 30 Jan 2023 15:13:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 131043858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675091599; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Fn811c6rxmtfbnhJiO3+We8u6nSdpqo2UFLyxy/hjmA=; b=FCVzDfAxeiRNKKqkViEZqf79cgki8YxT/+/m3av/Zzz60Q9HYLTXHAVsLaE/vSQeD2OqAg U7iQx4lMbIcLRst00PtyP760gKc4/5JysNn4PcoA9H1lguWKz7rmxSJ5zxCeN33vZ5/18t OpkopdMsvPkgzHx/DFbypVCNXEgRqlo= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-30-OzY60_mUMaub5mmqqaNLiQ-1; Mon, 30 Jan 2023 10:13:18 -0500 X-MC-Unique: OzY60_mUMaub5mmqqaNLiQ-1 Received: by mail-qt1-f199.google.com with SMTP id c18-20020ac84e12000000b003b5d38f1d29so5047801qtw.0 for ; Mon, 30 Jan 2023 07:13:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Fn811c6rxmtfbnhJiO3+We8u6nSdpqo2UFLyxy/hjmA=; b=RRpHeuFgKGcPDMGp4533phIVH4CGX9xVg/42vN7iQefPf2ITnFw8oah/c6ILxaNIki Dco3Ydt/rPlbHayqLvf0r5LRGlNWemvJCjV3yF9IaGx8zzxKZRao9/ZcnlPZx9GLduRZ 67SWWiIOoYck8krKCTNAH6/XhsI+YkcHVWIqrW3DR/MJtbKQwyJG7rFK2ESPWMUf5RSL /2uKdXjvgY9iJ60Afe1Jf5WnXpFv94v4Vwr8NQUCwR22LlRHVe6wG2g1CdtWat7BftU8 V5jGY/Ke7JTIeiF+H27M6YJmvNyzyzlxAfv1oWIcIxD477QWehCI/prluLbG9sey3xj0 VjBw== X-Gm-Message-State: AO0yUKXNzl6wf/AUus8OYv6EPY+PkzA0gOu9Jfc++YkMLO2kFj3oi14Y R3YJBIdjedovodPQ8sgNKZOUiVlUXAvfLffJk0v029p5H4U5vBhn1d/O2U+XH0KinqL2of/4ssE bDr8GBIESsMqIVLBH8DwEwTvVEcKhE86cxrzKn5wZ4+Eq86lYAMNhSPmS7e4PwfqU9ffeigvG+g == X-Received: by 2002:a05:622a:4c8:b0:3b7:ec87:8154 with SMTP id q8-20020a05622a04c800b003b7ec878154mr14008663qtx.44.1675091597811; Mon, 30 Jan 2023 07:13:17 -0800 (PST) X-Google-Smtp-Source: AK7set+XUFlkOFM7mcO897+dB1CZVuDthCTdMObF9mhPgSj3i5jWJtv1ldMcpyYgZOEV33U09zmaZg== X-Received: by 2002:a05:622a:4c8:b0:3b7:ec87:8154 with SMTP id q8-20020a05622a04c800b003b7ec878154mr14008615qtx.44.1675091597275; Mon, 30 Jan 2023 07:13:17 -0800 (PST) Received: from localhost (95.72.115.87.dyn.plus.net. [87.115.72.95]) by smtp.gmail.com with ESMTPSA id o12-20020a05622a044c00b003a82ca4e81csm8193570qtx.80.2023.01.30.07.13.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Jan 2023 07:13:16 -0800 (PST) From: Andrew Burgess To: gdb-patches@sourceware.org Subject: Re: [PATCHv2] gdb/c++: fix handling of breakpoints on @plt symbols In-Reply-To: <29e560fd9c94874f0839924fa25557a7e8418ad3.1674215225.git.aburgess@redhat.com> References: <756de5175770d79b3f0642fa3035ef348388bda4.1671544509.git.aburgess@redhat.com> <29e560fd9c94874f0839924fa25557a7e8418ad3.1674215225.git.aburgess@redhat.com> Date: Mon, 30 Jan 2023 15:13:15 +0000 Message-ID: <87fsbsukc4.fsf@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping. Andrew Burgess writes: > Changes since v1: > > - Rebased to current HEAD of master, > > - Updated test to use 'require allow_shlib_tests', and wrapped a few > long lines. Also Copyright date ranges have been updated for 2023. > > --- > > This commit should fix PR gdb/20091, PR gdb/17201, and PR gdb/17071. > Additionally, PR gdb/17199 relates to this area of code, but is more > of a request to refactor some parts of GDB, this commit does not > address that request, but it is probably worth reading that PR when > looking at this commit. > > When the current language is C++, and the user places a breakpoint on > a function in a shared library, GDB will currently find two locations > for the breakpoint, one location will be within the function itself as > we would expect, but the other location will be within the PLT table > for the call to the named function. Consider this session: > > $ gdb -q /tmp/breakpoint-shlib-func > Reading symbols from /tmp/breakpoint-shlib-func... > (gdb) start > Temporary breakpoint 1 at 0x40112e: file /tmp/breakpoint-shlib-func.cc, line 20. > Starting program: /tmp/breakpoint-shlib-func > > Temporary breakpoint 1, main () at /tmp/breakpoint-shlib-func.cc:20 > 20 int answer = foo (); > (gdb) break foo > Breakpoint 2 at 0x401030 (2 locations) > (gdb) info breakpoints > Num Type Disp Enb Address What > 2 breakpoint keep y > 2.1 y 0x0000000000401030 > 2.2 y 0x00007ffff7fc50fd in foo() at /tmp/breakpoint-shlib-func-lib.cc:20 > > This is not the expected behaviour. If we compile the same test using > a C compiler then we see this: > > (gdb) break foo > Breakpoint 2 at 0x7ffff7fc50fd: file /tmp/breakpoint-shlib-func-c-lib.c, line 20. > (gdb) info breakpoints > Num Type Disp Enb Address What > 2 breakpoint keep y 0x00007ffff7fc50fd in foo at /tmp/breakpoint-shlib-func-c-lib.c:20 > > Here's what's happening. When GDB parses the symbols in the main > executable and the shared library we see a number of different symbols > for foo, and use these to create entries in GDB's msymbol table: > > - In the main executable we see a symbol 'foo@plt' that points at > the plt entry for foo, from this we add two entries into GDB's > msymbol table, one called 'foo@plt' which points at the plt entry > and has type mst_text, then we create a second symbol, this time > called 'foo' with type mst_solib_trampoline which also points at > the plt entry, > > - Then, when the shared library is loaded we see another symbol > called 'foo', this one points at the actual implementation in the > shared library. This time GDB creates a msymbol called 'foo' with > type mst_text that points at the implementation. > > This means that GDB creates 3 msymbols to represent the 2 symbols > found in the executable and shared library. > > When the user creates a breakpoint on 'foo' GDB eventually ends up in > search_minsyms_for_name (linespec.c), this function then calls > iterate_over_minimal_symbols passing in the name we are looking for > wrapped in a lookup_name_info object. > > In iterate_over_minimal_symbols we iterate over two hash tables (using > the name we're looking for as the hash key), first we walk the hash > table of symbol linkage names, then we walk the hash table of > demangled symbol names. > > When the language is C++ the symbols for 'foo' will all have been > mangled, as a result, in this case, the iteration of the linkage name > hash table will find no matching results. > > However, when we walk the demangled hash table we do find some > results. In order to match symbol names, GDB obtains a symbol name > matching function by calling the get_symbol_name_matcher method on the > language_defn class. For C++, in this case, the matching function we > use is cp_fq_symbol_name_matches, which delegates the work to > strncmp_iw_with_mode with mode strncmp_iw_mode::MATCH_PARAMS and > language set to language_cplus. > > The strncmp_iw_mode::MATCH_PARAMS mode means that strncmp_iw_mode will > skip any parameters in the demangled symbol name when checking for a > match, e.g. 'foo' will match the demangled name 'foo()'. The way this > is done is that the strings are matched character by character, but, > once the string we are looking for ('foo' here) is exhausted, if we > are looking at '(' then we consider the match a success. > > Lets consider the 3 symbols GDB created. If the function declaration > is 'void foo ()' then from the main executable we added symbols > '_Z3foov@plt' and '_Z3foov', while from the shared library we added > another symbol call '_Z3foov'. When these are demangled they become > 'foo()@plt', 'foo', and 'foo' respectively. > > Now, the '_Z3foov' symbol from the main executable has the type > mst_solib_trampoline, and in search_minsyms_for_name, we search for > any symbols of type mst_solib_trampoline and filter these out of the > results. > > However, the '_Z3foov@plt' symbol (from the main executable), and the > '_Z3foov' symbol (from the shared library) both have type mst_text. > > During the demangled name matching, due to the use of MATCH_PARAMS > mode, we stop the comparison as soon as we hit a '(' in the demangled > name. And so, '_Z3foov@plt', which demangles to 'foo()@plt' matches > 'foo', and '_Z3foov', which demangles to 'foo()' also matches 'foo'. > > By contrast, for C, there are no demangled hash table entries to be > iterated over (in iterate_over_minimal_symbols), we only consider the > linkage name symbols which are 'foo@plt' and 'foo'. The plain 'foo' > symbol obviously matches when we are looking for 'foo', but in this > case the 'foo@plt' will not match due to the '@plt' suffix. > > And so, when the user asks for a breakpoint in 'foo', and the language > is C, search_minsyms_for_name, returns a single msymbol, the mst_text > symbol for foo in the shared library, while, when the language is C++, > we get two results, '_Z3foov' for the shared library function, and > '_Z3foov@plt' for the plt entry in the main executable. > > I propose to fix this in strncmp_iw_with_mode. When the mode is > MATCH_PARAMS, instead of stopping at a '(' and assuming the match is a > success, GDB will instead search forward for the matching, closing, > ')', effectively skipping the parameter list, and then resume > matching. Thus, when comparing 'foo' to 'foo()@plt' GDB will > effectively compare against 'foo@plt' (skipping the parameter list), > and the match will fail, just as it does when the language is C. > > There is one slight complication, which is revealed by the test > gdb.linespec/cpcompletion.exp, when searching for the symbol of a > const member function, the demangled symbol will have 'const' at the > end of its name, e.g.: > > struct_with_const_overload::const_overload_fn() const > > Previously, the matching would stop at the '(' character, but after my > change the whole '()' is skipped, and the match resumes. As a result, > the 'const' modifier results in a failure to match, when previously > GDB would have found a match. > > To work around this issue, in strncmp_iw_with_mode, for language C++ > and mode MATCH_PARAMS, I explicitly allow a trailing 'const' to be > skipped. > > With these changes in place I now see GDB correctly setting a > breakpoint only at the implementation of 'foo' in the shared library. > > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=20091 > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17201 > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17071 > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=17199 > --- > .../gdb.cp/breakpoint-shlib-func-lib.cc | 21 +++++ > gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc | 22 ++++++ > .../gdb.cp/breakpoint-shlib-func.exp | 78 +++++++++++++++++++ > gdb/utils.c | 26 ++++++- > 4 files changed, 146 insertions(+), 1 deletion(-) > create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc > create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc > create mode 100644 gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp > > diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc b/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc > new file mode 100644 > index 00000000000..7219f7c5a23 > --- /dev/null > +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func-lib.cc > @@ -0,0 +1,21 @@ > +/* Copyright 2022-2023 Free Software Foundation, Inc. > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see . */ > + > +extern int foo (); > + > +int > +foo () > +{ > + return 0; > +} > diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc > new file mode 100644 > index 00000000000..a86d06560d4 > --- /dev/null > +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.cc > @@ -0,0 +1,22 @@ > +/* Copyright 2022-2023 Free Software Foundation, Inc. > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see . */ > + > +extern int foo (); > + > +int > +main () > +{ > + int answer = foo (); > + return answer; > +} > diff --git a/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp > new file mode 100644 > index 00000000000..5d5c87d0b51 > --- /dev/null > +++ b/gdb/testsuite/gdb.cp/breakpoint-shlib-func.exp > @@ -0,0 +1,78 @@ > +# Copyright 2022-2023 Free Software Foundation, Inc. > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > + > +# Places a breakpoint on a function in a shared library before the > +# inferior has started. GDB will place the breakpoint on the @plt > +# symbol in the main executable. > +# > +# When the inferior is started GDB will re-evaluate the breakpoint > +# location and move the breakpoint to the function implementation in > +# the shared library. > +# > +# Then, with the inferior started, delete all breakpoints, and > +# re-create the breakpoint on the shared library function, GDB should > +# place a single breakpoint on the function implementation in the > +# shared library. > + > +require allow_shlib_tests > + > +standard_testfile .cc -lib.cc > + > +set libobj [standard_output_file libfoo.so] > +if {[build_executable "build shared library" $libobj $srcfile2 \ > + {debug c++ shlib}] != 0} { > + return -1 > +} > + > +if {[prepare_for_testing "failed to prepare" $testfile $srcfile \ > + [list debug c++ shlib=$libobj]]} { > + return -1 > +} > + > +# Place the breakpoint before the shared library has been loaded, the > +# breakpoint should be placed on the @plt symbol. > +gdb_test "break foo" "Breakpoint $decimal at $hex" > +gdb_test "info breakpoints" "" > + > +# This is used as an override for delete_breakpoints when we don't > +# want functions in gdb.exp to delete breakpoints behind the scenes > +# for us. > +proc do_not_delete_breakpoints {} { > + # Just do nothing. > +} > + > +# Runto main, but don't delete all the breakpoints. > +with_override delete_breakpoints do_not_delete_breakpoints { > + if {![runto_main]} { > + return -1 > + } > +} > + > +# The breakpoint should now be showing in `foo` for real. > +gdb_test "info breakpoints" \ > + "\r\n$decimal\\s+\[^\r\n\]+ in foo\\(\\) at \[^\r\n\]+\r\n.*" \ > + "check breakpoints after starting the inferior" > + > +# Now we can delete the breakpoints. > +delete_breakpoints > + > +# And recreate the foo breakpoint, we should only get one location, > +# the actual location. > +gdb_test "break foo" "Breakpoint $decimal at \[^\r\n\]+" \ > + "recreate foo breakpoint" > + > +# Check the breakpoint was recreated correctly. > +gdb_test "info breakpoints" \ > + "\r\n$decimal\\s+\[^\r\n\]+ in foo\\(\\) at \[^\r\n\]+" \ > + "check breakpoints after recreation" > diff --git a/gdb/utils.c b/gdb/utils.c > index 734c5bf7f70..e64fec941f1 100644 > --- a/gdb/utils.c > +++ b/gdb/utils.c > @@ -2397,7 +2397,31 @@ strncmp_iw_with_mode (const char *string1, const char *string2, > return 0; > } > else > - return (*string1 != '\0' && *string1 != '('); > + { > + if (*string1 == '(') > + { > + int p_count = 0; > + > + do > + { > + if (*string1 == '(') > + ++p_count; > + else if (*string1 == ')') > + --p_count; > + ++string1; > + } > + while (*string1 != '\0' && p_count > 0); > + > + /* There maybe things like 'const' after the parameters, > + which we do want to ignore. However, if there's an '@' > + then this likely indicates something like '@plt' which we > + should not ignore. */ > + return *string1 == '@'; > + } > + > + return *string1 == '\0' ? 0 : 1; > + } > + > } > else > return 1; > > base-commit: 7d02a94c8f74613edb0cdde11982b22eaaa9affe > -- > 2.25.4