Looks good to me! On Thu, Dec 15, 2022 at 6:44 PM Andrew Burgess wrote: > > Asaf pointed out to me (off list) that v4 fails for some cases that the > original patch supported, so here's a v5. Specifically, if the dlopen > call is passed a non-canonical path then GDB will fail to correctly spot > that the file being opened is within /proc/self, and will still try to > open a file within GDB's /proc/self by mistake. > > Changes since v4: > > - filter_proc_self_filenames now makes use of gdb_realpath_keepfile to > (partially) canonicalize the path, this should fix the non-canonical > path issues, > > - extended the test to include the non-canonical path case. > > Changes since v3: > > - Fixed NULL/nullptr issues Lancelot pointed out, > > - Updated the test case to take account of non-probes based shared > library event handling. > > --- > > commit 65c1f020184e632a6de1775c00de7e0d7a97def3 > Author: Asaf Fisher via Gdb-patches > Date: Fri Oct 21 17:42:05 2022 +0000 > > gdb: handle loading shared libraries from /proc/self/fd/ > > Bug PR gdb/29586 describes a situations where a shared library is > created in memory, then written to a memory mapped file. The memory > mapped file will show up as a file descriptor within /proc/self/fd/, > and this path is then used with dlopen in order to call functions > within the in-memory shared library. > > When attempting to debug this GDB hangs. The problem is that, GDB > stops at the shared-library event, reads the shared library path from > the inferior, which is "/proc/self/fd/", and then GDB attempts to > open this file. > > Unfortunately, this means GDB tries to open a file within GDB's > /proc/self/fd/ directory, not within the inferior's directory. In the > case of our hang it turns out that the file descriptor that is opened > is a pipe, and GDB hangs trying to read from the pipe. > > However, the behaviour is really just undefined, depending on which > file descriptor the inferior tries to open, GDB will open, or fail to > open, random files within its /proc/self/fd directory. > > The solution proposed in this commit is to hook into solib_find_1, and > spot when GDB is looking for any file in /proc/self/, if this is the > case, then the filename is rewritten as /proc/, where is > the process-id of the current inferior. > > One complexity that we need to consider is how to handle non-canonical > paths, for example, if the user opens /proc/../proc/self/fd/ then > GDB also needs to be able to spot this. > > My solution is to call gdb_realpath_keepfile, this canonicalizes > everything except the filename (the '/proc/../proc/self/fd' part in > the above example). The problem this then introduces is that the > 'self' is replaces with the pid of GDB itself. I therefore check, > after canonicalization, for a prefix of /proc/, and if > this is the case, I replace this with /proc/. > > There is one remaining issue, which I consider very unlikely, and I > don't propose to fix. What if the inferior actually did try to dlopen > a file from /proc/? If this happened then GDB will not be > able to distinguish this from the above case and would still replace > the pid of GDB with the pid of the inferior. However, this seems > really unlikely - why would the inferior be trying to load parts of > GDB? So I propose to ignore this case until someone can come up with > an actual scenario where that's useful. > > The test passes for the unix, native-gdbserver, and > native-extended-gdbserver targets. When testing with either of the > gdbserver targets, the test is run using the default empty sysroot, > and also using the 'target:' sysroot. > > Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29586 > > Co-authored-by: Andrew Burgess > > diff --git a/gdb/solib.c b/gdb/solib.c > index 59fd866b652..a627d217c85 100644 > --- a/gdb/solib.c > +++ b/gdb/solib.c > @@ -50,6 +50,7 @@ > #include "gdb_bfd.h" > #include "gdbsupport/filestuff.h" > #include "gdbsupport/scoped_fd.h" > +#include "gdbsupport/pathstuff.h" > #include "debuginfod-support.h" > #include "source.h" > #include "cli/cli-style.h" > @@ -79,6 +80,54 @@ show_solib_search_path (struct ui_file *file, int > from_tty, > # define DOS_BASED_FILE_SYSTEM 0 > #endif > > +/* Fix references to files in /proc/self/fd/ when opening a shared > library. > + > + SO_NAME is the name of the shared library being loaded. This function > + returns a possibly modified name which should be used as the path to > the > + shared library. > + > + If SO_NAME starts with /proc/self, then the returned name will be > + modified to start with /proc/PID where 'PID' is the pid of the current > + inferior. */ > + > +static gdb::unique_xmalloc_ptr > +filter_proc_self_filenames (gdb::unique_xmalloc_ptr so_name) > +{ > + /* In order to figure out if SO_NAME points at a file in /proc/self we > + need to canonicalize the path. However, the whole point here is that > + if SO_NAME does point at /proc/self then GDB will think this is its > + self, while the inferior actually means its /proc/self. A result of > + this is that the file being referenced within the inferior's > + /proc/self, might not exist within GDB's /proc/self, a call to > + gdb_realpath would then fail. > + > + To avoid this problem we call gdb_realpath_keepfile, this only tries > + to canonicalize the basename of SO_NAME, and leaves the final > filename > + untouched. */ > + so_name > + = make_unique_xstrdup (gdb_realpath_keepfile (so_name.get ()).c_str > ()); > + > + /* A result of the above canonicalization is that /proc/self will have > + been replaced with /proc/, so that's what we need to > check > + for. Of course, it could be the case that the inferior really did > try > + to dlopen a file within GDB's /proc/ directory, in which case > + we're going to do the wrong thing here, but that seems far less > likely > + than calling dlopen on a file within the inferior's own directory. > */ > + std::string prefix = string_printf ("/proc/%ld", (long) getpid ()); > + > + /* Is the canonical path inside GDB's /proc/ directory? */ > + if (!startswith (so_name.get (), prefix.c_str ())) > + return so_name; > + > + /* Get the part of the path after /proc/. For example given > + '/proc/123/fd' we find the '/fd' part. */ > + const char *tail = so_name.get () + strlen (prefix.c_str ()); > + > + /* Build a replacement path within the inferiors directory. */ > + int inferior_pid = inferior_ptid.pid (); > + return xstrprintf ("/proc/%d%s", inferior_pid, tail); > +} > + > /* Return the full pathname of a binary file (the main executable or a > shared library file), or NULL if not found. If FD is non-NULL, *FD > is set to either -1 or an open file handle for the binary file. > @@ -168,6 +217,9 @@ solib_find_1 (const char *in_pathname, int *fd, bool > is_solib) > } > } > > + temp_pathname.reset (xstrdup (in_pathname)); > + temp_pathname = filter_proc_self_filenames (std::move (temp_pathname)); > + > /* Note, we're interested in IS_TARGET_ABSOLUTE_PATH, not > IS_ABSOLUTE_PATH. The latter is for host paths only, while > IN_PATHNAME is a target path. For example, if we're supposed to > @@ -180,9 +232,7 @@ solib_find_1 (const char *in_pathname, int *fd, bool > is_solib) > 3rd attempt, c:/foo/bar.dll ==> /sysroot/foo/bar.dll > */ > > - if (!IS_TARGET_ABSOLUTE_PATH (fskind, in_pathname) || sysroot == NULL) > - temp_pathname.reset (xstrdup (in_pathname)); > - else > + if (IS_TARGET_ABSOLUTE_PATH (fskind, in_pathname) && sysroot != nullptr) > { > bool need_dir_separator; > > @@ -209,7 +259,7 @@ solib_find_1 (const char *in_pathname, int *fd, bool > is_solib) > /* Cat the prefixed pathname together. */ > temp_pathname.reset (concat (sysroot, > need_dir_separator ? SLASH_STRING : "", > - in_pathname, (char *) NULL)); > + temp_pathname.get (), nullptr)); > } > > /* Handle files to be accessed via the target. */ > diff --git a/gdb/testsuite/gdb.base/solib-proc-self.cc > b/gdb/testsuite/gdb.base/solib-proc-self.cc > new file mode 100644 > index 00000000000..aea397a66f9 > --- /dev/null > +++ b/gdb/testsuite/gdb.base/solib-proc-self.cc > @@ -0,0 +1,89 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2022 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see . > */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#ifdef __WIN32__ > +#include > +#define dlopen(name, mode) LoadLibrary (name) > +#define dlclose(handle) FreeLibrary (handle) > +#define dlerror() "an error occurred" > +#else > +#include > +#endif > + > +/* Combine PREFIX and NUM into a single string, joined with a '/', and > + return the string. */ > + > +std::string > +make_library_path (const char *prefix, int num) > +{ > + std::stringstream stream; > + stream << prefix << "/" << num; > + return stream.str (); > +} > + > +/* Call dlopen on the library pointed to by FILENAME. */ > + > +void > +open_library (const std::string &filename) > +{ > + /* Call dlopen on FILENAME. */ > + void *handle = dlopen (filename.c_str (), RTLD_LAZY); > + if (handle == nullptr) > + abort (); > + > + /* It worked. */ > + dlclose (handle); > +} > + > +int > +main () > +{ > + /* Read the shared libraries contents into a buffer. */ > + std::ifstream read_so_file = std::ifstream (SHLIB_NAME); > + read_so_file.seekg (0, std::ios::end); > + std::streamsize size = read_so_file.tellg (); > + read_so_file.seekg (0, std::ios::beg); > + std::vector buffer (size); > + if (!read_so_file.read (buffer.data (), size)) > + abort (); > + > + /* Create a memory mapped file, then write the shared library to that > + new memory mapped file. */ > + int mem_fd = memfd_create ("test", 0); > + write (mem_fd, buffer.data (), buffer.size ()); > + > + /* Generate a canonical /proc/self/fd/[num] path for the memory mapped > + file, and call dlopen on it. */ > + std::string filename > + = make_library_path ("/proc/self/fd", mem_fd); /* break-here */ > + open_library (filename); > + > + /* Now generate a new, non-canonical filename, and call dlopen on it. > */ > + filename = make_library_path ("/proc/../proc/self/fd", mem_fd); > + open_library (filename); > + > + return 0; > +} > diff --git a/gdb/testsuite/gdb.base/solib-proc-self.exp > b/gdb/testsuite/gdb.base/solib-proc-self.exp > new file mode 100644 > index 00000000000..01e7eecfb16 > --- /dev/null > +++ b/gdb/testsuite/gdb.base/solib-proc-self.exp > @@ -0,0 +1,200 @@ > +# Copyright 2022 Free Software Foundation, Inc. > + > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see . > */ > + > +# Test connecting and disconnecting at shared library events. > + > +if {[skip_shlib_tests]} { > + untested "could not run to main" > + return 0 > +} > + > +standard_testfile .cc > + > +# Reuse an existing library, we don't care about the library contents > +# for this test. > +set libfile so-disc-shr > +set libsrc "${srcdir}/${subdir}/${libfile}.c" > +set libname "${libfile}.so" > +set libobj [standard_output_file ${libname}] > + > +# Compile the shared library. > +if { [gdb_compile_shlib $libsrc $libobj {debug}] != ""} { > + return -1 > +} > + > +# Compile the test executable. > +if [ build_executable "failed to prepare" $testfile $srcfile \ > + [list shlib_load debug c++ > additional_flags=-DSHLIB_NAME="${libobj}"]] { > + return -1 > +} > + > +proc validate_library_load {inferior_pid memfd} { > + > + # Turn on the solib-events so we can see that gdb resolves > + # everything correctly. > + gdb_test_no_output "set stop-on-solib-events 1" > + > + # Now continue forward until the solib event is detected, and > + # check that the loaded library is found through the /proc/PID/fd > + # rather than /proc/self/fd. > + # > + # We need to handle the possibility of the interesting event > + # showing the first, or second time we stop, as depending on which > + # mechanism GDB is using to handle the shared library events (the > + # newer probes based interface, or the old non-probes interface), > + # the library will be reported at the first or second stop. > + set saw_expected_event false > + set saw_no_event_stop false > + gdb_test_multiple "continue" "continue to solib evnt" { > + -re "^continue\r\n" { > + exp_continue > + } > + > + -re "^Continuing\\.\r\n" { > + exp_continue > + } > + > + -early -re "Stopped due to shared library event \\(no libraries > added or removed\\)\r\n" { > + # This non-interesting event shows up first when using the > + # probes based mechanism for dealing with shared library > + # events. > + # > + # We set a flag here, and, once the prompt has appeared, > + # we send another continue, the next event will contain > + # the information we want. > + set saw_no_event_stop true > + exp_continue > + } > + > + -re "Stopped due to shared library event:\r\n Inferior loaded > (?:target:)?/proc/${inferior_pid}/fd/$memfd\r\n" { > + # This event, which includes the information we are > + # looking for, occurs first when using the non-probes > + # based mechanism for handling shared library events, and > + # occurs second when using the probes mechanism. > + # > + # Either way, record here that we say the output we expect. > + set saw_expected_event true > + exp_continue > + } > + > + -re "$::gdb_prompt $" { > + if {$saw_no_event_stop} { > + set saw_no_event_stop false > + send_gdb "continue\n" > + exp_continue > + } else { > + gdb_assert {$saw_expected_event} $gdb_test_name > + } > + } > + } > + > + # Turn off solib events. We're only interested in validating the > + # loads for now. > + gdb_test_no_output "set stop-on-solib-events 0" > +} > + > +# Start GDB and run to the point where the test program tries to dlopen a > file > +# from within /proc/self/fd/. Catch the shared library event and check > that > +# we actually try to load a file from /proc//fd/. > +# > +# If SYSROOT is not the empty string, then this is set as the value of > GDB's > +# sysroot immediately after starting GDB. The only value that is > (currently) > +# supported, other than the empty string, is 'target:'. > +proc do_test { {sysroot ""} } { > + clean_restart $::binfile > + > + if {$sysroot != ""} { > + gdb_test_no_output "set sysroot ${sysroot}" > + } > + > + gdb_load_shlib $::libobj > + > + if ![runto_main] then { > + return 0 > + } > + > + # Get inferior's PID for later. > + set inferior_pid [get_inferior_pid] > + > + # Run to the 'break-here' marker. > + gdb_breakpoint [gdb_get_line_number "break-here"] > + gdb_continue_to_breakpoint "break-here" ".* break-here .*" > + > + set memfd "" > + gdb_test_multiple "p mem_fd" "Get file descriptor" { > + -re -wrap "\\\$$::decimal = (\[^\r\n\]*)" { > + set memfd $expect_out(1,string) > + pass $gdb_test_name > + } > + } > + > + # The first call to open_library (in the test program) is done > + # with a canonical path. > + with_test_prefix "canonical path" { > + gdb_breakpoint "open_library" > + gdb_continue_to_breakpoint "open_library" > + validate_library_load $inferior_pid $memfd > + } > + > + # The second call to open_library (in the test program) is done > + # with a non-canonical path, however, GDB should resolve this to a > + # canonical path for display to the user, so the output we see > + # should be unchanged. > + with_test_prefix "non-canonical path" { > + gdb_breakpoint "open_library" > + gdb_continue_to_breakpoint "open_library" > + validate_library_load $inferior_pid $memfd > + } > +} > + > +# First run of the test. > +do_test > + > +# Possible second run of the test. If we are using a remote target then > we > +# should consider setting the sysroot to 'target:' and re-running the > test. > +if {[target_info exists gdb_protocol] > + && ([target_info gdb_protocol] == "remote" > + || [target_info gdb_protocol] == "extended-remote")} { > + # GDB will already be running after the first call to do_test, so we > can > + # take a peek at the current sysroot setting, and decide if we should > + # repeat the test with a different setting. > + > + set new_sysroot "" > + gdb_test_multiple "show sysroot" "" { > + -wrap -re "The current system root is \"\"\\." { > + pass $gdb_test_name > + > + # Repeat the test with 'target:' sysroot. > + set new_sysroot "target:" > + } > + -wrap -re "The current system root is \"target:\"\\." { > + pass $gdb_test_name > + > + # Nothing else to do, we already tested with target: sysroot. > + } > + -re "$gdb_prompt $" { > + pass $gdb_test_name > + > + # If already testing with any other sysroot, we probably should > + # not try to adjust things, so don't do any further testing. > + } > + } > + > + with_test_prefix "sysroot $new_sysroot" { > + if { $new_sysroot != "" } { > + do_test $new_sysroot > + } > + } > +} > >