From: Asaf Fisher <asaffisher.dev@gmail.com>
To: Andrew Burgess <aburgess@redhat.com>
Cc: gdb-patches@sourceware.org, Dr Lancelot SIX <lsix@lancelotsix.com>
Subject: Re: [PATCHv5] gdb: handle loading shared libraries from /proc/self/fd/
Date: Fri, 16 Dec 2022 18:59:23 +0200 [thread overview]
Message-ID: <CAHmEStvxHNrhAFP=soHU8qRQOPGCt4xaa_2dZJTEzZh+uJcUtQ@mail.gmail.com> (raw)
In-Reply-To: <87r0x0zke6.fsf@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 19478 bytes --]
Looks good to me!
On Thu, Dec 15, 2022 at 6:44 PM Andrew Burgess <aburgess@redhat.com> wrote:
>
> Asaf pointed out to me (off list) that v4 fails for some cases that the
> original patch supported, so here's a v5. Specifically, if the dlopen
> call is passed a non-canonical path then GDB will fail to correctly spot
> that the file being opened is within /proc/self, and will still try to
> open a file within GDB's /proc/self by mistake.
>
> Changes since v4:
>
> - filter_proc_self_filenames now makes use of gdb_realpath_keepfile to
> (partially) canonicalize the path, this should fix the non-canonical
> path issues,
>
> - extended the test to include the non-canonical path case.
>
> Changes since v3:
>
> - Fixed NULL/nullptr issues Lancelot pointed out,
>
> - Updated the test case to take account of non-probes based shared
> library event handling.
>
> ---
>
> commit 65c1f020184e632a6de1775c00de7e0d7a97def3
> Author: Asaf Fisher via Gdb-patches <gdb-patches@sourceware.org>
> Date: Fri Oct 21 17:42:05 2022 +0000
>
> gdb: handle loading shared libraries from /proc/self/fd/
>
> Bug PR gdb/29586 describes a situations where a shared library is
> created in memory, then written to a memory mapped file. The memory
> mapped file will show up as a file descriptor within /proc/self/fd/,
> and this path is then used with dlopen in order to call functions
> within the in-memory shared library.
>
> When attempting to debug this GDB hangs. The problem is that, GDB
> stops at the shared-library event, reads the shared library path from
> the inferior, which is "/proc/self/fd/<NUM>", and then GDB attempts to
> open this file.
>
> Unfortunately, this means GDB tries to open a file within GDB's
> /proc/self/fd/ directory, not within the inferior's directory. In the
> case of our hang it turns out that the file descriptor that is opened
> is a pipe, and GDB hangs trying to read from the pipe.
>
> However, the behaviour is really just undefined, depending on which
> file descriptor the inferior tries to open, GDB will open, or fail to
> open, random files within its /proc/self/fd directory.
>
> The solution proposed in this commit is to hook into solib_find_1, and
> spot when GDB is looking for any file in /proc/self/, if this is the
> case, then the filename is rewritten as /proc/<PID>, where <PID> is
> the process-id of the current inferior.
>
> One complexity that we need to consider is how to handle non-canonical
> paths, for example, if the user opens /proc/../proc/self/fd/<NUM> then
> GDB also needs to be able to spot this.
>
> My solution is to call gdb_realpath_keepfile, this canonicalizes
> everything except the filename (the '/proc/../proc/self/fd' part in
> the above example). The problem this then introduces is that the
> 'self' is replaces with the pid of GDB itself. I therefore check,
> after canonicalization, for a prefix of /proc/<PID OF GDB>, and if
> this is the case, I replace this with /proc/<PID OF INFERIOR>.
>
> There is one remaining issue, which I consider very unlikely, and I
> don't propose to fix. What if the inferior actually did try to dlopen
> a file from /proc/<PID OF GDB>? If this happened then GDB will not be
> able to distinguish this from the above case and would still replace
> the pid of GDB with the pid of the inferior. However, this seems
> really unlikely - why would the inferior be trying to load parts of
> GDB? So I propose to ignore this case until someone can come up with
> an actual scenario where that's useful.
>
> The test passes for the unix, native-gdbserver, and
> native-extended-gdbserver targets. When testing with either of the
> gdbserver targets, the test is run using the default empty sysroot,
> and also using the 'target:' sysroot.
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29586
>
> Co-authored-by: Andrew Burgess <aburgess@redhat.com>
>
> diff --git a/gdb/solib.c b/gdb/solib.c
> index 59fd866b652..a627d217c85 100644
> --- a/gdb/solib.c
> +++ b/gdb/solib.c
> @@ -50,6 +50,7 @@
> #include "gdb_bfd.h"
> #include "gdbsupport/filestuff.h"
> #include "gdbsupport/scoped_fd.h"
> +#include "gdbsupport/pathstuff.h"
> #include "debuginfod-support.h"
> #include "source.h"
> #include "cli/cli-style.h"
> @@ -79,6 +80,54 @@ show_solib_search_path (struct ui_file *file, int
> from_tty,
> # define DOS_BASED_FILE_SYSTEM 0
> #endif
>
> +/* Fix references to files in /proc/self/fd/ when opening a shared
> library.
> +
> + SO_NAME is the name of the shared library being loaded. This function
> + returns a possibly modified name which should be used as the path to
> the
> + shared library.
> +
> + If SO_NAME starts with /proc/self, then the returned name will be
> + modified to start with /proc/PID where 'PID' is the pid of the current
> + inferior. */
> +
> +static gdb::unique_xmalloc_ptr<char>
> +filter_proc_self_filenames (gdb::unique_xmalloc_ptr<char> so_name)
> +{
> + /* In order to figure out if SO_NAME points at a file in /proc/self we
> + need to canonicalize the path. However, the whole point here is that
> + if SO_NAME does point at /proc/self then GDB will think this is its
> + self, while the inferior actually means its /proc/self. A result of
> + this is that the file being referenced within the inferior's
> + /proc/self, might not exist within GDB's /proc/self, a call to
> + gdb_realpath would then fail.
> +
> + To avoid this problem we call gdb_realpath_keepfile, this only tries
> + to canonicalize the basename of SO_NAME, and leaves the final
> filename
> + untouched. */
> + so_name
> + = make_unique_xstrdup (gdb_realpath_keepfile (so_name.get ()).c_str
> ());
> +
> + /* A result of the above canonicalization is that /proc/self will have
> + been replaced with /proc/<pid of gdb>, so that's what we need to
> check
> + for. Of course, it could be the case that the inferior really did
> try
> + to dlopen a file within GDB's /proc/<pid> directory, in which case
> + we're going to do the wrong thing here, but that seems far less
> likely
> + than calling dlopen on a file within the inferior's own directory.
> */
> + std::string prefix = string_printf ("/proc/%ld", (long) getpid ());
> +
> + /* Is the canonical path inside GDB's /proc/<pid> directory? */
> + if (!startswith (so_name.get (), prefix.c_str ()))
> + return so_name;
> +
> + /* Get the part of the path after /proc/<pid>. For example given
> + '/proc/123/fd' we find the '/fd' part. */
> + const char *tail = so_name.get () + strlen (prefix.c_str ());
> +
> + /* Build a replacement path within the inferiors directory. */
> + int inferior_pid = inferior_ptid.pid ();
> + return xstrprintf ("/proc/%d%s", inferior_pid, tail);
> +}
> +
> /* Return the full pathname of a binary file (the main executable or a
> shared library file), or NULL if not found. If FD is non-NULL, *FD
> is set to either -1 or an open file handle for the binary file.
> @@ -168,6 +217,9 @@ solib_find_1 (const char *in_pathname, int *fd, bool
> is_solib)
> }
> }
>
> + temp_pathname.reset (xstrdup (in_pathname));
> + temp_pathname = filter_proc_self_filenames (std::move (temp_pathname));
> +
> /* Note, we're interested in IS_TARGET_ABSOLUTE_PATH, not
> IS_ABSOLUTE_PATH. The latter is for host paths only, while
> IN_PATHNAME is a target path. For example, if we're supposed to
> @@ -180,9 +232,7 @@ solib_find_1 (const char *in_pathname, int *fd, bool
> is_solib)
> 3rd attempt, c:/foo/bar.dll ==> /sysroot/foo/bar.dll
> */
>
> - if (!IS_TARGET_ABSOLUTE_PATH (fskind, in_pathname) || sysroot == NULL)
> - temp_pathname.reset (xstrdup (in_pathname));
> - else
> + if (IS_TARGET_ABSOLUTE_PATH (fskind, in_pathname) && sysroot != nullptr)
> {
> bool need_dir_separator;
>
> @@ -209,7 +259,7 @@ solib_find_1 (const char *in_pathname, int *fd, bool
> is_solib)
> /* Cat the prefixed pathname together. */
> temp_pathname.reset (concat (sysroot,
> need_dir_separator ? SLASH_STRING : "",
> - in_pathname, (char *) NULL));
> + temp_pathname.get (), nullptr));
> }
>
> /* Handle files to be accessed via the target. */
> diff --git a/gdb/testsuite/gdb.base/solib-proc-self.cc
> b/gdb/testsuite/gdb.base/solib-proc-self.cc
> new file mode 100644
> index 00000000000..aea397a66f9
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-proc-self.cc
> @@ -0,0 +1,89 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> + Copyright 2022 Free Software Foundation, Inc.
> +
> + This program is free software; you can redistribute it and/or modify
> + it under the terms of the GNU General Public License as published by
> + the Free Software Foundation; either version 3 of the License, or
> + (at your option) any later version.
> +
> + This program is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + GNU General Public License for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
> +
> +#include <sys/mman.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <iostream>
> +#include <fstream>
> +#include <sstream>
> +#include <vector>
> +#include <unistd.h>
> +
> +#ifdef __WIN32__
> +#include <windows.h>
> +#define dlopen(name, mode) LoadLibrary (name)
> +#define dlclose(handle) FreeLibrary (handle)
> +#define dlerror() "an error occurred"
> +#else
> +#include <dlfcn.h>
> +#endif
> +
> +/* Combine PREFIX and NUM into a single string, joined with a '/', and
> + return the string. */
> +
> +std::string
> +make_library_path (const char *prefix, int num)
> +{
> + std::stringstream stream;
> + stream << prefix << "/" << num;
> + return stream.str ();
> +}
> +
> +/* Call dlopen on the library pointed to by FILENAME. */
> +
> +void
> +open_library (const std::string &filename)
> +{
> + /* Call dlopen on FILENAME. */
> + void *handle = dlopen (filename.c_str (), RTLD_LAZY);
> + if (handle == nullptr)
> + abort ();
> +
> + /* It worked. */
> + dlclose (handle);
> +}
> +
> +int
> +main ()
> +{
> + /* Read the shared libraries contents into a buffer. */
> + std::ifstream read_so_file = std::ifstream (SHLIB_NAME);
> + read_so_file.seekg (0, std::ios::end);
> + std::streamsize size = read_so_file.tellg ();
> + read_so_file.seekg (0, std::ios::beg);
> + std::vector<char> buffer (size);
> + if (!read_so_file.read (buffer.data (), size))
> + abort ();
> +
> + /* Create a memory mapped file, then write the shared library to that
> + new memory mapped file. */
> + int mem_fd = memfd_create ("test", 0);
> + write (mem_fd, buffer.data (), buffer.size ());
> +
> + /* Generate a canonical /proc/self/fd/[num] path for the memory mapped
> + file, and call dlopen on it. */
> + std::string filename
> + = make_library_path ("/proc/self/fd", mem_fd); /* break-here */
> + open_library (filename);
> +
> + /* Now generate a new, non-canonical filename, and call dlopen on it.
> */
> + filename = make_library_path ("/proc/../proc/self/fd", mem_fd);
> + open_library (filename);
> +
> + return 0;
> +}
> diff --git a/gdb/testsuite/gdb.base/solib-proc-self.exp
> b/gdb/testsuite/gdb.base/solib-proc-self.exp
> new file mode 100644
> index 00000000000..01e7eecfb16
> --- /dev/null
> +++ b/gdb/testsuite/gdb.base/solib-proc-self.exp
> @@ -0,0 +1,200 @@
> +# Copyright 2022 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
> +
> +# Test connecting and disconnecting at shared library events.
> +
> +if {[skip_shlib_tests]} {
> + untested "could not run to main"
> + return 0
> +}
> +
> +standard_testfile .cc
> +
> +# Reuse an existing library, we don't care about the library contents
> +# for this test.
> +set libfile so-disc-shr
> +set libsrc "${srcdir}/${subdir}/${libfile}.c"
> +set libname "${libfile}.so"
> +set libobj [standard_output_file ${libname}]
> +
> +# Compile the shared library.
> +if { [gdb_compile_shlib $libsrc $libobj {debug}] != ""} {
> + return -1
> +}
> +
> +# Compile the test executable.
> +if [ build_executable "failed to prepare" $testfile $srcfile \
> + [list shlib_load debug c++
> additional_flags=-DSHLIB_NAME="${libobj}"]] {
> + return -1
> +}
> +
> +proc validate_library_load {inferior_pid memfd} {
> +
> + # Turn on the solib-events so we can see that gdb resolves
> + # everything correctly.
> + gdb_test_no_output "set stop-on-solib-events 1"
> +
> + # Now continue forward until the solib event is detected, and
> + # check that the loaded library is found through the /proc/PID/fd
> + # rather than /proc/self/fd.
> + #
> + # We need to handle the possibility of the interesting event
> + # showing the first, or second time we stop, as depending on which
> + # mechanism GDB is using to handle the shared library events (the
> + # newer probes based interface, or the old non-probes interface),
> + # the library will be reported at the first or second stop.
> + set saw_expected_event false
> + set saw_no_event_stop false
> + gdb_test_multiple "continue" "continue to solib evnt" {
> + -re "^continue\r\n" {
> + exp_continue
> + }
> +
> + -re "^Continuing\\.\r\n" {
> + exp_continue
> + }
> +
> + -early -re "Stopped due to shared library event \\(no libraries
> added or removed\\)\r\n" {
> + # This non-interesting event shows up first when using the
> + # probes based mechanism for dealing with shared library
> + # events.
> + #
> + # We set a flag here, and, once the prompt has appeared,
> + # we send another continue, the next event will contain
> + # the information we want.
> + set saw_no_event_stop true
> + exp_continue
> + }
> +
> + -re "Stopped due to shared library event:\r\n Inferior loaded
> (?:target:)?/proc/${inferior_pid}/fd/$memfd\r\n" {
> + # This event, which includes the information we are
> + # looking for, occurs first when using the non-probes
> + # based mechanism for handling shared library events, and
> + # occurs second when using the probes mechanism.
> + #
> + # Either way, record here that we say the output we expect.
> + set saw_expected_event true
> + exp_continue
> + }
> +
> + -re "$::gdb_prompt $" {
> + if {$saw_no_event_stop} {
> + set saw_no_event_stop false
> + send_gdb "continue\n"
> + exp_continue
> + } else {
> + gdb_assert {$saw_expected_event} $gdb_test_name
> + }
> + }
> + }
> +
> + # Turn off solib events. We're only interested in validating the
> + # loads for now.
> + gdb_test_no_output "set stop-on-solib-events 0"
> +}
> +
> +# Start GDB and run to the point where the test program tries to dlopen a
> file
> +# from within /proc/self/fd/. Catch the shared library event and check
> that
> +# we actually try to load a file from /proc/<INFERIOR-PID>/fd/.
> +#
> +# If SYSROOT is not the empty string, then this is set as the value of
> GDB's
> +# sysroot immediately after starting GDB. The only value that is
> (currently)
> +# supported, other than the empty string, is 'target:'.
> +proc do_test { {sysroot ""} } {
> + clean_restart $::binfile
> +
> + if {$sysroot != ""} {
> + gdb_test_no_output "set sysroot ${sysroot}"
> + }
> +
> + gdb_load_shlib $::libobj
> +
> + if ![runto_main] then {
> + return 0
> + }
> +
> + # Get inferior's PID for later.
> + set inferior_pid [get_inferior_pid]
> +
> + # Run to the 'break-here' marker.
> + gdb_breakpoint [gdb_get_line_number "break-here"]
> + gdb_continue_to_breakpoint "break-here" ".* break-here .*"
> +
> + set memfd ""
> + gdb_test_multiple "p mem_fd" "Get file descriptor" {
> + -re -wrap "\\\$$::decimal = (\[^\r\n\]*)" {
> + set memfd $expect_out(1,string)
> + pass $gdb_test_name
> + }
> + }
> +
> + # The first call to open_library (in the test program) is done
> + # with a canonical path.
> + with_test_prefix "canonical path" {
> + gdb_breakpoint "open_library"
> + gdb_continue_to_breakpoint "open_library"
> + validate_library_load $inferior_pid $memfd
> + }
> +
> + # The second call to open_library (in the test program) is done
> + # with a non-canonical path, however, GDB should resolve this to a
> + # canonical path for display to the user, so the output we see
> + # should be unchanged.
> + with_test_prefix "non-canonical path" {
> + gdb_breakpoint "open_library"
> + gdb_continue_to_breakpoint "open_library"
> + validate_library_load $inferior_pid $memfd
> + }
> +}
> +
> +# First run of the test.
> +do_test
> +
> +# Possible second run of the test. If we are using a remote target then
> we
> +# should consider setting the sysroot to 'target:' and re-running the
> test.
> +if {[target_info exists gdb_protocol]
> + && ([target_info gdb_protocol] == "remote"
> + || [target_info gdb_protocol] == "extended-remote")} {
> + # GDB will already be running after the first call to do_test, so we
> can
> + # take a peek at the current sysroot setting, and decide if we should
> + # repeat the test with a different setting.
> +
> + set new_sysroot ""
> + gdb_test_multiple "show sysroot" "" {
> + -wrap -re "The current system root is \"\"\\." {
> + pass $gdb_test_name
> +
> + # Repeat the test with 'target:' sysroot.
> + set new_sysroot "target:"
> + }
> + -wrap -re "The current system root is \"target:\"\\." {
> + pass $gdb_test_name
> +
> + # Nothing else to do, we already tested with target: sysroot.
> + }
> + -re "$gdb_prompt $" {
> + pass $gdb_test_name
> +
> + # If already testing with any other sysroot, we probably should
> + # not try to adjust things, so don't do any further testing.
> + }
> + }
> +
> + with_test_prefix "sysroot $new_sysroot" {
> + if { $new_sysroot != "" } {
> + do_test $new_sysroot
> + }
> + }
> +}
>
>
next prev parent reply other threads:[~2022-12-16 16:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-21 17:42 [PATCH v3 1/2] Add test to check GDB handles dlopen of /proc/self/fd/[num] correctly Asaf Fisher
2022-10-21 17:42 ` [PATCH v3 2/2] Make GDB resolve dlopen of memory mapped shared libraries Asaf Fisher
2022-11-10 19:37 ` Andrew Burgess
2022-11-11 12:35 ` Dr Lancelot SIX
2022-11-11 12:47 ` Asaf Fisher
2022-11-21 11:55 ` Andrew Burgess
2022-11-21 17:56 ` [PATCHv4] gdb: handle loading shared libraries from /proc/self/fd/ Andrew Burgess
2022-12-14 11:51 ` Andrew Burgess
2022-12-15 16:44 ` [PATCHv5] " Andrew Burgess
2022-12-16 16:59 ` Asaf Fisher [this message]
2023-01-20 12:33 ` [PATCHv6] " Andrew Burgess
2023-01-25 13:30 ` Pedro Alves
2022-10-24 10:45 ` [PATCH v3 1/2] Add test to check GDB handles dlopen of /proc/self/fd/[num] correctly Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHmEStvxHNrhAFP=soHU8qRQOPGCt4xaa_2dZJTEzZh+uJcUtQ@mail.gmail.com' \
--to=asaffisher.dev@gmail.com \
--cc=aburgess@redhat.com \
--cc=gdb-patches@sourceware.org \
--cc=lsix@lancelotsix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).