From: Pedro Alves <pedro@palves.net>
To: Andrew Burgess <aburgess@redhat.com>, Eli Zaretskii <eliz@gnu.org>
Cc: tom@tromey.com, gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdb: avoid '//' in filenames when searching for debuginfo
Date: Fri, 14 Jun 2024 18:48:40 +0100 [thread overview]
Message-ID: <d28a7ed9-444f-4685-9567-15810395e598@palves.net> (raw)
In-Reply-To: <87ed8z8xfc.fsf@redhat.com>
On 2024-06-14 14:29, Andrew Burgess wrote:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> From: Andrew Burgess <aburgess@redhat.com>
>>> Cc: gdb-patches@sourceware.org
>>> Date: Fri, 14 Jun 2024 09:58:09 +0100
>>>
>>> Tom Tromey <tom@tromey.com> writes:
>>>
>>>>>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
>>>>
>>>> Andrew> The new combine_paths function is similar to the existing path_join
>>>> Andrew> function (see gdbsupport/pathstuff.h), but unlike path_join the new
>>>> Andrew> function doesn't require that the first path be absolute.
>>>>
>>>> From what I can see, path_join doesn't have this requirement.
>>>> The docs say "The first element can be absolute or relative" and the
>>>> code seems to agree.
>>>
>>> You're absolutely right. And I see what's happened. Somewhere along
>>> the line the logic has gotten reversed in my head. The problem isn't
>>> that the first _must_ be absolute with path_join, it's that everything
>>> else _must_not_ be absolute, see these lines in path_join:
>>>
>>> if (i > 0)
>>> gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));
>>>
>>> And for what we're doing here we, for example, join the path to the file
>>> (which is absolute) onto the debug directory, which might, or might not,
>>> be absolute, e.g. these lines in find_separate_debug_file:
>>>
>>> /* If the file is in the sysroot, try using its base path in
>>> the global debugfile directory. */
>>> debugfile = path_join (target_prefix_str, debugdir.get (),
>>> base_path, debuglink);
>>>
>>> I've initially understood the problem, written the code, then come back
>>> and added comments and a commit message, and gotten mixed up somewhere.
>>> Sorry about that.
>>>
>>> Anyway, here's an updated patch. I've change one comment in symfile.c,
>>> and I've also updated the commit message, other than that the patch is
>>> unchanged.
>>>
>>> Oh, and just because I was doubting myself, I did try using path_join,
>>> and as predicted the test (sysroot-debug-lookup.exp) causes GDB to
>>> trigger an assertion as we are appending an absolute path.
>>
>> I suggest to test this with the first argument empty and the second
>> starting with two slashes: those should be left intact, at least on
>> MS-Windows (due to UNCs).
>
> Sorry, I'm feel foggy today... what is 'this' in this paragraph.
>
> So, I went and read up what UNCs means, I believe it's filenames like:
>
> \\SERVER\foo\bar\woof
>
> And I guess, if they can start with forward slashes then maybe this is
> valid too?
>
> //SERVER/foo/bar/woof
>
> I'm pretty sure that's not currently going to work, at least in some
> places we'll try to do:
>
> /debug-directory//SERVER/foo/bar/woof
>
> which probably isn't going to help. Right now the new combine_path is
> going to mush this into '/debug-directory/SERVER/foo/bar/woof' dropping
> one of the '/' characters. I can change this if the double slash is in
> fact useful in the middle of the filename, but I couldn't find anything
> immediately that suggests it would be.
>
>> Also, how do we make sure none of the arguments except the first one
>> is an absolute file name? Or if we cannot ensure that, how do we
>> handle the case where some non-first argument is an absolute file
>> name?
>
> In path_join (gdbsupport/pathstuff.cc)? We do this:
>
> if (i > 0)
> gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));
>
> Where 'i' is the index into the array of filenames to combine, for the
> first item we don't do any checks. For the rest the filename needs to
> either be empty, or not absolute.
>
> For the new combine_paths (gdb/symfile.c)? We don't do any checks,
> that's the point. Paths after the first one might be absolute, and we
> still want to join them together.
>
>>
>> Btw, the GNU Coding Standards frown on using "path" for anything but
>> PATH-style directory lists. The ones here should be called "file
>> name" of "file_name" or "fname" or "file" or "directory", but not
>> "path".
>
> I've updated some of the uses of path to filename in the patch below.
> Mostly in the commit message. Within GDB's code base the use of path to
> mean filename or directory name is pretty much ubiquitous
> unfortunately. Still, I've tried to switch where I think it makes
> sense.
>
> OOI, what should we use when we don't know if something is a filename or
> directory name? e.g. a function that joins such things (whatever they
> are called) together? i.e. path_join might be used to join together two
> directory names. Or a directory name and a filename. If we call the
> arguments 'filename' then that seems confusing, a directory name is just
> as valid. And similarly, using 'directory name' seems wrong too. This,
> I think, is where 'path' starts to creep in. Plus you have libc
> functions like 'realpath' which use 'path' in the same way....
>
> Anyway. An updated patch is below.
>
> Thanks,
> Andrew
>
> ---
>
> commit 74235cbf9db8f364ff8ed46898c34ccc6f38c5b9
> Author: Andrew Burgess <aburgess@redhat.com>
> Date: Wed Jun 12 15:24:46 2024 +0100
>
> gdb: avoid '//' in filenames when searching for debuginfo
>
> I spotted that the gdb.base/sysroot-debug-lookup.exp test that I added
> recently actually had a KPASS when run with the
> native-extended-gdbserver board. This was an oversight when adding
> the test.
>
> The failures in this test, when using the 'unix' board, are logged as
> bug PR gdb/31804. The problem appears to be caused by the use of the
> child_path function in find_separate_debug_file.
>
> What happens on the 'unix' board is that the file is specified to GDB
> with a target: prefix, however GDB spots that the target filesystem is
> local to GDB and so opens the file without a target: prefix. When we
> call into find_separate_debug_file the DIR and CANON_DIR arguments,
> which are computed from the objfile_name() no longer have a target:
> prefix.
>
> However, in this test if the file was opened with a target: prefix,
> then the sysroot also has a target: prefix. When child_path is called
> it looks for a common prefix between CANON_DIR (from the objfile_name)
> and the sysroot. However, the sysroot still has the target: prefix,
> which means the child_path() call fails and returns nullptr.
>
> What happens in the native-extended-gdbserver case is that GDB doesn't
> see the target filesystem as local. Now the filename retains the
> target: prefix, which means that in the child_path() call both the
> sysroot and the CANON_DIR have a target: prefix, and so the
> child_path() call succeeds. This allows GDB to progress, try some
> additional paths, and then find the debug information.
>
> So, this commit changes gdb.base/sysroot-debug-lookup.exp to expect
> the test to succeed when using the native-extended-gdbserver protocol.
>
> This leaves one KFAIL when using the native-extended-gdbserver board,
> we find the debug information but (apparently) find it in the wrong
> file. What's happening is that when GDB builds the filename for the
> debug information we end up with a '//' string as a directory
> separator, the test regexp only expects a single separator.
>
> Instead of just fixing the test regexp, I've added a new function
> combine_paths which I then use for building the debug info filenames.
> This function takes care of ensuring only a single '/' is added when
> joining filenames together. With this done I now have no KFAIL when
> using the native-extended-gdbserver board.
>
> The new combine_paths function is similar to the existing path_join
> function (see gdbsupport/pathstuff.h), but unlike path_join the new
> function doesn't require every filename after the first be a relative
> filename. For now I've left the new function in symfile.c, it can
> always be moved if this proves to be more widely useful.
>
> After this commit we still have 2 KFAIL when not using the
> native-extended-gdbserver board.
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31804
>
> diff --git a/gdb/symfile.c b/gdb/symfile.c
> index 5a03def91c6..d8120aa00e0 100644
> --- a/gdb/symfile.c
> +++ b/gdb/symfile.c
> @@ -1344,6 +1344,48 @@ show_debug_file_directory (struct ui_file *file, int from_tty,
> #define DEBUG_SUBDIRECTORY ".debug"
> #endif
>
> +/* This is like path_join, but does not have the requirement that every
> + filename after the first be a relative filename. All PATHS are joined
> + together and a directory separator is added if needed between joined
> + elements.
> +
> + This allows us to do things like, e.g. append the absolute filename of a
> + file onto the debug file directory name. */
> +
> +template<typename ...Args>
> +static std::string
> +combine_paths (Args... paths)
> +{
> + /* It doesn't make sense to join less than two filenames. */
> + static_assert (sizeof... (Args) >= 2);
> +
> + std::array<const char *, sizeof... (Args)> path_array
> + { paths... };
> +
> + std::string ret;
> +
> + for (int i = 0; i < path_array.size (); ++i)
> + {
> + const char *path = path_array[i];
> +
> + if (!ret.empty ())
> + {
> + /* If RET doesn't end in a separator then add one now. */
> + if (!IS_DIR_SEPARATOR (ret.back ()))
> + ret += '/';
> +
> + /* Now that RET ends in a separator ignore any at the start of
> + PATH. */
> + while (IS_DIR_SEPARATOR (path[0]))
> + ++path;
> + }
> +
> + ret.append (path);
> + }
> +
> + return ret;
> +}
path_join does the bulk of the work in an array_view overload, which avoids
code bloat from having multiple instances of the same code for different
arguments. I think this would benefit from doing the same.
next prev parent reply other threads:[~2024-06-14 17:48 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-13 9:56 [PATCH 0/2] find debug by debuglink minor fixes Andrew Burgess
2024-06-13 9:56 ` [PATCH 1/2] gdb: avoid '//' in filenames when searching for debuginfo Andrew Burgess
2024-06-13 16:46 ` Tom Tromey
2024-06-14 8:58 ` Andrew Burgess
2024-06-14 11:06 ` Eli Zaretskii
2024-06-14 13:29 ` Andrew Burgess
2024-06-14 14:18 ` Eli Zaretskii
2024-06-15 10:24 ` Andrew Burgess
2024-06-15 12:45 ` Eli Zaretskii
2024-06-15 15:28 ` Andrew Burgess
2024-06-15 15:33 ` Andrew Burgess
2024-06-15 16:11 ` Eli Zaretskii
2024-06-14 17:48 ` Pedro Alves [this message]
2024-06-15 9:54 ` Andrew Burgess
2024-06-14 14:23 ` Tom Tromey
2024-06-15 9:53 ` Andrew Burgess
2024-06-13 9:56 ` [PATCH 2/2] gdb: fix a target: prefix issue in find_separate_debug_file Andrew Burgess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d28a7ed9-444f-4685-9567-15810395e598@palves.net \
--to=pedro@palves.net \
--cc=aburgess@redhat.com \
--cc=eliz@gnu.org \
--cc=gdb-patches@sourceware.org \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).