public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Andrew Burgess <aburgess@redhat.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: tom@tromey.com, gdb-patches@sourceware.org
Subject: Re: [PATCH 1/2] gdb: avoid '//' in filenames when searching for debuginfo
Date: Fri, 14 Jun 2024 14:29:43 +0100	[thread overview]
Message-ID: <87ed8z8xfc.fsf@redhat.com> (raw)
In-Reply-To: <86o783ye9b.fsf@gnu.org>

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Andrew Burgess <aburgess@redhat.com>
>> Cc: gdb-patches@sourceware.org
>> Date: Fri, 14 Jun 2024 09:58:09 +0100
>> 
>> Tom Tromey <tom@tromey.com> writes:
>> 
>> >>>>>> "Andrew" == Andrew Burgess <aburgess@redhat.com> writes:
>> >
>> > Andrew> The new combine_paths function is similar to the existing path_join
>> > Andrew> function (see gdbsupport/pathstuff.h), but unlike path_join the new
>> > Andrew> function doesn't require that the first path be absolute.
>> >
>> > From what I can see, path_join doesn't have this requirement.
>> > The docs say "The first element can be absolute or relative" and the
>> > code seems to agree.
>> 
>> You're absolutely right.  And I see what's happened.  Somewhere along
>> the line the logic has gotten reversed in my head.  The problem isn't
>> that the first _must_ be absolute with path_join, it's that everything
>> else _must_not_ be absolute, see these lines in path_join:
>> 
>>       if (i > 0)
>> 	gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));
>> 
>> And for what we're doing here we, for example, join the path to the file
>> (which is absolute) onto the debug directory, which might, or might not,
>> be absolute, e.g. these lines in find_separate_debug_file:
>> 
>>   /* If the file is in the sysroot, try using its base path in
>>      the global debugfile directory.  */
>>   debugfile = path_join (target_prefix_str, debugdir.get (),
>> 			 base_path, debuglink);
>> 
>> I've initially understood the problem, written the code, then come back
>> and added comments and a commit message, and gotten mixed up somewhere.
>> Sorry about that.
>> 
>> Anyway, here's an updated patch.  I've change one comment in symfile.c,
>> and I've also updated the commit message, other than that the patch is
>> unchanged.
>> 
>> Oh, and just because I was doubting myself, I did try using path_join,
>> and as predicted the test (sysroot-debug-lookup.exp) causes GDB to
>> trigger an assertion as we are appending an absolute path.
>
> I suggest to test this with the first argument empty and the second
> starting with two slashes: those should be left intact, at least on
> MS-Windows (due to UNCs).

Sorry, I'm feel foggy today... what is 'this' in this paragraph.

So, I went and read up what UNCs means, I believe it's filenames like:

  \\SERVER\foo\bar\woof

And I guess, if they can start with forward slashes then maybe this is
valid too?

  //SERVER/foo/bar/woof

I'm pretty sure that's not currently going to work, at least in some
places we'll try to do:

 /debug-directory//SERVER/foo/bar/woof

which probably isn't going to help.  Right now the new combine_path is
going to mush this into '/debug-directory/SERVER/foo/bar/woof' dropping
one of the '/' characters.  I can change this if the double slash is in
fact useful in the middle of the filename, but I couldn't find anything
immediately that suggests it would be.

> Also, how do we make sure none of the arguments except the first one
> is an absolute file name?  Or if we cannot ensure that, how do we
> handle the case where some non-first argument is an absolute file
> name?

In path_join (gdbsupport/pathstuff.cc)?  We do this:

      if (i > 0)
	gdb_assert (strlen (path) == 0 || !IS_ABSOLUTE_PATH (path));

Where 'i' is the index into the array of filenames to combine, for the
first item we don't do any checks.  For the rest the filename needs to
either be empty, or not absolute.

For the new combine_paths (gdb/symfile.c)?  We don't do any checks,
that's the point.  Paths after the first one might be absolute, and we
still want to join them together.

>
> Btw, the GNU Coding Standards frown on using "path" for anything but
> PATH-style directory lists.  The ones here should be called "file
> name" of "file_name" or "fname" or "file" or "directory", but not
> "path".

I've updated some of the uses of path to filename in the patch below.
Mostly in the commit message.  Within GDB's code base the use of path to
mean filename or directory name is pretty much ubiquitous
unfortunately.  Still, I've tried to switch where I think it makes
sense.

OOI, what should we use when we don't know if something is a filename or
directory name?  e.g. a function that joins such things (whatever they
are called) together?  i.e. path_join might be used to join together two
directory names.  Or a directory name and a filename.  If we call the
arguments 'filename' then that seems confusing, a directory name is just
as valid.  And similarly, using 'directory name' seems wrong too.  This,
I think, is where 'path' starts to creep in.  Plus you have libc
functions like 'realpath' which use 'path' in the same way....

Anyway.  An updated patch is below.

Thanks,
Andrew

---

commit 74235cbf9db8f364ff8ed46898c34ccc6f38c5b9
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Wed Jun 12 15:24:46 2024 +0100

    gdb: avoid '//' in filenames when searching for debuginfo
    
    I spotted that the gdb.base/sysroot-debug-lookup.exp test that I added
    recently actually had a KPASS when run with the
    native-extended-gdbserver board.  This was an oversight when adding
    the test.
    
    The failures in this test, when using the 'unix' board, are logged as
    bug PR gdb/31804.  The problem appears to be caused by the use of the
    child_path function in find_separate_debug_file.
    
    What happens on the 'unix' board is that the file is specified to GDB
    with a target: prefix, however GDB spots that the target filesystem is
    local to GDB and so opens the file without a target: prefix.  When we
    call into find_separate_debug_file the DIR and CANON_DIR arguments,
    which are computed from the objfile_name() no longer have a target:
    prefix.
    
    However, in this test if the file was opened with a target: prefix,
    then the sysroot also has a target: prefix.  When child_path is called
    it looks for a common prefix between CANON_DIR (from the objfile_name)
    and the sysroot.  However, the sysroot still has the target: prefix,
    which means the child_path() call fails and returns nullptr.
    
    What happens in the native-extended-gdbserver case is that GDB doesn't
    see the target filesystem as local.  Now the filename retains the
    target: prefix, which means that in the child_path() call both the
    sysroot and the CANON_DIR have a target: prefix, and so the
    child_path() call succeeds.  This allows GDB to progress, try some
    additional paths, and then find the debug information.
    
    So, this commit changes gdb.base/sysroot-debug-lookup.exp to expect
    the test to succeed when using the native-extended-gdbserver protocol.
    
    This leaves one KFAIL when using the native-extended-gdbserver board,
    we find the debug information but (apparently) find it in the wrong
    file.  What's happening is that when GDB builds the filename for the
    debug information we end up with a '//' string as a directory
    separator, the test regexp only expects a single separator.
    
    Instead of just fixing the test regexp, I've added a new function
    combine_paths which I then use for building the debug info filenames.
    This function takes care of ensuring only a single '/' is added when
    joining filenames together.  With this done I now have no KFAIL when
    using the native-extended-gdbserver board.
    
    The new combine_paths function is similar to the existing path_join
    function (see gdbsupport/pathstuff.h), but unlike path_join the new
    function doesn't require every filename after the first be a relative
    filename.  For now I've left the new function in symfile.c, it can
    always be moved if this proves to be more widely useful.
    
    After this commit we still have 2 KFAIL when not using the
    native-extended-gdbserver board.
    
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31804

diff --git a/gdb/symfile.c b/gdb/symfile.c
index 5a03def91c6..d8120aa00e0 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -1344,6 +1344,48 @@ show_debug_file_directory (struct ui_file *file, int from_tty,
 #define DEBUG_SUBDIRECTORY ".debug"
 #endif
 
+/* This is like path_join, but does not have the requirement that every
+   filename after the first be a relative filename.  All PATHS are joined
+   together and a directory separator is added if needed between joined
+   elements.
+
+   This allows us to do things like, e.g. append the absolute filename of a
+   file onto the debug file directory name.  */
+
+template<typename ...Args>
+static std::string
+combine_paths (Args... paths)
+{
+  /* It doesn't make sense to join less than two filenames.  */
+  static_assert (sizeof... (Args) >= 2);
+
+  std::array<const char *, sizeof... (Args)> path_array
+    { paths... };
+
+  std::string ret;
+
+  for (int i = 0; i < path_array.size (); ++i)
+    {
+      const char *path = path_array[i];
+
+      if (!ret.empty ())
+	{
+	  /* If RET doesn't end in a separator then add one now.  */
+	  if (!IS_DIR_SEPARATOR (ret.back ()))
+	    ret += '/';
+
+	  /* Now that RET ends in a separator ignore any at the start of
+	     PATH.  */
+	  while (IS_DIR_SEPARATOR (path[0]))
+	    ++path;
+	}
+
+      ret.append (path);
+    }
+
+  return ret;
+}
+
 /* Find a separate debuginfo file for OBJFILE, using DIR as the directory
    where the original file resides (may not be the same as
    dirname(objfile->name) due to symlinks), and DEBUGLINK as the file we are
@@ -1370,17 +1412,13 @@ find_separate_debug_file (const char *dir,
      objfile_name (objfile));
 
   /* First try in the same directory as the original file.  */
-  std::string debugfile = dir;
-  debugfile += debuglink;
+  std::string debugfile = combine_paths (dir, debuglink);
 
   if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
     return debugfile;
 
   /* Then try in the subdirectory named DEBUG_SUBDIRECTORY.  */
-  debugfile = dir;
-  debugfile += DEBUG_SUBDIRECTORY;
-  debugfile += "/";
-  debugfile += debuglink;
+  debugfile = combine_paths (dir, DEBUG_SUBDIRECTORY, debuglink);
 
   if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
     return debugfile;
@@ -1393,6 +1431,7 @@ find_separate_debug_file (const char *dir,
   bool target_prefix = is_target_filename (dir);
   const char *dir_notarget
     = target_prefix ? dir + strlen (TARGET_SYSROOT_PREFIX) : dir;
+  const char *target_prefix_str = target_prefix ? TARGET_SYSROOT_PREFIX : "";
   std::vector<gdb::unique_xmalloc_ptr<char>> debugdir_vec
     = dirnames_to_char_ptr_vec (debug_file_directory.c_str ());
   gdb::unique_xmalloc_ptr<char> canon_sysroot
@@ -1421,12 +1460,8 @@ find_separate_debug_file (const char *dir,
 
   for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
     {
-      debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
-      debugfile += debugdir;
-      debugfile += "/";
-      debugfile += drive;
-      debugfile += dir_notarget;
-      debugfile += debuglink;
+      debugfile = combine_paths (target_prefix_str, debugdir.get (),
+				 drive.c_str (), dir_notarget, debuglink);
 
       if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
 	return debugfile;
@@ -1443,12 +1478,8 @@ find_separate_debug_file (const char *dir,
 	{
 	  /* If the file is in the sysroot, try using its base path in
 	     the global debugfile directory.  */
-	  debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
-	  debugfile += debugdir;
-	  debugfile += "/";
-	  debugfile += base_path;
-	  debugfile += "/";
-	  debugfile += debuglink;
+	  debugfile = combine_paths (target_prefix_str, debugdir.get (),
+				     base_path, debuglink);
 
 	  if (separate_debug_file_exists (debugfile, crc32, objfile, warnings))
 	    return debugfile;
@@ -1461,21 +1492,18 @@ find_separate_debug_file (const char *dir,
 	     same result as above.  */
 	  if (gdb_sysroot != TARGET_SYSROOT_PREFIX)
 	    {
-	      debugfile = target_prefix ? TARGET_SYSROOT_PREFIX : "";
+	      std::string root;
 	      if (is_target_filename (gdb_sysroot))
 		{
-		  std::string root
-		    = gdb_sysroot.substr (strlen (TARGET_SYSROOT_PREFIX));
+		  root = gdb_sysroot.substr (strlen (TARGET_SYSROOT_PREFIX));
 		  gdb_assert (!root.empty ());
-		  debugfile += root;
 		}
 	      else
-		debugfile += gdb_sysroot;
-	      debugfile += debugdir;
-	      debugfile += "/";
-	      debugfile += base_path;
-	      debugfile += "/";
-	      debugfile += debuglink;
+		root = gdb_sysroot;
+
+	      debugfile = combine_paths (target_prefix_str, root.c_str (),
+					 debugdir.get (), base_path,
+					 debuglink);
 
 	      if (separate_debug_file_exists (debugfile, crc32, objfile,
 					      warnings))
diff --git a/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp b/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
index 5f17315c027..30d492e0d47 100644
--- a/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
+++ b/gdb/testsuite/gdb.base/sysroot-debug-lookup.exp
@@ -183,13 +183,15 @@ proc_with_prefix lookup_via_debuglink {} {
 	#
 	# Bug PR gdb/30866 seems to be the (or a) relevant bug for
 	# this problem.
-	if { $sysroot_prefix ne "" } {
+	if { $sysroot_prefix ne ""
+	     && [target_info gdb_protocol] ne "extended-remote" } {
 	    setup_kfail "*-*-*" 31804
 	}
 	gdb_assert { $::gdb_file_cmd_debug_info eq "debug" } \
 	    "ensure debug information was found"
 
-	if { $sysroot_prefix ne "" } {
+	if { $sysroot_prefix ne ""
+	     && [target_info gdb_protocol] ne "extended-remote"  } {
 	    setup_kfail "*-*-*" 31804
 	}
 	set re [string_to_regexp "Reading symbols from ${sysroot_prefix}$debug_symlink..."]


  reply	other threads:[~2024-06-14 13:29 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13  9:56 [PATCH 0/2] find debug by debuglink minor fixes Andrew Burgess
2024-06-13  9:56 ` [PATCH 1/2] gdb: avoid '//' in filenames when searching for debuginfo Andrew Burgess
2024-06-13 16:46   ` Tom Tromey
2024-06-14  8:58     ` Andrew Burgess
2024-06-14 11:06       ` Eli Zaretskii
2024-06-14 13:29         ` Andrew Burgess [this message]
2024-06-14 14:18           ` Eli Zaretskii
2024-06-15 10:24             ` Andrew Burgess
2024-06-15 12:45               ` Eli Zaretskii
2024-06-15 15:28                 ` Andrew Burgess
2024-06-15 15:33                   ` Andrew Burgess
2024-06-15 16:11                   ` Eli Zaretskii
2024-06-14 17:48           ` Pedro Alves
2024-06-15  9:54             ` Andrew Burgess
2024-06-14 14:23       ` Tom Tromey
2024-06-15  9:53         ` Andrew Burgess
2024-06-13  9:56 ` [PATCH 2/2] gdb: fix a target: prefix issue in find_separate_debug_file Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ed8z8xfc.fsf@redhat.com \
    --to=aburgess@redhat.com \
    --cc=eliz@gnu.org \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).