public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-04 19:52 Aleksandar Ristovski
  2008-01-04 20:30 ` Doug Evans
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-04 19:52 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Ryan Mansfield, Joel Brobecker

> 
> Yes, this is something I've been meaning to fix for ages, but never
> had time.  We need versions of IS_ABSOLUTE_PATH, FILENAME_CMP, and the
> other related macros which handle DOS pathnames regardless of the host
> system.  We should give them different names, though, so that both are
> available (sometimes we need to manipulate host paths).
>

Submitted a patch:
http://sourceware.org/ml/gdb-patches/2008-01/msg00062.html

 
> It would help a lot if you could write test cases.  We can use
> gdb.dwarf2/ to construct arbitrary DWARF-2 contents, with relative or
> absolute paths as needed.  If you have trouble writing the test cases,
> I can do it if you send me a couple of example binaries, and the
> before and after expected results.
 
It is a good idea to have test cases. I, unfortunately, know very little
about writing tests, and would really appreciate if you wrote the tests.

Attached is a binary that illustrates the problem I talked about, but to
summarize and break-down the problem into independent pieces I will list
test cases I can think of. We need combinations of DW_AT_name,
DW_AT_comp_dir, and directory table (in .debug_line section).  I am not 100%
sure how are you going to test this, but my testcase is to set a breakpoint
using file name, i.e.
(gdb) break main.c:11
Or something like that. I expect testcase 2 to fail. Test 1 and 3 should be
o.k. When testing Na) versions of testcases (DOS paths, linux host), I would
expect all of them to fail.

1. Currently covered case:
DW_AT_name = /foo/bar/main.c
No DW_AT_comp_dir or DW_AT_comp_dir equal to DW_AT_name
The Directory Table:
(empty)
The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c

2. Case where DW_AT_name is absolute path, DW_AT_comp_dir is given as a
different absolute path and line info is given relative to comp_dir:
 
DW_AT_name = /foo/bar/main.c
DW_AT_comp_dir = /foo/bar/Debug
The Directory Table:
 ..

The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c

3. DW_AT_name relative to DW_AT_comp_dir

DW_AT_name = ../main.c
DW_AT_comp_dir = /foo/bar/Debug

The Directory Table:
 ..

The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c


========= Cross-compiled on windows case =============
The following testcases are basically the same as above, except now we use
DOS style paths. The tests are meaningful only when testing on POSIX host.
Otherwise they default to the above tests.

1a. Currently covered case:
DW_AT_name = C:/foo/bar/main.c
No DW_AT_comp_dir or DW_AT_comp_dir equal to DW_AT_name
The Directory Table:
(empty)
The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c

2a. Case where DW_AT_name is absolute path, DW_AT_comp_dir is given as a
different absolute path and line info is given relative to comp_dir:
 
DW_AT_name = C:/foo/bar/main.c
DW_AT_comp_dir = C:/foo/bar/Debug
The Directory Table:
 ..

The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c

3a. DW_AT_name relative to DW_AT_comp_dir

DW_AT_name = ../main.c
DW_AT_comp_dir = C:/foo/bar/Debug

The Directory Table:
 ..

The File Name Table:
 Entry	Dir	Time	Size	Name
 1		1	0	0	main.c



Thank you,

Aleksandar Ristovski
QNX Software Systems

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 19:52 gdb_realpath: dealing with ./ and ../ Aleksandar Ristovski
@ 2008-01-04 20:30 ` Doug Evans
  0 siblings, 0 replies; 25+ messages in thread
From: Doug Evans @ 2008-01-04 20:30 UTC (permalink / raw)
  To: Aleksandar Ristovski
  Cc: Daniel Jacobowitz, gdb, Ryan Mansfield, Joel Brobecker

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

[fwiw ...]

I wonder if things would be cleaner if the path argument to
start_subfile (name, dirname) was processed the same as each subfile's
path (subfile->name, subfile->dirname).  Otherwise it seems like a
potential bug source (not necessarily now, but down the road).  There
(apparently) have been edits to this area of the code in the past that
have introduced bugs.

i.e. how about writing a utility to process name,dirname and using
that utility for both the argument to start_subfile and each subfile
in the comparison loop?  This patch is untested, I'm just providing a
definitive example of the suggestion.

Another question: In the case where dirname is prepended to name,
should the whole thing be passed to rewrite_source_path, instead of
just dirname?

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rewrite-path-2.patch --]
[-- Type: text/x-patch; name=rewrite-path-2.patch, Size: 3610 bytes --]

Index: buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.59
diff -u -p -c -r1.59 buildsym.c
*** buildsym.c	1 Jan 2008 22:53:09 -0000	1.59
--- buildsym.c	4 Jan 2008 20:22:00 -0000
*************** static struct obstack pending_addrmap_ob
*** 86,91 ****
--- 87,93 ----
  static int pending_addrmap_interesting;
  
  \f
+ static char *rewrite_subfile_path (char *name, char* dirname);
  static int compare_line_numbers (const void *ln1p, const void *ln2p);
  \f
  
*************** void
*** 583,617 ****
  start_subfile (char *name, char *dirname)
  {
    struct subfile *subfile;
  
    /* See if this subfile is already known as a subfile of the current
       main source file.  */
  
    for (subfile = subfiles; subfile; subfile = subfile->next)
      {
!       char *subfile_name;
! 
!       /* If NAME is an absolute path, and this subfile is not, then
! 	 attempt to create an absolute path to compare.  */
!       if (IS_ABSOLUTE_PATH (name)
! 	  && !IS_ABSOLUTE_PATH (subfile->name)
! 	  && subfile->dirname != NULL)
! 	subfile_name = concat (subfile->dirname, SLASH_STRING,
! 			       subfile->name, NULL);
!       else
! 	subfile_name = subfile->name;
  
!       if (FILENAME_CMP (subfile_name, name) == 0)
  	{
  	  current_subfile = subfile;
! 	  if (subfile_name != subfile->name)
! 	    xfree (subfile_name);
  	  return;
  	}
!       if (subfile_name != subfile->name)
! 	xfree (subfile_name);
      }
  
    /* This subfile is not known.  Add an entry for it. Make an entry
       for this subfile in the list of all subfiles of the current main
       source file.  */
--- 599,630 ----
  start_subfile (char *name, char *dirname)
  {
    struct subfile *subfile;
+   /* Rewritten NAME, typically an absolute path, that is used to detect
+      identical subfiles.  */
+   char *comparable_name;
+ 
+   comparable_name = rewrite_subfile_path (name, dirname);
  
    /* See if this subfile is already known as a subfile of the current
       main source file.  */
  
    for (subfile = subfiles; subfile; subfile = subfile->next)
      {
!       char *subfile_name = rewrite_subfile_path (subfile->name,
! 						 subfile->dirname);
  
!       if (FILENAME_CMP (subfile_name, comparable_name) == 0)
  	{
  	  current_subfile = subfile;
! 	  xfree (comparable_name);
! 	  xfree (subfile_name);
  	  return;
  	}
!       xfree (subfile_name);
      }
  
+   xfree (comparable_name);
+ 
    /* This subfile is not known.  Add an entry for it. Make an entry
       for this subfile in the list of all subfiles of the current main
       source file.  */
*************** start_subfile (char *name, char *dirname
*** 681,686 ****
--- 694,727 ----
      }
  }
  
+ /* Subroutine of start_subfile to simplify it.
+    Convert NAME, DIRNAME to a form that can be used to watch for
+    identical subfiles.
+    Space for the result is malloc'd, caller must free.  */
+ 
+ static char *
+ rewrite_subfile_path (char *name, char *dirname)
+ {
+   char *p = name;
+   char *rwname;
+ 
+   if (! IS_ABSOLUTE_PATH (name)
+       && dirname != NULL)
+     p = concat (dirname, SLASH_STRING, name, NULL);
+ 
+   rwname = rewrite_source_path (p);
+ 
+   if (rwname != NULL)
+     {
+       if (p != name)
+ 	xfree (p);
+       return rwname;
+     }
+   if (p != name)
+     return p;
+   return xstrdup (name);
+ }
+ 
  /* For stabs readers, the first N_SO symbol is assumed to be the
     source file name, and the subfile struct is initialized using that
     assumption.  If another N_SO symbol is later seen, immediately

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-08  5:46       ` Joel Brobecker
@ 2008-01-08 19:54         ` Doug Evans
  0 siblings, 0 replies; 25+ messages in thread
From: Doug Evans @ 2008-01-08 19:54 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: Aleksandar Ristovski, gdb, Ryan Mansfield

On Jan 7, 2008 9:45 PM, Joel Brobecker <brobecker@adacore.com> wrote:
> I have thought this over some more, and it looks like
> indeed manual rewriting of the paths will be needed if we want to
> be able to handle all the situations currently discussed.

I think this point is still open for discussion, i.e. do we want to
try to handle "all" the situations currently discussed.  One can solve
the 99.9% case without resorting to path rewriting, one just needs to
know that one particular path is the path of the main source file.  If
one can know this (e.g. by an extra argument to start_subfile, or
because the debug info readers have enough context to know it), then
one can first do a full path comparison and if that fails and the file
is the main file then do a basename comparison.

In order to make progress I think we need to decide on whether we are
going to try heuristics (like path normalization) to handle the 0.1%
case.  The 0.1% case being multiple files with the same base name as
the main file.  [For completeness sake, there may be more cases so the
math might not add up.]  And if so we also need to decide whether
heuristics (like path normalization) will be applied to all files, or
just the main file.  The proposed patch applies normalization to all
files, the dwarf patch applies it to just files that are potentially
the main file and then only if there are multiple files with the same
basename as the main file.

Of course, we also need to decide on whether we're going to handle
this problem at all. :-)  Producers of debug info should follow rules
more strictly, but I think consumers of debug info should be a bit lax
if it's easy to do and reasonably useful.  [One can still flag
violations with complaints so producers get fixed - that's one thing
that should be added to any patch that goes in.  The dwarf spec is
silent on this issue but I gather the intent is that paths must be
consistent - this may get fixed in dwarf v4.]

> Maybe I shouldn't have talked about complexity as this may be just me
> needing time to understand the purpose of your patch. So I withdraw
> that comment.

Also remember that that patch also adds some simplification: All three
call sites of dwarf2_start_subfile currently process fe->dir_index.
One could move that into dwarf2_start_subfile of course, but there is
still the issue of knowing a file is the main file.

> My suggestion has two ideas behind it:
>
>     I reallly think that the wrapper around start_subfile that adjusts
>     NAME and DIRNAME so that NAME is always a basename is a good step
>     forward, because it reduces the number of combinations we need to
>     handle during the matching phase later on.  We don't have to handle
>     the case where NAME is a full path, or a relative path of a basename.

Or one could instead store basename and fullname in struct subfile
(where "fullname" is the complete path that's available).  This way
the comparison loop in start_subfile doesn't have to keep prepending
directories, and can go back to the simple loop it use to be.  [I
still have an open question regarding why DIRNAME isn't prepended to
NAME before going into the comparison loop:
http://sourceware.org/ml/gdb-patches/2008-01/msg00091.html]

> I think that the reorganization will not be necessary.  I'd like to
> make the subfile layer work in a way that the debug info reader would
> only have to pass the name and dirname as-is, and be confident that
> it'll just work.

If one wants to handle this, and one wants to handle it in
start_subfile, then I think start_subfile needs to know when it is
passed the main file so that it can punt to heuristics only in this
case.  start_symtab is presumably a good place to tell start_subfile
"this is the main file".  Either that or maybe use last_source_file.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-08 19:21 Aleksandar Ristovski
  0 siblings, 0 replies; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-08 19:21 UTC (permalink / raw)
  To: Mark Kettenis; +Cc: brobecker, dje, gdb, Ryan Mansfield

> 
> > If we can confirm for sure that "normalize_path" is safe, I think the
> idea
> > is good (please read my comment about normalize_path here:
> > http://sourceware.org/ml/gdb-patches/2008-01/msg00138.html)
> >
> > Example
> > DW_AT_name=../main.cc
> > DW_AT_comp_dir=/foo/bar/obj
> > The Directory Table:
> >   ..
> > The File Name Table:$
> >   Entry>Dir>~~~~Time>~~~Size>~~~Name$
> >   1>~~~~1>~~~~~~0>~~~~~~0>~~~~~~main.cc$
> >
> > Now we get rid of the comp_dir information (this is what effectively
> > happens):
> >
> > NAME=main.cc
> > DIR=/foo/bar
> >
> > And there is no info about /foo/bar/obj.
> >
> >
> > I would think that this is safe.
> 
> Unfortunately it isn't.  If /foo/bar/obj is a symlink to /bar/foo/obj,
> then ../main.cc actually refers to /bar/foo/main.cc, and not
> /foo/bar/main.cc.  So just "normalizing" names is defenitely unsafe.
> 
> The big question here is, whether a compiler will actually set
> DW_AT_comp_dir to /foo/bar/obj in that case.  One can argue that the
> compilation directory really is /bar/foo/obj in that case and that
> DW_AT_comp_dir should be set to /bar/foo/obj.

Therefore, this concludes this trail of thoughts and this topic should
probably be considered closed.

Any further discussion related to the issue mentioned at the beginning of
the topic should be continued here:
http://sourceware.org/ml/gdb-patches/2008-01/msg00148.html



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-08 16:12 Aleksandar Ristovski
@ 2008-01-08 16:40 ` Mark Kettenis
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Kettenis @ 2008-01-08 16:40 UTC (permalink / raw)
  To: ARistovski; +Cc: brobecker, dje, gdb, RMansfield

> From: Aleksandar Ristovski <ARistovski@qnx.com>
> Date: Tue, 8 Jan 2008 11:11:33 -0500

Dropping in late, so excuse me if I'm saying things that already have
been said.  But I really want to make sure that any changes are no
going to hurt people that develop free software on POSIX systems where
they do have all source files available.

> If we can confirm for sure that "normalize_path" is safe, I think the idea
> is good (please read my comment about normalize_path here:
> http://sourceware.org/ml/gdb-patches/2008-01/msg00138.html)
> 
> Example
> DW_AT_name=../main.cc
> DW_AT_comp_dir=/foo/bar/obj
> The Directory Table:
>   ..
> The File Name Table:$
>   Entry>Dir>~~~~Time>~~~Size>~~~Name$
>   1>~~~~1>~~~~~~0>~~~~~~0>~~~~~~main.cc$
> 
> Now we get rid of the comp_dir information (this is what effectively
> happens):
> 
> NAME=main.cc
> DIR=/foo/bar
> 
> And there is no info about /foo/bar/obj.
> 
> 
> I would think that this is safe.

Unfortunately it isn't.  If /foo/bar/obj is a symlink to /bar/foo/obj,
then ../main.cc actually refers to /bar/foo/main.cc, and not
/foo/bar/main.cc.  So just "normalizing" names is defenitely unsafe.

The big question here is, whether a compiler will actually set
DW_AT_comp_dir to /foo/bar/obj in that case.  One can argue that the
compilation directory really is /bar/foo/obj in that case and that
DW_AT_comp_dir should be set to /bar/foo/obj.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-08 16:12 Aleksandar Ristovski
  2008-01-08 16:40 ` Mark Kettenis
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-08 16:12 UTC (permalink / raw)
  To: Joel Brobecker, Doug Evans; +Cc: gdb, Ryan Mansfield

> 
> Maybe I shouldn't have talked about complexity as this may be just me
> needing time to understand the purpose of your patch. So I withdraw
> that comment.

Wherever the fix is put, it will be of approx. the same complexity as it
solves the same problem in a similar way.

> 
> > Handling the issue in each debug info reader is an important
> > consideration and may or may not be a problem.  One would need to
> > examine to what extent the issue exists in the other readers, and to
> > what extent start_subfile can solve it and still be debug-format
> > independent without being more complex.
> 
> My suggestion has two ideas behind it:
> 
>     I reallly think that the wrapper around start_subfile that adjusts
>     NAME and DIRNAME so that NAME is always a basename is a good step
>     forward, because it reduces the number of combinations we need to
>     handle during the matching phase later on.  We don't have to handle
>     the case where NAME is a full path, or a relative path of a basename.

This would effectively make keeping DW_AT_comp_dir value redundant. Not sure
if this would create any issues (I don't see any).

> 
>     Second, I still believe at this point that the problem should be
>     handled outside of the debug info reader.  I know that at least
>     stabs should have the same issues as DWARF. It would be very nice
>     to have the problems fixed in both cases without having to duplicate
>     the algorithms we're developing.

> 
> > One could rewrite that patch to keep dwarf2_start_subfile, but one
> > would have to pass an additional parameter so it would know if it's
> > dealing with the main source file.  The patch as submitted just
> > reorganizes things so that other solutions(/heuristics) are possible
> > without major reworking of the code (the patch itself had to do some
> > reworking, but once that's done it's done (in the problem space being
> > discussed)).  Plus it simplifies all call sites to
> > dwarf2_start_subfile/start_subfile.  Previously, each call site had to
> > process fe->dir_index, and there are three of them.
> 
> I think that the reorganization will not be necessary.  I'd like to
> make the subfile layer work in a way that the debug info reader would
> only have to pass the name and dirname as-is, and be confident that
> it'll just work.

If we can confirm for sure that "normalize_path" is safe, I think the idea
is good (please read my comment about normalize_path here:
http://sourceware.org/ml/gdb-patches/2008-01/msg00138.html)

Example
DW_AT_name=../main.cc
DW_AT_comp_dir=/foo/bar/obj
The Directory Table:
  ..
The File Name Table:$
  Entry>Dir>~~~~Time>~~~Size>~~~Name$
  1>~~~~1>~~~~~~0>~~~~~~0>~~~~~~main.cc$

Now we get rid of the comp_dir information (this is what effectively
happens):

NAME=main.cc
DIR=/foo/bar

And there is no info about /foo/bar/obj.


I would think that this is safe.


Thanks,

Aleksandar

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-07 17:00     ` Doug Evans
@ 2008-01-08  5:46       ` Joel Brobecker
  2008-01-08 19:54         ` Doug Evans
  0 siblings, 1 reply; 25+ messages in thread
From: Joel Brobecker @ 2008-01-08  5:46 UTC (permalink / raw)
  To: Doug Evans; +Cc: Aleksandar Ristovski, gdb, Ryan Mansfield

[-- Attachment #1: Type: text/plain, Size: 3047 bytes --]

Doug,

> > Does this sound like it would work?
> 
> It seems like directory names need to be included in the file name
> comparison by start_subfile to some extent, otherwise headers with the
> same basename will match each other.

Yes, it very much looks like I've been pretty naive in my first
proposal :). I have thought this over some more, and it looks like
indeed manual rewriting of the paths will be needed if we want to
be able to handle all the situations currently discussed. So here is
a new proposal that adds the dirname matching using a new function that
will do the dirname matching first unmodified, and then "normalized".

Now, back to your comment regarding changing the DWARF reader:

> To what extent is the dwarf patch more complex because it handles the
> possibility of multiple matches with the basename of the main source
> file.  Assume there's only one file with the basename of the main file
> name and things become a lot simpler.  See
> http://sourceware.org/ml/gdb-patches/2008-01/msg00090.html.

Maybe I shouldn't have talked about complexity as this may be just me
needing time to understand the purpose of your patch. So I withdraw
that comment.

> Handling the issue in each debug info reader is an important
> consideration and may or may not be a problem.  One would need to
> examine to what extent the issue exists in the other readers, and to
> what extent start_subfile can solve it and still be debug-format
> independent without being more complex.

My suggestion has two ideas behind it:

    I reallly think that the wrapper around start_subfile that adjusts
    NAME and DIRNAME so that NAME is always a basename is a good step
    forward, because it reduces the number of combinations we need to
    handle during the matching phase later on.  We don't have to handle
    the case where NAME is a full path, or a relative path of a basename.
    
    Second, I still believe at this point that the problem should be
    handled outside of the debug info reader.  I know that at least
    stabs should have the same issues as DWARF. It would be very nice
    to have the problems fixed in both cases without having to duplicate
    the algorithms we're developing.

> One could rewrite that patch to keep dwarf2_start_subfile, but one
> would have to pass an additional parameter so it would know if it's
> dealing with the main source file.  The patch as submitted just
> reorganizes things so that other solutions(/heuristics) are possible
> without major reworking of the code (the patch itself had to do some
> reworking, but once that's done it's done (in the problem space being
> discussed)).  Plus it simplifies all call sites to
> dwarf2_start_subfile/start_subfile.  Previously, each call site had to
> process fe->dir_index, and there are three of them.

I think that the reorganization will not be necessary.  I'd like to
make the subfile layer work in a way that the debug info reader would
only have to pass the name and dirname as-is, and be confident that
it'll just work.

-- 
Joel

[-- Attachment #2: buildsym.c.diff --]
[-- Type: text/plain, Size: 4628 bytes --]

Index: buildsym.c
===================================================================
--- buildsym.c	(revision 44)
+++ buildsym.c	(working copy)
@@ -574,13 +574,32 @@ make_blockvector (struct objfile *objfil
   return (blockvector);
 }
 \f
+
+static int
+paths_likely_equal_p (const char *path1, const char *path2)
+{
+  /* The purpose of this function is to handle situations where
+     the path contains ".", or ".." or double-directory separators.
+     Before doing the comparison, we manually resolve the paths in
+     so that ".", ".." and double directory separators no longer
+     appear.  This will allow us to performe a pure string comparison
+     to determine whether two paths are identical or not.  */
+
+  /* If the paths are already identical, then obviously they are equal.  */
+  if (FILENAME_CMP (path1, path2) == 0)
+    return 1;
+
+  /* Normalize path1 and path2, and redo the comparison.  */
+  [...]
+}
+
 /* Start recording information about source code that came from an
    included (or otherwise merged-in) source file with a different
    name.  NAME is the name of the file (cannot be NULL), DIRNAME is
    the directory in which the file was compiled (or NULL if not known).  */
 
 void
-start_subfile (char *name, char *dirname)
+start_subfile_1 (char *name, char *dirname)
 {
   struct subfile *subfile;
 
@@ -589,27 +608,24 @@ start_subfile (char *name, char *dirname
 
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
-      char *subfile_name;
-
-      /* If NAME is an absolute path, and this subfile is not, then
-	 attempt to create an absolute path to compare.  */
-      if (IS_ABSOLUTE_PATH (name)
-	  && !IS_ABSOLUTE_PATH (subfile->name)
-	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
-			       subfile->name, NULL);
-      else
-	subfile_name = subfile->name;
-
-      if (FILENAME_CMP (subfile_name, name) == 0)
-	{
+      /* First, verify that the basename matches.  */
+      if (FILENAME_CMP (subfile->name, name) != 0)
+        continue;
+
+      /* Next, verify that the directory name matches too.  Some programs
+         may include two files with the same name but in a different
+         directory.
+         
+         If either DIRNAME or SUBFILE->DIRNAME is NULL, then we cannot
+         do the verification.  In that case, we cannot do the verification,
+         but the safest bet is that the files are identical.  */
+      if (dirname == NULL
+          || subfile->dirname == NULL
+          || paths_likely_equal_p (dirname, subfile->dirname))
+        {
 	  current_subfile = subfile;
-	  if (subfile_name != subfile->name)
-	    xfree (subfile_name);
 	  return;
-	}
-      if (subfile_name != subfile->name)
-	xfree (subfile_name);
+        }
     }
 
   /* This subfile is not known.  Add an entry for it. Make an entry
@@ -681,6 +697,53 @@ start_subfile (char *name, char *dirname
     }
 }
 
+/* This function is a wrapper around start_subfile_1 which rewrites
+   a bit NAME and DIRNAME to make sure that we are always consistent
+   in how we create subfiles: Turn NAME into a basename, and adjust
+   DIRNAME accordingly.  */
+
+void
+start_subfile (char *name, char *dirname)
+{
+  char *new_name, new_dir_name;
+
+  new_name = lbasename (name);
+
+  /* The simple usual case: NAME is the name of a file without any path
+     (in other words, it's already a basename).  In that case, we use
+     DIR_NAME as is.  */
+  if (FILENAME_CMP (new_name, name) == 0)
+    new_dir_name = dir_name;
+  
+  /* Another simple case: NAME is an absolute PATH.  In that case,
+     we can ignore the compilation dir, and use the path inside NAME
+     as the dir name.  */
+  else if (IS_ABSOLUTE_PATH (name))
+    new_dir_name = ldirname (name);
+
+  /* Otherwise, NAME is a relative PATH, so we extract the relative
+     path from name, and append it at the end of DIRNAME.  IF DIRNAME
+     is NULL, then we use that relative path as the DIRNAME.  */
+  else
+    {
+      char *name_relative_path = ldirname (name);
+
+      if (dirname != NULL)
+        {
+          new_dir_name = concat (dirname, SLASH_STRING, name_relative_path,
+                                 NULL);
+          xfree (name_relative_path);
+        }
+      else
+        new_dir_name = name_relative_path;
+    }
+
+  start_subfile_1 (new_name, new_dir_name);
+
+  xfree (new_name);
+  xfree (new_dir_name);
+}
+
 /* For stabs readers, the first N_SO symbol is assumed to be the
    source file name, and the subfile struct is initialized using that
    assumption.  If another N_SO symbol is later seen, immediately

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-07 14:33   ` Joel Brobecker
@ 2008-01-07 17:00     ` Doug Evans
  2008-01-08  5:46       ` Joel Brobecker
  0 siblings, 1 reply; 25+ messages in thread
From: Doug Evans @ 2008-01-07 17:00 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: Aleksandar Ristovski, gdb, Ryan Mansfield

On Jan 7, 2008 6:32 AM, Joel Brobecker <brobecker@adacore.com> wrote:
> The reason why I agree that the changes should be done inside
> start-subfiles is that we don't want to have to handle this sort of
> problem with every debug info reader.  This was, the handling is
> centralized. It also looks a lot simpler than the patches I have seen
> that touch the DWARF reader.

To what extent is the dwarf patch more complex because it handles the
possibility of multiple matches with the basename of the main source
file.  Assume there's only one file with the basename of the main file
name and things become a lot simpler.  See
http://sourceware.org/ml/gdb-patches/2008-01/msg00090.html.
One could rewrite that patch to keep dwarf2_start_subfile, but one
would have to pass an additional parameter so it would know if it's
dealing with the main source file.  The patch as submitted just
reorganizes things so that other solutions(/heuristics) are possible
without major reworking of the code (the patch itself had to do some
reworking, but once that's done it's done (in the problem space being
discussed)).  Plus it simplifies all call sites to
dwarf2_start_subfile/start_subfile.  Previously, each call site had to
process fe->dir_index, and there are three of them.

Handling the issue in each debug info reader is an important
consideration and may or may not be a problem.  One would need to
examine to what extent the issue exists in the other readers, and to
what extent start_subfile can solve it and still be debug-format
independent without being more complex.

> The one thing that this might break, however, is when the user
> provides a relative path in the break command:
>
>    (gdb) break bar/main.c
>
> As a matter of fact, this is already broken if DW_AT_name is main.c
> and not bar/main.c. So I'm pretty sure that it'll break it more.
> But the good news is that it would be easy to fix it: Modify the
> matching to concat the dirname and basename and do the match with
> that.
>
> Does this sound like it would work?

It seems like directory names need to be included in the file name
comparison by start_subfile to some extent, otherwise headers with the
same basename will match each other.

e.g.

a/foo.h:
int afoo () { return 0; }

b/foo.h:
int bfoo () { return 0; }

foo.c:
#include "a/foo.h"
#include "b/foo.h"
int main () { return afoo () + bfoo (); }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-03 17:13 ` Daniel Jacobowitz
@ 2008-01-07 14:33   ` Joel Brobecker
  2008-01-07 17:00     ` Doug Evans
  0 siblings, 1 reply; 25+ messages in thread
From: Joel Brobecker @ 2008-01-07 14:33 UTC (permalink / raw)
  To: Aleksandar Ristovski, gdb, Ryan Mansfield

[-- Attachment #1: Type: text/plain, Size: 2827 bytes --]

> An alternative would be to do some canonicalization - not
> gdb_realpath, which accesses the filesystem, but just string
> manipulation - on the subfile names iff nothing matches the main
> file.  You could remove the ".." there.

Aleksandar liked this option best, but I think I liked your first
suggestion more:

> > Would this not introduce a bit of guessing? Consider a case where there are
> > several files with the same basename but different paths (possibly specified
> > using relative path elements, i.e. different pathnames like in my case). In
> > this case none of the files with the same base name would perfectly fit and
> > picking the first one will likely not give the correct answer.
> 
> It's a guess, but a good one.  99.9% of the time, a C file does not
> include another C file with the same basename.  If the compiler is
> going to generate debug info which refers to the same file by two
> different names, I don't see what else we can do.

I looked at the whole discussion and at the patches, and they seem
pretty simple in the idea, but pretty complex in the implementation.

If I undertand the problem correctly, the problem is matching our
main.cc entry in the linetable back to the main unit subfile where
the name&comp_unit don't exactly match character for character. Right?

Going further with the idea that 99.9% of the time, one file does not
depend on another file with the same name, then how about doing the
matching using basenames?  As a starting point, I propose the following
idea: Modify start_subfile so that:

  1. file name is reduced to a basename only
     If there is any path information inside name (such as ".." or
     "/foo/bar/"), it is appended to the dir name. As a special case,
     if name was an absolute path, then we ignore the dirname and
     use the dirname from the filename only.

  2. subfile matching is done using the basename only.

I am attaching a patch that illustrates roughly what I have in mind
(not compiled, not tested).  The great advantage (if it works :),
is that no rewriting is necessary.

The reason why I agree that the changes should be done inside
start-subfiles is that we don't want to have to handle this sort of
problem with every debug info reader.  This was, the handling is
centralized. It also looks a lot simpler than the patches I have seen
that touch the DWARF reader.

The one thing that this might break, however, is when the user
provides a relative path in the break command:

   (gdb) break bar/main.c

As a matter of fact, this is already broken if DW_AT_name is main.c
and not bar/main.c. So I'm pretty sure that it'll break it more.
But the good news is that it would be easy to fix it: Modify the
matching to concat the dirname and basename and do the match with
that.

Does this sound like it would work?

-- 
Joel

[-- Attachment #2: buildsym.c.diff --]
[-- Type: text/plain, Size: 3078 bytes --]

Index: buildsym.c
===================================================================
--- buildsym.c	(revision 44)
+++ buildsym.c	(working copy)
@@ -580,7 +580,7 @@ make_blockvector (struct objfile *objfil
    the directory in which the file was compiled (or NULL if not known).  */
 
 void
-start_subfile (char *name, char *dirname)
+start_subfile_1 (char *name, char *dirname)
 {
   struct subfile *subfile;
 
@@ -589,27 +589,11 @@ start_subfile (char *name, char *dirname
 
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
-      char *subfile_name;
-
-      /* If NAME is an absolute path, and this subfile is not, then
-	 attempt to create an absolute path to compare.  */
-      if (IS_ABSOLUTE_PATH (name)
-	  && !IS_ABSOLUTE_PATH (subfile->name)
-	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
-			       subfile->name, NULL);
-      else
-	subfile_name = subfile->name;
-
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      if (FILENAME_CMP (subfile->name, name) == 0)
 	{
 	  current_subfile = subfile;
-	  if (subfile_name != subfile->name)
-	    xfree (subfile_name);
 	  return;
 	}
-      if (subfile_name != subfile->name)
-	xfree (subfile_name);
     }
 
   /* This subfile is not known.  Add an entry for it. Make an entry
@@ -681,6 +665,53 @@ start_subfile (char *name, char *dirname
     }
 }
 
+/* This function is a wrapper around start_subfile_1 which rewrites
+   a bit NAME and DIRNAME to make sure that we are always consistent
+   in how we create subfiles: Turn NAME into a basename, and adjust
+   DIRNAME accordingly.  */
+
+void
+start_subfile (char *name, char *dirname)
+{
+  char *new_name, new_dir_name;
+
+  new_name = lbasename (name);
+
+  /* The simple usual case: NAME is the name of a file without any path
+     (in other words, it's already a basename).  In that case, we use
+     DIR_NAME as is.  */
+  if (FILENAME_CMP (new_name, name) == 0)
+    new_dir_name = dir_name;
+  
+  /* Another simple case: NAME is an absolute PATH.  In that case,
+     we can ignore the compilation dir, and use the path inside NAME
+     as the dir name.  */
+  else if (IS_ABSOLUTE_PATH (name))
+    new_dir_name = ldirname (name);
+
+  /* Otherwise, NAME is a relative PATH, so we extract the relative
+     path from name, and append it at the end of DIRNAME.  IF DIRNAME
+     is NULL, then we use that relative path as the DIRNAME.  */
+  else
+    {
+      char *name_relative_path = ldirname (name);
+
+      if (dirname != NULL)
+        {
+          new_dir_name = concat (dirname, SLASH_STRING, name_relative_path,
+                                 NULL);
+          xfree (name_relative_path);
+        }
+      else
+        new_dir_name = name_relative_path;
+    }
+
+  start_subfile_1 (new_name, new_dir_name);
+
+  xfree (new_name);
+  xfree (new_dir_name);
+}
+
 /* For stabs readers, the first N_SO symbol is assumed to be the
    source file name, and the subfile struct is initialized using that
    assumption.  If another N_SO symbol is later seen, immediately

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 21:48   ` Daniel Jacobowitz
@ 2008-01-04 22:23     ` Doug Evans
  0 siblings, 0 replies; 25+ messages in thread
From: Doug Evans @ 2008-01-04 22:23 UTC (permalink / raw)
  To: Aleksandar Ristovski, gdb, Ryan Mansfield

On Jan 4, 2008 1:47 PM, Daniel Jacobowitz <drow@false.org> wrote:
> On Fri, Jan 04, 2008 at 01:39:51PM -0800, Doug Evans wrote:
> > The use of #line in the testcases in the appended patch is
> > questionable, but as a data point the testcases fail without the patch
> > and pass with the patch.  These versions of the testcases have been
> > amended to use substitute-path.
>
> This happens during symbol reading, right?  That means substitute-path
> has to be set before GDB loads any files; another reason why I don't
> think it is appropriate to use it while building symtabs.

Not to mention requiring getting the right value for substitute-path
being a bit sadistic ...

> I'll look at the rest of this thread later.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-04 22:09 Aleksandar Ristovski
  0 siblings, 0 replies; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-04 22:09 UTC (permalink / raw)
  To: Daniel Jacobowitz, Doug Evans; +Cc: gdb, Ryan Mansfield

> 
> On Fri, Jan 04, 2008 at 01:39:51PM -0800, Doug Evans wrote:
> > The use of #line in the testcases in the appended patch is
> > questionable, but as a data point the testcases fail without the patch
> > and pass with the patch.  These versions of the testcases have been
> > amended to use substitute-path.
@Doug:
Without trying-out your patch, by just looking at it, I don't think it
solves testcase number 2 from my previous post:
http://sourceware.org/ml/gdb/2008-01/msg00024.html 
This test case was my initial issue I wanted to solve, and then other issues
come up.

So my original problem was not related to using substitute-path (the
inconsistency of substitute-path should probably be discussed in a separate
thread), but in having two pathnames for the same physical path. One comes
from
DW_AT_name
And is a full file name, e.g. /foo/bar/main.cc
Second is constructed from .debug_line information, by concatenating
directory (which is "..") and file name ("main.cc") giving "../main.cc"
which is then appended to DW_AT_comp_dir resulting in:
/foo/bar/Debug/../main.cc

Now in start_subfile we are comparing:
/foo/bar/main.cc for which subfile structure had been constructed already
and
/foo/bar/Debug/../main.cc
Which should yield to finding the already constructed structure, but since
FILENAME_CMP does fairly simple file comparison, it fails to match it and
then goes on to create new subfile structure which is wrong. 

After that, another lookup is done and the first subfile is found (using
/foo/bar/main.cc) but due to no line information associated with it, gdb
fails to find info about that line and displays the error message to the
user (something like "no line NN in main.cc").


> 
> This happens during symbol reading, right?  That means substitute-path
> has to be set before GDB loads any files; another reason why I don't
> think it is appropriate to use it while building symtabs.
> 
> I'll look at the rest of this thread later.


Part of the solution to the testcase 2 is to add opposite case (we are
checking only IS_ABSOLUTE_PATH (name)... but we do not cover the case where
NAME is relative (and existing subfile->name is absolute, which is exactly
my case).

Maybe the other part of the solution to testcase 2 should really be in
rewriting FILENAME_CMP to try other alternatives if simple comparison fails.
It could remove '..' path elements and it would be safe if we are dealing
with paths coming from the binary.

> 
> --
> Daniel Jacobowitz
> CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 21:40 ` Doug Evans
@ 2008-01-04 21:48   ` Daniel Jacobowitz
  2008-01-04 22:23     ` Doug Evans
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-04 21:48 UTC (permalink / raw)
  To: Doug Evans; +Cc: Aleksandar Ristovski, gdb, Ryan Mansfield

On Fri, Jan 04, 2008 at 01:39:51PM -0800, Doug Evans wrote:
> The use of #line in the testcases in the appended patch is
> questionable, but as a data point the testcases fail without the patch
> and pass with the patch.  These versions of the testcases have been
> amended to use substitute-path.

This happens during symbol reading, right?  That means substitute-path
has to be set before GDB loads any files; another reason why I don't
think it is appropriate to use it while building symtabs.

I'll look at the rest of this thread later.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 17:04 Aleksandar Ristovski
  2008-01-04 17:42 ` Daniel Jacobowitz
@ 2008-01-04 21:40 ` Doug Evans
  2008-01-04 21:48   ` Daniel Jacobowitz
  1 sibling, 1 reply; 25+ messages in thread
From: Doug Evans @ 2008-01-04 21:40 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: Daniel Jacobowitz, gdb, Ryan Mansfield

[-- Attachment #1: Type: text/plain, Size: 604 bytes --]

The use of #line in the testcases in the appended patch is
questionable, but as a data point the testcases fail without the patch
and pass with the patch.  These versions of the testcases have been
amended to use substitute-path.

ref: pr 2360
http://sourceware.org/ml/gdb/2007-11/msg00138.html
http://sourceware.org/ml/gdb-patches/2007-11/msg00314.html

Maybe #line + substitute-path fixup isn't supported.
Again, this patch is just a data point.

Note: This patch contains the rewrite_subfile_path suggestion of my
previous email.  I don't know whether it helps or hurts the case
you're trying to fix.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: rewrite-path-3.patch --]
[-- Type: text/x-patch; name=rewrite-path-3.patch, Size: 12986 bytes --]

Index: buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.59
diff -u -p -u -p -r1.59 buildsym.c
--- buildsym.c	1 Jan 2008 22:53:09 -0000	1.59
+++ buildsym.c	4 Jan 2008 21:26:11 -0000
@@ -44,6 +44,7 @@
 #include "cp-support.h"
 #include "dictionary.h"
 #include "addrmap.h"
+#include "source.h"
 
 /* Ask buildsym.h to define the vars it normally declares `extern'.  */
 #define	EXTERN
@@ -86,6 +87,7 @@ static struct obstack pending_addrmap_ob
 static int pending_addrmap_interesting;
 
 \f
+static char *rewrite_subfile_path (char *name, char* dirname);
 static int compare_line_numbers (const void *ln1p, const void *ln2p);
 \f
 
@@ -574,6 +576,20 @@ make_blockvector (struct objfile *objfil
   return (blockvector);
 }
 \f
+
+/* Workaround for IS_ABSOLUTE_PATH issue. When gdb is configured for
+   *ix, and we are debugging binary cross-compiled on windows, the original
+   IS_ABSOLUTE_PATH will always return 0. This is not good.  */
+#undef IS_ABSOLUTE_PATH
+#undef IS_DIR_SEPARATOR
+
+#define IS_DIR_SEPARATOR(c)	((c) == '/' || (c) == '\\')
+/* Note that IS_ABSOLUTE_PATH accepts d:foo as well, although it is
+   only semi-absolute.  This is because the users of IS_ABSOLUTE_PATH
+   want to know whether to prepend the current working directory to
+   a file name, which should not be done with a name like d:foo.  */
+#define IS_ABSOLUTE_PATH(f)	(IS_DIR_SEPARATOR((f)[0]) || (((f)[0]) && ((f)[1] == ':')))
+
 /* Start recording information about source code that came from an
    included (or otherwise merged-in) source file with a different
    name.  NAME is the name of the file (cannot be NULL), DIRNAME is
@@ -583,35 +599,32 @@ void
 start_subfile (char *name, char *dirname)
 {
   struct subfile *subfile;
+  /* Rewritten NAME, typically an absolute path, that is used to detect
+     identical subfiles.  */
+  char *comparable_name;
+
+  comparable_name = rewrite_subfile_path (name, dirname);
 
   /* See if this subfile is already known as a subfile of the current
      main source file.  */
 
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
-      char *subfile_name;
-
-      /* If NAME is an absolute path, and this subfile is not, then
-	 attempt to create an absolute path to compare.  */
-      if (IS_ABSOLUTE_PATH (name)
-	  && !IS_ABSOLUTE_PATH (subfile->name)
-	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
-			       subfile->name, NULL);
-      else
-	subfile_name = subfile->name;
+      char *subfile_name = rewrite_subfile_path (subfile->name,
+						 subfile->dirname);
 
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      if (FILENAME_CMP (subfile_name, comparable_name) == 0)
 	{
 	  current_subfile = subfile;
-	  if (subfile_name != subfile->name)
-	    xfree (subfile_name);
+	  xfree (comparable_name);
+	  xfree (subfile_name);
 	  return;
 	}
-      if (subfile_name != subfile->name)
-	xfree (subfile_name);
+      xfree (subfile_name);
     }
 
+  xfree (comparable_name);
+
   /* This subfile is not known.  Add an entry for it. Make an entry
      for this subfile in the list of all subfiles of the current main
      source file.  */
@@ -681,6 +694,34 @@ start_subfile (char *name, char *dirname
     }
 }
 
+/* Subroutine of start_subfile to simplify it.
+   Convert NAME, DIRNAME to a form that can be used to watch for
+   identical subfiles.
+   Space for the result is malloc'd, caller must free.  */
+
+static char *
+rewrite_subfile_path (char *name, char *dirname)
+{
+  char *p = name;
+  char *rwname;
+
+  if (! IS_ABSOLUTE_PATH (name)
+      && dirname != NULL)
+    p = concat (dirname, SLASH_STRING, name, NULL);
+
+  rwname = rewrite_source_path (p);
+
+  if (rwname != NULL)
+    {
+      if (p != name)
+	xfree (p);
+      return rwname;
+    }
+  if (p != name)
+    return p;
+  return xstrdup (name);
+}
+
 /* For stabs readers, the first N_SO symbol is assumed to be the
    source file name, and the subfile struct is initialized using that
    assumption.  If another N_SO symbol is later seen, immediately
Index: source.c
===================================================================
RCS file: /cvs/src/src/gdb/source.c,v
retrieving revision 1.83
diff -u -p -u -p -r1.83 source.c
--- source.c	1 Jan 2008 22:53:13 -0000	1.83
+++ source.c	4 Jan 2008 21:26:11 -0000
@@ -895,10 +895,12 @@ get_substitute_path_rule (const char *pa
    Return NULL if no substitution rule was specified by the user,
    or if no rule applied to the given PATH.  */
    
-static char *
+char *
 rewrite_source_path (const char *path)
 {
-  const struct substitute_path_rule *rule = get_substitute_path_rule (path);
+  const struct substitute_path_rule *rule = (path != NULL) ? 
+					get_substitute_path_rule (path):
+					NULL;
   char *new_path;
   int from_len;
   
@@ -999,10 +1001,11 @@ find_and_open_source (struct objfile *ob
 	  strcat (path + len, source_path + len + cdir_len);	/* After $cdir */
 	}
     }
-  else
+  
+  if (IS_ABSOLUTE_PATH (filename))
     {
-      /* If dirname is NULL, chances are the path is embedded in
-         the filename.  Try the source path substitution on it.  */
+      /* If filename is absolute path, try the source path 
+	 substitution on it.  */
       char *rewritten_filename = rewrite_source_path (filename);
 
       if (rewritten_filename != NULL)
Index: source.h
===================================================================
RCS file: /cvs/src/src/gdb/source.h,v
retrieving revision 1.9
diff -u -p -u -p -r1.9 source.h
--- source.h	1 Jan 2008 22:53:13 -0000	1.9
+++ source.h	4 Jan 2008 21:26:11 -0000
@@ -66,4 +66,14 @@ extern struct symtab_and_line set_curren
 
 /* Reset any information stored about a default file and line to print. */
 extern void clear_current_source_symtab_and_line (void);
+
+/* If the user specified a source path substitution rule that applies
+   to PATH, then apply it and return the new path.  This new path must
+   be deallocated afterwards.  
+   
+   Return NULL if no substitution rule was specified by the user,
+   or if no rule applied to the given PATH.  */
+   
+extern char *rewrite_source_path (const char *path);
+
 #endif
Index: testsuite/gdb.base/hashline1.exp
===================================================================
RCS file: testsuite/gdb.base/hashline1.exp
diff -N testsuite/gdb.base/hashline1.exp
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gdb.base/hashline1.exp	4 Jan 2008 21:26:18 -0000
@@ -0,0 +1,60 @@
+# Copyright 2007 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the gdb testsuite.
+
+# Test loading of line number information with absolute path in #line, bug 2360.
+
+if $tracelevel then {
+    strace $tracelevel
+}
+
+set prms_id 2360
+set bug_id 0
+
+# srcfile is in objdir because we need to machine generate it in order
+# to get correct the correct path in the #line directive.
+
+set testfile "hashline1"
+set srcfile "${testfile}.c"
+set binfile "${objdir}/${subdir}/${testfile}"
+
+set fd [open ${objdir}/${subdir}/${srcfile} w]
+puts $fd "#line 2 \"[pwd]/${subdir}/${srcfile}\""
+puts $fd "int main () { return 0; } /* set breakpoint here */"
+close $fd
+
+# The choice of path name for the source file is important in order to
+# trigger the bug.  Using ${objdir}/${subdir}/${srcfile} here won't trigger
+# the bug.
+if { [gdb_compile "./${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } {
+    untested hashline1.exp
+    return -1
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "set breakpoint here" ${objdir}/${subdir}/${srcfile}]
+
+# Try to set a breakpoint on the specified file location.
+
+gdb_test "set substitute-path [pwd]/. [pwd]" "" ""
+
+gdb_test "break $srcfile:$bp_location" \
+    "Breakpoint.*at.* file .*$srcfile, line.*" \
+    "set breakpoint"
Index: testsuite/gdb.base/hashline2.exp
===================================================================
RCS file: testsuite/gdb.base/hashline2.exp
diff -N testsuite/gdb.base/hashline2.exp
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gdb.base/hashline2.exp	4 Jan 2008 21:26:18 -0000
@@ -0,0 +1,58 @@
+# Copyright 2007 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the gdb testsuite.
+
+# Test loading of line number information with an absolute path with extra
+# /'s in #line, bug 2360.
+
+if $tracelevel then {
+    strace $tracelevel
+}
+
+set prms_id 2360
+set bug_id 0
+
+# srcfile is in objdir because we need to machine generate it in order
+# to get correct the correct path in the #line directive.
+
+set testfile "hashline2"
+set srcfile "${testfile}.c"
+set binfile "${objdir}/${subdir}/${testfile}"
+
+set fd [open ${objdir}/${subdir}/${srcfile} w]
+puts $fd "#line 2 \"///[pwd]/${subdir}/${srcfile}\""
+puts $fd "int main () { return 0; } /* set breakpoint here */"
+close $fd
+
+if { [gdb_compile "${objdir}/${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } {
+    untested hashline1.exp
+    return -1
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "set breakpoint here" ${objdir}/${subdir}/${srcfile}]
+
+# Try to set a breakpoint on the specified file location.
+
+gdb_test "set substitute-path ///[pwd] [pwd]" "" ""
+
+gdb_test "break $srcfile:$bp_location" \
+    "Breakpoint.*at.* file .*$srcfile, line.*" \
+    "set breakpoint"
Index: testsuite/gdb.base/hashline3.exp
===================================================================
RCS file: testsuite/gdb.base/hashline3.exp
diff -N testsuite/gdb.base/hashline3.exp
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ testsuite/gdb.base/hashline3.exp	4 Jan 2008 21:26:18 -0000
@@ -0,0 +1,57 @@
+# Copyright 2007 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This file is part of the gdb testsuite.
+
+# Test loading of line number information with relative path in #line, bug 2360.
+
+if $tracelevel then {
+    strace $tracelevel
+}
+
+set prms_id 2360
+set bug_id 0
+
+# srcfile is in objdir because we need to machine generate it in order
+# to get correct the correct path in the #line directive.
+
+set testfile "hashline3"
+set srcfile "${testfile}.c"
+set binfile "${objdir}/${subdir}/${testfile}"
+
+set fd [open ${objdir}/${subdir}/${srcfile} w]
+puts $fd "#line 2 \"./${subdir}/${srcfile}\""
+puts $fd "int main () { return 0; } /* set breakpoint here */"
+close $fd
+
+if { [gdb_compile "${objdir}/${subdir}/${srcfile}" "${binfile}" executable {debug}] != "" } {
+    untested hashline1.exp
+    return -1
+}
+
+gdb_exit
+gdb_start
+gdb_reinitialize_dir $srcdir/$subdir
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "set breakpoint here" ${objdir}/${subdir}/${srcfile}]
+
+# Try to set a breakpoint on the specified file location.
+
+gdb_test "set substitute-path ${objdir}/. ${objdir}" "" ""
+
+gdb_test "break $srcfile:$bp_location" \
+    "Breakpoint.*at.* file .*$srcfile, line.*" \
+    "set breakpoint"

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-04 20:16 Aleksandar Ristovski
  0 siblings, 0 replies; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-04 20:16 UTC (permalink / raw)
  To: gdb

> 
> Attached is a binary that illustrates the problem I talked about
I couldn't send the attached binary to the mailing list... the mail bounced
back. Sorry about that.

--
Aleksandar Ristovski
QNX Software Systems

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 17:42 ` Daniel Jacobowitz
@ 2008-01-04 18:25   ` Joel Brobecker
  0 siblings, 0 replies; 25+ messages in thread
From: Joel Brobecker @ 2008-01-04 18:25 UTC (permalink / raw)
  To: Aleksandar Ristovski, gdb, Ryan Mansfield

> I always have trouble understanding changes to this part of GDB.
> It would help a lot if you could write test cases.

Me too, and I agree! This part of the code can be pretty obscure,
and I know that there are some cases that we don't cover. Until we
hit real-life cases that we don't cover yet, I think that it allowed
us to keep the implementation a little simpler. That's just a guess
as I have only contributed to the code, not really written it.

> Or you might ask Joel or Eli to look at your patches.

I skipped the first few emails but, I'll catch up. I don't mind
acting as a second pair of eyes.

-- 
Joel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-04 17:04 Aleksandar Ristovski
@ 2008-01-04 17:42 ` Daniel Jacobowitz
  2008-01-04 18:25   ` Joel Brobecker
  2008-01-04 21:40 ` Doug Evans
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-04 17:42 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb, Ryan Mansfield

On Fri, Jan 04, 2008 at 11:38:44AM -0500, Aleksandar Ristovski wrote:
> BUT...
> This does not solve all my problems. In fact, when debugging on linux a
> binary cross-compiled on windows, it won't work:
> 
> IS_ABSOLUTE_PATH macro will differ depending on the configured gdb host. The
> problem with this is when one is debugging binary cross-compiled on windows,
> on a unix host, IS_ABSOLUTE_PATH will return 0 on "C:.." style paths which
> is not correct. Looking at the code, in many places we concatenate directory
> and file name if IS_ABSOLUTE_PATH returns zero so in case of cross-compiled
> binary I suspect there will be many issues.

Yes, this is something I've been meaning to fix for ages, but never
had time.  We need versions of IS_ABSOLUTE_PATH, FILENAME_CMP, and the
other related macros which handle DOS pathnames regardless of the host
system.  We should give them different names, though, so that both are
available (sometimes we need to manipulate host paths).

> > > 2. Problem two - one physical file is specified with two pathnames.
> > 
> > The specific case that's important here is associating the main file
> > from a dwarf2 CU DIE with the right lines from .debug_line.  When we
> > go to create that subfile we haven't created the other subfiles yet.
> > So what we need, I think, is to handle this in dwarf2read.c by walking
> > through the filename / directory table.
> 
> dwarf2read.c calls start_subfile or start_symtab which in turn calls
> start_subfile. I think we should let our readers deal with "compiled" paths
> (i.e. as found in the binary), without conversion. In start_subfile,
> however, we can do something with paths. Currently, we do not try to
> substitute-path on paths coming from the binary, even though they are
> clearly our source paths and should be dealt with the same as with any other
> source path. 

Substitute-path is designed to convert paths coming from the binary
into paths for the host system.  I don't understand why you're calling
it this early; it doesn't matter if these files exist on the host
system or not.

I always have trouble understanding changes to this part of GDB.
It would help a lot if you could write test cases.  We can use
gdb.dwarf2/ to construct arbitrary DWARF-2 contents, with relative or
absolute paths as needed.  If you have trouble writing the test cases,
I can do it if you send me a couple of example binaries, and the
before and after expected results.

Or you might ask Joel or Eli to look at your patches.

> Note2: for some reason, my post to gdb-patches didn't go through even after
> two attempts. There I submitted only source.c change dealing with rewriting
> source path in case when NAME is an absolute path and at the same time
> DIRNAME is not NULL (which can happen).

Please try using text for patches, instead of binary attachments.
ChangeLog entries would be nice, too.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-04 17:04 Aleksandar Ristovski
  2008-01-04 17:42 ` Daniel Jacobowitz
  2008-01-04 21:40 ` Doug Evans
  0 siblings, 2 replies; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-04 17:04 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Ryan Mansfield

[-- Attachment #1: Type: text/plain, Size: 2491 bytes --]

> On Thu, Jan 03, 2008 at 01:25:55PM -0500, Aleksandar Ristovski wrote:
> > 1. Problem one - relative NAME and absolute subfile->name case not
> covered:
> 
> I think you're right, but I can't tell for sure.

I am pretty sure this is real problem as this is what I am seeing with our
case (the instrumentation I posted proves it). This is easy and I think
straight forward to fix. (please see the attachement buildsym.c.patch1).

BUT...
This does not solve all my problems. In fact, when debugging on linux a
binary cross-compiled on windows, it won't work:

IS_ABSOLUTE_PATH macro will differ depending on the configured gdb host. The
problem with this is when one is debugging binary cross-compiled on windows,
on a unix host, IS_ABSOLUTE_PATH will return 0 on "C:.." style paths which
is not correct. Looking at the code, in many places we concatenate directory
and file name if IS_ABSOLUTE_PATH returns zero so in case of cross-compiled
binary I suspect there will be many issues.

> 
> > 2. Problem two - one physical file is specified with two pathnames.
> 
> The specific case that's important here is associating the main file
> from a dwarf2 CU DIE with the right lines from .debug_line.  When we
> go to create that subfile we haven't created the other subfiles yet.
> So what we need, I think, is to handle this in dwarf2read.c by walking
> through the filename / directory table.

dwarf2read.c calls start_subfile or start_symtab which in turn calls
start_subfile. I think we should let our readers deal with "compiled" paths
(i.e. as found in the binary), without conversion. In start_subfile,
however, we can do something with paths. Currently, we do not try to
substitute-path on paths coming from the binary, even though they are
clearly our source paths and should be dealt with the same as with any other
source path. 

My attempt would be patch2 attached (start_subfile.patch). Please take a
look. This should solve both problems.

Note: IS_ABSOLUTE_PATH issue will still exist since we deal with compiled
paths throughout gdb code. Wherever IS_ABSOLUTE_PATH is used on a windows
style path (and gdb was configured for *ix) it will return 0 leading to
problems.

Note2: for some reason, my post to gdb-patches didn't go through even after
two attempts. There I submitted only source.c change dealing with rewriting
source path in case when NAME is an absolute path and at the same time
DIRNAME is not NULL (which can happen).
 

Aleksandar Ristovski
QNX Software Systems


[-- Attachment #2: buildsym.c.patch1 --]
[-- Type: application/octet-stream, Size: 1745 bytes --]

Index: gdb/buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.59
diff -u -3 -p -r1.59 buildsym.c
--- gdb/buildsym.c	1 Jan 2008 22:53:09 -0000	1.59
+++ gdb/buildsym.c	4 Jan 2008 14:41:44 -0000
@@ -583,6 +583,13 @@ void
 start_subfile (char *name, char *dirname)
 {
   struct subfile *subfile;
+  char *absname; /* Absolute NAME.  */
+
+  if (!IS_ABSOLUTE_PATH (name) 
+      && dirname != NULL)
+    absname = concat (dirname, SLASH_STRING, name, NULL);
+  else
+    absname = name;
 
   /* See if this subfile is already known as a subfile of the current
      main source file.  */
@@ -590,6 +597,7 @@ start_subfile (char *name, char *dirname
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
       char *subfile_name;
+      char *aname;
 
       /* If NAME is an absolute path, and this subfile is not, then
 	 attempt to create an absolute path to compare.  */
@@ -601,7 +609,13 @@ start_subfile (char *name, char *dirname
       else
 	subfile_name = subfile->name;
 
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      if (!IS_ABSOLUTE_PATH (name)
+	  && IS_ABSOLUTE_PATH (subfile->name))
+	aname = absname;
+      else
+	aname = name;
+
+      if (FILENAME_CMP (subfile_name, aname) == 0)
 	{
 	  current_subfile = subfile;
 	  if (subfile_name != subfile->name)
@@ -612,6 +626,9 @@ start_subfile (char *name, char *dirname
 	xfree (subfile_name);
     }
 
+  if (absname != name)
+    xfree (absname);
+
   /* This subfile is not known.  Add an entry for it. Make an entry
      for this subfile in the list of all subfiles of the current main
      source file.  */

[-- Attachment #3: start_subfile.patch --]
[-- Type: application/octet-stream, Size: 5970 bytes --]

Index: gdb/buildsym.c
===================================================================
RCS file: /cvs/src/src/gdb/buildsym.c,v
retrieving revision 1.59
diff -u -3 -p -r1.59 buildsym.c
--- gdb/buildsym.c	1 Jan 2008 22:53:09 -0000	1.59
+++ gdb/buildsym.c	4 Jan 2008 16:28:40 -0000
@@ -44,6 +44,7 @@
 #include "cp-support.h"
 #include "dictionary.h"
 #include "addrmap.h"
+#include "source.h"
 
 /* Ask buildsym.h to define the vars it normally declares `extern'.  */
 #define	EXTERN
@@ -574,15 +575,57 @@ make_blockvector (struct objfile *objfil
   return (blockvector);
 }
 \f
+
+/* Workaround for IS_ABSOLUTE_PATH issue. When gdb is configured for
+   *ix, and we are debugging binary cross-compiled on windows, the original
+   IS_ABSOLUTE_PATH will always return 0. This is not good.  */
+#undef IS_ABSOLUTE_PATH
+#undef IS_DIR_SEPARATOR
+
+#define IS_DIR_SEPARATOR(c)	((c) == '/' || (c) == '\\')
+/* Note that IS_ABSOLUTE_PATH accepts d:foo as well, although it is
+   only semi-absolute.  This is because the users of IS_ABSOLUTE_PATH
+   want to know whether to prepend the current working directory to
+   a file name, which should not be done with a name like d:foo.  */
+#define IS_ABSOLUTE_PATH(f)	(IS_DIR_SEPARATOR((f)[0]) || (((f)[0]) && ((f)[1] == ':')))
+
 /* Start recording information about source code that came from an
    included (or otherwise merged-in) source file with a different
    name.  NAME is the name of the file (cannot be NULL), DIRNAME is
    the directory in which the file was compiled (or NULL if not known).  */
 
 void
-start_subfile (char *name, char *dirname)
+start_subfile (char *name, char *dirname) 
 {
   struct subfile *subfile;
+  char *absname; /* Rewritten absolute NAME.  */
+
+  if (IS_ABSOLUTE_PATH (name))
+    {
+      char *rwname;
+      rwname = rewrite_source_path (name);
+      if (rwname != NULL)
+	{
+	  make_cleanup (xfree, rwname);
+	  name = rwname;
+	}
+      absname = name;
+    }
+  else
+    {
+      if (dirname != NULL)
+	{
+	  char *rwdirname = rewrite_source_path (dirname);
+	  if (rwdirname == NULL)
+	    rwdirname = dirname;
+	  absname = concat (rwdirname, SLASH_STRING, name, NULL);
+	  if (rwdirname != dirname)
+	    xfree (rwdirname);
+	  make_cleanup (xfree, absname);
+	}
+      else
+	absname = name;
+    }
 
   /* See if this subfile is already known as a subfile of the current
      main source file.  */
@@ -590,18 +633,36 @@ start_subfile (char *name, char *dirname
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
       char *subfile_name;
+      char *aname;
 
       /* If NAME is an absolute path, and this subfile is not, then
 	 attempt to create an absolute path to compare.  */
       if (IS_ABSOLUTE_PATH (name)
 	  && !IS_ABSOLUTE_PATH (subfile->name)
 	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
+	{
+	  char *rwsubfile_dirname = rewrite_source_path (subfile->dirname);
+	  if (rwsubfile_dirname == NULL)
+	    rwsubfile_dirname = subfile->dirname;
+	  subfile_name = concat (rwsubfile_dirname, SLASH_STRING,
 			       subfile->name, NULL);
+	  if (rwsubfile_dirname != subfile->dirname)
+	    xfree (rwsubfile_dirname);
+	}
+      else
+	{
+	  subfile_name = rewrite_source_path (subfile->name);
+	  if (subfile_name == NULL)
+	    subfile_name = subfile->name;
+	}
+
+      if (!IS_ABSOLUTE_PATH (name)
+	  && IS_ABSOLUTE_PATH (subfile->name))
+	aname = absname;
       else
-	subfile_name = subfile->name;
+	aname = name;
 
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      if (FILENAME_CMP (subfile_name, aname) == 0)
 	{
 	  current_subfile = subfile;
 	  if (subfile_name != subfile->name)
Index: gdb/source.c
===================================================================
RCS file: /cvs/src/src/gdb/source.c,v
retrieving revision 1.83
diff -u -3 -p -r1.83 source.c
--- gdb/source.c	1 Jan 2008 22:53:13 -0000	1.83
+++ gdb/source.c	4 Jan 2008 16:28:40 -0000
@@ -895,10 +895,12 @@ get_substitute_path_rule (const char *pa
    Return NULL if no substitution rule was specified by the user,
    or if no rule applied to the given PATH.  */
    
-static char *
+char *
 rewrite_source_path (const char *path)
 {
-  const struct substitute_path_rule *rule = get_substitute_path_rule (path);
+  const struct substitute_path_rule *rule = (path != NULL) ? 
+					get_substitute_path_rule (path):
+					NULL;
   char *new_path;
   int from_len;
   
@@ -999,10 +1001,11 @@ find_and_open_source (struct objfile *ob
 	  strcat (path + len, source_path + len + cdir_len);	/* After $cdir */
 	}
     }
-  else
+  
+  if (IS_ABSOLUTE_PATH (filename))
     {
-      /* If dirname is NULL, chances are the path is embedded in
-         the filename.  Try the source path substitution on it.  */
+      /* If filename is absolute path, try the source path 
+	 substitution on it.  */
       char *rewritten_filename = rewrite_source_path (filename);
 
       if (rewritten_filename != NULL)
Index: gdb/source.h
===================================================================
RCS file: /cvs/src/src/gdb/source.h,v
retrieving revision 1.9
diff -u -3 -p -r1.9 source.h
--- gdb/source.h	1 Jan 2008 22:53:13 -0000	1.9
+++ gdb/source.h	4 Jan 2008 16:28:40 -0000
@@ -66,4 +66,14 @@ extern struct symtab_and_line set_curren
 
 /* Reset any information stored about a default file and line to print. */
 extern void clear_current_source_symtab_and_line (void);
+
+/* If the user specified a source path substitution rule that applies
+   to PATH, then apply it and return the new path.  This new path must
+   be deallocated afterwards.  
+   
+   Return NULL if no substitution rule was specified by the user,
+   or if no rule applied to the given PATH.  */
+   
+extern char *rewrite_source_path (const char *path);
+
 #endif

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-03 18:30 Aleksandar Ristovski
@ 2008-01-04 12:52 ` Daniel Jacobowitz
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-04 12:52 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb, Ryan Mansfield

On Thu, Jan 03, 2008 at 01:25:55PM -0500, Aleksandar Ristovski wrote:
> 1. Problem one - relative NAME and absolute subfile->name case not covered:

I think you're right, but I can't tell for sure.

> 2. Problem two - one physical file is specified with two pathnames.
> 
> This is the case I was talking about. To fix it, I would try a combination
> of guessing/normalizing, by inserting code for "second-best" match using
> normalized path (in the same loop) if FILENAME_CMP fails to match and then
> if no perfect match is found, use the second-best match. Not sure what
> happens if there are more than one second-best matches (you say it's going
> to be very rare)... probably at least issue a warning and behave as if no
> match was found? Or resort to getting the first/last second-best match? On a
> second thought, if we go with the second-best match we will never get into a
> situation where two created subfile-s give more two second-best matches...
> it will always be either perfect match or the second-best... this should
> work.

The specific case that's important here is associating the main file
from a dwarf2 CU DIE with the right lines from .debug_line.  When we
go to create that subfile we haven't created the other subfiles yet.
So what we need, I think, is to handle this in dwarf2read.c by walking
through the filename / directory table.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-03 18:30 Aleksandar Ristovski
  2008-01-04 12:52 ` Daniel Jacobowitz
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-03 18:30 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Ryan Mansfield

> An alternative would be to do some canonicalization - not
> gdb_realpath, which accesses the filesystem, but just string
> manipulation - on the subfile names iff nothing matches the main
> file.  You could remove the ".." there.

Allright, I will try with the alternative. 

I see two problems:

1. Problem one - relative NAME and absolute subfile->name case not covered:

In function start_subfile (buildsym.c:approx 554) we completely miss one
case: when NAME is relative path but subfile->name is absolute.

I put some instrumentation to illustrate this problem. The list below says
what is happening and specifies name, directory:

start_subfile: c:/testManagedCC/main.cc, c:/testManagedCC/Debug
creating subfile: c:/testManagedCC/main.cc, c:/testManagedCC/Debug
start_subfile: ../main.cc, c:/testManagedCC/Debug
comparing: c:/testManagedCC/main.cc, ../main.cc
creating subfile: ../main.cc, c:/testManagedCC/Debug
start_subfile: ../main.cc, c:/testManagedCC/Debug
comparing: ../main.cc, ../main.cc

Then it goes on to compare ../main.cc for each line in linetable.

Note that we created two subfiles for physically one file.


2. Problem two - one physical file is specified with two pathnames.

This is the case I was talking about. To fix it, I would try a combination
of guessing/normalizing, by inserting code for "second-best" match using
normalized path (in the same loop) if FILENAME_CMP fails to match and then
if no perfect match is found, use the second-best match. Not sure what
happens if there are more than one second-best matches (you say it's going
to be very rare)... probably at least issue a warning and behave as if no
match was found? Or resort to getting the first/last second-best match? On a
second thought, if we go with the second-best match we will never get into a
situation where two created subfile-s give more two second-best matches...
it will always be either perfect match or the second-best... this should
work.

Comments?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-03 17:07 Aleksandar Ristovski
@ 2008-01-03 17:13 ` Daniel Jacobowitz
  2008-01-07 14:33   ` Joel Brobecker
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-03 17:13 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb, Ryan Mansfield

On Thu, Jan 03, 2008 at 12:01:16PM -0500, Aleksandar Ristovski wrote:
> > We should not call realpath on files which are not known to exist on
> > the system running GDB.  If we do that somewhere, it's a bug.  
> 
> I am not sure it is a bug. At the time of gdb_realpath call, gdb can not
> know whether the path exists or not, it takes the path information from
> debug info. Second, it is perfectly normal situation where sources are not
> available at all, and paths mentioned in debug_info do not physically exist
> on the system.

Paths from debug info are adjusted before we treat them as local
paths.  e.g. set substitute-path.

> > Well, that's why.  If DW_AT_name was main.cc, things would have
> > worked.  That's what GCC generates for me.  You have debug info which
> > refers to the same file using two different pathnames.
> > 
> > Perhaps we can forcibly associate the compilation unit with the first
> > entry in the file name table, if they have the same basename and no
> > other file in the line table matches the CU better.
> 
> Would this not introduce a bit of guessing? Consider a case where there are
> several files with the same basename but different paths (possibly specified
> using relative path elements, i.e. different pathnames like in my case). In
> this case none of the files with the same base name would perfectly fit and
> picking the first one will likely not give the correct answer.

It's a guess, but a good one.  99.9% of the time, a C file does not
include another C file with the same basename.  If the compiler is
going to generate debug info which refers to the same file by two
different names, I don't see what else we can do.

An alternative would be to do some canonicalization - not
gdb_realpath, which accesses the filesystem, but just string
manipulation - on the subfile names iff nothing matches the main
file.  You could remove the ".." there.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-03 17:07 Aleksandar Ristovski
  2008-01-03 17:13 ` Daniel Jacobowitz
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-03 17:07 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Ryan Mansfield



> -----Original Message-----
> From: Daniel Jacobowitz [mailto:drow@false.org]
> Sent: January 3, 2008 11:47 AM
> To: Aleksandar Ristovski
> Cc: gdb@sourceware.org; Ryan Mansfield
> Subject: Re: gdb_realpath: dealing with ./ and ../
> 
> 
> We should not call realpath on files which are not known to exist on
> the system running GDB.  If we do that somewhere, it's a bug.  

I am not sure it is a bug. At the time of gdb_realpath call, gdb can not
know whether the path exists or not, it takes the path information from
debug info. Second, it is perfectly normal situation where sources are not
available at all, and paths mentioned in debug_info do not physically exist
on the system.

>Your
> patch added several calls of that sort.

No, my patch only checked if realpath returns NULL in which case instead of
returning filename as-is, it normalizes it. 

I also added gdb_realpath call in create_subfile to create canonical
pathname for a new subfile, and to canonicalize path we are looking for.

But I do not claim this is a perfect solution.

> 
> > > > When our cross-compiler generates binary, it stores relative path in
> > > > .debug_line section (relative to compilation dir), i.e. '..'.
> > >
> > > What's in .debug_info?  Also, what version of GDB?
> >
> >
> > In my case:
> >      DW_AT_name        :
> > c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/main.cc>~~~~~$
> >      DW_AT_comp_dir    :
> > c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/Debug>~~~~~
> 
> Well, that's why.  If DW_AT_name was main.cc, things would have
> worked.  That's what GCC generates for me.  You have debug info which
> refers to the same file using two different pathnames.
> 
> Perhaps we can forcibly associate the compilation unit with the first
> entry in the file name table, if they have the same basename and no
> other file in the line table matches the CU better.

Would this not introduce a bit of guessing? Consider a case where there are
several files with the same basename but different paths (possibly specified
using relative path elements, i.e. different pathnames like in my case). In
this case none of the files with the same base name would perfectly fit and
picking the first one will likely not give the correct answer.

> 
> --
> Daniel Jacobowitz
> CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-03 16:39 Aleksandar Ristovski
@ 2008-01-03 16:52 ` Daniel Jacobowitz
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-03 16:52 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb, Ryan Mansfield

On Thu, Jan 03, 2008 at 11:33:34AM -0500, Aleksandar Ristovski wrote:
> On linux, when realpath works all is fine and it will take care of the
> symbolic links. However, the problem starts when paths do not really exist
> and realpath fails. A simple example is my cross-compiled binary built on
> windows, being debugged on linux. In this case, when realpath fails, I would
> like to 'compact' or 'normalize' the path by resolving relative path
> elements (and current path elements) thus forming canonical path, whithout
> resolving symlinks (which can not be resolved since they do not exist).

We should not call realpath on files which are not known to exist on
the system running GDB.  If we do that somewhere, it's a bug.  Your
patch added several calls of that sort.

> > > When our cross-compiler generates binary, it stores relative path in
> > > .debug_line section (relative to compilation dir), i.e. '..'.
> > 
> > What's in .debug_info?  Also, what version of GDB?
> 
> 
> In my case:
>      DW_AT_name        :
> c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/main.cc>~~~~~$
>      DW_AT_comp_dir    :
> c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/Debug>~~~~~

Well, that's why.  If DW_AT_name was main.cc, things would have
worked.  That's what GCC generates for me.  You have debug info which
refers to the same file using two different pathnames.

Perhaps we can forcibly associate the compilation unit with the first
entry in the file name table, if they have the same basename and no
other file in the line table matches the CU better.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: gdb_realpath: dealing with ./ and ../
@ 2008-01-03 16:39 Aleksandar Ristovski
  2008-01-03 16:52 ` Daniel Jacobowitz
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-03 16:39 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb, Ryan Mansfield


> -----Original Message-----
> From: Daniel Jacobowitz [mailto:drow@false.org]
> Sent: January 3, 2008 10:56 AM
> To: Aleksandar Ristovski
> Cc: gdb@sourceware.org; Ryan Mansfield
> Subject: Re: gdb_realpath: dealing with ./ and ../
> 
> On Thu, Jan 03, 2008 at 10:20:19AM -0500, Aleksandar Ristovski wrote:
> > Hello,
> >
> > First a question, to give an idea what I am talking about and then
> detailed
> > explanation.
> >
> > Question: Should gdb_realpath deal with './' and '../' path elements and
> > compact them along with 'canonicalization' it already does?
> 
> The problem with this idea is that removing ../ is not reliably
> correct.  On Unix, symlinks allow foo/../bar and bar to be different

On linux, when realpath works all is fine and it will take care of the
symbolic links. However, the problem starts when paths do not really exist
and realpath fails. A simple example is my cross-compiled binary built on
windows, being debugged on linux. In this case, when realpath fails, I would
like to 'compact' or 'normalize' the path by resolving relative path
elements (and current path elements) thus forming canonical path, whithout
resolving symlinks (which can not be resolved since they do not exist).

Additional problem is on windows, where realpath doesn't work and
gdb_realpath usually simply returns the input parameter. (my proposed patch
doesn't solve this... it is not complete yet).

> directories.  We should only canonicalize paths to local files, not to
> files mentioned in debug information.
> 
> gdb_realpath shouldn't need any changes to handle ..; it works from
> the local filesystem and constructs a real canonical path.  I see that
> you're on Windows.  gdb_realpath may not handle Windows correctly;
> libiberty's lrealpath does and I don't know why we still have both.

I don't think lrealpath would work in case the real path doesn't exist.

> 
> > When our cross-compiler generates binary, it stores relative path in
> > .debug_line section (relative to compilation dir), i.e. '..'.
> 
> What's in .debug_info?  Also, what version of GDB?


In my case:
     DW_AT_name        :
c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/main.cc>~~~~~$
     DW_AT_comp_dir    :
c:/QNXTau/eclipse/ide-4.5-workspace/testManagedCC/Debug>~~~~~

GDB version 6.7.

> 
> I have:
> 
>  <0><154>: Abbrev Number: 1 (DW_TAG_compile_unit)
>   <189>     DW_AT_name        : ../main.c
>   <193>     DW_AT_comp_dir    : /home/drow/z/baz
> 
>  The Directory Table:
>   ..
> 
>  The File Name Table:
>   Entry Dir     Time    Size    Name
>   1     1       0       0       main.c
> 
> So everything constructs the same /home/drow/z/baz/../main.c.
> 
> --
> Daniel Jacobowitz
> CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: gdb_realpath: dealing with ./ and ../
  2008-01-03 15:25 Aleksandar Ristovski
@ 2008-01-03 16:00 ` Daniel Jacobowitz
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Jacobowitz @ 2008-01-03 16:00 UTC (permalink / raw)
  To: Aleksandar Ristovski; +Cc: gdb, Ryan Mansfield

On Thu, Jan 03, 2008 at 10:20:19AM -0500, Aleksandar Ristovski wrote:
> Hello,
> 
> First a question, to give an idea what I am talking about and then detailed
> explanation.
> 
> Question: Should gdb_realpath deal with './' and '../' path elements and
> compact them along with 'canonicalization' it already does?

The problem with this idea is that removing ../ is not reliably
correct.  On Unix, symlinks allow foo/../bar and bar to be different
directories.  We should only canonicalize paths to local files, not to
files mentioned in debug information.

gdb_realpath shouldn't need any changes to handle ..; it works from
the local filesystem and constructs a real canonical path.  I see that
you're on Windows.  gdb_realpath may not handle Windows correctly;
libiberty's lrealpath does and I don't know why we still have both.

> When our cross-compiler generates binary, it stores relative path in
> .debug_line section (relative to compilation dir), i.e. '..'.

What's in .debug_info?  Also, what version of GDB?

> It loads line table, finds '..', constructs absolute path using compilation
> directory and creates "C:/foo/bar/Debug/../main.cc" and then compares this
> (using FILENAME_CMP macro) to existing subfile-s and fails to find it (file
> buildsym.c, function start_subfile).

I have:

 <0><154>: Abbrev Number: 1 (DW_TAG_compile_unit)
  <189>     DW_AT_name        : ../main.c
  <193>     DW_AT_comp_dir    : /home/drow/z/baz

 The Directory Table:
  ..

 The File Name Table:
  Entry Dir     Time    Size    Name
  1     1       0       0       main.c

So everything constructs the same /home/drow/z/baz/../main.c.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 25+ messages in thread

* gdb_realpath: dealing with ./ and ../
@ 2008-01-03 15:25 Aleksandar Ristovski
  2008-01-03 16:00 ` Daniel Jacobowitz
  0 siblings, 1 reply; 25+ messages in thread
From: Aleksandar Ristovski @ 2008-01-03 15:25 UTC (permalink / raw)
  To: gdb; +Cc: Ryan Mansfield

Hello,

First a question, to give an idea what I am talking about and then detailed
explanation.

Question: Should gdb_realpath deal with './' and '../' path elements and
compact them along with 'canonicalization' it already does?

Details:
Our binary was created from one compilation unit:
C:/foo/bar/main.cc
Our compilation directory is:
C:/foo/bar/Debug

When our cross-compiler generates binary, it stores relative path in
.debug_line section (relative to compilation dir), i.e. '..'.

readelf -wl output gives this:
...

  Opcode 6 has 0 args
  Opcode 7 has 0 args
  Opcode 8 has 0 args
  Opcode 9 has 1 args

 The Directory Table:
  ..

 The File Name Table:
  Entry	Dir	Time	Size	Name
  1	1	0	0	main.cc

...

GDB internally gets confused by this: it first creates subfile using
filename:  'C:/foo/bar/main.cc'
But then, when breakpoint is set:
(gdb) b main.cc:11
It loads line table, finds '..', constructs absolute path using compilation
directory and creates "C:/foo/bar/Debug/../main.cc" and then compares this
(using FILENAME_CMP macro) to existing subfile-s and fails to find it (file
buildsym.c, function start_subfile).

It looks like storing relative path in .debug_line is correct (is it?) but
gdb_realpath fails to compact paths containing '..' path elements. 


Question: Should gdb_realpath deal with './' and '../' path elements and
compact them along with 'canonicalization' it already does? Alternatively,
should FILENAME_CMP do more to be smarter about comparing two paths?

Thank you,

Aleksandar Ristovski
QNX Software Systems


Patch that illustrates a solution, providing gdb_realpath indeed needs to
deal with relative path elements.

Index: gdb/utils.c
===================================================================
--- gdb/utils.c	(revision 69)
+++ gdb/utils.c	(working copy)
@@ -2867,6 +2867,109 @@
   return addr;
 }
 
+
+/* Normalize_path does lightweight path clean-up. It removes './' 
+ elements and resolves '../' elements by removing previous entry if any.
+ If FILENAME starts with '../', then '../' does not get removed.  
+
+ Returned value should be freed with xfree.
+
+ Examples:
+ ../main.c   -->    ../main.c
+ ./main.c    -->    main.c
+ /main.c     -->    /main.c
+ /foo/./bar/././main.c  -->   /foo/bar/main.c
+ C:/Temp/Debug/../main.c  -->  C:/Temp/main.c
+  */
+
+char *
+normalize_path (const char *filename)
+{
+  char *p;
+  char *pi;
+  int len;
+# if defined (PATH_MAX)
+  char buf[PATH_MAX];
+#  define USE_REALPATH
+# elif defined (MAXPATHLEN)
+  char buf[MAXPATHLEN];
+# endif
+
+ gdb_assert (filename != NULL);
+
+  strncpy (buf, filename, sizeof (buf));
+  buf[sizeof (buf)] = '\0';
+
+  p = buf;
+
+  while ((pi = strstr (p, "./")))
+    {
+      if (pi == p) /* FILENAME starts with './'. Remove it.  */
+	  p += 2;
+      else
+	break;
+    }
+
+  if (p != buf)
+      strncpy (buf, filename + (p - buf), sizeof (buf));
+ 
+  len = strlen (buf);
+
+  /* Remove all double '//' except the leading occurence.  */
+  p = buf + 1;
+  while (p < buf + len)
+    {
+      if (p[0] == '/' && p[1] == '/')
+	{
+	  memmove (p, p+1, buf + len - p);
+	  len--;
+	}
+      p++;
+    }
+
+  /* Replace all other occurences of '/./' with '/'.  */
+  p = buf;
+  while ((pi = strstr (p, "/./")))
+    {
+      p = pi + 3;
+      memmove (pi, p, buf + len  - p);
+      len -= 3;
+      memset (buf + len, 0, 3);
+    }
+
+  /* Remove trailing '/.'.  */
+  while (buf[len-2] == '/' && buf[len-1] == '.')
+    {
+      len -= 2;
+      buf[len] = '\0';
+      buf[len+1] = '\0';
+    }
+
+  /* Deal with '../' sequences.  */
+  p = buf + 1; /* In an odd case that path begins with '/../' we don't want
+		to know.  */
+  while ((pi = strstr (p, "/..")))
+    {
+      p = pi;
+      /* Reverse find '/'.  */
+      pi = p - 1;
+      while (pi > buf && *pi != '/')
+	pi--;
+
+      if (pi != p)
+	{
+	  p += 3;
+	  memmove (pi, p, buf + len - p);
+	  len -= (p - pi);
+	  memset (buf + len, 0, p - pi);
+	}
+      else
+	p++;
+    }
+
+  return xstrdup (buf);
+}
+
 char *
 gdb_realpath (const char *filename)
 {
@@ -2886,7 +2989,9 @@
 # if defined (USE_REALPATH)
     const char *rp = realpath (filename, buf);
     if (rp == NULL)
-      rp = filename;
+      {
+	return normalize_path (filename);
+      }
     return xstrdup (rp);
 # endif
   }
Index: gdb/buildsym.c
===================================================================
--- gdb/buildsym.c	(revision 69)
+++ gdb/buildsym.c	(working copy)
@@ -548,26 +548,46 @@
   for (subfile = subfiles; subfile; subfile = subfile->next)
     {
       char *subfile_name;
+      char *full_name;
 
       /* If NAME is an absolute path, and this subfile is not, then
 	 attempt to create an absolute path to compare.  */
       if (IS_ABSOLUTE_PATH (name)
 	  && !IS_ABSOLUTE_PATH (subfile->name)
 	  && subfile->dirname != NULL)
-	subfile_name = concat (subfile->dirname, SLASH_STRING,
+	{
+	  char *path = concat (subfile->dirname, SLASH_STRING,
 			       subfile->name, NULL);
+	  subfile_name = gdb_realpath (path);
+	  xfree (path);
+	}
       else
 	subfile_name = subfile->name;
 
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      /* If NAME is not an absolute path, try to make it an 
+	 absolute path so we compare apples with apples.  */
+      if (!IS_ABSOLUTE_PATH (name) && dirname != NULL)
+	{
+	  char *path = concat (dirname, SLASH_STRING, name, NULL);
+	  full_name = gdb_realpath (path);
+	  xfree (path);
+	}
+      else
+	full_name = name;
+
+      if (FILENAME_CMP (subfile_name, full_name) == 0)
 	{
 	  current_subfile = subfile;
 	  if (subfile_name != subfile->name)
 	    xfree (subfile_name);
+	  if (full_name != name)
+	    xfree (full_name);
 	  return;
 	}
       if (subfile_name != subfile->name)
 	xfree (subfile_name);
+      if (full_name != name)
+	xfree (full_name);
     }
 
   /* This subfile is not known.  Add an entry for it. Make an entry

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-01-08 19:54 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-04 19:52 gdb_realpath: dealing with ./ and ../ Aleksandar Ristovski
2008-01-04 20:30 ` Doug Evans
  -- strict thread matches above, loose matches on Subject: below --
2008-01-08 19:21 Aleksandar Ristovski
2008-01-08 16:12 Aleksandar Ristovski
2008-01-08 16:40 ` Mark Kettenis
2008-01-04 22:09 Aleksandar Ristovski
2008-01-04 20:16 Aleksandar Ristovski
2008-01-04 17:04 Aleksandar Ristovski
2008-01-04 17:42 ` Daniel Jacobowitz
2008-01-04 18:25   ` Joel Brobecker
2008-01-04 21:40 ` Doug Evans
2008-01-04 21:48   ` Daniel Jacobowitz
2008-01-04 22:23     ` Doug Evans
2008-01-03 18:30 Aleksandar Ristovski
2008-01-04 12:52 ` Daniel Jacobowitz
2008-01-03 17:07 Aleksandar Ristovski
2008-01-03 17:13 ` Daniel Jacobowitz
2008-01-07 14:33   ` Joel Brobecker
2008-01-07 17:00     ` Doug Evans
2008-01-08  5:46       ` Joel Brobecker
2008-01-08 19:54         ` Doug Evans
2008-01-03 16:39 Aleksandar Ristovski
2008-01-03 16:52 ` Daniel Jacobowitz
2008-01-03 15:25 Aleksandar Ristovski
2008-01-03 16:00 ` Daniel Jacobowitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).