public inbox for gdb-cvs@sourceware.org
help / color / mirror / Atom feed
* [binutils-gdb] gdb: add "id" fields to identify symtabs and subfiles
@ 2022-07-30  0:55 Simon Marchi
  0 siblings, 0 replies; only message in thread
From: Simon Marchi @ 2022-07-30  0:55 UTC (permalink / raw)
  To: gdb-cvs

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f71ad5556cf5ad3bb938d6ba1b5eca1e79d8d740

commit f71ad5556cf5ad3bb938d6ba1b5eca1e79d8d740
Author: Simon Marchi <simon.marchi@efficios.com>
Date:   Thu Jul 28 12:34:47 2022 -0400

    gdb: add "id" fields to identify symtabs and subfiles
    
    Printing macros defined in the main source file doesn't work reliably
    using various toolchains, especially when DWARF 5 is used.  For example,
    using the binaries produced by either of these commands:
    
        $ gcc --version
        gcc (GCC) 11.2.0
        $ ld --version
        GNU ld (GNU Binutils) 2.38
        $ gcc test.c -g3 -gdwarf-5
    
        $ clang --version
        clang version 13.0.1
        $ clang test.c -gdwarf-5 -fdebug-macro
    
    I get:
    
        $ ./gdb -nx -q --data-directory=data-directory a.out
        (gdb) start
        Temporary breakpoint 1 at 0x111d: file test.c, line 6.
        Starting program: /home/simark/build/binutils-gdb-one-target/gdb/a.out
    
        Temporary breakpoint 1, main () at test.c:6
        6         return ZERO;
        (gdb) p ZERO
        No symbol "ZERO" in current context.
    
    When starting to investigate this (taking the gcc-compiled binary as an
    example), we see that GDB fails to look up the appropriate macro scope
    when evaluating the expression.  While stopped in
    macro_lookup_inclusion:
    
        (top-gdb) p name
        $1 = 0x62100011a980 "test.c"
        (top-gdb) p source.filename
        $2 = 0x62100011a9a0 "/home/simark/build/binutils-gdb-one-target/gdb/test.c"
    
    `source` is the macro_source_file that we would expect GDB to find.
    `name` comes from the symtab::filename field of the symtab we are
    stopped in.  GDB doesn't find the appropriate macro_source_file because
    the name of the macro_source_file doesn't match exactly the name of the
    symtab.
    
    The name of the main symtab comes from the compilation unit's
    DW_AT_name, passed to the buildsym_compunit's constructor:
    
      https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/read.c#L10627-10630
    
    The contents of DW_AT_name, in this case, is "test.c".  It is typically
    (what I witnessed all compilers do) the same string that was passed to
    the compiler on the command-line.
    
    The name of the macro_source_file comes from the line number program
    header's file table, from the call to the line_header::file_file_name
    method:
    
      https://gitlab.com/gnutools/binutils-gdb/-/blob/4815d6125ec580cc02a1094d61b8c9d1cc83c0a1/gdb/dwarf2/macro.c#L54-65
    
    line_header::file_file_name prepends the directory path that the file
    entry refers to, in the file table (if the file name is not already
    absolute).  In this case, the file name is "test.c", appended to the
    directory "/home/simark/build/binutils-gdb-one-target/gdb".
    
    Because the symtab's name is not created the same way as the
    macro_source_file's name is created, we get this mismatch.  GDB fails to
    find the appropriate macro scope for the symtab, and we can't print
    macros when stopped in that symtab.
    
    To make this work, we must ensure that paths produced in these two ways
    end up identical.  This can be tricky because of the different ways a
    path can be passed to the compiler by the user.
    
    Another thing to consider is that while the main symtab's name (or
    subfile, before it becomes a symtab) is created using DW_AT_name, the
    main symtab is also referred to using its entry in the line table
    header's file table, when processing the line table.  We must therefore
    ensure that the same name is produced in both cases, so that a call to
    "start_subfile" for the main subfile will correctly find the
    already-created subfile, created by buildsym_compunit's constructor.  If
    we fail to do that, things still often work, because of a fallback: the
    watch_main_source_file_lossage method.  This method determines that if
    the main subfile has no symbols but there exists another subfile with
    the same basename (e.g. "test.c") that does have symbols, it's probably
    because there was some filename mismatch.  So it replaces the main
    subfile with that other subfile.  I think that heuristic is useful as a
    last effort to work around any bug or bad debug info, but I don't think
    we should design things such as to rely on it.  It's a heuristic, it can
    get things wrong.  So in my search for a fix, it is important that given
    some good debug info, we don't end up relying on that for things to
    work.
    
    A first attempt at fixing this was to try to prepend the compilation
    directory here or not prepend it there.  In practice, because of all the
    possible combinations of debug info the compilers produce, it was not
    possible to get something that would produce reliable, consistent paths.
    
    Another attempt at fixing this was to make both macro_source_file
    objects and symtab objects use the most complete form of path possible.
    That means to prepend directories at least until we get an absolute
    path.  In theory, we should end up with the same path in all cases.
    This generally worked, but because it changed the symtab names, it
    resulted in user-visible changes (for example, paths to source files in
    Breakpoint hit messages becoming always absolute).  I didn't find this
    very good, first because there is a "set filename-display" setting that
    lets the user control how they want the paths to be displayed, and that
    would suddenly make this setting completely ineffective (although even
    today, it is a bit dependent on the debug info).  Second, it would
    require a good amount of testsuite tweaks to make tests accept these
    suddenly absolute paths.
    
    This new patch is a slight variation of that: it adds a new field called
    "filename_for_id" in struct symtab and struct subfile, next to the
    existing filename field. The goal is to separate the internal ids used
    for finding objects from the names used for presentation.  This field is
    used for identifying subfiles, symtabs and macro_source_files
    internally.  For DWARF symtabs, this new field is meant to contain the
    "most complete possible" path, as discussed above.  So for a given file,
    it must always be in the same form, everywhere.  The existing
    symtab::filename field remains the one used for printing to the user, so
    there shouldn't be any change in how paths are printed.
    
    Changes in the core symtab files are:
    
     - Add "name_for_id" and "filename_for_id" fields to "struct subfile"
       and "struct symtab", next to existing "name" and "filename" fields.
     - Make buildsym_compunit::buildsym_compunit and
       buildsym_compunit::start_subfile accept a "name_for_id" parameter
       next to the existing "name" ones.
     - Make buildsym_compunit::start_subfile use "name_for_id" for looking
       up existing subfiles.  This is the key thing for making calls
       to start_subfile for the main source file look up the existing
       subfile successfully, and avoid relying on
       watch_main_source_file_lossage.
     - Make sal_macro_scope pass "filename_for_id", rather than "filename",
       to macro_lookup_inclusion.  This is the key thing to making the
       lookup work and macro printing work.
    
    Changes in the DWARF files are:
    
     - Make line_header::file_file_name return the "most complete possible"
       name.  The only pre-existing user of this method is the macro code,
       to give the macro_source_file objects their name.  And we now want
       them to have this "most complete possible" name, which will match the
       corresponding symtab's "filename_for_id".
     - Make dwarf2_cu::start_compunit_symtab pass the "most complete
       possible" name for the main symtab's "filename_for_id".  In this
       context, where the info comes from the compilation unit's DW_AT_name
       / DW_AT_comp_dir, it means prepending DW_AT_comp_dir to DW_AT_name if
       DW_AT_name is not already absolute.
     - Change dwarf2_start_subfile to build a name_for_id for the subfile
       being started.  The simplest way is to re-use
       line_header::file_file_name, since the callers always have a
       file_entry handy.  This ensures that it will get the exact same path
       representation as the macro code does, for the same file (since it
       also uses line_header::file_file_name).
     - Update calls to allocate_symtab to pass the "name_for_id" from the
       subfile.
    
    Tests exercising all this are added by the following patch.
    
    Of all the cases I tried, the only one I found that ends up relying on
    watch_main_source_file_lossage is the following one:
    
        $ clang --version
        clang version 13.0.1
        Target: x86_64-pc-linux-gnu
        Thread model: posix
        InstalledDir: /usr/bin
        $ clang  ./test.c -g3 -O0 -gdwarf-4
        $ ./gdb -nx --data-directory=data-directory -q -readnow -iex "set debug symtab-create 1"  a.out
        ...
        [symtab-create] start_subfile: name = test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/test.c
        [symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
        [symtab-create] start_subfile: name = ./test.c, name_for_id = /home/simark/build/binutils-gdb-one-target/gdb/./test.c
        [symtab-create] start_subfile: found existing symtab with name_for_id /home/simark/build/binutils-gdb-one-target/gdb/./test.c (/home/simark/build/binutils-gdb-one-target/gdb/./test.c)
        [symtab-create] watch_main_source_file_lossage: using subfile ./test.c as the main subfile
    
    As we can see, there are two forms used for "test.c", one with a "." and
    one without.  This comes from the fact that the compilation unit DIE
    contains:
    
        DW_AT_name ("test.c")
        DW_AT_comp_dir ("/home/simark/build/binutils-gdb-one-target/gdb")
    
    without a ".", and the line table for that file contains:
    
        include_directories[  1] = "."
        file_names[  1]:
                   name: "test.c"
              dir_index: 1
    
    When assembling the filename from that entry, we get a ".".
    
    It is a bit unexpected that the main filename resulting from the line
    table header does not match exactly the name in the compilation unit.
    For instance, gcc uses "./test.c" for the DW_AT_name, which gives
    identical paths in the compilation unit and in the line table header.
    
    Similarly, with DWARF 5:
    
        $ clang  ./test.c -g3 -O0 -gdwarf-5
    
    clang create two entries that refer to the same file but are of in a different
    form.
    
        include_directories[  0] = "/home/simark/build/binutils-gdb-one-target/gdb"
        include_directories[  1] = "."
        file_names[  0]:
                   name: "test.c"
              dir_index: 0
        file_names[  1]:
                   name: "test.c"
              dir_index: 1
    
    The first file name produces a path without a "." while the second does.
    This is not caught by watch_main_source_file_lossage, because of
    dwarf_decode_lines that creates a symtab for each file entry in the line
    table.  It therefore appears as "non-empty" to
    watch_main_source_file_lossage.  This results in two symtabs:
    
        (gdb) maintenance info symtabs
        { objfile /home/simark/build/binutils-gdb-one-target/gdb/a.out ((struct objfile *) 0x613000005d00)
          { ((struct compunit_symtab *) 0x62100011aca0)
            debugformat DWARF 5
            producer clang version 13.0.1
            name test.c
            dirname /home/simark/build/binutils-gdb-one-target/gdb
            blockvector ((struct blockvector *) 0x621000129ec0)
            user ((struct compunit_symtab *) (null))
                { symtab test.c ((struct symtab *) 0x62100011ad20)
                  fullname (null)
                  linetable ((struct linetable *) 0x0)
                }
                { symtab ./test.c ((struct symtab *) 0x62100011ad60)
                  fullname (null)
                  linetable ((struct linetable *) 0x621000129ef0)
                }
          }
        }
    
    I am not sure what is the consequence of this, but this is also what
    happens before my patch, so I think its acceptable to leave it as-is.
    
    To handle these two cases nicely, I think we will need a function that
    removes the unnecessary "." from path names, something that can be done
    later.
    
    Finally, I made a change in find_file_and_directory is necessary to
    avoid breaking test
    
        gdb.dwarf2/dw2-compdir-oldgcc.exp: info source gcc42
    
    Without that change, we would get:
    
        (gdb) info source
        Current source file is /dir/d/dw2-compdir-oldgcc42.S
        Compilation directory is /dir/d
    
    whereas the expected result is:
    
        (gdb) info source
        Current source file is dw2-compdir-oldgcc42.S
        Compilation directory is /dir/d
    
    This test was added here:
    
      https://sourceware.org/pipermail/gdb-patches/2012-November/098144.html
    
    Long story short, GCC <= 4.2 apparently had a bug where it would
    generate a DW_AT_name with a full path ("/dir/d/dw2-compdir-oldgcc42.S")
    and no DW_AT_comp_dir.  The line table has one entry with filename
    "dw2-compdir-oldgcc42.S", which refers to directory 0.  Directory 0
    normally refers to the compilation unit's comp dir, but it is
    non-existent in this case.
    
    This caused some symtab lookup problems, and to work around them, some
    workaround was added, which today reads as:
    
        if (res.get_comp_dir () == nullptr
            && producer_is_gcc_lt_4_3 (cu)
            && res.get_name () != nullptr
            && IS_ABSOLUTE_PATH (res.get_name ()))
          res.set_comp_dir (ldirname (res.get_name ()));
    
    Source: https://gitlab.com/gnutools/binutils-gdb/-/blob/6577f365ebdee7dda71cb996efa29d3714cbccd0/gdb/dwarf2/read.c#L9428-9432
    
    It extracts an artificial DW_AT_comp_dir from DW_AT_name, if there is no
    DW_AT_comp_dir and DW_AT_name is absolute.
    
    Prior to my patch, a subfile would get created with filename
    "/dir/d/dw2-compdir-oldgcc42.S", from DW_AT_name, and another would get
    created with filename "dw2-compdir-oldgcc42.S" from the line table's
    file table.  Then watch_main_source_file_lossage would kick in and merge
    them, keeping only the "dw2-compdir-oldgcc42.S" one:
    
        [symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S
        [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
        [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S
        [symtab-create] start_subfile: found existing symtab with name dw2-compdir-oldgcc42.S (dw2-compdir-oldgcc42.S)
        [symtab-create] watch_main_source_file_lossage: using subfile dw2-compdir-oldgcc42.S as the main subfile
    
    And so "info source" would show "dw2-compdir-oldgcc42.S" as the
    filename.
    
    With my patch applied, but without the change in
    find_file_and_directory, both DW_AT_name and the line table would try to
    start a subfile with the same filename_for_id, and there was no need for
    watch_main_source_file_lossage - which is what we want:
    
    [symtab-create] start_subfile: name = /dir/d/dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
    [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
    [symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
    [symtab-create] start_subfile: name = dw2-compdir-oldgcc42.S, name_for_id = /dir/d/dw2-compdir-oldgcc42.S
    [symtab-create] start_subfile: found existing symtab with name_for_id /dir/d/dw2-compdir-oldgcc42.S (/dir/d/dw2-compdir-oldgcc42.S)
    
    But since the one with name == "/dir/d/dw2-compdir-oldgcc42.S", coming
    from DW_AT_name, gets created first, it wins, and the symtab ends up
    with "/dir/d/dw2-compdir-oldgcc42.S" as the name, "info source" shows
    "/dir/d/dw2-compdir-oldgcc42.S" and the test breaks.
    
    This is not wrong per-se, after all DW_AT_name is
    "/dir/d/dw2-compdir-oldgcc42.S", so it wouldn't be wrong to report the
    current source file as "/dir/d/dw2-compdir-oldgcc42.S".  If you compile
    a file passing "/an/absolute/path.c", DW_AT_name typically contains (at
    least with GCC) "/an/absolute/path.c" and GDB tells you that the source
    file is "/an/absolute/path.c".  But we can also keep the existing
    behavior fairly easily with a little change in find_file_and_directory.
    When extracting an artificial DW_AT_comp_dir from DW_AT_name, we now
    modify the name to just keep the file part.  The result is coherent with
    what compilers do when you compile a file by just passing its filename
    ("gcc path.c -g"):
    
          DW_AT_name        ("path.c")
          DW_AT_comp_dir    ("/home/simark/build/binutils-gdb-one-target/gdb")
    
    With this change, filename_for_id is still the full name,
    "/dir/d/dw2-compdir-oldgcc42.S", but the filename of the subfile /
    symtab (what ends up shown by "info source") is just
    "dw2-compdir-oldgcc42.S", and that makes the test happy.
    
    Change-Id: I8b5cc4bb3052afdb172ee815c051187290566307

Diff:
---
 gdb/buildsym.c           | 36 ++++++++++++++++--------------
 gdb/buildsym.h           | 39 +++++++++++++++++++++++++++++---
 gdb/dwarf2/cu.c          | 17 +++++++++++++-
 gdb/dwarf2/line-header.c | 18 ++++++++++-----
 gdb/dwarf2/line-header.h |  6 +++--
 gdb/dwarf2/read.c        | 58 ++++++++++++++++++++++++++----------------------
 gdb/macroscope.c         |  2 +-
 gdb/symfile.c            |  4 +++-
 gdb/symfile.h            | 11 ++++++++-
 gdb/symtab.h             | 14 +++++++++++-
 gdb/xcoffread.c          |  1 +
 11 files changed, 148 insertions(+), 58 deletions(-)

diff --git a/gdb/buildsym.c b/gdb/buildsym.c
index fae3d5ef4ce..67d88a94ba9 100644
--- a/gdb/buildsym.c
+++ b/gdb/buildsym.c
@@ -52,6 +52,7 @@ struct pending_block
 buildsym_compunit::buildsym_compunit (struct objfile *objfile_,
 				      const char *name,
 				      const char *comp_dir_,
+				      const char *name_for_id,
 				      enum language language_,
 				      CORE_ADDR last_addr)
   : m_objfile (objfile_),
@@ -70,7 +71,7 @@ buildsym_compunit::buildsym_compunit (struct objfile *objfile_,
      It can happen that the debug info provides a different path to NAME than
      DIRNAME,NAME.  We cope with this in watch_main_source_file_lossage but
      that only works if the main_subfile doesn't have a symtab yet.  */
-  start_subfile (name);
+  start_subfile (name, name_for_id);
   /* Save this so that we don't have to go looking for it at the end
      of the subfiles list.  */
   m_main_subfile = m_current_subfile;
@@ -483,40 +484,38 @@ buildsym_compunit::make_blockvector ()
 
   return (blockvector);
 }
-\f
-/* Start recording information about source code that came from an
-   included (or otherwise merged-in) source file with a different
-   name.  NAME is the name of the file (cannot be NULL).  */
+
+/* See buildsym.h.  */
 
 void
-buildsym_compunit::start_subfile (const char *name)
+buildsym_compunit::start_subfile (const char *name, const char *name_for_id)
 {
   /* See if this subfile is already registered.  */
 
-  symtab_create_debug_printf ("name = %s", name);
+  symtab_create_debug_printf ("name = %s, name_for_id = %s", name, name_for_id);
 
   for (subfile *subfile = m_subfiles; subfile; subfile = subfile->next)
     {
       std::string subfile_name_holder;
-      const char *subfile_name;
+      const char *subfile_name_for_id;
 
       /* If NAME is an absolute path, and this subfile is not, then
 	 attempt to create an absolute path to compare.  */
-      if (IS_ABSOLUTE_PATH (name)
-	  && !IS_ABSOLUTE_PATH (subfile->name)
+      if (IS_ABSOLUTE_PATH (name_for_id)
+	  && !IS_ABSOLUTE_PATH (subfile->name_for_id)
 	  && !m_comp_dir.empty ())
 	{
 	  subfile_name_holder = path_join (m_comp_dir.c_str (),
-					   subfile->name.c_str ());
-	  subfile_name = subfile_name_holder.c_str ();
+					   subfile->name_for_id.c_str ());
+	  subfile_name_for_id = subfile_name_holder.c_str ();
 	}
       else
-	subfile_name = subfile->name.c_str ();
+	subfile_name_for_id = subfile->name_for_id.c_str ();
 
-      if (FILENAME_CMP (subfile_name, name) == 0)
+      if (FILENAME_CMP (subfile_name_for_id, name_for_id) == 0)
 	{
-	  symtab_create_debug_printf ("found existing symtab with name %s (%s)",
-				      subfile->name.c_str (), subfile_name);
+	  symtab_create_debug_printf ("found existing symtab with name_for_id %s (%s)",
+				      subfile->name_for_id.c_str (), subfile_name_for_id);
 	  m_current_subfile = subfile;
 	  return;
 	}
@@ -526,6 +525,7 @@ buildsym_compunit::start_subfile (const char *name)
 
   subfile_up subfile (new struct subfile);
   subfile->name = name;
+  subfile->name_for_id = name_for_id;
 
   m_current_subfile = subfile.get ();
 
@@ -595,6 +595,7 @@ buildsym_compunit::patch_subfile_names (struct subfile *subfile,
     {
       m_comp_dir = std::move (subfile->name);
       subfile->name = name;
+      subfile->name_for_id = name;
       set_last_source_file (name);
 
       /* Default the source language to whatever can be deduced from
@@ -932,7 +933,8 @@ buildsym_compunit::end_compunit_symtab_with_blockvector
 
       /* Allocate a symbol table if necessary.  */
       if (subfile->symtab == NULL)
-	subfile->symtab = allocate_symtab (cu, subfile->name.c_str ());
+	subfile->symtab = allocate_symtab (cu, subfile->name.c_str (),
+					   subfile->name_for_id.c_str ());
 
       struct symtab *symtab = subfile->symtab;
 
diff --git a/gdb/buildsym.h b/gdb/buildsym.h
index 2ad00336269..a42317aded8 100644
--- a/gdb/buildsym.h
+++ b/gdb/buildsym.h
@@ -55,6 +55,12 @@ struct subfile
 
   struct subfile *next = nullptr;
   std::string name;
+
+  /* This field is analoguous in function to symtab::filename_for_id.
+
+     It is used to look up existing subfiles in calls to start_subfile.  */
+  std::string name_for_id;
+
   std::vector<linetable_entry> line_vector_entries;
   enum language language = language_unknown;
   struct symtab *symtab = nullptr;
@@ -135,12 +141,24 @@ struct buildsym_compunit
 {
   /* Start recording information about a primary source file (IOW, not an
      included source file).
+
      COMP_DIR is the directory in which the compilation unit was compiled
-     (or NULL if not known).  */
+     (or NULL if not known).
+
+     NAME and NAME_FOR_ID have the same purpose as for the start_subfile
+     method.  */
+
+  buildsym_compunit (struct objfile *objfile_, const char *name,
+		     const char *comp_dir_, const char *name_for_id,
+		     enum language language_, CORE_ADDR last_addr);
+
+  /* Same as above, but passes NAME for NAME_FOR_ID.  */
 
   buildsym_compunit (struct objfile *objfile_, const char *name,
 		     const char *comp_dir_, enum language language_,
-		     CORE_ADDR last_addr);
+		     CORE_ADDR last_addr)
+    : buildsym_compunit (objfile_, name, comp_dir_, name, language_, last_addr)
+  {}
 
   /* Reopen an existing compunit_symtab so that additional symbols can
      be added to it.  Arguments are as for the main constructor.  CUST
@@ -198,7 +216,22 @@ struct buildsym_compunit
   void record_block_range (struct block *block,
 			   CORE_ADDR start, CORE_ADDR end_inclusive);
 
-  void start_subfile (const char *name);
+  /* Start recording information about source code that comes from a source
+     file.  This sets the current subfile, creating it if necessary.
+
+     NAME is the user-visible name of the subfile.
+
+     NAME_FOR_ID is a name that must be stable between the different calls to
+     start_subfile referring to the same file (it is used for looking up
+     existing subfiles).  It can be equal to NAME if NAME follows that rule.  */
+  void start_subfile (const char *name, const char *name_for_id);
+
+  /* Same as above, but passes NAME for NAME_FOR_ID.  */
+
+  void start_subfile (const char *name)
+  {
+    return start_subfile (name, name);
+  }
 
   void patch_subfile_names (struct subfile *subfile, const char *name);
 
diff --git a/gdb/dwarf2/cu.c b/gdb/dwarf2/cu.c
index d40dd094b09..fe95d7ea87b 100644
--- a/gdb/dwarf2/cu.c
+++ b/gdb/dwarf2/cu.c
@@ -21,6 +21,8 @@
 #include "dwarf2/cu.h"
 #include "dwarf2/read.h"
 #include "objfiles.h"
+#include "filenames.h"
+#include "gdbsupport/pathstuff.h"
 
 /* Initialize dwarf2_cu to read PER_CU, in the context of PER_OBJFILE.  */
 
@@ -60,9 +62,22 @@ dwarf2_cu::start_compunit_symtab (const char *name, const char *comp_dir,
 {
   gdb_assert (m_builder == nullptr);
 
+  std::string name_for_id_holder;
+  const char *name_for_id = name;
+
+  /* Prepend the compilation directory to the filename if needed (if not
+     absolute already) to get the "name for id" for our main symtab.  The name
+     for the main file coming from the line table header will be generated using
+     the same logic, so will hopefully match what we pass here.  */
+  if (!IS_ABSOLUTE_PATH (name) && comp_dir != nullptr)
+    {
+      name_for_id_holder = path_join (comp_dir, name);
+      name_for_id = name_for_id_holder.c_str ();
+    }
+
   m_builder.reset (new struct buildsym_compunit
 		   (this->per_objfile->objfile,
-		    name, comp_dir, lang (), low_pc));
+		    name, comp_dir, name_for_id, lang (), low_pc));
 
   list_in_scope = get_builder ()->get_file_symbols ();
 
diff --git a/gdb/dwarf2/line-header.c b/gdb/dwarf2/line-header.c
index a1bd39df7ed..73db3765c09 100644
--- a/gdb/dwarf2/line-header.c
+++ b/gdb/dwarf2/line-header.c
@@ -62,14 +62,22 @@ line_header::file_file_name (const file_entry &fe) const
 {
   gdb_assert (is_valid_file_index (fe.index));
 
-  if (IS_ABSOLUTE_PATH (fe.name))
-    return fe.name;
+  std::string ret = fe.name;
+
+  if (IS_ABSOLUTE_PATH (ret))
+    return ret;
 
   const char *dir = fe.include_dir (this);
-  if (dir == nullptr)
-    return fe.name;
+  if (dir != nullptr)
+    ret = path_join (dir, ret);
+
+  if (IS_ABSOLUTE_PATH (ret))
+    return ret;
+
+  if (m_comp_dir != nullptr)
+    ret = path_join (m_comp_dir, ret);
 
-  return path_join (dir, fe.name);
+  return ret;
 }
 
 static void
diff --git a/gdb/dwarf2/line-header.h b/gdb/dwarf2/line-header.h
index e6b52f7520e..e7a63a37918 100644
--- a/gdb/dwarf2/line-header.h
+++ b/gdb/dwarf2/line-header.h
@@ -171,8 +171,10 @@ struct line_header
      header.  These point into dwarf2_per_objfile->line_buffer.  */
   const gdb_byte *statement_program_start {}, *statement_program_end {};
 
-  /* Return file name relative to the compilation directory of file
-     FE in this object's file name table.  */
+  /* Return the most "complete" file name for FILE possible.
+
+     This means prepending the directory and compilation directory, as needed,
+     until we get an absolute path.  */
   std::string file_file_name (const file_entry &fe) const;
 
   /* Return the compilation directory of the compilation unit in the context of
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index b6d9f5a8aa1..ae0b5d7a773 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -956,8 +956,8 @@ static void dwarf_decode_lines (struct line_header *,
 				struct dwarf2_cu *,
 				CORE_ADDR, int decode_mapping);
 
-static void dwarf2_start_subfile (struct dwarf2_cu *, const char *,
-				  const char *);
+static void dwarf2_start_subfile (dwarf2_cu *cu, const file_entry &fe,
+				  const line_header &lh);
 
 static struct symbol *new_symbol (struct die_info *, struct type *,
 				  struct dwarf2_cu *, struct symbol * = NULL);
@@ -9439,7 +9439,10 @@ find_file_and_directory (struct die_info *die, struct dwarf2_cu *cu)
       && producer_is_gcc_lt_4_3 (cu)
       && res.get_name () != nullptr
       && IS_ABSOLUTE_PATH (res.get_name ()))
-    res.set_comp_dir (ldirname (res.get_name ()));
+    {
+      res.set_comp_dir (ldirname (res.get_name ()));
+      res.set_name (make_unique_xstrdup (lbasename (res.get_name ())));
+    }
 
   cu->per_cu->fnd.reset (new file_and_directory (std::move (res)));
   return *cu->per_cu->fnd;
@@ -9700,18 +9703,20 @@ dwarf2_cu::setup_type_unit_groups (struct die_info *die)
       for (i = 0; i < file_names.size (); ++i)
 	{
 	  file_entry &fe = file_names[i];
-	  dwarf2_start_subfile (this, fe.name,
-				fe.include_dir (line_header));
+	  dwarf2_start_subfile (this, fe, *line_header);
 	  buildsym_compunit *b = get_builder ();
-	  if (b->get_current_subfile ()->symtab == NULL)
+	  subfile *sf = b->get_current_subfile ();
+
+	  if (sf->symtab == nullptr)
 	    {
 	      /* NOTE: start_subfile will recognize when it's been
 		 passed a file it has already seen.  So we can't
 		 assume there's a simple mapping from
 		 cu->line_header->file_names to subfiles, plus
 		 cu->line_header->file_names may contain dups.  */
-	      const char *name = b->get_current_subfile ()->name.c_str ();
-	      b->get_current_subfile ()->symtab = allocate_symtab (cust, name);
+	      const char *name = sf->name.c_str ();
+	      const char *name_for_id = sf->name_for_id.c_str ();
+	      sf->symtab = allocate_symtab (cust, name, name_for_id);
 	    }
 
 	  fe.symtab = b->get_current_subfile ()->symtab;
@@ -20021,11 +20026,9 @@ lnp_state_machine::handle_set_file (file_name_index file)
     dwarf2_debug_line_missing_file_complaint ();
   else
     {
-      const char *dir = fe->include_dir (m_line_header);
-
       m_last_subfile = m_cu->get_builder ()->get_current_subfile ();
       m_line_has_non_zero_discriminator = m_discriminator != 0;
-      dwarf2_start_subfile (m_cu, fe->name, dir);
+      dwarf2_start_subfile (m_cu, *fe, *m_line_header);
     }
 }
 
@@ -20308,7 +20311,7 @@ dwarf_decode_lines_1 (struct line_header *lh, struct dwarf2_cu *cu,
       const file_entry *fe = state_machine.current_file ();
 
       if (fe != NULL)
-	dwarf2_start_subfile (cu, fe->name, fe->include_dir (lh));
+	dwarf2_start_subfile (cu, *fe, *lh);
 
       /* Decode the table.  */
       while (line_ptr < line_end && !end_sequence)
@@ -20520,14 +20523,14 @@ dwarf_decode_lines (struct line_header *lh, struct dwarf2_cu *cu,
 
   for (auto &fe : lh->file_names ())
     {
-      dwarf2_start_subfile (cu, fe.name, fe.include_dir (lh));
-      if (builder->get_current_subfile ()->symtab == NULL)
-	{
-	  builder->get_current_subfile ()->symtab
-	    = allocate_symtab (cust,
-			       builder->get_current_subfile ()->name.c_str ());
-	}
-      fe.symtab = builder->get_current_subfile ()->symtab;
+      dwarf2_start_subfile (cu, fe, *lh);
+      subfile *sf = builder->get_current_subfile ();
+
+      if (sf->symtab == nullptr)
+	sf->symtab = allocate_symtab (cust, sf->name.c_str (),
+				      sf->name_for_id.c_str ());
+
+      fe.symtab = sf->symtab;
     }
 }
 
@@ -20555,10 +20558,12 @@ dwarf_decode_lines (struct line_header *lh, struct dwarf2_cu *cu,
    subfile's name.  */
 
 static void
-dwarf2_start_subfile (struct dwarf2_cu *cu, const char *filename,
-		      const char *dirname)
+dwarf2_start_subfile (dwarf2_cu *cu, const file_entry &fe,
+		      const line_header &lh)
 {
-  std::string copy;
+  std::string filename_holder;
+  const char *filename = fe.name;
+  const char *dirname = lh.include_dir_at (fe.d_index);
 
   /* In order not to lose the line information directory,
      we concatenate it to the filename when it makes sense.
@@ -20569,11 +20574,12 @@ dwarf2_start_subfile (struct dwarf2_cu *cu, const char *filename,
 
   if (!IS_ABSOLUTE_PATH (filename) && dirname != NULL)
     {
-      copy = path_join (dirname, filename);
-      filename = copy.c_str ();
+      filename_holder = path_join (dirname, filename);
+      filename = filename_holder.c_str ();
     }
 
-  cu->get_builder ()->start_subfile (filename);
+  std::string filename_for_id = lh.file_file_name (fe);
+  cu->get_builder ()->start_subfile (filename, filename_for_id.c_str ());
 }
 
 static void
diff --git a/gdb/macroscope.c b/gdb/macroscope.c
index 93f561acccd..fe7f10e296a 100644
--- a/gdb/macroscope.c
+++ b/gdb/macroscope.c
@@ -51,7 +51,7 @@ sal_macro_scope (struct symtab_and_line sal)
   gdb::unique_xmalloc_ptr<struct macro_scope> ms (XNEW (struct macro_scope));
 
   main_file = macro_main (cust->macro_table ());
-  inclusion = macro_lookup_inclusion (main_file, sal.symtab->filename);
+  inclusion = macro_lookup_inclusion (main_file, sal.symtab->filename_for_id);
 
   if (inclusion)
     {
diff --git a/gdb/symfile.c b/gdb/symfile.c
index 27e95718327..aea8c762ee6 100644
--- a/gdb/symfile.c
+++ b/gdb/symfile.c
@@ -2772,13 +2772,15 @@ deduce_language_from_filename (const char *filename)
    CUST is from the result of allocate_compunit_symtab.  */
 
 struct symtab *
-allocate_symtab (struct compunit_symtab *cust, const char *filename)
+allocate_symtab (struct compunit_symtab *cust, const char *filename,
+		 const char *filename_for_id)
 {
   struct objfile *objfile = cust->objfile ();
   struct symtab *symtab
     = OBSTACK_ZALLOC (&objfile->objfile_obstack, struct symtab);
 
   symtab->filename = objfile->intern (filename);
+  symtab->filename_for_id = objfile->intern (filename_for_id);
   symtab->fullname = NULL;
   symtab->set_language (deduce_language_from_filename (filename));
 
diff --git a/gdb/symfile.h b/gdb/symfile.h
index f44dfa060a0..5216e85c4e3 100644
--- a/gdb/symfile.h
+++ b/gdb/symfile.h
@@ -201,9 +201,18 @@ extern symfile_segment_data_up default_symfile_segments (bfd *abfd);
 extern bfd_byte *default_symfile_relocate (struct objfile *objfile,
 					   asection *sectp, bfd_byte *buf);
 
-extern struct symtab *allocate_symtab (struct compunit_symtab *, const char *)
+extern struct symtab *allocate_symtab
+  (struct compunit_symtab *cust, const char *filename, const char *id)
   ATTRIBUTE_NONNULL (1);
 
+/* Same as the above, but passes FILENAME for ID.  */
+
+static inline struct symtab *
+allocate_symtab (struct compunit_symtab *cust, const char *filename)
+{
+  return allocate_symtab (cust, filename, filename);
+}
+
 extern struct compunit_symtab *allocate_compunit_symtab (struct objfile *,
 							 const char *)
   ATTRIBUTE_NONNULL (1);
diff --git a/gdb/symtab.h b/gdb/symtab.h
index 4bc86645e2c..89d7a183ff3 100644
--- a/gdb/symtab.h
+++ b/gdb/symtab.h
@@ -1644,10 +1644,22 @@ struct symtab
 
   struct linetable *m_linetable;
 
-  /* Name of this source file.  This pointer is never NULL.  */
+  /* Name of this source file, in a form appropriate to print to the user.
+
+     This pointer is never nullptr.  */
 
   const char *filename;
 
+  /* Filename for this source file, used as an identifier to link with
+     related objects such as associated macro_source_file objects.  It must
+     therefore match the name of any macro_source_file object created for this
+     source file.  The value can be the same as FILENAME if it is known to
+     follow that rule, or another form of the same file name, this is up to
+     the specific debug info reader.
+
+     This pointer is never nullptr.*/
+  const char *filename_for_id;
+
   /* Language of this source file.  */
 
   enum language m_language;
diff --git a/gdb/xcoffread.c b/gdb/xcoffread.c
index bcb667b29e7..6be0a7c4464 100644
--- a/gdb/xcoffread.c
+++ b/gdb/xcoffread.c
@@ -714,6 +714,7 @@ process_linenos (CORE_ADDR start, CORE_ADDR end)
 	  }
 	  struct subfile *current_subfile = get_current_subfile ();
 	  current_subfile->name = inclTable[ii].name;
+	  current_subfile->name_for_id = inclTable[ii].name;
 #endif
 
 	  start_subfile (pop_subfile ());


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-07-30  0:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-30  0:55 [binutils-gdb] gdb: add "id" fields to identify symtabs and subfiles Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).