public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Will Hawkins <hawkinsw@obs.cr>
To: gdb-patches@sourceware.org
Subject: Re: [PATCH v3] gdb: Support embedded source in DWARF
Date: Fri, 5 Apr 2024 12:18:23 -0400	[thread overview]
Message-ID: <CADx9qWhO5hAiA-wUAYcS0=OHGKFNas+P61PJ_qsXpYpkmMTOUQ@mail.gmail.com> (raw)
In-Reply-To: <20240405161302.573912-1-hawkinsw@obs.cr>

While there are still some areas that I need to address from Tom's
feedback, I took a stab at updating the source code access through the
cache and wanted to get your initial thoughts. See comments below.

On Fri, Apr 5, 2024 at 12:13 PM Will Hawkins <hawkinsw@obs.cr> wrote:
>
> While DW_LNCT_source is not yet finalized in the DWARF standard
> (https://dwarfstd.org/issues/180201.1.html), LLVM does emit it.
>
> This patch adds support for it in gdb.
>
> Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
> ---
>
> Notes:
>         v2 -> v3
>           - Address feedback from v2.
>           - Fixed up a bug where non-CU embedded files were incorrectly
>             marked as embedded.
>           - Revamped source access through the cache to avoid copies.
>
>         v1 -> v2
>           - Address feedback from original PR
>           - Add support for maintenance commands to see embedded source status
>           - Prevent access to the filesystem for symtabs with embedded source
>           - Add additional unit tests
>
>  gdb/dwarf2/file-and-dir.h                    | 23 ++++-
>  gdb/dwarf2/line-header.c                     | 22 +++--
>  gdb/dwarf2/line-header.h                     |  9 +-
>  gdb/dwarf2/read.c                            | 83 +++++++++++++++---
>  gdb/extension.c                              |  7 +-
>  gdb/extension.h                              |  9 +-
>  gdb/source-cache.c                           | 87 ++++++++++++-------
>  gdb/source-cache.h                           | 12 +--
>  gdb/source.c                                 | 74 +++++++++++-----
>  gdb/source.h                                 |  4 +
>  gdb/symmisc.c                                | 17 ++--
>  gdb/symtab.h                                 |  5 ++
>  gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c   | 24 ++++++
>  gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp | 88 ++++++++++++++++++++
>  gdb/testsuite/lib/dwarf.exp                  | 19 +++--
>  include/dwarf2.h                             |  5 ++
>  16 files changed, 390 insertions(+), 98 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
>  create mode 100644 gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
>
> diff --git a/gdb/dwarf2/file-and-dir.h b/gdb/dwarf2/file-and-dir.h
> index a5b1d8a3a21..84a7367e973 100644
> --- a/gdb/dwarf2/file-and-dir.h
> +++ b/gdb/dwarf2/file-and-dir.h
> @@ -95,7 +95,14 @@ struct file_and_directory
>    const char *get_fullname ()
>    {
>      if (m_fullname == nullptr)
> -      m_fullname = find_source_or_rewrite (get_name (), get_comp_dir ());
> +      {
> +       if (m_is_embedded)
> +         m_fullname = make_unique_xstrdup (embedded_fullname (
> +                                                              get_name (),
> +                                                              get_comp_dir ()));
> +       else
> +         m_fullname = find_source_or_rewrite (get_name (), get_comp_dir ());
> +      }
>      return m_fullname.get ();
>    }
>
> @@ -105,6 +112,17 @@ struct file_and_directory
>      m_fullname.reset ();
>    }
>
> +  /* Set whether the file's source is embedded in the dwarf.  */
> +  void set_embedded (bool is_embedded)
> +  {
> +    m_is_embedded = is_embedded;
> +  }
> +
> +  /* Return true if the file's source is embedded in the dwarf.  */
> +  bool is_embedded () const
> +  {
> +    return m_is_embedded;
> +  }
>  private:
>
>    /* The filename.  */
> @@ -124,6 +142,9 @@ struct file_and_directory
>
>    /* The full name.  */
>    gdb::unique_xmalloc_ptr<char> m_fullname;
> +
> +  /* Whether the file's source is embedded in the dwarf.  */
> +  bool m_is_embedded = false;
>  };
>
>  #endif /* GDB_DWARF2_FILE_AND_DIR_H */
> diff --git a/gdb/dwarf2/line-header.c b/gdb/dwarf2/line-header.c
> index a3ca49b64f5..3bc707e999e 100644
> --- a/gdb/dwarf2/line-header.c
> +++ b/gdb/dwarf2/line-header.c
> @@ -45,6 +45,7 @@ line_header::add_include_dir (const char *include_dir)
>  void
>  line_header::add_file_name (const char *name,
>                             dir_index d_index,
> +                           const char *source,
>                             unsigned int mod_time,
>                             unsigned int length)
>  {
> @@ -54,7 +55,7 @@ line_header::add_file_name (const char *name,
>    if (dwarf_line_debug >= 2)
>      gdb_printf (gdb_stdlog, "Adding file %d: %s\n", index, name);
>
> -  m_file_names.emplace_back (name, index, d_index, mod_time, length);
> +  m_file_names.emplace_back (name, index, d_index, source, mod_time, length);
>  }
>
>  std::string
> @@ -125,6 +126,7 @@ read_formatted_entries (dwarf2_per_objfile *per_objfile, bfd *abfd,
>                         void (*callback) (struct line_header *lh,
>                                           const char *name,
>                                           dir_index d_index,
> +                                         const char *source,
>                                           unsigned int mod_time,
>                                           unsigned int length))
>  {
> @@ -239,13 +241,17 @@ read_formatted_entries (dwarf2_per_objfile *per_objfile, bfd *abfd,
>               break;
>             case DW_LNCT_MD5:
>               break;
> +           case DW_LNCT_LLVM_SOURCE:
> +             if (string.has_value () && (*string)[0] != '\0')
> +               fe.source = *string;
> +             break;
>             default:
>               complaint (_("Unknown format content type %s"),
>                          pulongest (content_type));
>             }
>         }
>
> -      callback (lh, fe.name, fe.d_index, fe.mod_time, fe.length);
> +      callback (lh, fe.name, fe.d_index, fe.source, fe.mod_time, fe.length);
>      }
>
>    *bufp = buf;
> @@ -368,8 +374,8 @@ dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
>        read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (),
>                               offset_size,
>                               [] (struct line_header *header, const char *name,
> -                                 dir_index d_index, unsigned int mod_time,
> -                                 unsigned int length)
> +                                 dir_index d_index, const char *source,
> +                                 unsigned int mod_time, unsigned int length)
>         {
>           header->add_include_dir (name);
>         });
> @@ -378,10 +384,10 @@ dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
>        read_formatted_entries (per_objfile, abfd, &line_ptr, lh.get (),
>                               offset_size,
>                               [] (struct line_header *header, const char *name,
> -                                 dir_index d_index, unsigned int mod_time,
> -                                 unsigned int length)
> +                                 dir_index d_index, const char *source,
> +                                 unsigned int mod_time, unsigned int length)
>         {
> -         header->add_file_name (name, d_index, mod_time, length);
> +         header->add_file_name (name, d_index, source, mod_time, length);
>         });
>      }
>    else
> @@ -408,7 +414,7 @@ dwarf_decode_line_header  (sect_offset sect_off, bool is_dwz,
>           length = read_unsigned_leb128 (abfd, line_ptr, &bytes_read);
>           line_ptr += bytes_read;
>
> -         lh->add_file_name (cur_file, d_index, mod_time, length);
> +         lh->add_file_name (cur_file, d_index, nullptr, mod_time, length);
>         }
>        line_ptr += bytes_read;
>      }
> diff --git a/gdb/dwarf2/line-header.h b/gdb/dwarf2/line-header.h
> index c068dff70a3..abc95f3ee87 100644
> --- a/gdb/dwarf2/line-header.h
> +++ b/gdb/dwarf2/line-header.h
> @@ -35,9 +35,10 @@ struct file_entry
>    file_entry () = default;
>
>    file_entry (const char *name_, file_name_index index_, dir_index d_index_,
> -             unsigned int mod_time_, unsigned int length_)
> +             const char *source_, unsigned int mod_time_, unsigned int length_)
>      : name (name_),
>        index (index_),
> +      source (source_),
>        d_index (d_index_),
>        mod_time (mod_time_),
>        length (length_)
> @@ -54,6 +55,10 @@ struct file_entry
>    /* The index of this file in the file table.  */
>    file_name_index index {};
>
> +  /* The file's contents (if not null).  Note this is an observing pointer.
> +     The memory is owned by debug_line_buffer.  */
> +  const char *source {};
> +
>    /* The directory index (1-based).  */
>    dir_index d_index {};
>
> @@ -88,7 +93,7 @@ struct line_header
>    void add_include_dir (const char *include_dir);
>
>    /* Add an entry to the file name table.  */
> -  void add_file_name (const char *name, dir_index d_index,
> +  void add_file_name (const char *name, dir_index d_index, const char *source,
>                       unsigned int mod_time, unsigned int length);
>
>    /* Return the include dir at INDEX (0-based in DWARF 5 and 1-based before).
> diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
> index 7442094874c..e35ff019acf 100644
> --- a/gdb/dwarf2/read.c
> +++ b/gdb/dwarf2/read.c
> @@ -1635,6 +1635,10 @@ struct quick_file_names
>    /* The file names from the line table after being run through
>       gdb_realpath.  These are computed lazily.  */
>    const char **real_names;
> +
> +  /* Whether or not the file names refer to sources embedded
> +     in the dwarf.  */
> +  const bool *embeddeds;
>  };
>
>  /* With OBJF_READNOW, the DWARF reader expands all CUs immediately.
> @@ -1908,27 +1912,39 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
>    if (slot != nullptr)
>      *slot = qfn;
>
> +
> +  bool cu_file_embedded = false;
> +  std::vector<bool> embeddedv;
> +
>    std::vector<const char *> include_names;
>    if (lh != nullptr)
>      {
>        for (const auto &entry : lh->file_names ())
>         {
>           std::string name_holder;
> -         const char *include_name =
> -           compute_include_file_name (lh.get (), entry, fnd, name_holder);
> +         const char *include_name
> +           = compute_include_file_name (lh.get (), entry, fnd, name_holder);
>           if (include_name != nullptr)
>             {
>               include_name = per_objfile->objfile->intern (include_name);
>               include_names.push_back (include_name);
> +             embeddedv.push_back (entry.source != nullptr);
>             }
> +         else if (entry.source != nullptr)
> +           {
> +             /* We have an embedded source for the CU.  */
> +             gdb_assert (offset == 1);
> +             cu_file_embedded = true;
> +           }
> +
>         }
>      }
>
>    qfn->num_file_names = offset + include_names.size ();
>    qfn->comp_dir = fnd.intern_comp_dir (per_objfile->objfile);
> -  qfn->file_names =
> -    XOBNEWVEC (&per_objfile->per_bfd->obstack, const char *,
> -              qfn->num_file_names);
> +  qfn->file_names
> +    = XOBNEWVEC (&per_objfile->per_bfd->obstack, const char *,
> +                qfn->num_file_names);
>    if (offset != 0)
>      qfn->file_names[0] = per_objfile->objfile->intern (fnd.get_name ());
>
> @@ -1936,7 +1952,16 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
>      memcpy (&qfn->file_names[offset], include_names.data (),
>             include_names.size () * sizeof (const char *));
>
> -  qfn->real_names = NULL;
> +  bool *embeddeds
> +    = XOBNEWVEC (&per_objfile->per_bfd->obstack, bool,
> +                qfn->num_file_names);
> +  if (cu_file_embedded)
> +    embeddeds[0] = true;
> +  for (size_t i = 0; i < embeddedv.size (); i++)
> +    embeddeds[offset + i] = embeddedv[i];
> +  qfn->embeddeds = embeddeds;
> +
> +  qfn->real_names = nullptr;
>
>    lh_cu->file_names = qfn;
>  }
> @@ -1980,7 +2005,11 @@ dw2_get_real_path (dwarf2_per_objfile *per_objfile,
>         dirname = qfn->comp_dir;
>
>        gdb::unique_xmalloc_ptr<char> fullname;
> -      fullname = find_source_or_rewrite (qfn->file_names[index], dirname);
> +
> +      if (qfn->embeddeds[index])
> +       fullname.reset (embedded_fullname (dirname, qfn->file_names[index]));
> +      else
> +       fullname = find_source_or_rewrite (qfn->file_names[index], dirname);
>
>        qfn->real_names[index] = fullname.release ();
>      }
> @@ -7311,6 +7340,35 @@ find_file_and_directory (struct die_info *die, struct dwarf2_cu *cu)
>    file_and_directory res (dwarf2_string_attr (die, DW_AT_name, cu),
>                           dwarf2_string_attr (die, DW_AT_comp_dir, cu));
>
> +  /* Because the line header may tell us information about the CU
> +     filename (e.g., whether it is embedded) which will affect other
> +     calculations, we have to read that information here.  */
> +  line_header *lh = cu->line_header;
> +  struct attribute *attr = dwarf2_attr (die, DW_AT_stmt_list, cu);
> +  if (lh == nullptr && attr != nullptr && attr->form_is_unsigned ())
> +    {
> +      sect_offset line_offset = (sect_offset) attr->as_unsigned ();
> +      line_header_up lhu = dwarf_decode_line_header (line_offset, cu,
> +                                    res.get_comp_dir ());
> +      if (lhu != nullptr)
> +         lh = lhu.release();
> +    }
> +
> +  if (lh != nullptr)
> +    {
> +      for (const auto &entry : lh->file_names ())
> +       {
> +         if (entry.source == nullptr)
> +           continue;
> +
> +         std::string name_holder;
> +         const char *include_name =
> +           compute_include_file_name (lh, entry, res, name_holder);
> +         if (include_name == nullptr)
> +           res.set_embedded (true);
> +       }
> +    }
> +
>    if (res.get_comp_dir () == nullptr
>        && producer_is_gcc_lt_4_3 (cu)
>        && res.get_name () != nullptr
> @@ -18448,7 +18506,7 @@ dwarf_decode_lines_1 (struct line_header *lh, struct dwarf2_cu *cu,
>                     length =
>                       read_unsigned_leb128 (abfd, line_ptr, &bytes_read);
>                     line_ptr += bytes_read;
> -                   lh->add_file_name (cur_file, dindex, mod_time, length);
> +                   lh->add_file_name (cur_file, dindex, nullptr, mod_time, length);
>                   }
>                   break;
>                 case DW_LNE_set_discriminator:
> @@ -18603,9 +18661,12 @@ dwarf_decode_lines (struct line_header *lh, struct dwarf2_cu *cu,
>        subfile *sf = builder->get_current_subfile ();
>
>        if (sf->symtab == nullptr)
> -       sf->symtab = allocate_symtab (cust, sf->name.c_str (),
> -                                     sf->name_for_id.c_str ());
> -
> +       {
> +         sf->symtab = allocate_symtab (cust, sf->name.c_str (),
> +                                       sf->name_for_id.c_str ());
> +         if (fe.source)
> +           sf->symtab->source = fe.source;
> +       }
>        fe.symtab = sf->symtab;
>      }
>  }
> diff --git a/gdb/extension.c b/gdb/extension.c
> index 9db8b53a087..f17991a3751 100644
> --- a/gdb/extension.c
> +++ b/gdb/extension.c
> @@ -991,16 +991,19 @@ xmethod_worker::get_result_type (value *object, gdb::array_view<value *> args)
>  /* See extension.h.  */
>
>  std::optional<std::string>
> -ext_lang_colorize (const std::string &filename, const std::string &contents)
> +ext_lang_colorize (const std::string &filename, const std::string_view contents)
>  {
>    std::optional<std::string> result;
>
> +  /* We avoided copies as long as possible. The external colorization API
> +     requires a std::string.  */
> +  std::string contents_storage = std::string (contents);
>    for (const struct extension_language_defn *extlang : extension_languages)
>      {
>        if (extlang->ops == nullptr
>           || extlang->ops->colorize == nullptr)
>         continue;
> -      result = extlang->ops->colorize (filename, contents);
> +      result = extlang->ops->colorize (filename, contents_storage);
>        if (result.has_value ())
>         return result;
>      }
> diff --git a/gdb/extension.h b/gdb/extension.h
> index 5260bcbde00..e5a5333b7c3 100644
> --- a/gdb/extension.h
> +++ b/gdb/extension.h
> @@ -319,12 +319,13 @@ extern void get_matching_xmethod_workers
>     std::vector<xmethod_worker_up> *workers);
>
>  /* Try to colorize some source code.  FILENAME is the name of the file
> -   holding the code.  CONTENTS is the source code itself.  This will
> -   either a colorized (using ANSI terminal escapes) version of the
> -   source code, or an empty value if colorizing could not be done.  */
> +   holding the code.  CONTENTS is a view of the source code itself.
> +   This will either generate a colorized (using ANSI terminal escapes)
> +   version of the source code, or an empty value if colorizing could not
> +   be done.  */
>
>  extern std::optional<std::string> ext_lang_colorize
> -  (const std::string &filename, const std::string &contents);
> +  (const std::string &filename, const std::string_view contents);
>
>  /* Try to colorize a single line of disassembler output, CONTENT for
>     GDBARCH.  This will return either a colorized (using ANSI terminal
> diff --git a/gdb/source-cache.c b/gdb/source-cache.c
> index 8b5bd84d19a..f5168921f99 100644
> --- a/gdb/source-cache.c
> +++ b/gdb/source-cache.c
> @@ -93,32 +93,45 @@ set_use_gnu_source_highlight_enabled (const char *ignore_args,
>
>  /* See source-cache.h.  */
>
> -std::string
> +std::string_view
>  source_cache::get_plain_source_lines (struct symtab *s,
> -                                     const std::string &fullname)
> +                                     const std::string &fullname,
> +                                     std::string &source_storage)
>  {
> -  scoped_fd desc (open_source_file (s));
> -  if (desc.get () < 0)
> -    perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
> +  std::string_view lines;
> +  if (!s->source)
> +    {
> +
> +      scoped_fd desc (open_source_file (s));
> +      if (desc.get () < 0)
> +       perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
>
> -  struct stat st;
> -  if (fstat (desc.get (), &st) < 0)
> -    perror_with_name (symtab_to_filename_for_display (s));
> +      struct stat st;
> +      if (fstat (desc.get (), &st) < 0)
> +       perror_with_name (symtab_to_filename_for_display (s));
>
> -  std::string lines;
> -  lines.resize (st.st_size);
> -  if (myread (desc.get (), &lines[0], lines.size ()) < 0)
> -    perror_with_name (symtab_to_filename_for_display (s));
> +      source_storage.resize (st.st_size);
> +      if (myread (desc.get (),
> +                 &source_storage[0],
> +                 source_storage.size ()) < 0)
> +       perror_with_name (symtab_to_filename_for_display (s));
>
> -  time_t mtime = 0;
> -  if (s->compunit ()->objfile () != NULL
> -      && s->compunit ()->objfile ()->obfd != NULL)
> -    mtime = s->compunit ()->objfile ()->mtime;
> -  else if (current_program_space->exec_bfd ())
> -    mtime = current_program_space->ebfd_mtime;
> +      time_t mtime = 0;
> +      if (s->compunit ()->objfile () != nullptr
> +         && s->compunit ()->objfile ()->obfd != nullptr)
> +       mtime = s->compunit ()->objfile ()->mtime;
> +      else if (current_program_space->exec_bfd ())
> +       mtime = current_program_space->ebfd_mtime;
>
> -  if (mtime && mtime < st.st_mtime)
> -    warning (_("Source file is more recent than executable."));
> +      if (mtime && mtime < st.st_mtime)
> +       warning (_("Source file is more recent than executable."));
> +
> +      lines = source_storage;
> +    }
> +  else
> +    {
> +      lines = s->source;
> +    }
>

Tom mentioned lifting this check to the caller (source_cache::ensure)
but I thought it made more sense to leave it here because of the code
that calculates the offsets. If that is not reason enough to leave it
here, I am glad to take another whack at refactoring!


>    std::vector<off_t> offsets;
>    offsets.push_back (0);
> @@ -200,9 +213,10 @@ get_language_name (enum language lang)
>     succeeded.  */
>
>  static bool
> -try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
> +try_source_highlight (std::string_view &contents ATTRIBUTE_UNUSED,
>                       enum language lang ATTRIBUTE_UNUSED,
> -                     const std::string &fullname ATTRIBUTE_UNUSED)
> +                     const std::string &fullname ATTRIBUTE_UNUSED,
> +                     std::string &contents_storage)
>  {


My general approach was to use the '_storage' paradigm that I have
seen throughout the codebase. Where I can use a std::string_view, I
did. Where I ultimately needed some storage, I use those parameters.

As above, please let me know if this pattern looks okay! If it does, I
will continue to finalize based on the feedback that I received in the
previous review.

Thank you again for the welcome to the project!
Will


>  #ifdef HAVE_SOURCE_HIGHLIGHT
>    if (!use_gnu_source_highlight)
> @@ -240,10 +254,14 @@ try_source_highlight (std::string &contents ATTRIBUTE_UNUSED,
>           lang_name = detected_lang.c_str ();
>         }
>
> -      std::istringstream input (contents);
> +      /* We waited as long as possible but now we need a string.  */
> +      contents_storage = contents;
> +      std::istringstream input (contents_storage);
>        std::ostringstream output;
>        highlighter->highlight (input, output, lang_name, fullname);
> -      contents = std::move (output).str ();
> +      /* Use the given storage for the contents and set the
> +        view appropriately.  */
> +      contents = contents_storage = std::move (output).str ();
>        styled = true;
>      }
>    catch (...)
> @@ -275,13 +293,16 @@ static void gnu_source_highlight_test ()
>         "}\n");
>    const std::string fullname = "test.c";
>    std::string styled_prog;
> +  std::string_view styled_prog_view;
>
>    bool res = false;
>    bool saw_exception = false;
>    styled_prog = prog;
> +  styled_prog_view = styled_prog;
>    try
>      {
> -      res = try_source_highlight (styled_prog, language_c, fullname);
> +      res = try_source_highlight (styled_prog_view, language_c, fullname,
> +                                 styled_prog);
>      }
>    catch (...)
>      {
> @@ -324,10 +345,12 @@ source_cache::ensure (struct symtab *s)
>         }
>      }
>
> -  std::string contents;
> +  std::string_view source_contents;
> +  std::string source_contents_storage;
>    try
>      {
> -      contents = get_plain_source_lines (s, fullname);
> +      source_contents = get_plain_source_lines (s, fullname,
> +                                               source_contents_storage);
>      }
>    catch (const gdb_exception_error &e)
>      {
> @@ -339,15 +362,17 @@ source_cache::ensure (struct symtab *s)
>        && m_no_styling_files.count (fullname) == 0)
>      {
>        bool already_styled
> -       = try_source_highlight (contents, s->language (), fullname);
> +       = try_source_highlight (source_contents, s->language (), fullname,
> +                               source_contents_storage);
>
>        if (!already_styled)
>         {
>           std::optional<std::string> ext_contents;
> -         ext_contents = ext_lang_colorize (fullname, contents);
> +         ext_contents = ext_lang_colorize (fullname, source_contents);
>           if (ext_contents.has_value ())
>             {
> -             contents = std::move (*ext_contents);
> +             source_contents = source_contents_storage
> +               = std::move (*ext_contents);
>               already_styled = true;
>             }
>         }
> @@ -369,7 +394,7 @@ source_cache::ensure (struct symtab *s)
>         }
>      }
>
> -  source_text result = { std::move (fullname), std::move (contents) };
> +  source_text result = { std::move (fullname), std::string (source_contents) };
>    m_source_map.push_back (std::move (result));
>
>    if (m_source_map.size () > MAX_ENTRIES)
> diff --git a/gdb/source-cache.h b/gdb/source-cache.h
> index d4cb7d00ae8..84996a6a5a1 100644
> --- a/gdb/source-cache.h
> +++ b/gdb/source-cache.h
> @@ -80,11 +80,13 @@ class source_cache
>      std::string contents;
>    };
>
> -  /* A helper function for get_source_lines reads a source file.
> -     Returns the contents of the file; or throws an exception on
> -     error.  This also updates m_offset_cache.  */
> -  std::string get_plain_source_lines (struct symtab *s,
> -                                     const std::string &fullname);
> +  /* A helper function for get_source_lines that reads a source file.
> +     Returns a view of the contents of the file using SOURCE_STORAGE
> +     as necessary (i.e., if the source is stored on disk);  or throws
> +     an exception on error.  This also updates m_offset_cache.  */
> +  std::string_view get_plain_source_lines (struct symtab *s,
> +                                          const std::string &fullname,
> +                                          std::string &source_storage);
>
>    /* A helper function that the data for the given symtab is entered
>       into both caches.  Returns false on error.  */
> diff --git a/gdb/source.c b/gdb/source.c
> index bbeb4154258..8bfaffe43a8 100644
> --- a/gdb/source.c
> +++ b/gdb/source.c
> @@ -684,9 +684,11 @@ info_source_command (const char *ignore, int from_tty)
>
>    cust = s->compunit ();
>    gdb_printf (_("Current source file is %s\n"), s->filename);
> -  if (s->compunit ()->dirname () != NULL)
> +  if (s->compunit ()->dirname () != nullptr)
>      gdb_printf (_("Compilation directory is %s\n"), s->compunit ()->dirname ());
> -  if (s->fullname)
> +  if (s->source != nullptr)
> +    gdb_printf (_("With embedded source.\n"));
> +  else if (s->fullname)
>      gdb_printf (_("Located in %s\n"), s->fullname);
>    const std::vector<off_t> *offsets;
>    if (g_source_cache.get_line_charpos (s, &offsets))
> @@ -960,6 +962,19 @@ source_full_path_of (const char *filename,
>    return 1;
>  }
>
> +/* See source.h.  */
> +
> +char *
> +embedded_fullname (const char *dirname, const char *filename)
> +{
> +  if (dirname != nullptr)
> +    {
> +      return concat (dirname, SLASH_STRING, filename, (char *) nullptr);
> +    }
> +
> +  return xstrdup (filename);
> +}
> +
>  /* Return non-zero if RULE matches PATH, that is if the rule can be
>     applied to PATH.  */
>
> @@ -1237,27 +1252,35 @@ symtab_to_fullname (struct symtab *s)
>    /* Use cached copy if we have it.
>       We rely on forget_cached_source_info being called appropriately
>       to handle cases like the file being moved.  */
> -  if (s->fullname == NULL)
> +  if (s->fullname == nullptr)
>      {
> -      scoped_fd fd = open_source_file (s);
> -
> -      if (fd.get () < 0)
> +      if (s->source)
> +         s->fullname = embedded_fullname (s->compunit ()->dirname (),
> +                                          s->filename);
> +      else
>         {
> -         gdb::unique_xmalloc_ptr<char> fullname;
> +         scoped_fd fd = open_source_file (s);
>
> -         /* rewrite_source_path would be applied by find_and_open_source, we
> -            should report the pathname where GDB tried to find the file.  */
> +         if (fd.get () < 0)
> +           {
> +             gdb::unique_xmalloc_ptr<char> fullname;
>
> -         if (s->compunit ()->dirname () == nullptr
> -             || IS_ABSOLUTE_PATH (s->filename))
> -           fullname.reset (xstrdup (s->filename));
> -         else
> -           fullname.reset (concat (s->compunit ()->dirname (), SLASH_STRING,
> -                                   s->filename, (char *) NULL));
> +             /* rewrite_source_path would be applied by find_and_open_source,
> +                we should report the pathname where GDB tried to find the
> +                file.  */
>
> -         s->fullname = rewrite_source_path (fullname.get ()).release ();
> -         if (s->fullname == NULL)
> -           s->fullname = fullname.release ();
> +             if (s->compunit ()->dirname () == nullptr
> +                 || IS_ABSOLUTE_PATH (s->filename))
> +               fullname.reset (xstrdup (s->filename));
> +             else
> +               fullname.reset (concat (s->compunit ()->dirname (),
> +                                       SLASH_STRING, s->filename,
> +                                       (char *) nullptr));
> +
> +             s->fullname = rewrite_source_path (fullname.get ()).release ();
> +             if (s->fullname == nullptr)
> +               s->fullname = fullname.release ();
> +           }
>         }
>      }
>
> @@ -1317,12 +1340,17 @@ print_source_lines_base (struct symtab *s, int line, int stopline,
>        else
>         {
>           last_source_visited = s;
> -         scoped_fd desc = open_source_file (s);
> -         last_source_error = desc.get () < 0;
> -         if (last_source_error)
> +         /* Do not attempt to open a source file for a symtab
> +            with an embedded source.  */
> +         if (!s->source)
>             {
> -             noprint = true;
> -             errcode = -desc.get ();
> +             scoped_fd desc = open_source_file (s);
> +             last_source_error = desc.get () < 0;
> +             if (last_source_error)
> +               {
> +                 noprint = true;
> +                 errcode = -desc.get ();
> +               }
>             }
>         }
>      }
> diff --git a/gdb/source.h b/gdb/source.h
> index 144ee48f722..f27caf1a9b5 100644
> --- a/gdb/source.h
> +++ b/gdb/source.h
> @@ -216,4 +216,8 @@ extern void forget_cached_source_info (void);
>     need to would make things slower than necessary.  */
>  extern void select_source_symtab ();
>
> +/* Compute the fullname for a source file whose source is embedded
> +   in the dwarf file.  */
> +extern char *embedded_fullname (const char *dirname,
> +                               const char *filename);
>  #endif
> diff --git a/gdb/symmisc.c b/gdb/symmisc.c
> index 49b9674f77a..f5092de2659 100644
> --- a/gdb/symmisc.c
> +++ b/gdb/symmisc.c
> @@ -810,9 +810,11 @@ maintenance_info_symtabs (const char *regexp, int from_tty)
>                     gdb_printf ("((struct symtab *) %s)\n",
>                                 host_address_to_string (symtab));
>                     gdb_printf ("\t  fullname %s\n",
> -                               symtab->fullname != NULL
> +                               symtab->fullname != nullptr
>                                 ? symtab->fullname
>                                 : "(null)");
> +                   if (symtab->source != nullptr)
> +                     gdb_printf ("\t  source embedded in DWARF\n");
>                     gdb_printf ("\t  "
>                                 "linetable ((struct linetable *) %s)\n",
>                                 host_address_to_string
> @@ -955,15 +957,20 @@ maintenance_print_one_line_table (struct symtab *symtab, void *data)
>    gdb_printf (_("compunit_symtab: %s ((struct compunit_symtab *) %s)\n"),
>               symtab->compunit ()->name,
>               host_address_to_string (symtab->compunit ()));
> +  styled_string_s *styled_symtab_fullname = nullptr;
> +  if (symtab->source)
> +    styled_symtab_fullname = styled_string (metadata_style.style (),
> +                                           _("Source embedded in DWARF"));
> +  else
> +    styled_symtab_fullname = styled_string (file_name_style.style (),
> +                                           symtab_to_fullname (symtab));
>    gdb_printf (_("symtab: %ps ((struct symtab *) %s)\n"),
> -             styled_string (file_name_style.style (),
> -                            symtab_to_fullname (symtab)),
> +             styled_symtab_fullname,
>               host_address_to_string (symtab));
>    linetable = symtab->linetable ();
>    gdb_printf (_("linetable: ((struct linetable *) %s):\n"),
>               host_address_to_string (linetable));
> -
> -  if (linetable == NULL)
> +  if (linetable == nullptr)
>      gdb_printf (_("No line table.\n"));
>    else if (linetable->nitems <= 0)
>      gdb_printf (_("Line table has no lines.\n"));
> diff --git a/gdb/symtab.h b/gdb/symtab.h
> index bf9a3cfb79f..90c2f202390 100644
> --- a/gdb/symtab.h
> +++ b/gdb/symtab.h
> @@ -1755,6 +1755,11 @@ struct symtab
>
>    const char *filename;
>
> +  /* When the contents of the source file were/are embedded with the
> +     debugging info, this pointer will refer to that source. Can be nullptr
> +     if the source for this source file is on disk.  */
> +  const char *source;
> +
>    /* Filename for this source file, used as an identifier to link with
>       related objects such as associated macro_source_file objects.  It must
>       therefore match the name of any macro_source_file object created for this
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
> new file mode 100644
> index 00000000000..86cdb9655b7
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.c
> @@ -0,0 +1,24 @@
> +/* Copyright 2024 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +
> +int
> +main (void)
> +{                                                      /* main prologue */
> +  asm ("main_label: .global main_label");
> +  int m = 42;                                          /* main assign m */
> +  asm ("main_end: .global main_end");                  /* main end */
> +  return m;
> +}
> diff --git a/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
> new file mode 100644
> index 00000000000..73a06bd7b9c
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/dw2-lnct-source.exp
> @@ -0,0 +1,88 @@
> +# Copyright 2024 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +# Check that GDB can honor LNCT_llvm_SOURCE.
> +
> +load_lib dwarf.exp
> +
> +# This test can only be run on targets which support DWARF-2 and use gas.
> +require dwarf2_support
> +
> +standard_testfile .c .S
> +
> +set asm_file [standard_output_file $srcfile2]
> +
> +set fp [open "${srcdir}/${subdir}/${srcfile}" r]
> +set srcfile_data [read $fp]
> +close $fp
> +
> +Dwarf::assemble $asm_file {
> +    global srcdir subdir srcfile srcfile2 srcfile_data
> +    declare_labels lines_label
> +
> +    get_func_info main
> +
> +    cu {} {
> +       compile_unit {
> +           {language @DW_LANG_C}
> +           {name missing-file.c}
> +           {stmt_list ${lines_label} DW_FORM_sec_offset}
> +       } {
> +           subprogram {
> +               {external 1 flag}
> +               {name main}
> +               {low_pc $main_start addr}
> +               {high_pc "$main_start + $main_len" addr}
> +           }
> +       }
> +    }
> +
> +    lines {version 5} lines_label {
> +       set diridx [include_dir "${srcdir}/${subdir}"]
> +       file_name "missing-file.c" $diridx "${srcfile_data}"
> +
> +       program {
> +           DW_LNS_set_file $diridx
> +           DW_LNE_set_address $main_start
> +           line [gdb_get_line_number "main prologue"]
> +           DW_LNS_copy
> +
> +           DW_LNE_set_address main_label
> +           line [gdb_get_line_number "main assign m"]
> +           DW_LNS_copy
> +
> +           DW_LNE_set_address main_end
> +           line [gdb_get_line_number "main end"]
> +           DW_LNS_copy
> +
> +           DW_LNE_end_sequence
> +       }
> +    }
> +}
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} \
> +         [list $srcfile $asm_file] {nodebug}] } {
> +    return -1
> +}
> +
> +if ![runto_main] {
> +    return -1
> +}
> +
> +set assign_m_line [gdb_get_line_number "main assign m"]
> +gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*"
> +gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in DWARF.*"
> +gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source embedded in DWARF.*"
> +gdb_test "info source" ".*With embedded source.*"
> diff --git a/gdb/testsuite/lib/dwarf.exp b/gdb/testsuite/lib/dwarf.exp
> index d085f835f07..b1f09eab3d3 100644
> --- a/gdb/testsuite/lib/dwarf.exp
> +++ b/gdb/testsuite/lib/dwarf.exp
> @@ -2423,9 +2423,9 @@ namespace eval Dwarf {
>         # Add a file name entry to the line table header's file names table.
>         #
>         # Return the index by which this entry can be referred to.
> -       proc file_name {filename diridx} {
> +       proc file_name {filename diridx { source "" } } {
>             variable _line_file_names
> -           lappend _line_file_names $filename $diridx
> +           lappend _line_file_names $filename $diridx $source
>
>             if { $Dwarf::_line_unit_version >= 5 } {
>                 return [expr [llength $_line_file_names] - 1]
> @@ -2481,7 +2481,7 @@ namespace eval Dwarf {
>                     }
>                 }
>
> -               _op .byte 2 "file_name_entry_format_count"
> +               _op .byte 3 "file_name_entry_format_count"
>                 _op .uleb128 1 \
>                     "file_name_entry_format (content type code: DW_LNCT_path)"
>                 switch $_line_string_form {
> @@ -2494,15 +2494,21 @@ namespace eval Dwarf {
>                             "directory_entry_format (form: DW_FORM_line_strp)"
>                     }
>                 }
> +
>                 _op .uleb128 2 \
>                     "file_name_entry_format (content type code: DW_LNCT_directory_index)"
>                 _op .uleb128 0x0f \
>                     "file_name_entry_format (form: DW_FORM_udata)"
>
> -               set nr_files [expr [llength $_line_file_names] / 2]
> +               _op .uleb128 0x2001 \
> +                   "file_name_entry_format (content type code: DW_LNCT_LLVM_SOURCE)"
> +               _op .uleb128 0x08 \
> +                   "file_name_entry_format (form: DW_FORM_string)"
> +
> +               set nr_files [expr [llength $_line_file_names] / 3]
>                 _op .byte $nr_files "file_names_count"
>
> -               foreach { filename diridx } $_line_file_names {
> +               foreach { filename diridx source } $_line_file_names {
>                     switch $_line_string_form {
>                         string {
>                             _op .ascii [_quote $filename]
> @@ -2517,6 +2523,7 @@ namespace eval Dwarf {
>                         }
>                     }
>                     _op .uleb128 $diridx
> +                   _op .ascii [_quote [string map { "\"" "\\\"" "\n" "\\n" } $source]]
>                 }
>             } else {
>                 foreach dirname $_line_include_dirs {
> @@ -2525,7 +2532,7 @@ namespace eval Dwarf {
>
>                 _op .byte 0 "Terminator (include_directories)"
>
> -               foreach { filename diridx } $_line_file_names {
> +               foreach { filename diridx source } $_line_file_names {
>                     _op .ascii [_quote $filename]
>                     _op .sleb128 $diridx
>                     _op .sleb128 0 "mtime"
> diff --git a/include/dwarf2.h b/include/dwarf2.h
> index b3d3731ee83..3823c041bab 100644
> --- a/include/dwarf2.h
> +++ b/include/dwarf2.h
> @@ -289,6 +289,11 @@ enum dwarf_line_number_content_type
>      DW_LNCT_size = 0x4,
>      DW_LNCT_MD5 = 0x5,
>      DW_LNCT_lo_user = 0x2000,
> +    /* LLVM has implemented DW_LNCT_source (see
> +       https://dwarfstd.org/issues/180201.1.html) as DW_LNCT_LLVM_SOURCE as
> +       a vendor extension until the DWARF standard is updated (see
> +       https://github.com/llvm/llvm-project/blob/08bb121835be432ac52372f92845950628ce9a4a/llvm/include/llvm/BinaryFormat/Dwarf.def#L1080 .  */
> +    DW_LNCT_LLVM_SOURCE = 0x2001,
>      DW_LNCT_hi_user = 0x3fff
>    };
>
> --
> 2.44.0
>

      reply	other threads:[~2024-04-05 16:18 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-05 16:13 Will Hawkins
2024-04-05 16:18 ` Will Hawkins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADx9qWhO5hAiA-wUAYcS0=OHGKFNas+P61PJ_qsXpYpkmMTOUQ@mail.gmail.com' \
    --to=hawkinsw@obs.cr \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).