public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom Tromey <tom@tromey.com>
To: Will Hawkins <hawkinsw@obs.cr>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH v2] gdb: Support embedded source in DWARF
Date: Wed, 03 Apr 2024 14:34:53 -0600	[thread overview]
Message-ID: <87edbm6v82.fsf@tromey.com> (raw)
In-Reply-To: <20240403165713.452036-1-hawkinsw@obs.cr> (Will Hawkins's message of "Wed, 3 Apr 2024 12:57:10 -0400")

>>>>> "Will" == Will Hawkins <hawkinsw@obs.cr> writes:

Will> While DW_LNCT_source is not yet finalized in the DWARF standard
Will> (https://dwarfstd.org/issues/180201.1.html), LLVM does emit it.

Will> This patch adds support for it in gdb.

Will> Tested on x86_64-redhat-linux.

Will> Signed-off-by: Will Hawkins <hawkinsw@obs.cr>

Thanks for the patch.
It looks basically reasonable to me but I found a few oddities and then
also some of the normal stylistic nits.

Will> +      m_is_embedded (false)

It's better to use inline initialization, like:

Will> +  /* Whether the file's source is embedded in the dwarf.  */
Will> +  bool m_is_embedded;

   bool m_is_embedded = false;

Will> +	    case DW_LNCT_LLVM_SOURCE:
Will> +	      if (string.has_value () && strlen (*string))
Will> +		fe.source = *string;
Will> +	      break;

If the source can be large then it seems bad to spend time computing the
strlen, when just checking if the first byte is '\0' gives the same
answer in constant time.

Will> +  /* Whether or not the file names refer to sources embedded
Will> +     in the dwarf.  */
Will> +  bool *embeddeds;

Probably should be 'const bool *' since IIUC this is set up once and
then never written to again.

Will> +	      if (entry.source != NULL)
Will> +		embeddeds.push_back (false);
Will> +	      else
Will> +		embeddeds.push_back(true);

embeddeds.push_back (entry.source == nullptr);

Will> +  qfn->embeddeds =
Will> +    XOBNEWVEC (&per_objfile->per_bfd->obstack, bool,

gdb puts '=' on the continuation line.

Will> +	fullname.reset (embedded_fullname (dirname, qfn->file_names[index]));

embedded_fullname should return a unique_xmalloc_ptr so that this spot
can be an assignment.  Maybe one spot needs a '.release ()' then.

It's better to return these self-managing objects, since it makes leaks
less likely.

Will> +  /* Because the line header may tell us information about the CU
Will> +     filename (e.g., whether it is embedded) which will affect other
Will> +     calculations, we have to read that information here.  */
Will> +  line_header_up lh;
Will> +  lh.reset (cu->line_header);

This will wind up deleting cu->line_header, which I assume is not
intended.  For one thing, dwarf2_cu::line_header seems to have a kind of
complicated ownership situation.

In gdb the "_up" suffix on a type means "unique pointer"; i.e., 'lh'
will delete the memory when its destructor is run.

Will> +	  const char *include_name =
Will> +	    compute_include_file_name (lh.get (), entry, res, name_holder);

'=' on 2nd line.

Will> +	  if (!include_name )

gdb tends to spell this out, like: include_name == nullptr

Will> +	{
Will> +	  sf->symtab = allocate_symtab (cust, sf->name.c_str (),
Will> +					sf->name_for_id.c_str ());
Will> +	  if (fe.source)
Will> +	    sf->symtab->source = cu->per_objfile->objfile->intern (fe.source);

You don't want to intern the source here.  That makes a copy.  If the
source is in one of the debug sections, those are mapped during DWARF
reading and not unmapped until the per-BFD object is destroyed.  Objects
like this can safely be referred to by other symbol table data
structure; there are a lot of cases of this in gdb.

Will>  source_cache::get_plain_source_lines (struct symtab *s,
Will>  				      const std::string &fullname)
Will>  {
Will> -  scoped_fd desc (open_source_file (s));
Will> -  if (desc.get () < 0)
Will> -    perror_with_name (symtab_to_filename_for_display (s), -desc.get ());
Will> +  std::string lines;

[...]

Will> +  else
Will> +    {
Will> +      lines = s->source;
Will> +    }

gdb doesn't use braces when there's just a single statemnt.

However, I suspect you will want a different factoring here.  This line
makes a copy of the entire source text -- but a copy isn't needed here.

Currently, get_plain_source_lines is written this way because it is
reading from the filesystem, so the data has to go somewhere.

Instead, what if the caller (source_cache::ensure) did the checking
here?  Then it could make a string_view for the text -- either the
saved source or the string returned by get_plain_source_lines.

Then try_source_highlight and ext_lang_colorize could be changed to
accept a string view instead.  (I'm not 100% sure this works, maybe just
passing a 'const char *' is better, it depends on if this code assumes
\0-termination.)

source_cache::source_text will also need some kind of update, but that
doesn't seem so bad either.

This approach should avoid copying as much as possible.  It's maybe not
possible to completely avoid it (I didn't look to see what the Pygments
call does); but at least we can avoid it where it's not needed.

Will> +  if (s->source != NULL)

We're slowly switching to nullptr.  There's a few cases of this.

Will> diff --git a/gdb/symtab.h b/gdb/symtab.h
Will> index bf9a3cfb79f..1f871ee0a74 100644
Will> --- a/gdb/symtab.h
Will> +++ b/gdb/symtab.h
Will> @@ -1755,6 +1755,8 @@ struct symtab
 
Will>    const char *filename;
 
Will> +  const char *source;

This should have a comment.  Older code may sometimes be missing one,
but we try to add one for anything new.

Will> +# Copyright 2022-2024 Free Software Foundation, Inc.

Probably should just say 2024.

Will> +set assign_m_line [gdb_get_line_number "main assign m"]
Will> +gdb_test "frame" ".*main \\\(\\\) at \[^\r\n\]*:$assign_m_line\r\n.*"
Will> +gdb_test "maintenance info symtabs missing-file.c" ".*source embedded in DWARF.*"
Will> +gdb_test "maintenance info line-table missing-file.c" ".*symtab: Source embedded in DWARF.*"
Will> +gdb_test "info source" ".*With embedded source.*"

Probably some of these lines need to be broken up.

Also I wonder if the output here is voluminous.  That can cause problems
unless you use gdb_test_multiple.

Will> diff --git a/include/dwarf2.h b/include/dwarf2.h
Will> index b3d3731ee83..bf8bdcd608f 100644
Will> --- a/include/dwarf2.h
Will> +++ b/include/dwarf2.h
Will> @@ -289,6 +289,7 @@ enum dwarf_line_number_content_type
Will>      DW_LNCT_size = 0x4,
Will>      DW_LNCT_MD5 = 0x5,
Will>      DW_LNCT_lo_user = 0x2000,
Will> +    DW_LNCT_LLVM_SOURCE = 0x2001,

I think this needs some kind of comment as well.  Maybe a link to
whatever spec or docs there are.  Look in dwarf2.{h,def} for "http" to
see some examples.

Tom

  reply	other threads:[~2024-04-03 20:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-03 16:57 Will Hawkins
2024-04-03 20:34 ` Tom Tromey [this message]
2024-04-03 21:20   ` Will Hawkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edbm6v82.fsf@tromey.com \
    --to=tom@tromey.com \
    --cc=gdb-patches@sourceware.org \
    --cc=hawkinsw@obs.cr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).