public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "dblaikie at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug debug/99178] Emit .debug_names
Date: Wed, 10 Jan 2024 18:51:05 +0000	[thread overview]
Message-ID: <bug-99178-4-VJBIXa9QZx@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-99178-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99178

--- Comment #6 from David Blaikie <dblaikie at gmail dot com> ---
(In reply to Tom Tromey from comment #5)
> (In reply to David Blaikie from comment #4)
> 
> I don't remember filing this bug.  At the time maybe I thought it
> would be worthwhile to have "end to end" .debug_names generation,
> that is, to try to have the index and also not slow down
> compilation or link times too much.  Not sure how I feel about it now.

Certainly what's been possible with .debug_gnu_pubnames/types + -Wl,--gdb-index
today. It'd be nice to have that same workflow, but in a more portable form.

> > It'd be great to get GCC/GDB folks take on the name tables - get some
> > practical experience with their contents, file any bugs about missing
> > elements, etc. It's possible they're leaning too heavily towards lldb's
> > style of name lookup since they derived from an existing apple
> > implementation there & it'd be good to generalize them where needed.
> 
> gdb has long done the wrong thing with .debug_names.
> https://sourceware.org/bugzilla/show_bug.cgi?id=24820

Ah, thanks for the link - I followed up there with some context/thoughts.

> I don't really know how/why that happened.  However, I wrote patches to
> fix it:
> 
> https://sourceware.org/pipermail/gdb-patches/2023-December/204949.html
> 
> This version of gdb will look at the augmentation string and only
> allow certain indexes to be used.  This is done to avoid known bugs --
> mainly coming from the "old" (current) gdb, but also clang has some
> issues (from memory, it doesn't include parent info).

Ideally that'd be detected by looking at the abbreviation table, rather than
the augmentation string - if parent info is necessary for a usage of the table,
that'd be a generic way to check for it & ensure the unusable indexes are
ignored while not ignoring usable ones.

> Also, gdb relies on its extensions (see below).

Ah, but yeah, if you need extensions, then positive augmentation string
checking seems likely necessary.

(though this starts to feel like websites checking browser versions,
unfortunately :/ )

> When writing the new scanner, I found a few bugs in DWARF related to
> which names appear in the index.  I don't recall offhand what these are,
> and I didn't file them due to the late unpleasantness (sorry).  

No worries - and totally understandable. If they happen to come to mind at any
point, I'd love to hear about them.

> They could probably be dug up by reading the scanner and comparing to the spec.
> 
> gdb will also emit some extensions.  You can see these in the docs patch:
> 
> https://sourceware.org/pipermail/gdb-patches/2023-December/204947.html

Awesome - appreciate the documentation!

> Generally the goal of these is to avoid having to do any DIE reading
> in order to reject a lookup.  Note that this means gdb relies on
> template parameters being in the name -- something we discussed in
> gdb bugzilla a bit.

Yeah, I'd love to figure out how to deal with that better, but don't have
immediate suggestions.

Any sense of how bad the performance is if names without template parameters
(strawman: this could be communicated via another flag on the entry in the
index) did require DWARF parsing to check template parameters? Is that
something that'd be an option? (especially with a flag in the entry, then it'd
only be a runtime cost to those using this naming mechanism - as much as I'd
like to move to that mechanism being normal/the default, perhaps this would be
a safe transition path)

But I guess Google's probably the only one super interested in the simplified
template names at the moment (& we're mostly investing in lldb these days), so
might be unlikely anyone would be signing up to do that work in gdb.

> With these patches, gdb will not generate or use the hash table.
> This is explained here:
> 
> https://sourceware.org/pipermail/gdb-patches/2023-December/204963.html

Oh, that's got some good details/answers some of my questions - thanks!

> I consider this hash table to be essentially useless in general, due to the name canonicalization problem -- while DWARF says that writers should use the system demangling style, (1) this style varies across systems, so it can't truly be relied on; and (2) at least GCC and one other compiler don't actually follow this part of the spec anyway.

Hmm, I missed a step here - perhaps you can help me understand. Maybe,
ultimately, I agree with you here - I've pushed back on the lldb folks relying
on character identical name lookup in the index due to the problems you've
described (there's no real canonical demangling) - but where does DWARF say
that writers should "use the system demangling style"?

> It's important to note, though, that even if the hash was somehow useful, GDB probably still would not use it -- a sorted list of names is needed for completion and performs reasonably well for other lookups, so a hash table is just overhead, IMO.

Oh, that makes loads of sense - yeah, beats me how lldb deals with tab
completion using the hash table... maybe it builds some other side table or
something. That's something I've wondered about for a while and it's good to
know how GDB deals with this, and why its index looks different.

Does that mean you want the entries in the table to be sorted? Do you emit them
that way and then, based on augmentation string, rely on them being sorted? Or
do a quick scan at startup and build a sorted list in memory regardless of the
order in .debug_names? (.debug_names entry list isn't suited to random access,
is it? The records aren't all the same size so I don't think you could binary
search through them)


> The only other thing I can think of offhand is that the reliance on
> .debug_str means that gdb may have to augment the string table when
> DW_FORM_string is in use.  This is also caused by the "(anonymous namespace)"
> special case.

And Split DWARF, I guess? The strings wouldn't otherwise be in the executables
.debug_str if not for the index - they'd only appear in the dwo/dwp
.debug_str.dwo sections.

  parent reply	other threads:[~2024-01-10 18:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-20  0:20 [Bug debug/99178] New: " tromey at gcc dot gnu.org
2021-02-22  8:44 ` [Bug debug/99178] " rguenth at gcc dot gnu.org
2021-02-22  9:00 ` jakub at gcc dot gnu.org
2021-02-22 21:40 ` mark at gcc dot gnu.org
2024-01-10  1:20 ` dblaikie at gmail dot com
2024-01-10  2:47 ` tromey at gcc dot gnu.org
2024-01-10 18:51 ` dblaikie at gmail dot com [this message]
2024-01-10 20:31 ` tromey at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-99178-4-VJBIXa9QZx@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).