public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
From: "dblaikie at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: gdb-prs@sourceware.org
Subject: [Bug symtab/24820] .debug_names has incorrect contents
Date: Wed, 10 Jan 2024 18:37:30 +0000	[thread overview]
Message-ID: <bug-24820-4717-PfpFPLFETr@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-24820-4717@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=24820

--- Comment #11 from David Blaikie <dblaikie at gmail dot com> ---
Hey Tom - catching up on this bug & giving whatever context I have from DWARF
and Clang.

It'd be great to get this working for Clang/GCC/GDB/LLDB to have a portable
index that provides good performance for at least both GDB and LLDB, and maybe
other consumers - but two would at least be a good start.

If there are bugs in clang, we're interested in fixing them.
If there are bugs/unclear things in the DWARF spec, we're interested in fixing
them.

> However, it is required for .debug_names to work properly.

I don't think that's true - at least doesn't seem to be true for lldb. Maybe
this dovetails into:

> gdb needs to know the CU language but this isn't in the tables. gdb can emit it as an extension of course.  I suppose we could read the first DIE of every CU -- but that's precisely the kind of work we want to avoid.

Yeah, that's what I'd recommend for the language and address ranges - reading
the first DIE of each CU. My experience is that's still quite cheap and quite a
different proposition compared to "read all the DIEs and discover all the names
from scratch".

(I wouldn't be averse to a new DWARFv5+ version of "aranges" that points to the
rnglist the CU points to anyway - it'd add some overhead compared to
discovering the ranges by reading the first DIE, but would be slightly faster
too - but the current aranges are really verbose/totally duplicate with the
rnglist for the CU - Clang hasn't emitted aranges by default for a decade or so
because the size/perf tradeoff wasn't great)

> Oh, finally, the hash table is useless to gdb, so I think gdb should simply not emit it.

That surprises me - does gdb_index not contain some kind of fast lookup? Or is
it a flat list that gdb loads on startup into a hash table?

Could gdb benefit from using the hash table, or are there some architectural
reasons that's infeasible?

> Well, maybe this isn't really fair, the missing parentage is maybe just a clang bug, not a bug in the spec.

The spec doesn't mandate the parent references - but the Apple LLDB folks
(they're in the process of moving to DWARFv5, so they're finding all the
regressions in .debug_names compared to the original .apple_names that were
introduced in the standardization process) are in the process of adding the
parent links to Clang's output. https://github.com/llvm/llvm-project/pull/77457
and related work/discussions.

> gdb will have to mark some entries as linkage names.  Otherwise
there's no way to tell.

Is it not enough to check the mangling prefix? If it starts with `_Z` or
another mangling prefix, it's a mangled name otherwise it's the name? (unless
you need to differentiate the other direction - that a name isn't the linkage
name, but flagging linkage names wouldn't be enough for that (if you have
DW_AT_name "foo" you couldn't tell if that's a C linkage function where the
name matches the linkage name, or a C++ linkage function where it's
overloaded/etc and has some other linkage name))

Interested in understanding that more.

> It's kind of sad that DWARF invented a new format and then didn't provide enough guidance for tools to be compatible at all.

Happy to hear more about what guidance is missing/clarify things, if some
things that the spec says are optional are basically necessities, we could make
them mandatory in future DWARF versions, etc.

> In the case of libxul, the combined size of its .debug_names, .debug_str and .debug_aranges is about 72% larger than the size of its .gdb_index (690 MB vs. 400 MB).  At least for this library we are much better off sticking with .gdb_index.

Yeah, that's something that's concerned me a bit too - including the .o
representation of .debug_names (at Google I've been mostly concerned with .o
file sizes - the aggregate size of the objects needed by the linker in some
large Google programs is quite problematic)... so, again, interested to know
more about whether there are ways we could improve the format to be more dense
or terse.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2024-01-10 18:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-24820-4717@http.sourceware.org/bugzilla/>
2020-09-01 18:53 ` mark at klomp dot org
2021-02-18 16:27 ` tromey at sourceware dot org
2021-02-22  2:36 ` tromey at sourceware dot org
2021-02-22  8:55 ` fweimer at redhat dot com
2021-06-11 16:30 ` tromey at sourceware dot org
2022-04-22 18:03 ` tromey at sourceware dot org
2023-01-23 19:52 ` tromey at sourceware dot org
2023-12-03  0:07 ` tromey at sourceware dot org
2023-12-03  0:13 ` tromey at sourceware dot org
2023-12-03  0:16 ` tromey at sourceware dot org
2023-12-03 20:32 ` tromey at sourceware dot org
2023-12-04 14:19 ` tromey at sourceware dot org
2023-12-10 15:15 ` tromey at sourceware dot org
2023-12-10 15:16 ` tromey at sourceware dot org
2023-12-10 15:17 ` tromey at sourceware dot org
2023-12-10 15:21 ` tromey at sourceware dot org
2023-12-10 15:30 ` tromey at sourceware dot org
2024-01-10  2:01 ` tromey at sourceware dot org
2024-01-10 18:17 ` dblaikie at gmail dot com
2024-01-10 18:37 ` dblaikie at gmail dot com [this message]
2024-01-10 20:51 ` tromey at sourceware dot org
2024-01-10 20:58 ` tromey at sourceware dot org
2024-01-18 20:38 ` cvs-commit at gcc dot gnu.org
2024-01-18 20:38 ` tromey at sourceware dot org
2024-04-22 21:18 ` dblaikie at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-24820-4717-PfpFPLFETr@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=gdb-prs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).