From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DB60C3858C53; Wed, 10 Jan 2024 18:51:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DB60C3858C53 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1704912667; bh=ynxzVZL8kpByypvkq410p2e6XyOECVdk4ui7+V8ELas=; h=From:To:Subject:Date:In-Reply-To:References:From; b=Vgf6wCWpFYhaxisSmVGRGWunoL28SiW54NcFAfjhZOMy/cRxSR9y8aIdCkBJCudQM hgotVVlhK+5ptRpo9iW7MJEXeQE7cflMRfNDB/ekxtwZ4TejNMedVfhsAbxTiXL3tf IlDdcz/W7SQffjg4xogAbB3WO4JoI+7G0BPgctYg= From: "dblaikie at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug debug/99178] Emit .debug_names Date: Wed, 10 Jan 2024 18:51:05 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: debug X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: dblaikie at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99178 --- Comment #6 from David Blaikie --- (In reply to Tom Tromey from comment #5) > (In reply to David Blaikie from comment #4) >=20 > I don't remember filing this bug. At the time maybe I thought it > would be worthwhile to have "end to end" .debug_names generation, > that is, to try to have the index and also not slow down > compilation or link times too much. Not sure how I feel about it now. Certainly what's been possible with .debug_gnu_pubnames/types + -Wl,--gdb-i= ndex today. It'd be nice to have that same workflow, but in a more portable form. > > It'd be great to get GCC/GDB folks take on the name tables - get some > > practical experience with their contents, file any bugs about missing > > elements, etc. It's possible they're leaning too heavily towards lldb's > > style of name lookup since they derived from an existing apple > > implementation there & it'd be good to generalize them where needed. >=20 > gdb has long done the wrong thing with .debug_names. > https://sourceware.org/bugzilla/show_bug.cgi?id=3D24820 Ah, thanks for the link - I followed up there with some context/thoughts. > I don't really know how/why that happened. However, I wrote patches to > fix it: >=20 > https://sourceware.org/pipermail/gdb-patches/2023-December/204949.html >=20 > This version of gdb will look at the augmentation string and only > allow certain indexes to be used. This is done to avoid known bugs -- > mainly coming from the "old" (current) gdb, but also clang has some > issues (from memory, it doesn't include parent info). Ideally that'd be detected by looking at the abbreviation table, rather than the augmentation string - if parent info is necessary for a usage of the ta= ble, that'd be a generic way to check for it & ensure the unusable indexes are ignored while not ignoring usable ones. > Also, gdb relies on its extensions (see below). Ah, but yeah, if you need extensions, then positive augmentation string checking seems likely necessary. (though this starts to feel like websites checking browser versions, unfortunately :/ ) > When writing the new scanner, I found a few bugs in DWARF related to > which names appear in the index. I don't recall offhand what these are, > and I didn't file them due to the late unpleasantness (sorry).=20=20 No worries - and totally understandable. If they happen to come to mind at = any point, I'd love to hear about them. > They could probably be dug up by reading the scanner and comparing to the= spec. >=20 > gdb will also emit some extensions. You can see these in the docs patch: >=20 > https://sourceware.org/pipermail/gdb-patches/2023-December/204947.html Awesome - appreciate the documentation! > Generally the goal of these is to avoid having to do any DIE reading > in order to reject a lookup. Note that this means gdb relies on > template parameters being in the name -- something we discussed in > gdb bugzilla a bit. Yeah, I'd love to figure out how to deal with that better, but don't have immediate suggestions. Any sense of how bad the performance is if names without template parameters (strawman: this could be communicated via another flag on the entry in the index) did require DWARF parsing to check template parameters? Is that something that'd be an option? (especially with a flag in the entry, then i= t'd only be a runtime cost to those using this naming mechanism - as much as I'd like to move to that mechanism being normal/the default, perhaps this would= be a safe transition path) But I guess Google's probably the only one super interested in the simplifi= ed template names at the moment (& we're mostly investing in lldb these days),= so might be unlikely anyone would be signing up to do that work in gdb. > With these patches, gdb will not generate or use the hash table. > This is explained here: >=20 > https://sourceware.org/pipermail/gdb-patches/2023-December/204963.html Oh, that's got some good details/answers some of my questions - thanks! > I consider this hash table to be essentially useless in general, due to t= he name canonicalization problem -- while DWARF says that writers should us= e the system demangling style, (1) this style varies across systems, so it = can't truly be relied on; and (2) at least GCC and one other compiler don't= actually follow this part of the spec anyway. Hmm, I missed a step here - perhaps you can help me understand. Maybe, ultimately, I agree with you here - I've pushed back on the lldb folks rely= ing on character identical name lookup in the index due to the problems you've described (there's no real canonical demangling) - but where does DWARF say that writers should "use the system demangling style"? > It's important to note, though, that even if the hash was somehow useful,= GDB probably still would not use it -- a sorted list of names is needed fo= r completion and performs reasonably well for other lookups, so a hash tabl= e is just overhead, IMO. Oh, that makes loads of sense - yeah, beats me how lldb deals with tab completion using the hash table... maybe it builds some other side table or something. That's something I've wondered about for a while and it's good to know how GDB deals with this, and why its index looks different. Does that mean you want the entries in the table to be sorted? Do you emit = them that way and then, based on augmentation string, rely on them being sorted?= Or do a quick scan at startup and build a sorted list in memory regardless of = the order in .debug_names? (.debug_names entry list isn't suited to random acce= ss, is it? The records aren't all the same size so I don't think you could bina= ry search through them) > The only other thing I can think of offhand is that the reliance on > .debug_str means that gdb may have to augment the string table when > DW_FORM_string is in use. This is also caused by the "(anonymous namespa= ce)" > special case. And Split DWARF, I guess? The strings wouldn't otherwise be in the executab= les .debug_str if not for the index - they'd only appear in the dwo/dwp .debug_str.dwo sections.=