From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 155703860772; Wed, 10 Jan 2024 20:31:39 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 155703860772
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1704918699;
	bh=FyYS5SpkZa50OUwwlQ14XCl9Y0DwCpPYuJZvp2metHQ=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=IQfWogOy3GYAlTF+LLUyIV/m+pG7qA9v9yXtbrDGN/3wPz6c8OVCeIzDTiqp5hn1W
	 KP8gDxR0qdpAB5l7MJ/efuaVd4g4EJDh61EQnaUk/yu6rcip5vJyZRy3ep6XdhhWVo
	 u7bOP9vvy+geij0gG36oWrs38GTzyF6BIsBwwRIk=
From: "tromey at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug debug/99178] Emit .debug_names
Date: Wed, 10 Jan 2024 20:31:38 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: debug
X-Bugzilla-Version: 11.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: tromey at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-99178-4-tQWwg4bEMj@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-99178-4@http.gcc.gnu.org/bugzilla/>
References: <bug-99178-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99178
--- Comment #7 from Tom Tromey <tromey at gcc dot gnu.org> ---
(In reply to David Blaikie from comment #6)

> Ideally that'd be detected by looking at the abbreviation table, rather t=
han
> the augmentation string - if parent info is necessary for a usage of the
> table, that'd be a generic way to check for it & ensure the unusable inde=
xes
> are ignored while not ignoring usable ones.

Good idea.

> Ah, but yeah, if you need extensions, then positive augmentation string
> checking seems likely necessary.

It's possible we could accept indexes without the extensions and sort of
limp along or have bugs.  Not sure offhand.

> (though this starts to feel like websites checking browser versions,
> unfortunately :/ )

https://github.com/rust-lang/rust/issues/41252#issuecomment-293676579

> Any sense of how bad the performance is if names without template paramet=
ers
> (strawman: this could be communicated via another flag on the entry in the
> index) did require DWARF parsing to check template parameters?

It can be bad -- gdb actually has a related bug right now:

https://sourceware.org/bugzilla/show_bug.cgi?id=3D30520

(There's a few possible dups of this.)

Maybe it would be possible to do some kind of 2-phase expansion.
But we already have 2 DWARF scanners and this would be adding a 3rd...

> Hmm, I missed a step here - perhaps you can help me understand. Maybe,
> ultimately, I agree with you here - I've pushed back on the lldb folks
> relying on character identical name lookup in the index due to the proble=
ms
> you've described (there's no real canonical demangling) - but where does
> DWARF say that writers should "use the system demangling style"?

https://wiki.dwarfstd.org/Best_Practices.md

See also bug#94845.

> Oh, that makes loads of sense - yeah, beats me how lldb deals with tab
> completion using the hash table... maybe it builds some other side table =
or
> something. That's something I've wondered about for a while and it's good=
 to
> know how GDB deals with this, and why its index looks different.

FWIW you can't really go by what gdb does today: its .debug_names it just
super wrong, and .gdb_index is a hash table but explicitly relies on the
name canonicalization that gdb does.  Which itself is unsafe (like what if
that changes between versions) but we didn't really think through all the
problems back then.

> Does that mean you want the entries in the table to be sorted? Do you emit
> them that way and then, based on augmentation string, rely on them being
> sorted? Or do a quick scan at startup and build a sorted list in memory
> regardless of the order in .debug_names? (.debug_names entry list isn't
> suited to random access, is it? The records aren't all the same size so I
> don't think you could binary search through them)

The new reader just reads the entries and creates the same internal data
structures that the new DWARF scanner creates.  It handles sorting,
canonicalization, etc, during the scan.  This work is done in a worker
thread to hide it from the user (although I think it's reasonably quick
anyway).

> > The only other thing I can think of offhand is that the reliance on
> > .debug_str means that gdb may have to augment the string table when
> > DW_FORM_string is in use.  This is also caused by the "(anonymous names=
pace)"
> > special case.
>=20
> And Split DWARF, I guess? The strings wouldn't otherwise be in the
> executables .debug_str if not for the index - they'd only appear in the
> dwo/dwp .debug_str.dwo sections.

Yeah, I haven't really looked at this too much.=