From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <archer-return-2464-listarch-archer=sourceware.org@sourceware.org>
Received: (qmail 16929 invoked by alias); 10 Aug 2011 20:12:48 -0000
Mailing-List: contact archer-help@sourceware.org; run by ezmlm
Sender: <archer@sourceware.org>
Precedence: bulk
List-Post: <mailto:archer@sourceware.org>
List-Help: <mailto:archer-help@sourceware.org>
List-Subscribe: <mailto:archer-subscribe@sourceware.org>
List-Id: <archer.sourceware.org>
Received: (qmail 16910 invoked by uid 22791); 10 Aug 2011 20:12:44 -0000
X-SWARE-Spam-Status: No, hits=-2.4 required=5.0
	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW
X-Spam-Check-By: sourceware.org
MIME-Version: 1.0
In-Reply-To: <CAHACq4qhGTg7ShS-eoMHq1oDmoBj=GC=Bx06knVq-+XTUN4ohw@mail.gmail.com>
References: <CAHACq4oHymXfBOqzJfyzQXNyW5PYkN6g65X6x1rMU+YmJybZmQ@mail.gmail.com>
	<m3d3gkx8ar.fsf@fleche.redhat.com>
	<CAHACq4ofyavZBt4y65OfYSoeW--HeEAT=sz86urLC0MXBB0hnA@mail.gmail.com>
	<m3d3gjwuho.fsf@fleche.redhat.com>
	<CAHACq4poC0QkGowdHkg0_Y1FRmyXTZZDeoBYrUf91nxz9SJtQw@mail.gmail.com>
	<CAHACq4qhGTg7ShS-eoMHq1oDmoBj=GC=Bx06knVq-+XTUN4ohw@mail.gmail.com>
Date: Wed, 10 Aug 2011 20:12:00 -0000
Message-ID: <CAN9gPaGC=QxGhEdgcYxt35qdBvQ-BEfUy67ioYnG7=hKoegLYQ@mail.gmail.com>
Subject: Re: Generating gdb index at link time
From: Daniel Jacobowitz <drow@false.org>
To: Cary Coutant <ccoutant@google.com>
Cc: Tom Tromey <tromey@redhat.com>, Project Archer <archer@sourceware.org>, 
	Dodji Seketeli <dodji@redhat.com>, Sterling Augustine <saugustine@google.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-SW-Source: 2011-q3/txt/msg00002.txt.bz2

On Fri, Aug 5, 2011 at 4:30 PM, Cary Coutant <ccoutant@google.com> wrote:
>> Cary> Any comments, objections, advice, ...?
>>
>> I think the biggest difficulty is in C++ name canonicalization.
>>
>> My understanding (Jan and Keith are the experts here) is that DW_AT_name
>> does not always agree with the demangler; but one also cannot always
>> rely on the demangler because not all entities are given a
>> DW_linkage_name (perhaps fixed in newer GCC versions? =A0Dodji would
>> know, or anyway it is in bugzilla).
>>
>> We have a second canonicalization step (a bunch of code in
>> cp-name-parser.y) that we run on the demangled names. =A0I'm not 100% su=
re
>> this is still necessary. =A0I think this code is reasonably
>> self-contained; but maybe slow.
>>
>> I think it is worth considering changes to the index, even radical ones,
>> if it would make your solution better. =A0The index is purely an ad hoc
>> invention and should, IMO, be considered as mutable as any other piece.

You may all know most of this already, but just in case, here's a bit
of history.

The purpose of name canonicalization is to accept different spellings
of the same C++ name and be able to reliably and efficiently look up
the symbol for the canonical spelling of the name.  It is still
necessary, even with GCC.

Tom is right; DW_AT_name can not be trusted.  That's true across
multiple compilers, not just GCC, but GCC is a particularly egregious
offender.  There are some cosmetic differences, like spacing, and some
more significant differences like whether typedefs are expanded.

It is definitely slow.  I spent a long time speeding it up, but it's
still a significant chunk of startup time (a couple percent? don't
remember).

It's really important that the index use the same canonicalization as
GDB.  If it doesn't, we will fail to look up symbols where there's a
difference.  It would be nice to have some robust tests for this;
maybe a flag where GDB checks that all names in the index are
canonical, so we can run the testsuite that way?  That makes me a
little nervous about skew between GDB and Gold.

Not all entities have a linkage name because there are entities which
don't appear in the output.  Types, for instance.  Plus the abstract
copy of the two constructor versions, that's a historic trouble spot.

--=20
Thanks,
Daniel