From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16929 invoked by alias); 10 Aug 2011 20:12:48 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 16910 invoked by uid 22791); 10 Aug 2011 20:12:44 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org MIME-Version: 1.0 In-Reply-To: References: Date: Wed, 10 Aug 2011 20:12:00 -0000 Message-ID: Subject: Re: Generating gdb index at link time From: Daniel Jacobowitz To: Cary Coutant Cc: Tom Tromey , Project Archer , Dodji Seketeli , Sterling Augustine Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2011-q3/txt/msg00002.txt.bz2 On Fri, Aug 5, 2011 at 4:30 PM, Cary Coutant wrote: >> Cary> Any comments, objections, advice, ...? >> >> I think the biggest difficulty is in C++ name canonicalization. >> >> My understanding (Jan and Keith are the experts here) is that DW_AT_name >> does not always agree with the demangler; but one also cannot always >> rely on the demangler because not all entities are given a >> DW_linkage_name (perhaps fixed in newer GCC versions? =A0Dodji would >> know, or anyway it is in bugzilla). >> >> We have a second canonicalization step (a bunch of code in >> cp-name-parser.y) that we run on the demangled names. =A0I'm not 100% su= re >> this is still necessary. =A0I think this code is reasonably >> self-contained; but maybe slow. >> >> I think it is worth considering changes to the index, even radical ones, >> if it would make your solution better. =A0The index is purely an ad hoc >> invention and should, IMO, be considered as mutable as any other piece. You may all know most of this already, but just in case, here's a bit of history. The purpose of name canonicalization is to accept different spellings of the same C++ name and be able to reliably and efficiently look up the symbol for the canonical spelling of the name. It is still necessary, even with GCC. Tom is right; DW_AT_name can not be trusted. That's true across multiple compilers, not just GCC, but GCC is a particularly egregious offender. There are some cosmetic differences, like spacing, and some more significant differences like whether typedefs are expanded. It is definitely slow. I spent a long time speeding it up, but it's still a significant chunk of startup time (a couple percent? don't remember). It's really important that the index use the same canonicalization as GDB. If it doesn't, we will fail to look up symbols where there's a difference. It would be nice to have some robust tests for this; maybe a flag where GDB checks that all names in the index are canonical, so we can run the testsuite that way? That makes me a little nervous about skew between GDB and Gold. Not all entities have a linkage name because there are entities which don't appear in the output. Types, for instance. Plus the abstract copy of the two constructor versions, that's a historic trouble spot. --=20 Thanks, Daniel