From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9325 invoked by alias); 15 Jun 2010 22:26:34 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 9136 invoked by uid 22791); 15 Jun 2010 22:26:31 -0000 X-SWARE-Spam-Status: No, hits=-2.9 required=5.0 tests=AWL,BAYES_50,RCVD_IN_DNSWL_HI,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit From: Roland McGrath To: Tom Tromey Cc: Project Archer , Jakub Jelinek Subject: Re: Fedora 14 debug proposal In-Reply-To: Tom Tromey's message of Tuesday, 15 June 2010 13:45:53 -0600 References: <20100613104010.6D1174077C@magilla.sf.frob.com> Message-Id: <20100615222623.1452440736@magilla.sf.frob.com> Date: Tue, 15 Jun 2010 22:26:00 -0000 X-SW-Source: 2010-q2/txt/msg00053.txt.bz2 > Roland> It should be, yes. I don't see any reason that .debug_types and > Roland> DW_FORM_ref_sig8 need to survive final linking. The normal reference > Roland> forms are more efficient for consumers to use. > > Why is that? I looked at the gdb code here and nothing really stood out. ref_sig8 is a key to match in searching through type units. (Presumably a hash table lookup among already-interned units, interning more linearly as needed.) The ref forms are direct pointers into the file. In the case of ref_addr (the case for any actual sharing/compression), a consumer needs to figure out which CU it's in and intern that CU (i.e. track at least its header details, the total of "interning" that libdw does), which is a similar search and on-demand interning (in libdw this one is a tree-based search to match the file-offset bounds of the CU). For a consumer like GDB that interns at the DIE level, it's presumably a similar lookup (hash table or btree or whatever) keyed on the file offset to match a DIE previously interned. So it is simpler in theory but perhaps a wash in practice. What might be more important is the space savings. ref_sig8 itself uses twice the space of ref_addr. But beyond that, each referent must get its own type unit, with space for the unit header, plus duplicates of the containing DIE structure (levels of namespace, class, etc.). In contrast, optimal direct compression needs only as many unit headers (for the partial_unit or compile_unit) as there are distinct sets of sharing references. A shared partial_unit contains many referent DIEs nested in the single copy of the containing DIE structure, since references to foo::bar::baz::type1 and foo::bar::baz::type2::innertype3, etc., are all just direct pointers into different subtrees of the same larger tree. Anyway, the proof will be in the putative pudding. When we have compression working and libdw capable of handling ref_sig8, then it will be fairly straightforward to try preserving type units and ref_sig8's as they are (along with partial_unit-based compression of everything else) and compare that to morphing everything into direct references and compressing that way. Thanks, Roland