From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gateway30.websitewelcome.com (gateway30.websitewelcome.com [192.185.146.7]) by sourceware.org (Postfix) with ESMTPS id 010023858D35 for ; Wed, 10 Nov 2021 19:58:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 010023858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tromey.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=tromey.com Received: from cm16.websitewelcome.com (cm16.websitewelcome.com [100.42.49.19]) by gateway30.websitewelcome.com (Postfix) with ESMTP id 73195F853 for ; Wed, 10 Nov 2021 13:56:01 -0600 (CST) Received: from box5379.bluehost.com ([162.241.216.53]) by cmsmtp with SMTP id kthRm7Ga3tL6ekthRmBu5m; Wed, 10 Nov 2021 13:56:01 -0600 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tromey.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References :Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=C7hM97h33nUJCBqWKbaInRKqb7tkkCX7ougvLaLPkhI=; b=XUqEfmco37VruGAFagfrXcJfZ/ uhinc4RgRYA8I0H3Zz7GwFp2Ly2emPsnaXC4zbwNfpQU+UMVw3qfaOrDmwVxsU3x73RJnuABm2cDg oM1zJsvxd612DU9eAwrIee1hb; Received: from 97-122-68-246.hlrn.qwest.net ([97.122.68.246]:54152 helo=murgatroyd) by box5379.bluehost.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mkthR-00106l-2f; Wed, 10 Nov 2021 12:56:01 -0700 From: Tom Tromey To: Simon Marchi Cc: Tom Tromey , gdb-patches@sourceware.org Subject: Re: [PATCH v2 00/32] Rewrite the DWARF "partial" reader References: <20211104180907.2360627-1-tom@tromey.com> X-Attribution: Tom Date: Wed, 10 Nov 2021 12:56:00 -0700 In-Reply-To: (Simon Marchi's message of "Mon, 8 Nov 2021 12:41:24 -0500") Message-ID: <87bl2rhjjj.fsf@tromey.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - box5379.bluehost.com X-AntiAbuse: Original Domain - sourceware.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - tromey.com X-BWhitelist: no X-Source-IP: 97.122.68.246 X-Source-L: No X-Exim-ID: 1mkthR-00106l-2f X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: 97-122-68-246.hlrn.qwest.net (murgatroyd) [97.122.68.246]:54152 X-Source-Auth: tom+tromey.com X-Email-Count: 2 X-Source-Cap: ZWx5bnJvYmk7ZWx5bnJvYmk7Ym94NTM3OS5ibHVlaG9zdC5jb20= X-Local-Domain: yes X-Spam-Status: No, score=-3025.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NEUTRAL, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2021 19:58:26 -0000 Simon> What I think I understand is that with the new indexer, we gather Simon> essentially the same information that we gather today with partial Simon> symbols (just the information needed for doing lookups and expanding Simon> symtabs), but store it in a data structure that is better designed, Simon> letting us do more in parallel. Does that sounds right? If not, can Simon> you explain what's the main difference? Yeah, the biggest gains are due to parallelism. The single largest user-visible change comes from pushing the necessary post-processing into a background thread. However there are various smaller gains in the series as well: * Abbrevs are analyzed statically, so we can skip many more DIEs without examining their contents. * The partial DIE cache is eliminated. * Name canonicalization is done just on the name in the DIE, rather than constructing a full name and then canonicalizing the entire thing. * The abbrev cache may also speed up DWARF scanning in some scenarios. Simon> What sounds nice with partial symtabs is that they are re-used by Simon> different debug formats, so each format doesn't need to implement its Simon> own data structures to manage symtab lookups and expansion. True, but this is also a drawback, because psymtabs weren't really designed for DWARF. I tried many times to speed up the existing reader, and couldn't make it work... Also, another way to look at this is that the new approach shares more code with the existing DWARF index readers. It also provides the possibility of making the .debug_names writer work correctly (currently it is far from what the standard requires). Simon> How much is Simon> that new DWARF indexer really DWARF-specific (the part that parses the Simon> DWARF obviously is, but the part that holds names and stuff)? Could it Simon> one day be used by other debug formats? Not readily, because AFAIK the other debug formats aren't hierarchical in nature. Except for Ada (which as always works in its own way), the new reader uses the hierarchical nature of DWARF to simplify the resulting data structure. (For Ada, this same thing is done, but in a post-processing step.) Tom