From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com [IPv6:2607:f8b0:4864:20::d32]) by sourceware.org (Postfix) with ESMTPS id 0C2B53858D28 for ; Fri, 29 Apr 2022 19:47:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0C2B53858D28 Received: by mail-io1-xd32.google.com with SMTP id e15so10871399iob.3 for ; Fri, 29 Apr 2022 12:47:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=TvHVGvSQBdAQyv2qEpJRcso/0hqYxF3hogC0X4hjhb8=; b=og3NBWrZwcCOuYhk81RKdt2yJFSK7U1bXdVBkZ+ej3e+rderaV8ETbsP1qB9LXFSzL goIY6v0OuwYkx3jlmYRD4ADRNxs5jpqNR1U2jYTY35yYxINA1Dx1BWaQXJDyyqxwgMCE FfkXMPvo7orCMMijYr24mjQvKRGGjsc0TOrZS7tmYj0Exo9uvZcivQYPzLhvfwZ5MF3h DPz18zV+6uLXmRFmLN7hO5Iw2U6E0LoqaubVu1FZKOnisegyQ+SpUbhxE4DDjnuZ5Y1r waR+jfzDPVMwn49miLsfStROz9pWxovnjhVCjyPKUK5Ao0GkMOUxEB9iPTRGMknFJ/N6 Lr8g== X-Gm-Message-State: AOAM530//je5jbsT3QZLWe4M1qFx0Oup4SJ3uIzM56wkD5ydIU2CJVmi FQRmgJGTJVu0CdqlgjC54/Hb2ZqdT3Zf6A== X-Google-Smtp-Source: ABdhPJy8L94efFNLNkA8oxQ4toOAFmhAOpa8lvDUYVR2mZI3fOjQJyY8NmQ2xwRs8aPz/1Q2pBsNGA== X-Received: by 2002:a05:6638:144b:b0:321:589b:a8ea with SMTP id l11-20020a056638144b00b00321589ba8eamr399754jad.296.1651261674242; Fri, 29 Apr 2022 12:47:54 -0700 (PDT) Received: from murgatroyd (71-211-158-194.hlrn.qwest.net. [71.211.158.194]) by smtp.gmail.com with ESMTPSA id b26-20020a05663801ba00b0032b3a7817c2sm812679jaq.134.2022.04.29.12.47.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Apr 2022 12:47:53 -0700 (PDT) From: Tom Tromey To: Pedro Alves Cc: Tom Tromey , gdb-patches@sourceware.org Subject: Re: [PATCH] Fix method naming bug in new DWARF indexer References: <20220421163831.2582161-1-tromey@adacore.com> <02551264-3beb-0348-07da-e61dbf9681c8@palves.net> <87mtg9ow5w.fsf@tromey.com> <5ec4d8e1-9c16-c73a-ec8b-7802b498ba9b@palves.net> X-Attribution: Tom Date: Fri, 29 Apr 2022 13:47:52 -0600 In-Reply-To: <5ec4d8e1-9c16-c73a-ec8b-7802b498ba9b@palves.net> (Pedro Alves's message of "Tue, 26 Apr 2022 19:40:22 +0100") Message-ID: <87a6c3llhz.fsf@tromey.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Apr 2022 19:47:56 -0000 Pedro> Hmm, it actually fixes most of the performance for me. With that Pedro> new patch (and I guess the linkage names patch made a difference Pedro> too), I see roughly the same startup time in new gdb vs old gdb Pedro> when using the same index, and either an index generated by old Pedro> gdb, or by new gdb. I'm going to check it in. Pedro> [369902] intrusive_list >::begin: Pedro> 9 [global, function] Pedro> 255 [global, function] Pedro> while in the new indexer, for the same function, we have: [...] Pedro> What was the logic that the old writer used to come up with the Pedro> seemingly right number of function entries? It looks like the old reader checked the value of DW_AT_inline: - if (pdi->has_pc_info || pdi->has_range_info - || (!pdi->is_external && pdi->may_be_inlined)) - { - if (!pdi->is_declaration) - /* Ignore subprogram DIEs that do not have a name, they are - illegal. Do not emit a complaint at this point, we will - do so when we convert this psymtab into a symtab. */ - if (pdi->name (cu)) - add_partial_symbol (pdi, cu); - } and - case DW_AT_inline: - { - LONGEST value = attr.constant_value (-1); - if (value == DW_INL_inlined - || value == DW_INL_declared_inlined) - may_be_inlined = 1; - } - break; Pedro> For the time being, I'll continue using indexes, because as soon Pedro> as you need to run to a breakpoint from the command line, which I Pedro> do all the time, then having an index still beats the new Pedro> scanner. Yeah :( What happens here is that gdb does the finalization step in the background. Interactively, this helps make it seem faster. However, it's not really faster, it is just deferring some work. You can see it with an invocation like: gdb -q -batch -iex 'set debug timestamp 1' -iex 'set debug dwarf-read 1' -ex start ./gdb Here for gdb-with-an-index I see: 0.820423 [dwarf-read] dwarf2_initialize_objfile: found gdb index from file 1.699175 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb.idx ... But for ordinary gdb: 1.310033 [dwarf-read] dwarf2_build_psymtabs_hard: Done building psymtabs of /tmp/gdb 4.666649 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb ... This is disappointing of course. It might be possible to reduce this time at the cost of a bit more complexity in the lookup code and perhaps a bit more memory use. For example, right now finalization merges the results from all the readers into a single entry table -- but maybe the table could remain sharded instead. Another possibility might be to do the work in worker threads and, instead of sharding the result, pre-sort the vectors and use a sorted merge operation to combine them. I'll file some bugs about these things. I'm starting to lose track of what I still need to fix up. Tom