From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <tromey@adacore.com>
Received: from mail-io1-xd32.google.com (mail-io1-xd32.google.com
 [IPv6:2607:f8b0:4864:20::d32])
 by sourceware.org (Postfix) with ESMTPS id 0C2B53858D28
 for <gdb-patches@sourceware.org>; Fri, 29 Apr 2022 19:47:55 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0C2B53858D28
Received: by mail-io1-xd32.google.com with SMTP id e15so10871399iob.3
 for <gdb-patches@sourceware.org>; Fri, 29 Apr 2022 12:47:55 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:user-agent:mime-version;
 bh=TvHVGvSQBdAQyv2qEpJRcso/0hqYxF3hogC0X4hjhb8=;
 b=og3NBWrZwcCOuYhk81RKdt2yJFSK7U1bXdVBkZ+ej3e+rderaV8ETbsP1qB9LXFSzL
 goIY6v0OuwYkx3jlmYRD4ADRNxs5jpqNR1U2jYTY35yYxINA1Dx1BWaQXJDyyqxwgMCE
 FfkXMPvo7orCMMijYr24mjQvKRGGjsc0TOrZS7tmYj0Exo9uvZcivQYPzLhvfwZ5MF3h
 DPz18zV+6uLXmRFmLN7hO5Iw2U6E0LoqaubVu1FZKOnisegyQ+SpUbhxE4DDjnuZ5Y1r
 waR+jfzDPVMwn49miLsfStROz9pWxovnjhVCjyPKUK5Ao0GkMOUxEB9iPTRGMknFJ/N6
 Lr8g==
X-Gm-Message-State: AOAM530//je5jbsT3QZLWe4M1qFx0Oup4SJ3uIzM56wkD5ydIU2CJVmi
 FQRmgJGTJVu0CdqlgjC54/Hb2ZqdT3Zf6A==
X-Google-Smtp-Source: ABdhPJy8L94efFNLNkA8oxQ4toOAFmhAOpa8lvDUYVR2mZI3fOjQJyY8NmQ2xwRs8aPz/1Q2pBsNGA==
X-Received: by 2002:a05:6638:144b:b0:321:589b:a8ea with SMTP id
 l11-20020a056638144b00b00321589ba8eamr399754jad.296.1651261674242; 
 Fri, 29 Apr 2022 12:47:54 -0700 (PDT)
Received: from murgatroyd (71-211-158-194.hlrn.qwest.net. [71.211.158.194])
 by smtp.gmail.com with ESMTPSA id
 b26-20020a05663801ba00b0032b3a7817c2sm812679jaq.134.2022.04.29.12.47.52
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 29 Apr 2022 12:47:53 -0700 (PDT)
From: Tom Tromey <tromey@adacore.com>
To: Pedro Alves <pedro@palves.net>
Cc: Tom Tromey <tromey@adacore.com>,  gdb-patches@sourceware.org
Subject: Re: [PATCH] Fix method naming bug in new DWARF indexer
References: <20220421163831.2582161-1-tromey@adacore.com>
 <02551264-3beb-0348-07da-e61dbf9681c8@palves.net>
 <87mtg9ow5w.fsf@tromey.com>
 <5ec4d8e1-9c16-c73a-ec8b-7802b498ba9b@palves.net>
X-Attribution: Tom
Date: Fri, 29 Apr 2022 13:47:52 -0600
In-Reply-To: <5ec4d8e1-9c16-c73a-ec8b-7802b498ba9b@palves.net> (Pedro Alves's
 message of "Tue, 26 Apr 2022 19:40:22 +0100")
Message-ID: <87a6c3llhz.fsf@tromey.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gdb-patches@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>,
 <mailto:gdb-patches-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Apr 2022 19:47:56 -0000

Pedro> Hmm, it actually fixes most of the performance for me.  With that
Pedro> new patch (and I guess the linkage names patch made a difference
Pedro> too), I see roughly the same startup time in new gdb vs old gdb
Pedro> when using the same index, and either an index generated by old
Pedro> gdb, or by new gdb.

I'm going to check it in.

Pedro>  [369902] intrusive_list<inferior, intrusive_base_node<inferior> >::begin:
Pedro>           9 [global, function]
Pedro>           255 [global, function]
Pedro> while in the new indexer, for the same function, we have:
[...]

Pedro> What was the logic that the old writer used to come up with the
Pedro> seemingly right number of function entries?

It looks like the old reader checked the value of DW_AT_inline:

-      if (pdi->has_pc_info || pdi->has_range_info
-	  || (!pdi->is_external && pdi->may_be_inlined))
-	{
-	  if (!pdi->is_declaration)
-	    /* Ignore subprogram DIEs that do not have a name, they are
-	       illegal.  Do not emit a complaint at this point, we will
-	       do so when we convert this psymtab into a symtab.  */
-	    if (pdi->name (cu))
-	      add_partial_symbol (pdi, cu);
-	}

and

-	case DW_AT_inline:
-	  {
-	    LONGEST value = attr.constant_value (-1);
-	    if (value == DW_INL_inlined
-		|| value == DW_INL_declared_inlined)
-	      may_be_inlined = 1;
-	  }
-	  break;

Pedro> For the time being, I'll continue using indexes, because as soon
Pedro> as you need to run to a breakpoint from the command line, which I
Pedro> do all the time, then having an index still beats the new
Pedro> scanner.

Yeah :(

What happens here is that gdb does the finalization step in the
background.  Interactively, this helps make it seem faster.  However,
it's not really faster, it is just deferring some work.

You can see it with an invocation like:

   gdb -q -batch -iex 'set debug timestamp 1' -iex 'set debug dwarf-read 1' -ex start ./gdb

Here for gdb-with-an-index I see:

    0.820423 [dwarf-read] dwarf2_initialize_objfile: found gdb index from file
    1.699175 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb.idx ...

But for ordinary gdb:

    1.310033 [dwarf-read] dwarf2_build_psymtabs_hard: Done building psymtabs of /tmp/gdb
    4.666649 [dwarf-read] process_queue: Expanding one or more symtabs of objfile /tmp/gdb ...

This is disappointing of course.  It might be possible to reduce this
time at the cost of a bit more complexity in the lookup code and perhaps
a bit more memory use.  For example, right now finalization merges the
results from all the readers into a single entry table -- but maybe the
table could remain sharded instead.  Another possibility might be to do
the work in worker threads and, instead of sharding the result, pre-sort
the vectors and use a sorted merge operation to combine them.

I'll file some bugs about these things.
I'm starting to lose track of what I still need to fix up.

Tom