public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Tom de Vries <tdevries@suse.de>
To: Tom Tromey <tom@tromey.com>
Cc: gdb-patches@sourceware.org
Subject: Re: [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp
Date: Wed, 13 Dec 2023 10:58:14 +0100	[thread overview]
Message-ID: <ec1eaff5-b237-450f-98fb-2f19ec505b74@suse.de> (raw)
In-Reply-To: <87edfrckcq.fsf@tromey.com>

On 12/12/23 20:05, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> [...]
> 
> Thank you for this series and the write-up.
> You've done a great job on this.
> 
> Tom> IV. ALTERNATIVE APPROACH
> 
> Tom> I suppose it could be a matter of taste whether the current internal
> Tom> representation is:
> Tom> - a smart solution, or
> Tom> - incorrect representation of the dwarf.
> 
> I'm not sure I totally understood this section, but the basic idea
> behind this code -- and I think these decisions predate the cooked-index
> rewrite -- was to try to exploit dwz compression by also using less
> memory inside gdb.
> 

Makes sense to me.

> Technically, I think, in DWARF an import can appear at any level

That's also my understanding.

> and gdb
> should probably just be re-reading a bunch of DIEs over and over.

Yes, though we could make a distinction between top-level imports (as 
produced by dwz and gcc lto) and otherwise, and handle the top-level 
imports efficiently.

> However, this seemed super expensive back when I wrote the dwz support,
> and since dwz is the only known compressor and it doesn't do some of the
> weird stuff, it looked better to try to implement this optimization.
> 
> Shrug, maybe that was a mistake.  If every import caused a re-scan of
> the PU, it would mean that only dwz users would pay for this feature.
> Just maybe the price would be very high.
> 

The solution I tried to propose here is a middle ground:
- it fixes the PR we're trying to fix in this series, and
- it exploits the dwz compression inside gdb, to avoid the high price
   you mention.

The basic idea is:
- let the first import trigger a read of a PU,
- in the cooked_index entries generated for the PU, set the per_cu to
   the PU (instead of to the importing CU as we do now),
- keep track of importers, allowing to list all importers CUs of a given
   PU,
- when doing expansion for a cooked_index entry, if the per_cu is a PU,
   expand all importer CUs.

There are two notes here:
- there is the risk that this gets us too many expanded symtabs,
   so when using the cooked index we should know whether we're interested
   in expanding all matching symtabs or just one.
- imports can happen from CUs with different languages, so we'll have to
   handle that somehow.  I'm not sure btw whether we're getting that
   correct either atm.

> Tom> - I'm not sure if other maintainers are supportive of this approach.
> 
> I probably just don't understand.
> 

Ok, just let me know if there's anything I can do to clarify things.

> Tom> And this is comparing the patch series with the base version:
> Tom> ...
> Tom> real: mean: 687.00    (100%) mean: 931.00    (135.52%)
> Tom> user: mean: 2204.00   (100%) mean: 2938.00   (133.30%)
> Tom> sys : mean: 102.00    (100%) mean: 137.00    (134.31%)
> Tom> mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%)
> Tom> ...
> 
> Tom> In summary, the overall result is ~36% more real time and ~21% more memory.
> 
> Pretty heavy.  However I was curious about something:
> 
> Tom> The cumulative results of individual patches (leaving out the test-case
> Tom> patches) are:
> 
> Tom> [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
> 
> Tom> real: mean: 690.00    (100%) mean: 835.00    (121.01%)
> Tom> user: mean: 2186.00   (100%) mean: 2693.00   (123.19%)
> Tom> sys : mean: 106.00    (100%) mean: 130.00    (122.64%)
> Tom> mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%)
> 
> ... the big performance jump appears here -- but is it really needed?
> 
> IIUC this is done because we want to detect inlined functions in C++.
> However, in theory those should also be emitted at the top level with
> DW_AT_inline, with a value of either DW_INL_inlined or
> DW_INL_declared_inlined.
> 

In the gdb.cp/breakpoint-locs.exp cc-with-dwz case we have:
...
  <1><14>: Abbrev Number: 34 (DW_TAG_namespace)
     <15>   DW_AT_name        : N1
     <18>   DW_AT_sibling     : <0x2e>
  <2><19>: Abbrev Number: 32 (DW_TAG_class_type)
     <1a>   DW_AT_name        : C1
     <1d>   DW_AT_byte_size   : 1
     <1e>   DW_AT_decl_file   : 2
     <1f>   DW_AT_decl_line   : 20
  <3><20>: Abbrev Number: 35 (DW_TAG_subprogram)
     <21>   DW_AT_external    : 1
     <21>   DW_AT_name        : baz
     <25>   DW_AT_decl_file   : 2
     <26>   DW_AT_decl_line   : 23
     <27>   DW_AT_linkage_name: (indirect string, offset: 0x1d3): 
_ZN2N12C13bazEv
     <2b>   DW_AT_accessibility: 1       (public)
     <2c>   DW_AT_declaration : 1
  <3><2c>: Abbrev Number: 0
  <2><2d>: Abbrev Number: 0
  <1><2e>: Abbrev Number: 37 (DW_TAG_subprogram)
     <2f>   DW_AT_specification: <0x20>
     <30>   DW_AT_inline      : 3        (declared as inline and inlined)
     <31>   DW_AT_sibling     : <0x39>
...
and I think what you're describing matches most closely DIE 0x2e (though 
I haven't found DW_INL_inlined or DW_INL_declared_inlined in the dwarf 
produced for the test-case, also not with gcc 12 and -gdwarf-5).

> Would having entries for these be enough to provoke the desired CU
> expansion?  If so it may be just a matter of marking some more abbrevs
> as "interesting".
> 

The problem is that the DIE 0x2e has been moved to a PU, and as 
consequence in the current implementation will cause expansion of only a 
single CU.

> Actually, for the simple test case I tried, I already see it in the
> index.  So I wonder what's going on there.
> 

I wonder if you can reproduce the FAIL I'm seeing? [ Btw, it's not 100% 
reproducible, so I usually run it in a $(seq 1 100) loop. ]

> Anyway if we could drop this patch then it seems like the overall cost
> would be alright.
> 
> Tom> Ideally, the impact of a patch series implementing dwz support on a non-dwz
> Tom> test-case is none.
> 
> Tom> But fixing dwz support requires tracking more data, and there's no way of
> Tom> knowing in advance whether the additional data will be used or not.
> 
> Yeah.  I loathe this part of DWARF but I have come around to accept that
> we just have to handle it.
> 
> Tom> Of course this can be accommodated by optimistically assuming that the data is
> Tom> unnecessary, and when it turns out it was actually needed, partially or
> Tom> completely restart indexing.  My suspicion is that this approach itself is
> Tom> going to be complex, so I think it's best avoided.
> 
> I agree.

Thanks for the review.

- Tom


  reply	other threads:[~2023-12-13  9:57 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-12 17:32 Tom de Vries
2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
2023-12-12 18:28   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage Tom de Vries
2023-12-12 18:31   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent Tom de Vries
2023-12-12 18:34   ` Tom Tromey
2023-12-13  8:25     ` Tom de Vries
2023-12-13 20:11       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage Tom de Vries
2023-12-12 18:39   ` Tom Tromey
2023-12-13  8:46     ` Tom de Vries
2023-12-13 20:16       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case Tom de Vries
2023-12-12 19:27   ` Tom Tromey
2023-12-13 10:35     ` Tom de Vries
2023-12-13 20:19       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp Tom de Vries
2023-12-12 19:28   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp Tom de Vries
2023-12-12 19:29   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 08/13] [gdb/symtab] Keep track of processed DIEs in shard Tom de Vries
2023-12-12 17:32 ` [PATCH v2 09/13] [gdb/symtab] Resolve deferred entries, intra-shard case Tom de Vries
2023-12-12 17:32 ` [PATCH v2 10/13] [gdb/symtab] Don't defer backward refs, inter-cu " Tom de Vries
2023-12-12 17:32 ` [PATCH v2 11/13] [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index Tom de Vries
2023-12-12 17:32 ` [PATCH v2 12/13] [gdb/symtab] Keep track of all parents " Tom de Vries
2023-12-12 17:32 ` [PATCH v2 13/13] [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the " Tom de Vries
2023-12-12 19:05 ` [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom Tromey
2023-12-13  9:58   ` Tom de Vries [this message]
2023-12-13 20:14     ` Tom Tromey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec1eaff5-b237-450f-98fb-2f19ec505b74@suse.de \
    --to=tdevries@suse.de \
    --cc=gdb-patches@sourceware.org \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).