public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp
@ 2023-12-12 17:32 Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
                   ` (13 more replies)
  0 siblings, 14 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

I. INTRODUCTION

With target board unix, test-case gdb.cp/breakpoint-locs.exp passes reliably.

But when running with target board cc-with-dwz, we sometimes have 2 breakpoint
locations:
...
(gdb) break N1::C1::baz^M
Breakpoint 1 at 0x4004db: N1::C1::baz. (2 locations)^M
(gdb) PASS: gdb.cp/breakpoint-locs.exp: break N1::C1::baz
...
and sometimes only 1:
...
(gdb) break N1::C1::baz^M
Breakpoint 1 at 0x4004db: file breakpoint-locs.h, line 23.^M
(gdb) FAIL: gdb.cp/breakpoint-locs.exp: break N1::C1::baz
...

By using "maint set worker-threads 0" we can make the test-case fail reliably:
...
$ gdb -q -batch \
  -iex "maint set worker-threads 0" \
  outputs/gdb.cp/breakpoint-locs/breakpoint-locs \
  -ex "break N1::C1::baz"
...

And by adding -readnow, we can make it pass reliably.

This is PR symtab/30728, a gdb 12 -> 13 regression.  This series fixes it.

II. ANALYSIS

In order for the breakpoint to have two breakpoint locations, the symtabs for
both CUs breakpoint-locs.cc and breakpoint-locs-2.cc need to be expanded
(which explains why adding -readnow fixes it).  Let's call these CU1 and CU2.

CU1 is always expanded during the call:
...
cp_canonicalize_string_no_typedefs (string=string@entry=0x2fbbf90 "N1::C1::baz")
...
due to being the first CU to contain N1.

Then CU2 is expanded or not, depending on which per_cu (CU1 or CU2) this
cooked_index entry refers to (not shown in the dump atm):
...
   [12] ((cooked_index_entry *) 0x7f1200002f00)
    name:       baz
    canonical:  baz
    qualified:  N1::C1::baz
    DWARF tag:  DW_TAG_subprogram
    flags:      0x0 []
    DIE offset: 0x2e
    parent:     ((cooked_index_entry *) 0x7f1200002ed0) [C1]
...

The DIE offset 0x2e corresponds to this DIE in a PU, imported both from CU1
and CU2:
...
 <0><b>: Abbrev Number: 34 (DW_TAG_partial_unit)
 <1><14>: Abbrev Number: 37 (DW_TAG_namespace)
    <15>   DW_AT_name        : N1
 <2><19>: Abbrev Number: 36 (DW_TAG_class_type)
    <1a>   DW_AT_name        : C1
 <3><20>: Abbrev Number: 32 (DW_TAG_subprogram)
    <21>   DW_AT_external    : 1
    <21>   DW_AT_name        : baz
    <27>   DW_AT_linkage_name: _ZN2N12C13bazEv
    <2b>   DW_AT_accessibility: 1       (public)
    <2c>   DW_AT_declaration : 1
 <1><2e>: Abbrev Number: 35 (DW_TAG_subprogram)
    <2f>   DW_AT_specification: <0x20>
    <30>   DW_AT_inline      : 3        (declared as inline and inlined)
...

[ In the target board unix case, there's no PU and both CU1 and CU2 contain a
copy of 0x2e, and consequently both CUs are expanded. ]

Which per_cu the cooked_index entry refers to is decided by this import race in
cooked_indexer::ensure_cu_exists:
...
  /* When scanning, we only want to visit a given CU a single time.
     Doing this check here avoids self-imports as well.  */
  if (for_scanning)
    {
      bool nope = false;
      if (!per_cu->scanned.compare_exchange_strong (nope, true))
        return nullptr;
    }
...
[ Note that in the "maint set worker-threads 0" case, CU1 is guaranteed to win
the import race. ]

With gdb 12 (the partial symtab case) there was a similar problem, and we
relied on the DW_TAG_inlined_subroutine DIEs for the necessary expansion, see
commit f9b5d5ea18a ("[gdb/symtab] Fix missing breakpoint location for inlined
function").

The CU1 DW_TAG_inlined_subroutine DIE is at 0x148:
...
 <0><ee>: Abbrev Number: 16 (DW_TAG_compile_unit)
    <f3>   DW_AT_language    : 4        (C++)
    <f4>   DW_AT_name        : gdb.cp/breakpoint-locs.cc
 <1><13b>: Abbrev Number: 25 (DW_TAG_subprogram)
    <13c>   DW_AT_specification: <0x115>
    <13d>   DW_AT_low_pc      : 0x4004d7
    <145>   DW_AT_high_pc     : 16
    <146>   DW_AT_frame_base  : 1 byte block: 9c        (DW_OP_call_frame_cfa)
    <148>   DW_AT_GNU_all_call_sites: 1
 <2><148>: Abbrev Number: 24 (DW_TAG_inlined_subroutine)
    <149>   DW_AT_abstract_origin: <0x2e>
    <14d>   DW_AT_low_pc      : 0x4004db
    <155>   DW_AT_high_pc     : 9
    <156>   DW_AT_call_file   : 1
    <157>   DW_AT_call_line   : 22
...
and for CU2 there's a similar one at 0x1cf.

There are no corresponding DW_TAG_inlined_subroutine cooked_index entries,
because the DIEs are skipped during indexing.

So, we can fix this by adding DW_TAG_inlined_subroutine cooked_index entries.

III. FIX

It's easy to stop skipping the DW_TAG_inlined_subroutine cooked_index entries
during indexing.  However, that gives us cooked_index entries with incorrect
parents:
...
    [17] ((cooked_index_entry *) 0x2e68e40)
    name:       baz
    canonical:  baz
    qualified:  N1::foo::baz
    DWARF tag:  DW_TAG_inlined_subroutine
    flags:      0x0 []
    DIE offset: 0x148
    parent:     ((cooked_index_entry *) 0x2e68de0) [foo]

    [19] ((cooked_index_entry *) 0x2e68f90)
    name:       baz
    canonical:  baz
    qualified:  N1::bar::baz
    DWARF tag:  DW_TAG_inlined_subroutine
    flags:      0x0 []
    DIE offset: 0x1cf
    parent:     ((cooked_index_entry *) 0x2e68f30) [bar]
...

That is, the qualified name should be N1::C1::baz in both cases.  The C1 can
be found via the DW_AT_abstract_origin, but instead the DIE parents N1::foo
and N1::bar are used.

Again, we can fix this easily, by forcing to use DW_AT_abstract_origin, but
then we have a missing parent for the DIE at 0x1cf:
...
    [17] ((cooked_index_entry *) 0x4854140)
    name:       baz
    canonical:  baz
    qualified:  N1::C1::baz
    DWARF tag:  DW_TAG_inlined_subroutine
    flags:      0x0 []
    DIE offset: 0x148
    parent:     ((cooked_index_entry *) 0x4853f60) [C1]

    [19] ((cooked_index_entry *) 0x4854290)
    name:       baz
    canonical:  baz
    qualified:  baz
    DWARF tag:  DW_TAG_inlined_subroutine
    flags:      0x0 []
    DIE offset: 0x1cf
    parent:     ((cooked_index_entry *) 0)
...

The data structures that matter for tracking parent relations in cooked_index
entries are:
- m_die_range_map (currently known parent relations), and
- m_deferred_entries (deferred cooked_index entries due to unknown parent).

These only have the scope of a parsing a single CU (including PUs for which
the CU won the import race).

Since CU1 wins the import race, it has the information available to get the right
parent.

But that's not the case for CU2.

So in order to get the corrent parent for DIE 0x1cf, tracking parent relations
in the cooked index need to done at the inter-CU scope, implying also
inter-shard scope.

I've filed PR symtab/30846 "[gdb/symtab] incorrect parent for forward spec
(inter-cu case)" for part of the problem, and the series includes two
test-cases that specifically check for this type of problem:
- gdb.dwarf2/backward-spec-inter-cu.exp
- gdb.dwarf2/forward-spec-inter-cu.exp

IV. ALTERNATIVE APPROACH

An alternative way of approaching the problem is to observe that for the
target board unix case we have two cooked_index entries with per_cu CU1 and
CU2:
...
    [18] ((cooked_index_entry *) 0x3631250)
    name:       baz
    canonical:  baz
    qualified:  N1::C1::baz
    DWARF tag:  DW_TAG_subprogram
    flags:      0x0 []
    DIE offset: 0x173
    parent:     ((cooked_index_entry *) 0x3631130) [C1]

[20] ((cooked_index_entry *) 0x3631490)
    name:       baz
    canonical:  baz
    qualified:  N1::C1::baz
    DWARF tag:  DW_TAG_subprogram
    flags:      0x0 []
    DIE offset: 0x25a
    parent:     ((cooked_index_entry *) 0x3631370) [C1]
...
but for the dwz case we have a only a single entry:
...
   [12] ((cooked_index_entry *) 0x7f1200002f00)
    name:       baz
    canonical:  baz
    qualified:  N1::C1::baz
    DWARF tag:  DW_TAG_subprogram
    flags:      0x0 []
    DIE offset: 0x2e
    parent:     ((cooked_index_entry *) 0x7f1200002ed0) [C1]
...
with per_cu either CU1 or CU2.

If in the dwz case we'd have two of those, one with per_cu CU1 and one with
per_cu CU2, then the problem would also be solved.

Put differently, in the target board unix case we have two cooked index
entries, one with per_cu CU1, and one with per_cu CU2 because that's a
one-on-one reflection of what the dwarf looks like.

But in the dwz case, factoring out the DIE into a PU and importing it from
from both CUs does not alter the logical contents of the dwarf, so it would
make complete sense to have two entries in this case as well.

Of course, we'd want to exploit the dwz compression also inside gdb, and
consequently some scheme of a single cooked_index entry with the per_cu being
the actual PU and then mapping it to the complete set of CU importers would be
more optimal.  [ Note that in such a scheme, we'd have to be able to
differentiate between expand-one-CU and expand-all-CUs to prevent
expanding all importing CUs in the expand-one-CU case. ]

But the current internal representation is in fact corresponding to dwz dropping
the import for all but one CU (well, for the cooked index, not for the
expanded symtabs).

I think that the applied rationale is that if you're looking for a type in
the symtabs, the first one will do, so it doesn't make sense to keep track of
more than one.  This however does not work if you're looking for all the
occurrences of a type in the symtabs, as is the case here.

I suppose it could be a matter of taste whether the current internal
representation is:
- a smart solution, or
- incorrect representation of the dwarf.

The fact is that:
- if we choose the former, then correct dwz handling requires extra care (as
  set out in FIX), and
- if we choose the latter and fix it, then the dwz and non-dwz case are by
  design handled in the same way.

I'm not pursuing this alternative approach in this patch series for now, for
the reasons that:
- this solution was not chosen for gdb 12 either (though the patch went in
  without comments),
- because I'm hoping that it's a bit easier and safer to backport to gdb 13,
  and
- I'm not sure if other maintainers are supportive of this approach.

V. PATCH SERIES STRUCTURE

The series consists of several parts.

- patches that do some refactoring:
  - [gdb/symtab] Refactor condition in scan_attributes
  - [gdb/symtab] Factor out m_die_range_map usage
  - [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  - [gdb/symtab] Factor out m_deferred_entries usage
- patches that fix resolving deferred entries and add corresponding
  test-cases:
  - [gdb/symtab] Resolve deferred entries, inter-shard case
  - [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp
  - [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp
- patches that add two optimizations to the parent calculation
  (one preparation patch, and two optimization patches):
  - [gdb/symtab] Keep track of processed DIEs in shard
  - [gdb/symtab] Resolve deferred entries, intra-shard case
  - [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case
- patches that fix PR symtab/30728 (logically one patch, but split up to be
  able to assess performance impact more precisely):
  - [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
  - [gdb/symtab] Keep track of all parents for cooked index
  - [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index

The fix for PR symtab/30728 consists of adding entries to the cooked index for
DW_TAG_inlined_subroutine DIEs, with correct parent entries.

Tested on x86_64-linux, with target boards unix, cc-with-dwz and
cc-with-dwz-m.

I've split out the two added test-cases into separate patches, for better
patch readability.  Unfortunately the patch "[gdb/symtab] Resolve deferred
entries, inter-shard case" is still somewhat large due to moving code about.

VI. PERFORMANCE

I've also done a performance experiment.  The command being run is:
...
$ gdb -q -batch /usr/lib/debug/usr/bin/gdb-12.1-150400.15.9.1.x86_64.debug
...
The .debug file doesn't contain any partial units, so this excercises the
non-dwz scenario.  The command is run 10 times, and the mean value is used.
I also added a 5 second sleep before each run to prevent thermal throttling
from messing up the measurements.

Furthermore, the processor on which I run the experiment is an i7-1250U, with
4 performance cores and 8 efficiency cores, so I run the experiment using
"taskset -c 0-3" to make sure only performance cores are used.

The script I use compares one gdb version to another, so the first column is
100% by definition, using the same base version, though a different run each time.

This is comparing the base version with itself, which gives an idea about
accuracy:
...
real: mean: 687.00    (100%) mean: 691.00    (100.58%)
user: mean: 2196.00   (100%) mean: 2196.00   (100.00%)
sys : mean: 108.00    (100%) mean: 106.00     (98.15%)
mem : mean: 345220.40 (100%) mean: 346112.40 (100.26%)
...

The mem measure is the "Maximum resident set size of the process during its
lifetime, in Kbytes".  The real/user/sys times are in miliseconds.

And this is comparing the patch series with the base version:
...
real: mean: 687.00    (100%) mean: 931.00    (135.52%)
user: mean: 2204.00   (100%) mean: 2938.00   (133.30%)
sys : mean: 102.00    (100%) mean: 137.00    (134.31%)
mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%)
...

In summary, the overall result is ~36% more real time and ~21% more memory.

The cumulative results of individual patches (leaving out the test-case
patches) are:
...
[gdb/symtab] Refactor condition in scan_attributes

real: mean: 690.00    (100%) mean: 691.00   (100.14%)
user: mean: 2201.00   (100%) mean: 2218.00  (100.77%)
sys : mean: 102.00    (100%) mean: 93.00     (91.18%)
mem : mean: 344846.80 (100%) mean: 344196.40 (99.81%)

[gdb/symtab] Factor out m_die_range_map usage

real: mean: 689.00    (100%) mean: 696.00    (101.02%)
user: mean: 2214.00   (100%) mean: 2212.00    (99.91%)
sys : mean: 98.00     (100%) mean: 91.00      (92.86%)
mem : mean: 345784.00 (100%) mean: 346644.00 (100.25%)

[gdb/symtab] Handle nullptr parent in parent_map::set_parent

real: mean: 688.00    (100%) mean: 690.00    (100.29%)
user: mean: 2215.00   (100%) mean: 2223.00   (100.36%)
sys : mean: 99.00     (100%) mean: 94.00      (94.95%)
mem : mean: 344220.40 (100%) mean: 345640.80 (100.41%)

[gdb/symtab] Factor out m_deferred_entries usage

real: mean: 693.00    (100%) mean: 689.00    (99.42%)
user: mean: 2191.00   (100%) mean: 2201.00  (100.46%)
sys : mean: 103.00    (100%) mean: 101.00    (98.06%)
mem : mean: 346022.00 (100%) mean: 345167.60 (99.75%)

[gdb/symtab] Resolve deferred entries, inter-shard case

real: mean: 688.00    (100%) mean: 711.00    (103.34%)
user: mean: 2204.00   (100%) mean: 2265.00   (102.77%)
sys : mean: 103.00    (100%) mean: 119.00    (115.53%)
mem : mean: 344649.60 (100%) mean: 375080.00 (108.83%)

[gdb/symtab] Keep track of processed DIEs in shard

real: mean: 684.00    (100%) mean: 728.00    (106.43%)
user: mean: 2208.00   (100%) mean: 2301.00   (104.21%)
sys : mean: 96.00     (100%) mean: 124.00    (129.17%)
mem : mean: 344005.60 (100%) mean: 374647.20 (108.91%)

[gdb/symtab] Resolve deferred entries, intra-shard case

real: mean: 693.00    (100%) mean: 724.00    (104.47%)
user: mean: 2213.00   (100%) mean: 2301.00   (103.98%)
sys : mean: 103.00    (100%) mean: 114.00    (110.68%)
mem : mean: 344722.80 (100%) mean: 375021.20 (108.79%)

[gdb/symtab] Don't defer backward refs, inter-cu intra-shard case

real: mean: 686.00    (100%) mean: 725.00    (105.69%)
user: mean: 2199.00   (100%) mean: 2299.00   (104.55%)
sys : mean: 106.00    (100%) mean: 119.00    (112.26%)
mem : mean: 344597.20 (100%) mean: 374937.20 (108.80%)

[gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index

real: mean: 690.00    (100%) mean: 835.00    (121.01%)
user: mean: 2186.00   (100%) mean: 2693.00   (123.19%)
sys : mean: 106.00    (100%) mean: 130.00    (122.64%)
mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%)

[gdb/symtab] Keep track of all parents for cooked index

real: mean: 684.00    (100%) mean: 903.00    (132.02%)
user: mean: 2240.00   (100%) mean: 2921.00   (130.40%)
sys : mean: 93.00     (100%) mean: 141.00    (151.61%)
mem : mean: 346353.20 (100%) mean: 411096.00 (118.69%)

[gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index

real: mean: 690.00    (100%) mean: 930.00    (134.78%)
user: mean: 2227.00   (100%) mean: 2960.00   (132.91%)
sys : mean: 96.00     (100%) mean: 138.00    (143.75%)
mem : mean: 343766.00 (100%) mean: 417973.60 (121.59%)
...

Ideally, the impact of a patch series implementing dwz support on a non-dwz
test-case is none.

But fixing dwz support requires tracking more data, and there's no way of
knowing in advance whether the additional data will be used or not.

Of course this can be accommodated by optimistically assuming that the data is
unnecessary, and when it turns out it was actually needed, partially or
completely restart indexing.  My suspicion is that this approach itself is
going to be complex, so I think it's best avoided.

An optimization that can be added on top of the current approach is to
opportunistically use data from other shards if already available.

An extension of this would be to wait until data from another shard is
available, which would void the need to handle inter-shard dependencies after
the parallel_for_each, but I think there's a risk for deadlock there.  The
imports as generated by dwz are a fairly predictable pattern, but there are
other producers of imports, for instance GCC LTO.

VII. TAGS

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30728
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846

VIII. SUBMISSION HISTORY

v2:
- added refactoring patch at the start
- dropped:
  - [gdb/symtab] Check effect in parent_map::set_parent
    (review comments, concern about speed)
  - [gdb/symtab] Add debug_handle_deferred_entries
    (doesn't produce readable output for worker-threads > 0)
  - [gdb/symtab] Add parent_map::dump
    (only used by dropped patch)
- fixed performance regression in "[gdb/symtab] Keep track of processed DIEs
  in shard"
- split up last patch of v1 into 3 patches to be able to assess performance
  impact more precisely
- use std::unique_ptr for m_die_range_map, m_deferred_entries and
  m_die_range_map_valid
- rewrote the last patch to work for any DIE, not just
  DW_TAG_inlined_subroutine DIEs.
- dropped the why narrative from the individual log messages, which proved
  difficult to maintain, focusing on the what instead, leaving the why for the
  cover letter.

v1:
- https://sourceware.org/pipermail/gdb-patches/2023-October/202882.html

rfc:
- https://sourceware.org/pipermail/gdb-patches/2023-September/202443.html.

Tom de Vries (13):
  [gdb/symtab] Refactor condition in scan_attributes
  [gdb/symtab] Factor out m_die_range_map usage
  [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  [gdb/symtab] Factor out m_deferred_entries usage
  [gdb/symtab] Resolve deferred entries, inter-shard case
  [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp
  [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp
  [gdb/symtab] Keep track of processed DIEs in shard
  [gdb/symtab] Resolve deferred entries, intra-shard case
  [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case
  [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
  [gdb/symtab] Keep track of all parents for cooked index
  [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index

 gdb/dwarf2/cooked-index.c                     | 113 +++++++++
 gdb/dwarf2/cooked-index.h                     | 157 ++++++++++++-
 gdb/dwarf2/read.c                             | 222 +++++++++++++-----
 .../gdb.dwarf2/backward-spec-inter-cu.exp     | 103 ++++++++
 .../gdb.dwarf2/forward-spec-inter-cu.exp      | 103 ++++++++
 5 files changed, 642 insertions(+), 56 deletions(-)
 create mode 100644 gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp
 create mode 100644 gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp


base-commit: 14f2724f80b156928b1a0b0f9733350558e35e63
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 18:28   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage Tom de Vries
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

In scan_attributes there's code:
...
	  if (new_reader->cu == reader->cu
	      && new_info_ptr > watermark_ptr
	      && *parent_entry == nullptr)
	    ...
	  else if (*parent_entry == nullptr)
	    ...
...
that uses the "*parent_entry == nullptr" condition twice.

Make this somewhat more readable by factoring out the condition:
...
	  if (*parent_entry == nullptr)
	    {
	      if (new_reader->cu == reader->cu
		  && new_info_ptr > watermark_ptr)
		...
	      else
		...
	    }
...

This also allows us to factor out "form_addr (origin_offset, origin_is_dwz)".

Tested on x86_64-linux.
---
 gdb/dwarf2/read.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 37cabe52ecc..a6700ba3b40 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16170,15 +16170,17 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu,
 	  const gdb_byte *new_info_ptr = (new_reader->buffer
 					  + to_underlying (origin_offset));
 
-	  if (new_reader->cu == reader->cu
-	      && new_info_ptr > watermark_ptr
-	      && *parent_entry == nullptr)
-	    *maybe_defer = form_addr (origin_offset, origin_is_dwz);
-	  else if (*parent_entry == nullptr)
+	  if (*parent_entry == nullptr)
 	    {
-	      CORE_ADDR lookup = form_addr (origin_offset, origin_is_dwz);
-	      void *obj = m_die_range_map.find (lookup);
-	      *parent_entry = static_cast <cooked_index_entry *> (obj);
+	      CORE_ADDR addr = form_addr (origin_offset, origin_is_dwz);
+	      if (new_reader->cu == reader->cu
+		  && new_info_ptr > watermark_ptr)
+		*maybe_defer = addr;
+	      else
+		{
+		  void *obj = m_die_range_map.find (addr);
+		  *parent_entry = static_cast <cooked_index_entry *> (obj);
+		}
 	    }
 
 	  unsigned int bytes_read;
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 18:31   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent Tom de Vries
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Factor out usage of cooked_indexer::m_die_range_map into new class parent_map
with member functions find_parent and set_parent, and static member function
form_addr.

Tested on x86_64-linux.
---
 gdb/dwarf2/cooked-index.h | 32 +++++++++++++++++++++++++++
 gdb/dwarf2/read.c         | 46 ++++++++++++++++++++-------------------
 2 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index 5675ea68bb8..fa274b37c20 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -239,6 +239,38 @@ struct cooked_index_entry : public allocate_on_obstack
 		    bool for_name) const;
 };
 
+class parent_map
+{
+public:
+  /* A helper function to turn a section offset into an address that
+     can be used in a parent_map.  */
+  static CORE_ADDR form_addr (sect_offset offset, bool is_dwz)
+  {
+    CORE_ADDR value = to_underlying (offset);
+    if (is_dwz)
+      value |= ((CORE_ADDR) 1) << (8 * sizeof (CORE_ADDR) - 1);
+    return value;
+  }
+
+  /* Find the parent of DIE LOOKUP.  */
+  const cooked_index_entry *find_parent (CORE_ADDR lookup) const
+  {
+    const void *obj = m_parent_map.find (lookup);
+    return static_cast<const cooked_index_entry *> (obj);
+  }
+
+  /* Set the parent of DIES in range [START, END] to PARENT_ENTRY.  */
+  void set_parent (CORE_ADDR start, CORE_ADDR end,
+		   const cooked_index_entry *parent_entry)
+  {
+    m_parent_map.set_empty (start, end, (void *)parent_entry);
+  }
+
+private:
+  /* An addrmap that maps from section offsets to cooked_index_entry *.  */
+  addrmap_mutable m_parent_map;
+};
+
 class cooked_index;
 
 /* An index of interesting DIEs.  This is "cooked", in contrast to a
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index a6700ba3b40..806cc5902b3 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4541,16 +4541,6 @@ class cooked_indexer
 
 private:
 
-  /* A helper function to turn a section offset into an address that
-     can be used in an addrmap.  */
-  CORE_ADDR form_addr (sect_offset offset, bool is_dwz)
-  {
-    CORE_ADDR value = to_underlying (offset);
-    if (is_dwz)
-      value |= ((CORE_ADDR) 1) << (8 * sizeof (CORE_ADDR) - 1);
-    return value;
-  }
-
   /* A helper function to scan the PC bounds of READER and record them
      in the storage's addrmap.  */
   void check_bounds (cutu_reader *reader);
@@ -4618,7 +4608,20 @@ class cooked_indexer
   /* An addrmap that maps from section offsets (see the form_addr
      method) to newly-created entries.  See m_deferred_entries to
      understand this.  */
-  addrmap_mutable m_die_range_map;
+  parent_map m_die_range_map;
+
+  /* Find the parent of DIE LOOKUP.  */
+  const cooked_index_entry *find_parent (CORE_ADDR lookup) const
+  {
+    return m_die_range_map.find_parent (lookup);
+  }
+
+  /* Set the parent of DIES in range [START, END] to PARENT_ENTRY.  */
+  void set_parent (CORE_ADDR start, CORE_ADDR end,
+		   const cooked_index_entry *parent_entry)
+  {
+    m_die_range_map.set_parent (start, end, parent_entry);
+  }
 
   /* A single deferred entry.  */
   struct deferred_entry
@@ -16172,15 +16175,13 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu,
 
 	  if (*parent_entry == nullptr)
 	    {
-	      CORE_ADDR addr = form_addr (origin_offset, origin_is_dwz);
+	      CORE_ADDR addr
+		= parent_map::form_addr (origin_offset, origin_is_dwz);
 	      if (new_reader->cu == reader->cu
 		  && new_info_ptr > watermark_ptr)
 		*maybe_defer = addr;
 	      else
-		{
-		  void *obj = m_die_range_map.find (addr);
-		  *parent_entry = static_cast <cooked_index_entry *> (obj);
-		}
+		*parent_entry = find_parent (addr);
 	    }
 
 	  unsigned int bytes_read;
@@ -16298,11 +16299,13 @@ cooked_indexer::recurse (cutu_reader *reader,
     {
       /* Both start and end are inclusive, so use both "+ 1" and "- 1" to
 	 limit the range to the children of parent_entry.  */
-      CORE_ADDR start = form_addr (parent_entry->die_offset + 1,
-				   reader->cu->per_cu->is_dwz);
-      CORE_ADDR end = form_addr (sect_offset (info_ptr - 1 - reader->buffer),
+      CORE_ADDR start
+	= parent_map::form_addr (parent_entry->die_offset + 1,
+				 reader->cu->per_cu->is_dwz);
+      CORE_ADDR end
+	= parent_map::form_addr (sect_offset (info_ptr - 1 - reader->buffer),
 				 reader->cu->per_cu->is_dwz);
-      m_die_range_map.set_empty (start, end, (void *) parent_entry);
+      set_parent (start, end, parent_entry);
     }
 
   return info_ptr;
@@ -16475,8 +16478,7 @@ cooked_indexer::make_index (cutu_reader *reader)
 
   for (const auto &entry : m_deferred_entries)
     {
-      void *obj = m_die_range_map.find (entry.spec_offset);
-      cooked_index_entry *parent = static_cast<cooked_index_entry *> (obj);
+      const cooked_index_entry *parent = find_parent (entry.spec_offset);
       m_index_storage->add (entry.die_offset, entry.tag, entry.flags,
 			    entry.name, parent, m_per_cu);
     }
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 18:34   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage Tom de Vries
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Set_parent uses m_die_range_map.set_empty, which doesn't allow
parent_entry == nullptr.

So it may be necessary to guard calls to set_parent with
"if (parent_entry != nullptr)".

Fix this by handling the parent_entry == nullptr case in set_parent.

Tested on x86_64-linux.
---
 gdb/dwarf2/cooked-index.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index fa274b37c20..3c43717fc33 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -263,7 +263,9 @@ class parent_map
   void set_parent (CORE_ADDR start, CORE_ADDR end,
 		   const cooked_index_entry *parent_entry)
   {
-    m_parent_map.set_empty (start, end, (void *)parent_entry);
+    /* Calling set_empty with nullptr is currently not allowed.  */
+    if (parent_entry != nullptr)
+      m_parent_map.set_empty (start, end, (void *)parent_entry);
   }
 
 private:
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (2 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 18:39   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case Tom de Vries
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Factor out usage of cooked_indexer::m_deferred_entries in new member
functions defer_entry, handle_deferred_entries and resolve_deferred_entry.

Tested on x86_64-linux.
---
 gdb/dwarf2/read.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 806cc5902b3..cf71d4c55ce 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4641,6 +4641,32 @@ class cooked_indexer
      we'll know the containing context of all the DIEs that we might
      have scanned.  This vector stores these deferred entries.  */
   std::vector<deferred_entry> m_deferred_entries;
+
+  /* Defer creating a cooked_index_entry corresponding to deferred_entry DE.  */
+  void defer_entry (const deferred_entry &de)
+  {
+    m_deferred_entries.push_back (de);
+  }
+
+  /* Create a cooked_index_entry corresponding to deferred_entry DE with
+     parent PARENT_ENTRY.  */
+  const cooked_index_entry *resolve_deferred_entry
+    (const deferred_entry &de, const cooked_index_entry *parent_entry)
+  {
+    return m_index_storage->add (de.die_offset, de.tag, de.flags, de.name,
+				 parent_entry, m_per_cu);
+  }
+
+  /* Create cooked_index_entries for the deferred entries.  */
+  void handle_deferred_entries ()
+  {
+    for (const auto &entry : m_deferred_entries)
+      {
+	const cooked_index_entry *parent_entry
+	  = find_parent (entry.spec_offset);
+	resolve_deferred_entry (entry, parent_entry);
+      }
+  }
 };
 
 /* Subroutine of dwarf2_build_psymtabs_hard to simplify it.
@@ -16371,7 +16397,7 @@ cooked_indexer::index_dies (cutu_reader *reader,
       if (name != nullptr)
 	{
 	  if (defer != 0)
-	    m_deferred_entries.push_back ({
+	    defer_entry ({
 		this_die, name, defer, abbrev->tag, flags
 	      });
 	  else
@@ -16476,12 +16502,7 @@ cooked_indexer::make_index (cutu_reader *reader)
     return;
   index_dies (reader, reader->info_ptr, nullptr, false);
 
-  for (const auto &entry : m_deferred_entries)
-    {
-      const cooked_index_entry *parent = find_parent (entry.spec_offset);
-      m_index_storage->add (entry.die_offset, entry.tag, entry.flags,
-			    entry.name, parent, m_per_cu);
-    }
+  handle_deferred_entries ();
 }
 
 /* An implementation of quick_symbol_functions for the cooked DWARF
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (3 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 19:27   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp Tom de Vries
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Generically solve the case of inter-CU dependencies, including inter-shard
dependencies:
- mark deferred entries in new data structure m_deferred,
- return &parent_map::deferred in find_parent for deferred entries,
- defer all intra-CU dependencies that depend on deferred entries,
- defer all inter-CU dependencies (note that two subsequent patches implement
  optimizations to deal with this more optimally),
- move m_die_range_map and m_deferred_dies to cooked_index_shard, and
- move handle_deferred_dies to the cooked_index, where it is called in the
  constructor, and update it to handle the intra-shard case.

Handling units from the .debug_info section alongside units from the
.debug_types section requires us to extend form_addr to take is_debug_types
into account.

Tested on x86_64-linux.

PR symtab/30846
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846
---
 gdb/dwarf2/cooked-index.c |  77 +++++++++++++++++++++++
 gdb/dwarf2/cooked-index.h | 105 ++++++++++++++++++++++++++++++-
 gdb/dwarf2/read.c         | 128 ++++++++++++++++++++++----------------
 3 files changed, 254 insertions(+), 56 deletions(-)

diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index ba77f9cb373..ba37d4a820c 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -228,6 +228,12 @@ cooked_index_entry::write_scope (struct obstack *storage,
 
 /* See cooked-index.h.  */
 
+cooked_index_entry parent_map::deferred((sect_offset)0, (dwarf_tag)0,
+					(cooked_index_flag)0, nullptr,
+					nullptr, nullptr);
+
+/* See cooked-index.h.  */
+
 const cooked_index_entry *
 cooked_index_shard::add (sect_offset die_offset, enum dwarf_tag tag,
 			 cooked_index_flag flags, const char *name,
@@ -450,6 +456,8 @@ cooked_index_shard::wait (bool allow_quit) const
 cooked_index::cooked_index (vec_type &&vec)
   : m_vector (std::move (vec))
 {
+  handle_deferred_entries ();
+
   for (auto &idx : m_vector)
     idx->finalize ();
 
@@ -648,6 +656,75 @@ cooked_index::maybe_write_index (dwarf2_per_bfd *per_bfd,
   global_index_cache.store (per_bfd, ctx);
 }
 
+/* See cooked-index.h.  */
+
+const cooked_index_entry *
+cooked_index_shard::resolve_deferred_entry
+  (const deferred_entry &de, const cooked_index_entry *parent_entry)
+{
+  reset_parent_deferred (parent_map::form_addr (de.die_offset, de.per_cu_2->is_dwz,
+						de.per_cu_2->is_debug_types));
+  return add (de.die_offset, de.tag, de.flags, de.name,
+	      parent_entry, de.per_cu);
+}
+
+/* See cooked-index.h.  */
+
+const cooked_index_entry *
+cooked_index::find_parent_deferred_entry
+  (const cooked_index_shard::deferred_entry &entry) const
+{
+  const cooked_index_entry *parent_entry = nullptr;
+  for (auto &parent_map_shard : m_vector)
+    {
+      auto res = parent_map_shard->find_parent (entry.spec_offset);
+      if (res != nullptr)
+	{
+	  parent_entry = res;
+	  break;
+	}
+    }
+
+  return parent_entry;
+}
+
+/* See cooked-index.h.  */
+
+void
+cooked_index::handle_deferred_entries ()
+{
+  bool changed;
+  bool deferred;
+  do
+    {
+      deferred = false;
+      changed = false;
+      for (auto &shard : m_vector)
+	for (auto it = shard->m_deferred_entries->begin ();
+	     it != shard->m_deferred_entries->end (); )
+	  {
+	    const cooked_index_entry *parent_entry
+	      = find_parent_deferred_entry (*it);
+	    if (parent_entry == &parent_map::deferred)
+	      {
+		deferred = true;
+		it++;
+		continue;
+	      }
+	    shard->resolve_deferred_entry (*it, parent_entry);
+	    it = shard->m_deferred_entries->erase (it);
+	    changed = true;
+	  }
+    }
+  while (changed && deferred);
+
+  for (auto &shard : m_vector)
+    {
+      shard->m_die_range_map.reset (nullptr);
+      shard->m_deferred_entries.reset (nullptr);
+    }
+}
+
 /* Wait for all the index cache entries to be written before gdb
    exits.  */
 static void
diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index 3c43717fc33..de54a788c42 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -34,6 +34,7 @@
 #include "dwarf2/mapped-index.h"
 #include "dwarf2/tag.h"
 #include "gdbsupport/range-chain.h"
+#include <unordered_set>
 
 struct dwarf2_per_cu_data;
 struct dwarf2_per_bfd;
@@ -242,19 +243,29 @@ struct cooked_index_entry : public allocate_on_obstack
 class parent_map
 {
 public:
+  /* A dummy cooked_index_entry to mark that computing the parent has been
+     deferred.  */
+  static cooked_index_entry deferred;
+
   /* A helper function to turn a section offset into an address that
      can be used in a parent_map.  */
-  static CORE_ADDR form_addr (sect_offset offset, bool is_dwz)
+  static CORE_ADDR form_addr (sect_offset offset, bool is_dwz,
+			      bool is_debug_types)
   {
     CORE_ADDR value = to_underlying (offset);
     if (is_dwz)
       value |= ((CORE_ADDR) 1) << (8 * sizeof (CORE_ADDR) - 1);
+    if (is_debug_types)
+      value |= ((CORE_ADDR) 1) << (8 * sizeof (CORE_ADDR) - 2);
     return value;
   }
 
   /* Find the parent of DIE LOOKUP.  */
   const cooked_index_entry *find_parent (CORE_ADDR lookup) const
   {
+    if (m_deferred.find (lookup) != m_deferred.end ())
+      return &parent_map::deferred;
+
     const void *obj = m_parent_map.find (lookup);
     return static_cast<const cooked_index_entry *> (obj);
   }
@@ -265,12 +276,28 @@ class parent_map
   {
     /* Calling set_empty with nullptr is currently not allowed.  */
     if (parent_entry != nullptr)
-      m_parent_map.set_empty (start, end, (void *)parent_entry);
+      {
+	gdb_assert (parent_entry != &parent_map::deferred);
+	m_parent_map.set_empty (start, end, (void *)parent_entry);
+      }
+  }
+
+  void set_parent_deferred (CORE_ADDR addr)
+  {
+    m_deferred.emplace (addr);
+  }
+
+  void reset_parent_deferred (CORE_ADDR addr)
+  {
+    m_deferred.erase (addr);
   }
 
 private:
   /* An addrmap that maps from section offsets to cooked_index_entry *.  */
   addrmap_mutable m_parent_map;
+
+  /* DIEs that are deffered.  */
+  std::unordered_set<CORE_ADDR> m_deferred;
 };
 
 class cooked_index;
@@ -285,7 +312,12 @@ class cooked_index;
 class cooked_index_shard
 {
 public:
-  cooked_index_shard () = default;
+  cooked_index_shard ()
+  {
+    m_die_range_map.reset (new parent_map);
+    m_deferred_entries.reset (new std::vector<deferred_entry>);
+  }
+
   DISABLE_COPY_AND_ASSIGN (cooked_index_shard);
 
   /* Create a new cooked_index_entry and register it with this object.
@@ -329,6 +361,52 @@ class cooked_index_shard
      for completion, will be returned.  */
   range find (const std::string &name, bool completing) const;
 
+  /* Find the parent of DIE LOOKUP.  */
+  const cooked_index_entry *
+  find_parent (CORE_ADDR lookup) const
+  {
+    return m_die_range_map->find_parent (lookup);
+  }
+
+  /* Set the parent of DIES in range [START, END] to PARENT_ENTRY.  */
+  void set_parent (CORE_ADDR start, CORE_ADDR end,
+		   const cooked_index_entry *parent_entry)
+  {
+    m_die_range_map->set_parent (start, end, parent_entry);
+  }
+
+  void set_parent_deferred (CORE_ADDR addr)
+  {
+    m_die_range_map->set_parent_deferred (addr);
+  }
+
+  void reset_parent_deferred (CORE_ADDR addr)
+  {
+    m_die_range_map->reset_parent_deferred (addr);
+  }
+
+  /* A single deferred entry.  See m_deferred_entries.  */
+  struct deferred_entry
+  {
+    sect_offset die_offset;
+    const char *name;
+    CORE_ADDR spec_offset;
+    dwarf_tag tag;
+    cooked_index_flag flags;
+    dwarf2_per_cu_data *per_cu;
+    dwarf2_per_cu_data *per_cu_2;
+  };
+
+  /* Defer creating a cooked_index_entry corresponding to DEFERRED.  */
+  void defer_entry (deferred_entry de)
+  {
+    m_deferred_entries->push_back (de);
+  }
+
+  /* Variant of add that takes a deferred_entry as parameter.  */
+  const cooked_index_entry *resolve_deferred_entry
+    (const deferred_entry &entry, const cooked_index_entry *parent_entry);
+
 private:
 
   /* Return the entry that is believed to represent the program's
@@ -386,6 +464,20 @@ class cooked_index_shard
      that the 'get' method is never called on this future, only
      'wait'.  */
   gdb::future<void> m_future;
+
+  /* An addrmap that maps from section offsets (see the form_addr
+     method) to newly-created entries.  See m_deferred_entries to
+     understand this.  */
+  std::unique_ptr<parent_map> m_die_range_map;
+
+  /* The generated DWARF can sometimes have the declaration for a
+     method in a class (or perhaps namespace) scope, with the
+     definition appearing outside this scope... just one of the many
+     bad things about DWARF.  In order to handle this situation, we
+     defer certain entries until the end of scanning, at which point
+     we'll know the containing context of all the DIEs that we might
+     have scanned.  This vector stores these deferred entries.  */
+  std::unique_ptr<std::vector<deferred_entry>> m_deferred_entries;
 };
 
 /* The main index of DIEs.  The parallel DIE indexers create
@@ -469,6 +561,13 @@ class cooked_index : public dwarf_scanner_base
 
 private:
 
+  /* Find the parent corresponding to deferred entry ENTRY.  */
+  const cooked_index_entry *find_parent_deferred_entry
+    (const cooked_index_shard::deferred_entry &entry) const;
+
+  /* Create cooked_index_entries for the deferred entries.  */
+  void handle_deferred_entries ();
+
   /* Maybe write the index to the index cache.  */
   void maybe_write_index (dwarf2_per_bfd *per_bfd,
 			  const index_cache_store_context &);
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index cf71d4c55ce..ec58125499c 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4491,6 +4491,31 @@ class cooked_index_storage
     return &m_addrmap;
   }
 
+  /* Find the parent of DIE LOOKUP.  */
+  const cooked_index_entry *find_parent (CORE_ADDR lookup)
+  {
+    return m_index->find_parent (lookup);
+  }
+
+  /* Set the parent of DIES in range [START, END] to PARENT_ENTRY.  */
+  void set_parent (CORE_ADDR start, CORE_ADDR end,
+		   const cooked_index_entry *parent_entry)
+  {
+    m_index->set_parent (start, end, parent_entry);
+  }
+
+  /* Set the parent of DIE at ADDR as deferred.  */
+  void set_parent_deferred (CORE_ADDR addr)
+  {
+    m_index->set_parent_deferred (addr);
+  }
+
+  /* Defer creating a cooked_index_entry corresponding to deferred_entry DE.  */
+  void defer_entry (cooked_index_shard::deferred_entry de)
+  {
+    m_index->defer_entry (de);
+  }
+
 private:
 
   /* Hash function for a cutu_reader.  */
@@ -4613,59 +4638,26 @@ class cooked_indexer
   /* Find the parent of DIE LOOKUP.  */
   const cooked_index_entry *find_parent (CORE_ADDR lookup) const
   {
-    return m_die_range_map.find_parent (lookup);
+    return m_index_storage->find_parent (lookup);
   }
 
   /* Set the parent of DIES in range [START, END] to PARENT_ENTRY.  */
   void set_parent (CORE_ADDR start, CORE_ADDR end,
 		   const cooked_index_entry *parent_entry)
   {
-    m_die_range_map.set_parent (start, end, parent_entry);
+    m_index_storage->set_parent (start, end, parent_entry);
   }
 
-  /* A single deferred entry.  */
-  struct deferred_entry
-  {
-    sect_offset die_offset;
-    const char *name;
-    CORE_ADDR spec_offset;
-    dwarf_tag tag;
-    cooked_index_flag flags;
-  };
-
-  /* The generated DWARF can sometimes have the declaration for a
-     method in a class (or perhaps namespace) scope, with the
-     definition appearing outside this scope... just one of the many
-     bad things about DWARF.  In order to handle this situation, we
-     defer certain entries until the end of scanning, at which point
-     we'll know the containing context of all the DIEs that we might
-     have scanned.  This vector stores these deferred entries.  */
-  std::vector<deferred_entry> m_deferred_entries;
-
-  /* Defer creating a cooked_index_entry corresponding to deferred_entry DE.  */
-  void defer_entry (const deferred_entry &de)
+  /* Set the parent of DIE at ADDR as deferred.  */
+  void set_parent_deferred (CORE_ADDR addr)
   {
-    m_deferred_entries.push_back (de);
+    m_index_storage->set_parent_deferred (addr);
   }
 
-  /* Create a cooked_index_entry corresponding to deferred_entry DE with
-     parent PARENT_ENTRY.  */
-  const cooked_index_entry *resolve_deferred_entry
-    (const deferred_entry &de, const cooked_index_entry *parent_entry)
-  {
-    return m_index_storage->add (de.die_offset, de.tag, de.flags, de.name,
-				 parent_entry, m_per_cu);
-  }
-
-  /* Create cooked_index_entries for the deferred entries.  */
-  void handle_deferred_entries ()
+  /* Defer creating a cooked_index_entry corresponding to deferred_entry DE.  */
+  void defer_entry (const cooked_index_shard::deferred_entry &de)
   {
-    for (const auto &entry : m_deferred_entries)
-      {
-	const cooked_index_entry *parent_entry
-	  = find_parent (entry.spec_offset);
-	resolve_deferred_entry (entry, parent_entry);
-      }
+    m_index_storage->defer_entry (de);
   }
 };
 
@@ -16201,13 +16193,36 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu,
 
 	  if (*parent_entry == nullptr)
 	    {
+	      gdb_assert (reader->cu->per_cu->is_debug_types
+			  == new_reader->cu->per_cu->is_debug_types);
 	      CORE_ADDR addr
-		= parent_map::form_addr (origin_offset, origin_is_dwz);
-	      if (new_reader->cu == reader->cu
-		  && new_info_ptr > watermark_ptr)
-		*maybe_defer = addr;
+		= parent_map::form_addr (origin_offset, origin_is_dwz,
+					 reader->cu->per_cu->is_debug_types);
+	      if (new_reader->cu == reader->cu)
+		{
+		  /* Intra-CU case.  */
+		  if (new_info_ptr > watermark_ptr)
+		    {
+		      /* Defer because origin is not read yet.  */
+		      *maybe_defer = addr;
+		    }
+		  else
+		    {
+		      auto tmp = find_parent (addr);
+		      if (tmp == &parent_map::deferred)
+			{
+			  /* Defer because origin is deferred.  */
+			  *maybe_defer = addr;
+			}
+		      else
+			*parent_entry = tmp;
+		    }
+		}
 	      else
-		*parent_entry = find_parent (addr);
+		{
+		  /* Inter-CU case.  */
+		  *maybe_defer = addr;
+		}
 	    }
 
 	  unsigned int bytes_read;
@@ -16327,10 +16342,12 @@ cooked_indexer::recurse (cutu_reader *reader,
 	 limit the range to the children of parent_entry.  */
       CORE_ADDR start
 	= parent_map::form_addr (parent_entry->die_offset + 1,
-				 reader->cu->per_cu->is_dwz);
+				 reader->cu->per_cu->is_dwz,
+				 reader->cu->per_cu->is_debug_types);
       CORE_ADDR end
 	= parent_map::form_addr (sect_offset (info_ptr - 1 - reader->buffer),
-				 reader->cu->per_cu->is_dwz);
+				 reader->cu->per_cu->is_dwz,
+				 reader->cu->per_cu->is_debug_types);
       set_parent (start, end, parent_entry);
     }
 
@@ -16397,9 +16414,16 @@ cooked_indexer::index_dies (cutu_reader *reader,
       if (name != nullptr)
 	{
 	  if (defer != 0)
-	    defer_entry ({
-		this_die, name, defer, abbrev->tag, flags
-	      });
+	    {
+	      CORE_ADDR addr
+		= parent_map::form_addr (this_die, reader->cu->per_cu->is_dwz,
+					 reader->cu->per_cu->is_debug_types);
+	      set_parent_deferred (addr);
+	      defer_entry ({
+		  this_die, name, defer, abbrev->tag, flags, m_per_cu,
+		  reader->cu->per_cu
+		});
+	    }
 	  else
 	    this_entry = m_index_storage->add (this_die, abbrev->tag, flags,
 					       name, this_parent_entry,
@@ -16501,8 +16525,6 @@ cooked_indexer::make_index (cutu_reader *reader)
   if (!reader->comp_unit_die->has_children)
     return;
   index_dies (reader, reader->info_ptr, nullptr, false);
-
-  handle_deferred_entries ();
 }
 
 /* An implementation of quick_symbol_functions for the cooked DWARF
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (4 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 19:28   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp Tom de Vries
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Add a regression test for PR symtab/30846.

Tested on x86_64-linux.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846
---
 .../gdb.dwarf2/forward-spec-inter-cu.exp      | 103 ++++++++++++++++++
 1 file changed, 103 insertions(+)
 create mode 100644 gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp

diff --git a/gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp b/gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp
new file mode 100644
index 00000000000..d8367b0a162
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp
@@ -0,0 +1,103 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check that the DWARF reader works with a a DW_AT_specification that
+# refers to a later DIE.  Inter-cu variant of forward-spec.exp.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+require dwarf2_support
+
+standard_testfile main.c -debug.S
+
+# Set up the DWARF for the test.
+set asm_file [standard_output_file $srcfile2]
+Dwarf::assemble $asm_file {
+    global srcfile
+
+    declare_labels spec
+
+    cu {} {
+	DW_TAG_compile_unit {
+	    {DW_AT_language @DW_LANG_C_plus_plus}
+	} {
+	    # The new indexer has special code to compute the full
+	    # name of an object that uses a specification that appears
+	    # later in the DWARF.
+	    DW_TAG_variable {
+		{DW_AT_specification %$spec}
+		{DW_AT_location {
+		    DW_OP_const1u 23
+		    DW_OP_stack_value
+		} SPECIAL_expr}
+	    }
+	}
+    }
+
+    cu {} {
+	DW_TAG_compile_unit {
+	    {DW_AT_language @DW_LANG_C_plus_plus}
+	} {
+	    declare_labels myint
+
+	    myint: DW_TAG_base_type {
+		{DW_AT_byte_size 4 DW_FORM_sdata}
+		{DW_AT_encoding  @DW_ATE_signed}
+		{DW_AT_name      myint}
+	    }
+
+	    DW_TAG_namespace {
+		{DW_AT_name ns}
+	    } {
+		spec: DW_TAG_variable {
+		    {DW_AT_name v}
+		    {DW_AT_type :$myint}
+		    {DW_AT_declaration 1 DW_FORM_flag_present}
+		}
+	    }
+	}
+    }
+}
+
+if {[build_executable "failed to build executable" ${testfile} \
+	 [list $srcfile $asm_file] {nodebug}]} {
+    return -1
+}
+
+set eol "\r\n"
+set ws "\[ \t\]"
+
+set worker_threads_list {}
+
+# Exercises the intra-shard case.
+lappend worker_threads_list 0
+
+# Might exercise the inter-shard case.
+lappend worker_threads_list default
+
+foreach_with_prefix worker_threads $worker_threads_list {
+
+    clean_restart
+
+    if { $worker_threads != "default" } {
+	gdb_test_no_output "maint set worker-threads $worker_threads"
+    }
+
+    gdb_load $binfile
+
+    gdb_test "maint print objfiles" "$eol$ws+qualified:$ws+ns::v$eol.*" \
+	"v has parent ns"
+}
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (5 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 19:29   ` Tom Tromey
  2023-12-12 17:32 ` [PATCH v2 08/13] [gdb/symtab] Keep track of processed DIEs in shard Tom de Vries
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Add another regression test for PR symtab/30846.

Tested on x86_64-linux.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846
---
 .../gdb.dwarf2/backward-spec-inter-cu.exp     | 103 ++++++++++++++++++
 1 file changed, 103 insertions(+)
 create mode 100644 gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp

diff --git a/gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp b/gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp
new file mode 100644
index 00000000000..59b3db50dbb
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp
@@ -0,0 +1,103 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Check that the DWARF reader works with a a DW_AT_specification that
+# refers to an earlier DIE.  Inter-cu variant of forward-spec.exp.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+require dwarf2_support
+
+standard_testfile main.c -debug.S
+
+# Set up the DWARF for the test.
+set asm_file [standard_output_file $srcfile2]
+Dwarf::assemble $asm_file {
+    global srcfile
+
+    declare_labels spec
+
+    cu {} {
+	DW_TAG_compile_unit {
+	    {DW_AT_language @DW_LANG_C_plus_plus}
+	} {
+	    declare_labels myint
+
+	    myint: DW_TAG_base_type {
+		{DW_AT_byte_size 4 DW_FORM_sdata}
+		{DW_AT_encoding  @DW_ATE_signed}
+		{DW_AT_name      myint}
+	    }
+
+	    DW_TAG_namespace {
+		{DW_AT_name ns}
+	    } {
+		spec: DW_TAG_variable {
+		    {DW_AT_name v}
+		    {DW_AT_type :$myint}
+		    {DW_AT_declaration 1 DW_FORM_flag_present}
+		}
+	    }
+	}
+    }
+
+    cu {} {
+	DW_TAG_compile_unit {
+	    {DW_AT_language @DW_LANG_C_plus_plus}
+	} {
+	    # The new indexer has special code to compute the full
+	    # name of an object that uses a specification that appears
+	    # later in the DWARF.
+	    DW_TAG_variable {
+		{DW_AT_specification %$spec}
+		{DW_AT_location {
+		    DW_OP_const1u 23
+		    DW_OP_stack_value
+		} SPECIAL_expr}
+	    }
+	}
+    }
+}
+
+if {[build_executable "failed to build executable" ${testfile} \
+	 [list $srcfile $asm_file] {nodebug}]} {
+    return -1
+}
+
+set eol "\r\n"
+set ws "\[ \t\]"
+
+set worker_threads_list {}
+
+# Exercises the intra-shard case.
+lappend worker_threads_list 0
+
+# Might exercise the inter-shard case.
+lappend worker_threads_list default
+
+foreach_with_prefix worker_threads $worker_threads_list {
+
+    clean_restart
+
+    if { $worker_threads != "default" } {
+	gdb_test_no_output "maint set worker-threads $worker_threads"
+    }
+
+    gdb_load $binfile
+
+    gdb_test "maint print objfiles" "$eol$ws+qualified:$ws+ns::v$eol.*" \
+	"v has parent ns"
+}
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 08/13] [gdb/symtab] Keep track of processed DIEs in shard
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (6 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 09/13] [gdb/symtab] Resolve deferred entries, intra-shard case Tom de Vries
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

For optimizations in two following patches, we keep track in each shard which
DIEs have been processed.

Tested on x86_64-linux.
---
 gdb/dwarf2/cooked-index.c |  1 +
 gdb/dwarf2/cooked-index.h | 15 +++++++++++++++
 gdb/dwarf2/read.c         | 24 ++++++++++++++++++++++++
 3 files changed, 40 insertions(+)

diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index ba37d4a820c..1368636d4b3 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -722,6 +722,7 @@ cooked_index::handle_deferred_entries ()
     {
       shard->m_die_range_map.reset (nullptr);
       shard->m_deferred_entries.reset (nullptr);
+      shard->m_die_range_map_valid.reset (nullptr);
     }
 }
 
diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index de54a788c42..9b233e0f344 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -316,6 +316,7 @@ class cooked_index_shard
   {
     m_die_range_map.reset (new parent_map);
     m_deferred_entries.reset (new std::vector<deferred_entry>);
+    m_die_range_map_valid.reset (new addrmap_mutable);
   }
 
   DISABLE_COPY_AND_ASSIGN (cooked_index_shard);
@@ -407,6 +408,18 @@ class cooked_index_shard
   const cooked_index_entry *resolve_deferred_entry
     (const deferred_entry &entry, const cooked_index_entry *parent_entry);
 
+  /* Mark parents in range [START, END] as valid .  */
+  void set_parent_valid (CORE_ADDR start, CORE_ADDR end)
+  {
+    m_die_range_map_valid->set_empty (start, end, (void *) 1);
+  }
+
+  /* Return true if find_parents can be relied upon.  */
+  bool parent_valid (CORE_ADDR addr)
+  {
+    return m_die_range_map_valid->find (addr) != nullptr;
+  }
+
 private:
 
   /* Return the entry that is believed to represent the program's
@@ -470,6 +483,8 @@ class cooked_index_shard
      understand this.  */
   std::unique_ptr<parent_map> m_die_range_map;
 
+  std::unique_ptr<addrmap> m_die_range_map_valid;
+
   /* The generated DWARF can sometimes have the declaration for a
      method in a class (or perhaps namespace) scope, with the
      definition appearing outside this scope... just one of the many
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index ec58125499c..e8d5f0a1a9c 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4516,6 +4516,12 @@ class cooked_index_storage
     m_index->defer_entry (de);
   }
 
+  /* Mark parents in range [START, END] as valid .  */
+  void set_parent_valid (CORE_ADDR start, CORE_ADDR end)
+  {
+    m_index->set_parent_valid (start, end);
+  }
+
 private:
 
   /* Hash function for a cutu_reader.  */
@@ -4659,6 +4665,11 @@ class cooked_indexer
   {
     m_index_storage->defer_entry (de);
   }
+
+  void set_parent_valid (CORE_ADDR start, CORE_ADDR end)
+  {
+    m_index_storage->set_parent_valid (start, end);
+  }
 };
 
 /* Subroutine of dwarf2_build_psymtabs_hard to simplify it.
@@ -16364,6 +16375,11 @@ cooked_indexer::index_dies (cutu_reader *reader,
 			     + to_underlying (reader->cu->header.sect_off)
 			     + reader->cu->header.get_length_with_initial ());
 
+  const CORE_ADDR start_cu
+    = parent_map::form_addr (sect_offset (info_ptr - reader->buffer),
+			     reader->cu->per_cu->is_dwz,
+			     reader->cu->per_cu->is_debug_types);
+
   while (info_ptr < end_ptr)
     {
       sect_offset this_die = (sect_offset) (info_ptr - reader->buffer);
@@ -16514,6 +16530,14 @@ cooked_indexer::index_dies (cutu_reader *reader,
 	}
     }
 
+  {
+    CORE_ADDR end_prev_die
+      = parent_map::form_addr (sect_offset (info_ptr - reader->buffer - 1),
+			       reader->cu->per_cu->is_dwz,
+			       reader->cu->per_cu->is_debug_types);
+    set_parent_valid (start_cu, end_prev_die);
+  }
+
   return info_ptr;
 }
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 09/13] [gdb/symtab] Resolve deferred entries, intra-shard case
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (7 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 08/13] [gdb/symtab] Keep track of processed DIEs in shard Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 10/13] [gdb/symtab] Don't defer backward refs, inter-cu " Tom de Vries
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

In patch "[gdb/symtab] Resolve deferred entries, inter-shard case" we've
solved the generic case of handling deferred entries.

Now add an optimization that handles deferred entries with an intra-shard
dependency in the parallel for.

Tested on x86_64-linux.
---
 gdb/dwarf2/cooked-index.c | 35 +++++++++++++++++++++++++++++++++++
 gdb/dwarf2/cooked-index.h |  7 +++++++
 gdb/dwarf2/read.c         | 10 ++++++++++
 3 files changed, 52 insertions(+)

diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index 1368636d4b3..04c69de78d9 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -456,6 +456,7 @@ cooked_index_shard::wait (bool allow_quit) const
 cooked_index::cooked_index (vec_type &&vec)
   : m_vector (std::move (vec))
 {
+  /* Handle deferred entries, inter-cu case.  */
   handle_deferred_entries ();
 
   for (auto &idx : m_vector)
@@ -658,6 +659,40 @@ cooked_index::maybe_write_index (dwarf2_per_bfd *per_bfd,
 
 /* See cooked-index.h.  */
 
+const cooked_index_entry *
+cooked_index_shard::find_parent_deferred_entry
+  (const cooked_index_shard::deferred_entry &entry) const
+{
+  return find_parent (entry.spec_offset);
+}
+
+/* See cooked-index.h.  */
+
+void
+cooked_index_shard::handle_deferred_entries ()
+{
+  for (auto it = m_deferred_entries->begin (); it != m_deferred_entries->end (); )
+    {
+      const deferred_entry & deferred_entry = *it;
+      if (!parent_valid (deferred_entry.spec_offset))
+	{
+	  it++;
+	  continue;
+	}
+      const cooked_index_entry *parent_entry
+	= find_parent_deferred_entry (deferred_entry);
+      if (parent_entry == &parent_map::deferred)
+	{
+	  it++;
+	  continue;
+	}
+      resolve_deferred_entry (deferred_entry, parent_entry);
+      it = m_deferred_entries->erase (it);
+    }
+}
+
+/* See cooked-index.h.  */
+
 const cooked_index_entry *
 cooked_index_shard::resolve_deferred_entry
   (const deferred_entry &de, const cooked_index_entry *parent_entry)
diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index 9b233e0f344..a2b6492d924 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -408,6 +408,13 @@ class cooked_index_shard
   const cooked_index_entry *resolve_deferred_entry
     (const deferred_entry &entry, const cooked_index_entry *parent_entry);
 
+  /* Find the parent entry for deferred_entry ENTRY.  */
+  const cooked_index_entry *find_parent_deferred_entry
+    (const cooked_index_shard::deferred_entry &entry) const;
+
+  /* Create cooked_index_entries for the deferred entries.  */
+  void handle_deferred_entries ();
+
   /* Mark parents in range [START, END] as valid .  */
   void set_parent_valid (CORE_ADDR start, CORE_ADDR end)
   {
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index e8d5f0a1a9c..3d52e908152 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4516,6 +4516,12 @@ class cooked_index_storage
     m_index->defer_entry (de);
   }
 
+  /* Handle deferred entries, intra-cu case.  */
+  void handle_deferred_entries ()
+  {
+    m_index->handle_deferred_entries ();
+  }
+
   /* Mark parents in range [START, END] as valid .  */
   void set_parent_valid (CORE_ADDR start, CORE_ADDR end)
   {
@@ -5002,6 +5008,10 @@ dwarf2_build_psymtabs_hard (dwarf2_per_objfile *per_objfile)
 		errors.push_back (std::move (except));
 	      }
 	  }
+
+	/* Handle deferred entries, intra-cu case.  */
+	thread_storage.handle_deferred_entries ();
+
 	return result_type (thread_storage.release (), std::move (errors));
       }, task_size);
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 10/13] [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (8 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 09/13] [gdb/symtab] Resolve deferred entries, intra-shard case Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 11/13] [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index Tom de Vries
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

In patch "[gdb/symtab] Resolve deferred entries, inter-shard case" we've
solved the generic case of handling deferred entries.

Add an optimization that doesn't defer inter-CU intra-shard dependencies that
are present in the shard's parent map.

Tested on x86_64-linux.
---
 gdb/dwarf2/read.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 3d52e908152..610c751818e 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -4528,6 +4528,12 @@ class cooked_index_storage
     m_index->set_parent_valid (start, end);
   }
 
+  /* Return true if find_parents can be relied upon.  */
+  bool parent_valid (CORE_ADDR addr)
+  {
+    return m_index->parent_valid (addr);
+  }
+
 private:
 
   /* Hash function for a cutu_reader.  */
@@ -4676,6 +4682,12 @@ class cooked_indexer
   {
     m_index_storage->set_parent_valid (start, end);
   }
+
+  /* Return true if find_parents can be relied upon.  */
+  bool parent_valid (CORE_ADDR addr)
+  {
+    return m_index_storage->parent_valid (addr);
+  }
 };
 
 /* Subroutine of dwarf2_build_psymtabs_hard to simplify it.
@@ -16242,7 +16254,22 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu,
 	      else
 		{
 		  /* Inter-CU case.  */
-		  *maybe_defer = addr;
+		  if (parent_valid (addr))
+		    {
+		      auto tmp = find_parent (addr);
+		      if (tmp == &parent_map::deferred)
+			{
+			  /* Defer because origin is deferred.  */
+			  *maybe_defer = addr;
+			}
+		      else
+			*parent_entry = tmp;
+		    }
+		  else
+		    {
+		      /* Defer because origin is in other shard.  */
+		      *maybe_defer = addr;
+		    }
 		}
 	    }
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 11/13] [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (9 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 10/13] [gdb/symtab] Don't defer backward refs, inter-cu " Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 12/13] [gdb/symtab] Keep track of all parents " Tom de Vries
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Recurse into c++ DW_TAG_subprogram DIEs for cooked index, to add
DW_TAG_inlined_subroutine entries.

Tested on x86_64-linux.
---
 gdb/dwarf2/read.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 610c751818e..e999a6a5c59 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16541,7 +16541,8 @@ cooked_indexer::index_dies (cutu_reader *reader,
 
 	    case DW_TAG_subprogram:
 	      if ((m_language == language_fortran
-		   || m_language == language_ada)
+		   || m_language == language_ada
+		   || m_language == language_cplus)
 		  && this_entry != nullptr)
 		{
 		  info_ptr = recurse (reader, info_ptr, this_entry, true);
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 12/13] [gdb/symtab] Keep track of all parents for cooked index
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (10 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 11/13] [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 17:32 ` [PATCH v2 13/13] [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the " Tom de Vries
  2023-12-12 19:05 ` [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom Tromey
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

Keep track of all parents for cooked index.

Tested on x86_64-linux.
---
 gdb/dwarf2/read.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index e999a6a5c59..652bcda3704 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16478,9 +16478,15 @@ cooked_indexer::index_dies (cutu_reader *reader,
 		});
 	    }
 	  else
-	    this_entry = m_index_storage->add (this_die, abbrev->tag, flags,
-					       name, this_parent_entry,
-					       m_per_cu);
+	    {
+	      CORE_ADDR addr
+		= parent_map::form_addr (this_die, reader->cu->per_cu->is_dwz,
+					 reader->cu->per_cu->is_debug_types);
+	      set_parent (addr, addr, this_parent_entry);
+	      this_entry = m_index_storage->add (this_die, abbrev->tag, flags,
+						 name, this_parent_entry,
+						 m_per_cu);
+	    }
 	}
 
       if (linkage_name != nullptr)
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 13/13] [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (11 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 12/13] [gdb/symtab] Keep track of all parents " Tom de Vries
@ 2023-12-12 17:32 ` Tom de Vries
  2023-12-12 19:05 ` [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom Tromey
  13 siblings, 0 replies; 30+ messages in thread
From: Tom de Vries @ 2023-12-12 17:32 UTC (permalink / raw)
  To: gdb-patches

We get incorrect qualified names in the cooked index for
DW_TAG_inlined_subroutine DIEs with abstract origin, due to the fact that the
DIE parent is used instead of the abstract origin.

Fix this by preferring the abstract origin parent, if available.

Tested on x86_64-linux.

PR symtab/30728
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30728
---
 gdb/dwarf2/read.c | 67 ++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 35 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 652bcda3704..7537cb6b7f2 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -16224,52 +16224,49 @@ cooked_indexer::scan_attributes (dwarf2_per_cu_data *scanning_per_cu,
 	  const gdb_byte *new_info_ptr = (new_reader->buffer
 					  + to_underlying (origin_offset));
 
-	  if (*parent_entry == nullptr)
+	  gdb_assert (reader->cu->per_cu->is_debug_types
+		      == new_reader->cu->per_cu->is_debug_types);
+	  CORE_ADDR addr
+	    = parent_map::form_addr (origin_offset, origin_is_dwz,
+				     reader->cu->per_cu->is_debug_types);
+	  if (new_reader->cu == reader->cu)
 	    {
-	      gdb_assert (reader->cu->per_cu->is_debug_types
-			  == new_reader->cu->per_cu->is_debug_types);
-	      CORE_ADDR addr
-		= parent_map::form_addr (origin_offset, origin_is_dwz,
-					 reader->cu->per_cu->is_debug_types);
-	      if (new_reader->cu == reader->cu)
+	      /* Intra-CU case.  */
+	      if (new_info_ptr > watermark_ptr)
 		{
-		  /* Intra-CU case.  */
-		  if (new_info_ptr > watermark_ptr)
-		    {
-		      /* Defer because origin is not read yet.  */
-		      *maybe_defer = addr;
-		    }
-		  else
-		    {
-		      auto tmp = find_parent (addr);
-		      if (tmp == &parent_map::deferred)
-			{
-			  /* Defer because origin is deferred.  */
-			  *maybe_defer = addr;
-			}
-		      else
-			*parent_entry = tmp;
-		    }
+		  /* Defer because origin is not read yet.  */
+		  *maybe_defer = addr;
 		}
 	      else
 		{
-		  /* Inter-CU case.  */
-		  if (parent_valid (addr))
+		  auto tmp = find_parent (addr);
+		  if (tmp == &parent_map::deferred)
 		    {
-		      auto tmp = find_parent (addr);
-		      if (tmp == &parent_map::deferred)
-			{
-			  /* Defer because origin is deferred.  */
-			  *maybe_defer = addr;
-			}
-		      else
-			*parent_entry = tmp;
+		      /* Defer because origin is deferred.  */
+		      *maybe_defer = addr;
 		    }
 		  else
+		    *parent_entry = tmp;
+		}
+	    }
+	  else
+	    {
+	      /* Inter-CU case.  */
+	      if (parent_valid (addr))
+		{
+		  auto tmp = find_parent (addr);
+		  if (tmp == &parent_map::deferred)
 		    {
-		      /* Defer because origin is in other shard.  */
+		      /* Defer because origin is deferred.  */
 		      *maybe_defer = addr;
 		    }
+		  else
+		    *parent_entry = tmp;
+		}
+	      else
+		{
+		  /* Defer because origin is in other shard.  */
+		  *maybe_defer = addr;
 		}
 	    }
 
-- 
2.35.3


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes
  2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
@ 2023-12-12 18:28   ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 18:28 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Make this somewhat more readable by factoring out the condition:
Tom> ...
Tom> 	  if (*parent_entry == nullptr)
Tom> 	    {
Tom> 	      if (new_reader->cu == reader->cu
Tom> 		  && new_info_ptr > watermark_ptr)
Tom> 		...
Tom> 	      else
Tom> 		...
Tom> 	    }
Tom> ...

Ok.
Approved-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage
  2023-12-12 17:32 ` [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage Tom de Vries
@ 2023-12-12 18:31   ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 18:31 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Factor out usage of cooked_indexer::m_die_range_map into new class parent_map
Tom> with member functions find_parent and set_parent, and static member function
Tom> form_addr.
 
Tom> +class parent_map
Tom> +{

I think the class should have an introductory comment explaining what
it's for.  Ok with this change.
Approved-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  2023-12-12 17:32 ` [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent Tom de Vries
@ 2023-12-12 18:34   ` Tom Tromey
  2023-12-13  8:25     ` Tom de Vries
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 18:34 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Set_parent uses m_die_range_map.set_empty, which doesn't allow
Tom> parent_entry == nullptr.

Tom> So it may be necessary to guard calls to set_parent with
Tom> "if (parent_entry != nullptr)".

Tom> Fix this by handling the parent_entry == nullptr case in set_parent.

It seems like this must be a programming error somewhere?
Currently the only caller is guarded:

  if (parent_entry != nullptr)
...
      m_die_range_map.set_empty (start, end, (void *) parent_entry);

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage
  2023-12-12 17:32 ` [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage Tom de Vries
@ 2023-12-12 18:39   ` Tom Tromey
  2023-12-13  8:46     ` Tom de Vries
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 18:39 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Factor out usage of cooked_indexer::m_deferred_entries in new member
Tom> functions defer_entry, handle_deferred_entries and resolve_deferred_entry.

I don't mind this, but when reading through the whole series, it seems
like the code has to now iterate over a lot of the entries trying to fix
up the parentage later.

I wonder then about just sticking this info directly into the
cooked_index_entry object, and then doing fixups directly on these in
the shard.

That is, instead of keeping separate "deferred" entries, just making
ordinary entries.  cooked_index_entry::parent_entry could be a union
holding either the parent (if known) or a CORE_ADDR; and then there
could be a new flag in cooked_index_flag_enum indicating which one is in
use.

Then I think parent_map::deferrred also would not be needed.

I'm still kind of mid-reading so my apologies if this doesn't really
make sense.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp
  2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
                   ` (12 preceding siblings ...)
  2023-12-12 17:32 ` [PATCH v2 13/13] [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the " Tom de Vries
@ 2023-12-12 19:05 ` Tom Tromey
  2023-12-13  9:58   ` Tom de Vries
  13 siblings, 1 reply; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 19:05 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

[...]

Thank you for this series and the write-up.
You've done a great job on this.

Tom> IV. ALTERNATIVE APPROACH

Tom> I suppose it could be a matter of taste whether the current internal
Tom> representation is:
Tom> - a smart solution, or
Tom> - incorrect representation of the dwarf.

I'm not sure I totally understood this section, but the basic idea
behind this code -- and I think these decisions predate the cooked-index
rewrite -- was to try to exploit dwz compression by also using less
memory inside gdb.

Technically, I think, in DWARF an import can appear at any level and gdb
should probably just be re-reading a bunch of DIEs over and over.

However, this seemed super expensive back when I wrote the dwz support,
and since dwz is the only known compressor and it doesn't do some of the
weird stuff, it looked better to try to implement this optimization.

Shrug, maybe that was a mistake.  If every import caused a re-scan of
the PU, it would mean that only dwz users would pay for this feature.
Just maybe the price would be very high.

Tom> - I'm not sure if other maintainers are supportive of this approach.

I probably just don't understand.

Tom> And this is comparing the patch series with the base version:
Tom> ...
Tom> real: mean: 687.00    (100%) mean: 931.00    (135.52%)
Tom> user: mean: 2204.00   (100%) mean: 2938.00   (133.30%)
Tom> sys : mean: 102.00    (100%) mean: 137.00    (134.31%)
Tom> mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%)
Tom> ...

Tom> In summary, the overall result is ~36% more real time and ~21% more memory.

Pretty heavy.  However I was curious about something:

Tom> The cumulative results of individual patches (leaving out the test-case
Tom> patches) are:

Tom> [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index

Tom> real: mean: 690.00    (100%) mean: 835.00    (121.01%)
Tom> user: mean: 2186.00   (100%) mean: 2693.00   (123.19%)
Tom> sys : mean: 106.00    (100%) mean: 130.00    (122.64%)
Tom> mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%)

... the big performance jump appears here -- but is it really needed?

IIUC this is done because we want to detect inlined functions in C++.
However, in theory those should also be emitted at the top level with
DW_AT_inline, with a value of either DW_INL_inlined or
DW_INL_declared_inlined.

Would having entries for these be enough to provoke the desired CU
expansion?  If so it may be just a matter of marking some more abbrevs
as "interesting".

Actually, for the simple test case I tried, I already see it in the
index.  So I wonder what's going on there.

Anyway if we could drop this patch then it seems like the overall cost
would be alright.

Tom> Ideally, the impact of a patch series implementing dwz support on a non-dwz
Tom> test-case is none.

Tom> But fixing dwz support requires tracking more data, and there's no way of
Tom> knowing in advance whether the additional data will be used or not.

Yeah.  I loathe this part of DWARF but I have come around to accept that
we just have to handle it.

Tom> Of course this can be accommodated by optimistically assuming that the data is
Tom> unnecessary, and when it turns out it was actually needed, partially or
Tom> completely restart indexing.  My suspicion is that this approach itself is
Tom> going to be complex, so I think it's best avoided.

I agree.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case
  2023-12-12 17:32 ` [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case Tom de Vries
@ 2023-12-12 19:27   ` Tom Tromey
  2023-12-13 10:35     ` Tom de Vries
  0 siblings, 1 reply; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 19:27 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> +cooked_index_entry parent_map::deferred((sect_offset)0, (dwarf_tag)0,
Tom> +					(cooked_index_flag)0, nullptr,
Tom> +					nullptr, nullptr);

Several missing spaces.

However see the other note.  Maybe this isn't needed.

Tom>  cooked_index::cooked_index (vec_type &&vec)
Tom>    : m_vector (std::move (vec))
Tom>  {
Tom> +  handle_deferred_entries ();
Tom> +
Tom>    for (auto &idx : m_vector)
Tom>       idx->finalize ();

Here, parent resolution is single-threaded.  However, I think it can
probably be handled entirely in finalize -- and therefore distributed to
worker threads.  However, some changes are needed.

Tom> +const cooked_index_entry *
Tom> +cooked_index::find_parent_deferred_entry
Tom> +  (const cooked_index_shard::deferred_entry &entry) const
Tom> +{
Tom> +  const cooked_index_entry *parent_entry = nullptr;
Tom> +  for (auto &parent_map_shard : m_vector)
Tom> +    {
Tom> +      auto res = parent_map_shard->find_parent (entry.spec_offset);

The main issue with threading is that addrmaps are not thread-safe.
They are implemented as splay trees and these restructure themselves
during lookups.

However, I'm not sure we need splay trees.  Instead, it seems to me that
the relevant parts of the DIE tree can be handled with a sorted vector
that maps DIE ranges to entries.

Then, parent resolution can be done by binary search, and this can be
done by each shard in parallel -- and in particular, in do_finalize,
because that is already visiting every entry anyway.

Tom> +void
Tom> +cooked_index::handle_deferred_entries ()
Tom> +{
Tom> +  bool changed;
Tom> +  bool deferred;
Tom> +  do
Tom> +    {
Tom> +      deferred = false;
Tom> +      changed = false;
Tom> +      for (auto &shard : m_vector)
Tom> +	for (auto it = shard->m_deferred_entries->begin ();
Tom> +	     it != shard->m_deferred_entries->end (); )
Tom> +	  {
Tom> +	    const cooked_index_entry *parent_entry
Tom> +	      = find_parent_deferred_entry (*it);
Tom> +	    if (parent_entry == &parent_map::deferred)
Tom> +	      {
Tom> +		deferred = true;
Tom> +		it++;
Tom> +		continue;
Tom> +	      }
Tom> +	    shard->resolve_deferred_entry (*it, parent_entry);
Tom> +	    it = shard->m_deferred_entries->erase (it);
Tom> +	    changed = true;

I don't really understand this method.

What is the scenario leading to the need for 'deferred' and multiple
iterations?

Tom> +  for (auto &shard : m_vector)
Tom> +    {
Tom> +      shard->m_die_range_map.reset (nullptr);
Tom> +      shard->m_deferred_entries.reset (nullptr);
Tom> +    }
Tom> +}

In the background reading series, there's a new cooked_index_worker that
holds data that is needed before scanning is complete.  These things
could be put there instead.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp
  2023-12-12 17:32 ` [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp Tom de Vries
@ 2023-12-12 19:28   ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 19:28 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Add a regression test for PR symtab/30846.
Tom> Tested on x86_64-linux.

Tom> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846

Looks good, thank you.
Approved-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp
  2023-12-12 17:32 ` [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp Tom de Vries
@ 2023-12-12 19:29   ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-12 19:29 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Add another regression test for PR symtab/30846.
Tom> Tested on x86_64-linux.

Tom> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846

Tom> +	    DW_TAG_namespace {
Tom> +		{DW_AT_name ns}
Tom> +	    } {
Tom> +		spec: DW_TAG_variable {
Tom> +		    {DW_AT_name v}
Tom> +		    {DW_AT_type :$myint}
Tom> +		    {DW_AT_declaration 1 DW_FORM_flag_present}
Tom> +		}

...

Tom> +	    # The new indexer has special code to compute the full
Tom> +	    # name of an object that uses a specification that appears
Tom> +	    # later in the DWARF.
Tom> +	    DW_TAG_variable {
Tom> +		{DW_AT_specification %$spec}

I think the comment is wrong here.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  2023-12-12 18:34   ` Tom Tromey
@ 2023-12-13  8:25     ` Tom de Vries
  2023-12-13 20:11       ` Tom Tromey
  0 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-13  8:25 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 12/12/23 19:34, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> Tom> Set_parent uses m_die_range_map.set_empty, which doesn't allow
> Tom> parent_entry == nullptr.
> 
> Tom> So it may be necessary to guard calls to set_parent with
> Tom> "if (parent_entry != nullptr)".
> 
> Tom> Fix this by handling the parent_entry == nullptr case in set_parent.
> 
> It seems like this must be a programming error somewhere?
> Currently the only caller is guarded:
> 
>    if (parent_entry != nullptr)
> ...
>        m_die_range_map.set_empty (start, end, (void *) parent_entry);

Yes, and I've left that in place because I couldn't convince myself that 
this wouldn't introduce a performance regression, but perhaps it doesn't 
matter and we should drop the check there.

I'm not sure why you say programming error.  I know using 
addrmap::set_empty (..., nullptr) is a programming error.

This patch attempts to make sure that using parent_map::set_parent (..., 
nullptr) is not a programming error.

Anyway, this is needed for a set_parent added by "[gdb/symtab] Keep 
track of all parents for cooked index".

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage
  2023-12-12 18:39   ` Tom Tromey
@ 2023-12-13  8:46     ` Tom de Vries
  2023-12-13 20:16       ` Tom Tromey
  0 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-13  8:46 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 12/12/23 19:39, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> Tom> Factor out usage of cooked_indexer::m_deferred_entries in new member
> Tom> functions defer_entry, handle_deferred_entries and resolve_deferred_entry.
> 
> I don't mind this, but when reading through the whole series, it seems
> like the code has to now iterate over a lot of the entries trying to fix
> up the parentage later.
> 
> I wonder then about just sticking this info directly into the
> cooked_index_entry object, and then doing fixups directly on these in
> the shard.
> 
> That is, instead of keeping separate "deferred" entries, just making
> ordinary entries.  cooked_index_entry::parent_entry could be a union
> holding either the parent (if known) or a CORE_ADDR; and then there
> could be a new flag in cooked_index_flag_enum indicating which one is in
> use.
> 
> Then I think parent_map::deferrred also would not be needed.
> 
> I'm still kind of mid-reading so my apologies if this doesn't really
> make sense.

I've also considered something like this: rather than deferring creating 
entries, creating them immediately with some marker that work is left to 
be done.  My initial though there was to use parent_map::deferred as 
parent, and then keep lists of:
...
struct {
   CORE_ADDR get_parent_from_here;
   cooked_index_entry **patch_parent_here;
};
...

The union+flag approach would also work but doesn't offer a way to 
iterate over them quickly, which might matter for large shards.  Though 
we could do a side-table approach I suppose.

We could choose to not care about such lists (in which case you'd need 
the union+flag solution) and resolve these on demand, but then that'll 
have to take care of cycle detection, so atm I'm not convinced that's a 
good idea.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp
  2023-12-12 19:05 ` [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom Tromey
@ 2023-12-13  9:58   ` Tom de Vries
  2023-12-13 20:14     ` Tom Tromey
  0 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-13  9:58 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 12/12/23 20:05, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> [...]
> 
> Thank you for this series and the write-up.
> You've done a great job on this.
> 
> Tom> IV. ALTERNATIVE APPROACH
> 
> Tom> I suppose it could be a matter of taste whether the current internal
> Tom> representation is:
> Tom> - a smart solution, or
> Tom> - incorrect representation of the dwarf.
> 
> I'm not sure I totally understood this section, but the basic idea
> behind this code -- and I think these decisions predate the cooked-index
> rewrite -- was to try to exploit dwz compression by also using less
> memory inside gdb.
> 

Makes sense to me.

> Technically, I think, in DWARF an import can appear at any level

That's also my understanding.

> and gdb
> should probably just be re-reading a bunch of DIEs over and over.

Yes, though we could make a distinction between top-level imports (as 
produced by dwz and gcc lto) and otherwise, and handle the top-level 
imports efficiently.

> However, this seemed super expensive back when I wrote the dwz support,
> and since dwz is the only known compressor and it doesn't do some of the
> weird stuff, it looked better to try to implement this optimization.
> 
> Shrug, maybe that was a mistake.  If every import caused a re-scan of
> the PU, it would mean that only dwz users would pay for this feature.
> Just maybe the price would be very high.
> 

The solution I tried to propose here is a middle ground:
- it fixes the PR we're trying to fix in this series, and
- it exploits the dwz compression inside gdb, to avoid the high price
   you mention.

The basic idea is:
- let the first import trigger a read of a PU,
- in the cooked_index entries generated for the PU, set the per_cu to
   the PU (instead of to the importing CU as we do now),
- keep track of importers, allowing to list all importers CUs of a given
   PU,
- when doing expansion for a cooked_index entry, if the per_cu is a PU,
   expand all importer CUs.

There are two notes here:
- there is the risk that this gets us too many expanded symtabs,
   so when using the cooked index we should know whether we're interested
   in expanding all matching symtabs or just one.
- imports can happen from CUs with different languages, so we'll have to
   handle that somehow.  I'm not sure btw whether we're getting that
   correct either atm.

> Tom> - I'm not sure if other maintainers are supportive of this approach.
> 
> I probably just don't understand.
> 

Ok, just let me know if there's anything I can do to clarify things.

> Tom> And this is comparing the patch series with the base version:
> Tom> ...
> Tom> real: mean: 687.00    (100%) mean: 931.00    (135.52%)
> Tom> user: mean: 2204.00   (100%) mean: 2938.00   (133.30%)
> Tom> sys : mean: 102.00    (100%) mean: 137.00    (134.31%)
> Tom> mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%)
> Tom> ...
> 
> Tom> In summary, the overall result is ~36% more real time and ~21% more memory.
> 
> Pretty heavy.  However I was curious about something:
> 
> Tom> The cumulative results of individual patches (leaving out the test-case
> Tom> patches) are:
> 
> Tom> [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
> 
> Tom> real: mean: 690.00    (100%) mean: 835.00    (121.01%)
> Tom> user: mean: 2186.00   (100%) mean: 2693.00   (123.19%)
> Tom> sys : mean: 106.00    (100%) mean: 130.00    (122.64%)
> Tom> mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%)
> 
> ... the big performance jump appears here -- but is it really needed?
> 
> IIUC this is done because we want to detect inlined functions in C++.
> However, in theory those should also be emitted at the top level with
> DW_AT_inline, with a value of either DW_INL_inlined or
> DW_INL_declared_inlined.
> 

In the gdb.cp/breakpoint-locs.exp cc-with-dwz case we have:
...
  <1><14>: Abbrev Number: 34 (DW_TAG_namespace)
     <15>   DW_AT_name        : N1
     <18>   DW_AT_sibling     : <0x2e>
  <2><19>: Abbrev Number: 32 (DW_TAG_class_type)
     <1a>   DW_AT_name        : C1
     <1d>   DW_AT_byte_size   : 1
     <1e>   DW_AT_decl_file   : 2
     <1f>   DW_AT_decl_line   : 20
  <3><20>: Abbrev Number: 35 (DW_TAG_subprogram)
     <21>   DW_AT_external    : 1
     <21>   DW_AT_name        : baz
     <25>   DW_AT_decl_file   : 2
     <26>   DW_AT_decl_line   : 23
     <27>   DW_AT_linkage_name: (indirect string, offset: 0x1d3): 
_ZN2N12C13bazEv
     <2b>   DW_AT_accessibility: 1       (public)
     <2c>   DW_AT_declaration : 1
  <3><2c>: Abbrev Number: 0
  <2><2d>: Abbrev Number: 0
  <1><2e>: Abbrev Number: 37 (DW_TAG_subprogram)
     <2f>   DW_AT_specification: <0x20>
     <30>   DW_AT_inline      : 3        (declared as inline and inlined)
     <31>   DW_AT_sibling     : <0x39>
...
and I think what you're describing matches most closely DIE 0x2e (though 
I haven't found DW_INL_inlined or DW_INL_declared_inlined in the dwarf 
produced for the test-case, also not with gcc 12 and -gdwarf-5).

> Would having entries for these be enough to provoke the desired CU
> expansion?  If so it may be just a matter of marking some more abbrevs
> as "interesting".
> 

The problem is that the DIE 0x2e has been moved to a PU, and as 
consequence in the current implementation will cause expansion of only a 
single CU.

> Actually, for the simple test case I tried, I already see it in the
> index.  So I wonder what's going on there.
> 

I wonder if you can reproduce the FAIL I'm seeing? [ Btw, it's not 100% 
reproducible, so I usually run it in a $(seq 1 100) loop. ]

> Anyway if we could drop this patch then it seems like the overall cost
> would be alright.
> 
> Tom> Ideally, the impact of a patch series implementing dwz support on a non-dwz
> Tom> test-case is none.
> 
> Tom> But fixing dwz support requires tracking more data, and there's no way of
> Tom> knowing in advance whether the additional data will be used or not.
> 
> Yeah.  I loathe this part of DWARF but I have come around to accept that
> we just have to handle it.
> 
> Tom> Of course this can be accommodated by optimistically assuming that the data is
> Tom> unnecessary, and when it turns out it was actually needed, partially or
> Tom> completely restart indexing.  My suspicion is that this approach itself is
> Tom> going to be complex, so I think it's best avoided.
> 
> I agree.

Thanks for the review.

- Tom


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case
  2023-12-12 19:27   ` Tom Tromey
@ 2023-12-13 10:35     ` Tom de Vries
  2023-12-13 20:19       ` Tom Tromey
  0 siblings, 1 reply; 30+ messages in thread
From: Tom de Vries @ 2023-12-13 10:35 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 12/12/23 20:27, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
> 
> Tom> +cooked_index_entry parent_map::deferred((sect_offset)0, (dwarf_tag)0,
> Tom> +					(cooked_index_flag)0, nullptr,
> Tom> +					nullptr, nullptr);
> 
> Several missing spaces.
> > However see the other note.  Maybe this isn't needed.
> 
> Tom>  cooked_index::cooked_index (vec_type &&vec)
> Tom>    : m_vector (std::move (vec))
> Tom>  {
> Tom> +  handle_deferred_entries ();
> Tom> +
> Tom>    for (auto &idx : m_vector)
> Tom>       idx->finalize ();
> 
> Here, parent resolution is single-threaded.  However, I think it can
> probably be handled entirely in finalize -- and therefore distributed to
> worker threads.  However, some changes are needed.
> 
> Tom> +const cooked_index_entry *
> Tom> +cooked_index::find_parent_deferred_entry
> Tom> +  (const cooked_index_shard::deferred_entry &entry) const
> Tom> +{
> Tom> +  const cooked_index_entry *parent_entry = nullptr;
> Tom> +  for (auto &parent_map_shard : m_vector)
> Tom> +    {
> Tom> +      auto res = parent_map_shard->find_parent (entry.spec_offset);
> 
> The main issue with threading is that addrmaps are not thread-safe.
> They are implemented as splay trees and these restructure themselves
> during lookups.
> 
> However, I'm not sure we need splay trees.  Instead, it seems to me that
> the relevant parts of the DIE tree can be handled with a sorted vector
> that maps DIE ranges to entries.
> 
> Then, parent resolution can be done by binary search, and this can be
> done by each shard in parallel -- and in particular, in do_finalize,
> because that is already visiting every entry anyway.
> 

That'll still have to take care of inter-shard dependencies somehow.

> Tom> +void
> Tom> +cooked_index::handle_deferred_entries ()
> Tom> +{
> Tom> +  bool changed;
> Tom> +  bool deferred;
> Tom> +  do
> Tom> +    {
> Tom> +      deferred = false;
> Tom> +      changed = false;
> Tom> +      for (auto &shard : m_vector)
> Tom> +	for (auto it = shard->m_deferred_entries->begin ();
> Tom> +	     it != shard->m_deferred_entries->end (); )
> Tom> +	  {
> Tom> +	    const cooked_index_entry *parent_entry
> Tom> +	      = find_parent_deferred_entry (*it);
> Tom> +	    if (parent_entry == &parent_map::deferred)
> Tom> +	      {
> Tom> +		deferred = true;
> Tom> +		it++;
> Tom> +		continue;
> Tom> +	      }
> Tom> +	    shard->resolve_deferred_entry (*it, parent_entry);
> Tom> +	    it = shard->m_deferred_entries->erase (it);
> Tom> +	    changed = true;
> 
> I don't really understand this method.
> 
> What is the scenario leading to the need for 'deferred' and multiple
> iterations?
> 

Investigated by adding a "gdb_assert (false)" at "deferred = true".

For instance, this triggers with 
/usr/lib/debug/usr/lib64/gcc/x86_64-suse-linux/7/gnat1-7.5.0+r278197-150000.4.35.1.x86_64.debug.

There's a DIE:
...
  <2><d952b5>: Abbrev Number: 46 (DW_TAG_inlined_subroutine)
     <d952b6>   DW_AT_abstract_origin: <0x56f99>
     <d952ba>   DW_AT_low_pc      : 0xfefc33
     <d952c2>   DW_AT_high_pc     : 18
     <d952c3>   DW_AT_call_file   : 1
     <d952c4>   DW_AT_call_line   : 1811
     <d952c6>   DW_AT_sibling     : <0xd952d4>
...
whose parent is the parent of DIE 0x56f99:
...
  <1><56f99>: Abbrev Number: 88 (DW_TAG_subprogram)
     <56f9a>   DW_AT_specification: <0x5520e>
     <56f9e>   DW_AT_object_pointer: <0x56fa1>
     <56f9f>   DW_AT_inline      : 3     (declared as inline and inlined)
     <56fa0>   DW_AT_object_pointer: <0x56fa1>
...
whose parent is the parent of DIE 0x5520e:
...
  <2><5520e>: Abbrev Number: 58 (DW_TAG_subprogram)
     <5520f>   DW_AT_external    : 1
     <5520f>   DW_AT_name        : (indirect string, offset: 0x15762e): 
operator[]
     <55213>   DW_AT_decl_file   : 3
     <55214>   DW_AT_decl_line   : 1217
     <55216>   DW_AT_linkage_name: (indirect string, offset: 0x3dd5b3): 
_ZN3vecIPKc7va_heap6vl_ptrEixEj
     <5521a>   DW_AT_type        : <0x15e233>
     <5521e>   DW_AT_declaration : 1
     <5521e>   DW_AT_object_pointer: <0x55222>
     <55220>   DW_AT_sibling     : <0x5522b>
...

So, two DIEs are deferred, and resolving them can happen only in one 
specific order.  If the loop encounters them in the wrong order, it 
skips the first one in the first iteration, and handles it in the second 
one.

> Tom> +  for (auto &shard : m_vector)
> Tom> +    {
> Tom> +      shard->m_die_range_map.reset (nullptr);
> Tom> +      shard->m_deferred_entries.reset (nullptr);
> Tom> +    }
> Tom> +}
> 
> In the background reading series, there's a new cooked_index_worker that
> holds data that is needed before scanning is complete.  These things
> could be put there instead.

Ack.

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent
  2023-12-13  8:25     ` Tom de Vries
@ 2023-12-13 20:11       ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-13 20:11 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> I'm not sure why you say programming error.  I know using
Tom> addrmap::set_empty (..., nullptr) is a programming error.

I just mean that, if it happens, it seems like it must be a bug
elsewhere.  But it doesn't matter much I suppose.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp
  2023-12-13  9:58   ` Tom de Vries
@ 2023-12-13 20:14     ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-13 20:14 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

>> However, in theory those should also be emitted at the top level with
>> DW_AT_inline, with a value of either DW_INL_inlined or
>> DW_INL_declared_inlined.

Tom> In the gdb.cp/breakpoint-locs.exp cc-with-dwz case we have:

Tom>     <30>   DW_AT_inline      : 3        (declared as inline and inlined)

Tom> and I think what you're describing matches most closely DIE 0x2e
Tom> (though I haven't found DW_INL_inlined or DW_INL_declared_inlined in
Tom> the dwarf produced for the test-case, also not with gcc 12 and
Tom> -gdwarf-5).

readelf / objdump don't print the symbolic name here, but that's
actually what you're seeing.  From dwarf2.h:

    /* Inline attribute.  */
    enum dwarf_inline_attribute
      {
        DW_INL_not_inlined = 0,
        DW_INL_inlined = 1,
        DW_INL_declared_not_inlined = 2,
        DW_INL_declared_inlined = 3
      };

>> Would having entries for these be enough to provoke the desired CU
>> expansion?  If so it may be just a matter of marking some more abbrevs
>> as "interesting".

Tom> The problem is that the DIE 0x2e has been moved to a PU, and as
Tom> consequence in the current implementation will cause expansion of only
Tom> a single CU.

Ok, I see.  Won't this fail with .gdb_index as well then?  Because IIRC
that attributes symbols from a PU to the canonical includer.

Anyway it seems better to me to record inclusions more precisely and
then expand more often (depending on the lookup) than to spend a lot of
time recursing into C++ subroutines.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage
  2023-12-13  8:46     ` Tom de Vries
@ 2023-12-13 20:16       ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-13 20:16 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> I've also considered something like this: rather than deferring
Tom> creating entries, creating them immediately with some marker that work
Tom> is left to be done.

Also worth noting, in many cases no extra work will need to be done.
In these cases the parent entry for a DIE range can be recorded directly.

Tom> The union+flag approach would also work but doesn't offer a way to
Tom> iterate over them quickly, which might matter for large shards.
Tom> Though we could do a side-table approach I suppose.

Instead of an addrmap I'm proposing a sequence of vectors, and replacing
splay-tree search with a binary search.  So the side thing is still
needed (it's these vectors), but the idea is that now the processing can
be parallelized; in fact stuck into the existing finalize stage.

Tom> We could choose to not care about such lists (in which case you'd need
Tom> the union+flag solution) and resolve these on demand, but then that'll
Tom> have to take care of cycle detection, so atm I'm not convinced that's
Tom> a good idea.

Doing it on demand means keeping the parentage information around
forever, or alternately re-reading abbrevs and then scanning DIEs at
lookup time.  I don't think we should do either of these.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case
  2023-12-13 10:35     ` Tom de Vries
@ 2023-12-13 20:19       ` Tom Tromey
  0 siblings, 0 replies; 30+ messages in thread
From: Tom Tromey @ 2023-12-13 20:19 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

>> However, I'm not sure we need splay trees.  Instead, it seems to me
>> that
>> the relevant parts of the DIE tree can be handled with a sorted vector
>> that maps DIE ranges to entries.
>> Then, parent resolution can be done by binary search, and this can
>> be
>> done by each shard in parallel -- and in particular, in do_finalize,
>> because that is already visiting every entry anyway.

Tom> That'll still have to take care of inter-shard dependencies somehow.

Yes.  After scanning is done, instead of addrmaps we would have these
vectors that map DIE ranges to parent entries (or sometimes to other DIE
offsets).

These vectors would be passed (by reference) to the finalize methods.

This would handle inter-shard references just fine while also be
parallelized.

Tom

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2023-12-13 20:19 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-12 17:32 [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom de Vries
2023-12-12 17:32 ` [PATCH v2 01/13] [gdb/symtab] Refactor condition in scan_attributes Tom de Vries
2023-12-12 18:28   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 02/13] [gdb/symtab] Factor out m_die_range_map usage Tom de Vries
2023-12-12 18:31   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 03/13] [gdb/symtab] Handle nullptr parent in parent_map::set_parent Tom de Vries
2023-12-12 18:34   ` Tom Tromey
2023-12-13  8:25     ` Tom de Vries
2023-12-13 20:11       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 04/13] [gdb/symtab] Factor out m_deferred_entries usage Tom de Vries
2023-12-12 18:39   ` Tom Tromey
2023-12-13  8:46     ` Tom de Vries
2023-12-13 20:16       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 05/13] [gdb/symtab] Resolve deferred entries, inter-shard case Tom de Vries
2023-12-12 19:27   ` Tom Tromey
2023-12-13 10:35     ` Tom de Vries
2023-12-13 20:19       ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 06/13] [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp Tom de Vries
2023-12-12 19:28   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 07/13] [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp Tom de Vries
2023-12-12 19:29   ` Tom Tromey
2023-12-12 17:32 ` [PATCH v2 08/13] [gdb/symtab] Keep track of processed DIEs in shard Tom de Vries
2023-12-12 17:32 ` [PATCH v2 09/13] [gdb/symtab] Resolve deferred entries, intra-shard case Tom de Vries
2023-12-12 17:32 ` [PATCH v2 10/13] [gdb/symtab] Don't defer backward refs, inter-cu " Tom de Vries
2023-12-12 17:32 ` [PATCH v2 11/13] [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index Tom de Vries
2023-12-12 17:32 ` [PATCH v2 12/13] [gdb/symtab] Keep track of all parents " Tom de Vries
2023-12-12 17:32 ` [PATCH v2 13/13] [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the " Tom de Vries
2023-12-12 19:05 ` [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Tom Tromey
2023-12-13  9:58   ` Tom de Vries
2023-12-13 20:14     ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).