[PATCH] Finalize each cooked index separately

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] Finalize each cooked index separately
@ 2022-05-16 18:46 Tom Tromey
  2022-05-17  8:43 ` Lancelot SIX
  2022-05-26 13:48 ` Tom Tromey
  0 siblings, 2 replies; 4+ messages in thread
From: Tom Tromey @ 2022-05-16 18:46 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

After DWARF has been scanned, the cooked index code does a
"finalization" step in a worker thread.  This step combines all the
index entries into a single master list, canonicalizes C++ names, and
splits Ada names to synthesize package names.

While this step is run in the background, gdb will wait for the
results in some situations, and it turns out that this step can be
slow.  This is PR symtab/29105.

This can be sped up by parallelizing, at a small memory cost.  Now
each index is finalized on its own, in a worker thread.  The cost
comes from name canonicalization: if a given non-canonical name is
referred to by multiple indices, there will be N canonical copies (one
per index) rather than just one.

This requires changing the users of the index to iterate over multiple
results.  However, this is easily done by introducing a new "chained
range" class.

When run on gdb itself, the memory cost seems rather low -- on my
current machine, "maint space 1" reports no change due to the patch.

For performance testing, using "maint time 1" and "file" will not show
correct results.  That approach measures "time to next prompt", but
because the patch only affects background work, this shouldn't (and
doesn't) change.  Instead, a simple way to make gdb wait for the
results is to set a breakpoint.

Before:

    $ /bin/time -f%e ~/gdb/install/bin/gdb -nx -q -batch \
        -ex 'break main' /tmp/gdb
    Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28.
    2.00

After:

    $ /bin/time -f%e ./gdb/gdb -nx -q -batch \
        -ex 'break main' /tmp/gdb
    Breakpoint 1 at 0x43ec30: file ../../binutils-gdb/gdb/gdb.c, line 28.
    0.65

Regression tested on x86-64 Fedora 34.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29105
---
 gdb/dwarf2/cooked-index.c | 266 +++++++++++++++++++-------------------
 gdb/dwarf2/cooked-index.h |  87 ++++++++-----
 gdbsupport/range-chain.h  | 121 +++++++++++++++++
 3 files changed, 311 insertions(+), 163 deletions(-)
 create mode 100644 gdbsupport/range-chain.h

diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index d8a1ab8d30e..9f2a6c521b8 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -112,78 +112,22 @@ cooked_index::add (sect_offset die_offset, enum dwarf_tag tag,
   return result;
 }
 
-cooked_index_vector::cooked_index_vector (vec_type &&vec)
-  : m_vector (std::move (vec)),
-    m_future (gdb::thread_pool::g_thread_pool->post_task
-	      ([this] ()
-	      {
-		finalize ();
-	      }))
-{
-}
-
 /* See cooked-index.h.  */
 
-dwarf2_per_cu_data *
-cooked_index_vector::lookup (CORE_ADDR addr)
+void
+cooked_index::finalize ()
 {
-  for (const auto &index : m_vector)
+  m_future = gdb::thread_pool::g_thread_pool->post_task ([this] ()
     {
-      dwarf2_per_cu_data *result = index->lookup (addr);
-      if (result != nullptr)
-	return result;
-    }
-  return nullptr;
-}
-
-/* See cooked-index.h.  */
-
-std::vector<addrmap *>
-cooked_index_vector::get_addrmaps ()
-{
-  std::vector<addrmap *> result;
-  for (const auto &index : m_vector)
-    result.push_back (index->m_addrmap);
-  return result;
-}
-
-/* See cooked-index.h.  */
-
-cooked_index_vector::range
-cooked_index_vector::find (gdb::string_view name, bool completing)
-{
-  m_future.wait ();
-
-  auto lower = std::lower_bound (m_entries.begin (), m_entries.end (),
-				 name,
-				 [=] (const cooked_index_entry *entry,
-				      const gdb::string_view &n)
-  {
-    int cmp = strncasecmp (entry->canonical, n.data (), n.length ());
-    if (cmp != 0 || completing)
-      return cmp < 0;
-    return strlen (entry->canonical) < n.length ();
-  });
-
-  auto upper = std::upper_bound (m_entries.begin (), m_entries.end (),
-				 name,
-				 [=] (const gdb::string_view &n,
-				      const cooked_index_entry *entry)
-  {
-    int cmp = strncasecmp (n.data (), entry->canonical, n.length ());
-    if (cmp != 0 || completing)
-      return cmp < 0;
-    return n.length () < strlen (entry->canonical);
-  });
-
-  return range (lower, upper);
+      do_finalize ();
+    });
 }
 
 /* See cooked-index.h.  */
 
 gdb::unique_xmalloc_ptr<char>
-cooked_index_vector::handle_gnat_encoded_entry (cooked_index_entry *entry,
-						htab_t gnat_entries)
+cooked_index::handle_gnat_encoded_entry (cooked_index_entry *entry,
+					 htab_t gnat_entries)
 {
   std::string canonical = ada_decode (entry->name, false, false);
   if (canonical.empty ())
@@ -206,11 +150,9 @@ cooked_index_vector::handle_gnat_encoded_entry (cooked_index_entry *entry,
 	{
 	  gdb::unique_xmalloc_ptr<char> new_name
 	    = make_unique_xstrndup (name.data (), name.length ());
-	  /* It doesn't matter which obstack we allocate this on, so
-	     we pick the most convenient one.  */
-	  last = m_vector[0]->create (entry->die_offset, DW_TAG_namespace,
-				      0, new_name.get (), parent,
-				      entry->per_cu);
+	  last = create (entry->die_offset, DW_TAG_namespace,
+			 0, new_name.get (), parent,
+			 entry->per_cu);
 	  last->canonical = last->name;
 	  m_names.push_back (std::move (new_name));
 	  *slot = last;
@@ -225,28 +167,8 @@ cooked_index_vector::handle_gnat_encoded_entry (cooked_index_entry *entry,
 
 /* See cooked-index.h.  */
 
-const cooked_index_entry *
-cooked_index_vector::get_main () const
-{
-  const cooked_index_entry *result = nullptr;
-
-  for (const auto &index : m_vector)
-    {
-      const cooked_index_entry *entry = index->get_main ();
-      if (result == nullptr
-	  || ((result->flags & IS_MAIN) == 0
-	      && entry != nullptr
-	      && (entry->flags & IS_MAIN) != 0))
-	result = entry;
-    }
-
-  return result;
-}
-
-/* See cooked-index.h.  */
-
 void
-cooked_index_vector::finalize ()
+cooked_index::do_finalize ()
 {
   auto hash_name_ptr = [] (const void *p)
     {
@@ -268,26 +190,38 @@ cooked_index_vector::finalize ()
   htab_up seen_names (htab_create_alloc (10, hash_name_ptr, eq_name_ptr,
 					 nullptr, xcalloc, xfree));
 
-  for (auto &index : m_vector)
-    {
-      htab_up gnat_entries (htab_create_alloc (10, hash_entry, eq_entry,
-					       nullptr, xcalloc, xfree));
+  htab_up gnat_entries (htab_create_alloc (10, hash_entry, eq_entry,
+					   nullptr, xcalloc, xfree));
 
-      std::vector<cooked_index_entry *> entries
-	= std::move (index->m_entries);
-      for (cooked_index_entry *entry : entries)
+  for (cooked_index_entry *entry : m_entries)
+    {
+      gdb_assert (entry->canonical == nullptr);
+      if ((entry->per_cu->lang != language_cplus
+	   && entry->per_cu->lang != language_ada)
+	  || (entry->flags & IS_LINKAGE) != 0)
+	entry->canonical = entry->name;
+      else
 	{
-	  gdb_assert (entry->canonical == nullptr);
-	  if ((entry->per_cu->lang != language_cplus
-	       && entry->per_cu->lang != language_ada)
-	      || (entry->flags & IS_LINKAGE) != 0)
-	    entry->canonical = entry->name;
+	  if (entry->per_cu->lang == language_ada)
+	    {
+	      gdb::unique_xmalloc_ptr<char> canon_name
+		= handle_gnat_encoded_entry (entry, gnat_entries.get ());
+	      if (canon_name == nullptr)
+		entry->canonical = entry->name;
+	      else
+		{
+		  entry->canonical = canon_name.get ();
+		  m_names.push_back (std::move (canon_name));
+		}
+	    }
 	  else
 	    {
-	      if (entry->per_cu->lang == language_ada)
+	      void **slot = htab_find_slot (seen_names.get (), entry,
+					    INSERT);
+	      if (*slot == nullptr)
 		{
 		  gdb::unique_xmalloc_ptr<char> canon_name
-		    = handle_gnat_encoded_entry (entry, gnat_entries.get ());
+		    = cp_canonicalize_string (entry->name);
 		  if (canon_name == nullptr)
 		    entry->canonical = entry->name;
 		  else
@@ -298,37 +232,12 @@ cooked_index_vector::finalize ()
 		}
 	      else
 		{
-		  void **slot = htab_find_slot (seen_names.get (), entry,
-						INSERT);
-		  if (*slot == nullptr)
-		    {
-		      gdb::unique_xmalloc_ptr<char> canon_name
-			= cp_canonicalize_string (entry->name);
-		      if (canon_name == nullptr)
-			entry->canonical = entry->name;
-		      else
-			{
-			  entry->canonical = canon_name.get ();
-			  m_names.push_back (std::move (canon_name));
-			}
-		    }
-		  else
-		    {
-		      const cooked_index_entry *other
-			= (const cooked_index_entry *) *slot;
-		      entry->canonical = other->canonical;
-		    }
+		  const cooked_index_entry *other
+		    = (const cooked_index_entry *) *slot;
+		  entry->canonical = other->canonical;
 		}
 	    }
 	}
-
-      if (m_entries.empty ())
-	m_entries = std::move (entries);
-      else
-	{
-	  m_entries.reserve (m_entries.size () + entries.size ());
-	  m_entries.insert (m_entries.end (), entries.begin (), entries.end ());
-	}
     }
 
   m_names.shrink_to_fit ();
@@ -339,3 +248,98 @@ cooked_index_vector::finalize ()
 	       return *a < *b;
 	     });
 }
+
+/* See cooked-index.h.  */
+
+cooked_index::range
+cooked_index::find (gdb::string_view name, bool completing)
+{
+  wait ();
+
+  auto lower = std::lower_bound (m_entries.begin (), m_entries.end (),
+				 name,
+				 [=] (const cooked_index_entry *entry,
+				      const gdb::string_view &n)
+  {
+    int cmp = strncasecmp (entry->canonical, n.data (), n.length ());
+    if (cmp != 0 || completing)
+      return cmp < 0;
+    return strlen (entry->canonical) < n.length ();
+  });
+
+  auto upper = std::upper_bound (m_entries.begin (), m_entries.end (),
+				 name,
+				 [=] (const gdb::string_view &n,
+				      const cooked_index_entry *entry)
+  {
+    int cmp = strncasecmp (n.data (), entry->canonical, n.length ());
+    if (cmp != 0 || completing)
+      return cmp < 0;
+    return n.length () < strlen (entry->canonical);
+  });
+
+  return range (lower, upper);
+}
+
+cooked_index_vector::cooked_index_vector (vec_type &&vec)
+  : m_vector (std::move (vec))
+{
+  for (auto &idx : m_vector)
+    idx->finalize ();
+}
+
+/* See cooked-index.h.  */
+
+dwarf2_per_cu_data *
+cooked_index_vector::lookup (CORE_ADDR addr)
+{
+  for (const auto &index : m_vector)
+    {
+      dwarf2_per_cu_data *result = index->lookup (addr);
+      if (result != nullptr)
+	return result;
+    }
+  return nullptr;
+}
+
+/* See cooked-index.h.  */
+
+std::vector<addrmap *>
+cooked_index_vector::get_addrmaps ()
+{
+  std::vector<addrmap *> result;
+  for (const auto &index : m_vector)
+    result.push_back (index->m_addrmap);
+  return result;
+}
+
+/* See cooked-index.h.  */
+
+cooked_index_vector::range
+cooked_index_vector::find (gdb::string_view name, bool completing)
+{
+  std::vector<cooked_index::range> result_range;
+  for (auto &entry : m_vector)
+    result_range.push_back (entry->find (name, completing));
+  return range (std::move (result_range));
+}
+
+/* See cooked-index.h.  */
+
+const cooked_index_entry *
+cooked_index_vector::get_main () const
+{
+  const cooked_index_entry *result = nullptr;
+
+  for (const auto &index : m_vector)
+    {
+      const cooked_index_entry *entry = index->get_main ();
+      if (result == nullptr
+	  || ((result->flags & IS_MAIN) == 0
+	      && entry != nullptr
+	      && (entry->flags & IS_MAIN) != 0))
+	result = entry;
+    }
+
+  return result;
+}
diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index ad4de05ba52..3b83250ef0f 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -33,6 +33,7 @@
 #include "gdbsupport/thread-pool.h"
 #include "dwarf2/mapped-index.h"
 #include "dwarf2/tag.h"
+#include "gdbsupport/range-chain.h"
 
 struct dwarf2_per_cu_data;
 
@@ -190,8 +191,34 @@ class cooked_index
     m_addrmap = addrmap_create_fixed (map, &m_storage);
   }
 
+  /* Finalize the index.  This should be called a single time, when
+     the index has been fully populated.  It enters all the entries
+     into the internal table.  */
+  void finalize ();
+
+  /* Wait for this index's finalization to be complete.  */
+  void wait ()
+  {
+    m_future.wait ();
+  }
+
   friend class cooked_index_vector;
 
+  /* A simple range over part of m_entries.  */
+  typedef iterator_range<std::vector<cooked_index_entry *>::iterator> range;
+
+  /* Return a range of all the entries.  */
+  range all_entries ()
+  {
+    wait ();
+    return { m_entries.begin (), m_entries.end () };
+  }
+
+  /* Look up an entry by name.  Returns a range of all matching
+     results.  If COMPLETING is true, then a larger range, suitable
+     for completion, will be returned.  */
+  range find (gdb::string_view name, bool completing);
+
 private:
 
   /* Return the entry that is believed to represent the program's
@@ -223,6 +250,17 @@ class cooked_index
 						per_cu);
   }
 
+  /* GNAT only emits mangled ("encoded") names in the DWARF, and does
+     not emit the module structure.  However, we need this structure
+     to do lookups.  This function recreates that structure for an
+     existing entry.  It returns the base name (last element) of the
+     full decoded name.  */
+  gdb::unique_xmalloc_ptr<char> handle_gnat_encoded_entry
+       (cooked_index_entry *entry, htab_t gnat_entries);
+
+  /* A helper method that does the work of 'finalize'.  */
+  void do_finalize ();
+
   /* Storage for the entries.  */
   auto_obstack m_storage;
   /* List of all entries.  */
@@ -233,6 +271,12 @@ class cooked_index
   /* The addrmap.  This maps address ranges to dwarf2_per_cu_data
      objects.  */
   addrmap *m_addrmap = nullptr;
+  /* Storage for canonical names.  */
+  std::vector<gdb::unique_xmalloc_ptr<char>> m_names;
+  /* A future that tracks when the 'finalize' method is done.  Note
+     that the 'get' method is never called on this future, only
+     'wait'.  */
+  gdb::future<void> m_future;
 };
 
 /* The main index of DIEs.  The parallel DIE indexers create
@@ -253,17 +297,18 @@ class cooked_index_vector : public dwarf_scanner_base
 
   ~cooked_index_vector ()
   {
-    /* The 'finalize' method may be run in a different thread.  If
-       this object is destroyed before this completes, then the method
-       will end up writing to freed memory.  Waiting for this to
+    /* The 'finalize' methods may be run in a different thread.  If
+       this object is destroyed before these complete, then one will
+       end up writing to freed memory.  Waiting for finalization to
        complete avoids this problem; and the cost seems ignorable
        because creating and immediately destroying the debug info is a
        relatively rare thing to do.  */
-    m_future.wait ();
+    for (auto &item : m_vector)
+      item->wait ();
   }
 
-  /* A simple range over part of m_entries.  */
-  typedef iterator_range<std::vector<cooked_index_entry *>::iterator> range;
+  /* A range over a vector of subranges.  */
+  typedef range_chain<cooked_index::range> range;
 
   /* Look up an entry by name.  Returns a range of all matching
      results.  If COMPLETING is true, then a larger range, suitable
@@ -273,8 +318,10 @@ class cooked_index_vector : public dwarf_scanner_base
   /* Return a range of all the entries.  */
   range all_entries ()
   {
-    m_future.wait ();
-    return { m_entries.begin (), m_entries.end () };
+    std::vector<cooked_index::range> result_range;
+    for (auto &entry : m_vector)
+      result_range.push_back (entry->all_entries ());
+    return range (std::move (result_range));
   }
 
   /* Look up ADDR in the address map, and return either the
@@ -299,33 +346,9 @@ class cooked_index_vector : public dwarf_scanner_base
 
 private:
 
-  /* GNAT only emits mangled ("encoded") names in the DWARF, and does
-     not emit the module structure.  However, we need this structure
-     to do lookups.  This function recreates that structure for an
-     existing entry.  It returns the base name (last element) of the
-     full decoded name.  */
-  gdb::unique_xmalloc_ptr<char> handle_gnat_encoded_entry
-       (cooked_index_entry *entry, htab_t gnat_entries);
-
-  /* Finalize the index.  This should be called a single time, when
-     the index has been fully populated.  It enters all the entries
-     into the internal hash table.  */
-  void finalize ();
-
   /* The vector of cooked_index objects.  This is stored because the
      entries are stored on the obstacks in those objects.  */
   vec_type m_vector;
-
-  /* List of all entries.  This is sorted during finalization.  */
-  std::vector<cooked_index_entry *> m_entries;
-
-  /* Storage for canonical names.  */
-  std::vector<gdb::unique_xmalloc_ptr<char>> m_names;
-
-  /* A future that tracks when the 'finalize' method is done.  Note
-     that the 'get' method is never called on this future, only
-     'wait'.  */
-  gdb::future<void> m_future;
 };
 
 #endif /* GDB_DWARF2_COOKED_INDEX_H */
diff --git a/gdbsupport/range-chain.h b/gdbsupport/range-chain.h
new file mode 100644
index 00000000000..9bd6eb19480
--- /dev/null
+++ b/gdbsupport/range-chain.h
@@ -0,0 +1,121 @@
+/* A range adapter that wraps multiple ranges
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifndef GDBSUPPORT_RANGE_CHAIN_H
+#define GDBSUPPORT_RANGE_CHAIN_H
+
+/* A range adapter that presents a number of ranges as if it were a
+   single range.  That is, iterating over a range_chain will iterate
+   over each sub-range in order.  */
+template<typename Range>
+struct range_chain
+{
+  /* The type of the iterator that is created by this range.  */
+  class iterator
+  {
+  public:
+
+    iterator (std::vector<Range> &ranges, size_t idx)
+      : m_index (idx),
+	m_ranges (ranges)
+    {
+      skip_empty ();
+    }
+
+    bool operator== (const iterator &other) const
+    {
+      if (m_index != other.m_index || &m_ranges != &other.m_ranges)
+	return false;
+      if (m_current.has_value () != other.m_current.has_value ())
+	return false;
+      if (m_current.has_value ())
+	return *m_current == *other.m_current;
+      return true;
+    }
+
+    bool operator!= (const iterator &other) const
+    {
+      return !(*this == other);
+    }
+
+    iterator &operator++ ()
+    {
+      ++*m_current;
+      if (*m_current == m_ranges[m_index].end ())
+	{
+	  ++m_index;
+	  skip_empty ();
+	}
+      return *this;
+    }
+
+    typename Range::iterator::value_type operator* () const
+    {
+      return **m_current;
+    }
+
+  private:
+    /* Skip empty sub-ranges.  If this finds a valid sub-range,
+       m_current is updated to point to its start; otherwise,
+       m_current is reset.  */
+    void skip_empty ()
+    {
+      for (; m_index < m_ranges.size (); ++m_index)
+	{
+	  m_current = m_ranges[m_index].begin ();
+	  if (*m_current != m_ranges[m_index].end ())
+	    return;
+	}
+      m_current.reset ();
+    }
+
+    /* Index into the vector indicating where the current iterator
+       comes from.  */
+    size_t m_index;
+    /* The current iterator into one of the vector ranges.  If no
+       value then this (outer) iterator is at the end of the overall
+       range.  */
+    gdb::optional<typename Range::iterator> m_current;
+    /* Vector of ranges.  */
+    std::vector<Range> &m_ranges;
+  };
+
+  /* Create a new range_chain, transferring in a vector of
+     sub-ranges.  */
+  range_chain (std::vector<Range> &&ranges)
+    : m_ranges (ranges)
+  {
+  }
+
+  iterator begin ()
+  {
+    return iterator (m_ranges, 0);
+  }
+
+  iterator end ()
+  {
+    return iterator (m_ranges, m_ranges.size ());
+  }
+
+private:
+
+  /* The sub-ranges.  */
+  std::vector<Range> m_ranges;
+};
+
+#endif /* GDBSUPPORT_RANGE_CHAIN_H */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Finalize each cooked index separately
  2022-05-16 18:46 [PATCH] Finalize each cooked index separately Tom Tromey
@ 2022-05-17  8:43 ` Lancelot SIX
  2022-05-17 13:59   ` Tom Tromey
  2022-05-26 13:48 ` Tom Tromey
  1 sibling, 1 reply; 4+ messages in thread
From: Lancelot SIX @ 2022-05-17  8:43 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

Hi,

I have a couple of comments/suggestions inlined in the patch below.

> @@ -339,3 +248,98 @@ cooked_index_vector::finalize ()
>  	       return *a < *b;
>  	     });
>  }
> +
> +/* See cooked-index.h.  */
> +
> +cooked_index::range
> +cooked_index::find (gdb::string_view name, bool completing)
> +{
> +  wait ();
> +
> +  auto lower = std::lower_bound (m_entries.begin (), m_entries.end (),
> +				 name,
> +				 [=] (const cooked_index_entry *entry,
> +				      const gdb::string_view &n)
> +  {
> +    int cmp = strncasecmp (entry->canonical, n.data (), n.length ());
> +    if (cmp != 0 || completing)
> +      return cmp < 0;
> +    return strlen (entry->canonical) < n.length ();
> +  });
> +
> +  auto upper = std::upper_bound (m_entries.begin (), m_entries.end (),
> +				 name,
> +				 [=] (const gdb::string_view &n,
> +				      const cooked_index_entry *entry)
> +  {
> +    int cmp = strncasecmp (n.data (), entry->canonical, n.length ());
> +    if (cmp != 0 || completing)
> +      return cmp < 0;
> +    return n.length () < strlen (entry->canonical);
> +  });
> +
> +  return range (lower, upper);
> +}
> +
> +cooked_index_vector::cooked_index_vector (vec_type &&vec)
> +  : m_vector (std::move (vec))
> +{
> +  for (auto &idx : m_vector)
> +    idx->finalize ();
> +}
> +
> +/* See cooked-index.h.  */
> +
> +dwarf2_per_cu_data *
> +cooked_index_vector::lookup (CORE_ADDR addr)
> +{
> +  for (const auto &index : m_vector)
> +    {
> +      dwarf2_per_cu_data *result = index->lookup (addr);
> +      if (result != nullptr)
> +	return result;
> +    }
> +  return nullptr;
> +}
> +
> +/* See cooked-index.h.  */
> +
> +std::vector<addrmap *>
> +cooked_index_vector::get_addrmaps ()
> +{
> +  std::vector<addrmap *> result;
> +  for (const auto &index : m_vector)
> +    result.push_back (index->m_addrmap);
> +  return result;
> +}
> +
> +/* See cooked-index.h.  */
> +
> +cooked_index_vector::range
> +cooked_index_vector::find (gdb::string_view name, bool completing)
> +{
> +  std::vector<cooked_index::range> result_range;

Since the result_range's expected size is already known, it is possible
to pre-allocate the associated storage with

  result_range.reserve (m_vector.size);

> +  for (auto &entry : m_vector)
> +    result_range.push_back (entry->find (name, completing));
> +  return range (std::move (result_range));
> +}
> +
> +/* See cooked-index.h.  */
> +
> +const cooked_index_entry *
> +cooked_index_vector::get_main () const
> +{
> +  const cooked_index_entry *result = nullptr;
> +
> +  for (const auto &index : m_vector)
> +    {
> +      const cooked_index_entry *entry = index->get_main ();
> +      if (result == nullptr
> +	  || ((result->flags & IS_MAIN) == 0
> +	      && entry != nullptr
> +	      && (entry->flags & IS_MAIN) != 0))
> +	result = entry;
> +    }
> +
> +  return result;
> +}
> diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
> index ad4de05ba52..3b83250ef0f 100644
> --- a/gdb/dwarf2/cooked-index.h
> +++ b/gdb/dwarf2/cooked-index.h
> @@ -273,8 +318,10 @@ class cooked_index_vector : public dwarf_scanner_base
>    /* Return a range of all the entries.  */
>    range all_entries ()
>    {
> -    m_future.wait ();
> -    return { m_entries.begin (), m_entries.end () };
> +    std::vector<cooked_index::range> result_range;

Similar here, reserve can come in handy.

> +    for (auto &entry : m_vector)
> +      result_range.push_back (entry->all_entries ());
> +    return range (std::move (result_range));
>    }
>  
>    /* Look up ADDR in the address map, and return either the
> diff --git a/gdbsupport/range-chain.h b/gdbsupport/range-chain.h
> new file mode 100644
> index 00000000000..9bd6eb19480
> --- /dev/null
> +++ b/gdbsupport/range-chain.h
> @@ -0,0 +1,121 @@
> +/* A range adapter that wraps multiple ranges
> +   Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +   This file is part of GDB.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GDBSUPPORT_RANGE_CHAIN_H
> +#define GDBSUPPORT_RANGE_CHAIN_H
> +
> +/* A range adapter that presents a number of ranges as if it were a
> +   single range.  That is, iterating over a range_chain will iterate
> +   over each sub-range in order.  */
> +template<typename Range>
> +struct range_chain
> +{
> +  /* The type of the iterator that is created by this range.  */
> +  class iterator
> +  {
> +  public:
> +
> +    iterator (std::vector<Range> &ranges, size_t idx)
> +      : m_index (idx),
> +	m_ranges (ranges)
> +    {
> +      skip_empty ();
> +    }
> +
> +    bool operator== (const iterator &other) const
> +    {
> +      if (m_index != other.m_index || &m_ranges != &other.m_ranges)
> +	return false;
> +      if (m_current.has_value () != other.m_current.has_value ())
> +	return false;
> +      if (m_current.has_value ())
> +	return *m_current == *other.m_current;
> +      return true;
> +    }
> +
> +    bool operator!= (const iterator &other) const
> +    {
> +      return !(*this == other);
> +    }
> +
> +    iterator &operator++ ()
> +    {
> +      ++*m_current;
> +      if (*m_current == m_ranges[m_index].end ())
> +	{
> +	  ++m_index;
> +	  skip_empty ();
> +	}
> +      return *this;
> +    }
> +
> +    typename Range::iterator::value_type operator* () const
> +    {
> +      return **m_current;
> +    }
> +
> +  private:
> +    /* Skip empty sub-ranges.  If this finds a valid sub-range,
> +       m_current is updated to point to its start; otherwise,
> +       m_current is reset.  */
> +    void skip_empty ()
> +    {
> +      for (; m_index < m_ranges.size (); ++m_index)
> +	{
> +	  m_current = m_ranges[m_index].begin ();
> +	  if (*m_current != m_ranges[m_index].end ())
> +	    return;
> +	}
> +      m_current.reset ();
> +    }
> +
> +    /* Index into the vector indicating where the current iterator
> +       comes from.  */
> +    size_t m_index;
> +    /* The current iterator into one of the vector ranges.  If no
> +       value then this (outer) iterator is at the end of the overall
> +       range.  */
> +    gdb::optional<typename Range::iterator> m_current;
> +    /* Vector of ranges.  */
> +    std::vector<Range> &m_ranges;
> +  };
> +
> +  /* Create a new range_chain, transferring in a vector of
> +     sub-ranges.  */
> +  range_chain (std::vector<Range> &&ranges)
> +    : m_ranges (ranges)

Did you forget the std::move (ranges) here? Otherwise I think m_ranges
is initialized using a copy ctor instead of the intended move ctor.

Also, I am not certain range_chain should be restricted to only take
ownership of the vector or ranges.  This version should allow to take
lavlue or rvalue ref depending on the caller:

    template<typename Ranges>
    range_chain (Ranges &&ranges)
      : m_ranges (std::forward<Ranges> (ranges))
      { }

That being said, it is not needed in your patch and if someone needs
this it will be easy enough to add at that time.

> +  {
> +  }
> +
> +  iterator begin ()

I was wondering why the begin () and end () are not const.  Turns out
that they could become const if the following adjustments are done to
the iterator class:

- The range_chain::iterator::m_vector becomes a const member
- The range_chain::iterator::iterator ctor takes its vector parameter as
  a const ref.

I do not think there is case here for the iterator to change the
underlying vector, so making this const enforces it.

WDYT?

> +  {
> +    return iterator (m_ranges, 0);
> +  }
> +
> +  iterator end ()
> +  {
> +    return iterator (m_ranges, m_ranges.size ());
> +  }
> +
> +private:
> +
> +  /* The sub-ranges.  */
> +  std::vector<Range> m_ranges;
> +};
> +
> +#endif /* GDBSUPPORT_RANGE_CHAIN_H */
> -- 
> 2.34.1
> 

Best,
Lancelot.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Finalize each cooked index separately
  2022-05-17  8:43 ` Lancelot SIX
@ 2022-05-17 13:59   ` Tom Tromey
  0 siblings, 0 replies; 4+ messages in thread
From: Tom Tromey @ 2022-05-17 13:59 UTC (permalink / raw)
  To: Lancelot SIX; +Cc: Tom Tromey, gdb-patches

>>>>> "Lancelot" == Lancelot SIX <lsix@lancelotsix.com> writes:

Lancelot> I have a couple of comments/suggestions inlined in the patch below.

Thank you for the review.

>> +  std::vector<cooked_index::range> result_range;

Lancelot> Since the result_range's expected size is already known, it is possible
Lancelot> to pre-allocate the associated storage with

Lancelot>   result_range.reserve (m_vector.size);

I've made this change in both spots.

>> +  /* Create a new range_chain, transferring in a vector of
>> +     sub-ranges.  */
>> +  range_chain (std::vector<Range> &&ranges)
>> +    : m_ranges (ranges)

Lancelot> Did you forget the std::move (ranges) here? Otherwise I think m_ranges
Lancelot> is initialized using a copy ctor instead of the intended move ctor.

Yes, oops.

Lancelot>     template<typename Ranges>
Lancelot>     range_chain (Ranges &&ranges)
Lancelot>       : m_ranges (std::forward<Ranges> (ranges))
Lancelot>       { }

I did this too.

Lancelot> I was wondering why the begin () and end () are not const.  Turns out
Lancelot> that they could become const if the following adjustments are done to
Lancelot> the iterator class:

I made these const and updated the iterator.

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Finalize each cooked index separately
  2022-05-16 18:46 [PATCH] Finalize each cooked index separately Tom Tromey
  2022-05-17  8:43 ` Lancelot SIX
@ 2022-05-26 13:48 ` Tom Tromey
  1 sibling, 0 replies; 4+ messages in thread
From: Tom Tromey @ 2022-05-26 13:48 UTC (permalink / raw)
  To: Tom Tromey via Gdb-patches; +Cc: Tom Tromey

>>>>> "Tom" == Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> This can be sped up by parallelizing, at a small memory cost.  Now
Tom> each index is finalized on its own, in a worker thread.

I'm checking this in now.

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-05-26 13:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-16 18:46 [PATCH] Finalize each cooked index separately Tom Tromey
2022-05-17  8:43 ` Lancelot SIX
2022-05-17 13:59   ` Tom Tromey
2022-05-26 13:48 ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).