[PATCH 6/6] .gdb_index prod perf regression: mapped_symtab now vector of values

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

From: Pedro Alves <palves@redhat.com>
To: gdb-patches@sourceware.org
Subject: [PATCH 6/6] .gdb_index prod perf regression: mapped_symtab now vector of values
Date: Mon, 12 Jun 2017 16:14:00 -0000	[thread overview]
Message-ID: <1497284051-13795-6-git-send-email-palves@redhat.com> (raw)
In-Reply-To: <1497284051-13795-1-git-send-email-palves@redhat.com>
In-Reply-To: <8efc0742-1014-4fe0-6948-f40a9c5c4975@redhat.com>

... instead of vector of pointers

There's no real reason for having mapped_symtab::data be a vector of
heap-allocated symtab_index_entries.  symtab_index_entries is not that
large, it's movable, and it's cheap to move.  Making the vector hold
values instead improves cache locality and eliminates many roundtrips
to the heap.

Using the same test as in the previous patch, against the same gdb
inferior, timing improves ~13% further:

  ~6.0s => ~5.2s (average of 5 runs).

Note that before the .gdb_index C++ifycation patch, we were at ~5.7s.
We're now consistenly better than before.

gdb/ChangeLog
2017-06-12  Pedro Alves  <palves@redhat.com>

	* dwarf2read.c (mapped_symtab::data): Now a vector of
	symtab_index_entry instead of vector of
	std::unique_ptr<symtab_index_entry>.  All users adjusted to check
	whether an element's name is NULL instead of checking whether the
	element itself is NULL.
	(find_slot): Change return type.  Adjust.
	(hash_expand, , add_index_entry, uniquify_cu_indices)
	(write_hash_table): Adjust.
---
 gdb/ChangeLog    | 11 +++++++++++
 gdb/dwarf2read.c | 56 ++++++++++++++++++++++++++++----------------------------
 2 files changed, 39 insertions(+), 28 deletions(-)

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 9dbc059..82f972a 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,5 +1,16 @@
 2017-06-12  Pedro Alves  <palves@redhat.com>
 
+	* dwarf2read.c (mapped_symtab::data): Now a vector of
+	symtab_index_entry instead of vector of
+	std::unique_ptr<symtab_index_entry>.  All users adjusted to check
+	whether an element's name is NULL instead of checking whether the
+	element itself is NULL.
+	(find_slot): Change return type.  Adjust.
+	(hash_expand, , add_index_entry, uniquify_cu_indices)
+	(write_hash_table): Adjust.
+
+2017-06-12  Pedro Alves  <palves@redhat.com>
+
 	* dwarf2read.c (recursively_count_psymbols): New function.
 	(write_psymtabs_to_index): Call it to compute number of psyms and
 	pass estimate size of psyms_seen to unordered_set's ctor.
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index bff2fcb..3f872b7 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -23269,7 +23269,7 @@ struct mapped_symtab
   }
 
   offset_type n_elements = 0;
-  std::vector<std::unique_ptr<symtab_index_entry>> data;
+  std::vector<symtab_index_entry> data;
 };
 
 /* Find a slot in SYMTAB for the symbol NAME.  Returns a reference to
@@ -23278,7 +23278,7 @@ struct mapped_symtab
    Function is used only during write_hash_table so no index format backward
    compatibility is needed.  */
 
-static std::unique_ptr<symtab_index_entry> &
+static symtab_index_entry &
 find_slot (struct mapped_symtab *symtab, const char *name)
 {
   offset_type index, step, hash = mapped_index_string_hash (INT_MAX, name);
@@ -23288,7 +23288,8 @@ find_slot (struct mapped_symtab *symtab, const char *name)
 
   for (;;)
     {
-      if (!symtab->data[index] || !strcmp (name, symtab->data[index]->name))
+      if (symtab->data[index].name == NULL
+	  || strcmp (name, symtab->data[index].name) == 0)
 	return symtab->data[index];
       index = (index + step) & (symtab->data.size () - 1);
     }
@@ -23305,9 +23306,9 @@ hash_expand (struct mapped_symtab *symtab)
   symtab->data.resize (old_entries.size () * 2);
 
   for (auto &it : old_entries)
-    if (it != NULL)
+    if (it.name != NULL)
       {
-	auto &ref = find_slot (symtab, it->name);
+	auto &ref = find_slot (symtab, it.name);
 	ref = std::move (it);
       }
 }
@@ -23327,11 +23328,10 @@ add_index_entry (struct mapped_symtab *symtab, const char *name,
   if (4 * symtab->n_elements / 3 >= symtab->data.size ())
     hash_expand (symtab);
 
-  std::unique_ptr<symtab_index_entry> &slot = find_slot (symtab, name);
-  if (slot == NULL)
+  symtab_index_entry &slot = find_slot (symtab, name);
+  if (slot.name == NULL)
     {
-      slot.reset (new symtab_index_entry ());
-      slot->name = name;
+      slot.name = name;
       /* index_offset is set later.  */
     }
 
@@ -23347,7 +23347,7 @@ add_index_entry (struct mapped_symtab *symtab, const char *name,
      the last entry pushed), but a symbol could have multiple kinds in one CU.
      To keep things simple we don't worry about the duplication here and
      sort and uniqufy the list after we've processed all symbols.  */
-  slot->cu_indices.push_back (cu_index_and_attrs);
+  slot.cu_indices.push_back (cu_index_and_attrs);
 }
 
 /* Sort and remove duplicates of all symbols' cu_indices lists.  */
@@ -23355,11 +23355,11 @@ add_index_entry (struct mapped_symtab *symtab, const char *name,
 static void
 uniquify_cu_indices (struct mapped_symtab *symtab)
 {
-  for (const auto &entry : symtab->data)
+  for (auto &entry : symtab->data)
     {
-      if (entry && !entry->cu_indices.empty ())
+      if (entry.name != NULL && !entry.cu_indices.empty ())
 	{
-	  auto &cu_indices = entry->cu_indices;
+	  auto &cu_indices = entry.cu_indices;
 	  std::sort (cu_indices.begin (), cu_indices.end ());
 	  auto from = std::unique (cu_indices.begin (), cu_indices.end ());
 	  cu_indices.erase (from, cu_indices.end ());
@@ -23425,11 +23425,11 @@ write_hash_table (mapped_symtab *symtab, data_buf &output, data_buf &cpool)
 
     /* We add all the index vectors to the constant pool first, to
        ensure alignment is ok.  */
-    for (const std::unique_ptr<symtab_index_entry> &it : symtab->data)
+    for (symtab_index_entry &entry : symtab->data)
       {
-	if (it == NULL)
+	if (entry.name == NULL)
 	  continue;
-	gdb_assert (it->index_offset == 0);
+	gdb_assert (entry.index_offset == 0);
 
 	/* Finding before inserting is faster than always trying to
 	   insert, because inserting always allocates a node, does the
@@ -23437,34 +23437,34 @@ write_hash_table (mapped_symtab *symtab, data_buf &output, data_buf &cpool)
 	   already had the same key.  C++17 try_emplace will avoid
 	   this.  */
 	const auto found
-	  = symbol_hash_table.find (it->cu_indices);
+	  = symbol_hash_table.find (entry.cu_indices);
 	if (found != symbol_hash_table.end ())
 	  {
-	    it->index_offset = found->second;
+	    entry.index_offset = found->second;
 	    continue;
 	  }
 
-	symbol_hash_table.emplace (it->cu_indices, cpool.size ());
-	it->index_offset = cpool.size ();
-	cpool.append_data (MAYBE_SWAP (it->cu_indices.size ()));
-	for (const auto iter : it->cu_indices)
-	  cpool.append_data (MAYBE_SWAP (iter));
+	symbol_hash_table.emplace (entry.cu_indices, cpool.size ());
+	entry.index_offset = cpool.size ();
+	cpool.append_data (MAYBE_SWAP (entry.cu_indices.size ()));
+	for (const auto index : entry.cu_indices)
+	  cpool.append_data (MAYBE_SWAP (index));
       }
   }
 
   /* Now write out the hash table.  */
   std::unordered_map<c_str_view, offset_type, c_str_view_hasher> str_table;
-  for (const auto &it : symtab->data)
+  for (const auto &entry : symtab->data)
     {
       offset_type str_off, vec_off;
 
-      if (it != NULL)
+      if (entry.name != NULL)
 	{
-	  const auto insertpair = str_table.emplace (it->name, cpool.size ());
+	  const auto insertpair = str_table.emplace (entry.name, cpool.size ());
 	  if (insertpair.second)
-	    cpool.append_cstr0 (it->name);
+	    cpool.append_cstr0 (entry.name);
 	  str_off = insertpair.first->second;
-	  vec_off = it->index_offset;
+	  vec_off = entry.index_offset;
 	}
       else
 	{
-- 
2.5.5

next prev parent reply	other threads:[~2017-06-12 16:14 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 18:25 [PATCH 0/6] DWARF-5: .debug_names index Jan Kratochvil
2017-05-26 18:25 ` [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Jan Kratochvil
2017-06-12 16:08   ` [pushed] " Pedro Alves
2017-06-12 16:14     ` [PATCH 4/6] .gdb_index prod perf regression: find before insert in unordered_map Pedro Alves
2017-06-12 16:14     ` Pedro Alves [this message]
2017-06-12 16:14     ` [PATCH 1/6] Code cleanup: dwarf2read.c:uniquify_cu_indices: Use std::unique Pedro Alves
2017-06-12 16:14     ` [PATCH 2/6] Code cleanup: dwarf2read.c: Eliminate ::file_write Pedro Alves
2017-06-17 17:35       ` Jan Kratochvil
2017-06-19  9:26         ` Pedro Alves
2017-06-18 18:36       ` Regression: " Jan Kratochvil
2017-06-19  9:27         ` Pedro Alves
2017-06-19  9:39           ` Jan Kratochvil
2017-06-19  9:47             ` Pedro Alves
2017-06-19 10:03               ` Jan Kratochvil
2017-06-19 10:35                 ` Pedro Alves
2017-06-19 12:06                 ` Yao Qi
2017-06-19 12:26                   ` Jan Kratochvil
2017-06-12 16:14     ` [PATCH 3/6] Code cleanup: dwarf2read.c: Add data_buf::append_uint Pedro Alves
2017-06-12 16:18     ` [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Pedro Alves
2017-06-12 16:19     ` [PATCH 5/6] .gdb_index prod perf regression: Estimate size of psyms_seen Pedro Alves
2017-06-18 14:25     ` [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Jan Kratochvil
2017-06-18 15:12       ` Eli Zaretskii
2017-06-19 11:50         ` [pushed] .gdb_index writer: close the file before unlinking it (Re: [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer.) Pedro Alves
2017-06-18 16:50     ` [pushed] Re: [PATCH 1/6] Code cleanup: C++ify .gdb_index producer Jan Kratochvil
2017-05-26 18:25 ` [PATCH 2/6] cc-with-tweaks.sh: Use gdb-add-index.sh Jan Kratochvil
2017-05-26 18:26 ` [PATCH 4/6] Code cleanup: dwarf2_initialize_objfile return value Jan Kratochvil
2017-05-26 18:26 ` [PATCH 3/6] DWARF-5: .debug_names index producer Jan Kratochvil
2017-06-09  5:58   ` [PATCH 3.1/6] " Jan Kratochvil
2017-05-26 18:26 ` [PATCH 5/6] Refactor: Move some generic code out of .gdb_index code Jan Kratochvil
2017-05-26 18:26 ` [PATCH 6/6] DWARF-5: .debug_names index consumer Jan Kratochvil
2017-06-18 19:37 ` obsolete: [PATCH 0/6] DWARF-5: .debug_names index Jan Kratochvil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1497284051-13795-6-git-send-email-palves@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).