public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Add a trie to map quickly from address range to compilation unit.
@ 2022-03-21  9:40 Steinar H. Gunderson
  2022-03-23 14:14 ` Nick Clifton
  0 siblings, 1 reply; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-21  9:40 UTC (permalink / raw)
  To: binutils; +Cc: sesse, Steinar H. Gunderson

When using perf to profile large binaries, _bfd_dwarf2_find_nearest_line()
becomes a hotspot, as perf wants to get line number information
(for inline-detection purposes) for each and every sample. In Chromium
in particular (the content_shell binary), this entails going through
475k address ranges, which takes a long time when done repeatedly.

Add a radix-256 trie over the address space to quickly map address to
compilation unit spaces; for content_shell, which is 1.6 GB when some
(but not full) debug information turned is on, we go from 6 ms to
0.006 ms (6 µs) for each lookup from address to compilation unit, a 1000x
speedup. There is still a slow part left in mapping to symbol name,
but this at least gets us halfway there.

There is a modest RAM increase of 180 MB in this binary (the existing
linked list over ranges uses about 10 MB, and the entire perf job uses
between 2–3 GB for a medium-size profile); for smaller binaries with few
ranges, there should be hardly any extra RAM usage at all.
---
 bfd/dwarf2.c | 442 +++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 390 insertions(+), 52 deletions(-)

diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c
index fdf071c36e9..5156ed98c16 100644
--- a/bfd/dwarf2.c
+++ b/bfd/dwarf2.c
@@ -82,6 +82,76 @@ struct adjusted_section
   bfd_vma adj_vma;
 };
 
+/* A trie to map quickly from address range to compilation unit.
+
+   This is a fairly standard radix-256 trie, used to quickly locate which
+   compilation unit any given address belongs to.  Given that each compilation
+   unit may register hundreds of very small and unaligned ranges (which may
+   potentially overlap, due to inlining and other concerns), and a large
+   program may end up containing hundreds of thousands of such ranges, we cannot
+   scan through them linearly without undue slowdown.
+
+   We use a hybrid trie to avoid memory explosion: There are two types of trie
+   nodes, leaves and interior nodes.  (Almost all nodes are leaves, so they
+   take up the bulk of the memory usage.) Leaves contain a simple array of
+   ranges (high/low address) and which compilation unit contains those ranges,
+   and when we get to a leaf, we scan through it linearly.  Interior nodes
+   contain pointers to 256 othr nodes, keyed by the next byte of the address.
+   So for a 64-bit address like 0x1234567abcd, we would start at the root and go
+   down child[0x00]->child[0x00]->child[0x01]->child[0x23]->child[0x45] etc.,
+   until we hit a leaf.  (Nodes are, in general, leaves until they exceed the
+   default allocation of 16 elements, at which point they are converted to
+   interior node if posisble.) This gives us near-constant lookup times;
+   the only thing that can be costly is if there are lots of overlapping ranges
+   within a single 256-byte segment of the binary, in which case we have to
+   scan through them all to find the best match.
+
+   For a binary with few ranges, we will in practice only have a single leaf node
+   at the root, containing a simple array.  Thus, the scheme is efficient for
+   both small and large binaries.
+ */
+
+/* Experiments have shown 16 to be a memory-efficient default leaf size.
+   The only case where a leaf will hold more memory than this, is at the
+   bottomost level (covering 256 bytes in the binary), where we'll expand
+   the leaf to be able to hold more ranges if needed.
+ */
+#define TRIE_LEAF_SIZE 16
+
+/* All trie_node pointers will really be trie_leaf or trie_interior,
+   but they have this common head.  */
+struct trie_node
+{
+  /* If zero, we are an interior node.
+     Otherwise, how many ranges we have room for in this leaf.  */
+  int num_room_in_leaf;
+};
+
+struct trie_leaf
+{
+  struct trie_node head;
+  int num_stored_in_leaf;
+  struct {
+    const struct comp_unit *unit;
+    bfd_vma low_pc, high_pc;
+  } ranges[TRIE_LEAF_SIZE];
+};
+
+struct trie_interior
+{
+  struct trie_node head;
+  struct trie_node *children[256];
+};
+
+static struct trie_node *alloc_trie_leaf (bfd *abfd)
+{
+  struct trie_leaf *leaf =
+    (struct trie_leaf *) bfd_zalloc (abfd, sizeof (struct trie_leaf));
+  if (leaf != NULL)
+    leaf->head.num_room_in_leaf = TRIE_LEAF_SIZE;
+  return (struct trie_node *) leaf;
+}
+
 struct dwarf2_debug_file
 {
   /* The actual bfd from which debug info was loaded.  Might be
@@ -139,6 +209,9 @@ struct dwarf2_debug_file
   /* A list of all previously read comp_units.  */
   struct comp_unit *all_comp_units;
 
+  /* A list of all previously read comp_units with no ranges (yet).  */
+  struct comp_unit *all_comp_units_without_ranges;
+
   /* Last comp unit in list above.  */
   struct comp_unit *last_comp_unit;
 
@@ -147,6 +220,9 @@ struct dwarf2_debug_file
 
   /* Hash table to map offsets to decoded abbrevs.  */
   htab_t abbrev_offsets;
+
+  /* Root of a trie to map addresses to compilation units.  */
+  struct trie_node *trie_root;
 };
 
 struct dwarf2_debug
@@ -220,6 +296,11 @@ struct comp_unit
   /* Chain the previously read compilation units.  */
   struct comp_unit *next_unit;
 
+  /* Chain the previously read compilation units that have no ranges yet.
+     We scan these separately when we have a trie over the ranges.
+     Unused if arange.high != 0. */
+  struct comp_unit *next_unit_without_ranges;
+
   /* Likewise, chain the compilation unit read after this one.
      The comp units are stored in reversed reading order.  */
   struct comp_unit *prev_unit;
@@ -296,6 +377,10 @@ struct comp_unit
 
   /* TRUE if symbols are cached in hash table for faster lookup by name.  */
   bool cached;
+
+  /* Used when iterating over trie leaves to know which units we have
+     already seen in this iteration.  */
+  bool mark;
 };
 
 /* This data structure holds the information of an abbrev.  */
@@ -1766,9 +1851,182 @@ concat_filename (struct line_info_table *table, unsigned int file)
   return strdup (filename);
 }
 
+#define VMA_BITS (8 * sizeof (bfd_vma))
+
+/* Check whether [low1, high1) can be combined with [low2, high2),
+   i.e., they touch or overlap.  */
+static bool ranges_overlap(bfd_vma low1, bfd_vma high1, bfd_vma low2, bfd_vma high2)
+{
+  if (low1 == low2 || high1 == high2)
+    return true;
+
+  /* Sort so that low1 is below low2. */
+  if (low1 > low2)
+    {
+      bfd_vma tmp;
+
+      tmp = low1;
+      low1 = low2;
+      low2 = tmp;
+
+      tmp = high1;
+      high1 = high2;
+      high2 = tmp;
+    }
+
+  /* We touch iff low2 == high1.
+     We overlap iff low2 is within [low1, high1). */
+  return (low2 <= high1);
+}
+
+/* Insert an address range in the trie mapping addresses to compilation units.
+   Will return the new trie node (usually the same as is being sent in, but
+   in case of a leaf-to-interior conversion, or expansion of a leaf, it may be
+   different), or NULL on failure.
+ */
+static struct trie_node *insert_arange_in_trie(bfd *abfd,
+					       struct trie_node *trie,
+					       bfd_vma trie_pc,
+					       int trie_pc_bits,
+					       const struct comp_unit *unit,
+					       bfd_vma low_pc,
+					       bfd_vma high_pc)
+{
+  bfd_vma clamped_low_pc, clamped_high_pc;
+  int ch, from_ch, to_ch;
+  bool is_full_leaf = false;
+
+  /* See if we can extend any of the existing ranges.  This merging
+     isn't perfect (if merging opens up the possibility of merging two existing
+     ranges, we won't find them), but it takes the majority ot the cases.  */
+  if (trie->num_room_in_leaf > 0)
+    {
+      struct trie_leaf *leaf = (struct trie_leaf *) trie;
+      int i;
+
+      for (i = 0; i < leaf->num_stored_in_leaf; ++i)
+	{
+	  if (leaf->ranges[i].unit == unit &&
+	      ranges_overlap(low_pc, high_pc,
+			     leaf->ranges[i].low_pc, leaf->ranges[i].high_pc))
+	    {
+	      if (low_pc < leaf->ranges[i].low_pc)
+		leaf->ranges[i].low_pc = low_pc;
+	      if (high_pc > leaf->ranges[i].high_pc)
+		leaf->ranges[i].high_pc = high_pc;
+	      return trie;
+	    }
+	}
+
+      is_full_leaf = leaf->num_stored_in_leaf == trie->num_room_in_leaf;
+    }
+
+  /* If we're a leaf with no more room and we're _not_ at the bottom,
+     convert to an interior node.  */
+  if (is_full_leaf && trie_pc_bits < (int)VMA_BITS)
+    {
+      const struct trie_leaf *leaf = (struct trie_leaf *) trie;
+      int i;
+
+      trie = (struct trie_node *) bfd_zalloc (abfd, sizeof (struct trie_interior));
+      if (!trie)
+	return NULL;
+      is_full_leaf = false;
+
+      /* TODO: If we wanted to save a little more memory at the cost of
+	 complexity, we could have reused the old leaf node as one of the
+	 children of the new interior node, instead of throwing it away.  */
+      for (i = 0; i < leaf->num_stored_in_leaf; ++i)
+        {
+	  if (!insert_arange_in_trie (abfd, trie, trie_pc, trie_pc_bits,
+				      leaf->ranges[i].unit, leaf->ranges[i].low_pc,
+				      leaf->ranges[i].high_pc))
+	    return NULL;
+	}
+    }
+
+  /* If we're a leaf with no more room and we _are_ at the bottom,
+     we have no choice but to just make it larger. */
+  if (is_full_leaf)
+    {
+      const struct trie_leaf *leaf = (struct trie_leaf *) trie;
+      int new_room_in_leaf = trie->num_room_in_leaf * 2;
+      struct trie_leaf *new_leaf;
+
+      new_leaf = (struct trie_leaf *) bfd_zalloc (abfd,
+	sizeof (struct trie_leaf) + (new_room_in_leaf - TRIE_LEAF_SIZE) * sizeof (leaf->ranges[0]));
+      new_leaf->head.num_room_in_leaf = new_room_in_leaf;
+      new_leaf->num_stored_in_leaf = leaf->num_stored_in_leaf;
+
+      memcpy (new_leaf->ranges,
+	      leaf->ranges,
+	      leaf->num_stored_in_leaf * sizeof (leaf->ranges[0]));
+      trie = (struct trie_node *) new_leaf;
+      is_full_leaf = false;
+
+      /* Now the insert below will go through.  */
+    }
+
+  /* If we're a leaf (now with room), we can just insert at the end.  */
+  if (trie->num_room_in_leaf > 0)
+    {
+      struct trie_leaf *leaf = (struct trie_leaf *) trie;
+
+      int i = leaf->num_stored_in_leaf++;
+      leaf->ranges[i].unit = unit;
+      leaf->ranges[i].low_pc = low_pc;
+      leaf->ranges[i].high_pc = high_pc;
+      return trie;
+    }
+
+  /* Now we are definitely an interior node, so recurse into all the relevant buckets.  */
+
+  /* Clamp the range to the current trie bucket.  */
+  clamped_low_pc = low_pc;
+  clamped_high_pc = high_pc;
+  if (trie_pc_bits > 0)
+    {
+      bfd_vma bucket_high_pc = trie_pc + ((bfd_vma)-1 >> trie_pc_bits);  /* Inclusive.  */
+      if (clamped_low_pc < trie_pc)
+	clamped_low_pc = trie_pc;
+      if (clamped_high_pc > bucket_high_pc)
+	clamped_high_pc = bucket_high_pc;
+    }
+
+  /* Insert the ranges in all buckets that it spans.  */
+  from_ch = (clamped_low_pc >> (VMA_BITS - trie_pc_bits - 8)) & 0xff;
+  to_ch = ((clamped_high_pc - 1) >> (VMA_BITS - trie_pc_bits - 8)) & 0xff;
+  for (ch = from_ch; ch <= to_ch; ++ch)
+    {
+      struct trie_interior *interior = (struct trie_interior *) trie;
+      struct trie_node *child = interior->children[ch];
+
+      if (child == NULL)
+        {
+	  child = alloc_trie_leaf (abfd);
+	  if (!child)
+	    return NULL;
+	}
+      child = insert_arange_in_trie (abfd,
+				     child,
+				     trie_pc + ((bfd_vma)ch << (VMA_BITS - trie_pc_bits - 8)),
+				     trie_pc_bits + 8,
+				     unit,
+				     low_pc,
+				     high_pc);
+      if (!child)
+	return NULL;
+
+      interior->children[ch] = child;
+    }
+
+    return trie;
+}
+
+
 static bool
 arange_add (const struct comp_unit *unit, struct arange *first_arange,
-	    bfd_vma low_pc, bfd_vma high_pc)
+	    struct trie_node **trie_root, bfd_vma low_pc, bfd_vma high_pc)
 {
   struct arange *arange;
 
@@ -1776,6 +2034,19 @@ arange_add (const struct comp_unit *unit, struct arange *first_arange,
   if (low_pc == high_pc)
     return true;
 
+  if (trie_root != NULL)
+    {
+      *trie_root = insert_arange_in_trie (unit->file->bfd_ptr,
+					  *trie_root,
+					  0,
+					  0,
+					  unit,
+					  low_pc,
+					  high_pc);
+      if (*trie_root == NULL)
+	return false;
+    }
+
   /* If the first arange is empty, use it.  */
   if (first_arange->high == 0)
     {
@@ -2410,7 +2681,7 @@ decode_line_info (struct comp_unit *unit)
 		    low_pc = address;
 		  if (address > high_pc)
 		    high_pc = address;
-		  if (!arange_add (unit, &unit->arange, low_pc, high_pc))
+		  if (!arange_add (unit, &unit->arange, &unit->file->trie_root, low_pc, high_pc))
 		    goto line_fail;
 		  break;
 		case DW_LNE_set_address:
@@ -3134,7 +3405,7 @@ find_abstract_instance (struct comp_unit *unit,
 
 static bool
 read_ranges (struct comp_unit *unit, struct arange *arange,
-	     bfd_uint64_t offset)
+	     struct trie_node **trie_root, bfd_uint64_t offset)
 {
   bfd_byte *ranges_ptr;
   bfd_byte *ranges_end;
@@ -3169,7 +3440,7 @@ read_ranges (struct comp_unit *unit, struct arange *arange,
 	base_address = high_pc;
       else
 	{
-	  if (!arange_add (unit, arange,
+	  if (!arange_add (unit, arange, trie_root,
 			   base_address + low_pc, base_address + high_pc))
 	    return false;
 	}
@@ -3179,7 +3450,7 @@ read_ranges (struct comp_unit *unit, struct arange *arange,
 
 static bool
 read_rnglists (struct comp_unit *unit, struct arange *arange,
-	       bfd_uint64_t offset)
+	       struct trie_node **trie_root, bfd_uint64_t offset)
 {
   bfd_byte *rngs_ptr;
   bfd_byte *rngs_end;
@@ -3253,19 +3524,19 @@ read_rnglists (struct comp_unit *unit, struct arange *arange,
 	  return false;
 	}
 
-      if (!arange_add (unit, arange, low_pc, high_pc))
+      if (!arange_add (unit, arange, trie_root, low_pc, high_pc))
 	return false;
     }
 }
 
 static bool
 read_rangelist (struct comp_unit *unit, struct arange *arange,
-		bfd_uint64_t offset)
+		struct trie_node **trie_root, bfd_uint64_t offset)
 {
   if (unit->version <= 4)
-    return read_ranges (unit, arange, offset);
+    return read_ranges (unit, arange, trie_root, offset);
   else
-    return read_rnglists (unit, arange, offset);
+    return read_rnglists (unit, arange, trie_root, offset);
 }
 
 static struct funcinfo *
@@ -3563,7 +3834,7 @@ scan_unit_for_symbols (struct comp_unit *unit)
 
 		case DW_AT_ranges:
 		  if (is_int_form (&attr)
-		      && !read_rangelist (unit, &func->arange, attr.u.val))
+		      && !read_rangelist (unit, &func->arange, NULL, attr.u.val))
 		    goto fail;
 		  break;
 
@@ -3679,7 +3950,7 @@ scan_unit_for_symbols (struct comp_unit *unit)
 
       if (func && high_pc != 0)
 	{
-	  if (!arange_add (unit, &func->arange, low_pc, high_pc))
+	  if (!arange_add (unit, &func->arange, NULL, low_pc, high_pc))
 	    goto fail;
 	}
     }
@@ -3874,7 +4145,7 @@ parse_comp_unit (struct dwarf2_debug *stash,
 
 	case DW_AT_ranges:
 	  if (is_int_form (&attr)
-	      && !read_rangelist (unit, &unit->arange, attr.u.val))
+	      && !read_rangelist (unit, &unit->arange, &unit->file->trie_root, attr.u.val))
 	    return NULL;
 	  break;
 
@@ -3916,7 +4187,7 @@ parse_comp_unit (struct dwarf2_debug *stash,
     high_pc += low_pc;
   if (high_pc != 0)
     {
-      if (!arange_add (unit, &unit->arange, low_pc, high_pc))
+      if (!arange_add (unit, &unit->arange, &unit->file->trie_root, low_pc, high_pc))
 	return NULL;
     }
 
@@ -3982,6 +4253,47 @@ comp_unit_find_nearest_line (struct comp_unit *unit,
 					    discriminator_ptr);
 }
 
+static void
+comp_unit_find_nearest_narrowest_line (struct comp_unit *unit,
+				       bfd_vma addr,
+				       const char **filename_ptr,
+				       struct funcinfo **function_ptr,
+				       unsigned int *linenumber_ptr,
+				       unsigned int *discriminator_ptr,
+				       bfd_vma *min_range)
+{
+  const char * local_filename = NULL;
+  struct funcinfo *local_function = NULL;
+  unsigned int local_linenumber = 0;
+  unsigned int local_discriminator = 0;
+  bfd_vma range;
+
+  range = comp_unit_find_nearest_line (unit, addr, &local_filename,
+				       &local_function, &local_linenumber,
+				       &local_discriminator);
+  if (range == 0)
+    return;
+
+  /* PRs 15935 15994: Bogus debug information may have provided us
+     with an erroneous match.  We attempt to counter this by
+     selecting the match that has the smallest address range
+     associated with it.  (We are assuming that corrupt debug info
+     will tend to result in extra large address ranges rather than
+     extra small ranges).  */
+  if (range <= *min_range)
+    {
+      if (filename_ptr && local_filename)
+        * filename_ptr = local_filename;
+      if (function_ptr && local_function)
+        * function_ptr = local_function;
+      if (discriminator_ptr && local_discriminator)
+        * discriminator_ptr = local_discriminator;
+      if (linenumber_ptr && local_linenumber)
+        * linenumber_ptr = local_linenumber;
+      *min_range = range;
+    }
+}
+
 /* Check to see if line info is already decoded in a comp_unit.
    If not, decode it.  Returns TRUE if no errors were encountered;
    FALSE otherwise.  */
@@ -4747,6 +5059,14 @@ _bfd_dwarf2_slurp_debug_info (bfd *abfd, bfd *debug_bfd,
   if (!stash->alt.abbrev_offsets)
     return false;
 
+  stash->f.trie_root = alloc_trie_leaf (abfd);
+  if (!stash->f.trie_root)
+    return false;
+
+  stash->alt.trie_root = alloc_trie_leaf (abfd);
+  if (!stash->alt.trie_root)
+    return false;
+
   *pinfo = stash;
 
   if (debug_bfd == NULL)
@@ -4918,6 +5238,12 @@ stash_comp_unit (struct dwarf2_debug *stash, struct dwarf2_debug_file *file)
 	  each->next_unit = file->all_comp_units;
 	  file->all_comp_units = each;
 
+          if (each->arange.high == 0)
+            {
+              each->next_unit_without_ranges = file->all_comp_units_without_ranges;
+              file->all_comp_units_without_ranges = each->next_unit_without_ranges;
+            }
+
 	  file->info_ptr += length;
 	  return each;
 	}
@@ -5161,49 +5487,61 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
   else
     {
       bfd_vma min_range = (bfd_vma) -1;
-      const char * local_filename = NULL;
-      struct funcinfo *local_function = NULL;
-      unsigned int local_linenumber = 0;
-      unsigned int local_discriminator = 0;
+      struct trie_node *trie = stash->f.trie_root;
+      int bits = VMA_BITS - 8;
+      struct comp_unit **prev_each;
 
-      for (each = stash->f.all_comp_units; each; each = each->next_unit)
+      /* Traverse interior nodes until we get to a leaf.  */
+      while (trie && trie->num_room_in_leaf == 0)
 	{
-	  bfd_vma range = (bfd_vma) -1;
-
-	  found = ((each->arange.high == 0
-		    || comp_unit_contains_address (each, addr))
-		   && (range = (comp_unit_find_nearest_line
-				(each, addr, &local_filename,
-				 &local_function, &local_linenumber,
-				 &local_discriminator))) != 0);
-	  if (found)
-	    {
-	      /* PRs 15935 15994: Bogus debug information may have provided us
-		 with an erroneous match.  We attempt to counter this by
-		 selecting the match that has the smallest address range
-		 associated with it.  (We are assuming that corrupt debug info
-		 will tend to result in extra large address ranges rather than
-		 extra small ranges).
-
-		 This does mean that we scan through all of the CUs associated
-		 with the bfd each time this function is called.  But this does
-		 have the benefit of producing consistent results every time the
-		 function is called.  */
-	      if (range <= min_range)
-		{
-		  if (filename_ptr && local_filename)
-		    * filename_ptr = local_filename;
-		  if (local_function)
-		    function = local_function;
-		  if (discriminator_ptr && local_discriminator)
-		    * discriminator_ptr = local_discriminator;
-		  if (local_linenumber)
-		    * linenumber_ptr = local_linenumber;
-		  min_range = range;
-		}
-	    }
+	  int ch = (addr >> bits) & 0xff;
+	  trie = ((struct trie_interior *) trie)->children[ch];
+	  bits -= 8;
 	}
 
+      if (trie)
+	{
+	  const struct trie_leaf *leaf = (struct trie_leaf *) trie;
+	  int i;
+
+	  for (i = 0; i < leaf->num_stored_in_leaf; ++i)
+	    {
+	      struct comp_unit *unit = (struct comp_unit *) leaf->ranges[i].unit;
+	      unit->mark = false;
+	    }
+
+	  for (i = 0; i < leaf->num_stored_in_leaf; ++i)
+	    {
+	      struct comp_unit *unit = (struct comp_unit *) leaf->ranges[i].unit;
+	      if (unit->mark ||
+	          addr < leaf->ranges[i].low_pc ||
+	          addr >= leaf->ranges[i].high_pc)
+	        continue;
+	      unit->mark = true;
+
+	      comp_unit_find_nearest_narrowest_line (unit, addr, filename_ptr,
+						     &function, discriminator_ptr,
+						     linenumber_ptr, &min_range);
+	   }
+	}
+
+      /* Also scan through all compilation units without any ranges,
+         taking them out of the list if they have acquired any since last time.  */
+      prev_each = &stash->f.all_comp_units_without_ranges;
+      for (each = *prev_each; each; each = each->next_unit_without_ranges)
+        {
+	  if (each->arange.high != 0)
+	    {
+	      *prev_each = each->next_unit_without_ranges;
+	      continue;
+	    }
+
+	  comp_unit_find_nearest_narrowest_line (each, addr, filename_ptr,
+						 &function, discriminator_ptr,
+						 linenumber_ptr, &min_range);
+	  prev_each = &each->next_unit_without_ranges;
+        }
+
       if (* linenumber_ptr)
 	{
 	  found = true;
-- 
2.35.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-21  9:40 [PATCH] Add a trie to map quickly from address range to compilation unit Steinar H. Gunderson
@ 2022-03-23 14:14 ` Nick Clifton
  2022-03-23 15:53   ` Steinar H. Gunderson
  2022-03-23 22:24   ` Steinar H. Gunderson
  0 siblings, 2 replies; 14+ messages in thread
From: Nick Clifton @ 2022-03-23 14:14 UTC (permalink / raw)
  To: Steinar H. Gunderson, binutils; +Cc: sesse

Hi Steinar,

> Add a radix-256 trie over the address space to quickly map address to
> compilation unit spaces; for content_shell, which is 1.6 GB when some
> (but not full) debug information turned is on, we go from 6 ms to
> 0.006 ms (6 µs) for each lookup from address to compilation unit, a 1000x
> speedup. There is still a slow part left in mapping to symbol name,
> but this at least gets us halfway there.

This patch appears to introduce some new failures into the various testsuites.
For example with a toolchain targeted at x86_64-pc-linux-gnu I see:

   GAS REGRESSION: DWARF2 debugging information 2
   GAS REGRESSION: DWARF2 debugging information 2 with SHF_COMPRESS
   GAS REGRESSION: 64bit DWARF2 debugging information 2
   GAS REGRESSION: 64bit DWARF2 debugging information 2 with SHF_COMPRESS
   LD REGRESSION:  Dump pr21978.so
   BIN REGRESSION: build-id-debuglink (grepping for source file name is disassembly output)

These all appear to be due to lack of line number information.  For example
the gas tests are failing because they are expecting to see a line like this:

   ./dw2-compress-2.c:6

But instead they are being given:

   ./dw2-compress-2.c:?

There are similar explanations for the other failures too.

Please could you look into this ?

Cheers
   Nick


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-23 14:14 ` Nick Clifton
@ 2022-03-23 15:53   ` Steinar H. Gunderson
  2022-03-23 22:24   ` Steinar H. Gunderson
  1 sibling, 0 replies; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-23 15:53 UTC (permalink / raw)
  To: Nick Clifton; +Cc: binutils, sesse

On Wed, Mar 23, 2022 at 02:14:31PM +0000, Nick Clifton wrote:
> This patch appears to introduce some new failures into the various testsuites.

Thanks, I'll have a look to see what's going on.

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-23 14:14 ` Nick Clifton
  2022-03-23 15:53   ` Steinar H. Gunderson
@ 2022-03-23 22:24   ` Steinar H. Gunderson
  2022-03-24  5:22     ` Alan Modra
  1 sibling, 1 reply; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-23 22:24 UTC (permalink / raw)
  To: Nick Clifton; +Cc: binutils, sesse

On Wed, Mar 23, 2022 at 02:14:31PM +0000, Nick Clifton wrote:
> This patch appears to introduce some new failures into the various testsuites.

I found the issue; I had swapped the linenumber_ptr and
discriminator_ptr in the two calls to
comp_unit_find_nearest_narrowest_line(). Fixing that makes the test
suite pass.

But I noticed something else that's probably not good in the existing
code; the “found” variable leaks out of the loop from the last iteration
(only). So if you find a match without a line number of the
second-to-last compilation unit but not in the last, found = false on
return, but if you find a similar match in the last compilation unit,
found = true.

I suppose this isn't intentional, but what is the intention? Should
there be a “found = false;” before the test on *linenumber_ptr?

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-23 22:24   ` Steinar H. Gunderson
@ 2022-03-24  5:22     ` Alan Modra
  2022-03-24  8:01       ` Steinar H. Gunderson
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Modra @ 2022-03-24  5:22 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: Nick Clifton, binutils, sesse

On Wed, Mar 23, 2022 at 11:24:20PM +0100, Steinar H. Gunderson via Binutils wrote:
> But I noticed something else that's probably not good in the existing
> code; the “found” variable leaks out of the loop from the last iteration
> (only). So if you find a match without a line number of the
> second-to-last compilation unit but not in the last, found = false on
> return, but if you find a similar match in the last compilation unit,
> found = true.
> 
> I suppose this isn't intentional, but what is the intention? Should
> there be a “found = false;” before the test on *linenumber_ptr?

Huh, I remember looking at this code a while ago and finding it
confusing.  I think the code would be clearer, and behave the same on
normal line number info with the following patch:

	* dwarf2.c (_bfd_dwarf2_find_nearest_line): Simplify setting of
	"found" in loop checking previous comp units with
	comp_unit_contains_address.

diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c
index 8b5ac6012e1..ca8403da6da 100644
--- a/bfd/dwarf2.c
+++ b/bfd/dwarf2.c
@@ -5195,16 +5195,16 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
 
       for (each = stash->f.all_comp_units; each; each = each->next_unit)
 	{
-	  bfd_vma range = (bfd_vma) -1;
-
-	  found = ((each->arange.high == 0
-		    || comp_unit_contains_address (each, addr))
-		   && (range = (comp_unit_find_nearest_line
-				(each, addr, &local_filename,
-				 &local_function, &local_linenumber,
-				 &local_discriminator))) != 0);
-	  if (found)
+	  bfd_vma range;
+
+	  if ((each->arange.high == 0
+	       || comp_unit_contains_address (each, addr))
+	      && (range = (comp_unit_find_nearest_line
+			   (each, addr, &local_filename,
+			    &local_function, &local_linenumber,
+			    &local_discriminator))) != 0)
 	    {
+	      found = true;
 	      /* PRs 15935 15994: Bogus debug information may have provided us
 		 with an erroneous match.  We attempt to counter this by
 		 selecting the match that has the smallest address range
@@ -5231,11 +5231,8 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
 	    }
 	}
 
-      if (* linenumber_ptr)
-	{
-	  found = true;
-	  goto done;
-	}
+      if (found)
+	goto done;
     }
 
   /* Read each remaining comp. units checking each as they are read.  */

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-24  5:22     ` Alan Modra
@ 2022-03-24  8:01       ` Steinar H. Gunderson
  2022-03-24 23:30         ` Alan Modra
  0 siblings, 1 reply; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-24  8:01 UTC (permalink / raw)
  To: Alan Modra; +Cc: Nick Clifton, binutils, sesse

On Thu, Mar 24, 2022 at 03:52:27PM +1030, Alan Modra wrote:
> Huh, I remember looking at this code a while ago and finding it
> confusing.  I think the code would be clearer, and behave the same on
> normal line number info with the following patch:

An interesting question is: Do you want to keep searching through
compilation units once you've found a match with a line number?
Should we go straight to “goto done” then?

Currently, it seems that even if we find a match with line numbers, we
keep searching for a more specific match (range <= min_range) in
different compilation units... but only among compilation units we've
already parsed debug information for. If we have a line number by then,
we stop searching; we only parse new units if we're line-number-less
or have no match at all.

And if we find a match, no matter how bad, in those new units, we accept
it and stop searching, no matter whether we have line numbers or if it's
a more specific match or not. (Is that a correct interpretation?) Your
patch changes the exit-after-already-parsed to also include a match
_without_ line numbers. It's a change, but I don't know if it's
intended. My patch, as a side effect, introduces the more-specific logic
into the second loop, where it wasn't before.

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-24  8:01       ` Steinar H. Gunderson
@ 2022-03-24 23:30         ` Alan Modra
  2022-03-25  0:01           ` Steinar H. Gunderson
  2022-03-28 10:19           ` Jan Beulich
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Modra @ 2022-03-24 23:30 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: Nick Clifton, binutils, sesse

On Thu, Mar 24, 2022 at 09:01:38AM +0100, Steinar H. Gunderson wrote:
> On Thu, Mar 24, 2022 at 03:52:27PM +1030, Alan Modra wrote:
> > Huh, I remember looking at this code a while ago and finding it
> > confusing.  I think the code would be clearer, and behave the same on
> > normal line number info with the following patch:
> 
> An interesting question is: Do you want to keep searching through
> compilation units once you've found a match with a line number?
> Should we go straight to “goto done” then?

This would be reverting commit 240d6706c6a2.  In
https://sourceware.org/bugzilla/show_bug.cgi?id=15935#c3 I came to the
conclusion that the pr15935 testcase had bogus debug info and closed
the bug as invalid.  The reporter apparently opened another bug,
https://sourceware.org/bugzilla/show_bug.cgi?id=15994 a month later
that Nick fixed by making _bfd_dwarf2_find_nearest_line do extra work.
Which of course is unnecessary with good debug info, but in many cases
we try to make binutils give the best result even with bad input.  I
don't know the details beyond that.  It might have been that the
compiler producing the bad debug info was one supported by RedHat.

Now we have pr28592 and others complaining that objdump or addr2line
have significantly slowed.  Given that pr15935 dates back to 2013, I
would presume that people have moved on from whatever broken compiler
produced bad line info, and that we should indeed revert commit
240d6706c6a2.  Nick?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-24 23:30         ` Alan Modra
@ 2022-03-25  0:01           ` Steinar H. Gunderson
  2022-03-28 10:19           ` Jan Beulich
  1 sibling, 0 replies; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-25  0:01 UTC (permalink / raw)
  To: Alan Modra; +Cc: Nick Clifton, binutils, sesse

On Fri, Mar 25, 2022 at 10:00:29AM +1030, Alan Modra wrote:
> This would be reverting commit 240d6706c6a2.  In
> https://sourceware.org/bugzilla/show_bug.cgi?id=15935#c3 I came to the
> conclusion that the pr15935 testcase had bogus debug info and closed
> the bug as invalid.  The reporter apparently opened another bug,
> https://sourceware.org/bugzilla/show_bug.cgi?id=15994 a month later
> that Nick fixed by making _bfd_dwarf2_find_nearest_line do extra work.
> Which of course is unnecessary with good debug info, but in many cases
> we try to make binutils give the best result even with bad input.  I
> don't know the details beyond that.  It might have been that the
> compiler producing the bad debug info was one supported by RedHat.

Thanks for going through the history here. Note that my patch changes
the equation somewhat; as long as the debug info has been parsed for the
CU (I understand that this is done somewhat incrementally?), the cost of
looking through all CUs instead of stopping at the first one is nearly zero.
Generally, we pay a cost proportional to the number of ranges that
overlap with the given 256-byte region which should be very pretty few
in a well-behaved binary, and not a lot even in a really bad one.
(Well, that's not strictly true; if the number of such ranges is fewer
than 8, we could be testing up to 8 ranges. But it's still cheap.)

But of course, if we want to keep parsing debug info from new CUs,
that has a definite cost. But that cost is per-CU, not per lookup,
so if you're looking up lots of addresses, it's “only” startup cost.
Whether this matters depends on the use case, of course.

In any case, I think the decision here should be more clearly
communicated in the source :-) The existing comment saying that we keep
looking is very useful, but the decision to _not_ keep looking into
not-yet-parsed CUs (in some cases only!) is less clearly articulated.

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-24 23:30         ` Alan Modra
  2022-03-25  0:01           ` Steinar H. Gunderson
@ 2022-03-28 10:19           ` Jan Beulich
  2022-03-28 23:47             ` Alan Modra
  1 sibling, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-03-28 10:19 UTC (permalink / raw)
  To: Alan Modra, Nick Clifton; +Cc: sesse, binutils, Steinar H. Gunderson

On 25.03.2022 00:30, Alan Modra via Binutils wrote:
> On Thu, Mar 24, 2022 at 09:01:38AM +0100, Steinar H. Gunderson wrote:
>> On Thu, Mar 24, 2022 at 03:52:27PM +1030, Alan Modra wrote:
>>> Huh, I remember looking at this code a while ago and finding it
>>> confusing.  I think the code would be clearer, and behave the same on
>>> normal line number info with the following patch:
>>
>> An interesting question is: Do you want to keep searching through
>> compilation units once you've found a match with a line number?
>> Should we go straight to “goto done” then?
> 
> This would be reverting commit 240d6706c6a2.  In
> https://sourceware.org/bugzilla/show_bug.cgi?id=15935#c3 I came to the
> conclusion that the pr15935 testcase had bogus debug info and closed
> the bug as invalid.  The reporter apparently opened another bug,
> https://sourceware.org/bugzilla/show_bug.cgi?id=15994 a month later
> that Nick fixed by making _bfd_dwarf2_find_nearest_line do extra work.
> Which of course is unnecessary with good debug info, but in many cases
> we try to make binutils give the best result even with bad input.  I
> don't know the details beyond that.  It might have been that the
> compiler producing the bad debug info was one supported by RedHat.
> 
> Now we have pr28592 and others complaining that objdump or addr2line
> have significantly slowed.  Given that pr15935 dates back to 2013, I
> would presume that people have moved on from whatever broken compiler
> produced bad line info, and that we should indeed revert commit
> 240d6706c6a2.  Nick?

Since I ended up working on that function as well, I did notice another
potentially relevant aspect: The adjustment done back at the time was
only for the case of already processed CUs. The subsequent loop reading
any remaining ones doesn't similarly attempt to find a better match.
IOW the effects of that change can only have been partial anyway.

Jan


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-28 10:19           ` Jan Beulich
@ 2022-03-28 23:47             ` Alan Modra
  2022-03-29  6:07               ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Modra @ 2022-03-28 23:47 UTC (permalink / raw)
  To: Nick Clifton; +Cc: Jan Beulich, sesse, binutils, Steinar H. Gunderson

On Mon, Mar 28, 2022 at 12:19:52PM +0200, Jan Beulich wrote:
> On 25.03.2022 00:30, Alan Modra via Binutils wrote:
> > On Thu, Mar 24, 2022 at 09:01:38AM +0100, Steinar H. Gunderson wrote:
> >> On Thu, Mar 24, 2022 at 03:52:27PM +1030, Alan Modra wrote:
> >>> Huh, I remember looking at this code a while ago and finding it
> >>> confusing.  I think the code would be clearer, and behave the same on
> >>> normal line number info with the following patch:
> >>
> >> An interesting question is: Do you want to keep searching through
> >> compilation units once you've found a match with a line number?
> >> Should we go straight to “goto done” then?
> > 
> > This would be reverting commit 240d6706c6a2.  In
> > https://sourceware.org/bugzilla/show_bug.cgi?id=15935#c3 I came to the
> > conclusion that the pr15935 testcase had bogus debug info and closed
> > the bug as invalid.  The reporter apparently opened another bug,
> > https://sourceware.org/bugzilla/show_bug.cgi?id=15994 a month later
> > that Nick fixed by making _bfd_dwarf2_find_nearest_line do extra work.
> > Which of course is unnecessary with good debug info, but in many cases
> > we try to make binutils give the best result even with bad input.  I
> > don't know the details beyond that.  It might have been that the
> > compiler producing the bad debug info was one supported by RedHat.
> > 
> > Now we have pr28592 and others complaining that objdump or addr2line
> > have significantly slowed.  Given that pr15935 dates back to 2013, I
> > would presume that people have moved on from whatever broken compiler
> > produced bad line info, and that we should indeed revert commit
> > 240d6706c6a2.  Nick?
> 
> Since I ended up working on that function as well, I did notice another
> potentially relevant aspect: The adjustment done back at the time was
> only for the case of already processed CUs. The subsequent loop reading
> any remaining ones doesn't similarly attempt to find a better match.
> IOW the effects of that change can only have been partial anyway.

Yes, Steinar noticed that too.  Another thing I noticed is that prior
to Nick's original patch looking for the best line range match, we
used to return a function name result for addresses that matched a
DW_TAG_subprogram (and similar) address ramge info.  See the
comp_unit_find_nearest_line change in the following, a patch to revert
commit 240d6706c6a2.

	PR 28592
	PR 15994
	PR 15935
	* dwarf2.c (lookup_address_in_line_info_table): Return bool rather
	than a range.
	(comp_unit_find_nearest_line): Likewise.  Return true if function
	info found without line info.
	(_bfd_dwarf2_find_nearest_line): Revert range handling code.

diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c
index 8b5ac6012e1..4beebcd2835 100644
--- a/bfd/dwarf2.c
+++ b/bfd/dwarf2.c
@@ -2543,13 +2543,12 @@ decode_line_info (struct comp_unit *unit)
   return NULL;
 }
 
-/* If ADDR is within TABLE set the output parameters and return the
-   range of addresses covered by the entry used to fill them out.
-   Otherwise set * FILENAME_PTR to NULL and return 0.
+/* If ADDR is within TABLE set the output parameters and return TRUE,
+   otherwise set *FILENAME_PTR to NULL and return FALSE.
    The parameters FILENAME_PTR, LINENUMBER_PTR and DISCRIMINATOR_PTR
    are pointers to the objects to be filled in.  */
 
-static bfd_vma
+static bool
 lookup_address_in_line_info_table (struct line_info_table *table,
 				   bfd_vma addr,
 				   const char **filename_ptr,
@@ -2608,12 +2607,12 @@ lookup_address_in_line_info_table (struct line_info_table *table,
       *linenumber_ptr = info->line;
       if (discriminator_ptr)
 	*discriminator_ptr = info->discriminator;
-      return seq->last_line->address - seq->low_pc;
+      return true;
     }
 
  fail:
   *filename_ptr = NULL;
-  return 0;
+  return false;
 }
 
 /* Read in the .debug_ranges section for future reference.  */
@@ -4008,14 +4007,11 @@ comp_unit_contains_address (struct comp_unit *unit, bfd_vma addr)
 }
 
 /* If UNIT contains ADDR, set the output parameters to the values for
-   the line containing ADDR.  The output parameters, FILENAME_PTR,
-   FUNCTION_PTR, and LINENUMBER_PTR, are pointers to the objects
-   to be filled in.
-
-   Returns the range of addresses covered by the entry that was used
-   to fill in *LINENUMBER_PTR or 0 if it was not filled in.  */
+   the line containing ADDR and return TRUE.  Otherwise return FALSE.
+   The output parameters, FILENAME_PTR, FUNCTION_PTR, and
+   LINENUMBER_PTR, are pointers to the objects to be filled in.  */
 
-static bfd_vma
+static bool
 comp_unit_find_nearest_line (struct comp_unit *unit,
 			     bfd_vma addr,
 			     const char **filename_ptr,
@@ -4023,7 +4019,7 @@ comp_unit_find_nearest_line (struct comp_unit *unit,
 			     unsigned int *linenumber_ptr,
 			     unsigned int *discriminator_ptr)
 {
-  bool func_p;
+  bool line_p, func_p;
 
   if (!comp_unit_maybe_decode_line_info (unit))
     return false;
@@ -4033,10 +4029,11 @@ comp_unit_find_nearest_line (struct comp_unit *unit,
   if (func_p && (*function_ptr)->tag == DW_TAG_inlined_subroutine)
     unit->stash->inliner_chain = *function_ptr;
 
-  return lookup_address_in_line_info_table (unit->line_table, addr,
-					    filename_ptr,
-					    linenumber_ptr,
-					    discriminator_ptr);
+  line_p = lookup_address_in_line_info_table (unit->line_table, addr,
+					      filename_ptr,
+					      linenumber_ptr,
+					      discriminator_ptr);
+  return line_p || func_p;
 }
 
 /* Check to see if line info is already decoded in a comp_unit.
@@ -5187,54 +5184,17 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
     }
   else
     {
-      bfd_vma min_range = (bfd_vma) -1;
-      const char * local_filename = NULL;
-      struct funcinfo *local_function = NULL;
-      unsigned int local_linenumber = 0;
-      unsigned int local_discriminator = 0;
-
       for (each = stash->f.all_comp_units; each; each = each->next_unit)
 	{
-	  bfd_vma range = (bfd_vma) -1;
-
 	  found = ((each->arange.high == 0
 		    || comp_unit_contains_address (each, addr))
-		   && (range = (comp_unit_find_nearest_line
-				(each, addr, &local_filename,
-				 &local_function, &local_linenumber,
-				 &local_discriminator))) != 0);
+		   && comp_unit_find_nearest_line (each, addr,
+						   filename_ptr,
+						   &function,
+						   linenumber_ptr,
+						   discriminator_ptr));
 	  if (found)
-	    {
-	      /* PRs 15935 15994: Bogus debug information may have provided us
-		 with an erroneous match.  We attempt to counter this by
-		 selecting the match that has the smallest address range
-		 associated with it.  (We are assuming that corrupt debug info
-		 will tend to result in extra large address ranges rather than
-		 extra small ranges).
-
-		 This does mean that we scan through all of the CUs associated
-		 with the bfd each time this function is called.  But this does
-		 have the benefit of producing consistent results every time the
-		 function is called.  */
-	      if (range <= min_range)
-		{
-		  if (filename_ptr && local_filename)
-		    * filename_ptr = local_filename;
-		  if (local_function)
-		    function = local_function;
-		  if (discriminator_ptr && local_discriminator)
-		    * discriminator_ptr = local_discriminator;
-		  if (local_linenumber)
-		    * linenumber_ptr = local_linenumber;
-		  min_range = range;
-		}
-	    }
-	}
-
-      if (* linenumber_ptr)
-	{
-	  found = true;
-	  goto done;
+	    goto done;
 	}
     }
 
@@ -5259,7 +5219,7 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
 						 filename_ptr,
 						 &function,
 						 linenumber_ptr,
-						 discriminator_ptr) != 0);
+						 discriminator_ptr));
 
       if (found)
 	break;

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-28 23:47             ` Alan Modra
@ 2022-03-29  6:07               ` Jan Beulich
  2022-03-31  6:21                 ` Steinar H. Gunderson
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-03-29  6:07 UTC (permalink / raw)
  To: Alan Modra; +Cc: sesse, binutils, Steinar H. Gunderson, Nick Clifton

On 29.03.2022 01:47, Alan Modra wrote:
> On Mon, Mar 28, 2022 at 12:19:52PM +0200, Jan Beulich wrote:
>> On 25.03.2022 00:30, Alan Modra via Binutils wrote:
>>> On Thu, Mar 24, 2022 at 09:01:38AM +0100, Steinar H. Gunderson wrote:
>>>> On Thu, Mar 24, 2022 at 03:52:27PM +1030, Alan Modra wrote:
>>>>> Huh, I remember looking at this code a while ago and finding it
>>>>> confusing.  I think the code would be clearer, and behave the same on
>>>>> normal line number info with the following patch:
>>>>
>>>> An interesting question is: Do you want to keep searching through
>>>> compilation units once you've found a match with a line number?
>>>> Should we go straight to “goto done” then?
>>>
>>> This would be reverting commit 240d6706c6a2.  In
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=15935#c3 I came to the
>>> conclusion that the pr15935 testcase had bogus debug info and closed
>>> the bug as invalid.  The reporter apparently opened another bug,
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=15994 a month later
>>> that Nick fixed by making _bfd_dwarf2_find_nearest_line do extra work.
>>> Which of course is unnecessary with good debug info, but in many cases
>>> we try to make binutils give the best result even with bad input.  I
>>> don't know the details beyond that.  It might have been that the
>>> compiler producing the bad debug info was one supported by RedHat.
>>>
>>> Now we have pr28592 and others complaining that objdump or addr2line
>>> have significantly slowed.  Given that pr15935 dates back to 2013, I
>>> would presume that people have moved on from whatever broken compiler
>>> produced bad line info, and that we should indeed revert commit
>>> 240d6706c6a2.  Nick?
>>
>> Since I ended up working on that function as well, I did notice another
>> potentially relevant aspect: The adjustment done back at the time was
>> only for the case of already processed CUs. The subsequent loop reading
>> any remaining ones doesn't similarly attempt to find a better match.
>> IOW the effects of that change can only have been partial anyway.
> 
> Yes, Steinar noticed that too.  Another thing I noticed is that prior
> to Nick's original patch looking for the best line range match, we
> used to return a function name result for addresses that matched a
> DW_TAG_subprogram (and similar) address ramge info.  See the
> comp_unit_find_nearest_line change in the following, a patch to revert
> commit 240d6706c6a2.

Yeah, makes sense to me. I guess if bad input was to be worked around
again, a testcase covering the specific situation would want adding, so
it won't be lost what specifically is to be worked around.

Jan

> 	PR 28592
> 	PR 15994
> 	PR 15935
> 	* dwarf2.c (lookup_address_in_line_info_table): Return bool rather
> 	than a range.
> 	(comp_unit_find_nearest_line): Likewise.  Return true if function
> 	info found without line info.
> 	(_bfd_dwarf2_find_nearest_line): Revert range handling code.
> 
> diff --git a/bfd/dwarf2.c b/bfd/dwarf2.c
> index 8b5ac6012e1..4beebcd2835 100644
> --- a/bfd/dwarf2.c
> +++ b/bfd/dwarf2.c
> @@ -2543,13 +2543,12 @@ decode_line_info (struct comp_unit *unit)
>    return NULL;
>  }
>  
> -/* If ADDR is within TABLE set the output parameters and return the
> -   range of addresses covered by the entry used to fill them out.
> -   Otherwise set * FILENAME_PTR to NULL and return 0.
> +/* If ADDR is within TABLE set the output parameters and return TRUE,
> +   otherwise set *FILENAME_PTR to NULL and return FALSE.
>     The parameters FILENAME_PTR, LINENUMBER_PTR and DISCRIMINATOR_PTR
>     are pointers to the objects to be filled in.  */
>  
> -static bfd_vma
> +static bool
>  lookup_address_in_line_info_table (struct line_info_table *table,
>  				   bfd_vma addr,
>  				   const char **filename_ptr,
> @@ -2608,12 +2607,12 @@ lookup_address_in_line_info_table (struct line_info_table *table,
>        *linenumber_ptr = info->line;
>        if (discriminator_ptr)
>  	*discriminator_ptr = info->discriminator;
> -      return seq->last_line->address - seq->low_pc;
> +      return true;
>      }
>  
>   fail:
>    *filename_ptr = NULL;
> -  return 0;
> +  return false;
>  }
>  
>  /* Read in the .debug_ranges section for future reference.  */
> @@ -4008,14 +4007,11 @@ comp_unit_contains_address (struct comp_unit *unit, bfd_vma addr)
>  }
>  
>  /* If UNIT contains ADDR, set the output parameters to the values for
> -   the line containing ADDR.  The output parameters, FILENAME_PTR,
> -   FUNCTION_PTR, and LINENUMBER_PTR, are pointers to the objects
> -   to be filled in.
> -
> -   Returns the range of addresses covered by the entry that was used
> -   to fill in *LINENUMBER_PTR or 0 if it was not filled in.  */
> +   the line containing ADDR and return TRUE.  Otherwise return FALSE.
> +   The output parameters, FILENAME_PTR, FUNCTION_PTR, and
> +   LINENUMBER_PTR, are pointers to the objects to be filled in.  */
>  
> -static bfd_vma
> +static bool
>  comp_unit_find_nearest_line (struct comp_unit *unit,
>  			     bfd_vma addr,
>  			     const char **filename_ptr,
> @@ -4023,7 +4019,7 @@ comp_unit_find_nearest_line (struct comp_unit *unit,
>  			     unsigned int *linenumber_ptr,
>  			     unsigned int *discriminator_ptr)
>  {
> -  bool func_p;
> +  bool line_p, func_p;
>  
>    if (!comp_unit_maybe_decode_line_info (unit))
>      return false;
> @@ -4033,10 +4029,11 @@ comp_unit_find_nearest_line (struct comp_unit *unit,
>    if (func_p && (*function_ptr)->tag == DW_TAG_inlined_subroutine)
>      unit->stash->inliner_chain = *function_ptr;
>  
> -  return lookup_address_in_line_info_table (unit->line_table, addr,
> -					    filename_ptr,
> -					    linenumber_ptr,
> -					    discriminator_ptr);
> +  line_p = lookup_address_in_line_info_table (unit->line_table, addr,
> +					      filename_ptr,
> +					      linenumber_ptr,
> +					      discriminator_ptr);
> +  return line_p || func_p;
>  }
>  
>  /* Check to see if line info is already decoded in a comp_unit.
> @@ -5187,54 +5184,17 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
>      }
>    else
>      {
> -      bfd_vma min_range = (bfd_vma) -1;
> -      const char * local_filename = NULL;
> -      struct funcinfo *local_function = NULL;
> -      unsigned int local_linenumber = 0;
> -      unsigned int local_discriminator = 0;
> -
>        for (each = stash->f.all_comp_units; each; each = each->next_unit)
>  	{
> -	  bfd_vma range = (bfd_vma) -1;
> -
>  	  found = ((each->arange.high == 0
>  		    || comp_unit_contains_address (each, addr))
> -		   && (range = (comp_unit_find_nearest_line
> -				(each, addr, &local_filename,
> -				 &local_function, &local_linenumber,
> -				 &local_discriminator))) != 0);
> +		   && comp_unit_find_nearest_line (each, addr,
> +						   filename_ptr,
> +						   &function,
> +						   linenumber_ptr,
> +						   discriminator_ptr));
>  	  if (found)
> -	    {
> -	      /* PRs 15935 15994: Bogus debug information may have provided us
> -		 with an erroneous match.  We attempt to counter this by
> -		 selecting the match that has the smallest address range
> -		 associated with it.  (We are assuming that corrupt debug info
> -		 will tend to result in extra large address ranges rather than
> -		 extra small ranges).
> -
> -		 This does mean that we scan through all of the CUs associated
> -		 with the bfd each time this function is called.  But this does
> -		 have the benefit of producing consistent results every time the
> -		 function is called.  */
> -	      if (range <= min_range)
> -		{
> -		  if (filename_ptr && local_filename)
> -		    * filename_ptr = local_filename;
> -		  if (local_function)
> -		    function = local_function;
> -		  if (discriminator_ptr && local_discriminator)
> -		    * discriminator_ptr = local_discriminator;
> -		  if (local_linenumber)
> -		    * linenumber_ptr = local_linenumber;
> -		  min_range = range;
> -		}
> -	    }
> -	}
> -
> -      if (* linenumber_ptr)
> -	{
> -	  found = true;
> -	  goto done;
> +	    goto done;
>  	}
>      }
>  
> @@ -5259,7 +5219,7 @@ _bfd_dwarf2_find_nearest_line (bfd *abfd,
>  						 filename_ptr,
>  						 &function,
>  						 linenumber_ptr,
> -						 discriminator_ptr) != 0);
> +						 discriminator_ptr));
>  
>        if (found)
>  	break;
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-29  6:07               ` Jan Beulich
@ 2022-03-31  6:21                 ` Steinar H. Gunderson
  2022-04-03 11:39                   ` Alan Modra
  0 siblings, 1 reply; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-03-31  6:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Alan Modra, sesse, binutils, Nick Clifton

On Tue, Mar 29, 2022 at 08:07:07AM +0200, Jan Beulich wrote:
> Yeah, makes sense to me. I guess if bad input was to be worked around
> again, a testcase covering the specific situation would want adding, so
> it won't be lost what specifically is to be worked around.

I'm fine with just removing the range handling logic if that's what you
decide. :-) Let me know when/if a patch goes in, and I'll send a v3 that
is adjusted.

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-03-31  6:21                 ` Steinar H. Gunderson
@ 2022-04-03 11:39                   ` Alan Modra
  2022-04-04  7:29                     ` Steinar H. Gunderson
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Modra @ 2022-04-03 11:39 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: Jan Beulich, sesse, binutils, Nick Clifton

On Thu, Mar 31, 2022 at 08:21:48AM +0200, Steinar H. Gunderson wrote:
> On Tue, Mar 29, 2022 at 08:07:07AM +0200, Jan Beulich wrote:
> > Yeah, makes sense to me. I guess if bad input was to be worked around
> > again, a testcase covering the specific situation would want adding, so
> > it won't be lost what specifically is to be worked around.
> 
> I'm fine with just removing the range handling logic if that's what you
> decide. :-) Let me know when/if a patch goes in, and I'll send a v3 that
> is adjusted.

Let's do that.  I'll commit the patch I've already posted.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Add a trie to map quickly from address range to compilation unit.
  2022-04-03 11:39                   ` Alan Modra
@ 2022-04-04  7:29                     ` Steinar H. Gunderson
  0 siblings, 0 replies; 14+ messages in thread
From: Steinar H. Gunderson @ 2022-04-04  7:29 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jan Beulich, sesse, binutils, Nick Clifton

On Sun, Apr 03, 2022 at 09:09:58PM +0930, Alan Modra wrote:
>> I'm fine with just removing the range handling logic if that's what you
>> decide. :-) Let me know when/if a patch goes in, and I'll send a v3 that
>> is adjusted.
> Let's do that.  I'll commit the patch I've already posted.

I've rebased my patch; sending shortly. (It's less tested than the
previous versions, but should be simpler.) I'm not entirely sure whether
I still need the unit->mark part, though; would it perhaps now always
find something on the first unit that matches from the trie?

/* Steinar */

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-04-04  7:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21  9:40 [PATCH] Add a trie to map quickly from address range to compilation unit Steinar H. Gunderson
2022-03-23 14:14 ` Nick Clifton
2022-03-23 15:53   ` Steinar H. Gunderson
2022-03-23 22:24   ` Steinar H. Gunderson
2022-03-24  5:22     ` Alan Modra
2022-03-24  8:01       ` Steinar H. Gunderson
2022-03-24 23:30         ` Alan Modra
2022-03-25  0:01           ` Steinar H. Gunderson
2022-03-28 10:19           ` Jan Beulich
2022-03-28 23:47             ` Alan Modra
2022-03-29  6:07               ` Jan Beulich
2022-03-31  6:21                 ` Steinar H. Gunderson
2022-04-03 11:39                   ` Alan Modra
2022-04-04  7:29                     ` Steinar H. Gunderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).