public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@redhat.com>
To: gprocida@google.com
Cc: libabigail@sourceware.org
Subject: [PATCH 1/2 / RFC] Bug 26646 - unexpected declaration-only types (part 2)
Date: Mon, 28 Feb 2022 10:34:10 +0100	[thread overview]
Message-ID: <87wnhfjpwt.fsf@redhat.com> (raw)

Hello Giuliano,

This patch addresses the comment
https://sourceware.org/bugzilla/show_bug.cgi?id=26646#c15 of the
reported problem.

The core of the problem seems to be that when compare_dies compares
two DW_TAG_{structure,union}_type, it "gives up" when the comparison
stack contains more than 5 such DIEs being compared.  Giving up means
that it considers the two DIEs to be equivalent if they have the same
size and kind, basically.  This is to avoid infinite loops or too long
loops that we've seen with artificial DWARF input generated by
fuzzers.

This patch fixes that by better using the "aggregates_being_compared"
data structure that is meant to prevent looping infinitely over
aggregate DIEs that might be recursively defined.  That data structure
now contains the DWARF offset pairs of the DIEs being compared, rather
than their "string representation".  Lookups in that data structure
should now be faster and more precise.

The artificial limit of the comparison stack not having more than 5
DIEs is now lifted.

	* src/abg-dwarf-reader.cc (struct dwarf_offset_pair_hash)
	(dwarf_offset_pair_set_type): Define new type.
	(die_offset, has_offset_pair, insert_offset_pair)
	(erase_offset_pair): Define new static helper functions.
	(compare_dies): Use a set of DWARF offsets for the
	'aggregates_being_compared' data structure, rather than a set of
	string representation of the DIEs.  Always look at the size of the
	types being compared first so that we can get out quickly if they
	differ.  For DIEs of DW_TAG_{union,struct}_type kind, don't limit
	the depth of the stack of DIEs being compared to 5; so we don't
	consider two types as being equal just because the depth of the
	stack being compared is 5 and the two types have the same size and
	are equal.  Hopefully things don't take too long.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 src/abg-dwarf-reader.cc | 138 +++++++++++++++++++++++++++++-----------
 1 file changed, 101 insertions(+), 37 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 53b5845d..05a7e4d6 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -161,7 +161,18 @@ typedef unordered_map<GElf_Addr, elf_symbol_sptr> addr_elf_symbol_sptr_map_type;
 /// Convenience typedef for a set of ELF addresses.
 typedef unordered_set<GElf_Addr> address_set_type;
 
-typedef unordered_set<interned_string, hash_interned_string> istring_set_type;
+/// A hasher for a pair of Dwarf_Off.  This is used as a hasher for
+/// the type @ref dwarf_offset_pair_set_type.
+struct dwarf_offset_pair_hash
+{
+  size_t
+  operator()(const std::pair<Dwarf_Off, Dwarf_Off>& p) const
+  {return abigail::hashing::combine_hashes(p.first, p.second);}
+};// end struct dwarf_offset_pair_hash
+
+typedef unordered_set<std::pair<Dwarf_Off,
+				Dwarf_Off>,
+		      dwarf_offset_pair_hash> dwarf_offset_pair_set_type;
 
 /// Convenience typedef for a shared pointer to an @ref address_set_type.
 typedef shared_ptr<address_set_type> address_set_sptr;
@@ -297,6 +308,12 @@ get_scope_die(const read_context&	ctxt,
 	      size_t			where_offset,
 	      Dwarf_Die&		scope_die);
 
+static Dwarf_Off
+die_offset(Dwarf_Die* die);
+
+static Dwarf_Off
+die_offset(const Dwarf_Die* die);
+
 static bool
 die_is_anonymous(const Dwarf_Die* die);
 
@@ -6230,6 +6247,24 @@ void
 set_do_log(read_context& ctxt, bool f)
 {ctxt.do_log(f);}
 
+/// Get the offset of a DIE
+///
+/// @param die the DIE to consider.
+///
+/// @return the offset of the DIE.
+static Dwarf_Off
+die_offset(Dwarf_Die* die)
+{return dwarf_dieoffset(die);}
+
+/// Get the offset of a DIE
+///
+/// @param die the DIE to consider.
+///
+/// @return the offset of the DIE.
+static Dwarf_Off
+die_offset(const Dwarf_Die* die)
+{return die_offset(const_cast<Dwarf_Die*>(die));}
+
 /// Test if a given DIE is anonymous
 ///
 /// @param die the DIE to consider.
@@ -10097,6 +10132,54 @@ fn_die_equal_by_linkage_name(const read_context &ctxt,
 	  && llinkage_name == rlinkage_name);
 }
 
+/// Test if the unordered pair of offset {p1,p2} is present in a set.
+///
+/// @param set the set of pairs of DWARF offsets to consider.
+///
+/// @param p1 the first value of the pair.
+///
+/// @param p2 the second value of the pair.
+///
+/// @return if the pair {p1,p2} or {p2,p1} is present in the set.
+static bool
+has_offset_pair(const dwarf_offset_pair_set_type& set,
+		Dwarf_Off p1, Dwarf_Off p2)
+{
+  std::pair<Dwarf_Off, Dwarf_Off> p(p1, p2), o(p2, p1);
+  if ((set.find(p) != set.end())
+      || (set.find(o) != set.end()))
+    return true;
+  return false;
+}
+
+/// Insert a new pair of offset into the set of pair.
+///
+/// @param set the set of pairs of DWARF offsets to consider.
+///
+/// @param p1 the first value of the pair.
+///
+/// @param p2 the second value of the pair.
+static void
+insert_offset_pair(dwarf_offset_pair_set_type& set, Dwarf_Off p1, Dwarf_Off p2)
+{set.insert(std::pair<Dwarf_Off, Dwarf_Off>(p1, p2));}
+
+/// Erase a pair of DWARF offset from a set of pairs.
+///
+/// The pair erased is non ordered.
+///
+/// @param set the set of pairs of DWARF offsets to consider.
+///
+/// @param p1 the first value of the pair.
+///
+/// @param p2 the second value of the pair.
+static void
+erase_offset_pair(dwarf_offset_pair_set_type& set, Dwarf_Off p1, Dwarf_Off p2)
+{
+  std::pair<Dwarf_Off, Dwarf_Off> p(p1, p2), o(p2, 1);
+  if (!set.erase(p))
+    set.erase(o);
+}
+
 /// Compare two DIEs emitted by a C compiler.
 ///
 /// @param ctxt the read context used to load the DWARF information.
@@ -10123,7 +10206,7 @@ fn_die_equal_by_linkage_name(const read_context &ctxt,
 static bool
 compare_dies(const read_context& ctxt,
 	     const Dwarf_Die *l, const Dwarf_Die *r,
-	     istring_set_type& aggregates_being_compared,
+	     dwarf_offset_pair_set_type& aggregates_being_compared,
 	     bool update_canonical_dies_on_the_fly)
 {
   ABG_ASSERT(l);
@@ -10268,23 +10351,15 @@ compare_dies(const read_context& ctxt,
     case DW_TAG_structure_type:
     case DW_TAG_union_type:
       {
-	interned_string ln = ctxt.get_die_pretty_type_representation(l, 0);
-	interned_string rn = ctxt.get_die_pretty_type_representation(r, 0);
-
-	if ((aggregates_being_compared.find(ln)
-	     != aggregates_being_compared.end())
-	    || (aggregates_being_compared.find(rn)
-		!= aggregates_being_compared.end()))
+	if (has_offset_pair(aggregates_being_compared,
+			    die_offset(l), die_offset(r)))
 	  result = true;
-	else if (!compare_as_decl_dies(l, r))
-	  result = false;
-	else if (!compare_as_type_dies(l, r))
+	else if (!compare_as_decl_dies(l, r) || !compare_as_type_dies(l, r))
 	  result = false;
 	else
 	  {
-	    aggregates_being_compared.insert(ln);
-	    aggregates_being_compared.insert(rn);
-
+	    insert_offset_pair(aggregates_being_compared,
+			       die_offset(l), die_offset(r));
 	    Dwarf_Die l_member, r_member;
 	    bool found_l_member, found_r_member;
 	    for (found_l_member = dwarf_child(const_cast<Dwarf_Die*>(l),
@@ -10316,8 +10391,8 @@ compare_dies(const read_context& ctxt,
 	    if (found_l_member != found_r_member)
 	      result = false;
 
-	    aggregates_being_compared.erase(ln);
-	    aggregates_being_compared.erase(rn);
+	    erase_offset_pair(aggregates_being_compared,
+			      die_offset(l), die_offset(r));
 	  }
       }
       break;
@@ -10402,10 +10477,8 @@ compare_dies(const read_context& ctxt,
 	interned_string ln = ctxt.get_die_pretty_type_representation(l, 0);
 	interned_string rn = ctxt.get_die_pretty_type_representation(r, 0);
 
-	if ((aggregates_being_compared.find(ln)
-	     != aggregates_being_compared.end())
-	    || (aggregates_being_compared.find(rn)
-		!= aggregates_being_compared.end()))
+	if (has_offset_pair(aggregates_being_compared, die_offset(l),
+			    die_offset(r)))
 	  {
 	    result = true;
 	    break;
@@ -10498,8 +10571,8 @@ compare_dies(const read_context& ctxt,
 	      }
 	  }
 
-	aggregates_being_compared.erase(ln);
-	aggregates_being_compared.erase(rn);
+	erase_offset_pair(aggregates_being_compared,
+			  die_offset(l), die_offset(r));
       }
       break;
 
@@ -10535,19 +10608,10 @@ compare_dies(const read_context& ctxt,
 	      Dwarf_Die l_type, r_type;
 	      ABG_ASSERT(die_die_attribute(l, DW_AT_type, l_type));
 	      ABG_ASSERT(die_die_attribute(r, DW_AT_type, r_type));
-	      if (aggregates_being_compared.size () < 5)
-		{
-		  if (!compare_dies(ctxt, &l_type, &r_type,
-				    aggregates_being_compared,
-				    update_canonical_dies_on_the_fly))
-		    result = false;
-		}
-	      else
-		{
-		  if (!compare_as_type_dies(&l_type, &r_type)
-		      ||!compare_as_decl_dies(&l_type, &r_type))
-		    return false;
-		}
+	      if (!compare_dies(ctxt, &l_type, &r_type,
+				aggregates_being_compared,
+				update_canonical_dies_on_the_fly))
+		result = false;
 	    }
 	}
       else
@@ -10661,7 +10725,7 @@ compare_dies(const read_context& ctxt,
 	     const Dwarf_Die *r,
 	     bool update_canonical_dies_on_the_fly)
 {
-  istring_set_type aggregates_being_compared;
+  dwarf_offset_pair_set_type aggregates_being_compared;
   return compare_dies(ctxt, l, r, aggregates_being_compared,
 		      update_canonical_dies_on_the_fly);
 }
-- 
2.35.0.rc2


-- 
		Dodji


             reply	other threads:[~2022-02-28  9:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28  9:34 Dodji Seketeli [this message]
2022-03-01 17:45 ` Dodji Seketeli
2022-03-02  8:58   ` Giuliano Procida
2022-03-02 16:31     ` Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnhfjpwt.fsf@redhat.com \
    --to=dodji@redhat.com \
    --cc=gprocida@google.com \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).