public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Giuliano Procida <gprocida@google.com>
To: Dodji Seketeli <dodji@seketeli.org>
Cc: Dodji Seketeli via Libabigail <libabigail@sourceware.org>,
	Dodji Seketeli <dodji@redhat.com>
Subject: Re: [PATCH 1/2 / RFC] Bug 26646 - unexpected declaration-only types (part 2)
Date: Wed, 2 Mar 2022 08:58:05 +0000	[thread overview]
Message-ID: <CAGvU0HkOmdg7RO7a53QyGVrZznsiircTJcdcBEaubN5q-7hoNg@mail.gmail.com> (raw)
In-Reply-To: <87zgm9in1k.fsf@seketeli.org>

Answering on phone.

This looks fine. You might be able to shorten things a bit with make_pair
but that's just code style.

Giuliano.

On Tue, 1 Mar 2022, 17:46 Dodji Seketeli, <dodji@seketeli.org> wrote:

> Hey Giuliano,
>
> Thank you for reviewing the patch!
>
> Dodji Seketeli via Libabigail <libabigail@sourceware.org> a écrit:
>
> [...]
>
> >       * src/abg-dwarf-reader.cc (struct dwarf_offset_pair_hash)
> >       (dwarf_offset_pair_set_type): Define new type.
> >       (die_offset, has_offset_pair, insert_offset_pair)
> >       (erase_offset_pair): Define new static helper functions.
> >       (compare_dies): Use a set of DWARF offsets for the
> >       'aggregates_being_compared' data structure, rather than a set of
> >       string representation of the DIEs.  Always look at the size of the
> >       types being compared first so that we can get out quickly if they
> >       differ.  For DIEs of DW_TAG_{union,struct}_type kind, don't limit
> >       the depth of the stack of DIEs being compared to 5; so we don't
> >       consider two types as being equal just because the depth of the
> >       stack being compared is 5 and the two types have the same size and
> >       are equal.  Hopefully things don't take too long.
> >
>
> OK, you actually posted your comment as a bugzilla comment which is
> fine.  As I'd like the review to be found by people reading the patch
> post as well, I am replying it here.
>
> [...]
>
> gprocida at google dot com via Libabigail <libabigail@sourceware.org> a
> écrit:
>
> > Two (overlapping) things:
> >
> > 1. I wouldn't bother with inserting / testing / erasing the reverse
> > pair o(p2, p1)
> >
> > It might cost more than it ever saves. But it would take
> > instrumentation to see if it ever makes a difference.
> >
> > If you make this change then s/unordered pair/ordered pair/ in the doc
> > comments.
>
> OK, I am going with your inclination here.  I updated the patch
> accordingly.
>
> > 2. There is a typo.
> >
> > One of the reverse pairs is written as o(p2, 1).
>
> Good catch. Thanks.  Updated.
>
> > Otherwise, it looks good to me.
>
> Thanks.  I am posting an updated version of the patch below.
>
> From 3b67a367ce6b8a141f0cb29a130c002ca7b984eb Mon Sep 17 00:00:00 2001
> From: Dodji Seketeli <dodji@redhat.com>
> Date: Thu, 10 Feb 2022 17:43:38 +0100
> Subject: [PATCH 1/2] Bug 26646 - unexpected declaration-only types (part 2)
>
> This patch addresses the comment
> https://sourceware.org/bugzilla/show_bug.cgi?id=26646#c15 of the
> reported problem.
>
> The core of the problem seems to be that when compare_dies compares
> two DW_TAG_{structure,union}_type, it "gives up" when the comparison
> stack contains more than 5 such DIEs being compared.  Giving up means
> that it considers the two DIEs to be equivalent if they have the same
> size and kind, basically.  This is to avoid infinite loops or too long
> loops that we've seen with artificial DWARF input generated by
> fuzzers.
>
> This patch fixes that by better using the "aggregates_being_compared"
> data structure that is meant to prevent looping infinitely over
> aggregate DIEs that might be recursively defined.  That data structure
> now contains the DWARF offset pairs of the DIEs being compared, rather
> than their "string representation".  Lookups in that data structure
> should now be faster and more precise.
>
> The artificial limit of the comparison stack not having more than 5
> DIEs is now lifted.
>
>         * src/abg-dwarf-reader.cc (struct dwarf_offset_pair_hash)
>         (dwarf_offset_pair_set_type): Define new type.
>         (die_offset, has_offset_pair, insert_offset_pair)
>         (erase_offset_pair): Define new static helper functions.
>         (compare_dies): Use a set of DWARF offsets for the
>         'aggregates_being_compared' data structure, rather than a set of
>         string representation of the DIEs.  Always look at the size of the
>         types being compared first so that we can get out quickly if they
>         differ.  For DIEs of DW_TAG_{union,struct}_type kind, don't limit
>         the depth of the stack of DIEs being compared to 5; so we don't
>         consider two types as being equal just because the depth of the
>         stack being compared is 5 and the two types have the same size and
>         are equal.  Hopefully things don't take too long.
>
> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
> ---
>  src/abg-dwarf-reader.cc | 135 +++++++++++++++++++++++++++++-----------
>  1 file changed, 98 insertions(+), 37 deletions(-)
>
> diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
> index 53b5845d..18f49037 100644
> --- a/src/abg-dwarf-reader.cc
> +++ b/src/abg-dwarf-reader.cc
> @@ -161,7 +161,18 @@ typedef unordered_map<GElf_Addr, elf_symbol_sptr>
> addr_elf_symbol_sptr_map_type;
>  /// Convenience typedef for a set of ELF addresses.
>  typedef unordered_set<GElf_Addr> address_set_type;
>
> -typedef unordered_set<interned_string, hash_interned_string>
> istring_set_type;
> +/// A hasher for a pair of Dwarf_Off.  This is used as a hasher for
> +/// the type @ref dwarf_offset_pair_set_type.
> +struct dwarf_offset_pair_hash
> +{
> +  size_t
> +  operator()(const std::pair<Dwarf_Off, Dwarf_Off>& p) const
> +  {return abigail::hashing::combine_hashes(p.first, p.second);}
> +};// end struct dwarf_offset_pair_hash
> +
> +typedef unordered_set<std::pair<Dwarf_Off,
> +                               Dwarf_Off>,
> +                     dwarf_offset_pair_hash> dwarf_offset_pair_set_type;
>
>  /// Convenience typedef for a shared pointer to an @ref address_set_type.
>  typedef shared_ptr<address_set_type> address_set_sptr;
> @@ -297,6 +308,12 @@ get_scope_die(const read_context&  ctxt,
>               size_t                    where_offset,
>               Dwarf_Die&                scope_die);
>
> +static Dwarf_Off
> +die_offset(Dwarf_Die* die);
> +
> +static Dwarf_Off
> +die_offset(const Dwarf_Die* die);
> +
>  static bool
>  die_is_anonymous(const Dwarf_Die* die);
>
> @@ -6230,6 +6247,24 @@ void
>  set_do_log(read_context& ctxt, bool f)
>  {ctxt.do_log(f);}
>
> +/// Get the offset of a DIE
> +///
> +/// @param die the DIE to consider.
> +///
> +/// @return the offset of the DIE.
> +static Dwarf_Off
> +die_offset(Dwarf_Die* die)
> +{return dwarf_dieoffset(die);}
> +
> +/// Get the offset of a DIE
> +///
> +/// @param die the DIE to consider.
> +///
> +/// @return the offset of the DIE.
> +static Dwarf_Off
> +die_offset(const Dwarf_Die* die)
> +{return die_offset(const_cast<Dwarf_Die*>(die));}
> +
>  /// Test if a given DIE is anonymous
>  ///
>  /// @param die the DIE to consider.
> @@ -10097,6 +10132,51 @@ fn_die_equal_by_linkage_name(const read_context
> &ctxt,
>           && llinkage_name == rlinkage_name);
>  }
>
> +/// Test if the pair of offset {p1,p2} is present in a set.
> +///
> +/// @param set the set of pairs of DWARF offsets to consider.
> +///
> +/// @param p1 the first value of the pair.
> +///
> +/// @param p2 the second value of the pair.
> +///
> +/// @return if the pair {p1,p2} is present in the set.
> +static bool
> +has_offset_pair(const dwarf_offset_pair_set_type& set,
> +               Dwarf_Off p1, Dwarf_Off p2)
> +{
> +  std::pair<Dwarf_Off, Dwarf_Off> p(p1, p2);
> +  if (set.find(p) != set.end())
> +    return true;
> +  return false;
> +}
> +
> +/// Insert a new pair of offset into the set of pair.
> +///
> +/// @param set the set of pairs of DWARF offsets to consider.
> +///
> +/// @param p1 the first value of the pair.
> +///
> +/// @param p2 the second value of the pair.
> +static void
> +insert_offset_pair(dwarf_offset_pair_set_type& set, Dwarf_Off p1,
> Dwarf_Off p2)
> +{set.insert(std::pair<Dwarf_Off, Dwarf_Off>(p1, p2));}
> +
> +/// Erase a pair of DWARF offset from a set of pairs.
> +///
> +///
> +/// @param set the set of pairs of DWARF offsets to consider.
> +///
> +/// @param p1 the first value of the pair.
> +///
> +/// @param p2 the second value of the pair.
> +static void
> +erase_offset_pair(dwarf_offset_pair_set_type& set, Dwarf_Off p1,
> Dwarf_Off p2)
> +{
> +  std::pair<Dwarf_Off, Dwarf_Off> p(p1, p2);
> +  set.erase(p);
> +}
> +
>  /// Compare two DIEs emitted by a C compiler.
>  ///
>  /// @param ctxt the read context used to load the DWARF information.
> @@ -10123,7 +10203,7 @@ fn_die_equal_by_linkage_name(const read_context
> &ctxt,
>  static bool
>  compare_dies(const read_context& ctxt,
>              const Dwarf_Die *l, const Dwarf_Die *r,
> -            istring_set_type& aggregates_being_compared,
> +            dwarf_offset_pair_set_type& aggregates_being_compared,
>              bool update_canonical_dies_on_the_fly)
>  {
>    ABG_ASSERT(l);
> @@ -10268,23 +10348,15 @@ compare_dies(const read_context& ctxt,
>      case DW_TAG_structure_type:
>      case DW_TAG_union_type:
>        {
> -       interned_string ln = ctxt.get_die_pretty_type_representation(l, 0);
> -       interned_string rn = ctxt.get_die_pretty_type_representation(r, 0);
> -
> -       if ((aggregates_being_compared.find(ln)
> -            != aggregates_being_compared.end())
> -           || (aggregates_being_compared.find(rn)
> -               != aggregates_being_compared.end()))
> +       if (has_offset_pair(aggregates_being_compared,
> +                           die_offset(l), die_offset(r)))
>           result = true;
> -       else if (!compare_as_decl_dies(l, r))
> -         result = false;
> -       else if (!compare_as_type_dies(l, r))
> +       else if (!compare_as_decl_dies(l, r) || !compare_as_type_dies(l,
> r))
>           result = false;
>         else
>           {
> -           aggregates_being_compared.insert(ln);
> -           aggregates_being_compared.insert(rn);
> -
> +           insert_offset_pair(aggregates_being_compared,
> +                              die_offset(l), die_offset(r));
>             Dwarf_Die l_member, r_member;
>             bool found_l_member, found_r_member;
>             for (found_l_member = dwarf_child(const_cast<Dwarf_Die*>(l),
> @@ -10316,8 +10388,8 @@ compare_dies(const read_context& ctxt,
>             if (found_l_member != found_r_member)
>               result = false;
>
> -           aggregates_being_compared.erase(ln);
> -           aggregates_being_compared.erase(rn);
> +           erase_offset_pair(aggregates_being_compared,
> +                             die_offset(l), die_offset(r));
>           }
>        }
>        break;
> @@ -10402,10 +10474,8 @@ compare_dies(const read_context& ctxt,
>         interned_string ln = ctxt.get_die_pretty_type_representation(l, 0);
>         interned_string rn = ctxt.get_die_pretty_type_representation(r, 0);
>
> -       if ((aggregates_being_compared.find(ln)
> -            != aggregates_being_compared.end())
> -           || (aggregates_being_compared.find(rn)
> -               != aggregates_being_compared.end()))
> +       if (has_offset_pair(aggregates_being_compared, die_offset(l),
> +                           die_offset(r)))
>           {
>             result = true;
>             break;
> @@ -10498,8 +10568,8 @@ compare_dies(const read_context& ctxt,
>               }
>           }
>
> -       aggregates_being_compared.erase(ln);
> -       aggregates_being_compared.erase(rn);
> +       erase_offset_pair(aggregates_being_compared,
> +                         die_offset(l), die_offset(r));
>        }
>        break;
>
> @@ -10535,19 +10605,10 @@ compare_dies(const read_context& ctxt,
>               Dwarf_Die l_type, r_type;
>               ABG_ASSERT(die_die_attribute(l, DW_AT_type, l_type));
>               ABG_ASSERT(die_die_attribute(r, DW_AT_type, r_type));
> -             if (aggregates_being_compared.size () < 5)
> -               {
> -                 if (!compare_dies(ctxt, &l_type, &r_type,
> -                                   aggregates_being_compared,
> -                                   update_canonical_dies_on_the_fly))
> -                   result = false;
> -               }
> -             else
> -               {
> -                 if (!compare_as_type_dies(&l_type, &r_type)
> -                     ||!compare_as_decl_dies(&l_type, &r_type))
> -                   return false;
> -               }
> +             if (!compare_dies(ctxt, &l_type, &r_type,
> +                               aggregates_being_compared,
> +                               update_canonical_dies_on_the_fly))
> +               result = false;
>             }
>         }
>        else
> @@ -10661,7 +10722,7 @@ compare_dies(const read_context& ctxt,
>              const Dwarf_Die *r,
>              bool update_canonical_dies_on_the_fly)
>  {
> -  istring_set_type aggregates_being_compared;
> +  dwarf_offset_pair_set_type aggregates_being_compared;
>    return compare_dies(ctxt, l, r, aggregates_being_compared,
>                       update_canonical_dies_on_the_fly);
>  }
> --
> 2.35.0.rc2
>
>
> --
>                 Dodji
>

  reply	other threads:[~2022-03-02  9:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28  9:34 Dodji Seketeli
2022-03-01 17:45 ` Dodji Seketeli
2022-03-02  8:58   ` Giuliano Procida [this message]
2022-03-02 16:31     ` Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGvU0HkOmdg7RO7a53QyGVrZznsiircTJcdcBEaubN5q-7hoNg@mail.gmail.com \
    --to=gprocida@google.com \
    --cc=dodji@redhat.com \
    --cc=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).