* [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 2/4] abg-writer: refactor recording of unemitted types Giuliano Procida
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
The insertion and look-up in get_id_for_type can be refactored. This
avoids an extra round of hashing and equality tests in the case of a
newly inserted type.
* src/abg-writer.cc (get_id_for_type): Fold together the
look-up and insertion logic by always attempting to insert a
dummy value.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 47 ++++++++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 21 deletions(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index 9f48dc92..d23f4fd6 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -421,30 +421,35 @@ public:
if (c == 0)
c = const_cast<type_base*>(t);
- type_ptr_map::const_iterator it = m_type_id_map.find(c);
- if (it != m_type_id_map.end())
- return it->second;
+ auto insertion = m_type_id_map.emplace(c, interned_string());
+ interned_string& id = insertion.first->second;
- switch (m_type_id_style)
- {
- case SEQUENCE_TYPE_ID_STYLE:
- {
- interned_string id = get_id_manager().get_id_with_prefix("type-id-");
- return m_type_id_map[c] = id;
- }
- case HASH_TYPE_ID_STYLE:
+ if (insertion.second)
+ switch (m_type_id_style)
{
- interned_string pretty = c->get_cached_pretty_representation(true);
- size_t hash = hashing::fnv_hash(pretty);
- while (!m_used_type_id_hashes.insert(hash).second)
- ++hash;
- std::ostringstream os;
- os << std::hex << std::setfill('0') << std::setw(8) << hash;
- return m_type_id_map[c] = c->get_environment()->intern(os.str());
+ case SEQUENCE_TYPE_ID_STYLE:
+ {
+ id = get_id_manager().get_id_with_prefix("type-id-");
+ break;
+ }
+ case HASH_TYPE_ID_STYLE:
+ {
+ interned_string pretty = c->get_cached_pretty_representation(true);
+ size_t hash = hashing::fnv_hash(pretty);
+ while (!m_used_type_id_hashes.insert(hash).second)
+ ++hash;
+ std::ostringstream os;
+ os << std::hex << std::setfill('0') << std::setw(8) << hash;
+ id = c->get_environment()->intern(os.str());
+ break;
+ }
+ default: {
+ ABG_ASSERT_NOT_REACHED;
+ break;
+ }
}
- }
- ABG_ASSERT_NOT_REACHED;
- return interned_string();
+
+ return id;
}
string
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/4] abg-writer: refactor recording of unemitted types
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 3/4] abg-writer: allow debugging of equality / hash issues Giuliano Procida
2021-09-10 11:23 ` [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
The code fragment
if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
referenced_types_to_emit.insert(*i);
appears 5 times. This commit inlines the insertion into the helper
function which is also renamed.
* src/abg-writer.cc (referenced_type_should_be_emitted):
Replaced by record_unemitted_type.
(record_unemitted_type): Replacement function that also takes
care of inserting the type into the set.
(write_translation_unit): Replace calls to
referenced_type_should_be_emitted and inserts with calls to
record_unemitted_type.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 35 +++++++++++++++--------------------
1 file changed, 15 insertions(+), 20 deletions(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index d23f4fd6..795ca685 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2248,10 +2248,12 @@ write_canonical_types_of_scope(const scope_decl &scope,
}
/// Test if a type referenced in a given translation unit should be
-/// emitted or not.
+/// emitted and record it if so.
///
/// This is a subroutine of @ref write_translation_unit.
///
+/// @param types the set of types to update.
+///
/// @param t the type to consider.
///
/// @param ctxt the write context to consider.
@@ -2260,20 +2262,18 @@ write_canonical_types_of_scope(const scope_decl &scope,
///
/// @param tu_is_last true if @p tu is the last translation unit being
/// emitted.
-///
-/// @return true iff @p t is to be emitted.
-static bool
-referenced_type_should_be_emitted(const type_base *t,
- const write_context& ctxt,
- const translation_unit& tu,
- bool tu_is_last)
+static void
+record_unemitted_type(type_ptr_set_type& types,
+ const type_base *t,
+ const write_context& ctxt,
+ const translation_unit& tu,
+ bool tu_is_last)
{
if ((tu_is_last || t->get_translation_unit()->get_absolute_path()
== tu.get_absolute_path())
&& !ctxt.type_is_emitted(t)
&& !ctxt.decl_only_type_is_emitted(t))
- return true;
- return false;
+ types.insert(t);
}
/// Serialize a translation unit to an output stream.
@@ -2389,22 +2389,19 @@ write_translation_unit(write_context& ctxt,
ctxt.get_referenced_types().begin();
i != ctxt.get_referenced_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (fn_type_ptr_set_type::const_iterator i =
ctxt.get_referenced_function_types().begin();
i != ctxt.get_referenced_function_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (type_ptr_set_type::const_iterator i =
ctxt.get_referenced_non_canonical_types().begin();
i != ctxt.get_referenced_non_canonical_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
// Ok, now let's emit the referenced type for good.
while (!referenced_types_to_emit.empty())
@@ -2459,15 +2456,13 @@ write_translation_unit(write_context& ctxt,
ctxt.get_referenced_types().begin();
i != ctxt.get_referenced_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (type_ptr_set_type::const_iterator i =
ctxt.get_referenced_non_canonical_types().begin();
i != ctxt.get_referenced_non_canonical_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
}
// Now handle all function types that were not only referenced by
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 3/4] abg-writer: allow debugging of equality / hash issues
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
2021-09-10 11:23 ` [PATCH 2/4] abg-writer: refactor recording of unemitted types Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
Inconsistenencies between equality and hash values can and have caused
hard-to-debug issues. There are quite a few unordered sets and maps in
the XML writer that use deep_ptr_eq_functor and type_hasher.
New ./configure option: --enable-debug-equality-hash-consistency
A simple way of tracking down potential issues is to check for
inconsistencies any time a type is looked up or inserted into a
container.
* src/abg-writer.cc (get_id_for_type): Add a debugging stanza
that, before a type is looked up or inserted, asserts that if
it is equal to an existing mapped type then their hash values
must be the same.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
configure.ac | 16 ++++++++
src/abg-writer.cc | 102 +++++++++++++++++++++++++++++++++++++---------
2 files changed, 98 insertions(+), 20 deletions(-)
diff --git a/configure.ac b/configure.ac
index 735cc9de..8870a7e7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -72,6 +72,12 @@ AC_ARG_ENABLE(debug-self-comparison,
ENABLE_DEBUG_SELF_COMPARISON=$enableval,
ENABLE_DEBUG_SELF_COMPARISON=no)
+AC_ARG_ENABLE(debug-equality-hash-consistency,
+ AS_HELP_STRING([--enable-debug-equality-hash-consistency=yes|no],
+ [enable debugging of equality / hash value consistency in unordered maps (default is no)]),
+ ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY=$enableval,
+ ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY=no)
+
AC_ARG_ENABLE(deb,
AS_HELP_STRING([--enable-deb=yes|no|auto],
[enable the support of deb in abipkgdiff (default is auto)]),
@@ -313,6 +319,16 @@ fi
AM_CONDITIONAL(ENABLE_DEBUG_SELF_COMPARISON, test x$ENABLE_DEBUG_SELF_COMPARISON = xyes)
+dnl enable the debugging of equality / hash value consistency in the abidw XML writer
+if test x$ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY = xyes; then
+ AC_DEFINE([WITH_DEBUG_EQUALITY_HASH_CONSISTENCY], 1, [compile equality / hash value consistency checks])
+ AC_MSG_NOTICE([support of debugging equality / hash value consistency is enabled])
+else
+ AC_MSG_NOTICE([support of debugging equality / hash value consistency is disabled])
+fi
+
+AM_CONDITIONAL(ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY, test x$ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY = xyes)
+
dnl Check for the dpkg program
if test x$ENABLE_DEB = xauto -o x$ENABLE_DEB = xyes; then
AC_CHECK_PROG(HAS_DPKG, dpkg, yes, no)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index 795ca685..cf664720 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -115,6 +115,41 @@ struct type_hasher
{return hash_type(t);}
}; // end struct type_hasher
+template <typename T>
+static void check_set_equality_hash_consistency(
+ const T& haystack, const typename T::key_type& needle)
+{
+#ifdef WITH_DEBUG_EQUALITY_HASH_CONSISTENCY
+ abigail::diff_utils::deep_ptr_eq_functor eq;
+ type_hasher hasher;
+ const auto needle_hash = hasher(needle);
+ for (const auto& key : haystack)
+ ABG_ASSERT(!eq(needle, key) || needle_hash == hasher(key));
+#else
+ (void) haystack;
+ (void) needle;
+#endif
+}
+
+template <typename T>
+static void check_map_equality_hash_consistency(
+ const T& haystack, const typename T::key_type& needle)
+{
+#ifdef WITH_DEBUG_EQUALITY_HASH_CONSISTENCY
+ abigail::diff_utils::deep_ptr_eq_functor eq;
+ type_hasher hasher;
+ const auto needle_hash = hasher(needle);
+ for (const auto& key_value : haystack)
+ {
+ const auto& key = key_value.first;
+ ABG_ASSERT(!eq(key, needle) || needle_hash == hasher(key));
+ }
+#else
+ (void) haystack;
+ (void) needle;
+#endif
+}
+
/// A convenience typedef for a map that associates a pointer to type
/// to a string. The pointer to type is hashed as fast as possible.
typedef unordered_map<type_base*,
@@ -421,6 +456,7 @@ public:
if (c == 0)
c = const_cast<type_base*>(t);
+ check_map_equality_hash_consistency(m_type_id_map, c);
auto insertion = m_type_id_map.emplace(c, interned_string());
interned_string& id = insertion.first->second;
@@ -544,14 +580,25 @@ public:
{
// If the type is a function type, record it in a dedicated data
// structure.
- if (function_type* f = is_function_type(t.get()))
- m_referenced_fn_types_set.insert(f);
- else if (!t->get_naked_canonical_type())
- // If the type doesn't have a canonical type, record it in a
- // dedicated data structure.
- m_referenced_non_canonical_types_set.insert(t.get());
+ type_base* t_ptr = t.get();
+ if (function_type* f = is_function_type(t_ptr))
+ {
+ check_set_equality_hash_consistency(m_referenced_fn_types_set, f);
+ m_referenced_fn_types_set.insert(f);
+ }
+ else if (!t_ptr->get_naked_canonical_type())
+ {
+ // If the type doesn't have a canonical type, record it in a
+ // dedicated data structure.
+ check_set_equality_hash_consistency(
+ m_referenced_non_canonical_types_set, t_ptr);
+ m_referenced_non_canonical_types_set.insert(t_ptr);
+ }
else
- m_referenced_types_set.insert(t.get());
+ {
+ check_set_equality_hash_consistency(m_referenced_types_set, t_ptr);
+ m_referenced_types_set.insert(t_ptr);
+ }
}
/// Test if a given type has been referenced by a pointer, a
@@ -564,15 +611,26 @@ public:
bool
type_is_referenced(const type_base_sptr& t)
{
- if (function_type *f = is_function_type(t.get()))
- return (m_referenced_fn_types_set.find(f)
- != m_referenced_fn_types_set.end());
- else if (!t->get_naked_canonical_type())
- return (m_referenced_non_canonical_types_set.find(t.get())
- != m_referenced_non_canonical_types_set.end());
+ type_base* t_ptr = t.get();
+ if (function_type *f = is_function_type(t_ptr))
+ {
+ check_set_equality_hash_consistency(m_referenced_fn_types_set, f);
+ return (m_referenced_fn_types_set.find(f)
+ != m_referenced_fn_types_set.end());
+ }
+ else if (!t_ptr->get_naked_canonical_type())
+ {
+ check_set_equality_hash_consistency(
+ m_referenced_non_canonical_types_set, t_ptr);
+ return (m_referenced_non_canonical_types_set.find(t_ptr)
+ != m_referenced_non_canonical_types_set.end());
+ }
else
- return m_referenced_types_set.find
- (t.get()) != m_referenced_types_set.end();
+ {
+ check_set_equality_hash_consistency(m_referenced_types_set, t_ptr);
+ return (m_referenced_types_set.find(t_ptr)
+ != m_referenced_types_set.end());
+ }
}
/// A comparison functor to compare pointers to @ref type_base.
@@ -727,6 +785,7 @@ public:
type_base *c = t->get_naked_canonical_type();
if (c == 0)
c = const_cast<type_base*>(t);
+ check_set_equality_hash_consistency(m_emitted_type_set, c);
m_emitted_type_set.insert(c);
}
@@ -739,6 +798,7 @@ public:
bool
type_is_emitted(const type_base *t) const
{
+ check_set_equality_hash_consistency(m_emitted_type_set, t);
return m_emitted_type_set.find(t) != m_emitted_type_set.end();
}
@@ -792,6 +852,7 @@ public:
{
class_or_union* cl = is_class_or_union_type(t);
ABG_ASSERT(cl && cl->get_is_declaration_only());
+ check_set_equality_hash_consistency(m_emitted_decl_only_set, t);
m_emitted_decl_only_set.insert(t);
}
@@ -814,10 +875,8 @@ public:
bool
decl_only_type_is_emitted(const type_base* t) const
{
- type_ptr_set_type::const_iterator i = m_emitted_decl_only_set.find(t);
- if (i == m_emitted_decl_only_set.end())
- return false;
- return true;
+ check_set_equality_hash_consistency(m_emitted_decl_only_set, t);
+ return m_emitted_decl_only_set.find(t) != m_emitted_decl_only_set.end();
}
/// Test if a declaration-only class has been emitted.
@@ -2273,7 +2332,10 @@ record_unemitted_type(type_ptr_set_type& types,
== tu.get_absolute_path())
&& !ctxt.type_is_emitted(t)
&& !ctxt.decl_only_type_is_emitted(t))
- types.insert(t);
+ {
+ check_set_equality_hash_consistency(types, t);
+ types.insert(t);
+ }
}
/// Serialize a translation unit to an output stream.
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
` (2 preceding siblings ...)
2021-09-10 11:23 ` [PATCH 3/4] abg-writer: allow debugging of equality / hash issues Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
Before writing out extra referenced types, the function
write_translation_unit does not check if this was an already emitted
declaration-only type.
This may actually be intentional, given the following comment.
// We allow several *declarations* of the same class in the corpus,
// but only one definition.
* src/abg-writer.cc (write_translation_unit): Also check if an
an extra referenced type is declaration-only and already
emitted.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index cf664720..abd5a45a 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2484,7 +2484,7 @@ write_translation_unit(write_context& ctxt,
// We handle types which have declarations *and* function
// types here.
type_base_sptr t(*i, noop_deleter());
- if (!ctxt.type_is_emitted(t))
+ if (!ctxt.type_is_emitted(t) && !ctxt.decl_only_type_is_emitted(t))
{
if (decl_base* d = get_type_declaration(*i))
{
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread