* [PATCH 0/4] Looking at equality and hashing
@ 2021-09-10 11:23 Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
These relate to PR28320.
The first two patches are essentially code movement to better isolate
where insertions and look-ups happen.
The third patch adds some optional consistency checks. These find one
issue in the test suite. Unfortunately, they do not pick up on issue
in PR28320 itself. Perhaps there is a canonical type pointer issue as
well.
The last patch seems like an omission. Please take a look.
Giuliano Procida (4):
abg-writer: get_id_for_type: refactor insertion and look-up
abg-writer: refactor recording of unemitted types
abg-writer: allow debugging of equality / hash issues
RFC: abg-writer: add a missing check for emitted declarations
configure.ac | 16 ++++
src/abg-writer.cc | 184 +++++++++++++++++++++++++++++++---------------
2 files changed, 139 insertions(+), 61 deletions(-)
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 2/4] abg-writer: refactor recording of unemitted types Giuliano Procida
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
The insertion and look-up in get_id_for_type can be refactored. This
avoids an extra round of hashing and equality tests in the case of a
newly inserted type.
* src/abg-writer.cc (get_id_for_type): Fold together the
look-up and insertion logic by always attempting to insert a
dummy value.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 47 ++++++++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 21 deletions(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index 9f48dc92..d23f4fd6 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -421,30 +421,35 @@ public:
if (c == 0)
c = const_cast<type_base*>(t);
- type_ptr_map::const_iterator it = m_type_id_map.find(c);
- if (it != m_type_id_map.end())
- return it->second;
+ auto insertion = m_type_id_map.emplace(c, interned_string());
+ interned_string& id = insertion.first->second;
- switch (m_type_id_style)
- {
- case SEQUENCE_TYPE_ID_STYLE:
- {
- interned_string id = get_id_manager().get_id_with_prefix("type-id-");
- return m_type_id_map[c] = id;
- }
- case HASH_TYPE_ID_STYLE:
+ if (insertion.second)
+ switch (m_type_id_style)
{
- interned_string pretty = c->get_cached_pretty_representation(true);
- size_t hash = hashing::fnv_hash(pretty);
- while (!m_used_type_id_hashes.insert(hash).second)
- ++hash;
- std::ostringstream os;
- os << std::hex << std::setfill('0') << std::setw(8) << hash;
- return m_type_id_map[c] = c->get_environment()->intern(os.str());
+ case SEQUENCE_TYPE_ID_STYLE:
+ {
+ id = get_id_manager().get_id_with_prefix("type-id-");
+ break;
+ }
+ case HASH_TYPE_ID_STYLE:
+ {
+ interned_string pretty = c->get_cached_pretty_representation(true);
+ size_t hash = hashing::fnv_hash(pretty);
+ while (!m_used_type_id_hashes.insert(hash).second)
+ ++hash;
+ std::ostringstream os;
+ os << std::hex << std::setfill('0') << std::setw(8) << hash;
+ id = c->get_environment()->intern(os.str());
+ break;
+ }
+ default: {
+ ABG_ASSERT_NOT_REACHED;
+ break;
+ }
}
- }
- ABG_ASSERT_NOT_REACHED;
- return interned_string();
+
+ return id;
}
string
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/4] abg-writer: refactor recording of unemitted types
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 3/4] abg-writer: allow debugging of equality / hash issues Giuliano Procida
2021-09-10 11:23 ` [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
The code fragment
if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
referenced_types_to_emit.insert(*i);
appears 5 times. This commit inlines the insertion into the helper
function which is also renamed.
* src/abg-writer.cc (referenced_type_should_be_emitted):
Replaced by record_unemitted_type.
(record_unemitted_type): Replacement function that also takes
care of inserting the type into the set.
(write_translation_unit): Replace calls to
referenced_type_should_be_emitted and inserts with calls to
record_unemitted_type.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 35 +++++++++++++++--------------------
1 file changed, 15 insertions(+), 20 deletions(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index d23f4fd6..795ca685 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2248,10 +2248,12 @@ write_canonical_types_of_scope(const scope_decl &scope,
}
/// Test if a type referenced in a given translation unit should be
-/// emitted or not.
+/// emitted and record it if so.
///
/// This is a subroutine of @ref write_translation_unit.
///
+/// @param types the set of types to update.
+///
/// @param t the type to consider.
///
/// @param ctxt the write context to consider.
@@ -2260,20 +2262,18 @@ write_canonical_types_of_scope(const scope_decl &scope,
///
/// @param tu_is_last true if @p tu is the last translation unit being
/// emitted.
-///
-/// @return true iff @p t is to be emitted.
-static bool
-referenced_type_should_be_emitted(const type_base *t,
- const write_context& ctxt,
- const translation_unit& tu,
- bool tu_is_last)
+static void
+record_unemitted_type(type_ptr_set_type& types,
+ const type_base *t,
+ const write_context& ctxt,
+ const translation_unit& tu,
+ bool tu_is_last)
{
if ((tu_is_last || t->get_translation_unit()->get_absolute_path()
== tu.get_absolute_path())
&& !ctxt.type_is_emitted(t)
&& !ctxt.decl_only_type_is_emitted(t))
- return true;
- return false;
+ types.insert(t);
}
/// Serialize a translation unit to an output stream.
@@ -2389,22 +2389,19 @@ write_translation_unit(write_context& ctxt,
ctxt.get_referenced_types().begin();
i != ctxt.get_referenced_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (fn_type_ptr_set_type::const_iterator i =
ctxt.get_referenced_function_types().begin();
i != ctxt.get_referenced_function_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (type_ptr_set_type::const_iterator i =
ctxt.get_referenced_non_canonical_types().begin();
i != ctxt.get_referenced_non_canonical_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
// Ok, now let's emit the referenced type for good.
while (!referenced_types_to_emit.empty())
@@ -2459,15 +2456,13 @@ write_translation_unit(write_context& ctxt,
ctxt.get_referenced_types().begin();
i != ctxt.get_referenced_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
for (type_ptr_set_type::const_iterator i =
ctxt.get_referenced_non_canonical_types().begin();
i != ctxt.get_referenced_non_canonical_types().end();
++i)
- if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
- referenced_types_to_emit.insert(*i);
+ record_unemitted_type(referenced_types_to_emit, *i, ctxt, tu, is_last);
}
// Now handle all function types that were not only referenced by
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 3/4] abg-writer: allow debugging of equality / hash issues
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
2021-09-10 11:23 ` [PATCH 2/4] abg-writer: refactor recording of unemitted types Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
2021-09-10 11:23 ` [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
Inconsistenencies between equality and hash values can and have caused
hard-to-debug issues. There are quite a few unordered sets and maps in
the XML writer that use deep_ptr_eq_functor and type_hasher.
New ./configure option: --enable-debug-equality-hash-consistency
A simple way of tracking down potential issues is to check for
inconsistencies any time a type is looked up or inserted into a
container.
* src/abg-writer.cc (get_id_for_type): Add a debugging stanza
that, before a type is looked up or inserted, asserts that if
it is equal to an existing mapped type then their hash values
must be the same.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
configure.ac | 16 ++++++++
src/abg-writer.cc | 102 +++++++++++++++++++++++++++++++++++++---------
2 files changed, 98 insertions(+), 20 deletions(-)
diff --git a/configure.ac b/configure.ac
index 735cc9de..8870a7e7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -72,6 +72,12 @@ AC_ARG_ENABLE(debug-self-comparison,
ENABLE_DEBUG_SELF_COMPARISON=$enableval,
ENABLE_DEBUG_SELF_COMPARISON=no)
+AC_ARG_ENABLE(debug-equality-hash-consistency,
+ AS_HELP_STRING([--enable-debug-equality-hash-consistency=yes|no],
+ [enable debugging of equality / hash value consistency in unordered maps (default is no)]),
+ ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY=$enableval,
+ ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY=no)
+
AC_ARG_ENABLE(deb,
AS_HELP_STRING([--enable-deb=yes|no|auto],
[enable the support of deb in abipkgdiff (default is auto)]),
@@ -313,6 +319,16 @@ fi
AM_CONDITIONAL(ENABLE_DEBUG_SELF_COMPARISON, test x$ENABLE_DEBUG_SELF_COMPARISON = xyes)
+dnl enable the debugging of equality / hash value consistency in the abidw XML writer
+if test x$ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY = xyes; then
+ AC_DEFINE([WITH_DEBUG_EQUALITY_HASH_CONSISTENCY], 1, [compile equality / hash value consistency checks])
+ AC_MSG_NOTICE([support of debugging equality / hash value consistency is enabled])
+else
+ AC_MSG_NOTICE([support of debugging equality / hash value consistency is disabled])
+fi
+
+AM_CONDITIONAL(ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY, test x$ENABLE_DEBUG_EQUALITY_HASH_CONSISTENCY = xyes)
+
dnl Check for the dpkg program
if test x$ENABLE_DEB = xauto -o x$ENABLE_DEB = xyes; then
AC_CHECK_PROG(HAS_DPKG, dpkg, yes, no)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index 795ca685..cf664720 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -115,6 +115,41 @@ struct type_hasher
{return hash_type(t);}
}; // end struct type_hasher
+template <typename T>
+static void check_set_equality_hash_consistency(
+ const T& haystack, const typename T::key_type& needle)
+{
+#ifdef WITH_DEBUG_EQUALITY_HASH_CONSISTENCY
+ abigail::diff_utils::deep_ptr_eq_functor eq;
+ type_hasher hasher;
+ const auto needle_hash = hasher(needle);
+ for (const auto& key : haystack)
+ ABG_ASSERT(!eq(needle, key) || needle_hash == hasher(key));
+#else
+ (void) haystack;
+ (void) needle;
+#endif
+}
+
+template <typename T>
+static void check_map_equality_hash_consistency(
+ const T& haystack, const typename T::key_type& needle)
+{
+#ifdef WITH_DEBUG_EQUALITY_HASH_CONSISTENCY
+ abigail::diff_utils::deep_ptr_eq_functor eq;
+ type_hasher hasher;
+ const auto needle_hash = hasher(needle);
+ for (const auto& key_value : haystack)
+ {
+ const auto& key = key_value.first;
+ ABG_ASSERT(!eq(key, needle) || needle_hash == hasher(key));
+ }
+#else
+ (void) haystack;
+ (void) needle;
+#endif
+}
+
/// A convenience typedef for a map that associates a pointer to type
/// to a string. The pointer to type is hashed as fast as possible.
typedef unordered_map<type_base*,
@@ -421,6 +456,7 @@ public:
if (c == 0)
c = const_cast<type_base*>(t);
+ check_map_equality_hash_consistency(m_type_id_map, c);
auto insertion = m_type_id_map.emplace(c, interned_string());
interned_string& id = insertion.first->second;
@@ -544,14 +580,25 @@ public:
{
// If the type is a function type, record it in a dedicated data
// structure.
- if (function_type* f = is_function_type(t.get()))
- m_referenced_fn_types_set.insert(f);
- else if (!t->get_naked_canonical_type())
- // If the type doesn't have a canonical type, record it in a
- // dedicated data structure.
- m_referenced_non_canonical_types_set.insert(t.get());
+ type_base* t_ptr = t.get();
+ if (function_type* f = is_function_type(t_ptr))
+ {
+ check_set_equality_hash_consistency(m_referenced_fn_types_set, f);
+ m_referenced_fn_types_set.insert(f);
+ }
+ else if (!t_ptr->get_naked_canonical_type())
+ {
+ // If the type doesn't have a canonical type, record it in a
+ // dedicated data structure.
+ check_set_equality_hash_consistency(
+ m_referenced_non_canonical_types_set, t_ptr);
+ m_referenced_non_canonical_types_set.insert(t_ptr);
+ }
else
- m_referenced_types_set.insert(t.get());
+ {
+ check_set_equality_hash_consistency(m_referenced_types_set, t_ptr);
+ m_referenced_types_set.insert(t_ptr);
+ }
}
/// Test if a given type has been referenced by a pointer, a
@@ -564,15 +611,26 @@ public:
bool
type_is_referenced(const type_base_sptr& t)
{
- if (function_type *f = is_function_type(t.get()))
- return (m_referenced_fn_types_set.find(f)
- != m_referenced_fn_types_set.end());
- else if (!t->get_naked_canonical_type())
- return (m_referenced_non_canonical_types_set.find(t.get())
- != m_referenced_non_canonical_types_set.end());
+ type_base* t_ptr = t.get();
+ if (function_type *f = is_function_type(t_ptr))
+ {
+ check_set_equality_hash_consistency(m_referenced_fn_types_set, f);
+ return (m_referenced_fn_types_set.find(f)
+ != m_referenced_fn_types_set.end());
+ }
+ else if (!t_ptr->get_naked_canonical_type())
+ {
+ check_set_equality_hash_consistency(
+ m_referenced_non_canonical_types_set, t_ptr);
+ return (m_referenced_non_canonical_types_set.find(t_ptr)
+ != m_referenced_non_canonical_types_set.end());
+ }
else
- return m_referenced_types_set.find
- (t.get()) != m_referenced_types_set.end();
+ {
+ check_set_equality_hash_consistency(m_referenced_types_set, t_ptr);
+ return (m_referenced_types_set.find(t_ptr)
+ != m_referenced_types_set.end());
+ }
}
/// A comparison functor to compare pointers to @ref type_base.
@@ -727,6 +785,7 @@ public:
type_base *c = t->get_naked_canonical_type();
if (c == 0)
c = const_cast<type_base*>(t);
+ check_set_equality_hash_consistency(m_emitted_type_set, c);
m_emitted_type_set.insert(c);
}
@@ -739,6 +798,7 @@ public:
bool
type_is_emitted(const type_base *t) const
{
+ check_set_equality_hash_consistency(m_emitted_type_set, t);
return m_emitted_type_set.find(t) != m_emitted_type_set.end();
}
@@ -792,6 +852,7 @@ public:
{
class_or_union* cl = is_class_or_union_type(t);
ABG_ASSERT(cl && cl->get_is_declaration_only());
+ check_set_equality_hash_consistency(m_emitted_decl_only_set, t);
m_emitted_decl_only_set.insert(t);
}
@@ -814,10 +875,8 @@ public:
bool
decl_only_type_is_emitted(const type_base* t) const
{
- type_ptr_set_type::const_iterator i = m_emitted_decl_only_set.find(t);
- if (i == m_emitted_decl_only_set.end())
- return false;
- return true;
+ check_set_equality_hash_consistency(m_emitted_decl_only_set, t);
+ return m_emitted_decl_only_set.find(t) != m_emitted_decl_only_set.end();
}
/// Test if a declaration-only class has been emitted.
@@ -2273,7 +2332,10 @@ record_unemitted_type(type_ptr_set_type& types,
== tu.get_absolute_path())
&& !ctxt.type_is_emitted(t)
&& !ctxt.decl_only_type_is_emitted(t))
- types.insert(t);
+ {
+ check_set_equality_hash_consistency(types, t);
+ types.insert(t);
+ }
}
/// Serialize a translation unit to an output stream.
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
` (2 preceding siblings ...)
2021-09-10 11:23 ` [PATCH 3/4] abg-writer: allow debugging of equality / hash issues Giuliano Procida
@ 2021-09-10 11:23 ` Giuliano Procida
3 siblings, 0 replies; 5+ messages in thread
From: Giuliano Procida @ 2021-09-10 11:23 UTC (permalink / raw)
To: libabigail; +Cc: dodji, kernel-team, gprocida, maennich
Before writing out extra referenced types, the function
write_translation_unit does not check if this was an already emitted
declaration-only type.
This may actually be intentional, given the following comment.
// We allow several *declarations* of the same class in the corpus,
// but only one definition.
* src/abg-writer.cc (write_translation_unit): Also check if an
an extra referenced type is declaration-only and already
emitted.
Signed-off-by: Giuliano Procida <gprocida@google.com>
---
src/abg-writer.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index cf664720..abd5a45a 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2484,7 +2484,7 @@ write_translation_unit(write_context& ctxt,
// We handle types which have declarations *and* function
// types here.
type_base_sptr t(*i, noop_deleter());
- if (!ctxt.type_is_emitted(t))
+ if (!ctxt.type_is_emitted(t) && !ctxt.decl_only_type_is_emitted(t))
{
if (decl_base* d = get_type_declaration(*i))
{
--
2.33.0.309.g3052b89438-goog
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-09-10 11:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-10 11:23 [PATCH 0/4] Looking at equality and hashing Giuliano Procida
2021-09-10 11:23 ` [PATCH 1/4] abg-writer: get_id_for_type: refactor insertion and look-up Giuliano Procida
2021-09-10 11:23 ` [PATCH 2/4] abg-writer: refactor recording of unemitted types Giuliano Procida
2021-09-10 11:23 ` [PATCH 3/4] abg-writer: allow debugging of equality / hash issues Giuliano Procida
2021-09-10 11:23 ` [PATCH 4/4] RFC: abg-writer: add a missing check for emitted declarations Giuliano Procida
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).