public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
* [RESEND PATCH] corpus/writer: sort emitted translation units by path name
@ 2020-04-30 22:58 Matthias Maennich
  2020-05-01 14:35 ` Giuliano Procida
  2020-05-05 11:30 ` Dodji Seketeli
  0 siblings, 2 replies; 4+ messages in thread
From: Matthias Maennich @ 2020-04-30 22:58 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, gprocida, kernel-team, maennich, mark

By sorting the corpora output, we achieve determinism for the emitted
XML file across multiple runs of abidw.

For that to happen, change the collection of translation units to a
std::set (instead of std::vector), sorted by absolute path name.

Test data needed adjustments, but changes are fully compatible.

	* include/abg-fwd.h: remove translation_units fwd declaration.
	* include/abg-ir.h: add SharedTranslationUnitComparator and add
	translation_units fwd declaration as a set using said comparator.
	* src/abg-corpus.cc (corpus::add): do checked insert into the
	translation_units set (rather than vector::pushback)
	* tests/data/test-annotate/test13-pr18894.so.abi: Adjust test data.
	* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
	* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
	* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
	* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
	* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
	* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
	* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.

Signed-off-by: Matthias Maennich <maennich@google.com>
---

I cut out the test data changes when sending this PATCH to avoid a crazy large
email. The test data can be restored quite easily, but I also staged the change
in the mm/sort-tu-by-name branch [1] easily cherry pickable as

    d5bdebdbd3a8 ("corpus/writer: sort emitted translation units by path name")

Cheers,
Matthias

[1] https://sourceware.org/git/?p=libabigail.git;a=commit;h=d5bdebdbd3a8a5abd804b314ded517ae6fa26d2e

 include/abg-fwd.h                             |     2 -
 include/abg-ir.h                              |    15 +
 src/abg-corpus.cc                             |     3 +-
 .../data/test-annotate/test13-pr18894.so.abi  |   562 +-
 .../data/test-annotate/test14-pr18893.so.abi  | 11708 +--
 .../data/test-annotate/test15-pr18892.so.abi  | 71712 +++++++--------
 .../data/test-annotate/test17-pr19027.so.abi  | 54448 +++++------
 ...st18-pr19037-libvtkRenderingLIC-6.1.so.abi |    58 +-
 ...19-pr19023-libtcmalloc_and_profiler.so.abi | 76289 ++++++++--------
 ...st20-pr19025-libvtkParallelCore-6.1.so.abi |  9220 +-
 .../data/test-annotate/test21-pr19092.so.abi  | 12358 +--
 11 files changed, 118387 insertions(+), 117988 deletions(-)

diff --git a/include/abg-fwd.h b/include/abg-fwd.h
index 1aab70a663de..cffa3f0d1aa3 100644
--- a/include/abg-fwd.h
+++ b/include/abg-fwd.h
@@ -119,8 +119,6 @@ class translation_unit;
 /// Convenience typedef for a shared pointer on a @ref
 /// translation_unit type.
 typedef shared_ptr<translation_unit> translation_unit_sptr;
-/// Convenience typedef for a vector of @ref translation_unit_sptr.
-typedef std::vector<translation_unit_sptr> translation_units;
 /// Convenience typedef for a map that associates a string to a
 /// translation unit.
 typedef unordered_map<string, translation_unit_sptr> string_tu_map_type;
diff --git a/include/abg-ir.h b/include/abg-ir.h
index fda10de5c537..83dbe02f605d 100644
--- a/include/abg-ir.h
+++ b/include/abg-ir.h
@@ -33,6 +33,7 @@
 #include <assert.h>
 #include <stdint.h>
 #include <cstdlib>
+#include <set>
 #include "abg-cxx-compat.h"
 #include "abg-fwd.h"
 #include "abg-hash.h"
@@ -707,6 +708,20 @@ public:
 					translation_unit& tu);
 };//end class translation_unit
 
+struct SharedTranslationUnitComparator
+{
+  bool
+  operator()(const translation_unit_sptr& lhs,
+	     const translation_unit_sptr& rhs) const
+  {
+    return lhs->get_absolute_path() < rhs->get_absolute_path();
+  }
+};
+
+/// Convenience typedef for a vector of @ref translation_unit_sptr.
+typedef std::set<translation_unit_sptr, SharedTranslationUnitComparator>
+    translation_units;
+
 string
 translation_unit_language_to_string(translation_unit::language);
 
diff --git a/src/abg-corpus.cc b/src/abg-corpus.cc
index 12f44fd1e4cf..e01c20f75b4d 100644
--- a/src/abg-corpus.cc
+++ b/src/abg-corpus.cc
@@ -560,7 +560,8 @@ corpus::add(const translation_unit_sptr tu)
 
   ABG_ASSERT(tu->get_environment() == get_environment());
 
-  priv_->members.push_back(tu);
+  ABG_ASSERT(priv_->members.insert(tu).second);
+
   if (!tu->get_absolute_path().empty())
     {
       // Update the path -> translation_unit map.
-- 
2.26.2.526.g744177e7f7-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] corpus/writer: sort emitted translation units by path name
  2020-04-30 22:58 [RESEND PATCH] corpus/writer: sort emitted translation units by path name Matthias Maennich
@ 2020-05-01 14:35 ` Giuliano Procida
  2020-05-04 13:25   ` Matthias Maennich
  2020-05-05 11:30 ` Dodji Seketeli
  1 sibling, 1 reply; 4+ messages in thread
From: Giuliano Procida @ 2020-05-01 14:35 UTC (permalink / raw)
  To: Matthias Maennich; +Cc: libabigail, Dodji Seketeli, kernel-team, mark

Hi.

I had a patch which sorted cus in read_debug_info_into_corpus but I
had some concens. This was ages ago.

Anyway...

On Thu, 30 Apr 2020 at 23:58, Matthias Maennich <maennich@google.com> wrote:
>
> By sorting the corpora output, we achieve determinism for the emitted
> XML file across multiple runs of abidw.
>
> For that to happen, change the collection of translation units to a
> std::set (instead of std::vector), sorted by absolute path name.

Would it be excessive paranoia to use a multiset instead?
Or keep the vector and sort it with the comparator?

This would eliminate an ABG_ASSERT.

> Test data needed adjustments, but changes are fully compatible.
>
>         * include/abg-fwd.h: remove translation_units fwd declaration.
>         * include/abg-ir.h: add SharedTranslationUnitComparator and add
>         translation_units fwd declaration as a set using said comparator.
>         * src/abg-corpus.cc (corpus::add): do checked insert into the
>         translation_units set (rather than vector::pushback)

s/pushback/push_back/

>         * tests/data/test-annotate/test13-pr18894.so.abi: Adjust test data.
>         * tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
>         * tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
>         * tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
>         * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
>         * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
>         * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
>         * tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
>

Reviewed-by: Giuliano Procida <gprocida@google.com>

> Signed-off-by: Matthias Maennich <maennich@google.com>
> ---
>
> I cut out the test data changes when sending this PATCH to avoid a crazy large
> email. The test data can be restored quite easily, but I also staged the change
> in the mm/sort-tu-by-name branch [1] easily cherry pickable as
>
>     d5bdebdbd3a8 ("corpus/writer: sort emitted translation units by path name")
>
> Cheers,
> Matthias
>
> [1] https://sourceware.org/git/?p=libabigail.git;a=commit;h=d5bdebdbd3a8a5abd804b314ded517ae6fa26d2e
>
>  include/abg-fwd.h                             |     2 -
>  include/abg-ir.h                              |    15 +
>  src/abg-corpus.cc                             |     3 +-
>  .../data/test-annotate/test13-pr18894.so.abi  |   562 +-
>  .../data/test-annotate/test14-pr18893.so.abi  | 11708 +--
>  .../data/test-annotate/test15-pr18892.so.abi  | 71712 +++++++--------
>  .../data/test-annotate/test17-pr19027.so.abi  | 54448 +++++------
>  ...st18-pr19037-libvtkRenderingLIC-6.1.so.abi |    58 +-
>  ...19-pr19023-libtcmalloc_and_profiler.so.abi | 76289 ++++++++--------
>  ...st20-pr19025-libvtkParallelCore-6.1.so.abi |  9220 +-
>  .../data/test-annotate/test21-pr19092.so.abi  | 12358 +--
>  11 files changed, 118387 insertions(+), 117988 deletions(-)
>
> diff --git a/include/abg-fwd.h b/include/abg-fwd.h
> index 1aab70a663de..cffa3f0d1aa3 100644
> --- a/include/abg-fwd.h
> +++ b/include/abg-fwd.h
> @@ -119,8 +119,6 @@ class translation_unit;
>  /// Convenience typedef for a shared pointer on a @ref
>  /// translation_unit type.
>  typedef shared_ptr<translation_unit> translation_unit_sptr;
> -/// Convenience typedef for a vector of @ref translation_unit_sptr.
> -typedef std::vector<translation_unit_sptr> translation_units;
>  /// Convenience typedef for a map that associates a string to a
>  /// translation unit.
>  typedef unordered_map<string, translation_unit_sptr> string_tu_map_type;
> diff --git a/include/abg-ir.h b/include/abg-ir.h
> index fda10de5c537..83dbe02f605d 100644
> --- a/include/abg-ir.h
> +++ b/include/abg-ir.h
> @@ -33,6 +33,7 @@
>  #include <assert.h>
>  #include <stdint.h>
>  #include <cstdlib>
> +#include <set>
>  #include "abg-cxx-compat.h"
>  #include "abg-fwd.h"
>  #include "abg-hash.h"
> @@ -707,6 +708,20 @@ public:
>                                         translation_unit& tu);
>  };//end class translation_unit
>
> +struct SharedTranslationUnitComparator
> +{
> +  bool
> +  operator()(const translation_unit_sptr& lhs,
> +            const translation_unit_sptr& rhs) const
> +  {
> +    return lhs->get_absolute_path() < rhs->get_absolute_path();
> +  }
> +};
> +
> +/// Convenience typedef for a vector of @ref translation_unit_sptr.

vector / set

> +typedef std::set<translation_unit_sptr, SharedTranslationUnitComparator>
> +    translation_units;
> +
>  string
>  translation_unit_language_to_string(translation_unit::language);
>
> diff --git a/src/abg-corpus.cc b/src/abg-corpus.cc
> index 12f44fd1e4cf..e01c20f75b4d 100644
> --- a/src/abg-corpus.cc
> +++ b/src/abg-corpus.cc
> @@ -560,7 +560,8 @@ corpus::add(const translation_unit_sptr tu)
>
>    ABG_ASSERT(tu->get_environment() == get_environment());
>
> -  priv_->members.push_back(tu);
> +  ABG_ASSERT(priv_->members.insert(tu).second);
> +
>    if (!tu->get_absolute_path().empty())
>      {
>        // Update the path -> translation_unit map.
> --
> 2.26.2.526.g744177e7f7-goog
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] corpus/writer: sort emitted translation units by path name
  2020-05-01 14:35 ` Giuliano Procida
@ 2020-05-04 13:25   ` Matthias Maennich
  0 siblings, 0 replies; 4+ messages in thread
From: Matthias Maennich @ 2020-05-04 13:25 UTC (permalink / raw)
  To: Giuliano Procida; +Cc: libabigail, Dodji Seketeli, kernel-team, mark

On Fri, May 01, 2020 at 03:35:06PM +0100, Giuliano Procida wrote:
>Hi.
>
>I had a patch which sorted cus in read_debug_info_into_corpus but I
>had some concens. This was ages ago.
>
>Anyway...
>
>On Thu, 30 Apr 2020 at 23:58, Matthias Maennich <maennich@google.com> wrote:
>>
>> By sorting the corpora output, we achieve determinism for the emitted
>> XML file across multiple runs of abidw.
>>
>> For that to happen, change the collection of translation units to a
>> std::set (instead of std::vector), sorted by absolute path name.
>
>Would it be excessive paranoia to use a multiset instead?
>Or keep the vector and sort it with the comparator?
>
>This would eliminate an ABG_ASSERT.

This assert kicks in if we see a translation unit with the same name
twice. While there might be scenarios where the same relative name is
valid, that should still not happen. Hence my assumption that this
should never happen and we can assert there. Allowing multiple values
with the same key likely introduces issues downstream when working on
the set of translation units.

>
>> Test data needed adjustments, but changes are fully compatible.
>>
>>         * include/abg-fwd.h: remove translation_units fwd declaration.
>>         * include/abg-ir.h: add SharedTranslationUnitComparator and add
>>         translation_units fwd declaration as a set using said comparator.
>>         * src/abg-corpus.cc (corpus::add): do checked insert into the
>>         translation_units set (rather than vector::pushback)
>
>s/pushback/push_back/

Dodji, could you correct this if you agree with the patch and decide to
apply it?

>
>>         * tests/data/test-annotate/test13-pr18894.so.abi: Adjust test data.
>>         * tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
>>         * tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
>>         * tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
>>         * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
>>         * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
>>         * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
>>         * tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
>>
>
>Reviewed-by: Giuliano Procida <gprocida@google.com>
>
>> Signed-off-by: Matthias Maennich <maennich@google.com>
>> ---
>>
>> I cut out the test data changes when sending this PATCH to avoid a crazy large
>> email. The test data can be restored quite easily, but I also staged the change
>> in the mm/sort-tu-by-name branch [1] easily cherry pickable as
>>
>>     d5bdebdbd3a8 ("corpus/writer: sort emitted translation units by path name")
>>
>> Cheers,
>> Matthias
>>
>> [1] https://sourceware.org/git/?p=libabigail.git;a=commit;h=d5bdebdbd3a8a5abd804b314ded517ae6fa26d2e
>>
>>  include/abg-fwd.h                             |     2 -
>>  include/abg-ir.h                              |    15 +
>>  src/abg-corpus.cc                             |     3 +-
>>  .../data/test-annotate/test13-pr18894.so.abi  |   562 +-
>>  .../data/test-annotate/test14-pr18893.so.abi  | 11708 +--
>>  .../data/test-annotate/test15-pr18892.so.abi  | 71712 +++++++--------
>>  .../data/test-annotate/test17-pr19027.so.abi  | 54448 +++++------
>>  ...st18-pr19037-libvtkRenderingLIC-6.1.so.abi |    58 +-
>>  ...19-pr19023-libtcmalloc_and_profiler.so.abi | 76289 ++++++++--------
>>  ...st20-pr19025-libvtkParallelCore-6.1.so.abi |  9220 +-
>>  .../data/test-annotate/test21-pr19092.so.abi  | 12358 +--
>>  11 files changed, 118387 insertions(+), 117988 deletions(-)
>>
>> diff --git a/include/abg-fwd.h b/include/abg-fwd.h
>> index 1aab70a663de..cffa3f0d1aa3 100644
>> --- a/include/abg-fwd.h
>> +++ b/include/abg-fwd.h
>> @@ -119,8 +119,6 @@ class translation_unit;
>>  /// Convenience typedef for a shared pointer on a @ref
>>  /// translation_unit type.
>>  typedef shared_ptr<translation_unit> translation_unit_sptr;
>> -/// Convenience typedef for a vector of @ref translation_unit_sptr.
>> -typedef std::vector<translation_unit_sptr> translation_units;
>>  /// Convenience typedef for a map that associates a string to a
>>  /// translation unit.
>>  typedef unordered_map<string, translation_unit_sptr> string_tu_map_type;
>> diff --git a/include/abg-ir.h b/include/abg-ir.h
>> index fda10de5c537..83dbe02f605d 100644
>> --- a/include/abg-ir.h
>> +++ b/include/abg-ir.h
>> @@ -33,6 +33,7 @@
>>  #include <assert.h>
>>  #include <stdint.h>
>>  #include <cstdlib>
>> +#include <set>
>>  #include "abg-cxx-compat.h"
>>  #include "abg-fwd.h"
>>  #include "abg-hash.h"
>> @@ -707,6 +708,20 @@ public:
>>                                         translation_unit& tu);
>>  };//end class translation_unit
>>
>> +struct SharedTranslationUnitComparator
>> +{
>> +  bool
>> +  operator()(const translation_unit_sptr& lhs,
>> +            const translation_unit_sptr& rhs) const
>> +  {
>> +    return lhs->get_absolute_path() < rhs->get_absolute_path();
>> +  }
>> +};
>> +
>> +/// Convenience typedef for a vector of @ref translation_unit_sptr.
>
>vector / set

Here as well. Thanks for catching that!

Cheers,
Matthias

>
>> +typedef std::set<translation_unit_sptr, SharedTranslationUnitComparator>
>> +    translation_units;
>> +
>>  string
>>  translation_unit_language_to_string(translation_unit::language);
>>
>> diff --git a/src/abg-corpus.cc b/src/abg-corpus.cc
>> index 12f44fd1e4cf..e01c20f75b4d 100644
>> --- a/src/abg-corpus.cc
>> +++ b/src/abg-corpus.cc
>> @@ -560,7 +560,8 @@ corpus::add(const translation_unit_sptr tu)
>>
>>    ABG_ASSERT(tu->get_environment() == get_environment());
>>
>> -  priv_->members.push_back(tu);
>> +  ABG_ASSERT(priv_->members.insert(tu).second);
>> +
>>    if (!tu->get_absolute_path().empty())
>>      {
>>        // Update the path -> translation_unit map.
>> --
>> 2.26.2.526.g744177e7f7-goog
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] corpus/writer: sort emitted translation units by path name
  2020-04-30 22:58 [RESEND PATCH] corpus/writer: sort emitted translation units by path name Matthias Maennich
  2020-05-01 14:35 ` Giuliano Procida
@ 2020-05-05 11:30 ` Dodji Seketeli
  1 sibling, 0 replies; 4+ messages in thread
From: Dodji Seketeli @ 2020-05-05 11:30 UTC (permalink / raw)
  To: Matthias Maennich; +Cc: libabigail, gprocida, kernel-team, mark

Hello Matthias,

Thank you for working on this!

Matthias Maennich <maennich@google.com> a écrit:

> By sorting the corpora output, we achieve determinism for the emitted
> XML file across multiple runs of abidw.
>
> For that to happen, change the collection of translation units to a
> std::set (instead of std::vector), sorted by absolute path name.

We were using a vector so if the analyzed binary itself didn't change
i.e, if the order of the translation units described in the DWARF didn't
change, several runs of abidw on that same binary should not have
changed.  Right?  I other words, we were seeing changes in the ordering
of translation units because the binary itself (its DWARF at least)
changed.

If that is right, then I think that even with this change, we still can
see non deterministic output in the ordering of type IDs inside a given
translation unit (TU) if the order of the definition of types and
declarations inside that TU changes.

But this is just a side note to raise awareness about this matter.

This work is an improvement.  And as such I think it's perfectly worth
getting in.

I have thus applied it to master, with some little changes that I am
discussing below.

[...]

> --- a/include/abg-ir.h
> +++ b/include/abg-ir.h
> @@ -33,6 +33,7 @@
>  #include <assert.h>
>  #include <stdint.h>
>  #include <cstdlib>
> +#include <set>
>  #include "abg-cxx-compat.h"
>  #include "abg-fwd.h"
>  #include "abg-hash.h"
> @@ -707,6 +708,20 @@ public:
>  					translation_unit& tu);
>  };//end class translation_unit
>  
> +struct SharedTranslationUnitComparator
> +{

In libabigail code base, we try to not use camel case for type and
function names.  The only exception to that is for template parameters,
I believe.  So I changed this to shared_translation_unit_comp, to comply
with how the other comparator functors are named.  I have also added
some comments to that type.

> +  bool
> +  operator()(const translation_unit_sptr& lhs,
> +	     const translation_unit_sptr& rhs) const
> +  {

I have added a comment for this too.

> +    return lhs->get_absolute_path() < rhs->get_absolute_path();
> +  }
> +};
> +
> +/// Convenience typedef for a vector of @ref translation_unit_sptr.

I changed vector into set here.

> +typedef std::set<translation_unit_sptr, SharedTranslationUnitComparator>
> +    translation_units;
> +
>  string
>  translation_unit_language_to_string(translation_unit::language);

[...]

>
> 	* include/abg-fwd.h: remove translation_units fwd declaration.
> 	* include/abg-ir.h: add SharedTranslationUnitComparator and add
> 	translation_units fwd declaration as a set using said comparator.
> 	* src/abg-corpus.cc (corpus::add): do checked insert into the
> 	translation_units set (rather than vector::pushback)
> 	* tests/data/test-annotate/test13-pr18894.so.abi: Adjust test data.
> 	* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
> 	* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
> 	* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
> 	* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
> 	* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
> 	* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
> 	* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
>

I have adjusted this ChangeLog according to the changes above.

[...]



Matthias Maennich <maennich@google.com> a écrit:

> On Fri, May 01, 2020 at 03:35:06PM +0100, Giuliano Procida wrote:
>>Hi.
>>
>>I had a patch which sorted cus in read_debug_info_into_corpus but I
>>had some concens. This was ages ago.
>>
>>Anyway...
>>
>>On Thu, 30 Apr 2020 at 23:58, Matthias Maennich <maennich@google.com> wrote:
>>>
>>> By sorting the corpora output, we achieve determinism for the emitted
>>> XML file across multiple runs of abidw.
>>>
>>> For that to happen, change the collection of translation units to a
>>> std::set (instead of std::vector), sorted by absolute path name.
>>
>>Would it be excessive paranoia to use a multiset instead?
>>Or keep the vector and sort it with the comparator?
>>
>>This would eliminate an ABG_ASSERT.
>
> This assert kicks in if we see a translation unit with the same name
> twice. While there might be scenarios where the same relative name is
> valid, that should still not happen. Hence my assumption that this
> should never happen and we can assert there. Allowing multiple values
> with the same key likely introduces issues downstream when working on
> the set of translation units.

I agree.  In general, I like having asserts that express a systemic
invariant like in this case.  They are invaluable in detecting
inconsistency in the state of the system.

>>
>>> Test data needed adjustments, but changes are fully compatible.
>>>
>>>         * include/abg-fwd.h: remove translation_units fwd declaration.
>>>         * include/abg-ir.h: add SharedTranslationUnitComparator and add
>>>         translation_units fwd declaration as a set using said comparator.
>>>         * src/abg-corpus.cc (corpus::add): do checked insert into the
>>>         translation_units set (rather than vector::pushback)
>>
>>s/pushback/push_back/
>
> Dodji, could you correct this if you agree with the patch and decide to
> apply it?

I have corrected that, thanks.

[...]


[...]

>>> +++ b/include/abg-ir.h

[...]

>>> +/// Convenience typedef for a vector of @ref translation_unit_sptr.
>>
>>vector / set
>
> Here as well.

Done.

Please find below the patch that I have applied.  I am leaving out the
tests adjustment part because of its size.  Just for the record, I have
applied the patch from the mm/sort-tu-by-name branch and this is what I
have applied exactly:

           d5bdebdb corpus/writer: sort emitted translation units by path name

------------------------->8<--------------------------------------------
commit 246ca2004930b52b2d7e73f2ba9faa7bcbcd3999
Author: Matthias Maennich <maennich@google.com>
Date:   Sun Jan 12 21:05:33 2020 +0000

    corpus/writer: sort emitted translation units by path name
    
    By sorting the corpora output, we achieve determinism for the emitted
    XML file across multiple runs of abidw.
    
    For that to happen, change the collection of translation units to a
    std::set (instead of std::vector), sorted by absolute path name.
    
    Test data needed adjustments, but changes are fully compatible.
    
            * include/abg-fwd.h: remove translation_units fwd declaration.
            * include/abg-ir.h (struct shared_translation_unit_comparator):
            Define new class.
            (translation_units): Define new typedef.
            * src/abg-corpus.cc (corpus::add): do checked insert into the
            translation_units set (rather than vector::push_back)
            * tests/data/test-annotate/test13-pr18894.so.abi: Adjust test data.
            * tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
            * tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
            * tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
            * tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
            * tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
            * tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
            * tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
    
    Signed-off-by: Matthias Maennich <maennich@google.com>
    Signed-off-by: Dodji Seketeli <dodji@redhat.com>

diff --git a/include/abg-fwd.h b/include/abg-fwd.h
index 1aab70a6..cffa3f0d 100644
--- a/include/abg-fwd.h
+++ b/include/abg-fwd.h
@@ -119,8 +119,6 @@ class translation_unit;
 /// Convenience typedef for a shared pointer on a @ref
 /// translation_unit type.
 typedef shared_ptr<translation_unit> translation_unit_sptr;
-/// Convenience typedef for a vector of @ref translation_unit_sptr.
-typedef std::vector<translation_unit_sptr> translation_units;
 /// Convenience typedef for a map that associates a string to a
 /// translation unit.
 typedef unordered_map<string, translation_unit_sptr> string_tu_map_type;
diff --git a/include/abg-ir.h b/include/abg-ir.h
index fda10de5..4b9626df 100644
--- a/include/abg-ir.h
+++ b/include/abg-ir.h
@@ -33,6 +33,7 @@
 #include <assert.h>
 #include <stdint.h>
 #include <cstdlib>
+#include <set>
 #include "abg-cxx-compat.h"
 #include "abg-fwd.h"
 #include "abg-hash.h"
@@ -707,6 +708,28 @@ public:
 					translation_unit& tu);
 };//end class translation_unit
 
+/// A comparison functor to compare translation units based on their
+/// absolute paths.
+struct shared_translation_unit_comp
+{
+  /// Compare two translations units based on their absolute paths.
+  ///
+  /// @param lhs the first translation unit to consider for the
+  /// comparison.
+  ///
+  /// @param rhs the second translatin unit to consider for the
+  /// comparison.
+  bool
+  operator()(const translation_unit_sptr& lhs,
+	     const translation_unit_sptr& rhs) const
+  {return lhs->get_absolute_path() < rhs->get_absolute_path();}
+}; // end struct shared_translation_unit_comp
+
+/// Convenience typedef for an ordered set of @ref
+/// translation_unit_sptr.
+typedef std::set<translation_unit_sptr,
+		 shared_translation_unit_comp> translation_units;
+
 string
 translation_unit_language_to_string(translation_unit::language);
 
diff --git a/src/abg-corpus.cc b/src/abg-corpus.cc
index 2e9f304d..1ca46086 100644
--- a/src/abg-corpus.cc
+++ b/src/abg-corpus.cc
@@ -560,7 +560,8 @@ corpus::add(const translation_unit_sptr tu)
 
   ABG_ASSERT(tu->get_environment() == get_environment());
 
-  priv_->members.push_back(tu);
+  ABG_ASSERT(priv_->members.insert(tu).second);
+
   if (!tu->get_absolute_path().empty())
     {
       // Update the path -> translation_unit map.
------------------------->8<--------------------------------------------

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-05 11:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-30 22:58 [RESEND PATCH] corpus/writer: sort emitted translation units by path name Matthias Maennich
2020-05-01 14:35 ` Giuliano Procida
2020-05-04 13:25   ` Matthias Maennich
2020-05-05 11:30 ` Dodji Seketeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).