public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Giuliano Procida <gprocida@google.com>
To: Ben Woodard <woodard@redhat.com>
Cc: Dodji Seketeli <dodji@seketeli.org>,
	 "Guillermo E. Martinez via Libabigail"
	<libabigail@sourceware.org>
Subject: Re: [PATCH v2] Add regression tests for ctf reading
Date: Thu, 25 Nov 2021 09:34:05 +0000	[thread overview]
Message-ID: <CAGvU0Hm62UivLHe+7fnA97powu+D+qM8BmYo6W06Hos1YZ7Hyg@mail.gmail.com> (raw)
In-Reply-To: <C42EAD6E-B2E3-4373-80AE-D665D6B94C43@redhat.com>

Hi Ben.

On Wed, 24 Nov 2021 at 19:10, Ben Woodard via Libabigail
<libabigail@sourceware.org> wrote:
>
>
>
> > On Nov 24, 2021, at 8:36 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
> >
> > Hello,
> >
> > [...]
> >
> > Thanks for working on this.  Nice patch, by the way!  I like its
> > direction.
> >
> > I have a few comments and I believe that when they are addressed, we'll
> > be able to apply the patch.
> >
> > [...]
> >
> >> Dependencies/limitations:
> >>
> >> * It was worked on the top of the following patches:
> >> https://sourceware.org/pipermail/libabigail/2021q4/003853.html
> >>
> >> * Some CTF tests were *disabled* because it generates the XML ABI
> >> corpus with *same information* but the XML nodes *are not* always
> >> in the *same order*, so using diff command fails. Details here:
> >> https://sourceware.org/pipermail/libabigail/2021q4/003824.html
> >
> > In those cases where the abixml file generated from CTF is different
> > from the one generated from DWARF, I think we should have two
> > different reference abixml files to diff against.  No CTF test should
> > be disabled, I think.
>
> Here is a somewhat deeper question that I think needs to be considered. I’ve generally referred to it as “DWARF Idioms” but this email makes me think that it is even larger than that.
>
> For my work, I need libabigail to generate an abstract notion of the ABI corpus. How it constructs that abstract notion of the ABI needs to be independent of the producer. Think of it this way, say we take the same compiler and have it compile the same library producing both CTF and DWARF, the ABI of the library doesn’t change. Since it is literally the same object, the program text is same. Therefore the ABI is unquestionably the the same. Any difference reported by libabigail therefore is a problem with libabigail. It is not taking the source material and abstracting it enough into the ABI artifacts to separate the artifacts from the implementation.
>
> So I kind of believe that we need to look more deeply into WHY the CTF and DWARF are not comparing as equivalent and begin the process of filing the compiler bugs when we need to, and doing what is necessary to abstract the ABI from the source material that libabigail used to construct its IR of the ABI corpus from.
>
> So, I must say that I disagree with both dodji’s approach here and to a lesser extent Guillermo’s approach of disabling the tests. I think that the tests where the CTF doesn’t match the DWARF should be investigated and when necessary marked “xfail” with a note citing their individual cause.
>
> I think that what we need to work towards is:
> abidw produces the same output (or more precisely libabigail produces the same IR) whether you compile with -gdwarf-3 -gdwarf-4 -gdwarf-5 -gsplit-dwarf -gctf
> also for the most part the ABI should not change between compiler versions. There may be a few cases that we need to look into where the compiler actually breaks ABI and of course libabigail should flag those but I would assert that libabigail needs to abstract its IR of the ABI enough that compiler version changes that don’t actually change the ABI of the ELF object are not reported as ABI breaks. It currently does pretty well at this at the moment.
> Then once that foundation is built, being able to abstract the ABI IR enough that differences in toolchains e.g. LLVM vs GCC are not flagged as changes in the object’s ABI. This is important to provide people with a tool that will allow them to mix toolchains within a project to achieve optimal code.
>

This almost sounds like a hierarchy of needs:

ABI tools produce identical output for identical inputs no matter
which platform they run or which compiler / library was used to build
them.
- we have seen libabigail hash table trouble when built with Clang
- we don't support BTF byte sex conversion (the format is I think target-endian)

each platform / compiler / optimisation / LTO / DWARF combination
generates object code that appears ABI-stable (varying just the ABI
tooling)
...
each platform's object code is ABI-equivalent (varying all the other things)

And then do the same for C++, Rust etc.

So far I've seen apparent ABI issues due to all the above things,
except possibly -Olevel, and seen bugs fixed in LLVM, elfutils,
dwarves and libabigail.

It would be useful to prioritise your needs. Which stability axes
bring most value? At a rough guess, we'd want varying optimisation
levels and LTO to be least likely things to cause ABI differences,
maybe then DWARF version, followed by compiler then standard library?

It would also be useful to try to limit the combinations that will be
supported (nothing too old or obscure, for example). Is anyone still
using DWARF 3?

Giuliano.

> -ben
>
> >
> > So:
> >
> > This:
> >
> >   {
> >     "data/test-read-dwarf/test0",
> >     "",
> >     "",
> >     SEQUENCE_TYPE_ID_STYLE,
> >     "data/test-read-dwarf/test0.abi",
> >     "output/test-read-dwarf/test0.abi"
> >   },
> >
> > would be changed into:
> >
> >   {
> >     "data/test-read-common/test0",
> >     "",
> >     "",
> >     SEQUENCE_TYPE_ID_STYLE,
> >     "data/test-read-dwarf/test0.abi",
> >     "output/test-read-dwarf/test0.abi"
> >   },
> >
> > For the DWARF test entry in test-read-dwarf.cc, and it would be
> > changed into:
> >
> >   {
> >     "data/test-read-common/test0",
> >     "",
> >     "",
> >     SEQUENCE_TYPE_ID_STYLE,
> >     "data/test-read-ctf/test0.abi",
> >     "output/test-read-ctf/test0.abi"
> >   },
> >
> > for the CTF test netry in test-read-ctf.cc.
> >
> > By the way, I am seeing entries like this in test-read-ctf.cc:
> >
> >> +    "data/test-read-common/test3.so",
> >> +    "",
> >> +    "",
> >> +    SEQUENCE_TYPE_ID_STYLE,
> >> +    "data/test-read-common/test3-ctf.so.abi",
> >> +    "output/test-read-common/test3-ctf.so.abi"
> >
> > Here this entry does exactly what I am suggesting, even if
> > test3-ctf.so.abi is stored in data/test-read-common.
> >
> > So where exactly is the CTF test disabled?
> >
> > [...]
> >
> >
> >> diff --git a/tests/test-read-common.cc b/tests/test-read-common.cc
> >
> > [...]
> >
> >> +
> >> +namespace abigail
> >> +{
> >> +namespace tests
> >> +{
> >> +namespace read_common
> >> +{
> >
> > Because this file now contains an API definition that is to be used be
> > some tests, every single function of the file should be documented
> > using doxygen comments, just like we have in src/abg-*.cc files.
> >
> > For instance:
> >
> >> +
> >> +test_task::test_task(const InOutSpec &s,
> >> +                     string& a_out_abi_base,
> >> +                     string& a_in_elf_base,
> >> +                     string& a_in_abi_base)
> >
> > This function should be fully doxygen-documented.
> >
> >> +    : is_ok(true),
> >> +      spec(s),
> >> +      out_abi_base(a_out_abi_base),
> >> +      in_elf_base(a_in_elf_base),
> >> +      in_abi_base(a_in_abi_base)
> >> +  {}
> >> +
> >> +bool
> >> +test_task::serialize_corpus(const string& out_abi_path,
> >> +                            corpus_sptr corp)
> >
> > Likewise.
> >
> >> +{
> >> +  ofstream of(out_abi_path.c_str(), std::ios_base::trunc);
> >> +  if (!of.is_open())
> >> +    {
> >> +       error_message = string("failed to read ") + out_abi_path + "\n";
> >> +       return false;
> >> +    }
> >> +
> >> +  write_context_sptr write_ctxt
> >> +      = create_write_context(corp->get_environment(), of);
> >> +  set_type_id_style(*write_ctxt, spec.type_id_style);
> >> +  is_ok = write_corpus(*write_ctxt, corp, /*indent=*/0);
> >> +  of.close();
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +bool
> >> +test_task::run_abidw(const string& extargs)
> >
> > Likewise.
> >
> >> +{
> >> +  string abidw = string(get_build_dir()) + "/tools/abidw";
> >> +  string drop_private_types;
> >> +  if (!in_public_headers_path.empty())
> >> +    drop_private_types += "--headers-dir " + in_public_headers_path +
> >> +      " --drop-private-types";
> >> +  string cmd = abidw + " " + drop_private_types + " --abidiff " + extargs +
> >> +   in_elf_path;
> >> +  if (system(cmd.c_str()))
> >> +    {
> >> +      error_message = string("ABIs differ:\n")
> >> +        + in_elf_path
> >> +        + "\nand:\n"
> >> +        + out_abi_path
> >> +        + "\n";
> >> +
> >> +      return false;
> >> +    }
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +bool
> >> +test_task::run_diff()
> >
> > Likewise.
> >
> >> +{
> >> +  set_in_abi_path();
> >> +  string cmd = "diff -u " + in_abi_path + " " + out_abi_path;
> >> +  if (system(cmd.c_str()))
> >> +    {
> >> +      error_message = string("ABIs differ:\n")
> >> +        + in_abi_path
> >> +        + "\nand:\n"
> >> +        + out_abi_path
> >> +        + "\n";
> >> +
> >> +      return false;
> >> +    }
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +void
> >> +display_usage(const string& prog_name, ostream& out)
> >
> > Likewise.
> >
> >> +{
> >> +  emit_prefix(prog_name, out)
> >> +    << "usage: " << prog_name << " [options]\n"
> >> +    << " where options can be: \n"
> >> +    << "  --help|-h  display this message\n"
> >> +    << "  --no-parallel execute testsuite is a sigle thread\n"
> >> +  ;
> >> +}
> >> +
> >> +bool
> >> +parse_command_line(int argc, char* argv[], options& opts)
> >
> > Likewise.
> >
> >> +{
> >> +  for (int i = 1; i < argc; ++i)
> >> +    {
> >> +      if (!strcmp(argv[i], "--no-parallel"))
> >> +        opts.parallel = false;
> >> +      else if (!strcmp(argv[i], "--help")
> >> +               || !strcmp(argv[i], "--h"))
> >> +        return false;
> >> +      else
> >> +        {
> >> +          if (strlen(argv[i]) >= 2 && argv[i][0] == '-' && argv[i][1] == '-')
> >> +            opts.wrong_option = argv[i];
> >> +          return false;
> >> +        }
> >> +    }
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +bool
> >> +run_tests(const size_t num_tests, const InOutSpec* specs,
> >> +          const options& opts, create_new_test new_test)
> >
> > Likewise.
> >
> >> +{
> >
> > [...]
> >
> >
> >> diff --git a/tests/test-read-common.h b/tests/test-read-common.h
> >
> > [...]
> >
> >> +/// This is an aggregate that specifies where a test shall get its
> >> +/// input from, and where it shall write its ouput to.
> >> +struct InOutSpec
> >> +{
> >> +  const char* in_elf_path;
> >> +  const char* in_suppr_spec_path;
> >> +  const char* in_public_headers_path;
> >> +  type_id_style_kind type_id_style;
> >> +  const char* in_abi_path;
> >> +  const char* out_abi_path;
> >> +};// end struct InOutSpec
> >> +
> >> +/// The task that peforms the tests.
> >> +struct test_task : public abigail::workers::task
> >> +{
> >> +  bool is_ok;
> >> +  InOutSpec spec;
> >> +  string error_message;
> >> +  string out_abi_base;
> >> +  string in_elf_base;
> >> +  string in_abi_base;
> >> +
> >> +  string in_elf_path;
> >> +  string in_abi_path;
> >> +  string in_suppr_spec_path;
> >> +  string in_public_headers_path;
> >> +  string out_abi_path;
> >> +
> >> +  void
> >> +  set_in_elf_path()
> >
> > Please doxygen-document this function.
> >
> >> +  {
> >> +    in_elf_path = in_elf_base + spec.in_elf_path;
> >> +  }
> >> +
> >> +  void
> >> +  set_in_suppr_spec_path()
> >
> > Likewise.
> >
> >> +  {
> >> +    if (spec.in_suppr_spec_path)
> >> +      in_suppr_spec_path = in_elf_base + spec.in_suppr_spec_path;
> >> +    else
> >> +      in_suppr_spec_path.clear();
> >> +  }
> >> +
> >> +  void
> >> +  set_in_public_headers_path()
> >
> > Likewise.
> >
> >> +  {
> >> +    if (spec.in_public_headers_path)
> >> +      in_public_headers_path = spec.in_public_headers_path;
> >> +    if (!in_public_headers_path.empty())
> >> +      in_public_headers_path = in_elf_base + spec.in_public_headers_path;
> >> +  }
> >> +
> >> +  bool
> >> +  set_out_abi_path()
> >
> > Likewise.
> >
> >> +  {
> >> +    out_abi_path = out_abi_base + spec.out_abi_path;
> >> +    if (!abigail::tools_utils::ensure_parent_dir_created(out_abi_path))
> >> +      {
> >> +          error_message =
> >> +            string("Could not create parent directory for ") + out_abi_path;
> >> +          return false;
> >> +      }
> >> +    return true;
> >> +  }
> >> +
> >> +  void
> >> +  set_in_abi_path()
> >
> > Likewise.
> >
> >> +  {
> >> +    in_abi_path = in_abi_base + spec.in_abi_path;
> >> +  }
> >> +
> >
> > [...]
> >
> >> +  test_task(const InOutSpec &s,
> >> +            string& a_out_abi_base,
> >> +            string& a_in_elf_base,
> >> +            string& a_in_abi_base);
> >> +  bool
> >> +  serialize_corpus(const string& out_abi_path,
> >> +                   corpus_sptr corp);
> >> +  bool
> >> +  run_abidw(const string& extargs = "");
> >> +
> >> +  bool
> >> +  run_diff();
> >> +
> >> +  virtual
> >> +  ~test_task()
> >> +  {}
> >> +
> >> +}; // end struct test_task
> >> +
> >> +typedef shared_ptr<test_task> test_task_sptr;
> >> +
> >> +struct options
> >> +{
> >
> > Please doxygen-document this struct.
> >
> >> +  string        wrong_option;
> >> +  bool          parallel;
> >> +
> >> +  options()
> >> +    : parallel(true)
> >> +  {}
> >> +
> >> +  ~options()
> >> +  {
> >> +  }
> >> +};
> >
> > [...]
> >
> >> diff --git a/tests/test-read-ctf.cc b/tests/test-read-ctf.cc
> >
> > [...]
> >
> >
> >> +test_task_ctf::test_task_ctf(const InOutSpec &s,
> >> +                             string& a_out_abi_base,
> >> +                             string& a_in_elf_base,
> >> +                             string& a_in_abi_base)
> >
> > Please doxygen-document this function.
> >
> >> +        : test_task(s, a_out_abi_base, a_in_elf_base, a_in_abi_base)
> >> +  {}
> >> +
> >
> > [...]
> >
> >> +static test_task*
> >> +new_task(const InOutSpec* s, string& a_out_abi_base,
> >> +         string& a_in_elf_base, string& a_in_abi_base)
> >
> > Please doxygen-document this function.
> >
> >> +{
> >> +  return new test_task_ctf(*s, a_in_abi_base,
> >
> > This 'a_in_abi_base' should be a_out_abi_base.
> >
> >> +                           a_in_elf_base, a_in_abi_base);
> >> +}
> >
> > [...]
> >
> >
> >> diff --git a/tests/test-read-dwarf.cc b/tests/test-read-dwarf.cc
> >
> > [...]
> >
> >
> >> +static test_task*
> >> +new_task(const InOutSpec* s, string& a_out_abi_base,
> >> +         string& a_in_elf_base, string& a_in_abi_base)
> >
> > Please doxygen-document this function.
> >
> >> +{
> >> +  return new test_task_dwarf(*s, a_in_abi_base,
> >
> > This 'a_in_abi_base' should be a_out_abi_base.
> >
> >> +                             a_in_elf_base, a_in_abi_base);
> >> +}
> >>
> >> int
> >> main(int argc, char *argv[])
> >> {
> >>   bool no_parallel = false;
> >
> > This variable is not unused.
> >
> >
> > Thanks for working on this.  It's really appreciated!
> >
> > [...]
> >
> > Cheers,
> >
> > --
> >               Dodji
> >
>

  parent reply	other threads:[~2021-11-25  9:34 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03 22:41 Regression tests for ctf reader: Avoid duplicating files Guillermo Martinez
2021-11-04  8:27 ` Jose E. Marchesi
2021-11-09  2:52   ` Guillermo Martinez
2021-11-09 14:47     ` Dodji Seketeli
2021-11-10 12:06       ` Jose E. Marchesi
2021-11-11 15:16       ` Guillermo Martinez
2021-11-15 13:49 ` [PATCH] Add regression tests for ctf reading Guillermo E. Martinez
2021-11-17  8:33   ` Dodji Seketeli
2021-11-18  4:02     ` Guillermo Martinez
2021-11-18  4:16     ` Guillermo E. Martinez
2021-11-18 13:52       ` Jose E. Marchesi
2021-11-18 15:14         ` Guillermo Martinez
2021-11-22 21:33       ` [PATCH v2] " Guillermo E. Martinez
2021-11-23 15:48         ` Jose E. Marchesi
2021-11-23 18:54           ` Guillermo Martinez
2021-11-25 10:40             ` Dodji Seketeli
2021-11-25 21:03               ` Guillermo Martinez
2021-11-26 10:02                 ` Dodji Seketeli
2021-11-24 16:36         ` Dodji Seketeli
2021-11-24 18:52           ` Guillermo Martinez
2021-11-26 11:23             ` Dodji Seketeli
2021-11-26 13:01               ` Jose E. Marchesi
2021-11-26 13:37               ` Guillermo Martinez
2021-11-24 19:09           ` Ben Woodard
2021-11-25  0:13             ` Ben Woodard
2021-11-25  6:50             ` Jose E. Marchesi
2021-11-25  9:47               ` Giuliano Procida
2021-12-01  3:18               ` Ben Woodard
2021-11-25  9:34             ` Giuliano Procida [this message]
2021-11-25 21:56               ` Ben Woodard
2021-11-26 10:27             ` Dodji Seketeli
2021-12-01  2:13               ` Ben Woodard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGvU0Hm62UivLHe+7fnA97powu+D+qM8BmYo6W06Hos1YZ7Hyg@mail.gmail.com \
    --to=gprocida@google.com \
    --cc=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    --cc=woodard@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).