From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x932.google.com (mail-ua1-x932.google.com [IPv6:2607:f8b0:4864:20::932]) by sourceware.org (Postfix) with ESMTPS id 19A9E3858403 for ; Thu, 25 Nov 2021 09:34:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 19A9E3858403 Received: by mail-ua1-x932.google.com with SMTP id n6so11058135uak.1 for ; Thu, 25 Nov 2021 01:34:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=mw02VeYCU/HNpi8XtSajVP3az2rqi+2Kkn2fYZR/OsI=; b=nhnz5VpfDdlA2PbIkYPwugqpTpFaJav7Qf/h+wegYtOiKsO84uD1B/MAWPMra+Oi3+ enoJnsvWQTLcf1Z5sNzEQxT/2VxtPxf0u/XyM73nFcJLmPEy6wHjTEUQHN5h2pDt/VZG KH3Padd9j2sklzwNeZgOcYq39xzYLhQnWwRRgEDk5xdizsndBu5NcSrca1UxMIycnguG ZV4JRGfw9jlO1SAMd0+k9hQcNtfGEOYoR0LavvFVYgwhJnyyehTKIhpfWv0+Jgl/tJfR MqPp6HTXB4psXxo8KWOmXAjtP3tvUX+lsoQvbXAEvl9Fus9de80AlNvHFgGm/hvnbK4I IcnQ== X-Gm-Message-State: AOAM5304thjS0hRLOdaepRkW8GKEFgXOuMP2d0ao+swA3MPvftswBVt1 wkUjMFshHWQtYWyRfEaNh/g6IbEVg9U6FErAW66ntg== X-Google-Smtp-Source: ABdhPJx9KJkse9JSQ4OzC1UCTaBjeQ0uKhiTtcTe28tVJMx0QdaUGLy2AIWePUSTEd5rN0yzyZcfLu41zQi3yeFOl1M= X-Received: by 2002:ab0:4465:: with SMTP id m92mr22991114uam.47.1637832882322; Thu, 25 Nov 2021 01:34:42 -0800 (PST) MIME-Version: 1.0 References: <20211118041625.622972-1-guillermo.e.martinez@oracle.com> <20211122213353.2456208-1-guillermo.e.martinez@oracle.com> <87ee75tsst.fsf@seketeli.org> In-Reply-To: From: Giuliano Procida Date: Thu, 25 Nov 2021 09:34:05 +0000 Message-ID: Subject: Re: [PATCH v2] Add regression tests for ctf reading To: Ben Woodard Cc: Dodji Seketeli , "Guillermo E. Martinez via Libabigail" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-27.0 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Nov 2021 09:34:46 -0000 Hi Ben. On Wed, 24 Nov 2021 at 19:10, Ben Woodard via Libabigail wrote: > > > > > On Nov 24, 2021, at 8:36 AM, Dodji Seketeli wrote: > > > > Hello, > > > > [...] > > > > Thanks for working on this. Nice patch, by the way! I like its > > direction. > > > > I have a few comments and I believe that when they are addressed, we'll > > be able to apply the patch. > > > > [...] > > > >> Dependencies/limitations: > >> > >> * It was worked on the top of the following patches: > >> https://sourceware.org/pipermail/libabigail/2021q4/003853.html > >> > >> * Some CTF tests were *disabled* because it generates the XML ABI > >> corpus with *same information* but the XML nodes *are not* always > >> in the *same order*, so using diff command fails. Details here: > >> https://sourceware.org/pipermail/libabigail/2021q4/003824.html > > > > In those cases where the abixml file generated from CTF is different > > from the one generated from DWARF, I think we should have two > > different reference abixml files to diff against. No CTF test should > > be disabled, I think. > > Here is a somewhat deeper question that I think needs to be considered. I= =E2=80=99ve generally referred to it as =E2=80=9CDWARF Idioms=E2=80=9D but = this email makes me think that it is even larger than that. > > For my work, I need libabigail to generate an abstract notion of the ABI = corpus. How it constructs that abstract notion of the ABI needs to be indep= endent of the producer. Think of it this way, say we take the same compiler= and have it compile the same library producing both CTF and DWARF, the ABI= of the library doesn=E2=80=99t change. Since it is literally the same obje= ct, the program text is same. Therefore the ABI is unquestionably the the s= ame. Any difference reported by libabigail therefore is a problem with liba= bigail. It is not taking the source material and abstracting it enough into= the ABI artifacts to separate the artifacts from the implementation. > > So I kind of believe that we need to look more deeply into WHY the CTF an= d DWARF are not comparing as equivalent and begin the process of filing the= compiler bugs when we need to, and doing what is necessary to abstract the= ABI from the source material that libabigail used to construct its IR of t= he ABI corpus from. > > So, I must say that I disagree with both dodji=E2=80=99s approach here an= d to a lesser extent Guillermo=E2=80=99s approach of disabling the tests. I= think that the tests where the CTF doesn=E2=80=99t match the DWARF should = be investigated and when necessary marked =E2=80=9Cxfail=E2=80=9D with a no= te citing their individual cause. > > I think that what we need to work towards is: > abidw produces the same output (or more precisely libabigail produces the= same IR) whether you compile with -gdwarf-3 -gdwarf-4 -gdwarf-5 -gsplit-dw= arf -gctf > also for the most part the ABI should not change between compiler version= s. There may be a few cases that we need to look into where the compiler ac= tually breaks ABI and of course libabigail should flag those but I would as= sert that libabigail needs to abstract its IR of the ABI enough that compil= er version changes that don=E2=80=99t actually change the ABI of the ELF ob= ject are not reported as ABI breaks. It currently does pretty well at this = at the moment. > Then once that foundation is built, being able to abstract the ABI IR eno= ugh that differences in toolchains e.g. LLVM vs GCC are not flagged as chan= ges in the object=E2=80=99s ABI. This is important to provide people with a= tool that will allow them to mix toolchains within a project to achieve op= timal code. > This almost sounds like a hierarchy of needs: ABI tools produce identical output for identical inputs no matter which platform they run or which compiler / library was used to build them. - we have seen libabigail hash table trouble when built with Clang - we don't support BTF byte sex conversion (the format is I think target-en= dian) each platform / compiler / optimisation / LTO / DWARF combination generates object code that appears ABI-stable (varying just the ABI tooling) ... each platform's object code is ABI-equivalent (varying all the other things= ) And then do the same for C++, Rust etc. So far I've seen apparent ABI issues due to all the above things, except possibly -Olevel, and seen bugs fixed in LLVM, elfutils, dwarves and libabigail. It would be useful to prioritise your needs. Which stability axes bring most value? At a rough guess, we'd want varying optimisation levels and LTO to be least likely things to cause ABI differences, maybe then DWARF version, followed by compiler then standard library? It would also be useful to try to limit the combinations that will be supported (nothing too old or obscure, for example). Is anyone still using DWARF 3? Giuliano. > -ben > > > > > So: > > > > This: > > > > { > > "data/test-read-dwarf/test0", > > "", > > "", > > SEQUENCE_TYPE_ID_STYLE, > > "data/test-read-dwarf/test0.abi", > > "output/test-read-dwarf/test0.abi" > > }, > > > > would be changed into: > > > > { > > "data/test-read-common/test0", > > "", > > "", > > SEQUENCE_TYPE_ID_STYLE, > > "data/test-read-dwarf/test0.abi", > > "output/test-read-dwarf/test0.abi" > > }, > > > > For the DWARF test entry in test-read-dwarf.cc, and it would be > > changed into: > > > > { > > "data/test-read-common/test0", > > "", > > "", > > SEQUENCE_TYPE_ID_STYLE, > > "data/test-read-ctf/test0.abi", > > "output/test-read-ctf/test0.abi" > > }, > > > > for the CTF test netry in test-read-ctf.cc. > > > > By the way, I am seeing entries like this in test-read-ctf.cc: > > > >> + "data/test-read-common/test3.so", > >> + "", > >> + "", > >> + SEQUENCE_TYPE_ID_STYLE, > >> + "data/test-read-common/test3-ctf.so.abi", > >> + "output/test-read-common/test3-ctf.so.abi" > > > > Here this entry does exactly what I am suggesting, even if > > test3-ctf.so.abi is stored in data/test-read-common. > > > > So where exactly is the CTF test disabled? > > > > [...] > > > > > >> diff --git a/tests/test-read-common.cc b/tests/test-read-common.cc > > > > [...] > > > >> + > >> +namespace abigail > >> +{ > >> +namespace tests > >> +{ > >> +namespace read_common > >> +{ > > > > Because this file now contains an API definition that is to be used be > > some tests, every single function of the file should be documented > > using doxygen comments, just like we have in src/abg-*.cc files. > > > > For instance: > > > >> + > >> +test_task::test_task(const InOutSpec &s, > >> + string& a_out_abi_base, > >> + string& a_in_elf_base, > >> + string& a_in_abi_base) > > > > This function should be fully doxygen-documented. > > > >> + : is_ok(true), > >> + spec(s), > >> + out_abi_base(a_out_abi_base), > >> + in_elf_base(a_in_elf_base), > >> + in_abi_base(a_in_abi_base) > >> + {} > >> + > >> +bool > >> +test_task::serialize_corpus(const string& out_abi_path, > >> + corpus_sptr corp) > > > > Likewise. > > > >> +{ > >> + ofstream of(out_abi_path.c_str(), std::ios_base::trunc); > >> + if (!of.is_open()) > >> + { > >> + error_message =3D string("failed to read ") + out_abi_path + "= \n"; > >> + return false; > >> + } > >> + > >> + write_context_sptr write_ctxt > >> + =3D create_write_context(corp->get_environment(), of); > >> + set_type_id_style(*write_ctxt, spec.type_id_style); > >> + is_ok =3D write_corpus(*write_ctxt, corp, /*indent=3D*/0); > >> + of.close(); > >> + > >> + return true; > >> +} > >> + > >> +bool > >> +test_task::run_abidw(const string& extargs) > > > > Likewise. > > > >> +{ > >> + string abidw =3D string(get_build_dir()) + "/tools/abidw"; > >> + string drop_private_types; > >> + if (!in_public_headers_path.empty()) > >> + drop_private_types +=3D "--headers-dir " + in_public_headers_path= + > >> + " --drop-private-types"; > >> + string cmd =3D abidw + " " + drop_private_types + " --abidiff " + e= xtargs + > >> + in_elf_path; > >> + if (system(cmd.c_str())) > >> + { > >> + error_message =3D string("ABIs differ:\n") > >> + + in_elf_path > >> + + "\nand:\n" > >> + + out_abi_path > >> + + "\n"; > >> + > >> + return false; > >> + } > >> + > >> + return true; > >> +} > >> + > >> +bool > >> +test_task::run_diff() > > > > Likewise. > > > >> +{ > >> + set_in_abi_path(); > >> + string cmd =3D "diff -u " + in_abi_path + " " + out_abi_path; > >> + if (system(cmd.c_str())) > >> + { > >> + error_message =3D string("ABIs differ:\n") > >> + + in_abi_path > >> + + "\nand:\n" > >> + + out_abi_path > >> + + "\n"; > >> + > >> + return false; > >> + } > >> + > >> + return true; > >> +} > >> + > >> +void > >> +display_usage(const string& prog_name, ostream& out) > > > > Likewise. > > > >> +{ > >> + emit_prefix(prog_name, out) > >> + << "usage: " << prog_name << " [options]\n" > >> + << " where options can be: \n" > >> + << " --help|-h display this message\n" > >> + << " --no-parallel execute testsuite is a sigle thread\n" > >> + ; > >> +} > >> + > >> +bool > >> +parse_command_line(int argc, char* argv[], options& opts) > > > > Likewise. > > > >> +{ > >> + for (int i =3D 1; i < argc; ++i) > >> + { > >> + if (!strcmp(argv[i], "--no-parallel")) > >> + opts.parallel =3D false; > >> + else if (!strcmp(argv[i], "--help") > >> + || !strcmp(argv[i], "--h")) > >> + return false; > >> + else > >> + { > >> + if (strlen(argv[i]) >=3D 2 && argv[i][0] =3D=3D '-' && argv= [i][1] =3D=3D '-') > >> + opts.wrong_option =3D argv[i]; > >> + return false; > >> + } > >> + } > >> + > >> + return true; > >> +} > >> + > >> +bool > >> +run_tests(const size_t num_tests, const InOutSpec* specs, > >> + const options& opts, create_new_test new_test) > > > > Likewise. > > > >> +{ > > > > [...] > > > > > >> diff --git a/tests/test-read-common.h b/tests/test-read-common.h > > > > [...] > > > >> +/// This is an aggregate that specifies where a test shall get its > >> +/// input from, and where it shall write its ouput to. > >> +struct InOutSpec > >> +{ > >> + const char* in_elf_path; > >> + const char* in_suppr_spec_path; > >> + const char* in_public_headers_path; > >> + type_id_style_kind type_id_style; > >> + const char* in_abi_path; > >> + const char* out_abi_path; > >> +};// end struct InOutSpec > >> + > >> +/// The task that peforms the tests. > >> +struct test_task : public abigail::workers::task > >> +{ > >> + bool is_ok; > >> + InOutSpec spec; > >> + string error_message; > >> + string out_abi_base; > >> + string in_elf_base; > >> + string in_abi_base; > >> + > >> + string in_elf_path; > >> + string in_abi_path; > >> + string in_suppr_spec_path; > >> + string in_public_headers_path; > >> + string out_abi_path; > >> + > >> + void > >> + set_in_elf_path() > > > > Please doxygen-document this function. > > > >> + { > >> + in_elf_path =3D in_elf_base + spec.in_elf_path; > >> + } > >> + > >> + void > >> + set_in_suppr_spec_path() > > > > Likewise. > > > >> + { > >> + if (spec.in_suppr_spec_path) > >> + in_suppr_spec_path =3D in_elf_base + spec.in_suppr_spec_path; > >> + else > >> + in_suppr_spec_path.clear(); > >> + } > >> + > >> + void > >> + set_in_public_headers_path() > > > > Likewise. > > > >> + { > >> + if (spec.in_public_headers_path) > >> + in_public_headers_path =3D spec.in_public_headers_path; > >> + if (!in_public_headers_path.empty()) > >> + in_public_headers_path =3D in_elf_base + spec.in_public_headers= _path; > >> + } > >> + > >> + bool > >> + set_out_abi_path() > > > > Likewise. > > > >> + { > >> + out_abi_path =3D out_abi_base + spec.out_abi_path; > >> + if (!abigail::tools_utils::ensure_parent_dir_created(out_abi_path= )) > >> + { > >> + error_message =3D > >> + string("Could not create parent directory for ") + out_ab= i_path; > >> + return false; > >> + } > >> + return true; > >> + } > >> + > >> + void > >> + set_in_abi_path() > > > > Likewise. > > > >> + { > >> + in_abi_path =3D in_abi_base + spec.in_abi_path; > >> + } > >> + > > > > [...] > > > >> + test_task(const InOutSpec &s, > >> + string& a_out_abi_base, > >> + string& a_in_elf_base, > >> + string& a_in_abi_base); > >> + bool > >> + serialize_corpus(const string& out_abi_path, > >> + corpus_sptr corp); > >> + bool > >> + run_abidw(const string& extargs =3D ""); > >> + > >> + bool > >> + run_diff(); > >> + > >> + virtual > >> + ~test_task() > >> + {} > >> + > >> +}; // end struct test_task > >> + > >> +typedef shared_ptr test_task_sptr; > >> + > >> +struct options > >> +{ > > > > Please doxygen-document this struct. > > > >> + string wrong_option; > >> + bool parallel; > >> + > >> + options() > >> + : parallel(true) > >> + {} > >> + > >> + ~options() > >> + { > >> + } > >> +}; > > > > [...] > > > >> diff --git a/tests/test-read-ctf.cc b/tests/test-read-ctf.cc > > > > [...] > > > > > >> +test_task_ctf::test_task_ctf(const InOutSpec &s, > >> + string& a_out_abi_base, > >> + string& a_in_elf_base, > >> + string& a_in_abi_base) > > > > Please doxygen-document this function. > > > >> + : test_task(s, a_out_abi_base, a_in_elf_base, a_in_abi_base) > >> + {} > >> + > > > > [...] > > > >> +static test_task* > >> +new_task(const InOutSpec* s, string& a_out_abi_base, > >> + string& a_in_elf_base, string& a_in_abi_base) > > > > Please doxygen-document this function. > > > >> +{ > >> + return new test_task_ctf(*s, a_in_abi_base, > > > > This 'a_in_abi_base' should be a_out_abi_base. > > > >> + a_in_elf_base, a_in_abi_base); > >> +} > > > > [...] > > > > > >> diff --git a/tests/test-read-dwarf.cc b/tests/test-read-dwarf.cc > > > > [...] > > > > > >> +static test_task* > >> +new_task(const InOutSpec* s, string& a_out_abi_base, > >> + string& a_in_elf_base, string& a_in_abi_base) > > > > Please doxygen-document this function. > > > >> +{ > >> + return new test_task_dwarf(*s, a_in_abi_base, > > > > This 'a_in_abi_base' should be a_out_abi_base. > > > >> + a_in_elf_base, a_in_abi_base); > >> +} > >> > >> int > >> main(int argc, char *argv[]) > >> { > >> bool no_parallel =3D false; > > > > This variable is not unused. > > > > > > Thanks for working on this. It's really appreciated! > > > > [...] > > > > Cheers, > > > > -- > > Dodji > > >