From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 687983858403 for ; Thu, 25 Nov 2021 00:14:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 687983858403 Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-290-_EOTJSmnPtmLySnzmMRcuw-1; Wed, 24 Nov 2021 19:14:01 -0500 X-MC-Unique: _EOTJSmnPtmLySnzmMRcuw-1 Received: by mail-qv1-f72.google.com with SMTP id kk1-20020a056214508100b003a9d1b987caso4003539qvb.4 for ; Wed, 24 Nov 2021 16:14:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=nL27+Jq3jevv/YQJQu/MNargg26RZiSIkijAwshVXj8=; b=Eg7fLtJV09SBayGRi+ZDwCbAu/2Ae3WaSpgG4zzV24yBttXG/wpJE19xZwxfP2qFQf +dJBmtAg2RDekB2zXW461XJ08r8FwxkedUAJh1ecJam9XZ+sFpu0BmaV1IeaNcoLFbeI 3PL1JQDX7YKUVLjxoMWCvnU4o5QjlnVVFjda+DCgbETYR8MVYyoGOCsmOcWZz134ppra j/mb1B8PZMHm8aHGE8WsEQu8DyhrgN6L9ETDJYEBPg6ta+mZPSg0QZCTY/yTAthZXftq ds/gzr4aqxzBFmAIyq8eAIXWwbtRLBAsqwlTPyu7gl/7QNdVqGBMllFYehAP2N8JEFGv kxng== X-Gm-Message-State: AOAM531ff16+2htygZyUmpAU/OLtrBDmevsqpse1f6FzxGChauNcr1Zz jSLcz/O76tKR0loOxqVZUEIwq88AG1Y5K6luMLaWthEYCcCrpOP6/+hk4BHC39JcJVifQm4qxMN tPYGppghu922Uepj7BRNg X-Received: by 2002:a05:622a:292:: with SMTP id z18mr12566314qtw.205.1637799240589; Wed, 24 Nov 2021 16:14:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJwHqGjPwCy12noWT3ttEeVxrhxY1+9ngS1XWZFBnMTbGbuC9ap/6xQnDEQ/+2jGV9iE8B/LiA== X-Received: by 2002:a05:622a:292:: with SMTP id z18mr12566262qtw.205.1637799240085; Wed, 24 Nov 2021 16:14:00 -0800 (PST) Received: from smtpclient.apple (47-208-193-143.trckcmtc01.res.dyn.suddenlink.net. [47.208.193.143]) by smtp.gmail.com with ESMTPSA id 139sm587599qkn.37.2021.11.24.16.13.58 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Nov 2021 16:13:59 -0800 (PST) From: Ben Woodard Message-Id: Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Subject: Re: [PATCH v2] Add regression tests for ctf reading Date: Wed, 24 Nov 2021 16:13:57 -0800 In-Reply-To: Cc: "Guillermo E. Martinez via Libabigail" To: Dodji Seketeli References: <20211118041625.622972-1-guillermo.e.martinez@oracle.com> <20211122213353.2456208-1-guillermo.e.martinez@oracle.com> <87ee75tsst.fsf@seketeli.org> X-Mailer: Apple Mail (2.3693.20.0.1.32) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, HTML_MESSAGE, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Nov 2021 00:14:09 -0000 > On Nov 24, 2021, at 11:09 AM, Ben Woodard wrote: >=20 >=20 >=20 >> On Nov 24, 2021, at 8:36 AM, Dodji Seketeli wrote: >>=20 >> Hello, >>=20 >> [...] >>=20 >> Thanks for working on this. Nice patch, by the way! I like its >> direction. >>=20 >> I have a few comments and I believe that when they are addressed, we'll >> be able to apply the patch. >>=20 >> [...] >>=20 >>> Dependencies/limitations: >>>=20 >>> * It was worked on the top of the following patches: >>> https://sourceware.org/pipermail/libabigail/2021q4/003853.html >>>=20 >>> * Some CTF tests were *disabled* because it generates the XML ABI >>> corpus with *same information* but the XML nodes *are not* always >>> in the *same order*, so using diff command fails. Details here: >>> https://sourceware.org/pipermail/libabigail/2021q4/003824.html >>=20 >> In those cases where the abixml file generated from CTF is different >> from the one generated from DWARF, I think we should have two >> different reference abixml files to diff against. No CTF test should >> be disabled, I think. >=20 > Here is a somewhat deeper question that I think needs to be considered. I= =E2=80=99ve generally referred to it as =E2=80=9CDWARF Idioms=E2=80=9D but = this email makes me think that it is even larger than that.=20 >=20 > For my work, I need libabigail to generate an abstract notion of the ABI = corpus. How it constructs that abstract notion of the ABI needs to be indep= endent of the producer. Think of it this way, say we take the same compiler= and have it compile the same library producing both CTF and DWARF, the ABI= of the library doesn=E2=80=99t change. Since it is literally the same obje= ct, the program text is same. Therefore the ABI is unquestionably the the s= ame. Any difference reported by libabigail therefore is a problem with liba= bigail. It is not taking the source material and abstracting it enough into= the ABI artifacts to separate the artifacts from the implementation.=20 >=20 > So I kind of believe that we need to look more deeply into WHY the CTF an= d DWARF are not comparing as equivalent and begin the process of filing the= compiler bugs when we need to, and doing what is necessary to abstract the= ABI from the source material that libabigail used to construct its IR of t= he ABI corpus from. >=20 > So, I must say that I disagree with both dodji=E2=80=99s approach here an= d to a lesser extent Guillermo=E2=80=99s approach of disabling the tests. I= think that the tests where the CTF doesn=E2=80=99t match the DWARF should = be investigated and when necessary marked =E2=80=9Cxfail=E2=80=9D with a no= te citing their individual cause.=20 >=20 > I think that what we need to work towards is: > abidw produces the same output (or more precisely libabigail produces the= same IR) whether you compile with -gdwarf-3 -gdwarf-4 -gdwarf-5 -gsplit-dw= arf -gctf > also for the most part the ABI should not change between compiler version= s. There may be a few cases that we need to look into where the compiler ac= tually breaks ABI and of course libabigail should flag those but I would as= sert that libabigail needs to abstract its IR of the ABI enough that compil= er version changes that don=E2=80=99t actually change the ABI of the ELF ob= ject are not reported as ABI breaks. It currently does pretty well at this = at the moment. > Then once that foundation is built, being able to abstract the ABI IR eno= ugh that differences in toolchains e.g. LLVM vs GCC are not flagged as chan= ges in the object=E2=80=99s ABI. This is important to provide people with a= tool that will allow them to mix toolchains within a project to achieve op= timal code. Here is the problem: $ head -5 ../../gcc-clang=20 Functions changes summary: 9 Removed (157 filtered out), 1070 Changed (1068= filtered out), 123 Added (369 filtered out) functions Variables changes summary: 0 Removed, 0 Changed, 0 Added variable Function symbols changes summary: 151 Removed, 33 Added function symbols no= t referenced by debug info Variable symbols changes summary: 1 Removed, 66 Added variable symbols not = referenced by debug info yet: $ LD_LIBRARY_PATH=3D../lib/:$LD_LIBRARY_PATH ./abidw --abidiff /usr/lib64/l= ibstdc++.so.6.0.29=20 Downloading from https://debuginfod.fedoraproject.org/ 5377772 Downloading from https://debuginfod.fedoraproject.org/ 32828914 Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added f= unction Variables changes summary: 0 Removed, 0 Changed, 0 Added variable $ LD_LIBRARY_PATH=3D../../l5/lib/:$LD_LIBRARY_PATH ./abidw --abidiff /usr/l= ib64/libstdc++.so.6.0.29=20 Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added f= unction Variables changes summary: 0 Removed, 0 Changed, 0 Added variable $ cd ../../l5/bin [ben@alien bin]$ LD_LIBRARY_PATH=3D../lib/:$LD_LIBRARY_PATH ./abidw --abidi= ff /usr/lib64/libstdc++.so.6.0.29=20 Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added f= unction Variables changes summary: 0 Removed, 0 Changed, 0 Added variable [ben@alien bin]$ LD_LIBRARY_PATH=3D../../g5/lib/:$LD_LIBRARY_PATH ./abidw -= -abidiff /usr/lib64/libstdc++.so.6.0.29=20 Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added f= unction Variables changes summary: 0 Removed, 0 Changed, 0 Added variable where the l5 directory contains a clang++ build of libabigail and g5 contai= ns a gcc build of it. (that one change that is filtered out is something I= =E2=80=99ve already reported). So obviously the gcc built version of abidw is able to run with the clang++= built version of the libabigail version of the library and vice versa. The= libraries are ABI compatible. Good work compiler authors. So why does liba= bigail report 5000 lines worth of differences between them? We are starting= to get to a point where we need to start classifying these thousands of di= fferences and see how many of them are compiler errors and how many of them= we can find a way to abstract away enough in the IR to avoid the errors pr= esented. As you guys introduce CTF, if we allow it to diverge from the IR that we ge= t when processing DWARF, then the abixml doesn=E2=80=99t actually represent= the ELF file=E2=80=99s ABI, it is much more limited than that because it i= s filtered through the compiler=E2=80=99s idiomatic way of expressing the f= eatures which are part of the ABI. It shouldn=E2=80=99t matter where we get the information describing the fil= e=E2=80=99s ABI, if the information is complete, then it shouldn=E2=80=99t = matter what source it came from. -ben >=20 > -ben >=20 >>=20 >> So: >>=20 >> This: >>=20 >> { >> "data/test-read-dwarf/test0", >> "", >> "", >> SEQUENCE_TYPE_ID_STYLE, >> "data/test-read-dwarf/test0.abi", >> "output/test-read-dwarf/test0.abi" >> }, >>=20 >> would be changed into: >>=20 >> { >> "data/test-read-common/test0", >> "", >> "", >> SEQUENCE_TYPE_ID_STYLE, >> "data/test-read-dwarf/test0.abi", >> "output/test-read-dwarf/test0.abi" >> }, >>=20 >> For the DWARF test entry in test-read-dwarf.cc, and it would be >> changed into: >>=20 >> { >> "data/test-read-common/test0", >> "", >> "", >> SEQUENCE_TYPE_ID_STYLE, >> "data/test-read-ctf/test0.abi", >> "output/test-read-ctf/test0.abi" >> }, >>=20 >> for the CTF test netry in test-read-ctf.cc. >>=20 >> By the way, I am seeing entries like this in test-read-ctf.cc: >>=20 >>> + "data/test-read-common/test3.so", >>> + "", >>> + "", >>> + SEQUENCE_TYPE_ID_STYLE, >>> + "data/test-read-common/test3-ctf.so.abi", >>> + "output/test-read-common/test3-ctf.so.abi" >>=20 >> Here this entry does exactly what I am suggesting, even if >> test3-ctf.so.abi is stored in data/test-read-common. >>=20 >> So where exactly is the CTF test disabled? >>=20 >> [...] >>=20 >>=20 >>> diff --git a/tests/test-read-common.cc b/tests/test-read-common.cc >>=20 >> [...] >>=20 >>> + >>> +namespace abigail >>> +{ >>> +namespace tests >>> +{ >>> +namespace read_common >>> +{ >>=20 >> Because this file now contains an API definition that is to be used be >> some tests, every single function of the file should be documented >> using doxygen comments, just like we have in src/abg-*.cc files. >>=20 >> For instance: >>=20 >>> + >>> +test_task::test_task(const InOutSpec &s, >>> + string& a_out_abi_base, >>> + string& a_in_elf_base, >>> + string& a_in_abi_base) >>=20 >> This function should be fully doxygen-documented. >>=20 >>> + : is_ok(true), >>> + spec(s), >>> + out_abi_base(a_out_abi_base), >>> + in_elf_base(a_in_elf_base), >>> + in_abi_base(a_in_abi_base) >>> + {} >>> + >>> +bool >>> +test_task::serialize_corpus(const string& out_abi_path, >>> + corpus_sptr corp) >>=20 >> Likewise. >>=20 >>> +{ >>> + ofstream of(out_abi_path.c_str(), std::ios_base::trunc); >>> + if (!of.is_open()) >>> + { >>> + error_message =3D string("failed to read ") + out_abi_path + "\= n"; >>> + return false; >>> + } >>> + >>> + write_context_sptr write_ctxt >>> + =3D create_write_context(corp->get_environment(), of); >>> + set_type_id_style(*write_ctxt, spec.type_id_style); >>> + is_ok =3D write_corpus(*write_ctxt, corp, /*indent=3D*/0); >>> + of.close(); >>> + >>> + return true; >>> +} >>> + >>> +bool >>> +test_task::run_abidw(const string& extargs) >>=20 >> Likewise. >>=20 >>> +{ >>> + string abidw =3D string(get_build_dir()) + "/tools/abidw"; >>> + string drop_private_types; >>> + if (!in_public_headers_path.empty()) >>> + drop_private_types +=3D "--headers-dir " + in_public_headers_path = + >>> + " --drop-private-types"; >>> + string cmd =3D abidw + " " + drop_private_types + " --abidiff " + ex= targs + >>> + in_elf_path; >>> + if (system(cmd.c_str())) >>> + { >>> + error_message =3D string("ABIs differ:\n") >>> + + in_elf_path >>> + + "\nand:\n" >>> + + out_abi_path >>> + + "\n"; >>> + >>> + return false; >>> + } >>> + >>> + return true; >>> +} >>> + >>> +bool >>> +test_task::run_diff() >>=20 >> Likewise. >>=20 >>> +{ >>> + set_in_abi_path(); >>> + string cmd =3D "diff -u " + in_abi_path + " " + out_abi_path; >>> + if (system(cmd.c_str())) >>> + { >>> + error_message =3D string("ABIs differ:\n") >>> + + in_abi_path >>> + + "\nand:\n" >>> + + out_abi_path >>> + + "\n"; >>> + >>> + return false; >>> + } >>> + >>> + return true; >>> +} >>> + >>> +void >>> +display_usage(const string& prog_name, ostream& out) >>=20 >> Likewise. >>=20 >>> +{ >>> + emit_prefix(prog_name, out) >>> + << "usage: " << prog_name << " [options]\n" >>> + << " where options can be: \n" >>> + << " --help|-h display this message\n" >>> + << " --no-parallel execute testsuite is a sigle thread\n" >>> + ; >>> +} >>> + >>> +bool >>> +parse_command_line(int argc, char* argv[], options& opts) >>=20 >> Likewise. >>=20 >>> +{ >>> + for (int i =3D 1; i < argc; ++i) >>> + { >>> + if (!strcmp(argv[i], "--no-parallel")) >>> + opts.parallel =3D false; >>> + else if (!strcmp(argv[i], "--help") >>> + || !strcmp(argv[i], "--h")) >>> + return false; >>> + else >>> + { >>> + if (strlen(argv[i]) >=3D 2 && argv[i][0] =3D=3D '-' && argv[= i][1] =3D=3D '-') >>> + opts.wrong_option =3D argv[i]; >>> + return false; >>> + } >>> + } >>> + >>> + return true; >>> +} >>> + >>> +bool >>> +run_tests(const size_t num_tests, const InOutSpec* specs, >>> + const options& opts, create_new_test new_test) >>=20 >> Likewise. >>=20 >>> +{ >>=20 >> [...] >>=20 >>=20 >>> diff --git a/tests/test-read-common.h b/tests/test-read-common.h >>=20 >> [...] >>=20 >>> +/// This is an aggregate that specifies where a test shall get its >>> +/// input from, and where it shall write its ouput to. >>> +struct InOutSpec >>> +{ >>> + const char* in_elf_path; >>> + const char* in_suppr_spec_path; >>> + const char* in_public_headers_path; >>> + type_id_style_kind type_id_style; >>> + const char* in_abi_path; >>> + const char* out_abi_path; >>> +};// end struct InOutSpec >>> + >>> +/// The task that peforms the tests. >>> +struct test_task : public abigail::workers::task >>> +{ >>> + bool is_ok; >>> + InOutSpec spec; >>> + string error_message; >>> + string out_abi_base; >>> + string in_elf_base; >>> + string in_abi_base; >>> + >>> + string in_elf_path; >>> + string in_abi_path; >>> + string in_suppr_spec_path; >>> + string in_public_headers_path; >>> + string out_abi_path; >>> + >>> + void >>> + set_in_elf_path() >>=20 >> Please doxygen-document this function. >>=20 >>> + { >>> + in_elf_path =3D in_elf_base + spec.in_elf_path; >>> + } >>> + >>> + void >>> + set_in_suppr_spec_path() >>=20 >> Likewise. >>=20 >>> + { >>> + if (spec.in_suppr_spec_path) >>> + in_suppr_spec_path =3D in_elf_base + spec.in_suppr_spec_path; >>> + else >>> + in_suppr_spec_path.clear(); >>> + } >>> + >>> + void >>> + set_in_public_headers_path() >>=20 >> Likewise. >>=20 >>> + { >>> + if (spec.in_public_headers_path) >>> + in_public_headers_path =3D spec.in_public_headers_path; >>> + if (!in_public_headers_path.empty()) >>> + in_public_headers_path =3D in_elf_base + spec.in_public_headers_= path; >>> + } >>> + >>> + bool >>> + set_out_abi_path() >>=20 >> Likewise. >>=20 >>> + { >>> + out_abi_path =3D out_abi_base + spec.out_abi_path; >>> + if (!abigail::tools_utils::ensure_parent_dir_created(out_abi_path)= ) >>> + { >>> + error_message =3D >>> + string("Could not create parent directory for ") + out_abi= _path; >>> + return false; >>> + } >>> + return true; >>> + } >>> + >>> + void >>> + set_in_abi_path() >>=20 >> Likewise. >>=20 >>> + { >>> + in_abi_path =3D in_abi_base + spec.in_abi_path; >>> + } >>> + >>=20 >> [...] >>=20 >>> + test_task(const InOutSpec &s, >>> + string& a_out_abi_base, >>> + string& a_in_elf_base, >>> + string& a_in_abi_base); >>> + bool >>> + serialize_corpus(const string& out_abi_path, >>> + corpus_sptr corp); >>> + bool >>> + run_abidw(const string& extargs =3D ""); >>> + >>> + bool >>> + run_diff(); >>> + >>> + virtual >>> + ~test_task() >>> + {} >>> + >>> +}; // end struct test_task >>> + >>> +typedef shared_ptr test_task_sptr; >>> + >>> +struct options >>> +{ >>=20 >> Please doxygen-document this struct. >>=20 >>> + string wrong_option; >>> + bool parallel; >>> + >>> + options() >>> + : parallel(true) >>> + {} >>> + >>> + ~options() >>> + { >>> + } >>> +}; >>=20 >> [...] >>=20 >>> diff --git a/tests/test-read-ctf.cc b/tests/test-read-ctf.cc >>=20 >> [...] >>=20 >>=20 >>> +test_task_ctf::test_task_ctf(const InOutSpec &s, >>> + string& a_out_abi_base, >>> + string& a_in_elf_base, >>> + string& a_in_abi_base) >>=20 >> Please doxygen-document this function. >>=20 >>> + : test_task(s, a_out_abi_base, a_in_elf_base, a_in_abi_base) >>> + {} >>> + >>=20 >> [...] >>=20 >>> +static test_task* >>> +new_task(const InOutSpec* s, string& a_out_abi_base, >>> + string& a_in_elf_base, string& a_in_abi_base) >>=20 >> Please doxygen-document this function. >>=20 >>> +{ >>> + return new test_task_ctf(*s, a_in_abi_base, >>=20 >> This 'a_in_abi_base' should be a_out_abi_base. >>=20 >>> + a_in_elf_base, a_in_abi_base); >>> +} >>=20 >> [...] >>=20 >>=20 >>> diff --git a/tests/test-read-dwarf.cc b/tests/test-read-dwarf.cc >>=20 >> [...] >>=20 >>=20 >>> +static test_task* >>> +new_task(const InOutSpec* s, string& a_out_abi_base, >>> + string& a_in_elf_base, string& a_in_abi_base) >>=20 >> Please doxygen-document this function. >>=20 >>> +{ >>> + return new test_task_dwarf(*s, a_in_abi_base, >>=20 >> This 'a_in_abi_base' should be a_out_abi_base. >>=20 >>> + a_in_elf_base, a_in_abi_base); >>> +} >>>=20 >>> int >>> main(int argc, char *argv[]) >>> { >>> bool no_parallel =3D false; >>=20 >> This variable is not unused. >>=20 >>=20 >> Thanks for working on this. It's really appreciated! >>=20 >> [...] >>=20 >> Cheers, >>=20 >> --=20 >> =09=09Dodji