Re: Use CTF as a fallback when no DWARFs debug info is present

public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed

From: Guillermo Martinez <guillermo.e.martinez@oracle.com>
To: Ben Woodard <woodard@redhat.com>
Cc: Ben Woodard via Libabigail <libabigail@sourceware.org>
Subject: Re: Use CTF as a fallback when no DWARFs debug info is present
Date: Thu, 14 Jul 2022 18:13:50 -0500	[thread overview]
Message-ID: <992b4e78-697d-89d2-5e3b-fbccfa464ed6@oracle.com> (raw)
In-Reply-To: <4148AF95-8356-482C-A674-518A3AC0F488@redhat.com>



On 7/14/22 13:09, Ben Woodard wrote:
>
>
>> On Jul 14, 2022, at 10:35 AM, Guillermo E. Martinez 
>> <guillermo.e.martinez@oracle.com> wrote:
>>
>> On Thursday, July 14, 2022 11:01:41 AM CDT Ben Woodard wrote:
>>>
>>>> On Jul 14, 2022, at 8:06 AM, Guillermo E. Martinez 
>>>> <guillermo.e.martinez@oracle.com> wrote:
>>>>
>>>> On Wednesday, July 13, 2022 8:37:13 PM CDT Ben Woodard wrote:
>>>> Hi Ben,
>>>>
>>>>> So let me begin by saying, I think we should push changes to the 
>>>>> API like this until after 2.1 is released. Dodji isn’t calling it 
>>>>> a beta or RC yet but we have been treating it that way. I’ve been 
>>>>> trying to get all my testing done and he has been trying to get 
>>>>> those same bugs fixed.
>>>>>
>>>>> After that, I want to really revisit the API and make it more 
>>>>> generally useful and less closely tied to the tools. What you are 
>>>>> proposing below is a baby step along the lines of what I I had in 
>>>>> mind.
>>>>>
>>>>>> On Jul 13, 2022, at 4:25 PM, Guillermo E. Martinez via Libabigail 
>>>>>> <libabigail@sourceware.org> wrote:
>>>>>>
>>>>>> Hello libabigail team,
>>>>>>
>>>>>> Talking with my colleges Jose E. Marchesi and Nick Alcock about 
>>>>>> of the user
>>>>>> interface provided by abidw, abidiff and some other tools in 
>>>>>> libabigail, we
>>>>>> think such tools should be looks for DWARF debug information, and 
>>>>>> if it is not
>>>>>> present fall back for CTF even without --ctf option, this 
>>>>>> behaviour is used by
>>>>>> GDB (looking for .ctf and .debug_* section,
>>>>>
>>>>> First of all, I think rather than having a:
>>>>>
>>>>> —enable-ctf compile time option
>>>>>
>>>>> Why don’t we simply make “WITH_CTF” an autoconf option that is 
>>>>> disabled if AC_CHECK_HEADER and AC_CHECK_LIB doesn’t find the 
>>>>> things that you need.
>>>> Ok, in that case we need to decide whether CTF support always will 
>>>> ship in libabigail,
>>>> i.e, if the only dependency: libctf is meet, then CTF support will 
>>>> be there by default.
>>>
>>> If the header is there and the library is there, then let’s build it 
>>> in without the config time option. I think we should do this but 
>>> after 2.1 releases though. I can’t see why it shouldn’t be there as 
>>> long as autotools disables it if the required headers and library 
>>> are not there at config time.
>>
>> Yeah, although configure statements and conditional compiling make me 
>> think
>> that was a reason to not incorporate CTF support in libabigail by 
>> default like
>> to ENABLE_RPM in abipkgdiff. But for sure, these motivations could have
>> changed, just we need to agreed.
>
> I think at the time CTF seemed more experimental. Now that you added a 
> lot more testing, it feels better tested and thus more supportable. 
> However, we’ll probably hand all the CTF issues to you and Jose.

Agree.
>
>>
>>> My general philosophy is: Use it more, test it more — find more 
>>> bugs. I personally haven’t been building or testing CTF. It might be 
>>> worth it for you to do what I just did with Fedora 36 which was in 
>>> essence:
>>>
>>> foreach package
>>> fedabipkgdiff —self-compare -a --from fc36 <package>
>>
>> Actually I did the same experiment just with *some* packages in OL8, 
>> using
>> abiddw and abidiff instead, so I will follow your advice :-).
>
> Note that this relies on Koji if you guys don’t use that you can fall 
> back to abipkgdiff wit a bit more scripting.
>
>>
>>> If you can do something equivalent with Oracle Linux 9 which I guess 
>>> is built with CTF, then I think that would be a really good test. I 
>>> found that flux-corehttps://github.com/flux-framework/flux-coreis a 
>>> really good scheduler for this kind of thing. I basically spin up 
>>> flux and then do:
>>
>> It seems good!, will do the same in Oracle Linux.
>>
>>> $ cat list-of-packages | flux mini bulksubmit --wait --progress 
>>> --job-name={} --output={}.out fedabipkgdiff --self-compare -a --from 
>>> fc36 {}
>>>
>>> You can either send me the PR and I will merge it into my develop 
>>> branch on github (right now practically empty) to be passed on to 
>>> the list after 2.1 releases or you can send it to the list and we’ll 
>>> just sit on it until 2.1 releases. The problem with the latter 
>>> approach is you might have to send an updated patch if it falls out 
>>> of sync with the master.
>>
>> Ok. agreed.
>>
>>>>>
>>>>>
>>>>> I don’t think that it should matter where we get the information 
>>>>> about the types from.
>>>> Yes.
>>>>> I would also be tempted to get rid the —use-ctf options except I 
>>>>> think that we should have the ability to do something like
>>>>> abidw —abidiff —use-ctf somelib.so
>>>>> And it would read the DWARF, then read the CTF and then compare 
>>>>> the corpus that we get from them.
>>>> Ok, IMHO, it just be needed in the testing harness, because at the 
>>>> end of the day both corpus
>>>> with the same input, will be generated no differences, just in some 
>>>> properties e.g: filepath, line,
>>>> column, etc.
>>>
>>> Heh in theory that should be the case, just like in theory
>>> $ abidw —abidiff <whatever>
>>> should always return no changes. However, I have found literally 
>>> hundreds of bugs doing that. I expect that if we were to take a 
>>> substantial portion of a distribution built with CTF we would find 
>>> plenty of problems.
>>
>> Oh, I see, I presume such of errors will be raised running 
>> `fedabipkgdiff —self-compare ...'
>> as you show above.
>
> Same errors just using fedabipkgdiff is a bigger hammer.
> As I’m tracking down issues and try to boil them down to see if they 
> are duplicates the process that I tend to do is something like:
>
>  1. automated testing as I stated above with fedabipkgdiff
>  2. reproduce the error with fedabipkgdiff manually
>  3. try fedabipkgdiff —dry-run which prints out the underlying
>     abipkgdiff commands
>  4. run the abipkgdiff commands manually
>  5. then if I want a closer look then I use abidw —abidiff
>  6. if that doesn’t tell me what I want then abidw <library> >
>     /tmp/<library>.abixml; abidiff —harmless /tmp/<library>.abixml
>     <library>
>

OK, thanks for algorithm.
>>
>>> I’m not enthusiastic about the syntax for the command line that I 
>>> suggested maybe we should nix the —with-ctf options too and just 
>>> have something like:
>>>
>>> $ abidw —abidiff-dwarf <library> # does DWARF to DWARF comparison - 
>>> fails if no DWARF is found
>>> $ abidw —abidiff-ctf <library> # does CTF to CTF - fails if no CTF 
>>> is found
>>>
>>> Then the current syntax:
>>> $ abidw —abidiff <library>
>>> # looks for DWARF and CTF if both are present then it compares DWARF 
>>> to CTF
>>> # if only DWARF is present then it compares DWARF to DWARF
>>> # if only CTF is present then it compares CTF to CTF
>>>
>>> What do other people think? Is there a better syntax to do this?
>>>
>>> I think that both Dodji and I kind of thought that wedging it into 
>>> abidw was a short term expedient and optimization and we never 
>>> thought that we would still be battling problems with this literally 
>>> years later. It is simpler than what I was doing at the beginning 
>>> which was:
>>>
>>> abidw <library> > /tmp/<library>.abixml
>>> abidiff /tmp/<library>.abixml <library>
>>>
>>> but as a coworker pointed out that is a non-intuitive way to 
>>> approach the problerm. I argued that doing A=A is counter-intuitive 
>>> test and yet discovering that A!=A is mind-spinning until you 
>>> realize that it represents a bug. She thought a better command line 
>>> syntax would be:
>>>
>>> abidiff —self-compare <library>
>>>
>>>>> Or maybe it would be
>>>>>> but in libabigail is done by the
>>>>>> symtab class), so I did some minor changes in abidw and abidiff
>>>>>
>>>>> don’t forget abicompat
>>>> Ok.
>>>>>> setting the
>>>>>> corpus origin depending of `opts.use_ctf', trying to build the 
>>>>>> copus with DWARF
>>>>>> debug information, but if it is not possible 
>>>>>> (`STATUS_DEBUG_INFO_NOT_FOUND')
>>>>>> looks for CTF info.
>>>>>
>>>>> I think that we should do a deeper rework here and work the way 
>>>>> that GDB works. In essence doing:
>>>>>
>>>>> In essence:
>>>>> read ELF
>>>>> if( elf-file has DWARF) then use that
>>>>> else if (look for .gnu_reflinks DWARF) then use that
>>>>> else if( look for DWARF5 split DWARF) then use that
>>>>> else if( look for CTF) then use that
>>>>> else if (fail-on-no-debug-info) then exit
>>>>> else warn on limited functionality.
>>>>>
>>>>> I think that all this logic should be encapsulated in an exported 
>>>>> function read_corpus and then all the programs use that interface 
>>>>> rather than the functions which are specific to what format the 
>>>>> information is gathered from.
>>>> Agree, because now days every reader implements the logic to get 
>>>> ELF info:
>>>> `read_debug_info_into_corpus' and `slurp_elf_info'.
>>>>>> corpus::origin origin = opts.use_ctf ? CTF_ORIGIN : DWARF_ORIGIN;
>>>>>> ...
>>>>>> if (origin == DWARF_ORIGIN)
>>>>>> {
>>>>>> // Build corpus with DWARF
>>>>>> }
>>>>>> if ((status & STATUS_DEBUG_INFO_NOT_FOUND) ||
>>>>>> origin == CTF_ORIGIN)
>>>>>> {
>>>>>> // Build corpus with CTF
>>>>>> }
>>>>>>
>>>>>> if (status & STATUS_DEBUG_INFO_NOT_FOUND)
>>>>>> {
>>>>>> // lets to know the user that no debug info
>>>>>> // was found in DWARF neither CTF
>>>>>> }
>>>>>>
>>>>>> Doing in this way, seem to work, however there are functions to 
>>>>>> notifying the
>>>>>> user about of missing debug information (e.g `handle_error') that 
>>>>>> use an
>>>>>> specific DWARF `read_context' reference, so if we want to reuse 
>>>>>> it, then we
>>>>>> could split common functionality/interface for both readers in a 
>>>>>> base class:
>>>>>>
>>>>>
>>>>> Dodji can weigh in here but this whole contex thing as an API is 
>>>>> one of the things that I wanted get rid of in the general purpose 
>>>>> API, IMHO it is too tightly tied into how the tools work. I kind 
>>>>> of know what he was trying to do, he didn’t want 50 parameters for 
>>>>> the functions and so he made this class to pass them all at once.
>>>>> I think we need to think about making a good general purpose API 
>>>>> and I think we can do better than sticking a data structure with 
>>>>> parsed command line options in it.
>>>>>> class base_read_context
>>>>>> {
>>>>>> ...
>>>>>> virtual const string&
>>>>>> alt_debug_info_path() const;
>>>>>>
>>>>>> void
>>>>>> exported_decls_builder(corpus::exported_decls_builder* b);
>>>>>> ...
>>>>>> };
>>>>>>
>>>>>> class dwarf_reader::read_context : public base_read_context
>>>>>> {
>>>>>> ...
>>>>>> };
>>>>>>
>>>>>> class ctf_reader::read_context : public base_read_context
>>>>>> {
>>>>>> ...
>>>>>> };
>>>>>>
>>>>>>
>>>>>> // Now reusing refers_to_alt_debug_info for both readers
>>>>>> bool
>>>>>> refers_to_alt_debug_info(const read_context&ctxt,
>>>>>> string& alt_di_path)
>>>>>> {
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Please let met know your feedback, I'll really appreciate!
>>>>>
>>>>> If you want to collaborate on this, send me a PR and we can 
>>>>> iterate on it on my develop branch on github and then I will push 
>>>>> it to Dodji right after we release 2.1
>>>> Ok, of course I will do so following the advice of people that have 
>>>> more experience working in libabigail.
>>>> Thanks for your comments!
>>>>> https://github.com/woodard/libabigail<https://github.com/woodard/libabigail>
>>>>
>>
>> Regards,
>> guillermo
>

     prev parent reply	other threads:[~2022-07-14 23:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-13 23:25 Guillermo E. Martinez
2022-07-14  1:37 ` Ben Woodard
2022-07-14 15:06   ` Guillermo E. Martinez
2022-07-14 16:01     ` Ben Woodard
2022-07-14 17:35       ` Guillermo E. Martinez
2022-07-14 18:09         ` Ben Woodard
2022-07-14 23:13           ` Guillermo Martinez [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=992b4e78-697d-89d2-5e3b-fbccfa464ed6@oracle.com \
    --to=guillermo.e.martinez@oracle.com \
    --cc=libabigail@sourceware.org \
    --cc=woodard@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).