public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Giuliano Procida <gprocida@google.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>
Cc: libabigail@sourceware.org
Subject: Re: [PATCH] Add support for the CTF debug format to libabigail.
Date: Mon, 11 Oct 2021 16:26:26 +0100	[thread overview]
Message-ID: <CAGvU0H=PZ8DxohMSvixhKGj4J6taA=A6s9PVYrJ83yRapA8frg@mail.gmail.com> (raw)
In-Reply-To: <8735p7sijh.fsf@oracle.com>

Hi.

On Mon, 11 Oct 2021 at 16:12, Jose E. Marchesi <jose.marchesi@oracle.com> wrote:
>
>
> Hi Giuliano.
> Thanks for the feedback.
>
> > On Mon, 11 Oct 2021 at 09:45, Jose E. Marchesi via Libabigail
> > <libabigail@sourceware.org> wrote:
> >>
> >> CTF (C Type Format) is a lightwieght debugging format that provices
> >> information about C types and the association between functions and
> >> data symbols and types.  It is designed to be very compact and
> >> simple.
> >>
> >
> > It's nice to see you say "simple" here. I'm all in favour. However, a lot
> > the https://github.com/oracle/binutils-gdb/wiki/libctf-todo items look
> > like they aim to reduce CTF binary size at the expense of greater
> > complexity. Will everything be abstracted away in libctf?
>
> libctf shall be able to abstract that additional complexity, yes.
>
> I agree the balance between compactness and simplicity is tricky: it is
> also a concern of mine.  However, I think that provided we don't
> change/miss the aim/scope of the debugging format, we shall be ok at the
> end.
>
> Consider DWARF for example.  Some people say it is not compact.  They
> are very wrong: DWARF is way more compact than, say, CTF, speaking in
> relative terms.  It is just that the scope of DWARF is so wide that the
> amount of information it encodes for typical programs is totally
> massive.
>
> >> This patch introduces support in libabigail to extract ABI information
> >> from CTF stored in ELF files.
> >>
> >> A few notes on this implementation:
> >>
> >> - The implementation is complete in terms of CTF support.  Every CTF
> >>   feature is processed and handled to generate libabigail IR.  This
> >>   includes basic types, typedefs, pointer, array and struct types.
> >>   The CTF record of data objects (variables) and functions are also
> >>   used in order to generate the corresponding libabigail IR artifacts.
> >>
> >> - The decoding of CTF data is done using the libctf library which is
> >>   part of binutils.  In order to link with it, binutils shall be built
> >>   with --enable-shared for libctf.so to become available.
> >>
> >> - This initial implementation is aimed to simplicity.  We have not
> >>   tried to resolve any and every corner case that may require special
> >>   handling.  We have observed that the DWARF front-end (which is
> >>   naturally way more complex as the scope is way bigger) is plagued
> >>   with hacks to handle such situations.  However, for the CTF support
> >>   we prefer to proceed in a simpler and more modest way: we will
> >>   handle these problems if/when we find them.  The fact that CTF only
> >>   supports C (currently) certainly helps there.
> >>
> >> - Likewise, in this basic support we are not handling symbol
> >>   suppressions or other goodies that libabigail provides.  We are new
> >>   to libabigail and ABI analysis, and at this point we simply don't
> >>   have a clear picture about what is most useful/relevant to support
> >>   or not.  With the maintainer's blesssing, we will tackle that
> >>   functionaly after this basic support is applied upstream.
> >>
> >> - The implementation in abg-ctf-reader.{cc,h} is pretty much
> >>   self-contained.  As a result there is some duplication in terms of
> >>   ELF handling with the DWARF reader, but since that logic is very
> >>   simple and can be easily implemented, we don't consider this to be a
> >>   big deal (for now.)  Hopefully the maintainers agree.
> >>
> >
> > The implementation is short which is great.
> >
> >> - The libabigail tools assume that ELF means to always use DWARF to
> >>   generate the ABI IR.  We added a new command-line option --ctf to
> >>   the tools in order to make them to use the CTF debug info instead.
> >>   We are definitely not sure whether this is the best user interface.
> >>   In fact I would be suprised if it was ;)
> >>
> >> - We added support for --ctf to both abilint and abidiff.   We are not
> >>   sure whether it would make sense to add support for CTF to the other
> >>   tools.  Feedback welcome.
> >>
> >
> > For ease of testing / building up a useful regression test suite, please do
> > consider adding --ctf to abidw (or adding abictf?) which would give a
> > CTF -> XML utility. Plain diff (rather than abilint's ABI diff) can be used to
> > check for changes over time.
>
> What about renaming abitdw to something like abielf?  Then --ctf would
> fit well.
>
> >> - We are pondering about what to do in terms of testing.  We have
> >>   cursory tested this implementation using abilint and abidiff.  We
> >>   know we are generating IR corpus that seem to be ok.  It would be
> >>   good however to be able to run the libabigail testsuites using CTF.
> >>   However the testsuites may need some non-trivial changes in order to
> >>   make this possible.  Let's talk about that :)
> >>
> >
> > We created a small test suite for regression testing, initially when we
> > started working on BTF so that the developer could have something
> > to check their progress but also to track progress on certain libabgiail
> > issues.
>
> Sounds exactly what we need :)
>
> > There is a simple Makefile that refreshes objects and reports from C
> > and C++ source code. Each test case consists of a pair of either C
> > or C++ source files. Everything is enumerated just by globbing.
> >
> > Source is compiled with GCC 10 at present and BTF information is
> > obtained by running pahole -J on copies of the objects.
>
> Note that GCC now supports generating BTF natively.  That will become
> available in released form with GCC 12.
>

Good to know. At some point we may end up with 3 different BTF test
inputs (pahole, GCC and Clang).

> > There are Python wrappers that replicate these ABI extraction and diff
> > steps to check for discrepancies during continuous integration builds.
> >
> > The abidiff script is a bit special in that it expects comparing .o and
> > .xml in all 4 combinations to result in identical outcomes. stdout
> > and exit status are both captured and compared. As a result, there's
> > been a test we haven't been able to add for a while.
> >
> > We don't attempt to assert identity of abidiff of abidw XML with BTF diff
> > for the same files. There are format and diff algorithm implementation
> > differences that make this an impossibility. This may be possible for you
> > with CTF though and you can abidiff test all 9 combinations of DWARF,
> > CTF and XML inputs. You probably won't be able to assert that DWARF
> >  -> XML and CTL -> XML give identical results as text.
> >
> > The driver code and Makefile are part of the Google repo, but I'd be
> > happy to share all the test cases. It's probably time I organised them
> > a bit better anyway.
>
> Where is the Google repo?

That's the (private) Google monorepo. To be honest, we could probably
offer up the Makefile and Python scripts, but the latter in assumes
Google libraries.

Giuliano.

  reply	other threads:[~2021-10-11 15:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-11  8:45 Jose E. Marchesi
2021-10-11 12:09 ` Giuliano Procida
2021-10-11 15:11   ` Jose E. Marchesi
2021-10-11 15:26     ` Giuliano Procida [this message]
2021-10-27  9:19     ` Dodji Seketeli
2021-10-27  8:59 ` Dodji Seketeli
2021-10-27 16:06   ` Jose E. Marchesi
2021-10-27 20:31   ` Ben Woodard
2021-10-29  9:35     ` Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGvU0H=PZ8DxohMSvixhKGj4J6taA=A6s9PVYrJ83yRapA8frg@mail.gmail.com' \
    --to=gprocida@google.com \
    --cc=jose.marchesi@oracle.com \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).