public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: "Jose E. Marchesi" <jose.marchesi@oracle.com>
To: Dodji Seketeli <dodji@seketeli.org>
Cc: "Jose E. Marchesi via Libabigail" <libabigail@sourceware.org>
Subject: Re: [PATCH] Add support for the CTF debug format to libabigail.
Date: Wed, 27 Oct 2021 18:06:53 +0200	[thread overview]
Message-ID: <87bl3aih9e.fsf@oracle.com> (raw)
In-Reply-To: <87lf2ealn8.fsf@seketeli.org> (Dodji Seketeli's message of "Wed,  27 Oct 2021 10:59:23 +0200")


Hi Dodji.

>>     - The libabigail tools assume that ELF means to always use DWARF to
>>       generate the ABI IR.  We added a new command-line option --ctf to
>>       the tools in order to make them to use the CTF debug info instead.
>>       We are definitely not sure whether this is the best user interface.
>>       In fact I would be suprised if it was ;)
>
> It's OK for me, so far.  We can come up with a better way.  In theory,
> we should be able to automatically detect the debuginfo format
> attached to a given binary, now that we about to support more than one
> ;-) We should just make sure we can do that even when said debuginfo
> is split out into a separate file.

Right.  If both formats (DWARF and CTF) are present in a given binary
DWARF must have precedence, since it is capable of encoding more
information than CTF.

>
>>     - We added support for --ctf to both abilint and abidiff.
>
> Thanks for doing that!
>
> Oh, one thing that is missing is to add documentation to the
> doc/manuals/abi{diff,lint}.rst files for the new --ctf to these tools.
> Could you please add that in a subsequent iteration of this patch?

Will do for V2.

>>       We are not sure whether it would make sense to add support for CTF
>>       to the other tools.  Feedback welcome.
>
> I think it would be important to add something similar to abidw, as
> that tool is important to emit an abixml representation of a given
> binary.  A lot of users use the abixml representation to serialize a
> representation of the

I will add CTF support for abidw in V2.

>>     - We are pondering about what to do in terms of testing.  We have
>>       cursory tested this implementation using abilint and abidiff.  We
>>       know we are generating IR corpus that seem to be ok.  It would be
>>       good however to be able to run the libabigail testsuites using CTF.
>>       However the testsuites may need some non-trivial changes in order to
>>       make this possible.  Let's talk about that :)
>
> I think it wouldn't be that hard.  We can discuss that separately if
> you like.  But it just boils down to adding a new test similar to
> tests/test-read-dwarf.cc, that would be called tests/test-read-ctf.cc.
> That test would just read a binary built with ctf support, save its
> abixml representation to disk and compare it with an expected output.
> That would be a great start.  I can help with that, no problem.  But
> let's maybe discuss this in a separate thread, even after the initial
> patch is applied.

I asked Guillermo Rodriguez to look at that.  He is subscribed to this
list.

>> diff --git a/ChangeLog b/ChangeLog
>> index 30918b49..385fe067 100644
>> --- a/ChangeLog
>> +++ b/ChangeLog
>> @@ -1,3 +1,21 @@
>> +2021-10-11  Jose E. Marchesi  <jose.marchesi@oracle.com>
>> +
>> +	* configure.ac: Check for libctf.
>> +	* src/abg-ctf-reader.cc: New file.
>> +	* include/abg-ctf-reader.h: Likewise.
>> +	* src/Makefile.am (libabigail_la_SOURCES): Add abg-ctf-reader.cc
>> +	conditionally.
>> +	* include/Makefile.am (pkginclude_HEADERS): Add abg-ctf-reader.h
>> +	conditionally.
>> +	* tools/abilint.cc (struct options): New option `use_ctf'.
>> +	(display_usage): Documentation for --ctf.
>> +	(parse_command_line): Handle --ctf.
>> +	(main): Honour --ctf.
>> +	* tools/abidiff.cc (struct options): New option `use_ctf'.
>> +	(display_usage): Documentation for --ctf.
>> +	(parse_command_line): Handle --ctf.
>> +	(main): Honour --ctf.
>> +
>
> As explained in the COMMIT-LOG-GUIDELINES file from the source code at
> https://sourceware.org/git/?p=libabigail.git;a=blob_plain;f=COMMIT-LOG-GUIDELINES;hb=HEAD,
> the ChangeLog is automatically updated by a script before a Libabigail
> release.

Ok... I will DTRT.
Will also add a Sign-off-by in V2.

>
>> diff --git a/configure.ac b/configure.ac
>
> [...]
>
>> +    AC_DEFINE([CTF], 1,
>> +	     [Defined if user enables and system has the libctf library])
>
> To comply with the implicit convention throughout the code, I renamed
> the pre-processor macro used to enable the "CTF Support Feature" into
> "WITH_CTF", because all the other "feature enabling" macros are named
> WITH_XXX.  This produces the hunk below, that as been committed into
> the ctf-branch:
>
> @@ -266,7 +266,7 @@ if test x$ENABLE_CTF = xyes; then
>    AC_CHECK_LIB(ctf, ctf_open, [LIBCTF=yes], [LIBCTF=no])
>    if test x$LIBCTF = xyes; then
>      AC_MSG_NOTICE([activating CTF code])
> -    AC_DEFINE([WITH_CTF], 1,
> +    AC_DEFINE([CTF], 1,
>  	     [Defined if user enables and system has the libctf library])
>      CTF_LIBS=-lctf
>    else
>
> Of course, all uses the "CTF" macro throughout have been updated accordingly.

Sorry, somehow I overlooked that.  Thanks for fixing it.

>> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
>
> [...]
>
>> +  /// Associate a given CTF type ID with a given libabigail IR type.
>> +  void add_type (ctf_id_t ctf_type, type_base_sptr type)
>
> Throughout the existing source code, there is no space between the
> name of a function and the opening parenthesis.  This is based on the
> standard GNU C++ coding conventions followed, for instance, in
> libstdc++.  It would be appreciated if this patch could comply with
> the coding conventions of the rest of the source code.
>
> Those conventions are loosely defined in the CONTRIBUTING file under
> the chapter named "Coding language and style".  There is a
> .clang-format specification file that you can use to format the source
> code (semi?-)automatically using the clang formatter.  I don't use it
> myself as I just do the formatting manually as I write the code in
> Emacs but you can ask questions about that tool on the mailing list,
> should you need any help about it.

Ok I will remove spaces between function names and opening parenthesis.

> [...]
>
>> +static typedef_decl_sptr
>> +process_ctf_typedef (read_context *ctxt,
>> +                     corpus_sptr corp,
>> +                     translation_unit_sptr tunit,
>> +                     ctf_dict_t *ctf_dictionary,
>> +                     ctf_id_t ctf_type)
>> +{
>> +  typedef_decl_sptr result;
>> +
>> +  ctf_id_t ctf_utype = ctf_type_reference (ctf_dictionary, ctf_type);
>
> How about error handling here?  I mean, what if ctf_type is not a
> typedef or a type that has a reference to another type?

Hmmm, yes this code relies on CTF_TYPE to be a typedef type with a
reference to other type.

The first condition is guaranteed by the caller at the moment, which is
bad coupling. (bad bad jemarch.)

Will fix for V2.

>> +  result.reset (new typedef_decl (typedef_name, utype, location (),
>> +                                  typedef_name /* mangled_name */));
>
> I noticed that you are not setting the "location" information.  That
> information is quite useful for accurate diagnostics emitted by, e.g,
> abidiff.  For instance:
>
>     $ build/tools/abidiff --harmless tests/data/test-diff-dwarf/test2-v0.o tests/data/test-diff-dwarf/test2-v1.o 
>     Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>     Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
>     1 function with some indirect sub-type change:
>
>       [C] 'function void foo(int, char)' at test2-v1.cc:5:1 has some indirect sub-type changes:
> 	parameter 1 of type 'int' changed:
> 	  entity changed from 'int' to compatible type 'typedef Int' at test2-v1.cc:1:1
> 	parameter 2 of type 'char' changed:
> 	  entity changed from 'char' to compatible type 'typedef Char' at test2-v1.cc:2:1
>
>     $ 
>
> The location information that you see in the form of "at
> test2-v1.cc:1:1" can be useful, I believe.
>
> So why are you not retrieving location information from CTF? (honest
> question).  To be fair, I haven't found (from the documentation) how
> to get source location information from CTF.  But I am pretty sure I
> must be missing the information.  In any case, I think it might be
> interesting to add a comment in the code about the reason.
>
> In any case, if the information is not yet available, it's not a
> problem.  The patch will go in, regardless.  When source location is
> later available from CTF then this code will be amended accordingly.

CTF doesn't support location information.

> Also, there is something else that might be important that you are not
> doing here: supporting Naming Typedefs.
>
> Consider this construct:
>
> typedef enum {one, two} Number;
>
> Here, there is an anonymous enum that is created.  There is also a
> typedef that is created to "name" that anonymous enum.  In concrete
> terms, the enum will be referred-to as having the name "Number".
> Libabigail IR allows to set the typedef as being the "naming typedef"
> of the underlying type of the using the
> decl_base::set_naming_typedef() member function, on the underlying
> type (the enum).
>
> So the code would somewhat look like this:
>
>     if (is_anonymous_type(utype)
> 	&& (is_enum_type(utype) || is_class_or_union_type(utype))
>       // So utype is an anonymous enum, union or struct.
>       // So let's consider that the typedef 'result' is 
>       // a naming typedef for utype.
>       utype->set_naming_typedef(result);
>
> Does that make sense?

Oook, from the snippet above I infer this applies only to anonymous
enum, class/struct and union underlying types?

Sure, easy to do.  Will be in V2.

>> +  type_base_sptr utype = ctxt->lookup_type (ctf_utype);
>> +
>> +  if (!utype)
>> +    {
>> +      utype = process_ctf_type (ctxt, corp, tunit, ctf_dictionary, ctf_utype);
>> +      if (!utype)
>> +        return result;
>> +    }
>
> I noticed this pattern is used a lot throughout the code.
> Maybe it could be factorized into a function and used throughout the
> code base as, e.g:
>
>     type_base_sptr utype = process_or_lookup_type(ctxt, corp, tunit,
> 						  ctf_dictionary,
> 						  ctf_utype)
> What do you think?

Yes, totally.  Will do.

>
>> +static void
>> +process_ctf_sou_members (read_context *ctxt,
>> +                         corpus_sptr corp,
>> +                         translation_unit_sptr tunit,
>> +                         ctf_dict_t *ctf_dictionary,
>> +                         ctf_id_t ctf_type,
>> +                         class_or_union_sptr sou)
>> +{
>
> This function definition lacks doxygen comment, even if it's "just" a
> sub-routine of process_ctf_struct_type and process_ctf_union_type.

Ok.

> +static void
> +process_ctf_archive (read_context *ctxt, corpus_sptr corp)
> +{
> +  /* We only have a translation unit.  */
> +  translation_unit_sptr ir_translation_unit =
> +    std::make_shared<translation_unit> (ctxt->ir_env, "", 64);
>
> Interesting.  So, CTF doesn't keep the information about the
> translation unit a decl was defined in?  I guess that must be related
> to the fact that I haven't seen source location information so far.

Correct, no translation unit info in CTF.  What CTF has is deduplication
builtin in the linker (ld) so when several objects get combined the
corresponding .ctf sections get also merged and deduplicated.

> [...]
>
> +
> +      /* Iterate over the CTF functions stored in this archive.  */
> +      ctf_next_t *func_next = NULL;
> +      const char *func_name = NULL;
> +      ctf_id_t ctf_sym;
> +
> +      while ((ctf_sym = ctf_symbol_next (ctf_dict, &func_next, &func_name,
> +                                         1 /* functions symbols only */) != CTF_ERR))
> +      {
>
> How about function symbol visibility?  Won't ctf_symbol_next return
> the symbol of a function that is not necessarily "exported"?  In that
> case, because the symbol is not exported, it's related to a "private"
> function and so, it doesn't have ABI visibility.  So it should be
> discarded.  Just curious.

Hmmm, I think these symbols correspond to public symbols, but will
double-check and sure of that.

> diff --git a/tools/abidiff.cc b/tools/abidiff.cc
>
> [...]
>
> @@ -104,6 +105,7 @@ struct options
>  #ifdef WITH_DEBUG_SELF_COMPARISON
>    bool			do_debug;
>  #endif
> +  bool			use_ctf;
>
> I have protected this CTF support hunk with a #ifdef WITH_CTF guard,
> so that configuring with --disable-ctf leads to compiling the file
> just fine I've done so with all the CTF-specific hunks in abidiff.cc
> and abilint.cc.

Thanks for that.  I forgot :)

>
> All in all, I really like the patch.  Thanks a lot for working on
> this.

Well thanks for the review.
I will send a V2 soon.

  reply	other threads:[~2021-10-27 16:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-11  8:45 Jose E. Marchesi
2021-10-11 12:09 ` Giuliano Procida
2021-10-11 15:11   ` Jose E. Marchesi
2021-10-11 15:26     ` Giuliano Procida
2021-10-27  9:19     ` Dodji Seketeli
2021-10-27  8:59 ` Dodji Seketeli
2021-10-27 16:06   ` Jose E. Marchesi [this message]
2021-10-27 20:31   ` Ben Woodard
2021-10-29  9:35     ` Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bl3aih9e.fsf@oracle.com \
    --to=jose.marchesi@oracle.com \
    --cc=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).