public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: John Moon <quic_johmoo@quicinc.com>
To: Dodji Seketeli <dodji@seketeli.org>
Cc: <libabigail@sourceware.org>, Trilok Soni <quic_tsoni@quicinc.com>,
	"Satya Durga Srinivasu Prabhala" <quic_satyap@quicinc.com>
Subject: Re: abidiff improvements for kernel UAPI checker
Date: Mon, 24 Apr 2023 11:39:48 -0700	[thread overview]
Message-ID: <340b33bd-2b43-9f99-58e1-f1b77a51b48a@quicinc.com> (raw)
In-Reply-To: <87354tt32o.fsf@seketeli.org>

On 4/21/2023 1:03 PM, Dodji Seketeli wrote:
> Hello,
> 
> Dodji Seketeli <dodji@seketeli.org> a écrit:
> 
>>
>> I think feature this would not be difficult to add and would indeed be a
>> useful feature for others, I believe.
>>
>> Would it be okay if you'd have to carry a suppression specification
>> file that would have:
>>
>>      $ cat filter-enums.suppr
>>
>>      [suppress_type]
>>        type_kind = enum
>>        has_last_enumerator_change_only = true
>>
>>
>> Then you'd invoke the tool by doing:
>>
>>      $ abidiff --suppr filter-enum.suppr binary-version1 binary-version2
>>
>> To get an idea of what suppression specifications do, you can read more
>> at https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppression-specifications.
>>
>> Please note that the "has_last_enumerator_change_only" property is not
>> yet implemented.  I'd like to know first if this "user interface" would
>> fit your use case.  If it does then we can go ahead and add it.
> 
> Actually, I went ahead and played a little with this idea, in case you'd
> be interested, so I have a patched libabigail tree with the above
> implemented.  The actual property to set in the suppression
> specification file is "allow_last_enumerator_change_only".
> 
> Here is an example of what it does:
> 
> $ cat -n test0-v0.c
>       1	enum enum_type
>       2	{
>       3	  E0,
>       4	  E1,
>       5	  E_LAST
>       6	};
>       7
>       8	struct some_type
>       9	{
>      10	  enum enum_type m;
>      11	};
>      12
>      13	void
>      14	foo(struct some_type* s __attribute__((unused)))
>      15	{
>      16	}
> $ cat -n test0-v1.c
>       1	enum enum_type
>       2	{
>       3	  E0,
>       4	  E1,
>       5	  E2,
>       6	  E_LAST
>       7	};
>       8
>       9	struct some_type
>      10	{
>      11	  enum enum_type m;
>      12	};
>      13
>      14	void
>      15	foo(struct some_type* s __attribute__((unused)))
>      16	{
>      17	}
> $
> 
>      $ ./build/tools/abidiff test0-v0.o test0-v1.o || echo "exit code: $?"
>      Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>      Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> 
>      1 function with some indirect sub-type change:
> 
>        [C] 'function void foo(some_type*)' at test0-v0.c:14:1 has some indirect sub-type changes:
> 	parameter 1 of type 'some_type*' has sub-type changes:
> 	  in pointed to type 'struct some_type' at test0-v1.c:9:1:
> 	    type size hasn't changed
> 	    1 data member change:
> 	      type of 'enum_type m' changed:
> 		type size hasn't changed
> 		1 enumerator insertion:
> 		  'enum_type::E2' value '2'
> 		1 enumerator change:
> 		  'enum_type::E_LAST' from value '2' to '3' at test0-v1.c:1:1
> 
>      exit code: 4
>      $ cat last-enum-change-only.suppr
>      [suppress_type]
>        type_kind = enum
>        allow_last_enumerator_change_only = yes
>      $
>      $ /home/dodji/git/libabigail/filter-last-enumerator-change/build/tools/abidiff --suppr ./last-enum-change-only.suppr test0-v0.o test0-v1.o && echo "exit code: $?"
>      Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
>      Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> 
>      exit code: 0
>      $
> 
> 
> The git tree that you can check you, compile and test at your
> convenience is https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/filter-last-enumerator-change.
> 
> The patch in question is this one: https://sourceware.org/git/?p=libabigail.git;a=commit;h=239ecaa7dc5de9e9a37650248aa00b5cfc2e578e
> 
> The git command to use would be:
> 
> $ git clone -b 'users/dodji/filter-last-enumerator-change' git://sourceware.org/git/libabigail.git
> 
> Once you've got the necessary dependencies for your system, I guess
> compiling the tree would take:
> 
>      $ autoreconf
>      $ configure --prefix=</your/prefix>
>      $ make -j<amount-of-cpu-cores>
>      $ make install
> 
> Then play with /your/prefix/usr/bin/abidiff.
> 
> [...]
> 
> I hope this helps to at least start a productive discussion.
> 
> Cheers,
> 

Hi Dodji,

Thanks so much for getting the ball rolling on this! I agree that 
suppressions seem to be the correct mechanism here and I don't foresee 
any issue using them from the check script we're working on.

Before seeing this, I had gotten started on implementing some (hacky) 
filters based on the output of abidiff. While testing on the Linux 
codebase, I found some interesting corner cases. I'll go into the 
details here so hopefully we can have more discussion and design the 
right solution.

There are basically three classes of "false positives" that we'd like to 
eventually be able to filter:

1) Enum expansion
2) Padding field expansion
3) Flex array expansion


=== Enum expansion ===
Here are some places in the kernel where the script reported expansion 
as a breaking change:

One "MAX" member: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool.h#L1789

Two "MAX" members: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L6398

Three "MAX" members: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L3311

Having two is pretty common. I can only find the one occurrence of 
having three of these variants... maybe the suppression could take an 
"n_allowed" parameter of some sort?

Also, would it make sense to limit the check to enumerator variants that 
have somewhat common names for that case? It could evolve with time, but 
so far I've found "MAX", "LAST", "NUM", and "NBITS" being fairly common. 
Maybe the suppression could also take "allowed_strings"? Something like 
this?

[suppress_type]
   type_kind = enum
   allow_n_last_enumerator_changes = 3
   allow_n_last_enumerator_change_names = [".*MAX.*", ".*LAST.*", 
".*NUM.*", ".*NBITS.*"]


This is the algorithm I was working on to detect these changes:
   - The type size did not change
   - The change is to an enum
   - The number of changed variants is < 4
   - The changed variants contain at least one of the above strings
   - The changed variants' values are all >= the last newly-inserted value


=== Padding field expansion ===
For example, you'll see lots of padding fields like this: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/fuse.h#L908

These don't necessarily have to occur at the end of a struct. You could 
expand into a padding field in the middle somewhere.

This is the algorithm I was working on to detect these changes:
   - The type size did not change
   - There is only one data member being changed
   - The changed member name contains at least one of: "pad", 
"reserved", "end", or "last"
   - The first new member being inserted has an offset equal to the 
changed member's previous offset


=== Flex array expansion ===
For example: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/rseq.h#L149.

Leaving a flex array at the end essentially gives the struct an 
"infinite" padding field. The algorithm to detect looks almost identical 
to the ordinary padding expansion except this class can only occur with 
the flex array at the end of the struct.


=== Anonymous Enums ===
Another issue that comes up when comparing ABI across wide swaths of 
kernel commit history are changes to anonymous enums. From what I can 
tell, there's not a great way to handle this. If someone adds a new 
anonymous enum, the tag abidiff gives (e.g. enum __anonymous_enum__6) 
can change, so abidiff reports a new anonymous enum with all the same 
fields even though it was basically just a "rename".

For reference, this file is full of anonymous enums: 
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool_netlink.h#L15. 
Basically any change in there is triggering a failure from the script.

I'm not sure how to handle these changes... there are possible ABI 
breakages when changing anonymous enums, but it's unclear to me how to 
detect them.

Again, thanks so much for taking the initiative here. Let me know what 
you think about the above topics and what next steps we could take!

- John

  reply	other threads:[~2023-04-24 18:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-11  0:45 John Moon
2023-04-16 18:33 ` Trilok Soni
2023-04-21 18:21 ` Dodji Seketeli
2023-04-21 20:03   ` Dodji Seketeli
2023-04-24 18:39     ` John Moon [this message]
2023-05-10 14:21       ` Dodji Seketeli
2023-05-23 19:59         ` John Moon
2023-07-05 16:52           ` John Moon
2023-10-05 13:44             ` Support suppressing data members inserted right before flexible array members (was Re: abidiff improvements for kernel UAPI checker) Dodji Seketeli
2023-07-10 10:55           ` abidiff improvements for kernel UAPI checker Dodji Seketeli
2023-09-22 11:39       ` Dodji Seketeli
2023-09-22 11:51         ` Dodji Seketeli
2023-09-22 18:28           ` John Moon
2023-09-22 20:02             ` John Moon
2023-09-26  8:38             ` Dodji Seketeli
2023-09-27 17:37               ` John Moon
2023-09-29  9:52                 ` Dodji Seketeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=340b33bd-2b43-9f99-58e1-f1b77a51b48a@quicinc.com \
    --to=quic_johmoo@quicinc.com \
    --cc=dodji@seketeli.org \
    --cc=libabigail@sourceware.org \
    --cc=quic_satyap@quicinc.com \
    --cc=quic_tsoni@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).