From: John Moon <quic_johmoo@quicinc.com>
To: Dodji Seketeli <dodji@seketeli.org>
Cc: <libabigail@sourceware.org>, Trilok Soni <quic_tsoni@quicinc.com>,
"Satya Durga Srinivasu Prabhala" <quic_satyap@quicinc.com>
Subject: Re: abidiff improvements for kernel UAPI checker
Date: Mon, 24 Apr 2023 11:39:48 -0700 [thread overview]
Message-ID: <340b33bd-2b43-9f99-58e1-f1b77a51b48a@quicinc.com> (raw)
In-Reply-To: <87354tt32o.fsf@seketeli.org>
On 4/21/2023 1:03 PM, Dodji Seketeli wrote:
> Hello,
>
> Dodji Seketeli <dodji@seketeli.org> a écrit:
>
>>
>> I think feature this would not be difficult to add and would indeed be a
>> useful feature for others, I believe.
>>
>> Would it be okay if you'd have to carry a suppression specification
>> file that would have:
>>
>> $ cat filter-enums.suppr
>>
>> [suppress_type]
>> type_kind = enum
>> has_last_enumerator_change_only = true
>>
>>
>> Then you'd invoke the tool by doing:
>>
>> $ abidiff --suppr filter-enum.suppr binary-version1 binary-version2
>>
>> To get an idea of what suppression specifications do, you can read more
>> at https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppression-specifications.
>>
>> Please note that the "has_last_enumerator_change_only" property is not
>> yet implemented. I'd like to know first if this "user interface" would
>> fit your use case. If it does then we can go ahead and add it.
>
> Actually, I went ahead and played a little with this idea, in case you'd
> be interested, so I have a patched libabigail tree with the above
> implemented. The actual property to set in the suppression
> specification file is "allow_last_enumerator_change_only".
>
> Here is an example of what it does:
>
> $ cat -n test0-v0.c
> 1 enum enum_type
> 2 {
> 3 E0,
> 4 E1,
> 5 E_LAST
> 6 };
> 7
> 8 struct some_type
> 9 {
> 10 enum enum_type m;
> 11 };
> 12
> 13 void
> 14 foo(struct some_type* s __attribute__((unused)))
> 15 {
> 16 }
> $ cat -n test0-v1.c
> 1 enum enum_type
> 2 {
> 3 E0,
> 4 E1,
> 5 E2,
> 6 E_LAST
> 7 };
> 8
> 9 struct some_type
> 10 {
> 11 enum enum_type m;
> 12 };
> 13
> 14 void
> 15 foo(struct some_type* s __attribute__((unused)))
> 16 {
> 17 }
> $
>
> $ ./build/tools/abidiff test0-v0.o test0-v1.o || echo "exit code: $?"
> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> 1 function with some indirect sub-type change:
>
> [C] 'function void foo(some_type*)' at test0-v0.c:14:1 has some indirect sub-type changes:
> parameter 1 of type 'some_type*' has sub-type changes:
> in pointed to type 'struct some_type' at test0-v1.c:9:1:
> type size hasn't changed
> 1 data member change:
> type of 'enum_type m' changed:
> type size hasn't changed
> 1 enumerator insertion:
> 'enum_type::E2' value '2'
> 1 enumerator change:
> 'enum_type::E_LAST' from value '2' to '3' at test0-v1.c:1:1
>
> exit code: 4
> $ cat last-enum-change-only.suppr
> [suppress_type]
> type_kind = enum
> allow_last_enumerator_change_only = yes
> $
> $ /home/dodji/git/libabigail/filter-last-enumerator-change/build/tools/abidiff --suppr ./last-enum-change-only.suppr test0-v0.o test0-v1.o && echo "exit code: $?"
> Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> exit code: 0
> $
>
>
> The git tree that you can check you, compile and test at your
> convenience is https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/filter-last-enumerator-change.
>
> The patch in question is this one: https://sourceware.org/git/?p=libabigail.git;a=commit;h=239ecaa7dc5de9e9a37650248aa00b5cfc2e578e
>
> The git command to use would be:
>
> $ git clone -b 'users/dodji/filter-last-enumerator-change' git://sourceware.org/git/libabigail.git
>
> Once you've got the necessary dependencies for your system, I guess
> compiling the tree would take:
>
> $ autoreconf
> $ configure --prefix=</your/prefix>
> $ make -j<amount-of-cpu-cores>
> $ make install
>
> Then play with /your/prefix/usr/bin/abidiff.
>
> [...]
>
> I hope this helps to at least start a productive discussion.
>
> Cheers,
>
Hi Dodji,
Thanks so much for getting the ball rolling on this! I agree that
suppressions seem to be the correct mechanism here and I don't foresee
any issue using them from the check script we're working on.
Before seeing this, I had gotten started on implementing some (hacky)
filters based on the output of abidiff. While testing on the Linux
codebase, I found some interesting corner cases. I'll go into the
details here so hopefully we can have more discussion and design the
right solution.
There are basically three classes of "false positives" that we'd like to
eventually be able to filter:
1) Enum expansion
2) Padding field expansion
3) Flex array expansion
=== Enum expansion ===
Here are some places in the kernel where the script reported expansion
as a breaking change:
One "MAX" member:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool.h#L1789
Two "MAX" members:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L6398
Three "MAX" members:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L3311
Having two is pretty common. I can only find the one occurrence of
having three of these variants... maybe the suppression could take an
"n_allowed" parameter of some sort?
Also, would it make sense to limit the check to enumerator variants that
have somewhat common names for that case? It could evolve with time, but
so far I've found "MAX", "LAST", "NUM", and "NBITS" being fairly common.
Maybe the suppression could also take "allowed_strings"? Something like
this?
[suppress_type]
type_kind = enum
allow_n_last_enumerator_changes = 3
allow_n_last_enumerator_change_names = [".*MAX.*", ".*LAST.*",
".*NUM.*", ".*NBITS.*"]
This is the algorithm I was working on to detect these changes:
- The type size did not change
- The change is to an enum
- The number of changed variants is < 4
- The changed variants contain at least one of the above strings
- The changed variants' values are all >= the last newly-inserted value
=== Padding field expansion ===
For example, you'll see lots of padding fields like this:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/fuse.h#L908
These don't necessarily have to occur at the end of a struct. You could
expand into a padding field in the middle somewhere.
This is the algorithm I was working on to detect these changes:
- The type size did not change
- There is only one data member being changed
- The changed member name contains at least one of: "pad",
"reserved", "end", or "last"
- The first new member being inserted has an offset equal to the
changed member's previous offset
=== Flex array expansion ===
For example:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/rseq.h#L149.
Leaving a flex array at the end essentially gives the struct an
"infinite" padding field. The algorithm to detect looks almost identical
to the ordinary padding expansion except this class can only occur with
the flex array at the end of the struct.
=== Anonymous Enums ===
Another issue that comes up when comparing ABI across wide swaths of
kernel commit history are changes to anonymous enums. From what I can
tell, there's not a great way to handle this. If someone adds a new
anonymous enum, the tag abidiff gives (e.g. enum __anonymous_enum__6)
can change, so abidiff reports a new anonymous enum with all the same
fields even though it was basically just a "rename".
For reference, this file is full of anonymous enums:
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool_netlink.h#L15.
Basically any change in there is triggering a failure from the script.
I'm not sure how to handle these changes... there are possible ABI
breakages when changing anonymous enums, but it's unclear to me how to
detect them.
Again, thanks so much for taking the initiative here. Let me know what
you think about the above topics and what next steps we could take!
- John
next prev parent reply other threads:[~2023-04-24 18:39 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 0:45 John Moon
2023-04-16 18:33 ` Trilok Soni
2023-04-21 18:21 ` Dodji Seketeli
2023-04-21 20:03 ` Dodji Seketeli
2023-04-24 18:39 ` John Moon [this message]
2023-05-10 14:21 ` Dodji Seketeli
2023-05-23 19:59 ` John Moon
2023-07-05 16:52 ` John Moon
2023-10-05 13:44 ` Support suppressing data members inserted right before flexible array members (was Re: abidiff improvements for kernel UAPI checker) Dodji Seketeli
2023-07-10 10:55 ` abidiff improvements for kernel UAPI checker Dodji Seketeli
2023-09-22 11:39 ` Dodji Seketeli
2023-09-22 11:51 ` Dodji Seketeli
2023-09-22 18:28 ` John Moon
2023-09-22 20:02 ` John Moon
2023-09-26 8:38 ` Dodji Seketeli
2023-09-27 17:37 ` John Moon
2023-09-29 9:52 ` Dodji Seketeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=340b33bd-2b43-9f99-58e1-f1b77a51b48a@quicinc.com \
--to=quic_johmoo@quicinc.com \
--cc=dodji@seketeli.org \
--cc=libabigail@sourceware.org \
--cc=quic_satyap@quicinc.com \
--cc=quic_tsoni@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).