From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by sourceware.org (Postfix) with ESMTPS id 070533858D1E for ; Mon, 24 Apr 2023 18:39:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 070533858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=quicinc.com Received: from pps.filterd (m0279871.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33OHh09F007268; Mon, 24 Apr 2023 18:39:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=qcppdkim1; bh=OK4Gbf0f/P9cMma0qvwnSsgIBzuTqcvip+l7MBRNnhg=; b=OzqtlVK3BkykO9VXhfbqRKKP0DQsH5benVYOCNH6y7IHeAAClAmLBwXWB9Kp/wRWyDS0 GQw10XLTaSUolc2eip6jO8CBh+0/tWnWo0lmykpINRhM5C8MTq60pl97DoSz8ccNgWQH AAmmM0iqimHsjFIzd3oTfcOpyzuuYXB594e1MAIrE1lfVfULEiliusMeh7C1cVccK10Q R1npsfqyeT9RXepIyTYmJ4HXtmymq2fsfbh856aj8WTr0foB5+41l/YhUmBfKdiO3caX lRM48mmbYnyEzxgJhzGNUXtXPDCDFHmCfI6WWg8GawKtbsArzH3SVlY4G14XhwjzOdE/ lQ== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3q5qv6s4we-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Apr 2023 18:39:53 +0000 Received: from nalasex01c.na.qualcomm.com (nalasex01c.na.qualcomm.com [10.47.97.35]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 33OIdqo9023411 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Apr 2023 18:39:52 GMT Received: from [10.110.66.28] (10.80.80.8) by nalasex01c.na.qualcomm.com (10.47.97.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.42; Mon, 24 Apr 2023 11:39:49 -0700 Message-ID: <340b33bd-2b43-9f99-58e1-f1b77a51b48a@quicinc.com> Date: Mon, 24 Apr 2023 11:39:48 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: abidiff improvements for kernel UAPI checker To: Dodji Seketeli CC: , Trilok Soni , "Satya Durga Srinivasu Prabhala" References: <5363161d-8167-284e-e35d-9a8ef20adea9@quicinc.com> <877cu5t7tw.fsf@seketeli.org> <87354tt32o.fsf@seketeli.org> Content-Language: en-US From: John Moon In-Reply-To: <87354tt32o.fsf@seketeli.org> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nalasex01c.na.qualcomm.com (10.47.97.35) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: E2cJnQIMkCKn2QYoCnaheJQEYfAInABY X-Proofpoint-ORIG-GUID: E2cJnQIMkCKn2QYoCnaheJQEYfAInABY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-24_11,2023-04-21_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 spamscore=0 adultscore=0 impostorscore=0 lowpriorityscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 suspectscore=0 malwarescore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304240168 X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/21/2023 1:03 PM, Dodji Seketeli wrote: > Hello, > > Dodji Seketeli a écrit: > >> >> I think feature this would not be difficult to add and would indeed be a >> useful feature for others, I believe. >> >> Would it be okay if you'd have to carry a suppression specification >> file that would have: >> >> $ cat filter-enums.suppr >> >> [suppress_type] >> type_kind = enum >> has_last_enumerator_change_only = true >> >> >> Then you'd invoke the tool by doing: >> >> $ abidiff --suppr filter-enum.suppr binary-version1 binary-version2 >> >> To get an idea of what suppression specifications do, you can read more >> at https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppression-specifications. >> >> Please note that the "has_last_enumerator_change_only" property is not >> yet implemented. I'd like to know first if this "user interface" would >> fit your use case. If it does then we can go ahead and add it. > > Actually, I went ahead and played a little with this idea, in case you'd > be interested, so I have a patched libabigail tree with the above > implemented. The actual property to set in the suppression > specification file is "allow_last_enumerator_change_only". > > Here is an example of what it does: > > $ cat -n test0-v0.c > 1 enum enum_type > 2 { > 3 E0, > 4 E1, > 5 E_LAST > 6 }; > 7 > 8 struct some_type > 9 { > 10 enum enum_type m; > 11 }; > 12 > 13 void > 14 foo(struct some_type* s __attribute__((unused))) > 15 { > 16 } > $ cat -n test0-v1.c > 1 enum enum_type > 2 { > 3 E0, > 4 E1, > 5 E2, > 6 E_LAST > 7 }; > 8 > 9 struct some_type > 10 { > 11 enum enum_type m; > 12 }; > 13 > 14 void > 15 foo(struct some_type* s __attribute__((unused))) > 16 { > 17 } > $ > > $ ./build/tools/abidiff test0-v0.o test0-v1.o || echo "exit code: $?" > Functions changes summary: 0 Removed, 1 Changed, 0 Added function > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable > > 1 function with some indirect sub-type change: > > [C] 'function void foo(some_type*)' at test0-v0.c:14:1 has some indirect sub-type changes: > parameter 1 of type 'some_type*' has sub-type changes: > in pointed to type 'struct some_type' at test0-v1.c:9:1: > type size hasn't changed > 1 data member change: > type of 'enum_type m' changed: > type size hasn't changed > 1 enumerator insertion: > 'enum_type::E2' value '2' > 1 enumerator change: > 'enum_type::E_LAST' from value '2' to '3' at test0-v1.c:1:1 > > exit code: 4 > $ cat last-enum-change-only.suppr > [suppress_type] > type_kind = enum > allow_last_enumerator_change_only = yes > $ > $ /home/dodji/git/libabigail/filter-last-enumerator-change/build/tools/abidiff --suppr ./last-enum-change-only.suppr test0-v0.o test0-v1.o && echo "exit code: $?" > Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable > > exit code: 0 > $ > > > The git tree that you can check you, compile and test at your > convenience is https://sourceware.org/git/?p=libabigail.git;a=shortlog;h=refs/heads/users/dodji/filter-last-enumerator-change. > > The patch in question is this one: https://sourceware.org/git/?p=libabigail.git;a=commit;h=239ecaa7dc5de9e9a37650248aa00b5cfc2e578e > > The git command to use would be: > > $ git clone -b 'users/dodji/filter-last-enumerator-change' git://sourceware.org/git/libabigail.git > > Once you've got the necessary dependencies for your system, I guess > compiling the tree would take: > > $ autoreconf > $ configure --prefix= > $ make -j > $ make install > > Then play with /your/prefix/usr/bin/abidiff. > > [...] > > I hope this helps to at least start a productive discussion. > > Cheers, > Hi Dodji, Thanks so much for getting the ball rolling on this! I agree that suppressions seem to be the correct mechanism here and I don't foresee any issue using them from the check script we're working on. Before seeing this, I had gotten started on implementing some (hacky) filters based on the output of abidiff. While testing on the Linux codebase, I found some interesting corner cases. I'll go into the details here so hopefully we can have more discussion and design the right solution. There are basically three classes of "false positives" that we'd like to eventually be able to filter: 1) Enum expansion 2) Padding field expansion 3) Flex array expansion === Enum expansion === Here are some places in the kernel where the script reported expansion as a breaking change: One "MAX" member: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool.h#L1789 Two "MAX" members: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L6398 Three "MAX" members: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/nl80211.h#L3311 Having two is pretty common. I can only find the one occurrence of having three of these variants... maybe the suppression could take an "n_allowed" parameter of some sort? Also, would it make sense to limit the check to enumerator variants that have somewhat common names for that case? It could evolve with time, but so far I've found "MAX", "LAST", "NUM", and "NBITS" being fairly common. Maybe the suppression could also take "allowed_strings"? Something like this? [suppress_type] type_kind = enum allow_n_last_enumerator_changes = 3 allow_n_last_enumerator_change_names = [".*MAX.*", ".*LAST.*", ".*NUM.*", ".*NBITS.*"] This is the algorithm I was working on to detect these changes: - The type size did not change - The change is to an enum - The number of changed variants is < 4 - The changed variants contain at least one of the above strings - The changed variants' values are all >= the last newly-inserted value === Padding field expansion === For example, you'll see lots of padding fields like this: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/fuse.h#L908 These don't necessarily have to occur at the end of a struct. You could expand into a padding field in the middle somewhere. This is the algorithm I was working on to detect these changes: - The type size did not change - There is only one data member being changed - The changed member name contains at least one of: "pad", "reserved", "end", or "last" - The first new member being inserted has an offset equal to the changed member's previous offset === Flex array expansion === For example: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/rseq.h#L149. Leaving a flex array at the end essentially gives the struct an "infinite" padding field. The algorithm to detect looks almost identical to the ordinary padding expansion except this class can only occur with the flex array at the end of the struct. === Anonymous Enums === Another issue that comes up when comparing ABI across wide swaths of kernel commit history are changes to anonymous enums. From what I can tell, there's not a great way to handle this. If someone adds a new anonymous enum, the tag abidiff gives (e.g. enum __anonymous_enum__6) can change, so abidiff reports a new anonymous enum with all the same fields even though it was basically just a "rename". For reference, this file is full of anonymous enums: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ethtool_netlink.h#L15. Basically any change in there is triggering a failure from the script. I'm not sure how to handle these changes... there are possible ABI breakages when changing anonymous enums, but it's unclear to me how to detect them. Again, thanks so much for taking the initiative here. Let me know what you think about the above topics and what next steps we could take! - John