From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay6-d.mail.gandi.net (relay6-d.mail.gandi.net [217.70.183.198]) by sourceware.org (Postfix) with ESMTPS id 04F59385828E for ; Fri, 17 Nov 2023 12:59:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 04F59385828E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=seketeli.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=seketeli.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 04F59385828E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.70.183.198 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700225966; cv=none; b=wdqzraFkfRlZs1MIkJay6aWGrVdD237Fo8ofRoj4JtViaFvpFP3WM77L59FSshKw1sy1S72ojg6Li8Pvt7Wfis0fM5F5UAIWlXPV4j1yC5Lr58YAUb1duXOrmUEzZRS/bJzCJkoVNE/90ycNEjLZmfVRQ17rh4x7+D4FmEARHPo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700225966; c=relaxed/simple; bh=y9UKUy0edAPH3twIk5+nvXrINVF/1oF187FgeDoDyic=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=lmRTyUkV0E6p+LXyFFBb6pwe7hjB6CFo6fbSm0LILtqWJqev5u22UPrwea6mFDn55A/y0MGrHJIulMovQ6y6EcMdCMWTkk7+HoGjgIHSNZJwonH2BOD/RltO8zTexvqJM5wtcfotz6iYb+37i8mV0vbwkW6ALflAtwcaD9Yoq1w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail.gandi.net (Postfix) with ESMTPSA id 7CFF6C000E; Fri, 17 Nov 2023 12:59:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=seketeli.org; s=gm1; t=1700225960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fpPJPRjtMezccHMS3ZJVYhG8O9zsXMU0DQRtHhmNEJs=; b=KCePImrvkDTGb1375vmMXUVY5Vjfy4xz2z4TPr12RNsSd0TxGjk1HqFu7a3AgWN/4xFObh RdfGus8hpO++toVw8P4jcU2kcM0EPotIbzA1ZV9uMXF2vPXHZn8VxG7dr4pPvTt6AhzGvn QEIrys7oesN/bF6tcGcAF7oscAoMMvgvAzz6bHtoRS+4LmDJoqPal9S9OycoxzC7kKq7Km smFu7AMBHg8HbHnHIz2sjESQiVMbg0cKxXIM/8jFYc3ew6KTxgZxqft0KcoAOHSXAQTtDo 5WF5U5N5pqV4UdkfmUHlNkM6iWlabKPFmtgbcLgIA0oJjA87z7KALY+NUaau5g== Received: by localhost (Postfix, from userid 1000) id C0BAF5077C71; Fri, 17 Nov 2023 13:59:19 +0100 (CET) From: Dodji Seketeli To: "Frank Ch. Eigler" Cc: libabigail@sourceware.org, woodard@redhat.com Subject: Re: idea: abigail abixml archive Organization: Me, myself and I References: <20231115155306.GC15862@redhat.com> X-Operating-System: AlmaLinux 9.2 X-URL: http://www.seketeli.net/~dodji Date: Fri, 17 Nov 2023 13:59:19 +0100 In-Reply-To: <20231115155306.GC15862@redhat.com> (Frank Ch. Eigler's message of "Wed, 15 Nov 2023 10:53:06 -0500") Message-ID: <87pm08y1qw.fsf@seketeli.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-GND-Sasl: dodji@seketeli.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, "Frank Ch. Eigler" a =C3=A9crit: > Hi - > > I'd love some feedback about the following idea, related to using > libabigail to assemble a crowdsourced database of abixml files for > linux distros. > > The germ of the idea is that developers may need to know whether a > binary they built or found is likely to be abi-compatible with a given > distro / version. Yes, if my memory serves, Ben Woodard (in copy of this message) was the first one I heard talking about this feature. I am guessing this is useful for users who build a binary on a distro and would like to know if they can run it on another distro without using things like containers. Am I right in expressing the user need here or am I missing something? > This is possible today by downloading the target distro binaries and > running libabigail locally against them, or using front-end scripts > like fedabipkgdiff that do the downloading first. Today, even if you have access to the target distro, it's not practical with the current tools to know if the binary is "ABI compatible" with it. If I understand things correctly, you would need to: 1/ Get the ABI of the (transitive closure) set of dependencies of the binary on its original distro. I call it orig-deps-abi. 2/ Get the ABI of the set of dependencies of the binary on the target distro. I call it target-deps-abi. 3/ compare target-deps-abi against orig-deps-abi. abicompat, for instance, hasn't been designed for that. We would need to either extend it to support 1-3 or come up with a new tool for that. > But this is a pain if one wants to compare against a range of versions > or foreign distros. Indeed. > So the idea is instead to let people use an public archive of abixml > artifacts instead of the binaries. The abixml files are relatively > tiny, barely-ever changing, and should be an effective proxy for the > real binaries. It's just a small matter of (a) storing, (b) > using, and (c) collecting it. OK. And so the tool that does [1-3] would just download get the archive of source and target distributions and do the work locally. Correct? > -------------------- > > For storing this data, I envision overloading the libabigil git repo > (or a new one) with storage of the abixml documents. I am guessing the new tool itself would be yet another libabigail tool alongside the 6 we already have, so it would be in the libabigail git repo. But the distro ABI archives would better be hosted in another git repo somewhere. And users might have their own private distro ABI repo somewhere as they see fit. WDYT? Heck, it could even just be in a directory tree served by whatever transport protocol users would see fit. Git would be our preferred choice, of course. Similarly to the way distro packages themselves are organized today. > To keep it dead simple, there could be one branch per /etc/os-release > $ID/$VERSION_ID, one file per shared library in the distribution. For > example, a fedora-39-x86-64 copy of /usr/lib64/libc.so.6, the file > abidw produces could sit at > > repo git://sourceware.org/git/libabigail.git=20 > branch gitabixml/fedora/39/x86_64 > file /usr/lib64/libc.so.6.xml I would even go further as to put the binary inside a subdirectory tree with all that information, making it somewhat independent from being in git: > repo git://sourceware.org/git/distributions-abi.git > file /fedora/39/x86_64/glibc/usr/lib64/libc.so.6.xml Storing information about the package would be useful, for instance, to handle conflicting packages that might have binaries with the same path. > (Symlinks in the distro fs could be represented as symlinks in git.) ACK. > Updates to the distro package of course happen. It seems natural to > update the abixml file for the affected file(s) right there in place. Yes. Updating an abixml file would just be an overwrite. We would not need to handle merges etc. > Since it may sometimes be desirable to track what package version > (e.g. rpm n-v-r) is associated with the abixml data of a given > version, we could use stylized records in the git commit text (or a > git note, or maybe a tag). That would mean one git commit per updated > package, with metadata message like: > > Package: glibc-2.38-7.fc39.x86_64 > > Maybe abidw version tags would be useful to add. I am not sure what you mean by abidw version tag. A possible way to handle this in a way that is not dependent on Git would be to store the originating n-v-r in the abixml directly. The tool that emits the abixml (from the original package) would be able to do that. > -------------------- > > For using this data, I envision abidiff / abicompat taking a new form > for its right operand. It could be a git url identifying the distro > branch or tag. libabigail would fetch the corresponding file.xml > within that. Simplify/default the heck out of it for ease of use: > > export $BRANCH=3Dfedora/39/x86_64 > abicompat /bin/myprogram gitabixml:$BRANCH > > (Where "gitabixml:" could instruct the tool to look at the sourceware > libabigail git / gitweb / cgit server. Let users specify different or > private git servers via environment variables or something. Yes, something like that. I guess the specific will depend on what we end up settling on for the above (and below). > -------------------- > > For collecting this data, I envision writing some distro-specific > scripts, kind of like fedabipkgdiff, being run by contributors or > ourselves. One flavour could run in operational installed distros, > doing the equivalent of > > find $PATHS -name '*.so.*' | while read lib; do > # or filter with elfclassify=20 > package=3D`rpm -qf "$lib"`=20 > abidw "$lib" | (cd $gitrepo/`dirname $lib`; cat > "$lib.xml") > (cd $gitrepo; git commit -m"Package: $package" "$lib.xml") > done > > and rerun that occasionally as updates flow down from the distro. > This could be done on a single beefy box running containers with > different distros. Yes, I like the idea of having something incremental like this. There would probably be some tweaks added to abidw to let it add a "version string" (the N-V-R mentioned earlier) to the abidw, but these are details. Also, we need to add a mode to libabigail to let it expect debuginfod to find the debuginfo because today, it expects the user to provide the debug info location in cases where it's not already installed on the system. Again, this is a detail but it's going to matter as soon as we the rubber hits the road. > Another flavour could be to take a set of RPM/etc. archives on a > filesystem (or an ISO image), incrementally decompress them, run abidw > on the individual files, and similarly construct the git repo of > abixml files. (This is kind of like how debuginfod produces indexes > from a bunch of RPMs.) ACK. > > No matter how the local git repo is populated, each branch describing > a data contributor's distro could be pushed to the central one, > bringing that one up to date. Patches representing updates could be > emailed too, but no one will want to read/review that stuff. We'd > probably need a trusted pool of contributors who can just commit to > areas of the central git repo. Secured with gitsigur of course. :-) > > The central repo could be built up entirely gradually. If some > libraries were omitted from initial commits for a distro, a later > contribution could fill in the gaps. ACK. Thank you for putting this thoughts together. The idea is getting much less abstract in my mind now. Cheers, --=20 Dodji