public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@seketeli.org>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: libabigail@sourceware.org, woodard@redhat.com
Subject: Re: idea: abigail abixml archive
Date: Fri, 17 Nov 2023 13:59:19 +0100	[thread overview]
Message-ID: <87pm08y1qw.fsf@seketeli.org> (raw)
In-Reply-To: <20231115155306.GC15862@redhat.com> (Frank Ch. Eigler's message of "Wed, 15 Nov 2023 10:53:06 -0500")

Hello,

"Frank Ch. Eigler" <fche@redhat.com> a écrit:

> Hi -
>
> I'd love some feedback about the following idea, related to using
> libabigail to assemble a crowdsourced database of abixml files for
> linux distros.
>
> The germ of the idea is that developers may need to know whether a
> binary they built or found is likely to be abi-compatible with a given
> distro / version.

Yes, if my memory serves, Ben Woodard (in copy of this message) was the
first one I heard talking about this feature.

I am guessing this is useful for users who build a binary on a distro
and would like to know if they can run it on another distro without
using things like containers.

Am I right in expressing the user need here or am I missing something?

> This is possible today by downloading the target distro binaries and
> running libabigail locally against them, or using front-end scripts
> like fedabipkgdiff that do the downloading first.

Today, even if you have access to the target distro, it's not practical
with the current tools to know if the binary is "ABI compatible" with
it.

If I understand things correctly, you would need to:

    1/ Get the ABI of the (transitive closure) set of dependencies of
    the binary on its original distro.  I call it orig-deps-abi.

    2/ Get the ABI of the set of dependencies of the binary on the
    target distro.  I call it target-deps-abi.

    3/ compare target-deps-abi against orig-deps-abi.

abicompat, for instance, hasn't been designed for that.  We would need
to either extend it to support 1-3 or come up with a new tool for that.

> But this is a pain if one wants to compare against a range of versions
> or foreign distros.

Indeed.

> So the idea is instead to let people use an public archive of abixml
> artifacts instead of the binaries.  The abixml files are relatively
> tiny, barely-ever changing, and should be an effective proxy for the
> real binaries.  It's just a small matter of (a) storing, (b)
> using, and (c) collecting it.

OK.

And so the tool that does [1-3] would just download get the archive of
source and target distributions and do the work locally.  Correct?

> --------------------
>
> For storing this data, I envision overloading the libabigil git repo
> (or a new one) with storage of the abixml documents.

I am guessing the new tool itself would be yet another libabigail tool
alongside the 6 we already have, so it would be in the libabigail git
repo.

But the distro ABI archives would better be hosted in another git repo
somewhere.  And users might have their own private distro ABI repo
somewhere as they see fit.  WDYT?

Heck, it could even just be in a directory tree served by whatever
transport protocol users would see fit.  Git would be our preferred
choice, of course.  Similarly to the way distro packages themselves are
organized today.

>  To keep it dead simple, there could be one branch per /etc/os-release
> $ID/$VERSION_ID, one file per shared library in the distribution.  For
> example, a fedora-39-x86-64 copy of /usr/lib64/libc.so.6, the file
> abidw produces could sit at
>
>    repo  git://sourceware.org/git/libabigail.git 
>  branch  gitabixml/fedora/39/x86_64
>    file  /usr/lib64/libc.so.6.xml

I would even go further as to put the binary inside a subdirectory tree
with all that information, making it somewhat independent from being in git:

>    repo  git://sourceware.org/git/distributions-abi.git
>    file  /fedora/39/x86_64/glibc/usr/lib64/libc.so.6.xml


Storing information about the package would be useful, for instance, to
handle conflicting packages that might have binaries with the same
path.

> (Symlinks in the distro fs could be represented as symlinks in git.)

ACK.


> Updates to the distro package of course happen.  It seems natural to
> update the abixml file for the affected file(s) right there in place.

Yes. Updating an abixml file would just be an overwrite.  We would not
need to handle merges etc.

> Since it may sometimes be desirable to track what package version
> (e.g. rpm n-v-r) is associated with the abixml data of a given
> version, we could use stylized records in the git commit text (or a
> git note, or maybe a tag).  That would mean one git commit per updated
> package, with metadata message like:
>
>   Package: glibc-2.38-7.fc39.x86_64
>
> Maybe abidw version tags would be useful to add.

I am not sure what you mean by abidw version tag.

A possible way to handle this in a way that is not dependent on Git
would be to store the originating n-v-r in the abixml directly.  The
tool that emits the abixml (from the original package) would be able to
do that.

> --------------------
>
> For using this data, I envision abidiff / abicompat taking a new form
> for its right operand.  It could be a git url identifying the distro
> branch or tag.  libabigail would fetch the corresponding file.xml
> within that.  Simplify/default the heck out of it for ease of use:
>
>   export $BRANCH=fedora/39/x86_64
>   abicompat /bin/myprogram gitabixml:$BRANCH
>
> (Where "gitabixml:" could instruct the tool to look at the sourceware
> libabigail git / gitweb / cgit server.  Let users specify different or
> private git servers via environment variables or something.

Yes, something like that.  I guess the specific will depend on what we
end up settling on for the above (and below).


> --------------------
>
> For collecting this data, I envision writing some distro-specific
> scripts, kind of like fedabipkgdiff, being run by contributors or
> ourselves.  One flavour could run in operational installed distros,
> doing the equivalent of
>
>     find $PATHS -name '*.so.*' | while read lib; do
>        # or filter with elfclassify 
>        package=`rpm -qf "$lib"` 
>        abidw "$lib" | (cd $gitrepo/`dirname $lib`; cat > "$lib.xml")
>        (cd $gitrepo; git commit -m"Package: $package" "$lib.xml")
>     done
>
> and rerun that occasionally as updates flow down from the distro.
> This could be done on a single beefy box running containers with
> different distros.

Yes, I like the idea of having something incremental like this.

There would probably be some tweaks added to abidw to let it add a
"version string" (the N-V-R mentioned earlier) to the abidw, but these
are details.  Also, we need to add a mode to libabigail to let it expect
debuginfod to find the debuginfo because today, it expects the user to
provide the debug info location in cases where it's not already
installed on the system.  Again, this is a detail but it's going to
matter as soon as we the rubber hits the road.

> Another flavour could be to take a set of RPM/etc. archives on a
> filesystem (or an ISO image), incrementally decompress them, run abidw
> on the individual files, and similarly construct the git repo of
> abixml files.  (This is kind of like how debuginfod produces indexes
> from a bunch of RPMs.)

ACK.
>
> No matter how the local git repo is populated, each branch describing
> a data contributor's distro could be pushed to the central one,
> bringing that one up to date.  Patches representing updates could be
> emailed too, but no one will want to read/review that stuff.  We'd
> probably need a trusted pool of contributors who can just commit to
> areas of the central git repo.  Secured with gitsigur of course. :-)
>
> The central repo could be built up entirely gradually.  If some
> libraries were omitted from initial commits for a distro, a later
> contribution could fill in the gaps.

ACK.

Thank you for putting this thoughts together.  The idea is getting much
less abstract in my mind now.

Cheers,

-- 
		Dodji

  reply	other threads:[~2023-11-17 12:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-15 15:53 Frank Ch. Eigler
2023-11-17 12:59 ` Dodji Seketeli [this message]
2023-11-17 14:53   ` Frank Ch. Eigler
2023-11-17 23:06     ` Dodji Seketeli
2023-12-06  0:14       ` Frank Ch. Eigler
2023-11-27 19:17   ` Ben Woodard
2023-11-28 13:52     ` Dodji Seketeli
2023-11-21 14:54 ` Giuliano Procida
2023-11-27 19:09 ` Ben Woodard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pm08y1qw.fsf@seketeli.org \
    --to=dodji@seketeli.org \
    --cc=fche@redhat.com \
    --cc=libabigail@sourceware.org \
    --cc=woodard@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).