From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by sourceware.org (Postfix) with ESMTPS id 33A193858C2A for ; Tue, 21 Nov 2023 14:54:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33A193858C2A Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=google.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 33A193858C2A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700578486; cv=none; b=j56Z5EPXM/fpwThSxNfP5XuVK3uAto5uhAspPj8W+e8XgU17ibrs441yYAAc8AEioL7vxzBAwBen2Gh5n5o7kOzkPLpkFysluy88qIqLZIVTIDMwcytlxbzPzuqO9taRqg3EO8N0GBYJFIrBDs0ZTSj9nAJgzQFoAS/Af7tjr40= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700578486; c=relaxed/simple; bh=1G9PjunoiVwoGNv6dKGkpQfS/mFIpFtWlPtwgGPqiLo=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=PBmKu+tLVoM3kSIyFLx8qc3E/+hbImeJOcrQDk6v5SlyBBjWsf+S9R9zGVpbvc1qnSue6LL/UEb4XTFxTqckuG35M+JtipLuug8DQJ4gS5zWDWR0F2MRMtGkQ96LEs5e1EioEvbQRAXNmn6wGKQG80yt+SrCF5nJh+bZZoJBa04= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-507c5249d55so8207574e87.3 for ; Tue, 21 Nov 2023 06:54:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700578482; x=1701183282; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=O5yGsL0etO/h0PS2AVaNev8gr4L+WqPJ8kvC1HLdIK8=; b=MCINSwjyw/xx2jBd3dZHH6aWnmVuZy5WzKVJcS5byXuTS4dh9zgN0Z6FHDIKHajbbv ufqWgWaQJc6/3fmJvnkfiyxOUkSn4bqOjBUn0yi8kcKgR4pIDYJjXQ6ZA09D6EbxwDuX cyUXV1Wjgbzw53XWxTF7T2JHL5jtMBKbcgrefz3PY0SGvOpYEWPfxWL9M6BNlgoi9JG2 S+b88mMDqoAwfs0Ap8boz8L4KxifcWYqeo6F/0nXP+w8LRt6sRAdVDytdUOvNWOiF2vm 3ySb1Js96VTW2ZyOXtXhk99gmWSP7awJaoFq/G2fqaIo2O2vTSGPLq5f8SsJto43/or2 IowQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700578482; x=1701183282; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O5yGsL0etO/h0PS2AVaNev8gr4L+WqPJ8kvC1HLdIK8=; b=dLDWpZFXyEzx96WW+sDiqaUMiM4iER8XJ45+tlQCRS3GXSdJTgSOjPqP2XGi7DjSfG kPZDp3VhCdtsBP6er3CpOh4bz0pjxn5Wn3Wa9gxTd8ua1l8jJqJaqBvyiOQcQA+84Bkc 4bH6Ck8zvD//s8ac3gS7cYCPKytRQIbFW0XoocogUU2t1zxRchEMcZfjLiW2ica9X+xM KsnlJrjEbZ6Gd5mwjc7JuEXFw9fng4duuVJq6xxnU0/+mMhwHaqdVX55R2+cafEIjU8Y hEkznXXDc3YHBHx4H3OfikOXIKa+gnUK/On4bpKxXkULXZNzadBQ+YK4TvePK9KPcLln hfXA== X-Gm-Message-State: AOJu0Yx0E/cdVHZmc+wJzod9m1/j0WA/X8AMf1FZ/n2dO28Fo93+9KeS LLDTPPT7ARlzO2nZwX9p+v9um/CmgEnv3B1QKWXRhhOCEWxKbHGck9+8RFfo X-Google-Smtp-Source: AGHT+IF4Gc+rwhKu69U7A777G7A3azVWaXIhqQCLaHtOcoMjXSVudmET7b1uEWtbcLAzCQHbLkAP36rRsozSX3/x3SY= X-Received: by 2002:a05:6512:1252:b0:50a:a720:141b with SMTP id fb18-20020a056512125200b0050aa720141bmr8417726lfb.31.1700578482049; Tue, 21 Nov 2023 06:54:42 -0800 (PST) MIME-Version: 1.0 References: <20231115155306.GC15862@redhat.com> In-Reply-To: <20231115155306.GC15862@redhat.com> From: Giuliano Procida Date: Tue, 21 Nov 2023 14:54:05 +0000 Message-ID: Subject: Re: idea: abigail abixml archive To: "Frank Ch. Eigler" Cc: libabigail@sourceware.org, =?UTF-8?Q?Matthias_M=C3=A4nnich?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-20.2 required=5.0 tests=BAYES_00,DKIMWL_WL_MED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi. On Wed, 15 Nov 2023 at 15:53, Frank Ch. Eigler wrote: > > Hi - > > I'd love some feedback about the following idea, related to using > libabigail to assemble a crowdsourced database of abixml files for > linux distros. > > The germ of the idea is that developers may need to know whether a > binary they built or found is likely to be abi-compatible with a given > distro / version. This is possible today by downloading the target > distro binaries and running libabigail locally against them, or using > front-end scripts like fedabipkgdiff that do the downloading first. > But this is a pain if one wants to compare against a range of versions > or foreign distros. > > So the idea is instead to let people use an public archive of abixml > artifacts instead of the binaries. The abixml files are relatively > tiny, barely-ever changing, and should be an effective proxy for the > real binaries. It's just a small matter of (a) storing, (b) > using, and (c) collecting it. > > -------------------- > > For storing this data, I envision overloading the libabigil git repo > (or a new one) with storage of the abixml documents. To keep it dead > simple, there could be one branch per /etc/os-release $ID/$VERSION_ID, > one file per shared library in the distribution. For example, a > fedora-39-x86-64 copy of /usr/lib64/libc.so.6, the file abidw produces > could sit at > > repo git://sourceware.org/git/libabigail.git > branch gitabixml/fedora/39/x86_64 > file /usr/lib64/libc.so.6.xml > > (Symlinks in the distro fs could be represented as symlinks in git.) > > Updates to the distro package of course happen. It seems natural to > update the abixml file for the affected file(s) right there in place. > > Since it may sometimes be desirable to track what package version > (e.g. rpm n-v-r) is associated with the abixml data of a given > version, we could use stylized records in the git commit text (or a > git note, or maybe a tag). That would mean one git commit per updated > package, with metadata message like: > > Package: glibc-2.38-7.fc39.x86_64 > > Maybe abidw version tags would be useful to add. > > -------------------- > > For using this data, I envision abidiff / abicompat taking a new form > for its right operand. It could be a git url identifying the distro > branch or tag. libabigail would fetch the corresponding file.xml > within that. Simplify/default the heck out of it for ease of use: > > export $BRANCH=3Dfedora/39/x86_64 > abicompat /bin/myprogram gitabixml:$BRANCH > > (Where "gitabixml:" could instruct the tool to look at the sourceware > libabigail git / gitweb / cgit server. Let users specify different or > private git servers via environment variables or something. > > -------------------- > > For collecting this data, I envision writing some distro-specific > scripts, kind of like fedabipkgdiff, being run by contributors or > ourselves. One flavour could run in operational installed distros, > doing the equivalent of > > find $PATHS -name '*.so.*' | while read lib; do > # or filter with elfclassify > package=3D`rpm -qf "$lib"` > abidw "$lib" | (cd $gitrepo/`dirname $lib`; cat > "$lib.xml") > (cd $gitrepo; git commit -m"Package: $package" "$lib.xml") > done > > and rerun that occasionally as updates flow down from the distro. > This could be done on a single beefy box running containers with > different distros. > > Another flavour could be to take a set of RPM/etc. archives on a > filesystem (or an ISO image), incrementally decompress them, run abidw > on the individual files, and similarly construct the git repo of > abixml files. (This is kind of like how debuginfod produces indexes > from a bunch of RPMs.) > > No matter how the local git repo is populated, each branch describing > a data contributor's distro could be pushed to the central one, > bringing that one up to date. Patches representing updates could be > emailed too, but no one will want to read/review that stuff. We'd > probably need a trusted pool of contributors who can just commit to > areas of the central git repo. Secured with gitsigur of course. :-) > > The central repo could be built up entirely gradually. If some > libraries were omitted from initial commits for a distro, a later > contribution could fill in the gaps. > > -------------------- > > OK, how reasonable does all this sound? This sounds like an interesting project, but you can go further. Starting at the level of a single binary with a single .so. The questions we want answered are: 1. Will all undefined symbols resolve successfully (otherwise reporting missing symbols)? 2. Are the types of the resolved symbols compatible (otherwise reporting differences)? 3. Do we have ABI representations that let us answer 1. and 2. without having some binaries to hand? (Not yet.) Neither libabigail nor STG emit undefined symbols in their ABI representations (yet), so answering 1. currently requires having binaries and debug information to hand. Now build this up to multiple binaries, SONAME, bundled and unbundled shared objects, library dependencies (ELF needs), multiple distributions, packages, versions and architectures, supporting link-loaded plugins with dlopen etc. Having a full database of existing libraries would allow compatibility for a freshly-compiled binary (or package) to be checked (idea due to Matthias M=C3=A4nnich) or dependency hell to be explored without actually installing any packages. Similarly, if binaries (and their packages) are in the database, questions about library upgrades could be answered. Giuliano. > > - FChE >