From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mark@klomp.org>
Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184])
 by sourceware.org (Postfix) with ESMTPS id A6B5D3857BB1
 for <overseers@sourceware.org>; Mon, 30 May 2022 22:01:11 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A6B5D3857BB1
Received: from reform (deer0x07.wildebeest.org [172.31.17.137])
 (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by gnu.wildebeest.org (Postfix) with ESMTPSA id 1524530006D6
 for <overseers@sourceware.org>; Tue, 31 May 2022 00:01:07 +0200 (CEST)
Received: by reform (Postfix, from userid 1000)
 id 62B272E8145F; Tue, 31 May 2022 00:01:07 +0200 (CEST)
Date: Tue, 31 May 2022 00:01:07 +0200
From: Mark Wielaard <mark@klomp.org>
To: overseers@sourceware.org
Subject: GNU Toolchain Infrastructure at sourceware
Message-ID: <YpU+o8C8hyFsgtuP@wildebeest.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL,
 KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: overseers@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Overseers mailing list <overseers.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/overseers>,
 <mailto:overseers-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/overseers/>
List-Post: <mailto:overseers@sourceware.org>
List-Help: <mailto:overseers-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/overseers>,
 <mailto:overseers-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Mon, 30 May 2022 22:01:13 -0000

Hi,

Here is a quick overview of some of the GNU Toolchain Infrastructure
work around CI/CD and how to integrate that better with the email
based workflow used on sourceware. Nothing that hasn't been discussed
before, but it might be helpful to have it all in one place. And to
see if we are moving in the right direction.

It seems these services are fairly low on resource usage, at least
compared to other web, git, email and bugzilla services. But please
take a look in case there are any concerns about the sustainability or
scalability on the current hosting infrastructure.

builder.sourceware.org has been running for a couple of months
now. The buildbot process seems to have a low load on the machine. Low
single digit %CPU. The state database has grown to ~800MB, the
gitpoller-work dir is ~4GB (that is mainly gcc ~2G, libabigail ~750M
and binutils-gdb ~600M, others are < 200M).

We have native/VM workers for ppc64le, s390x, ppc64, i386, arm64 and
armhf for debian, fedora and centos (although not all combinations)
and x86_64 container builders for fedora, debian and opensuse. See
https://builder.sourceware.org/ for the current sponsors. There are a
couple of other people who have offered runners. So at the moment
there is enough room for expansion of the builders. If we need more,
or more powerful runners, I intend to ask the GCC Compile Farm
maintainers or the OSUOSL.

There are 80 builders on 14 workers, doing ~120 builds a day (more on
week days, less on weekends). With the just added new opensuse
container builders this will likely jump to ~150 a day. There are a
couple of full testsuite builders (for gcc and binutils-gdb), but most
builders are "quick" CI builders, which will sent email whenever a
regression is detected. It seems to catch and report a couple of
issues a week across all projects. There have been various flaky tests
but those seem to all have been disabled now.

Various builders upload their test results to the bunsendb so they can
be analysed. I am slightly worried about the long time these uploads
take. For example the upload of results from the elfutils builders
takes a significant amount of time (almost half) of the build (4 to 5
out of 10 to 12 minutes). Especially on the slower workers this can
cause them to get behind on the build queue. The bunsendb.git is
currently ~400MB and contains ~2000 results. It seems it compresses
fairly well and could maybe use a periodic repacking (making a local
clone reduces the size to ~260MB).

There is still a small TODO list for the buildbot:
https://sourceware.org/git/?p=builder.git;a=blob;f=TODO
But maintenance is now minimal with the builders just running without
needing supervision. Discussions have moved to the
buildbot@sourceware.org mailinglist.

It is now up to the projects to provide testsuite (subsets, possibly
split by arch) that are stable and act when the buildbot reports a
regression. The bunsendb should be able to help with selecting
non-flaky tests.

One of the workers runs on sourceware itself. It is currently used to
reconfig the buildbot itself (meta!). But projects can also use it to
run tasks on specific commits or periodically. Tasks which are now
often done by a cron job or git hook. For example to update
documentation, websites, generate release tars or update bugzilla. The
advantage over cron jobs is that it can be done more immediately
and/or only when really needed based on specific commit files. The
advantage over git hooks is that they run in the builder context, not
in the context of the specific user that pushed a commit.

The current builder CI checks what has been committed on the main
branch of the projects. This makes sure that what is checked out is in
a good state and any pushed regressions are found early and often. Now
that most of these builders are green we can start watching user/try
branches. So when a user pushes to their try branch the same builder
CI checks are ran, so a project developer knows their proposed
patch(es) won't break the build or introduce regressions.

The above only helps developers that have commit access on sourceware,
but not others who sent in patches. For that we have
https://patchwork.sourceware.org/ plus the CICD trybot that DJ wrote
https://sourceware.org/glibc/wiki/CICDDesign To make this work better
and connect it to the buildbot we need to upgrade to the latest
patchwork (and update django).

The current trybot doesn't do authentication, this might not be OK for
all builders. So we want to either require checking for known GPG keys
on the patch emails or let a trusted developer set a flag in
patchwork. Once we have public-inbox setup we could also use b4 for DKIM
attestation for known/trusted hackers.

Some projects have already experimented with public-inbox. But we
don't have an instance running on sourceware itself yet. This would
resolve complaints of not very usable mailman archives.

And for people wanting a more forge like experience we are already
running a mirror of sourceware projects at sourcehut
https://sr.ht/~sourceware/ This allows a user on sourcehut to fork any
project, prepare their patches and submit a merge request through
email (without having to locally setup git send-email or smtp - the
patch emails are generated server side).

The sourcehut mirror is currently read-only. Sourcehut is designed
around email based workflows, fully Free Software, doesn't use
javascript and is much faster and resource constrained compared to
(proprietary) alternatives. The sourcehut beta will have groups
support. We can test a self-hosted instance then. The various sr.ht
components are very modular so we can only use those parts we need.

Cheers,

Mark