public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <bonzini@gnu.org>
To: overseers@sourceware.org
Cc: Bradley Kuhn <bkuhn@sfconservancy.org>
Subject: Role of sourceware for hosted projects
Date: Fri, 23 Sep 2022 16:04:12 +0200	[thread overview]
Message-ID: <58991505-d402-bc5f-5ee8-fff48dcde6cd@gnu.org> (raw)

Hi all,

I have recently come across the discussions at Cauldron about 
Sourceware, by means of an article on LWN.net.  Even though I have not 
been a contributor to Sourceware projects for several years, they are 
still dear to my heart and I am familiar with several of them. 
Therefore, I would like to share my own observations about what 
sourceware.org can or should provide to the project it hosts.

First of all, a few obligatory pieces of disclosure.  First, I am 
employed by Red Hat but not for anything related to the GNU Compiler 
Collection or any other Sourceware project.  Second, I was not at 
Cauldron and have only watched the online recording of the first hour of 
the BoFs; I trust the LWN.net editors and their article on the topic to 
provide a faithful account.  Third, even though I have discussed this 
with some of the people involved in the BoFs, that hasn't substantially 
changed my understanding of the situation as it is reflected below.


The obvious observation from the outside is that the two "opposing 
sides" (for lack of a better word) have different priorities on what 
Sourceware needs to provide for them.  There are several reasons for 
this: different projects that people work on, different emotional 
attachment, different jobs, whatnot.  However, both of them make the 
same mistake: they focus on the "next steps" without a full assessment 
of the *current* state of Sourceware.

The first part of the assessment in my opinion should be that most 
projects on sourceware.org are dead.  Some of them (e.g. GSL) have 
already migrated out of Sourceware in fact, but the others should be 
archived ASAP.  Archival means moving them to a different domain *and 
machine*, with redirects put in place on the sourceware.org website. 
The machine should not provide any maintainer of archived projects with 
shell access.  Archived projects can have a static website and have 
their sources available via git/gitweb, but no mailing lists, no bug 
tracking, no wiki, and no code running on the server such as commit 
hooks or CGI scripts.

Archiving these projects or turning their pages into hard redirects will 
leave if I counted right, roughly ten projects that are alive: gcc, gdb, 
binutils and glibc ("the GNU toolchain"); elfutils, systemtap, 
cygwin/newlib and libabigail ("smaller projects"); and others that are 
alive but barely so (debugedit, dwz, etc.).  Also, most projects outside 
the GNU toolchain have only one main developer.

That's quite a varied lot, enough so that:

1) it does seem weird for Conservancy to be the fiscal sponsor for 
*Sourceware*, which is essentially "a machine" rather than a 
coherent/cohesive set of projects.  *Some* of the smaller projects might 
very well be interested in joining Conservancy, but different 
maintainers may have different requirements or priorities.

2) Migration to IT infrastructure hosted by Linux Foundation, as in the 
"GTI" proposal, might not take into account the needs of smaller 
projects very well.  But there is no reason for these projects to live 
on the same server or have the same development model; and there's no 
reason for all of the services to be provided by a single server. 
Different services can be easily outsourced to different people, 
companies or external providers.


Based on the above, I think focus should be on simplifying the 
sourceware.org infrastructure, removing unnecessary services based on 
ease of migration, attack surface, operating cost (both human and 
monetary). This way Sourceware handles on what is harder to obtain from 
external providers, and projects can use the best external 
infrastructure for everything else they need.

A prime target for this simplification, based mostly on my experience 
with QEMU, is source control and CI.  My understanding is that GTI comes 
from maintainers that are not confident that Sourceware can provide 
enough security---not because they do not trust the overseers, but 
rather because things go well until they don't.  For this reason, source 
control is the main concern of the people behind the GTI proposal.  If 
the bigger projects are worried about supply chain attacks, to begin 
with they can very easily migrate their source code control repositories 
elsewhere.  It could be operated by LF staff, or could be a "forge" like 
Gitlab/GitHub; several projects use the latter purely for hosting while 
keeping development on mailing lists.  They don't even have to do the 
same thing between the four of them, though it would make sense for at 
least binutils, gcc and gdb to do so. I am rusty on how these three 
projects do release management, but perhaps they could even use a 
monorepo with per-project release branches, and tarballs that only 
include the relevant directories.

But while GTI focuses (as expected) on the four GNU projects that form 
the toolchain, Sourceware infrastructure might not be the most suitable 
even for smaller projects.  In fact, in my humble opinion maintainers of 
smaller and moribund projects should *also* figure out if they still 
need/want the infrastructure provided by sourceware.org.  Of the smaller 
projects, most might be better served by migrating to Gitlab or GitHub; 
for example, CI there is easier to use than custom infrastructure based 
on patchwork and buildbot.  Sourcehut is another possibility; it scores 
pretty high on the GNU Ethical Repository Criteria, especially with 
respect to free Javascript, and some maintainers might value that as 
well.  Therefore, even though in the short term Sourceware can also host 
source code control repos for the smaller projects that do not want to 
migrate, I believe that this should be phased out for all non-archived 
projects.  Distributed version control makes migration easy enough, and 
backwards-compatible HTTP-level redirects are easy to set up for git, 
gitweb and also the websites.

The remaining common needs of large and small projects alike seem to be 
mailing lists, where migrating to a new domain and preserving archives 
can both be painful endeavors; bug tracking, for similar preservation 
reasons; and possibly a wiki.  Reducing Sourceware to these three 
services would reduce the operating cost (bandwidth for downloads and 
source control often dwarves everything else) and the attack surface.

While other services such as patchwork can be included, they fall more 
into a nice-to-have category and might not be completely indispensable 
when they break.  This also makes them easier to manage, and not a 
primary component of the sourceware "bus factor".

That's all.  I admit I am not necessarily up to date on how Sourceware 
operates, so I apologize in advance for any incorrect assumptions I 
made.  Apart from that, I hope that this writeup can provide useful 
ideas, or even a path forward for both Sourceware and the GNU toolchain.

Thanks,

Paolo

             reply	other threads:[~2022-09-23 14:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 14:04 Paolo Bonzini [this message]
2022-09-23 15:06 ` Frank Ch. Eigler
2022-09-23 21:59 ` Bradley M. Kuhn
2022-09-26  1:26 ` Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58991505-d402-bc5f-5ee8-fff48dcde6cd@gnu.org \
    --to=bonzini@gnu.org \
    --cc=bkuhn@sfconservancy.org \
    --cc=overseers@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).