public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* Role of sourceware for hosted projects
@ 2022-09-23 14:04 Paolo Bonzini
  2022-09-23 15:06 ` Frank Ch. Eigler
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Paolo Bonzini @ 2022-09-23 14:04 UTC (permalink / raw)
  To: overseers; +Cc: Bradley Kuhn

Hi all,

I have recently come across the discussions at Cauldron about 
Sourceware, by means of an article on LWN.net.  Even though I have not 
been a contributor to Sourceware projects for several years, they are 
still dear to my heart and I am familiar with several of them. 
Therefore, I would like to share my own observations about what 
sourceware.org can or should provide to the project it hosts.

First of all, a few obligatory pieces of disclosure.  First, I am 
employed by Red Hat but not for anything related to the GNU Compiler 
Collection or any other Sourceware project.  Second, I was not at 
Cauldron and have only watched the online recording of the first hour of 
the BoFs; I trust the LWN.net editors and their article on the topic to 
provide a faithful account.  Third, even though I have discussed this 
with some of the people involved in the BoFs, that hasn't substantially 
changed my understanding of the situation as it is reflected below.


The obvious observation from the outside is that the two "opposing 
sides" (for lack of a better word) have different priorities on what 
Sourceware needs to provide for them.  There are several reasons for 
this: different projects that people work on, different emotional 
attachment, different jobs, whatnot.  However, both of them make the 
same mistake: they focus on the "next steps" without a full assessment 
of the *current* state of Sourceware.

The first part of the assessment in my opinion should be that most 
projects on sourceware.org are dead.  Some of them (e.g. GSL) have 
already migrated out of Sourceware in fact, but the others should be 
archived ASAP.  Archival means moving them to a different domain *and 
machine*, with redirects put in place on the sourceware.org website. 
The machine should not provide any maintainer of archived projects with 
shell access.  Archived projects can have a static website and have 
their sources available via git/gitweb, but no mailing lists, no bug 
tracking, no wiki, and no code running on the server such as commit 
hooks or CGI scripts.

Archiving these projects or turning their pages into hard redirects will 
leave if I counted right, roughly ten projects that are alive: gcc, gdb, 
binutils and glibc ("the GNU toolchain"); elfutils, systemtap, 
cygwin/newlib and libabigail ("smaller projects"); and others that are 
alive but barely so (debugedit, dwz, etc.).  Also, most projects outside 
the GNU toolchain have only one main developer.

That's quite a varied lot, enough so that:

1) it does seem weird for Conservancy to be the fiscal sponsor for 
*Sourceware*, which is essentially "a machine" rather than a 
coherent/cohesive set of projects.  *Some* of the smaller projects might 
very well be interested in joining Conservancy, but different 
maintainers may have different requirements or priorities.

2) Migration to IT infrastructure hosted by Linux Foundation, as in the 
"GTI" proposal, might not take into account the needs of smaller 
projects very well.  But there is no reason for these projects to live 
on the same server or have the same development model; and there's no 
reason for all of the services to be provided by a single server. 
Different services can be easily outsourced to different people, 
companies or external providers.


Based on the above, I think focus should be on simplifying the 
sourceware.org infrastructure, removing unnecessary services based on 
ease of migration, attack surface, operating cost (both human and 
monetary). This way Sourceware handles on what is harder to obtain from 
external providers, and projects can use the best external 
infrastructure for everything else they need.

A prime target for this simplification, based mostly on my experience 
with QEMU, is source control and CI.  My understanding is that GTI comes 
from maintainers that are not confident that Sourceware can provide 
enough security---not because they do not trust the overseers, but 
rather because things go well until they don't.  For this reason, source 
control is the main concern of the people behind the GTI proposal.  If 
the bigger projects are worried about supply chain attacks, to begin 
with they can very easily migrate their source code control repositories 
elsewhere.  It could be operated by LF staff, or could be a "forge" like 
Gitlab/GitHub; several projects use the latter purely for hosting while 
keeping development on mailing lists.  They don't even have to do the 
same thing between the four of them, though it would make sense for at 
least binutils, gcc and gdb to do so. I am rusty on how these three 
projects do release management, but perhaps they could even use a 
monorepo with per-project release branches, and tarballs that only 
include the relevant directories.

But while GTI focuses (as expected) on the four GNU projects that form 
the toolchain, Sourceware infrastructure might not be the most suitable 
even for smaller projects.  In fact, in my humble opinion maintainers of 
smaller and moribund projects should *also* figure out if they still 
need/want the infrastructure provided by sourceware.org.  Of the smaller 
projects, most might be better served by migrating to Gitlab or GitHub; 
for example, CI there is easier to use than custom infrastructure based 
on patchwork and buildbot.  Sourcehut is another possibility; it scores 
pretty high on the GNU Ethical Repository Criteria, especially with 
respect to free Javascript, and some maintainers might value that as 
well.  Therefore, even though in the short term Sourceware can also host 
source code control repos for the smaller projects that do not want to 
migrate, I believe that this should be phased out for all non-archived 
projects.  Distributed version control makes migration easy enough, and 
backwards-compatible HTTP-level redirects are easy to set up for git, 
gitweb and also the websites.

The remaining common needs of large and small projects alike seem to be 
mailing lists, where migrating to a new domain and preserving archives 
can both be painful endeavors; bug tracking, for similar preservation 
reasons; and possibly a wiki.  Reducing Sourceware to these three 
services would reduce the operating cost (bandwidth for downloads and 
source control often dwarves everything else) and the attack surface.

While other services such as patchwork can be included, they fall more 
into a nice-to-have category and might not be completely indispensable 
when they break.  This also makes them easier to manage, and not a 
primary component of the sourceware "bus factor".

That's all.  I admit I am not necessarily up to date on how Sourceware 
operates, so I apologize in advance for any incorrect assumptions I 
made.  Apart from that, I hope that this writeup can provide useful 
ideas, or even a path forward for both Sourceware and the GNU toolchain.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Role of sourceware for hosted projects
  2022-09-23 14:04 Role of sourceware for hosted projects Paolo Bonzini
@ 2022-09-23 15:06 ` Frank Ch. Eigler
  2022-09-23 21:59 ` Bradley M. Kuhn
  2022-09-26  1:26 ` Mark Wielaard
  2 siblings, 0 replies; 4+ messages in thread
From: Frank Ch. Eigler @ 2022-09-23 15:06 UTC (permalink / raw)
  To: Overseers mailing list; +Cc: Paolo Bonzini

Hi, Paolo -


Thank you for your comments.  I'll address a couple of things.

> [...]
> The obvious observation from the outside is that the two "opposing sides"
> (for lack of a better word) have different priorities on what Sourceware
> needs to provide for them.  [...]

Indeed.  And the only way we can contemplate making improvements or
fixing shortcomings is by discussing them openly.  In addition, we
have been super consistent in saying that projects are welcome to
come, stay, and leave for whatever reason if they like.  This is one
reason I don't see any necessary conflict between the SFC and LF/GTI
proposals.


> The first part of the assessment in my opinion should be that most
> projects on sourceware.org are dead.  Some of them (e.g. GSL) have
> already migrated out of Sourceware in fact, but the others should be
> archived ASAP.  Archival means [...]

Most of your suggestions can be dealt with gradually, on a per-project
basis.  I believe that passive presence on the server is harmless, so
extraordinary attempts to transport & forward stuff is not necessary.
We have hardly any cgi stuff, especially in small projects, so access
control withdrawals for long-inactive users should be sufficient to
freeze things safely --- and quickly unfreeze if new activity appears.


> [...]  A prime target for this simplification, based mostly on my
> experience with QEMU, is source control and CI.  [...]  For this
> reason, source control is the main concern of the people behind the
> GTI proposal.  [...]

I have heard this as a general concern, but it's hard to match this to
a random drastic change and believe that it would or would not help.
So instead of generalities, I believe one particular threat model that
has been uttered here and there is ... "what if someone breaks into
sourceware and alters the code repositories?".  That is a reasonable
and specific concern.

Fortunately, all the active projects already / finally use git.  (gcc
was one of the last to switch).  As you know, git already has
excellent damage resiliency with every developer having full clones,
hash based content verification, etc.  Signed tags are common.  AIUI,
git is just not that vulnerable.

In addition, any git project on sourceware has the option to go
further into security territory with gpg-signed git commit/merge/push
ops.  AIUI, this practically eliminates the possibility of malicious
code-repository damage, even with a full penetration of the server.
Sourceware's git server has supported this stuff for years.  I'm a
little bit surprised that hardly any project has taken advantage.


> [...]
> The remaining common needs of large and small projects alike [...]
> While other services such as patchwork can be included [...]

Those are reasonable suggestions, OTOH for small/medium projects, the
current set of colocated services are convenient, fully open-sourcy,
community maintained.  Attack surface, yes, I suppose, but that's
balanced against a happy developing experience.


- FChE

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Role of sourceware for hosted projects
  2022-09-23 14:04 Role of sourceware for hosted projects Paolo Bonzini
  2022-09-23 15:06 ` Frank Ch. Eigler
@ 2022-09-23 21:59 ` Bradley M. Kuhn
  2022-09-26  1:26 ` Mark Wielaard
  2 siblings, 0 replies; 4+ messages in thread
From: Bradley M. Kuhn @ 2022-09-23 21:59 UTC (permalink / raw)
  To: overseers

Frank answered on other points, I just want to address the ones that relates
specifically to SFC:

Paolo Bonzini wrote at 07:04 (PDT):

> 1) it does seem weird for Conservancy to be the fiscal sponsor for
> *Sourceware*, which is essentially "a machine" rather than a
> coherent/cohesive set of projects.

Organizationally and governance-wise, it's not much different than a member
project that's a library used by other FOSS projects.  The users of such a
project are all developers, usually with their own FOSS projects, and those
FOSS projects that combine with the library aren't usually member projects
of SFC.

Logistically, it's a different because the needs of an infrastructure
project are quite different than the needs of a software development
project.  However, I urge you not to think about Sourceware as merely a
machine, because that does a disservice to the individuals who have spent
person-years of their lives maintaining and improving this infrastructure.

From SFC's perspective, Sourceware is actually primarily a group of
volunteers, deploying solutions for the guest projects, working with the
community of the guest projects to assess needs and prioritize those needs.
Eventually, once Sourceware has a fiscal sponsor, it will probably also be a
community of volunteers working with paid contractors to do that work.
IBM's Red Hat (and Cygnus before it) is a donor to that effort — donating
bandwidth and machines.

> 2) Migration to IT infrastructure hosted by Linux Foundation, as in the
> "GTI" proposal, might not take into account the needs of smaller projects
> very well.

I really agree with this too, and it's a point that Mark has been raising
quite a bit — or at least tried to during his interrupted presentation.  I
think the Sourceware Overseers and SFC have been quite clear that feedback
from guest projects (and their own fiscal sponsors) about plans for fiscal
sponsorship *of Sourceware* are quite welcome.  However, we really have to
keep reiterating the distinction that *fiscal sponsorship of Sourceware has
nothing to do with the fiscal sponsorship of any guest projects*.  (e.g.,
GitHub isn't automatically fiscal sponsor if you host your project on
GitHub.)  But, we do hope guest projects (and their own fiscal sponsors, if
they have one) express their needs, concerns, and any other opinions.

Indeed, with regard to the GNU projects on Sourceware, the opinion of the
FSF is highly relevant and we should consider it.  I was really glad Zoë
(FSF's Executive Director) has been able to join this list and attend the
Cauldron sessions, and I look forward to hearing more from her.

> But there is no reason for these projects to live on the same server or
> have the same development model; and there's no reason for all of the
> services to be provided by a single server. Different services can be
> easily outsourced to different people, companies or external providers.

As we discussed on one of the public BBB chats, another part of SFC's
excited interest in helping Sourceware is how bad things have gotten with
regard to proprietary infrastructure for FOSS projects.  Sourceware is one
example among many of initiatives that are trying very hard to resist
proprietary software infrastructure for FOSS development.

SFC has been collaborating for the last year with OSU-OSL, and recently
began dialogue with projects such as Codeberg, and a few other ad-hoc
collectives that run self-hosted GitLab Community Edition instances.  As
part of our GiveUpGitHub campaign <https://giveupgithub.org>, we're seeking
to provide as many alternatives as we can to FOSS projects that don't want
to use proprietary infrastructure.  This is admittedly a very big challenge,
but in SFC's opinion, the best way to face a big challenge like this is to
diversify approaches — working in collaboration with volunteers who care
deeply about the issues.  When we received the Sourceware application, we
were thrilled about it for this very reason.

On this point, plainly stated: Linux Foundation does not offer a commitment
to FOSS infrastructure for FOSS projects.  At the Cauldron session, the
example was given of Yocto as a project that LF has served well with
infrastructure.  Yocto's mailing lists are hosted on groups.io, which is a
proprietary mailing list software for which even to *subscribe* to the
mailing list, every subscriber must agree not to attempt to reverse engineer
their mailing list system (which would include, say, trying to figure out
how they deal with things like spam to apply the solution to GNU Mailman).
That's as FOSS-unfriendly as it gets.  Yocto also uses a Slack instance for
their chat services.
-- 
Bradley M. Kuhn - he/him
Policy Fellow & Hacker-in-Residence at Software Freedom Conservancy
========================================================================
Become a Conservancy Sustainer today: https://sfconservancy.org/sustainer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Role of sourceware for hosted projects
  2022-09-23 14:04 Role of sourceware for hosted projects Paolo Bonzini
  2022-09-23 15:06 ` Frank Ch. Eigler
  2022-09-23 21:59 ` Bradley M. Kuhn
@ 2022-09-26  1:26 ` Mark Wielaard
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Wielaard @ 2022-09-26  1:26 UTC (permalink / raw)
  To: Overseers mailing list

Hi Paolo,

On Fri, Sep 23, 2022 at 04:04:12PM +0200, Paolo Bonzini via Overseers wrote:
> I have recently come across the discussions at Cauldron about Sourceware, by
> means of an article on LWN.net.

Which is 
https://lwn.net/SubscriberLink/908638/567de0001d86662c/

And I should really apologize for how I handled that BoF. I had
intended to show the infrastructure work the community had done this
last year and discuss the policies different projects could/had set to
use the new services, what had worked, what didn't work. Working up to
how the community could move these Sourceware infrastructure services
into the future with the Conservancy member proposal as a bridge to
the LF/GTI plan. For which we then also had the second BoF hour as
overflow if people wanted to discuss that even more. But clearly I
lost total control of the BoF. I thought I had prepared for that, to
make sure all discussion topics got enough time, but clearly hadn't
prepared enough.

> Even though I have not been a contributor
> to Sourceware projects for several years, they are still dear to my heart
> and I am familiar with several of them. Therefore, I would like to share my
> own observations about what sourceware.org can or should provide to the
> project it hosts.

Thanks.

> First of all, a few obligatory pieces of disclosure.  First, I am employed
> by Red Hat but not for anything related to the GNU Compiler Collection or
> any other Sourceware project.  Second, I was not at Cauldron and have only
> watched the online recording of the first hour of the BoFs; I trust the
> LWN.net editors and their article on the topic to provide a faithful
> account.  Third, even though I have discussed this with some of the people
> involved in the BoFs, that hasn't substantially changed my understanding of
> the situation as it is reflected below.

And you have relevant background as one of the Qemu Conservancy PLC
members. Also I would love to hear about patchew.org.

For those only having attended the Cauldron BoF or having just read
the lwn article maybe some of this background information helps a bit
with understanding where in the process we are:

- Sourceware roadmap discussions
  https://sourceware.org/pipermail/overseers/2022q2/018453.html
  https://sourceware.org/pipermail/overseers/2022q2/018529.html
  https://sourceware.org/pipermail/overseers/2022q3/018636.html
  https://sourceware.org/pipermail/overseers/2022q3/018716.html
- Sourceware as Conservancy member project proposal
  https://sourceware.org/pipermail/overseers/2022q3/018802.html
- Full Sourceware SFC application text
  https://sourceware.org/pipermail/overseers/2022q3/018804.html
- Public SFC video chat meeting notes
  https://sourceware.org/pipermail/overseers/2022q3/018837.html
- Statement from Zoë Kooyman, Executive director FSF
  https://sourceware.org/pipermail/overseers/2022q3/018843.html

> The obvious observation from the outside is that the two "opposing sides"
> (for lack of a better word)

Note that I am really unhappy this is seen as "opposing sides" or
different "visions". I really hope we can work this out as a community
with some shared values. Sourceware being a Conservancy member project
really shouldn't be incompatible with finding sponsors through the
Linux Foundation. And I hope we can also work out what might make
sense as a managed service (I really like the BBB idea as something we
can try out together).

> have different priorities on what Sourceware
> needs to provide for them.  There are several reasons for this: different
> projects that people work on, different emotional attachment, different
> jobs, whatnot.  However, both of them make the same mistake: they focus on
> the "next steps" without a full assessment of the *current* state of
> Sourceware.

I think we did do an assessment of the current state and the next step
really just is setting up a fiscal sponsor without any impact. Once we
have a PLC we can start thinking about the next steps. People have
ideas, and I think we really should take the security issues seriously
once we really understand them. First priority is making sure there is
no disruption to the projects and to make sure that we are "disaster
proof". That doesn't mean we shouldn't be a bit more ambitious, but
lets first make sure we have a solid fiscal sponsor and governance
structure.

> The first part of the assessment in my opinion should be that most projects
> on sourceware.org are dead.

Correct. We have 40 projects that have stopped being developed or
moved to other hosting services. We really should work with the
Software Heritage and maybe archive.org to get them properly archived.

> if I counted right, roughly ten projects that are alive: gcc, gdb,
> binutils and glibc ("the GNU toolchain"); elfutils, systemtap, cygwin/newlib
> and libabigail ("smaller projects"); and others that are alive but barely so
> (debugedit, dwz, etc.).  Also, most projects outside the GNU toolchain have
> only one main developer.

The projects page could use an update. There are 26 alive projects (we
just added one this week), plus a couple of "meta" projects (basically
for the services like builder.sourceware.org which is its own
community). Some are much more active than others, and some just use a
few services like having a mailinglist or git repo or bugzilla, but we
do find the "long tail" also important.

> If the bigger projects
> are worried about supply chain attacks, to begin with they can very easily
> migrate their source code control repositories elsewhere.  It could be
> operated by LF staff, or could be a "forge" like Gitlab/GitHub; several
> projects use the latter purely for hosting while keeping development on
> mailing lists.

I think a higher priority is to have good mirrors. We already have a
sourcehut mirror at https://sr.ht/~sourceware/ it would be great to
have some alternative mirrors, maybe LF/IT could set those up. Also
now that we have a public-inbox instance we could have the same for
the patches/mailinglists. With that and possibly commit signing and
patch attestation you can always verify the "supply chain" even when
one of the hosts is compromised since all verification is/can be done
against any mirror or your local copy.

> They don't even have to do the same thing between the four
> of them, though it would make sense for at least binutils, gcc and gdb to do
> so. I am rusty on how these three projects do release management, but
> perhaps they could even use a monorepo with per-project release branches,
> and tarballs that only include the relevant directories.

binutils, gdb and sim use a mono-repo, but do separate releases and
even have separate contribution policies. They do share libiberty but
not as a shared repo. And some use gnulib, but with different update
policies. It isn't completely clear if all of them have even heard of
the LF/GTI proposal yet, we have sent email to the various steering
committees to figure out where they all stand on this.

> That's all.  I admit I am not necessarily up to date on how Sourceware
> operates, so I apologize in advance for any incorrect assumptions I made.
> Apart from that, I hope that this writeup can provide useful ideas, or even
> a path forward for both Sourceware and the GNU toolchain.

Thanks, it was interesting. I am not sure it really fits with the
current roadmap and making sure projects don't need to migrate away
from sourceware. But if projects want to migrate some of their
services you do list some interesting options.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-09-26  1:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-23 14:04 Role of sourceware for hosted projects Paolo Bonzini
2022-09-23 15:06 ` Frank Ch. Eigler
2022-09-23 21:59 ` Bradley M. Kuhn
2022-09-26  1:26 ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).