From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mark@klomp.org>
Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184])
 by sourceware.org (Postfix) with ESMTPS id 057E33857C5D
 for <overseers@sourceware.org>; Fri, 22 Jul 2022 16:48:30 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 057E33857C5D
Received: by gnu.wildebeest.org (Postfix, from userid 1000)
 id 12493300047C; Fri, 22 Jul 2022 18:48:29 +0200 (CEST)
Date: Fri, 22 Jul 2022 18:48:29 +0200
From: Mark Wielaard <mark@klomp.org>
To: overseers@sourceware.org
Subject: Re: GNU Toolchain Infrastructure at sourceware
Message-ID: <20220722164829.GB27274@gnu.wildebeest.org>
References: <YpU+o8C8hyFsgtuP@wildebeest.org>
 <20220531163932.GA25222@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20220531163932.GA25222@redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: overseers@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Overseers mailing list <overseers.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/overseers>,
 <mailto:overseers-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/overseers/>
List-Post: <mailto:overseers@sourceware.org>
List-Help: <mailto:overseers-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/overseers>,
 <mailto:overseers-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jul 2022 16:48:31 -0000

Hi,

Just a quick size/resources update after two more months.

On Tue, May 31, 2022 at 12:39:32PM -0400, Frank Ch. Eigler via Overseers wrote:
> > builder.sourceware.org has been running for a couple of months
> > now. The buildbot process seems to have a low load on the machine. Low
> > single digit %CPU. The state database has grown to ~800MB, the
> > gitpoller-work dir is ~4GB (that is mainly gcc ~2G, libabigail ~750M
> > and binutils-gdb ~600M, others are < 200M).
> 
> I suspect those git work trees could be made into shallow clones, or
> use git-alternates or somesuch to minimize unnecessary .git/objects
> duplication right there on the same machine.

While we haven't done that yet the gitpoller-work git repos shrunk to
2.5G. Even though we added a new poller and through the try-schedulers
now also pull the users branches for binutils-gdb and libabigai. It
looks like this is more efficient packing. But I don't full understand
how it became that much more efficiently stored.

We did add an extra debian-i386 stable VM worker. Added container
builders for debian testing and fedora rawhide. IBM provided power8,
power9 and power10 workers for gdb and valgrind builders, Arm did the
same for gdb with workers for armhf and arm64. And OSUOSL provided an
8 core arm64 worker on which we run fedora latest builders for all
projects (mostly replacing the little arm64 odroid board. We added a
glibc poller, scheduler and builders on fedora-x86_64, fedora-arm64,
debian-i386, rawhide-x86_64, debian-testing-x86_64, fedora-s390x,
debian-ppc64, fedora-ppc64le, opensusue-tumbleweed and leap on
x86_64. And added try builders for binutils, gdb and libabigail. And a
gccrs bootstrap builder. The new gcc builders are build-only
--disable-bootstrap builders for just C and C++. We are still waiting
on a larger x86_64 worker to run the full gcc testsuite.

We are seeing ~500 builds a day. The CPU usage is slightly higher now
because builds are running all day now, but mostly still single digit
percentage. But while uploading new bunsen results CPU usage spikes
to ~40%/~70%. 

The state database is now 4GB.

> > I am slightly worried about the long time these uploads take. For
> > example the upload of results from the elfutils builders takes a
> > significant amount of time (almost half) of the build (4 to 5 out of
> > 10 to 12 minutes). Especially on the slower workers this can cause
> > them to get behind on the build queue.
> 
> Yeah, this appears to be a buildbot infrastructure limitation.
> Something is very inefficient about the way it evaluates file upload
> directives.  If it turns into a bigger problem, can try to revisit.

The new cpio helped, the uploads take less time. But we are seeing
much more uploads with some CPU spikes. I don't think it is
concerning, but something to watch.
 
> > The bunsendb.git is currently ~400MB and contains ~2000 results. It
> > seems it compresses fairly well and could maybe use a periodic
> > repacking (making a local clone reduces the size to ~260MB).
> 
> No big deal, auto repack will do most of the work fine.

It contains almost 10 times the number of results now (19000+) and the
bunsendb.git is only 3 times as big (1.2G).

> Added to that, there is the bunsenql.sqlite database, which is at the
> moment around 2GB.  It will ebb and flow with the bunsendb.git
> contents.

The sqlite database did indeed grow as much as the bunsendb.git, it is
now 3GB.

> I think all of the above storage levels are sustainable for at least
> several months at this rate, after which point we could start aging
> out old data.  (Plus we have an extra TB of space for /sourceware[12]
> that I plan to bring online shortly.)

Yes, it looks like even with the 10x growth in 2 months we didn't
really grow resource usage that much. We won't need the extra TB of
storage just yet. There is more than 128GB free at the moment, which
will certainly get us to next year. But having that extra TB of
storage online would help expanding other services.

Cheers,

Mark