public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Overseers mailing list <overseers@sourceware.org>
Subject: Re: GNU Toolchain Infrastructure at sourceware
Date: Tue, 31 May 2022 23:50:06 +0200	[thread overview]
Message-ID: <YpaNjrCEB+OuIhd5@wildebeest.org> (raw)
In-Reply-To: <20220531163932.GA25222@redhat.com>

Hi Frank,

On Tue, May 31, 2022 at 12:39:32PM -0400, Frank Ch. Eigler wrote:
> > builder.sourceware.org has been running for a couple of months
> > now. The buildbot process seems to have a low load on the machine. Low
> > single digit %CPU. The state database has grown to ~800MB, the
> > gitpoller-work dir is ~4GB (that is mainly gcc ~2G, libabigail ~750M
> > and binutils-gdb ~600M, others are < 200M).
> 
> I suspect those git work trees could be made into shallow clones, or
> use git-alternates or somesuch to minimize unnecessary .git/objects
> duplication right there on the same machine.

Good point the buildbot and the git repos share the same
machine/storage. We use https URLs for the changesource pollers
because we reuse the same for the build factories which run on the
workers. But they don't have to be the same. I'll experiment with
that. But will have to be careful to not reset the whole history.

A more general point is that gcc.git is really a couple of orders
bigger than anything else. And that affects more things than just the
buildbot. I wonder if we should cut off a bit more history. It would
mean that people who really have to search back to before say gcc-5
need to stich in the gcc-old.git. But if it makes the default clone
1GB smaller that would be really good.

> > I am slightly worried about the long time these uploads take. For
> > example the upload of results from the elfutils builders takes a
> > significant amount of time (almost half) of the build (4 to 5 out of
> > 10 to 12 minutes). Especially on the slower workers this can cause
> > them to get behind on the build queue.
> 
> Yeah, this appears to be a buildbot infrastructure limitation.
> Something is very inefficient about the way it evaluates file upload
> directives.  If it turns into a bigger problem, can try to revisit.

Do we really need that all the individual .log and .trs files for each
test?  That is more than hundreds of files in some cases (like
elfutils). Given that one file seems to take 0.5 a second that is easily
multiple minutes.

> > The bunsendb.git is currently ~400MB and contains ~2000 results. It
> > seems it compresses fairly well and could maybe use a periodic
> > repacking (making a local clone reduces the size to ~260MB).
> 
> No big deal, auto repack will do most of the work fine.
> 
> Added to that, there is the bunsenql.sqlite database, which is at the
> moment around 2GB.  It will ebb and flow with the bunsendb.git
> contents.
> 
> I think all of the above storage levels are sustainable for at least
> several months at this rate, after which point we could start aging
> out old data.

There is a janitor which will delete builder logs older than ~2.5
months. Which is about as long as the buildbot has been running. So I
expect the state database to not grow as much from now on.

> (Plus we have an extra TB of space for /sourceware[12]
> that I plan to bring online shortly.)

OK! Now we are talking. And here I was concerned about a couple of GBs :)

Thanks,

Mark

  reply	other threads:[~2022-05-31 21:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-30 22:01 Mark Wielaard
2022-05-31 16:39 ` Frank Ch. Eigler
2022-05-31 21:50   ` Mark Wielaard [this message]
2022-05-31 22:12     ` Joseph Myers
2022-06-01 10:09       ` Mark Wielaard
2022-07-22 16:48   ` Mark Wielaard
2022-06-22  9:14 ` Roadmap update Mark Wielaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YpaNjrCEB+OuIhd5@wildebeest.org \
    --to=mark@klomp.org \
    --cc=fche@redhat.com \
    --cc=overseers@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).