From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) by sourceware.org (Postfix) with ESMTPS id 55DF43834F28 for ; Tue, 31 May 2022 21:50:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55DF43834F28 Received: from reform (deer0x07.wildebeest.org [172.31.17.137]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id EF3E2300027E; Tue, 31 May 2022 23:50:06 +0200 (CEST) Received: by reform (Postfix, from userid 1000) id 69CDA2E83D9A; Tue, 31 May 2022 23:50:06 +0200 (CEST) Date: Tue, 31 May 2022 23:50:06 +0200 From: Mark Wielaard To: "Frank Ch. Eigler" Cc: Overseers mailing list Subject: Re: GNU Toolchain Infrastructure at sourceware Message-ID: References: <20220531163932.GA25222@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220531163932.GA25222@redhat.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: overseers@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Overseers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2022 21:50:10 -0000 Hi Frank, On Tue, May 31, 2022 at 12:39:32PM -0400, Frank Ch. Eigler wrote: > > builder.sourceware.org has been running for a couple of months > > now. The buildbot process seems to have a low load on the machine. Low > > single digit %CPU. The state database has grown to ~800MB, the > > gitpoller-work dir is ~4GB (that is mainly gcc ~2G, libabigail ~750M > > and binutils-gdb ~600M, others are < 200M). > > I suspect those git work trees could be made into shallow clones, or > use git-alternates or somesuch to minimize unnecessary .git/objects > duplication right there on the same machine. Good point the buildbot and the git repos share the same machine/storage. We use https URLs for the changesource pollers because we reuse the same for the build factories which run on the workers. But they don't have to be the same. I'll experiment with that. But will have to be careful to not reset the whole history. A more general point is that gcc.git is really a couple of orders bigger than anything else. And that affects more things than just the buildbot. I wonder if we should cut off a bit more history. It would mean that people who really have to search back to before say gcc-5 need to stich in the gcc-old.git. But if it makes the default clone 1GB smaller that would be really good. > > I am slightly worried about the long time these uploads take. For > > example the upload of results from the elfutils builders takes a > > significant amount of time (almost half) of the build (4 to 5 out of > > 10 to 12 minutes). Especially on the slower workers this can cause > > them to get behind on the build queue. > > Yeah, this appears to be a buildbot infrastructure limitation. > Something is very inefficient about the way it evaluates file upload > directives. If it turns into a bigger problem, can try to revisit. Do we really need that all the individual .log and .trs files for each test? That is more than hundreds of files in some cases (like elfutils). Given that one file seems to take 0.5 a second that is easily multiple minutes. > > The bunsendb.git is currently ~400MB and contains ~2000 results. It > > seems it compresses fairly well and could maybe use a periodic > > repacking (making a local clone reduces the size to ~260MB). > > No big deal, auto repack will do most of the work fine. > > Added to that, there is the bunsenql.sqlite database, which is at the > moment around 2GB. It will ebb and flow with the bunsendb.git > contents. > > I think all of the above storage levels are sustainable for at least > several months at this rate, after which point we could start aging > out old data. There is a janitor which will delete builder logs older than ~2.5 months. Which is about as long as the buildbot has been running. So I expect the state database to not grow as much from now on. > (Plus we have an extra TB of space for /sourceware[12] > that I plan to bring online shortly.) OK! Now we are talking. And here I was concerned about a couple of GBs :) Thanks, Mark