From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) by sourceware.org (Postfix) with ESMTPS id 11E84385800A for ; Sun, 7 Aug 2022 15:46:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 11E84385800A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from reform (deer0x0c.wildebeest.org [172.31.17.142]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 83CF430012BF; Sun, 7 Aug 2022 17:46:10 +0200 (CEST) Received: by reform (Postfix, from userid 1000) id 398002E83662; Sun, 7 Aug 2022 17:46:10 +0200 (CEST) From: Mark Wielaard To: buildbot@sourceware.org Cc: Mark Wielaard Subject: [PATCH] Allow multiple latent workers per host and add small/large vm workers Date: Sun, 7 Aug 2022 17:45:59 +0200 Message-Id: <20220807154559.20146-1-mark@klomp.org> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: buildbot@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "The https://builder.sourceware.org/ buildbot" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Aug 2022 15:46:14 -0000 This switches from doing multiple builds per container to multiple builds per vm. A latent worker only does multiple builds if the builds use the same container/image. But we often have multiple builds that all want to use a different container image. Also we replicated worker dirs per image. This used duplicate diskspace per git/src/build dir for each image. Switch to only having one worker dir per (shared) container hostname. This reduces the duplication to 2 instead of 6 (but the ccache dir is still per image). Also introduce two big vm workers. One on the ryzen9 box with 12 vcpus. And one on the osuosl multi-socket box with 16 vcpus. These are currently only used for the gccrs bootstrap build and the full glibc check build. --- builder/containers/bb-start.sh | 13 +++++- builder/master.cfg | 77 +++++++++++++++++++++++++++------- 2 files changed, 73 insertions(+), 17 deletions(-) diff --git a/builder/containers/bb-start.sh b/builder/containers/bb-start.sh index 31cdbc9..a87a9e0 100755 --- a/builder/containers/bb-start.sh +++ b/builder/containers/bb-start.sh @@ -7,7 +7,16 @@ # - IMAGE_NAME - name of the image # - CCACHE_LIBDIR - path where to search for ccache (usually /usr/lib64/ccache) -worker_dir=shared/$IMAGE_NAME/worker +# Use the (container) hostname to create different worker dirs in case +# we have multiple containers running on the same build host (using the same +# shared/ directory). Note that a latent worker only runs one container at +# a time. Even if it might do multiple builds it will do those builds using +# the same container/image. The latent worker only has one instantation of +# the worker/container running at the same time. But multiple different +# latent workers might be running on the same host. Those latent workers +# should all set a different container hostname. +hostname=$(hostname -s) +worker_dir=shared/$hostname/worker tac_file=$worker_dir/buildbot.tac if [ ! -f $tac_file ]; then mkdir -p $worker_dir @@ -20,6 +29,7 @@ fi # objcopy gives us the binutils version, iconv the glibc version echo buildbot@sourceware.org > $worker_dir/info/admin echo $IMAGE_NAME > $worker_dir/info/host +echo $hostname >> $worker_dir/info/host gcc --version | head -1 >> $worker_dir/info/host objcopy --version | head -1 >> $worker_dir/info/host iconv --version | head -1 >> $worker_dir/info/host @@ -28,6 +38,7 @@ iconv --version | head -1 >> $worker_dir/info/host unset WORKERPASS # Make sure ccache is in the PATH and uses a shared cache +# There is one ccache dir per image shared between workers export PATH=$CCACHE_LIBDIR:$PATH mkdir -p shared/$IMAGE_NAME/ccache export CCACHE_DIR=/home/builder/shared/$IMAGE_NAME/ccache diff --git a/builder/master.cfg b/builder/master.cfg index ee752fa..b66506c 100644 --- a/builder/master.cfg +++ b/builder/master.cfg @@ -131,28 +131,56 @@ fedrawhide_x86_64_worker = worker.Worker("fedrawhide-x86_64", c['workers'].append(fedrawhide_x86_64_worker) # 3 (Fedora Core) VMs which can run container files +# 2 can do simultanious builds, and so have 2 latent workers with different +# hostnames, 1 is a bit larger but only does one (larger) build at a time. # builders need to set the "container-file" property by using readContainerFile -bb1_worker = worker.DockerLatentWorker("bb1", +bb1_1_worker = worker.DockerLatentWorker("bb1-1", None, masterFQDN="builder.sourceware.org", docker_host="ssh://builder@bb.wildebeest.org:2021", dockerfile=util.Interpolate('%(prop:container-file)s'), volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bb1-1", build_wait_timeout=0, - max_builds=2, + max_builds=1, properties={'ncpus': 6, 'maxcpus': 8}); -c['workers'].append(bb1_worker) +c['workers'].append(bb1_1_worker) -bb2_worker = worker.DockerLatentWorker("bb2", +bb1_2_worker = worker.DockerLatentWorker("bb1-2", + None, + masterFQDN="builder.sourceware.org", + docker_host="ssh://builder@bb.wildebeest.org:2021", + dockerfile=util.Interpolate('%(prop:container-file)s'), + volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bb1-2", + build_wait_timeout=0, + max_builds=1, + properties={'ncpus': 6, 'maxcpus': 8}); +c['workers'].append(bb1_2_worker) + +bb2_1_worker = worker.DockerLatentWorker("bb2-1", + None, + masterFQDN="builder.sourceware.org", + docker_host="ssh://builder@bb.wildebeest.org:2022", + dockerfile=util.Interpolate('%(prop:container-file)s'), + volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bb2-1", + build_wait_timeout=0, + max_builds=1, + properties={'ncpus': 6, 'maxcpus': 8}); +c['workers'].append(bb2_1_worker) + +bb2_2_worker = worker.DockerLatentWorker("bb2-2", None, masterFQDN="builder.sourceware.org", docker_host="ssh://builder@bb.wildebeest.org:2022", dockerfile=util.Interpolate('%(prop:container-file)s'), volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bb2-2", build_wait_timeout=0, - max_builds=2, + max_builds=1, properties={'ncpus': 6, 'maxcpus': 8}); -c['workers'].append(bb2_worker) +c['workers'].append(bb2_2_worker) bb3_worker = worker.DockerLatentWorker("bb3", None, @@ -160,24 +188,41 @@ bb3_worker = worker.DockerLatentWorker("bb3", docker_host="ssh://builder@bb.wildebeest.org:2023", dockerfile=util.Interpolate('%(prop:container-file)s'), volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bb3", build_wait_timeout=0, - max_builds=2, - properties={'ncpus': 6, 'maxcpus': 8}); + max_builds=1, + properties={'ncpus': 6, 'maxcpus': 12}); c['workers'].append(bb3_worker) -# OSUOSL machine -bbo1_worker = worker.DockerLatentWorker("bbo1", +# OSUOSL machine, two latent workers +# One for "small" fast jobs, one for (slow) large jobs +bbo1_1_worker = worker.DockerLatentWorker("bbo1-1", None, masterFQDN="builder.sourceware.org", docker_host="ssh://builder@sourceware-builder1.osuosl.org", dockerfile=util.Interpolate('%(prop:container-file)s'), volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bbo1-1", build_wait_timeout=0, - max_builds=2, - properties={'ncpus': 4, 'maxcpus': 8}); -c['workers'].append(bbo1_worker) + max_builds=1, + properties={'ncpus': 8, 'maxcpus': 8}); +c['workers'].append(bbo1_1_worker) -vm_workers = ['bb1', 'bb2', 'bb3', 'bbo1'] +bbo1_2_worker = worker.DockerLatentWorker("bbo1-2", + None, + masterFQDN="builder.sourceware.org", + docker_host="ssh://builder@sourceware-builder1.osuosl.org", + dockerfile=util.Interpolate('%(prop:container-file)s'), + volumes=["/home/builder/shared:/home/builder/shared"], + hostname="bbo1-2", + build_wait_timeout=0, + max_builds=1, + properties={'ncpus': 8, 'maxcpus': 16}); +c['workers'].append(bbo1_2_worker) + +vm_workers = ['bb1-1', 'bb1-2', 'bb2-1', 'bb2-2', 'bb3', + 'bbo1-1', 'bbo1-2'] +big_vm_workers = ['bb3', 'bbo1-2'] ibm_power8_worker = worker.Worker("ibm_power8", getpw("ibm_power8"), @@ -1962,7 +2007,7 @@ gccrust_bootstrap_debian_amd64_builder = util.BuilderConfig( name="gccrust-bootstrap-debian-amd64", properties={'container-file': readContainerFile('debian-stable')}, - workernames=vm_workers, + workernames=big_vm_workers, tags=["gccrust-bootstrap", "debian", "x86_64"], collapseRequests=True, factory=gccrust_bootstrap_factory) @@ -3019,7 +3064,7 @@ glibc_fedora_x86_64_builder = util.BuilderConfig( collapseRequests=True, properties={'container-file': readContainerFile('fedora-latest')}, - workernames=vm_workers, + workernames=big_vm_workers, tags=["glibc", "fedora", "x86_64"], factory=glibc_factory) c['builders'].append(glibc_fedora_x86_64_builder) -- 2.30.2