From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) by sourceware.org (Postfix) with ESMTPS id 6EB783834E67 for ; Wed, 1 Jun 2022 10:09:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6EB783834E67 Received: from tarox.wildebeest.org (83-87-18-245.cable.dynamic.v4.ziggo.nl [83.87.18.245]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 531D630006D6; Wed, 1 Jun 2022 12:09:45 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id 1264640007B5; Wed, 1 Jun 2022 12:09:45 +0200 (CEST) Message-ID: <6ef09f08f681437fcb43dd347f2cd949ed16ea86.camel@klomp.org> Subject: Re: GNU Toolchain Infrastructure at sourceware From: Mark Wielaard To: Overseers mailing list Cc: Joseph Myers Date: Wed, 01 Jun 2022 12:09:44 +0200 In-Reply-To: References: <20220531163932.GA25222@redhat.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: overseers@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Overseers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jun 2022 10:09:47 -0000 Hi Joseph, On Tue, 2022-05-31 at 22:12 +0000, Joseph Myers via Overseers wrote: > On Tue, 31 May 2022, Mark Wielaard via Overseers wrote: >=20 > > A more general point is that gcc.git is really a couple of orders > > bigger than anything else. And that affects more things than just the > > buildbot. I wonder if we should cut off a bit more history. It would > > mean that people who really have to search back to before say gcc-5 > > need to stich in the gcc-old.git. But if it makes the default clone > > 1GB smaller that would be really good. >=20 > gcc-old.git is the old git-svn mirror, now read only. It has no relation= =20 > to the main version of the history in gcc.git (the gcc-old.git history is= =20 > also available in gcc.git under refs/git-old/ and refs/git-svn-old/, not= =20 > fetched by default). How embarrassing. I only assumed, I didn't actually check, (nor did I actually try to create such a "small" gcc.git). You are of course right. a) The default fetches objects go all the way back to including pre egcs history. b) By default you indeed only fetch about 1GB. > The vast bulk of the approximately 6000 refs in gcc.git are *not* fetched= =20 > by default, and the repository is set up with delta islands to make that= =20 > efficient. >=20 > People not wanting full history of the branches they clone can create a= =20 > shallow clone with git clone --depth (and people wanting full history but= =20 > only for one branch can use --single-branch). >=20 > The GCC repository size is similar to that for the Linux kernel. Right, so it isn't completely unusual to have such large repos. And indeed the buildbot workers do use --depth 1. So for automation this isn't really such a big deal. But I would say that like the linux kernel tree it is somewhat unusual for developers to have such a giant tree. That 1GB is still somewhat big on slower networks and it does take significant resources when resolving the deltas. --single-branch doesn't really significantly reduce that. But --depth 1 does of course. IMHO it would be good to have something in between. Maybe a standard tree of --depth ~20000 (around when we cutover to git). But maybe that is still too little history as a default. And it could of course be clearly documented. But who reads docs... Another hurdle for the first time gcc hacker is the default configure flags. If you forget --disable-bootstrap for your first hacking it really takes forever to build. But maybe this is a better conversation on the gcc@ mailinglist. I also may be too eager to be popular with hackers who don't like reading setup documentation. And in the end gcc does contain so many frontends and libraries these days that a single "easy" default is impossible to give. Cheers, Mark