From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23275 invoked by alias); 29 Jan 2003 08:10:56 -0000 Mailing-List: contact overseers-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: , Sender: overseers-owner@sources.redhat.com Received: (qmail 23240 invoked from network); 29 Jan 2003 08:10:55 -0000 Received: from unknown (HELO molenda.com) (192.220.74.81) by 172.16.49.205 with SMTP; 29 Jan 2003 08:10:55 -0000 Received: (qmail 46015 invoked by uid 19025); 29 Jan 2003 08:10:54 -0000 Date: Wed, 29 Jan 2003 08:10:00 -0000 From: Jason Molenda To: overseers@sources.redhat.com Subject: A little cvs benchmarking - fileattr speedup estimates Message-ID: <20030129001054.A42309@molenda.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-SW-Source: 2003-q1/txt/msg00211.txt.bz2 Jason's summary for those who don't like reading: fileattr gives us 2-4x speedups on cvs updates against HEAD, and subversions.gnu.org seems faster than sourceware, probably because of lower overall system load. I spent last weekend pounding on otherwise-idle cvs servers for some stuff I was doing at Apple. The numbers comparing gcc.gnu.org (sourceware) and subversions.gnu.org (the FSF's system) are a little interesting so I thought I'd summarize them. I used the gcc sources for all the tests because subversion rsync's a copy of the repo a few times a day. Checkout speeds for both systems were identical - around 100 seconds from inside Apple. From home, even with gzip compression enabled, my 768kbps down-link was the bottleneck - I never got a gcc checkout in less than ~370 seconds. I did many, many test runs over the weekend. Each test run did a sequence three times on each server -- a sequence was a checkout followed immediately by two do-nothing cvs updates. The idea with these double updates is to give the system a chance to cache everything in memory, and make it clear how effective the server was at doing that cacheing. In general, subversions did a much better job at keeping everything cached in memory. It's possible they have more than the 2GB of RAM that sourceware has, but I'm putting my money on their having fewer loads imposed on their server. It isn't always obvious, but there's a lot of stuff being served off sourceware and http is a particularly tricky little service, sneaking up and eating lots of memory when you're not looking. Bad httpd, bad! In my test runs against the top of tree (where sourceware has the fileattr caches and subversions has to read in at least the first block of each RCS file), sourceware was 2x faster to complete. These times were from almost perfectly idle servers--I have enough test runs that I'm confident of when they were otherwise idle. In real terms, sourceware could do a 'cvs update' of the entire gcc repository in 10-11 seconds; subversions did it in 17-22 seconds. I wanted a harder test, so I modified the script to do its checkouts against gcc-3_1-branch which will cause both cvs servers to page in at least a couple pages of each RCS file. When I did that, sourceware got about 4x slower (40-50 seconds to complete) and subversions stayed mostly about the same time wise. e.g. Starting at Mon Jan 27 05:29:42 2003 client-system-localtime. gcc.gnu.org seconds to perform updates: [63, 41], [109, 54], [97, 58] subversions.gnu.org seconds to perform updates: [25, 28], [44, 26], [29, 24] Finished at Mon Jan 27 06:16:58 2003 client-system-localtime. Starting at Mon Jan 27 06:16:58 2003 client-system-localtime. gcc.gnu.org seconds to perform updates: [56, 49], [78, 84], [29, 19] subversions.gnu.org seconds to perform updates: [46, 25], [28, 21], [25, 21] Finished at Mon Jan 27 07:02:29 2003 client-system-localtime. ('client system localtime' is GMT-8, PDT) You can imagine what's going on if you look closely. When you have a pair with a big first number and a smaller second number, the server couldn't keep all the RCS file pages cached in memory after the checkout; it had to fetch some of them from disk again. Once it did that, it was able to keep these pages on hand until the next update and that completed faster. Ignore the occasional bigger number - these numbers always have weird spikes when other processes fire up on the servers. The general trend is clear -- sourceware takes something like 50-60 seconds to do a cvs update against gcc-3_1-branch when it took 10-11 seconds against HEAD with fileattr. subversions.gnu.org took 25-30 seconds for gcc-3_1-branch when it could do HEAD in 17-22. So in this gcc-3_1-branch update case, subversions is spanking sourceware by about 2x. Maybe faster processors? Fewer services/load offered? I honestly don't know for sure. One thing you really want to measure is fileattr cvs update time vs no-fileattr cvs update time, on the same server with the same load. I didn't do that (I could do it by creating a custom cvs not installed anywhere easily visible and using that, I suppose), but you can draw some conclusions from all these numbers. subversions.gnu.org paid a penalty of about 35% for doing a cvs update on gcc-3_1-branch vs HEAD. Given that sourceware has cvs update times of 50-60 seconds for gcc-3_1-branch, if that same ratio holds true for sourceware, that means a checkout of HEAD would take around 40 seconds. It takes only 10-11 seconds with the fileattr cache, so that would lead me to say the fileattr cache gives us a ~4x speedup on this sytem. Well anyhow, that's long winded and not that interesting, but I thought I'd share as long as I'd spent all that time when I could have been outside or something. :-) Disclaimer: I was purposely testing these servers when there was little appreciable load on them. It's important to separate "how fast can this server be" vs "how heavily loaded is this system"; conducting any measurements when those two are intermixed makes it difficult to draw any conclusion beyond "it wuz reel slow the other day". At the same time, the combination is entirely relevant to end users because that dictates what kind of actually performance they'll see.. J