From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Molenda To: Phil Edwards Cc: overseers@sources.redhat.com Subject: Re: splitting sourceware into pieces? Date: Wed, 21 Feb 2001 13:06:00 -0000 Message-id: <20010221130610.A11627@shell17.ba.best.com> References: <20010221142147.A20931@disaster.jaj.com> X-SW-Source: 2001-q1/msg00294.html On Wed, Feb 21, 2001 at 02:21:47PM -0500, Phil Edwards wrote: > During the middle of the day, mail seems to be taking upwards of half an > hour to get processed by the lists; CVS connections can be dog slow for > anyone who has less than an OC-3, etc, etc, you all have heard this before. I haven't monitored the system's performance in a long time, but I would suspect (read - I'd be surprised if I'm wrong) that the bottleneck is entirely in the T1 to the system. The oft-mentioned co location would address this problem. I gather a new system would be constructed at that time as well. The disks won't get much faster (drive technology hasn't advanced much in the past two years, and I'd put in the fastest ~$1k drives I could find back then), but the CPU will be much faster (it has a 450MHz P-III right now). I suppose memory might get a bit bigger, but it has 768MB right now. I kind of doubt we're CPU starved very often, though. If the network is the bottleneck, then separating the functionality across multiple hosts doesn't gain much. If there was a bonanza of extra CPU, a few more tricks could be used, like maybe using a loadable apache module which would gzip _all_ the content on the site on the fly. Right now there is a module which will send gzipped versions of content if they exist, but they have to be pre-generated, and it can't interoperate with server side includes. This change would decrease the response time for people browsing the web archives of mailing lists by quite a bit I think. > You get the picture; it doesn't need to all be on one system. It doesn't need to be, but it makes a lot of things a whole lot easier when it is. On the good side, all the services on sourceware are tightly integrated so we can have mailing list archives right in the ftp dir with no problems. On the bad side, it makes maintenance a lot harder because anyone wanting to change things has to figure out how all the hair sticks together. If it were necessary to stick to the T1 and people were unhappy with system performance, I'd attack the ftp traffic side first. It's something like half the traffic, and most of it is gzipped (not even bzip2'ed!!) gcc snapshots and the like. I'd at least get these compressed in .bz2 instead of .gz, and I'd probably make the snapshots etc available only to mirror sites and require users to get them from the mirrors. I haven't looked over the ftp logs (/www/logs/ftp*), but I'm sure some examination of what content is being downloaded the most would suggest some obvious alternatives for reducing the load on the T1. Jason