From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Molenda To: Andrew Cagney Cc: overseers@sourceware.cygnus.com Subject: Re: ftp mirrors Date: Sat, 30 Dec 2000 06:08:00 -0000 Message-id: <20000610235503.A8352@shell17.ba.best.com> References: <200006091545.LAA00938.cygnus.project.sourcemaster@envy.delorie.com> <3941CC03.B74B3373@cygnus.com> <20000610010132.A20997@shell17.ba.best.com> <394223B7.A1A849E9@cygnus.com> <3942E532.5D17574@cygnus.com> X-SW-Source: 2000/msg00636.html On Sun, Jun 11, 2000 at 11:02:43AM +1000, Andrew Cagney wrote: > Given two nightly snapshots there are very few differences. Once the > tar ball has gone through gzip, however, all similarity is lost. I guess I don't understand this. You're suggesting that rsync is clever enough to see two files on the server, foo-2000-06-09 and foo-2000-06-10, see that the client already has foo-2000-06-09, and make the leap that the -09 and -10 files are probably pretty close to each other, so do a diff between the server's -09 and -10 and send that diff? I've never heard of that - do you have something to back it up? I must admit to being a little incredulous. > Its a lot more efficient to rsync the uncompressed tar-ball than it is > to down load the compressed version (well it is for me :-). I'd have to see a clear example of this before I believed it - it just doesn't make any sense. You saw a file, which existed in both .tar and .tar.gz format on sourceware, and neither existed on your local host, and rsync'ing both of them to your system resulted in the .tar file downloading noticeably faster than the .tar.gz file? I'm sorry, but I just can't take that on faith. Can you outline a little more explicitly what you're measuring here? Think about it. The .tar file is 30MB on disk. The .tar.gz, which was gzipped with gzip -9 presumably, is 10MB. The default gzip -3 gives you, say, a file length of 13MB. If rsync send the .tar.gz, it sends 10MB of data. If rsync sends the .tar file without compression, it sends 30MB of data. If rsync sends the .tar file with compression, it sends 10-13MB of data. Where's the gain? > One issue raised by individual testers during the gdb 5.0 release > process was the logistics of repeatedly dragging down 10mb tar-balls. Let me introduce you to my good friend, "diff". :-) If someone is following a release process, they either need to (a) use CVS, (b) download diffs back to the last fully downloaded snapshot they have, or (c) have an amazingly fast net connection. If they don't have an amazingly fast net connection, and they're downloading snapshots every day *and complaining about it*, then the solution here is to educate them on the magic of patch, not putting uncompressed files on the ftp server. Fix the developer's problem at the right place -- at the developer. > Next time around I'll have the un-compressed tar ball available. It > occures to me that this could be scaled :-) Of course I don't maintain sourceware any longer, but the disk usage for having uncompressed tar files, and having dozens of people download a file that is 4x larger than necessary -- for no gain -- is simply unacceptable. We don't have the disk space, we don't have the bandwidth. But as I said, I don't run sourceware any longer, so it's out of my hands. Jason From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason Molenda To: Andrew Cagney Cc: overseers@sourceware.cygnus.com Subject: Re: ftp mirrors Date: Sat, 10 Jun 2000 23:55:00 -0000 Message-ID: <20000610235503.A8352@shell17.ba.best.com> References: <200006091545.LAA00938.cygnus.project.sourcemaster@envy.delorie.com> <3941CC03.B74B3373@cygnus.com> <20000610010132.A20997@shell17.ba.best.com> <394223B7.A1A849E9@cygnus.com> <3942E532.5D17574@cygnus.com> X-SW-Source: 2000-q2/msg00329.html Message-ID: <20000610235500._CPGElejJkYUcU2pQ-D1x4xtdCStMDWzwARkcKHGjMU@z> On Sun, Jun 11, 2000 at 11:02:43AM +1000, Andrew Cagney wrote: > Given two nightly snapshots there are very few differences. Once the > tar ball has gone through gzip, however, all similarity is lost. I guess I don't understand this. You're suggesting that rsync is clever enough to see two files on the server, foo-2000-06-09 and foo-2000-06-10, see that the client already has foo-2000-06-09, and make the leap that the -09 and -10 files are probably pretty close to each other, so do a diff between the server's -09 and -10 and send that diff? I've never heard of that - do you have something to back it up? I must admit to being a little incredulous. > Its a lot more efficient to rsync the uncompressed tar-ball than it is > to down load the compressed version (well it is for me :-). I'd have to see a clear example of this before I believed it - it just doesn't make any sense. You saw a file, which existed in both .tar and .tar.gz format on sourceware, and neither existed on your local host, and rsync'ing both of them to your system resulted in the .tar file downloading noticeably faster than the .tar.gz file? I'm sorry, but I just can't take that on faith. Can you outline a little more explicitly what you're measuring here? Think about it. The .tar file is 30MB on disk. The .tar.gz, which was gzipped with gzip -9 presumably, is 10MB. The default gzip -3 gives you, say, a file length of 13MB. If rsync send the .tar.gz, it sends 10MB of data. If rsync sends the .tar file without compression, it sends 30MB of data. If rsync sends the .tar file with compression, it sends 10-13MB of data. Where's the gain? > One issue raised by individual testers during the gdb 5.0 release > process was the logistics of repeatedly dragging down 10mb tar-balls. Let me introduce you to my good friend, "diff". :-) If someone is following a release process, they either need to (a) use CVS, (b) download diffs back to the last fully downloaded snapshot they have, or (c) have an amazingly fast net connection. If they don't have an amazingly fast net connection, and they're downloading snapshots every day *and complaining about it*, then the solution here is to educate them on the magic of patch, not putting uncompressed files on the ftp server. Fix the developer's problem at the right place -- at the developer. > Next time around I'll have the un-compressed tar ball available. It > occures to me that this could be scaled :-) Of course I don't maintain sourceware any longer, but the disk usage for having uncompressed tar files, and having dozens of people download a file that is 4x larger than necessary -- for no gain -- is simply unacceptable. We don't have the disk space, we don't have the bandwidth. But as I said, I don't run sourceware any longer, so it's out of my hands. Jason