From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out-so.shaw.ca (smtp-out-so.shaw.ca [64.59.136.137]) by sourceware.org (Postfix) with ESMTPS id DA42E3896C08 for ; Tue, 12 Jan 2021 16:28:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DA42E3896C08 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=SystematicSw.ab.ca Authentication-Results: sourceware.org; spf=none smtp.mailfrom=brian.inglis@systematicsw.ab.ca Received: from [192.168.1.104] ([24.64.172.44]) by shaw.ca with ESMTP id zMWfk6GaubYg3zMWgk1aZl; Tue, 12 Jan 2021 09:28:10 -0700 X-Authority-Analysis: v=2.4 cv=Q4RsX66a c=1 sm=1 tr=0 ts=5ffdce1a a=kiZT5GMN3KAWqtYcXc+/4Q==:117 a=kiZT5GMN3KAWqtYcXc+/4Q==:17 a=IkcTkHD0fZMA:10 a=ejknC5xS72zp2OFXFO8A:9 a=QEXdDO2ut3YA:10 From: Brian Inglis Subject: Re: scallywag / cygport not pulling lzip Reply-To: cygwin-apps@cygwin.com To: cygwin-apps@cygwin.com References: <3cd5f2f8-b292-0ac1-de18-753a4513f6ba@gmail.com> <87wnwnk1er.fsf@Otto.invalid> <87a6tj9wst.fsf@Rainer.invalid> <871rev9rbx.fsf@Rainer.invalid> <6bbcf804-593b-16be-dfd1-e1430c1c4bf8@SystematicSw.ab.ca> Organization: Systematic Software Message-ID: Date: Tue, 12 Jan 2021 09:28:09 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <6bbcf804-593b-16be-dfd1-e1430c1c4bf8@SystematicSw.ab.ca> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 8bit X-CMAE-Envelope: MS4xfOP8LROoDdK9HhMyqwp/0WYzLRfBhAZ4nV8rQl+zUHRnJtgMPT7km8nBK1NCSeVPYmwW0AEOtCjYussXw0MxvMM9PaF+Wg6RJg9TH5yTskh+Si3GnOr4 QMY3x1UwOf5eUo860YV9K1AUHNq/WSoFGcPHtLKX4nhwE+FHIqnc6Vj9SD3NgiejoGLqWbDhdDHey4X5T4s/ji8eSgCq1gg1wbI= X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin-apps@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin package maintainer discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jan 2021 16:28:14 -0000 On 2021-01-11 15:05, Brian Inglis wrote: > On 2021-01-08 12:11, Achim Gratz wrote: >> Brian Inglis writes: >>> Do we know what the frequency weighted difference is on bandwidth of >>> packages actually downloaded? >> >> Not that I know of, as everything goes through mirrors.  But I happen to >> have a complete Cygwin mirror on disk at the moment plus another one >> that only has the packages for my install and that's a fairly large >> installation, but without a desktop environment: >> >> 30G     /mnt/mirror/cygwin >> 149G    /mnt/fullmirror/cygwin Currently mirrors.html states 160GB total. >> So you can probably assume that only about 20% of the files are >> frequently accessed (likely significantly less since most folks would >> not install the debuginfo or source packages that are included in the >> above figure). >> >>> I am more concerned with mirror providers (and also the lack of them) >>> especially those with limited resources, and those in marginal >>> locations and circumstances, for whom download time and charges may >>> override other considerations, and perhaps prevent them (or many) from >>> accessing or taking full advantage of available software. >> >> We could save way more space than that by de-duplicating the noarch >> parts into their own archives as I have already demonstrated before. >> The last time I did that I was cutting out around 30GiB IIRC. Rsync delta download data transfer quantities are still significant on metered or limited accounts with high overage charges: mirror storage cost is trivial by comparison. Parents are being hit with $Ks monthly overage charges run up by their kids downloading and running games on multiple devices, or cut off after hitting their limit if they block overages, which prevents access to school or work until the next month capacity allowance kicks in! >>> I doubt the unarchiving time difference is more than a blip in the >>> total time required to *download* *AND* install any package, greatly >>> outweighed by the download time difference, unless you are on a big >>> pipe to a nearby mirror. >> >> It is not, with a typical VDSL connection I'd be able to download faster >> than I can install on a more typical machine, I need only about 5MiB/s >> to saturate the filesystem for small files and around 20…40MiB/s for >> large ones (to an NVMe drive, a spinning disk or some of the slower SSD >> can't sustain that).  But that point is somewhat moot since setup will >> always mirror to disk first and that's not easy to change since we read >> the file twice: once for the SHA512 check (which can use up to around >> 300MiB/s input bandwidth somewhat higher in peaks and then the actual >> installation). > > Some setup phase stats from my own most recent upgrade of about 130MB of > downloads, and stats since 2013 (nearly 8 years) since I last cleared setup.log: Added average and standard deviation per run for multiple runs, and total times for all runs, including another set with data only since the solver was added early 2018; trivial runs with no download etc. phase times are not counted, but their solve times may be included in stats: $ cyg-setup-phase-times.awk /var/log/setup.log.full sv 00:04:26 dl 00:01:28 pr 00:00:02 ui 00:00:36 ex 00:03:35 pi 00:07:12 tot 00:17:19 runs 1 $ cyg-setup-phase-times.awk /var/log/setup.log.solve sv 00:04:06 dl 00:00:33 pr 00:00:04 ui 00:00:20 ex 00:01:16 pi 00:12:05 avg 00:18:27 runs 66 sv 00:06:11 dl 00:00:53 pr 00:00:17 ui 00:00:25 ex 00:02:05 pi 00:11:08 dev 00:12:57 runs 66 sv 04:30:50 dl 00:37:07 pr 00:04:57 ui 00:22:03 ex 01:24:25 pi 13:18:32 tot 20:17:54 runs 66 $ cyg-setup-phase-times.awk /var/log/setup.log sv 00:00:45 dl 00:00:25 pr 00:00:02 ui 00:00:08 ex 00:01:01 pi 00:04:01 avg 00:06:24 runs 358 sv 00:03:06 dl 00:00:57 pr 00:00:09 ui 00:00:15 ex 00:05:29 pi 00:06:25 dev 00:09:03 runs 358 sv 04:32:46 dl 02:31:27 pr 00:16:08 ui 00:51:10 ex 06:04:40 pi 23:58:16 tot 1 14:14:27 runs 358 > phases are: > > sv - solve formerly Adding required packages - high times are interaction delays > dl - download > pr - preremove > ui - uninstall > ex - extract > pi - postinstall > > so your comments about extracts are validated, taking nearly 3 times as long as > downloads on a currently 2MByte/s medium speed cable modem link to a nearby > (7.5km direct, 11km drive, 15 hop 10ms round trip) university campus mirror in > recent years. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in binary units and prefixes, physical quantities in SI.]