From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mailout01.t-online.de (mailout01.t-online.de [194.25.134.80]) by sourceware.org (Postfix) with ESMTPS id 8598C385700D for ; Thu, 13 May 2021 14:43:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8598C385700D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=t-online.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=Christian.Franke@t-online.de Received: from fwd34.aul.t-online.de (fwd34.aul.t-online.de [172.20.26.145]) by mailout01.t-online.de (Postfix) with SMTP id 24EFCEBF3E for ; Thu, 13 May 2021 16:43:01 +0200 (CEST) Received: from [192.168.2.105] (GoJyt4ZJohQkA4K+qckoeIQ1AZekT7eg99Ku6ZkD+HgKcnJqAGd9aCfabfHM+rFwWh@[79.230.169.184]) by fwd34.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1lhCYE-3sxF2W0; Thu, 13 May 2021 16:42:58 +0200 Subject: Re: [PATCH setup] Add new option '--compact-os' To: cygwin-apps@cygwin.com References: <5d57a3f5-b595-2196-12ed-1c409d46be2a@t-online.de> <02f5ecb0-433b-262d-f56e-c5394c5f07bd@dronecode.org.uk> From: Christian Franke Message-ID: <2c7ad892-b0ff-67e5-02f5-2c1afc596fce@t-online.de> Date: Thu, 13 May 2021 16:42:58 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 SeaMonkey/2.53.6 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ID: GoJyt4ZJohQkA4K+qckoeIQ1AZekT7eg99Ku6ZkD+HgKcnJqAGd9aCfabfHM+rFwWh X-TOI-EXPURGATEID: 150726::1620916978-000126A7-A0EB5C3D/0/0 CLEAN NORMAL X-TOI-MSGID: ec354f97-b461-44ef-a727-ce7676599a6c X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00, BODY_8BITS, FREEMAIL_FROM, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin-apps@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin package maintainer discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 May 2021 14:43:04 -0000 Corinna Vinschen via Cygwin-apps wrote: > On May 12 16:14, Jon Turney wrote: >> On 08/05/2021 21:03, Christian Franke wrote: >> [...] >>> +bool io_stream_cygfile::compact_os_is_available = (OSMajorVersion () >= 10); >> The documentation seems a bit vague, but are we really expecting this to >> work on Windows 10 1507? > I think this could even work under 8.1 from what I can see on MSDN. I skipped all Win8*, so I didn't test with 8.1 :-) This page says "Available starting with Windows 10": https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_provider_external_info_v0 It also says "Header: ntifs.h" but in recent "Windows Kits" all required defines are in winioctl.h. These defines are enabled even for '>= _WIN32_WINNT_WIN7'. According to a test I did some time ago, Win7 could not read these files. > >>> +{ >>> + const char * const p = name.c_str(); >>> + if (!(!strncmp (p, "/bin/", 5) || !strncmp (p, "/sbin/", 6) || !strncmp (p, "/usr/", 5))) >>> + return true; /* File is not in R/O tree. */ >>> + const size_t len = name.size(); /* >= 5 */ >>> + if (!strcmp (p + (len - 4), ".dll") || !strcmp (p + (len - 3), ".so")) >>> + return true; /* Rebase will open file for writing which uncompresses the file. */ >>> + if (!strcmp (p + (len - 3), ".gz") || !strcmp (p + (len - 3), ".xz")) >>> + return true; /* File is already compressed. */ >> Is this an assertion that there are no .bz2, .lzma, .zst etc. files in the >> install? > Another question is this: FILE_PROVIDER_COMPRESSION_LZX > "This algorithm is designed to be highly compact, and provides for small > footprint for infrequently accessed data." > > When running a shell script, certain executables (especially coreutils, > gawk, sed, grep, find) are not so very infrequently accessed. Is this > compression really feasible for these binaries? Did you compare shell > script performance with non-compressed, XPRESS16K and LZX compressed > /bin dir? Good point. Now I did a test with a ./configure script run after reboot: There was significant difference with /bin/*.exe (only) uncompressed, NTFS-, XPRESS16K- or LZX-compressed. Time was always around 23s. Here a read speed test with fast and slow storage and a 10+ years old i7-2600K (4C/8T). The 256MiB test file was generated by concatenating various EXE files. All file accesses were the first after reboot. AV (defender) was turned off:  Compression MiB      T1     T2   T3,T4  ======================================  None        256   0.69s  10.1s  <0.02s  NTFS        159   1.03s   8.1s  <0.02s  XPRESS4K    138   -  XPRESS8K    128   -  XPRESS16K   123   0.64s   5.4s  <0.02s  LZX          97   0.79s   4.8s  <0.02s T1,T2: Read whole file: time dd if=FILE bs=FILESIZE of=/dev/null T3,T4: Read last byte: time dd if=FILE bs=1 skip=FILESIZE-1 of=/dev/null T1,T3: SATA SSD, raw read speed with dd bs=1M: ~520MB/s T2,T4: USB3 flash drive via USB2, raw read speed: ~27MB/s As expected, compression helps to improve 'virtual' read speed on slow storage. Otherwise, it depends on storage speed, CPU speed, system load, ... As unexpected (for me), even LZX seems to be suitable for random reads which are done when EXE files are preloaded or paged-in. If the files were already cached, all read times were similar: ~0.135s for the whole file. For more flexibility, I will provide a new version of the patch with '--compact-os ALGORITHM' option. Thanks, Christian