From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 93181 invoked by alias); 2 Nov 2018 05:05:13 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 93168 invoked by uid 89); 2 Nov 2018 05:05:13 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=Repin, Hx-spam-relays-external:shaw.ca, H*r:shaw.ca, andrey X-HELO: smtp-out-so.shaw.ca Received: from smtp-out-so.shaw.ca (HELO smtp-out-so.shaw.ca) (64.59.136.139) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 02 Nov 2018 05:05:11 +0000 Received: from [192.168.1.114] ([24.64.240.204]) by shaw.ca with ESMTP id IRdsgM2ClYzqLIRdtgQZNN; Thu, 01 Nov 2018 23:05:09 -0600 Reply-To: Brian.Inglis@SystematicSw.ab.ca Subject: Re: RFE: find -d -size 0 => doesn't find empty directories To: cygwin@cygwin.com References: <5BDA347D.8070909@tlinx.org> <1792215646.20181101191249@yandex.ru> From: Brian Inglis Openpgp: preference=signencrypt Message-ID: <787c5490-d4a8-46e1-20d6-e7fc9a1f5db8@SystematicSw.ab.ca> Date: Fri, 02 Nov 2018 05:05:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1792215646.20181101191249@yandex.ru> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg00021.txt.bz2 On 2018-11-01 10:12, Andrey Repin wrote: > L A Walsh wrote: >> Unfortunately, due to directories really not being in the user >> disk data space, but in the MFT(zone) (I think), the size >> comes back as zero ('0') for directories. >> Would it be possible (if not problematic) for the cygwin >> emulation layer to return some non-zero value if the >> directory has actual entries in it (ignoring structural >> values like "." and "..")? Maybe return as 'size' either >> a dummy number proportional to #entries (like 10*#entries), >> or something like summing up actual number (+1) of characters >> in the file list? >> Would that be difficult to do, or add? > Having something to this extent would be useful in case of searching for > directories with too many files, for example. > I'd vote for something like (entries << 7), which is closer to an average ext2 > counter. No need to ignore anything. I believe readdir(3) overhead is already high, and adding extraneous lookups to add metadata which is not readily available under NTFS/exFAT would slow it even further. Do you really want readdir(3) or stat(3) to recurse to sum the entry sizes for each subdirectory? Some of us have some large messy directories more reminiscent of Unix systems than typical of Windows systems. $ time du -sh /tmp/ 91M /tmp/ real 0m5.125s user 0m0.125s sys 0m1.077s $ time du -sh /var/log/ 496M /var/log/ real 0m42.725s user 0m0.687s sys 0m9.139s -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple