public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: cygwin@cygwin.com
Subject: Re: RFE: find <path> -d -size 0 => doesn't find empty directories
Date: Fri, 02 Nov 2018 05:05:00 -0000	[thread overview]
Message-ID: <787c5490-d4a8-46e1-20d6-e7fc9a1f5db8@SystematicSw.ab.ca> (raw)
In-Reply-To: <1792215646.20181101191249@yandex.ru>

On 2018-11-01 10:12, Andrey Repin wrote:
> L A Walsh wrote:
>> Unfortunately, due to directories really not being in the user
>> disk data space, but in the MFT(zone) (I think), the size
>> comes back as zero ('0') for directories.

>> Would it be possible (if not problematic) for the cygwin
>> emulation layer to return some non-zero value if the
>> directory has actual entries in it (ignoring structural
>> values like "." and "..")?  Maybe return as 'size' either
>> a dummy number proportional to #entries (like 10*#entries),
>> or something like summing up actual number (+1) of characters
>> in the file list?

>> Would that be difficult to do, or add?

> Having something to this extent would be useful in case of searching for
> directories with too many files, for example.

> I'd vote for something like (entries << 7), which is closer to an average ext2
> counter. No need to ignore anything.

I believe readdir(3) overhead is already high, and adding extraneous lookups to
add metadata which is not readily available under NTFS/exFAT would slow it even
further.
Do you really want readdir(3) or stat(3) to recurse to sum the entry sizes for
each subdirectory?
Some of us have some large messy directories more reminiscent of Unix systems
than typical of Windows systems.

$ time du -sh /tmp/
91M     /tmp/

real    0m5.125s
user    0m0.125s
sys     0m1.077s
$ time du -sh /var/log/
496M    /var/log/

real    0m42.725s
user    0m0.687s
sys     0m9.139s

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2018-11-02  5:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-31 23:02 L A Walsh
2018-11-01  0:02 ` Norton Allen
2018-11-01  1:16 ` Mark Geisert
2018-11-01  4:40   ` Mark Geisert
2018-11-01 16:27     ` (SOLVED) " L A Walsh
2018-11-02  5:33       ` Mark Geisert
2018-11-01 16:20 ` Andrey Repin
2018-11-02  5:05   ` Brian Inglis [this message]
2018-11-02 14:05     ` Andrey Repin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=787c5490-d4a8-46e1-20d6-e7fc9a1f5db8@SystematicSw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).