public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Linda Walsh <cygwin@tlinx.org>
To: cygwin@cygwin.com
Subject: Re: locate and updatedb
Date: Sat, 13 Feb 2016 12:15:00 -0000	[thread overview]
Message-ID: <56BF1E4D.5000901@tlinx.org> (raw)
In-Reply-To: <56BD0D87.6030008@gmail.com>

Marco Atzeri wrote:
> On 11/02/2016 19:33, Byron Boulton wrote:
>> On 2/11/2016 1:18 PM, cyg Simple wrote:
>>> On 2/11/2016 9:00 AM, Byron Boulton wrote:
>>>> Does anyone here have success using `updatedb` and `locate` in 
>>>> cygwin? I
>>>> use `locate` heavily on my Linux machines, but everytime I've tried to
>>>> run `updatedb` on cygwin I've given up and killed the process 
>>>> because it
>>>> is taking too long.
---
	There's a reason why on linux it is usually set to run
when you are asleep.  ;-)

>>>>  Is there something wrong with cygwin's
>>>> implementation of `updatedb` making it not work at all or making it
>>>> slower that on my Linux machines? Or are there others who have success
>>>> using it on cygwin?

But it might have to do with disk speed and memory.  Laptop drives
are usually among the slowest.


I ran it just now (this is with MS's Home Essentials
real-time protection turned on).

law.Bliss/bin> time index_files.sh
670592 (process ID) old priority 0, new priority 19
44.21sec 15.06usr 28.30sys (98.09% cpu)
> locate / >/tmp/all
> wc /tmp/all
  1479146   4014375 133322318 /tmp/all
> df .
Filesystem      Size  Used Avail Use% Mounted on
C:              949G  585G  365G  62% /
----
 

So ~1.4 million files... Using the following exclusions:

---(index_files.sh)----
renice +19 $$
Local="/"
if [[ -d /windows/sysnative/. ]]; then 
  Local+=" /windows/sysnative/."
fi
Prunepaths='/.usr /proc /C /B /H /I /M /D /P /System[[:space:]]Volume[[:space:]]Information /Windows/CSC /pagefile.sys /Music /Pictures /Share /Media /home /Doc /$RECYCLE.BIN /cygdrive'

/bin/updatedb --findoptions=-noleaf  --localpaths="$Local" --prunepaths="$Prunepaths" --netpaths="$Net"
----
Most of those pruned files are pruned either due to redundancy
or being on a local network server...

That's fairly fast vs. the MS-Home Essentials, full malware
scan I run once a week that takes ~ 8-16 hours (It scans a 
few of my network directories,as well).






>>>
>>> Processing every file on the drive will be slow just because it's
>>> Windows.  Initializing the database with updatedb will require a large
>>> amount of time.  There are processes such as AntiVirus intrusion
>>> protection that might make it even slower.
>>>
>> Hmmm, the reason the slowness is particuarly strange to me is that in
>> place of using `locate` from my cygwin terminal, I have to use a program
>> called "Everything Search Engine" available at www.voidtools.com. The
>> first time I install it, it takes maybe a few minutes to index the hard
>> drive, then every once in a while when I open the program it takes a few
>> seconds to update the index, but in general the performance for indexing
>> and searching the index if comparable to `updatedb` and `locate` on a
>> Linux machine, so it's possible to do on Windows.
>>
>> Byron
>>
> 
> the time taken from updatedb is mainly due to
> the execution time of "find" on the disks.
> 
> It takes ~ 70 minutes for my 500 GB of data,
> and likely the AV is impacting the execution.
> 
> I suspect voidtools is using MS disk indexing
> to speed up the things for it.
> 
> 
> Regards
> Marco
> 
> 
> 
> 
> -- 
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
> 

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2016-02-13 12:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-11 14:01 Byron Boulton
2016-02-11 18:17 ` cyg Simple
2016-02-11 18:34   ` Byron Boulton
2016-02-11 22:39     ` Marco Atzeri
2016-02-13 12:15       ` Linda Walsh [this message]
2016-02-16 22:55         ` Buchbinder, Barry (NIH/NIAID) [E]
2016-02-17 13:43           ` Byron Boulton
2016-02-17 16:01             ` Buchbinder, Barry (NIH/NIAID) [E]
2016-02-17 16:21               ` Byron Boulton
2016-02-17 16:49                 ` Buchbinder, Barry (NIH/NIAID) [E]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BF1E4D.5000901@tlinx.org \
    --to=cygwin@tlinx.org \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).