public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Kaz Kylheku <kaz@kylheku.com>
To: Martin Wege <martin.l.wege@gmail.com>
Cc: cygwin@cygwin.com
Subject: Re: rfe: CYGWIN fslinktypes option? Re: Catastrophic Cygwin find . -ls, grep performance on samba share compared to WSL&Linux
Date: Thu, 21 Dec 2023 12:32:51 -0800	[thread overview]
Message-ID: <4723aab7e2b331cb81946eff0fb4e862@kylheku.com> (raw)
In-Reply-To: <CANH4o6OjJJZQkbELt+H3WdAxQbLGZ1DL0ytevknRpbTO9sVUig@mail.gmail.com>

On 2023-12-21 04:16, Martin Wege via Cygwin wrote:
> On Wed, Dec 20, 2023 at 6:21 PM Kaz Kylheku via Cygwin
> <cygwin@cygwin.com> wrote:
>>
>> On 2023-12-17 22:22, Dan Shelton via Cygwin wrote:
>> > It would be nice if someone from the Cygwin authors could assist me in
>> > figuring out why this happens.
>>
>> Cygwin is famously slow; this is nothing new. We are grateful
>> for Cygwin because it makes stuff work at all; if it were blazing
>> fast that would be a bonus.
>>
>> E.g. git operations (clone, rebase, ...); ./configure scripts; ...: all
>> run like molasses.
>>
>> The following is just my fast and loose opinion, shot from the hip,
>> and possibly off or wrong, but it likely has to do with the layering.
>> Cygwin's core API is based on a C library called Newlib. Cygwin bolts
>> Newlib to Windows by means of an additional shim below Newlib that
>> is based on C++ objects, where there is path munging going on and such,
>> and that's where the Win32 calls get made. It's an additional abstraction.
> 
> I disagree with that. Ok, part of that is that the layering causes
> more memory allocations and copies, but this is not the root cause.

I seem to recall that most operations that take a path argument have
to convert the path from Cygwin to Win32, and I think that also involves
going from 8 bit to UTF-16 also. That's gotta hurt a bit.
 
> The root cause is IMO the extra Win32 syscalls (>= 3 per file lookup,
> compared to 1 on Linux) to lookup the *.lnk and *.exe.lnk files on
> filesystems which have native link support (NTFS, ReFS, SMBFS, NFS).
> On SMBFS and NFS it hurts the most, because access latency is the
> highest for networked filesystems.

Could some intelligent caching be added there? (Discussion of
associated invalidation problem in 3... 2.... 1... )

Can you discuss more details, so people don't have to dive into code
to understand it? If we are accessing some file "foo", the application
or user may actually be referring to a "foo.lnk" link. But in the
happy case that "foo" exists, why would we bother looking for "foo.lnk"?

If "foo" does not exist, but "foo.lnk" does, that could probably be
cached, so that next time "foo" is accessed, we go straight for "foo.lnk",
and keep using that while it exists.

If someone has both "foo" and "foo.lnk" in the same directory,
that's a bit of a degenerate case; how important is it to be "correct",
anyway.

> So my proposal would be to add an option ('fslinktypes') to the CYGWIN
> environment variable to define which types of links are supported:
> default 'all'. which is an shortcut for 'native,lnk,lnkexe'.
> So in case people do not want 'lnk' link support they just add
> CYGWIN+=' fslinktypes:native' to env, to turn off support for
> lnk/lnk.exe style links, and be happy.

So this complements the winsymlinks option? winsymlinks has to do
with how the Cygwin DLL creates symbolic links, whereas this has to do
with what objects are recognized as links.

The implementation would probably want to compare fslinktypes
and winsymlinks to make sure they are harmonized together;
if winsymlinks tells Cygwin to make .lnk files, but then fslinktypes
banishes them, that's something diagnosable somehow.

  parent reply	other threads:[~2023-12-21 20:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-06  4:08 Dan Shelton
2023-12-18  6:22 ` Dan Shelton
2023-12-18  6:49   ` Marco Atzeri
2023-12-18  6:53     ` Dan Shelton
2023-12-18  7:05       ` Marco Atzeri
2023-12-18  7:16         ` Dan Shelton
2023-12-18  8:23           ` Marco Atzeri
2023-12-20 17:20   ` Kaz Kylheku
2023-12-21 12:16     ` rfe: CYGWIN fslinktypes option? " Martin Wege
2023-12-21 16:10       ` Cedric Blancher
2023-12-21 17:43         ` Brian Inglis
2023-12-21 20:32       ` Kaz Kylheku [this message]
2023-12-24  0:47         ` Roland Mainz
2024-01-08 14:53           ` Corinna Vinschen
2023-12-22 18:53       ` Andrey Repin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4723aab7e2b331cb81946eff0fb4e862@kylheku.com \
    --to=kaz@kylheku.com \
    --cc=cygwin@cygwin.com \
    --cc=martin.l.wege@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).