public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Subject: Re: Cygwin generates syscalls for *.lnk files on filesystems with native symlink support?
Date: Mon, 8 Jan 2024 14:56:59 +0100	[thread overview]
Message-ID: <ZZv_K1NyxE-btqQt@calimero.vinschen.de> (raw)
In-Reply-To: <CALXu0Ue9SyJod+0k24pQzs3KPg1RPquRfhN3tw3GYG-qMt_+DQ@mail.gmail.com>

On Dec 18 13:04, Cedric Blancher via Cygwin wrote:
> On Fri, 1 Sept 2023 at 13:00, Corinna Vinschen via Cygwin
> <cygwin@cygwin.com> wrote:
> >
> > On Sep  1 06:23, Cedric Blancher via Cygwin wrote:
> > > Good morning!
> > >
> > > During a Cygwin 3.4.8-1.x86_64 debugging session I noticed something
> > > odd when I looked at the network traffic generated by one of our
> > > cluster nodes:
> > > It seems that for each call to a tool (i.e. starting "sed" from
> > > "bash") Cygwin searches for *.lnk files.
> > >
> > > Is this correct even when the filesystem in question has native
> > > symlink support (e.g. NFS)?
> >
> > Yes.  During file handling, Cygwin doesn't know what filesystem a
> > file is on until it could actually open the file and request file
> > and filesystem info from the open handle.
> 
> Why? If you have the path name you could lookup the (cached) mount
> points, and determine the filesystem type. Same solution applies for
> UNC paths, where you can easily lookup the filesystem type, and cache
> it per-process or in Cygserver.

No can do.

We *do* filesystem info caching, but this requires to be able to
open the file in the first place.  If the file exists, we're done, but
if not, we have to evaluate all symbolic links in the path first.
Simply reducing the path to the mount point and then translate it into
a Windows path doesn't cut it.

After the file (or its parent dir) could be opened, so we know the path
is valid, we can fetch the filesystem info right from the open file
handle.  At this point, we can shortcut the whole thing, reducing
the three necessary calls to kernel functions to only the first one
and skipping all the filesystem evaluation code.

> > So if Cygwin couldn't open
> > "foo" because the NtCreateFile call returned with status
> > STATUS_OBJECT_PATH_NOT_FOUND or STATUS_OBJECT_NAME_NOT_FOUND, or
> > STATUS_NO_SUCH_FILE, or one of the countless other status codes the
> > kernel (or the driver) might return in case a file doesn't exist,
> > it will tack on .lnk and .exe and, for historical reasons, .exe.lnk,
> > and try again.
> 
> Can this machinery please be turned off via CYGWIN env var option? As
> discussed in https://www.mail-archive.com/cygwin@cygwin.com/msg174547.html
> this machinery causes very bad filesystem lookup performance, and it
> would IMO a good idea to have an option to turn this off, and just
> allow and expect native links (for NTFS, ReFS and NFS). Maybe CYGWIN
> env var option winsymlinks_expect:native? winsymlinks_expect takes a :
> seperated list which symlink types are to be expected.

We won't add any more options to the CYGWIN env variable if it's not
necessary, and there's no proof at all that this is necessary in this
case.

If the file exists as specified, there will be no further check for
.exe or .lnk.  If it doesn't exist, *and* the return code from
the kernel is STATUS_OBJECT_PATH_NOT_FOUND (or any one of a
number of equivalent return values) we're done here and are going
to fall back to checking for symlinks in the leading path components.

If we got over that, the check for .exe is unavoidable.

So we're at the .lnk check. I'm pretty sure, if you try to measure the
performance it's not that bad. After all, at this point, the OS has
already cached the parent dir info and the NtOpenFile call returns
adequately snappy.

We might get away with dropping the .exe.lnk check.  In the past we
had the problem that these file existed, and you can still easily
create them, but theoretically nobody should do this.  Getting rid
of this check may be a minor backward incompatibility.

The additional problem with getting rid of .lnk is this: Even if you
switch to native symlinks exclusively (which may subtly break POSIX
compatibility, which is why Cygwin defaults to
IO_REPARSE_TAG_LX_SYMLINK) you won't switch to native symlinks
exclusively.  The reason is that FIFOs (and other device files but those
don;'t really matter since they were never fully utilized) are
implemented as Windows shortcuts.

Having said that.  I'm not opposed to fixes and improvements to the
path handling code.  It's long, winding and convoluted, and there's
certainly room for improvement.

We won't do that for Cygwin 3.5, though.  If all goes well, it will be
release in the next few weeks.

After that, when we cut the 3.5 branch, the main branch is open for new
stuff for 3.6.

However, before making generic suggestions, which may or may not be
helpful, it would be really great if you could have a look into the code
and try to understand why things are done the way they are done.  I'm
open to questions if you don't understand the code, but at least the
last couple of years, we tried hard to add comments everywhere to
explain things.  If you see a way to improve the code *without breaking
stuff*, I'm all ears, but it needs to be based on real, testable code
changes or at least POC code.

For instance: Getting rid of .lnk files isn't easy with backward
compatibility in mind.  Creating FIFOs differently isn't that much
trouble, but if a new DLL doesn't handle .lnk files it doesn't handle
the old FIFOs and already existing installations would break for no
apparent reason.  We don't want that.  Also, as long as we support MVFS,
we still need shortcut symlinks.  So we need either a transition period
or an incredibly bright idea how to fix it on the spot.


Corinna

  reply	other threads:[~2024-01-08 13:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-01  4:23 Cedric Blancher
2023-09-01 10:57 ` Corinna Vinschen
2023-09-26  5:12   ` Cedric Blancher
2023-12-18 12:04   ` Cedric Blancher
2024-01-08 13:56     ` Corinna Vinschen [this message]
2024-01-08 17:11       ` matthew patton
2024-01-08 18:11         ` Corinna Vinschen
2024-01-08 18:44           ` matthew patton
2024-01-08 19:05             ` Rainer Emrich
2024-01-08 19:17             ` Jeffrey Altman
2024-01-08 19:21               ` Corinna Vinschen
2024-01-08 19:57               ` matthew patton
2024-01-08 20:27                 ` Brian Inglis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZv_K1NyxE-btqQt@calimero.vinschen.de \
    --to=corinna-cygwin@cygwin.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).