From: Jeffrey Altman <jaltman@secure-endpoints.com>
To: cygwin@cygwin.com
Subject: Re: ls/stat on OneDrive causes download of files
Date: Wed, 6 Mar 2024 13:55:17 -0500 [thread overview]
Message-ID: <7d9fe460-5704-424b-a89b-e34ef2176d38@secure-endpoints.com> (raw)
In-Reply-To: <ZeilkJK7Csryuzkc@calimero.vinschen.de>
On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> We can add an explicit call to
>
> RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
>
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.
>
>
> Corinna
I'm not an expert in this area by any means but here are my
recollections from when Microsoft presented in-person on cloud
placeholders to filter and filesystem developers many years ago.
Files and directories that are placeholders should have either the
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN
file attributes set. When these attributes are set, applications and
mini filters are advised not to "read" or "open" the files or
directories unless they absolutely need to because doing so will cause
the placeholder to be replaced by an object containing the actual data
which might take a long time to fetch, might cost the end user money, or
might fail depending upon the network connectivity. In particular,
anti-malware should ignore them during scans and only analyze the data
when it is fetched locally by an end user application.
I believe that IO_REPARSE_TAG_FILE_PLACEHOLDER was replaced by
IO_REPARSE_TAG_CLOUD_1 ..IO_REPARSE_TAG_CLOUD_F. Any reparse tag
attached to a placeholder object is for the interpretation of the filter
associated with the back-end storage and not for the consumption of
applications. The content of the reparse tags can be back-end
proprietary; different reparse data for onedrive, icloud, dropbox, etc.
The default ProcessPlaceholderCompaibilityMode is
PHCM_EXPOSE_PLACEHOLDERS which makes the FILE_ATTRIBUTE flags and
reparse tags visible. Microsoft maintains a database of processes for
which PHCM_DISGUISE_PLACEHOLDER is set which hides that information. Its
unclear to me that explicitly setting the placeholder compatibility mode
is useful.
I'm not sure that exposing the object as a symlink is a good idea. A
posix symlink is an object whose type and target information cannot
change. In the case of a placeholder, the placeholder is silently
replaced by the actual object either when the object is opened or the
object's data is accessed. An application that believes it knows that
the object is a symlink will be mighty confused when it turns out to be
a file or a directory.
Perhaps the question that needs to be asked is whether there are opens
that can be skipped if an object is known to not be locally present
(either of the FILE_ATTRIBUTE flags are set)?
Jeffrey Altman
next prev parent reply other threads:[~2024-03-06 18:55 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-06 0:54 Marcin Wisnicki
2024-03-06 13:22 ` Corinna Vinschen
2024-03-06 13:28 ` Corinna Vinschen
2024-03-06 13:54 ` Brian Inglis
2024-03-06 17:19 ` Corinna Vinschen
2024-03-06 18:55 ` Jeffrey Altman [this message]
2024-03-06 19:14 ` Corinna Vinschen
2024-03-07 9:06 ` Corinna Vinschen
2024-03-08 10:37 ` Corinna Vinschen
2024-03-08 12:52 ` Thomas Wolff
2024-03-08 13:15 ` Jeffrey Altman
2024-03-08 13:56 ` Corinna Vinschen
2024-03-08 22:21 ` Corinna Vinschen
2024-03-08 22:26 ` Marcin Wisnicki
2024-03-09 20:29 ` Marcin Wisnicki
2024-03-11 17:04 ` Corinna Vinschen
2024-03-06 19:00 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7d9fe460-5704-424b-a89b-e34ef2176d38@secure-endpoints.com \
--to=jaltman@secure-endpoints.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).