* ls/stat on OneDrive causes download of files
@ 2024-03-06 0:54 Marcin Wisnicki
2024-03-06 13:22 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-06 0:54 UTC (permalink / raw)
To: cygwin
If I invoke ls or anything else that does stat inside OneDrive folder
it will trigger download of all files.
OneDrive uses placeholder files[1] to represent remote files.
I'm guessing reading file content in stat is to support detection of
actually executable files as in here[2]?
I think this should be disabled on non-hydrated placeholder files.
Running `find` or 'ls -R` and having your entire OneDrive downloaded
is extremely problematic.
I could live without executable scripts in the OneDrive folder and
it's easy to mark files as always offline to solve it.
Another idea is to skip checking files with extensions known to be
non-executable such as jpg (or just any extensions that is not known
to be executable).
This was previously reported in
https://github.com/msys2/msys2-runtime/issues/206.
[1] https://learn.microsoft.com/en-us/windows/win32/w8cookbook/placeholder-files
[2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 0:54 ls/stat on OneDrive causes download of files Marcin Wisnicki
@ 2024-03-06 13:22 ` Corinna Vinschen
2024-03-06 13:28 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 13:22 UTC (permalink / raw)
To: cygwin
On Mar 5 19:54, Marcin Wisnicki via Cygwin wrote:
> If I invoke ls or anything else that does stat inside OneDrive folder
> it will trigger download of all files.
>
> OneDrive uses placeholder files[1] to represent remote files.
>
> I'm guessing reading file content in stat is to support detection of
> actually executable files as in here[2]?
>
> I think this should be disabled on non-hydrated placeholder files.
> Running `find` or 'ls -R` and having your entire OneDrive downloaded
> is extremely problematic.
>
> I could live without executable scripts in the OneDrive folder and
> it's easy to mark files as always offline to solve it.
>
> Another idea is to skip checking files with extensions known to be
> non-executable such as jpg (or just any extensions that is not known
> to be executable).
Nothing of this makes sense from a POSIX library POV. The library can
either not handle placeholder files specially, as today, or it can
handle them all the same way.
Given these placeholder files are actually reparse points of type
IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
data buffer is undocumented. It would be helpful if somebody using
OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
The NtReadFile call at this point is not the problem. It would be
helpful to point to Cygwin's source instead of MSYS2, btw.
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 13:22 ` Corinna Vinschen
@ 2024-03-06 13:28 ` Corinna Vinschen
2024-03-06 13:54 ` Brian Inglis
0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 13:28 UTC (permalink / raw)
To: cygwin
On Mar 6 14:22, Corinna Vinschen via Cygwin wrote:
> On Mar 5 19:54, Marcin Wisnicki via Cygwin wrote:
> > If I invoke ls or anything else that does stat inside OneDrive folder
> > it will trigger download of all files.
> >
> > OneDrive uses placeholder files[1] to represent remote files.
> >
> > I'm guessing reading file content in stat is to support detection of
> > actually executable files as in here[2]?
> >
> > I think this should be disabled on non-hydrated placeholder files.
> > Running `find` or 'ls -R` and having your entire OneDrive downloaded
> > is extremely problematic.
> >
> > I could live without executable scripts in the OneDrive folder and
> > it's easy to mark files as always offline to solve it.
> >
> > Another idea is to skip checking files with extensions known to be
> > non-executable such as jpg (or just any extensions that is not known
> > to be executable).
>
> Nothing of this makes sense from a POSIX library POV. The library can
> either not handle placeholder files specially, as today, or it can
> handle them all the same way.
>
> Given these placeholder files are actually reparse points of type
> IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
>
> However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> data buffer is undocumented. It would be helpful if somebody using
> OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
>
> > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
>
> The NtReadFile call at this point is not the problem. It would be
> helpful to point to Cygwin's source instead of MSYS2, btw.
Oh, btw., this is from
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
IO_REPARSE_TAG_FILE_PLACEHOLDER
0x80000015
Obsolete.
---------
Used by Windows Shell for legacy placeholder files in Windows 8.1.
Server-side interpretation only, not meaningful over the wire.
So even if we support them, what is their replacement in W10 and later?
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 13:28 ` Corinna Vinschen
@ 2024-03-06 13:54 ` Brian Inglis
2024-03-06 17:19 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Brian Inglis @ 2024-03-06 13:54 UTC (permalink / raw)
To: cygwin
On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> On Mar 6 14:22, Corinna Vinschen via Cygwin wrote:
>> On Mar 5 19:54, Marcin Wisnicki via Cygwin wrote:
>>> If I invoke ls or anything else that does stat inside OneDrive folder
>>> it will trigger download of all files.
>>>
>>> OneDrive uses placeholder files[1] to represent remote files.
>>>
>>> I'm guessing reading file content in stat is to support detection of
>>> actually executable files as in here[2]?
>>>
>>> I think this should be disabled on non-hydrated placeholder files.
>>> Running `find` or 'ls -R` and having your entire OneDrive downloaded
>>> is extremely problematic.
>>>
>>> I could live without executable scripts in the OneDrive folder and
>>> it's easy to mark files as always offline to solve it.
>>>
>>> Another idea is to skip checking files with extensions known to be
>>> non-executable such as jpg (or just any extensions that is not known
>>> to be executable).
>>
>> Nothing of this makes sense from a POSIX library POV. The library can
>> either not handle placeholder files specially, as today, or it can
>> handle them all the same way.
>>
>> Given these placeholder files are actually reparse points of type
>> IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
>>
>> However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
>> data buffer is undocumented. It would be helpful if somebody using
>> OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
>>
>>> [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
>>
>> The NtReadFile call at this point is not the problem. It would be
>> helpful to point to Cygwin's source instead of MSYS2, btw.
>
> Oh, btw., this is from
> https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
>
> IO_REPARSE_TAG_FILE_PLACEHOLDER
> 0x80000015
>
> Obsolete.
> ---------
> Used by Windows Shell for legacy placeholder files in Windows 8.1.
> Server-side interpretation only, not meaningful over the wire.
>
> So even if we support them, what is their replacement in W10 and later?
May or not help:
https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 13:54 ` Brian Inglis
@ 2024-03-06 17:19 ` Corinna Vinschen
2024-03-06 18:55 ` Jeffrey Altman
2024-03-06 19:00 ` Corinna Vinschen
0 siblings, 2 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 17:19 UTC (permalink / raw)
To: cygwin
On Mar 6 06:54, Brian Inglis via Cygwin wrote:
> On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> > On Mar 6 14:22, Corinna Vinschen via Cygwin wrote:
> > > Given these placeholder files are actually reparse points of type
> > > IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
> > >
> > > However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> > > data buffer is undocumented. It would be helpful if somebody using
> > > OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> > >
> > > > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
> > >
> > > The NtReadFile call at this point is not the problem. It would be
> > > helpful to point to Cygwin's source instead of MSYS2, btw.
> >
> > Oh, btw., this is from
> > https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
> >
> > IO_REPARSE_TAG_FILE_PLACEHOLDER
> > 0x80000015
> >
> > Obsolete.
> > ---------
> > Used by Windows Shell for legacy placeholder files in Windows 8.1.
> > Server-side interpretation only, not meaningful over the wire.
> >
> > So even if we support them, what is their replacement in W10 and later?
>
> May or not help:
>
> https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder
We can add an explicit call to
RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
we still have to know what the reparse point buffer actually contains.
Given that the content of reparse points with these reparse tags are
undocumented, some people using cloud services should examine these
reparse points so we can add some suitable code to Cygwin.
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 17:19 ` Corinna Vinschen
@ 2024-03-06 18:55 ` Jeffrey Altman
2024-03-06 19:14 ` Corinna Vinschen
` (2 more replies)
2024-03-06 19:00 ` Corinna Vinschen
1 sibling, 3 replies; 17+ messages in thread
From: Jeffrey Altman @ 2024-03-06 18:55 UTC (permalink / raw)
To: cygwin
On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> We can add an explicit call to
>
> RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
>
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.
>
>
> Corinna
I'm not an expert in this area by any means but here are my
recollections from when Microsoft presented in-person on cloud
placeholders to filter and filesystem developers many years ago.
Files and directories that are placeholders should have either the
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN
file attributes set. When these attributes are set, applications and
mini filters are advised not to "read" or "open" the files or
directories unless they absolutely need to because doing so will cause
the placeholder to be replaced by an object containing the actual data
which might take a long time to fetch, might cost the end user money, or
might fail depending upon the network connectivity. In particular,
anti-malware should ignore them during scans and only analyze the data
when it is fetched locally by an end user application.
I believe that IO_REPARSE_TAG_FILE_PLACEHOLDER was replaced by
IO_REPARSE_TAG_CLOUD_1 ..IO_REPARSE_TAG_CLOUD_F. Any reparse tag
attached to a placeholder object is for the interpretation of the filter
associated with the back-end storage and not for the consumption of
applications. The content of the reparse tags can be back-end
proprietary; different reparse data for onedrive, icloud, dropbox, etc.
The default ProcessPlaceholderCompaibilityMode is
PHCM_EXPOSE_PLACEHOLDERS which makes the FILE_ATTRIBUTE flags and
reparse tags visible. Microsoft maintains a database of processes for
which PHCM_DISGUISE_PLACEHOLDER is set which hides that information. Its
unclear to me that explicitly setting the placeholder compatibility mode
is useful.
I'm not sure that exposing the object as a symlink is a good idea. A
posix symlink is an object whose type and target information cannot
change. In the case of a placeholder, the placeholder is silently
replaced by the actual object either when the object is opened or the
object's data is accessed. An application that believes it knows that
the object is a symlink will be mighty confused when it turns out to be
a file or a directory.
Perhaps the question that needs to be asked is whether there are opens
that can be skipped if an object is known to not be locally present
(either of the FILE_ATTRIBUTE flags are set)?
Jeffrey Altman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 17:19 ` Corinna Vinschen
2024-03-06 18:55 ` Jeffrey Altman
@ 2024-03-06 19:00 ` Corinna Vinschen
1 sibling, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 19:00 UTC (permalink / raw)
To: cygwin
On Mar 6 18:19, Corinna Vinschen via Cygwin wrote:
> On Mar 6 06:54, Brian Inglis via Cygwin wrote:
> > On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> > > On Mar 6 14:22, Corinna Vinschen via Cygwin wrote:
> > > > Given these placeholder files are actually reparse points of type
> > > > IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
> > > >
> > > > However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> > > > data buffer is undocumented. It would be helpful if somebody using
> > > > OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> > > >
> > > > > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
> > > >
> > > > The NtReadFile call at this point is not the problem. It would be
> > > > helpful to point to Cygwin's source instead of MSYS2, btw.
> > >
> > > Oh, btw., this is from
> > > https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
> > >
> > > IO_REPARSE_TAG_FILE_PLACEHOLDER
> > > 0x80000015
> > >
> > > Obsolete.
> > > ---------
> > > Used by Windows Shell for legacy placeholder files in Windows 8.1.
> > > Server-side interpretation only, not meaningful over the wire.
> > >
> > > So even if we support them, what is their replacement in W10 and later?
> >
> > May or not help:
> >
> > https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder
>
> We can add an explicit call to
>
> RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
>
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.
Reading further on this it seems that one cannot easily compare these
reparse points with symlinks.
The tag values are 0x80000015 for IO_REPARSE_TAG_FILE_PLACEHOLDER and
0x9000001AL up to 0x9000F01AL for the IO_REPARSE_TAG_CLOUD_* tags. None
of them have the name surrogate bit set, so they don't "represent
another named entity in the system". In other words, these reparse
points represent themselves rather than pointing to some other file, as
symlinks do.
Additionally the IO_REPARSE_TAG_CLOUD_* tags all have the directory bit
set so it seems they are used on the parent(?) directory of the local
OneDrive copy only, but not on the files inside it.
Bottom line:
I wonder if the real deal is not the reparse tag and the reparse
content, but whether or not the file has the FILE_ATTRIBUTE_OFFLINE flag
set.
If so, we can try to disable any action within path conversion, as
well as in our stat(2) and readdir(3) implementation which would
trigger onlining an offline file.
Can anybody confirm that the idea is right, or if I'm something missing?
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 18:55 ` Jeffrey Altman
@ 2024-03-06 19:14 ` Corinna Vinschen
2024-03-07 9:06 ` Corinna Vinschen
2024-03-08 10:37 ` Corinna Vinschen
2 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 19:14 UTC (permalink / raw)
To: cygwin
Hi Jeffrey,
looks like writing our mails overlapped:
https://cygwin.com/pipermail/cygwin/2024-March/255622.html
On Mar 6 13:55, Jeffrey Altman via Cygwin wrote:
> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> > We can add an explicit call to
> >
> > RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
> >
> > and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> > IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> > we still have to know what the reparse point buffer actually contains.
> >
> > Given that the content of reparse points with these reparse tags are
> > undocumented, some people using cloud services should examine these
> > reparse points so we can add some suitable code to Cygwin.
> >
> >
> > Corinna
> I'm not an expert in this area by any means but here are my recollections
> from when Microsoft presented in-person on cloud placeholders to filter and
> filesystem developers many years ago.
>
> Files and directories that are placeholders should have either the
> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
> attributes set. When these attributes are set, applications and mini filters
> are advised not to "read" or "open" the files or directories unless they
> absolutely need to
Per https://learn.microsoft.com/en-us/windows/win32/fileio/file-attribute-constants
FILE_ATTRIBUTE_RECALL_ON_OPEN only appears in directory listing
classes, but not in standard FILE_BASIC_INFORMATION and alike.
That's a bit of a problem considering how we check files during
path conversion.
The MSDN article doesn't state the same for
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS, which is good, I think.
But I'm a bit puzzled then in terms of FILE_ATTRIBUTE_OFFLINE. Is it
not used for OneDrive files?
> [...]
> I'm not sure that exposing the object as a symlink is a good idea.
Yeah, that's what I realized as well, see my aforementioned mail.
> Perhaps the question that needs to be asked is whether there are opens that
> can be skipped if an object is known to not be locally present (either of
> the FILE_ATTRIBUTE flags are set)?
This may be the way to go, see my mail. It wouldn't be much of
a problem to check all attribute bits, i.e. FILE_ATTRIBUTE_OFFLINE,
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS and FILE_ATTRIBUTE_RECALL_ON_OPEN.
Maybe that's what we should do.
Thanks,
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 18:55 ` Jeffrey Altman
2024-03-06 19:14 ` Corinna Vinschen
@ 2024-03-07 9:06 ` Corinna Vinschen
2024-03-08 10:37 ` Corinna Vinschen
2 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-07 9:06 UTC (permalink / raw)
To: Jeffrey Altman; +Cc: cygwin
Hi Jeffrey,
apart from the attribute stuff...
On Mar 6 13:55, Jeffrey Altman via Cygwin wrote:
> The default ProcessPlaceholderCompaibilityMode is PHCM_EXPOSE_PLACEHOLDERS
> which makes the FILE_ATTRIBUTE flags and reparse tags visible. Microsoft
> maintains a database of processes for which PHCM_DISGUISE_PLACEHOLDER is set
> which hides that information. Its unclear to me that explicitly setting the
> placeholder compatibility mode is useful.
What I see as a problem here is this:
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-rtlsetprocessplaceholdercompatibilitymode
Quote:
"Most Windows applications see exposed placeholders by default. For
^^^^
compatibility reasons, Windows may decide that certain applications
^^^^^^^^^^^^^^^^^^
see disguised placeholders by default."
But then again, in other news from Microsoft:
https://learn.microsoft.com/en-us/windows/win32/cfapi/build-a-cloud-file-sync-engine#compatibility-with-applications-that-use-reparse-points
Quote:
"[...] the cloud files API always hides its reparse points from all
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
applications except for sync engines and processes whose main image
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
resides under %systemroot%. Applications that understand reparse
^^^^^^^^^^^^^^^^^^^^^^^^^^^
points correctly can force the platform to expose cloud files API
reparse points using RtlSetProcessPlaceholderCompatibilityMode or
RtlSetThreadProcessPlaceholderCompatibilityMode.
Considering these two statements, it's totally unclear to a process, if
it just defaults to "exposed" or "disguised".
Fortunately we can ask Windows by calling the
RtlQueryProcessPlaceholderCompatibilityMode() function, right?
Lets have a look into the documentation at
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-rtlqueryprocessplaceholdercompatibilitymode
Quote:
"Return value
This function returns the process's placeholder compatibily mode
(PHCM_xxx), or a negative value on error (PCHM_ERROR_xxx). Contains
one of the following values:
Compatibility Mode Value
PHCM_APPLICATION_DEFAULT 0
PHCM_DISGUISE_PLACEHOLDER 1
PHCM_EXPOSE_PLACEHOLDERS 2
PHCM_MAX 2
PHCM_ERROR_INVALID_PARAMETER -1
PHCM_ERROR_NO_TEB -2"
So I called the function right at the start of the Cygwin DLL, and it
returns the value 0, i. e., PHCM_APPLICATION_DEFAULT.
At this point the process *still* has no idea if placeholders are
exposed or disguised. What a great API! \o/
So, from the above, and if we really want to be sure that placeholders
will be exposed, I don't see any way around calling
RtlSetProcessPlaceholderCompatibilityMode(PHCM_EXPOSE_PLACEHOLDERS)
explicitely at DLL startup.
What do you think?
Thanks,
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-06 18:55 ` Jeffrey Altman
2024-03-06 19:14 ` Corinna Vinschen
2024-03-07 9:06 ` Corinna Vinschen
@ 2024-03-08 10:37 ` Corinna Vinschen
2024-03-08 12:52 ` Thomas Wolff
2 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 10:37 UTC (permalink / raw)
To: Jeffrey Altman; +Cc: cygwin
Hi Jeffrey,
On Mar 6 13:55, Jeffrey Altman via Cygwin wrote:
> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> > We can add an explicit call to
> >
> > RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
> > [...]
> Files and directories that are placeholders should have either the
> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
> attributes set. When these attributes are set, applications and mini filters
> are advised not to "read" or "open" the files or directories unless they
> absolutely need to because doing so will cause the placeholder to be
> replaced by an object containing the actual data which might take a long
> time to fetch,
Yesterday I stumbled over a certain NtCreateFile flag:
FILE_OPEN_NO_RECALL (0x00400000)
Instructs any filters that perform offline storage or virtualization
to not recall the contents of the file as a result of this open.
MS-CIFS described it like this:
FILE_OPEN_NO_RECALL
0x00400000
In a hierarchical storage management environment, this option
requests that the file SHOULD NOT be recalled from tertiary storage
such as tape. A file recall can take up to several minutes in a
hierarchical storage management environment. The clients can specify
this option to avoid such delays.
This sounds like we could simply add this flag to all NtOpenFile
used for path conversion or stat-like calls, without having to care
for any file attributes specificially.
Does that make sense?
Thanks,
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 10:37 ` Corinna Vinschen
@ 2024-03-08 12:52 ` Thomas Wolff
2024-03-08 13:15 ` Jeffrey Altman
0 siblings, 1 reply; 17+ messages in thread
From: Thomas Wolff @ 2024-03-08 12:52 UTC (permalink / raw)
To: cygwin
Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> Hi Jeffrey,
>
> On Mar 6 13:55, Jeffrey Altman via Cygwin wrote:
>> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
>>> We can add an explicit call to
>>>
>>> RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>>> [...]
>> Files and directories that are placeholders should have either the
>> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
>> attributes set. When these attributes are set, applications and mini filters
>> are advised not to "read" or "open" the files or directories unless they
>> absolutely need to because doing so will cause the placeholder to be
>> replaced by an object containing the actual data which might take a long
>> time to fetch,
> Yesterday I stumbled over a certain NtCreateFile flag:
>
> FILE_OPEN_NO_RECALL (0x00400000)
>
> Instructs any filters that perform offline storage or virtualization
> to not recall the contents of the file as a result of this open.
>
> MS-CIFS described it like this:
>
> FILE_OPEN_NO_RECALL
> 0x00400000
>
> In a hierarchical storage management environment, this option
> requests that the file SHOULD NOT be recalled from tertiary storage
> such as tape. A file recall can take up to several minutes in a
> hierarchical storage management environment. The clients can specify
> this option to avoid such delays.
>
> This sounds like we could simply add this flag to all NtOpenFile
> used for path conversion or stat-like calls, without having to care
> for any file attributes specificially.
>
> Does that make sense?
Sounds good, without even studying the other details...
I speculate some more handling would still be needed to avoid executable
detection via magic tags.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 12:52 ` Thomas Wolff
@ 2024-03-08 13:15 ` Jeffrey Altman
2024-03-08 13:56 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Jeffrey Altman @ 2024-03-08 13:15 UTC (permalink / raw)
To: Thomas Wolff, cygwin
On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
>
> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
>> Hi Jeffrey,
>>
>> On Mar 6 13:55, Jeffrey Altman via Cygwin wrote:
>>> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
>>>> We can add an explicit call to
>>>>
>>>> RtlSetProcessPlaceholderCompatibilityMode
>>>> (PHCM_EXPOSE_PLACEHOLDERS);
>>>> [...]
>>> Files and directories that are placeholders should have either the
>>> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or
>>> FILE_ATTRIBUTE_RECALL_ON_OPEN file
>>> attributes set. When these attributes are set, applications and mini
>>> filters
>>> are advised not to "read" or "open" the files or directories unless
>>> they
>>> absolutely need to because doing so will cause the placeholder to be
>>> replaced by an object containing the actual data which might take a
>>> long
>>> time to fetch,
>> Yesterday I stumbled over a certain NtCreateFile flag:
>>
>> FILE_OPEN_NO_RECALL (0x00400000)
>>
>> Instructs any filters that perform offline storage or
>> virtualization
>> to not recall the contents of the file as a result of this open.
>>
>> MS-CIFS described it like this:
>>
>> FILE_OPEN_NO_RECALL
>> 0x00400000
>>
>> In a hierarchical storage management environment, this option
>> requests that the file SHOULD NOT be recalled from tertiary storage
>> such as tape. A file recall can take up to several minutes in a
>> hierarchical storage management environment. The clients can
>> specify
>> this option to avoid such delays.
>>
>> This sounds like we could simply add this flag to all NtOpenFile
>> used for path conversion or stat-like calls, without having to care
>> for any file attributes specificially.
>>
>> Does that make sense?
> Sounds good, without even studying the other details...
> I speculate some more handling would still be needed to avoid executable
> detection via magic tags.
>
Agreed. FILE_OPEN_NO_RECALL has been defined for at least a decade but
was not documented by Microsoft relatively recently.
Another suggestion would be to try opening the file with
FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
required. See
https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
Jeffrey Altman
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 13:15 ` Jeffrey Altman
@ 2024-03-08 13:56 ` Corinna Vinschen
2024-03-08 22:21 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 13:56 UTC (permalink / raw)
To: Jeffrey Altman; +Cc: Thomas Wolff, cygwin
On Mar 8 08:15, Jeffrey Altman via Cygwin wrote:
> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> > Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> > > Yesterday I stumbled over a certain NtCreateFile flag:
> > >
> > > FILE_OPEN_NO_RECALL (0x00400000)
> > >
> > > Instructs any filters that perform offline storage or
> > > virtualization
> > > to not recall the contents of the file as a result of this open.
> > >
> > > MS-CIFS described it like this:
> > >
> > > FILE_OPEN_NO_RECALL
> > > 0x00400000
> > >
> > > In a hierarchical storage management environment, this option
> > > requests that the file SHOULD NOT be recalled from tertiary storage
> > > such as tape. A file recall can take up to several minutes in a
> > > hierarchical storage management environment. The clients can
> > > specify
> > > this option to avoid such delays.
> > >
> > > This sounds like we could simply add this flag to all NtOpenFile
> > > used for path conversion or stat-like calls, without having to care
> > > for any file attributes specificially.
> > >
> > > Does that make sense?
> > Sounds good, without even studying the other details...
> > I speculate some more handling would still be needed to avoid executable
> > detection via magic tags.
> >
> Agreed. FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> not documented by Microsoft relatively recently.
Thanks for the feedback, guys.
> Another suggestion would be to try opening the file with
> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> required. See
>
> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
Cygwin uses the minimum of required permissions in NtCreateFile/
NtOpenFile calls anyway.
I'm just running a test cygwin DLL locally with a lot of added
FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
being offline to allow skipping some code.
I think I'll push this change in a bit so we get a test release out
so people using OneDrive can test.
Thanks,
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 13:56 ` Corinna Vinschen
@ 2024-03-08 22:21 ` Corinna Vinschen
2024-03-08 22:26 ` Marcin Wisnicki
0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 22:21 UTC (permalink / raw)
To: Marcin Wisnicki, cygwin; +Cc: Jeffrey Altman, Thomas Wolff
On Mar 8 14:56, Corinna Vinschen via Cygwin wrote:
> On Mar 8 08:15, Jeffrey Altman via Cygwin wrote:
> > On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> > > Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> > > > FILE_OPEN_NO_RECALL (0x00400000)
> > > > [...]
> > > > This sounds like we could simply add this flag to all NtOpenFile
> > > > used for path conversion or stat-like calls, without having to care
> > > > for any file attributes specificially.
> > > >
> > > > Does that make sense?
> > > Sounds good, without even studying the other details...
> > > I speculate some more handling would still be needed to avoid executable
> > > detection via magic tags.
> > >
> > Agreed. FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> > not documented by Microsoft relatively recently.
>
> Thanks for the feedback, guys.
>
> > Another suggestion would be to try opening the file with
> > FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> > required. See
> >
> > https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
>
> Cygwin uses the minimum of required permissions in NtCreateFile/
> NtOpenFile calls anyway.
>
> I'm just running a test cygwin DLL locally with a lot of added
> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
> being offline to allow skipping some code.
>
> I think I'll push this change in a bit so we get a test release out
> so people using OneDrive can test.
I pushed this change as well as a followup change to make sure we don't
inadvertently recall an offline file. I also added handling for the
Pinned and Unpinned attributes to chattr(1) and lsattr(1).
The full set of changes can be tested by installing the Cygwin test
release 3.6.0-0.77.g06aa5a751682.
Please give it a try. If you encounter a situation which still results
in recalling an offline file in a situation which doesn't qualify for
it, please report. We will have to analyze that situation further
then.
Thanks,
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 22:21 ` Corinna Vinschen
@ 2024-03-08 22:26 ` Marcin Wisnicki
2024-03-09 20:29 ` Marcin Wisnicki
0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-08 22:26 UTC (permalink / raw)
To: cygwin, Jeffrey Altman, Thomas Wolff
On 2024-03-08 17:21, Corinna Vinschen wrote:
> On Mar 8 14:56, Corinna Vinschen via Cygwin wrote:
>> On Mar 8 08:15, Jeffrey Altman via Cygwin wrote:
>>> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
>>>> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
>>>>> FILE_OPEN_NO_RECALL (0x00400000)
>>>>> [...]
>>>>> This sounds like we could simply add this flag to all NtOpenFile
>>>>> used for path conversion or stat-like calls, without having to care
>>>>> for any file attributes specificially.
>>>>>
>>>>> Does that make sense?
>>>> Sounds good, without even studying the other details...
>>>> I speculate some more handling would still be needed to avoid executable
>>>> detection via magic tags.
>>>>
>>> Agreed. FILE_OPEN_NO_RECALL has been defined for at least a decade but was
>>> not documented by Microsoft relatively recently.
>> Thanks for the feedback, guys.
>>
>>> Another suggestion would be to try opening the file with
>>> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
>>> required. See
>>>
>>> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
>> Cygwin uses the minimum of required permissions in NtCreateFile/
>> NtOpenFile calls anyway.
>>
>> I'm just running a test cygwin DLL locally with a lot of added
>> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
>> being offline to allow skipping some code.
>>
>> I think I'll push this change in a bit so we get a test release out
>> so people using OneDrive can test.
> I pushed this change as well as a followup change to make sure we don't
> inadvertently recall an offline file. I also added handling for the
> Pinned and Unpinned attributes to chattr(1) and lsattr(1).
>
> The full set of changes can be tested by installing the Cygwin test
> release 3.6.0-0.77.g06aa5a751682.
>
> Please give it a try. If you encounter a situation which still results
> in recalling an offline file in a situation which doesn't qualify for
> it, please report. We will have to analyze that situation further
> then.
>
>
> Thanks,
> Corinna
Thanks for doing this work so quickly. I'm not subscribed to this
mailing list so I didn't see previous messages.
I will try to check this in Cygwin this weekend but should tell you that
I'm not cygwin user and now found a report of another user claiming this
only happens in MSys and not in Cygwin.
https://github.com/msys2/MSYS2-packages/issues/3049
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-08 22:26 ` Marcin Wisnicki
@ 2024-03-09 20:29 ` Marcin Wisnicki
2024-03-11 17:04 ` Corinna Vinschen
0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-09 20:29 UTC (permalink / raw)
To: cygwin, Jeffrey Altman, Thomas Wolff
I did more testing and found out that the problem does not happen in
cygwin by default because cygwin mounts with acl which doesn't do
header sniffing while msys uses noacl.
Testing on an mp4 file in OneDrive, when I use noacl in cygwin it
triggers the read as well.
After upgrading to the test version the read is gone and an mp4 file
is not executable.
Thank you!
On Fri, 8 Mar 2024 at 17:26, Marcin Wisnicki <mwisnicki@gmail.com> wrote:
>
> On 2024-03-08 17:21, Corinna Vinschen wrote:
> > On Mar 8 14:56, Corinna Vinschen via Cygwin wrote:
> >> On Mar 8 08:15, Jeffrey Altman via Cygwin wrote:
> >>> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> >>>> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> >>>>> FILE_OPEN_NO_RECALL (0x00400000)
> >>>>> [...]
> >>>>> This sounds like we could simply add this flag to all NtOpenFile
> >>>>> used for path conversion or stat-like calls, without having to care
> >>>>> for any file attributes specificially.
> >>>>>
> >>>>> Does that make sense?
> >>>> Sounds good, without even studying the other details...
> >>>> I speculate some more handling would still be needed to avoid executable
> >>>> detection via magic tags.
> >>>>
> >>> Agreed. FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> >>> not documented by Microsoft relatively recently.
> >> Thanks for the feedback, guys.
> >>
> >>> Another suggestion would be to try opening the file with
> >>> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> >>> required. See
> >>>
> >>> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
> >> Cygwin uses the minimum of required permissions in NtCreateFile/
> >> NtOpenFile calls anyway.
> >>
> >> I'm just running a test cygwin DLL locally with a lot of added
> >> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
> >> being offline to allow skipping some code.
> >>
> >> I think I'll push this change in a bit so we get a test release out
> >> so people using OneDrive can test.
> > I pushed this change as well as a followup change to make sure we don't
> > inadvertently recall an offline file. I also added handling for the
> > Pinned and Unpinned attributes to chattr(1) and lsattr(1).
> >
> > The full set of changes can be tested by installing the Cygwin test
> > release 3.6.0-0.77.g06aa5a751682.
> >
> > Please give it a try. If you encounter a situation which still results
> > in recalling an offline file in a situation which doesn't qualify for
> > it, please report. We will have to analyze that situation further
> > then.
> >
> >
> > Thanks,
> > Corinna
>
> Thanks for doing this work so quickly. I'm not subscribed to this
> mailing list so I didn't see previous messages.
>
> I will try to check this in Cygwin this weekend but should tell you that
> I'm not cygwin user and now found a report of another user claiming this
> only happens in MSys and not in Cygwin.
>
> https://github.com/msys2/MSYS2-packages/issues/3049
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ls/stat on OneDrive causes download of files
2024-03-09 20:29 ` Marcin Wisnicki
@ 2024-03-11 17:04 ` Corinna Vinschen
0 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-11 17:04 UTC (permalink / raw)
To: Marcin Wisnicki; +Cc: cygwin, Jeffrey Altman, Thomas Wolff
On Mar 9 15:29, Marcin Wisnicki via Cygwin wrote:
> I did more testing and found out that the problem does not happen in
> cygwin by default because cygwin mounts with acl which doesn't do
> header sniffing while msys uses noacl.
>
> Testing on an mp4 file in OneDrive, when I use noacl in cygwin it
> triggers the read as well.
> After upgrading to the test version the read is gone and an mp4 file
> is not executable.
>
> Thank you!
Thanks a lot for testing. I backported the changes (minus the lsattr(1)/
chattr(1) changes) to the 3.5 branch so it will be in released with
3.5.2 in the next few weeks.
Corinna
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-03-11 17:04 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 0:54 ls/stat on OneDrive causes download of files Marcin Wisnicki
2024-03-06 13:22 ` Corinna Vinschen
2024-03-06 13:28 ` Corinna Vinschen
2024-03-06 13:54 ` Brian Inglis
2024-03-06 17:19 ` Corinna Vinschen
2024-03-06 18:55 ` Jeffrey Altman
2024-03-06 19:14 ` Corinna Vinschen
2024-03-07 9:06 ` Corinna Vinschen
2024-03-08 10:37 ` Corinna Vinschen
2024-03-08 12:52 ` Thomas Wolff
2024-03-08 13:15 ` Jeffrey Altman
2024-03-08 13:56 ` Corinna Vinschen
2024-03-08 22:21 ` Corinna Vinschen
2024-03-08 22:26 ` Marcin Wisnicki
2024-03-09 20:29 ` Marcin Wisnicki
2024-03-11 17:04 ` Corinna Vinschen
2024-03-06 19:00 ` Corinna Vinschen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).