public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* ls/stat on OneDrive causes download of files
@ 2024-03-06  0:54 Marcin Wisnicki
  2024-03-06 13:22 ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-06  0:54 UTC (permalink / raw)
  To: cygwin

If I invoke ls or anything else that does stat inside OneDrive folder
it will trigger download of all files.

OneDrive uses placeholder files[1] to represent remote files.

I'm guessing reading file content in stat is to support detection of
actually executable files as in here[2]?

I think this should be disabled on non-hydrated placeholder files.
Running `find` or 'ls -R` and having your entire OneDrive downloaded
is extremely problematic.

I could live without executable scripts in the OneDrive folder and
it's easy to mark files as always offline to solve it.

Another idea is to skip checking files with extensions known to be
non-executable such as jpg (or just any extensions that is not known
to be executable).

This was previously reported in
https://github.com/msys2/msys2-runtime/issues/206.

[1] https://learn.microsoft.com/en-us/windows/win32/w8cookbook/placeholder-files
[2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06  0:54 ls/stat on OneDrive causes download of files Marcin Wisnicki
@ 2024-03-06 13:22 ` Corinna Vinschen
  2024-03-06 13:28   ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 13:22 UTC (permalink / raw)
  To: cygwin

On Mar  5 19:54, Marcin Wisnicki via Cygwin wrote:
> If I invoke ls or anything else that does stat inside OneDrive folder
> it will trigger download of all files.
> 
> OneDrive uses placeholder files[1] to represent remote files.
> 
> I'm guessing reading file content in stat is to support detection of
> actually executable files as in here[2]?
> 
> I think this should be disabled on non-hydrated placeholder files.
> Running `find` or 'ls -R` and having your entire OneDrive downloaded
> is extremely problematic.
> 
> I could live without executable scripts in the OneDrive folder and
> it's easy to mark files as always offline to solve it.
> 
> Another idea is to skip checking files with extensions known to be
> non-executable such as jpg (or just any extensions that is not known
> to be executable).

Nothing of this makes sense from a POSIX library POV.  The library can
either not handle placeholder files specially, as today, or it can
handle them all the same way.

Given these placeholder files are actually reparse points of type
IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.

However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
data buffer is undocumented.  It would be helpful if somebody using
OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.

> [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548

The NtReadFile call at this point is not the problem.  It would be
helpful to point to Cygwin's source instead of MSYS2, btw.


Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 13:22 ` Corinna Vinschen
@ 2024-03-06 13:28   ` Corinna Vinschen
  2024-03-06 13:54     ` Brian Inglis
  0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 13:28 UTC (permalink / raw)
  To: cygwin

On Mar  6 14:22, Corinna Vinschen via Cygwin wrote:
> On Mar  5 19:54, Marcin Wisnicki via Cygwin wrote:
> > If I invoke ls or anything else that does stat inside OneDrive folder
> > it will trigger download of all files.
> > 
> > OneDrive uses placeholder files[1] to represent remote files.
> > 
> > I'm guessing reading file content in stat is to support detection of
> > actually executable files as in here[2]?
> > 
> > I think this should be disabled on non-hydrated placeholder files.
> > Running `find` or 'ls -R` and having your entire OneDrive downloaded
> > is extremely problematic.
> > 
> > I could live without executable scripts in the OneDrive folder and
> > it's easy to mark files as always offline to solve it.
> > 
> > Another idea is to skip checking files with extensions known to be
> > non-executable such as jpg (or just any extensions that is not known
> > to be executable).
> 
> Nothing of this makes sense from a POSIX library POV.  The library can
> either not handle placeholder files specially, as today, or it can
> handle them all the same way.
> 
> Given these placeholder files are actually reparse points of type
> IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
> 
> However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> data buffer is undocumented.  It would be helpful if somebody using
> OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> 
> > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
> 
> The NtReadFile call at this point is not the problem.  It would be
> helpful to point to Cygwin's source instead of MSYS2, btw.

Oh, btw., this is from
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:

  IO_REPARSE_TAG_FILE_PLACEHOLDER
  0x80000015

    Obsolete.
    ---------
    Used by Windows Shell for legacy placeholder files in Windows 8.1.
    Server-side interpretation only, not meaningful over the wire.

So even if we support them, what is their replacement in W10 and later?


Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 13:28   ` Corinna Vinschen
@ 2024-03-06 13:54     ` Brian Inglis
  2024-03-06 17:19       ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Brian Inglis @ 2024-03-06 13:54 UTC (permalink / raw)
  To: cygwin

On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> On Mar  6 14:22, Corinna Vinschen via Cygwin wrote:
>> On Mar  5 19:54, Marcin Wisnicki via Cygwin wrote:
>>> If I invoke ls or anything else that does stat inside OneDrive folder
>>> it will trigger download of all files.
>>>
>>> OneDrive uses placeholder files[1] to represent remote files.
>>>
>>> I'm guessing reading file content in stat is to support detection of
>>> actually executable files as in here[2]?
>>>
>>> I think this should be disabled on non-hydrated placeholder files.
>>> Running `find` or 'ls -R` and having your entire OneDrive downloaded
>>> is extremely problematic.
>>>
>>> I could live without executable scripts in the OneDrive folder and
>>> it's easy to mark files as always offline to solve it.
>>>
>>> Another idea is to skip checking files with extensions known to be
>>> non-executable such as jpg (or just any extensions that is not known
>>> to be executable).
>>
>> Nothing of this makes sense from a POSIX library POV.  The library can
>> either not handle placeholder files specially, as today, or it can
>> handle them all the same way.
>>
>> Given these placeholder files are actually reparse points of type
>> IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
>>
>> However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
>> data buffer is undocumented.  It would be helpful if somebody using
>> OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
>>
>>> [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
>>
>> The NtReadFile call at this point is not the problem.  It would be
>> helpful to point to Cygwin's source instead of MSYS2, btw.
> 
> Oh, btw., this is from
> https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
> 
>    IO_REPARSE_TAG_FILE_PLACEHOLDER
>    0x80000015
> 
>      Obsolete.
>      ---------
>      Used by Windows Shell for legacy placeholder files in Windows 8.1.
>      Server-side interpretation only, not meaningful over the wire.
> 
> So even if we support them, what is their replacement in W10 and later?

May or not help:

https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder
-- 
Take care. Thanks, Brian Inglis              Calgary, Alberta, Canada

La perfection est atteinte                   Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter  not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer     but when there is no more to cut
                                 -- Antoine de Saint-Exupéry


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 13:54     ` Brian Inglis
@ 2024-03-06 17:19       ` Corinna Vinschen
  2024-03-06 18:55         ` Jeffrey Altman
  2024-03-06 19:00         ` Corinna Vinschen
  0 siblings, 2 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 17:19 UTC (permalink / raw)
  To: cygwin

On Mar  6 06:54, Brian Inglis via Cygwin wrote:
> On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> > On Mar  6 14:22, Corinna Vinschen via Cygwin wrote:
> > > Given these placeholder files are actually reparse points of type
> > > IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
> > > 
> > > However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> > > data buffer is undocumented.  It would be helpful if somebody using
> > > OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> > > 
> > > > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
> > > 
> > > The NtReadFile call at this point is not the problem.  It would be
> > > helpful to point to Cygwin's source instead of MSYS2, btw.
> > 
> > Oh, btw., this is from
> > https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
> > 
> >    IO_REPARSE_TAG_FILE_PLACEHOLDER
> >    0x80000015
> > 
> >      Obsolete.
> >      ---------
> >      Used by Windows Shell for legacy placeholder files in Windows 8.1.
> >      Server-side interpretation only, not meaningful over the wire.
> > 
> > So even if we support them, what is their replacement in W10 and later?
> 
> May or not help:
> 
> https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder

We can add an explicit call to

  RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);

and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
we still have to know what the reparse point buffer actually contains.

Given that the content of reparse points with these reparse tags are
undocumented, some people using cloud services should examine these
reparse points so we can add some suitable code to Cygwin.


Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 17:19       ` Corinna Vinschen
@ 2024-03-06 18:55         ` Jeffrey Altman
  2024-03-06 19:14           ` Corinna Vinschen
                             ` (2 more replies)
  2024-03-06 19:00         ` Corinna Vinschen
  1 sibling, 3 replies; 17+ messages in thread
From: Jeffrey Altman @ 2024-03-06 18:55 UTC (permalink / raw)
  To: cygwin

On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> We can add an explicit call to
>
>    RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
>
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.
>
>
> Corinna
I'm not an expert in this area by any means but here are my 
recollections from when Microsoft presented in-person on cloud 
placeholders to filter and filesystem developers many years ago.

Files and directories that are placeholders should have either the 
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN 
file attributes set. When these attributes are set, applications and 
mini filters are advised not to "read" or "open" the files or 
directories unless they absolutely need to because doing so will cause 
the placeholder to be replaced by an object containing the actual data 
which might take a long time to fetch, might cost the end user money, or 
might fail depending upon the network connectivity. In particular, 
anti-malware should ignore them during scans and only analyze the data 
when it is fetched locally by an end user application.

I believe that IO_REPARSE_TAG_FILE_PLACEHOLDER was replaced by 
IO_REPARSE_TAG_CLOUD_1 ..IO_REPARSE_TAG_CLOUD_F. Any reparse tag 
attached to a placeholder object is for the interpretation of the filter 
associated with the back-end storage and not for the consumption of 
applications. The content of the reparse tags can be back-end 
proprietary; different reparse data for onedrive, icloud, dropbox, etc.

The default ProcessPlaceholderCompaibilityMode is 
PHCM_EXPOSE_PLACEHOLDERS which makes the FILE_ATTRIBUTE flags and 
reparse tags visible. Microsoft maintains a database of processes for 
which PHCM_DISGUISE_PLACEHOLDER is set which hides that information. Its 
unclear to me that explicitly setting the placeholder compatibility mode 
is useful.

I'm not sure that exposing the object as a symlink is a good idea. A 
posix symlink is an object whose type and target information cannot 
change. In the case of a placeholder, the placeholder is silently 
replaced by the actual object either when the object is opened or the 
object's data is accessed. An application that believes it knows that 
the object is a symlink will be mighty confused when it turns out to be 
a file or a directory.

Perhaps the question that needs to be asked is whether there are opens 
that can be skipped if an object is known to not be locally present 
(either of the FILE_ATTRIBUTE flags are set)?

Jeffrey Altman



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 17:19       ` Corinna Vinschen
  2024-03-06 18:55         ` Jeffrey Altman
@ 2024-03-06 19:00         ` Corinna Vinschen
  1 sibling, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 19:00 UTC (permalink / raw)
  To: cygwin

On Mar  6 18:19, Corinna Vinschen via Cygwin wrote:
> On Mar  6 06:54, Brian Inglis via Cygwin wrote:
> > On 2024-03-06 06:28, Corinna Vinschen via Cygwin wrote:
> > > On Mar  6 14:22, Corinna Vinschen via Cygwin wrote:
> > > > Given these placeholder files are actually reparse points of type
> > > > IO_REPARSE_TAG_FILE_PLACEHOLDER, we can handle them as symbolic links.
> > > > 
> > > > However, the structure of the IO_REPARSE_TAG_FILE_PLACEHOLDER reparse
> > > > data buffer is undocumented.  It would be helpful if somebody using
> > > > OneDrive would examine the content of the attached REPARSE_DATA_BUFFER.
> > > > 
> > > > > [2] https://github.com/msys2/msys2-runtime/blob/msys2-3.4.10/winsup/cygwin/fhandler/disk_file.cc#L548
> > > > 
> > > > The NtReadFile call at this point is not the problem.  It would be
> > > > helpful to point to Cygwin's source instead of MSYS2, btw.
> > > 
> > > Oh, btw., this is from
> > > https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4:
> > > 
> > >    IO_REPARSE_TAG_FILE_PLACEHOLDER
> > >    0x80000015
> > > 
> > >      Obsolete.
> > >      ---------
> > >      Used by Windows Shell for legacy placeholder files in Windows 8.1.
> > >      Server-side interpretation only, not meaningful over the wire.
> > > 
> > > So even if we support them, what is their replacement in W10 and later?
> > 
> > May or not help:
> > 
> > https://stackoverflow.com/questions/59152220/cant-get-reparse-point-information-for-the-onedrive-folder
> 
> We can add an explicit call to
> 
>   RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
> 
> and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> we still have to know what the reparse point buffer actually contains.
> 
> Given that the content of reparse points with these reparse tags are
> undocumented, some people using cloud services should examine these
> reparse points so we can add some suitable code to Cygwin.

Reading further on this it seems that one cannot easily compare these
reparse points with symlinks.

The tag values are 0x80000015 for IO_REPARSE_TAG_FILE_PLACEHOLDER and
0x9000001AL up to 0x9000F01AL for the IO_REPARSE_TAG_CLOUD_* tags.  None
of them have the name surrogate bit set, so they don't "represent
another named entity in the system".  In other words, these reparse
points represent themselves rather than pointing to some other file, as
symlinks do.

Additionally the IO_REPARSE_TAG_CLOUD_* tags all have the directory bit
set so it seems they are used on the parent(?) directory of the local
OneDrive copy only, but not on the files inside it.

Bottom line:

I wonder if the real deal is not the reparse tag and the reparse
content, but whether or not the file has the FILE_ATTRIBUTE_OFFLINE flag
set.

If so, we can try to disable any action within path conversion, as 
well as in our stat(2) and readdir(3) implementation which would
trigger onlining an offline file.

Can anybody confirm that the idea is right, or if I'm something missing?


Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 18:55         ` Jeffrey Altman
@ 2024-03-06 19:14           ` Corinna Vinschen
  2024-03-07  9:06           ` Corinna Vinschen
  2024-03-08 10:37           ` Corinna Vinschen
  2 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-06 19:14 UTC (permalink / raw)
  To: cygwin

Hi Jeffrey,


looks like writing our mails overlapped:
https://cygwin.com/pipermail/cygwin/2024-March/255622.html

On Mar  6 13:55, Jeffrey Altman via Cygwin wrote:
> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> > We can add an explicit call to
> > 
> >    RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
> > 
> > and we can recognize the IO_REPARSE_TAG_FILE_PLACEHOLDER and
> > IO_REPARSE_TAG_CLOUD_* tags during symlink evaluation, but even then
> > we still have to know what the reparse point buffer actually contains.
> > 
> > Given that the content of reparse points with these reparse tags are
> > undocumented, some people using cloud services should examine these
> > reparse points so we can add some suitable code to Cygwin.
> > 
> > 
> > Corinna
> I'm not an expert in this area by any means but here are my recollections
> from when Microsoft presented in-person on cloud placeholders to filter and
> filesystem developers many years ago.
> 
> Files and directories that are placeholders should have either the
> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
> attributes set. When these attributes are set, applications and mini filters
> are advised not to "read" or "open" the files or directories unless they
> absolutely need to

Per https://learn.microsoft.com/en-us/windows/win32/fileio/file-attribute-constants
FILE_ATTRIBUTE_RECALL_ON_OPEN only appears in directory listing
classes, but not in standard FILE_BASIC_INFORMATION and alike.
That's a bit of a problem considering how we check files during
path conversion.

The MSDN article doesn't state the same for
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS, which is good, I think.

But I'm a bit puzzled then in terms of FILE_ATTRIBUTE_OFFLINE.  Is it
not used for OneDrive files?

> [...]
> I'm not sure that exposing the object as a symlink is a good idea.

Yeah, that's what I realized as well, see my aforementioned mail.

> Perhaps the question that needs to be asked is whether there are opens that
> can be skipped if an object is known to not be locally present (either of
> the FILE_ATTRIBUTE flags are set)?

This may be the way to go, see my mail.  It wouldn't be much of
a problem to check all attribute bits, i.e. FILE_ATTRIBUTE_OFFLINE,
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS and FILE_ATTRIBUTE_RECALL_ON_OPEN.
Maybe that's what we should do.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 18:55         ` Jeffrey Altman
  2024-03-06 19:14           ` Corinna Vinschen
@ 2024-03-07  9:06           ` Corinna Vinschen
  2024-03-08 10:37           ` Corinna Vinschen
  2 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-07  9:06 UTC (permalink / raw)
  To: Jeffrey Altman; +Cc: cygwin

Hi Jeffrey,


apart from the attribute stuff...


On Mar  6 13:55, Jeffrey Altman via Cygwin wrote:
> The default ProcessPlaceholderCompaibilityMode is PHCM_EXPOSE_PLACEHOLDERS
> which makes the FILE_ATTRIBUTE flags and reparse tags visible. Microsoft
> maintains a database of processes for which PHCM_DISGUISE_PLACEHOLDER is set
> which hides that information. Its unclear to me that explicitly setting the
> placeholder compatibility mode is useful.

What I see as a problem here is this:

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-rtlsetprocessplaceholdercompatibilitymode

Quote:

  "Most Windows applications see exposed placeholders by default. For
   ^^^^
   compatibility reasons, Windows may decide that certain applications
                          ^^^^^^^^^^^^^^^^^^
   see disguised placeholders by default."

But then again, in other news from Microsoft:

https://learn.microsoft.com/en-us/windows/win32/cfapi/build-a-cloud-file-sync-engine#compatibility-with-applications-that-use-reparse-points

Quote:

  "[...] the cloud files API always hides its reparse points from all
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   applications except for sync engines and processes whose main image
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   resides under %systemroot%. Applications that understand reparse
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   points correctly can force the platform to expose cloud files API
   reparse points using RtlSetProcessPlaceholderCompatibilityMode or
   RtlSetThreadProcessPlaceholderCompatibilityMode.

Considering these two statements, it's totally unclear to a process, if
it just defaults to "exposed" or "disguised".

Fortunately we can ask Windows by calling the
RtlQueryProcessPlaceholderCompatibilityMode() function, right?

Lets have a look into the documentation at
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-rtlqueryprocessplaceholdercompatibilitymode

Quote:

  "Return value

  This function returns the process's placeholder compatibily mode
  (PHCM_xxx), or a negative value on error (PCHM_ERROR_xxx). Contains
  one of the following values:

  Compatibility Mode           Value
  PHCM_APPLICATION_DEFAULT       0
  PHCM_DISGUISE_PLACEHOLDER      1
  PHCM_EXPOSE_PLACEHOLDERS       2
  PHCM_MAX                       2
  PHCM_ERROR_INVALID_PARAMETER  -1
  PHCM_ERROR_NO_TEB             -2"

So I called the function right at the start of the Cygwin DLL, and it
returns the value 0, i. e., PHCM_APPLICATION_DEFAULT.

At this point the process *still* has no idea if placeholders are
exposed or disguised.  What a great API! \o/

So, from the above, and if we really want to be sure that placeholders
will be exposed, I don't see any way around calling
RtlSetProcessPlaceholderCompatibilityMode(PHCM_EXPOSE_PLACEHOLDERS)
explicitely at DLL startup.

What do you think?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-06 18:55         ` Jeffrey Altman
  2024-03-06 19:14           ` Corinna Vinschen
  2024-03-07  9:06           ` Corinna Vinschen
@ 2024-03-08 10:37           ` Corinna Vinschen
  2024-03-08 12:52             ` Thomas Wolff
  2 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 10:37 UTC (permalink / raw)
  To: Jeffrey Altman; +Cc: cygwin

Hi Jeffrey,

On Mar  6 13:55, Jeffrey Altman via Cygwin wrote:
> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
> > We can add an explicit call to
> > 
> >    RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
> > [...]
> Files and directories that are placeholders should have either the
> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
> attributes set. When these attributes are set, applications and mini filters
> are advised not to "read" or "open" the files or directories unless they
> absolutely need to because doing so will cause the placeholder to be
> replaced by an object containing the actual data which might take a long
> time to fetch,

Yesterday I stumbled over a certain NtCreateFile flag:

  FILE_OPEN_NO_RECALL (0x00400000)

    Instructs any filters that perform offline storage or virtualization
    to not recall the contents of the file as a result of this open.

MS-CIFS described it like this:

  FILE_OPEN_NO_RECALL
  0x00400000

    In a hierarchical storage management environment, this option
    requests that the file SHOULD NOT be recalled from tertiary storage
    such as tape. A file recall can take up to several minutes in a
    hierarchical storage management environment. The clients can specify
    this option to avoid such delays.

This sounds like we could simply add this flag to all NtOpenFile
used for path conversion or stat-like calls, without having to care
for any file attributes specificially.

Does that make sense?


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 10:37           ` Corinna Vinschen
@ 2024-03-08 12:52             ` Thomas Wolff
  2024-03-08 13:15               ` Jeffrey Altman
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Wolff @ 2024-03-08 12:52 UTC (permalink / raw)
  To: cygwin



Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> Hi Jeffrey,
>
> On Mar  6 13:55, Jeffrey Altman via Cygwin wrote:
>> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
>>> We can add an explicit call to
>>>
>>>     RtlSetProcessPlaceholderCompatibilityMode (PHCM_EXPOSE_PLACEHOLDERS);
>>> [...]
>> Files and directories that are placeholders should have either the
>> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or FILE_ATTRIBUTE_RECALL_ON_OPEN file
>> attributes set. When these attributes are set, applications and mini filters
>> are advised not to "read" or "open" the files or directories unless they
>> absolutely need to because doing so will cause the placeholder to be
>> replaced by an object containing the actual data which might take a long
>> time to fetch,
> Yesterday I stumbled over a certain NtCreateFile flag:
>
>    FILE_OPEN_NO_RECALL (0x00400000)
>
>      Instructs any filters that perform offline storage or virtualization
>      to not recall the contents of the file as a result of this open.
>
> MS-CIFS described it like this:
>
>    FILE_OPEN_NO_RECALL
>    0x00400000
>
>      In a hierarchical storage management environment, this option
>      requests that the file SHOULD NOT be recalled from tertiary storage
>      such as tape. A file recall can take up to several minutes in a
>      hierarchical storage management environment. The clients can specify
>      this option to avoid such delays.
>
> This sounds like we could simply add this flag to all NtOpenFile
> used for path conversion or stat-like calls, without having to care
> for any file attributes specificially.
>
> Does that make sense?
Sounds good, without even studying the other details...
I speculate some more handling would still be needed to avoid executable
detection via magic tags.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 12:52             ` Thomas Wolff
@ 2024-03-08 13:15               ` Jeffrey Altman
  2024-03-08 13:56                 ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Jeffrey Altman @ 2024-03-08 13:15 UTC (permalink / raw)
  To: Thomas Wolff, cygwin

On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
>
> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
>> Hi Jeffrey,
>>
>> On Mar  6 13:55, Jeffrey Altman via Cygwin wrote:
>>> On 3/6/2024 12:19 PM, Corinna Vinschen via Cygwin wrote:
>>>> We can add an explicit call to
>>>>
>>>>     RtlSetProcessPlaceholderCompatibilityMode 
>>>> (PHCM_EXPOSE_PLACEHOLDERS);
>>>> [...]
>>> Files and directories that are placeholders should have either the
>>> FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS or 
>>> FILE_ATTRIBUTE_RECALL_ON_OPEN file
>>> attributes set. When these attributes are set, applications and mini 
>>> filters
>>> are advised not to "read" or "open" the files or directories unless 
>>> they
>>> absolutely need to because doing so will cause the placeholder to be
>>> replaced by an object containing the actual data which might take a 
>>> long
>>> time to fetch,
>> Yesterday I stumbled over a certain NtCreateFile flag:
>>
>>    FILE_OPEN_NO_RECALL (0x00400000)
>>
>>      Instructs any filters that perform offline storage or 
>> virtualization
>>      to not recall the contents of the file as a result of this open.
>>
>> MS-CIFS described it like this:
>>
>>    FILE_OPEN_NO_RECALL
>>    0x00400000
>>
>>      In a hierarchical storage management environment, this option
>>      requests that the file SHOULD NOT be recalled from tertiary storage
>>      such as tape. A file recall can take up to several minutes in a
>>      hierarchical storage management environment. The clients can 
>> specify
>>      this option to avoid such delays.
>>
>> This sounds like we could simply add this flag to all NtOpenFile
>> used for path conversion or stat-like calls, without having to care
>> for any file attributes specificially.
>>
>> Does that make sense?
> Sounds good, without even studying the other details...
> I speculate some more handling would still be needed to avoid executable
> detection via magic tags.
>
Agreed.   FILE_OPEN_NO_RECALL has been defined for at least a decade but 
was not documented by Microsoft relatively recently.

Another suggestion would be to try opening the file with 
FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not 
required.  See

https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493

Jeffrey Altman



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 13:15               ` Jeffrey Altman
@ 2024-03-08 13:56                 ` Corinna Vinschen
  2024-03-08 22:21                   ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 13:56 UTC (permalink / raw)
  To: Jeffrey Altman; +Cc: Thomas Wolff, cygwin

On Mar  8 08:15, Jeffrey Altman via Cygwin wrote:
> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> > Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> > > Yesterday I stumbled over a certain NtCreateFile flag:
> > > 
> > >    FILE_OPEN_NO_RECALL (0x00400000)
> > > 
> > >      Instructs any filters that perform offline storage or
> > > virtualization
> > >      to not recall the contents of the file as a result of this open.
> > > 
> > > MS-CIFS described it like this:
> > > 
> > >    FILE_OPEN_NO_RECALL
> > >    0x00400000
> > > 
> > >      In a hierarchical storage management environment, this option
> > >      requests that the file SHOULD NOT be recalled from tertiary storage
> > >      such as tape. A file recall can take up to several minutes in a
> > >      hierarchical storage management environment. The clients can
> > > specify
> > >      this option to avoid such delays.
> > > 
> > > This sounds like we could simply add this flag to all NtOpenFile
> > > used for path conversion or stat-like calls, without having to care
> > > for any file attributes specificially.
> > > 
> > > Does that make sense?
> > Sounds good, without even studying the other details...
> > I speculate some more handling would still be needed to avoid executable
> > detection via magic tags.
> > 
> Agreed.   FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> not documented by Microsoft relatively recently.

Thanks for the feedback, guys.

> Another suggestion would be to try opening the file with
> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> required.  See
> 
> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493

Cygwin uses the minimum of required permissions in NtCreateFile/
NtOpenFile calls anyway.

I'm just running a test cygwin DLL locally with a lot of added
FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
being offline to allow skipping some code.

I think I'll push this change in a bit so we get a test release out
so people using OneDrive can test.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 13:56                 ` Corinna Vinschen
@ 2024-03-08 22:21                   ` Corinna Vinschen
  2024-03-08 22:26                     ` Marcin Wisnicki
  0 siblings, 1 reply; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-08 22:21 UTC (permalink / raw)
  To: Marcin Wisnicki, cygwin; +Cc: Jeffrey Altman, Thomas Wolff

On Mar  8 14:56, Corinna Vinschen via Cygwin wrote:
> On Mar  8 08:15, Jeffrey Altman via Cygwin wrote:
> > On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> > > Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> > > >    FILE_OPEN_NO_RECALL (0x00400000)
> > > > [...]
> > > > This sounds like we could simply add this flag to all NtOpenFile
> > > > used for path conversion or stat-like calls, without having to care
> > > > for any file attributes specificially.
> > > > 
> > > > Does that make sense?
> > > Sounds good, without even studying the other details...
> > > I speculate some more handling would still be needed to avoid executable
> > > detection via magic tags.
> > > 
> > Agreed.   FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> > not documented by Microsoft relatively recently.
> 
> Thanks for the feedback, guys.
> 
> > Another suggestion would be to try opening the file with
> > FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> > required.  See
> > 
> > https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
> 
> Cygwin uses the minimum of required permissions in NtCreateFile/
> NtOpenFile calls anyway.
> 
> I'm just running a test cygwin DLL locally with a lot of added
> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
> being offline to allow skipping some code.
> 
> I think I'll push this change in a bit so we get a test release out
> so people using OneDrive can test.

I pushed this change as well as a followup change to make sure we don't
inadvertently recall an offline file.  I also added handling for the
Pinned and Unpinned attributes to chattr(1) and lsattr(1).

The full set of changes can be tested by installing the Cygwin test
release 3.6.0-0.77.g06aa5a751682.

Please give it a try.  If you encounter a situation which still results
in recalling an offline file in a situation which doesn't qualify for
it, please report.  We will have to analyze that situation further
then.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 22:21                   ` Corinna Vinschen
@ 2024-03-08 22:26                     ` Marcin Wisnicki
  2024-03-09 20:29                       ` Marcin Wisnicki
  0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-08 22:26 UTC (permalink / raw)
  To: cygwin, Jeffrey Altman, Thomas Wolff

On 2024-03-08 17:21, Corinna Vinschen wrote:
> On Mar  8 14:56, Corinna Vinschen via Cygwin wrote:
>> On Mar  8 08:15, Jeffrey Altman via Cygwin wrote:
>>> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
>>>> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
>>>>>     FILE_OPEN_NO_RECALL (0x00400000)
>>>>> [...]
>>>>> This sounds like we could simply add this flag to all NtOpenFile
>>>>> used for path conversion or stat-like calls, without having to care
>>>>> for any file attributes specificially.
>>>>>
>>>>> Does that make sense?
>>>> Sounds good, without even studying the other details...
>>>> I speculate some more handling would still be needed to avoid executable
>>>> detection via magic tags.
>>>>
>>> Agreed.   FILE_OPEN_NO_RECALL has been defined for at least a decade but was
>>> not documented by Microsoft relatively recently.
>> Thanks for the feedback, guys.
>>
>>> Another suggestion would be to try opening the file with
>>> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
>>> required.  See
>>>
>>> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
>> Cygwin uses the minimum of required permissions in NtCreateFile/
>> NtOpenFile calls anyway.
>>
>> I'm just running a test cygwin DLL locally with a lot of added
>> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
>> being offline to allow skipping some code.
>>
>> I think I'll push this change in a bit so we get a test release out
>> so people using OneDrive can test.
> I pushed this change as well as a followup change to make sure we don't
> inadvertently recall an offline file.  I also added handling for the
> Pinned and Unpinned attributes to chattr(1) and lsattr(1).
>
> The full set of changes can be tested by installing the Cygwin test
> release 3.6.0-0.77.g06aa5a751682.
>
> Please give it a try.  If you encounter a situation which still results
> in recalling an offline file in a situation which doesn't qualify for
> it, please report.  We will have to analyze that situation further
> then.
>
>
> Thanks,
> Corinna

Thanks for doing this work so quickly. I'm not subscribed to this 
mailing list so I didn't see previous messages.

I will try to check this in Cygwin this weekend but should tell you that 
I'm not cygwin user and now found a report of another user claiming this 
only happens in MSys and not in Cygwin.

https://github.com/msys2/MSYS2-packages/issues/3049


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-08 22:26                     ` Marcin Wisnicki
@ 2024-03-09 20:29                       ` Marcin Wisnicki
  2024-03-11 17:04                         ` Corinna Vinschen
  0 siblings, 1 reply; 17+ messages in thread
From: Marcin Wisnicki @ 2024-03-09 20:29 UTC (permalink / raw)
  To: cygwin, Jeffrey Altman, Thomas Wolff

I did more testing and found out that the problem does not happen in
cygwin by default because cygwin mounts with acl which doesn't do
header sniffing while msys uses noacl.

Testing on an mp4 file in OneDrive, when I use noacl in cygwin it
triggers the read as well.
After upgrading to the test version the read is gone and an mp4 file
is not executable.

Thank you!

On Fri, 8 Mar 2024 at 17:26, Marcin Wisnicki <mwisnicki@gmail.com> wrote:
>
> On 2024-03-08 17:21, Corinna Vinschen wrote:
> > On Mar  8 14:56, Corinna Vinschen via Cygwin wrote:
> >> On Mar  8 08:15, Jeffrey Altman via Cygwin wrote:
> >>> On 3/8/2024 7:52 AM, Thomas Wolff via Cygwin wrote:
> >>>> Am 08.03.2024 um 11:37 schrieb Corinna Vinschen via Cygwin:
> >>>>>     FILE_OPEN_NO_RECALL (0x00400000)
> >>>>> [...]
> >>>>> This sounds like we could simply add this flag to all NtOpenFile
> >>>>> used for path conversion or stat-like calls, without having to care
> >>>>> for any file attributes specificially.
> >>>>>
> >>>>> Does that make sense?
> >>>> Sounds good, without even studying the other details...
> >>>> I speculate some more handling would still be needed to avoid executable
> >>>> detection via magic tags.
> >>>>
> >>> Agreed.   FILE_OPEN_NO_RECALL has been defined for at least a decade but was
> >>> not documented by Microsoft relatively recently.
> >> Thanks for the feedback, guys.
> >>
> >>> Another suggestion would be to try opening the file with
> >>> FILE_READ_ATTRIBUTES instead of GENERIC_READ if the file data is not
> >>> required.  See
> >>>
> >>> https://github.com/microsoft/BuildXL/commit/4fb8e7ce07d243ccd95de0d66da551538a794493
> >> Cygwin uses the minimum of required permissions in NtCreateFile/
> >> NtOpenFile calls anyway.
> >>
> >> I'm just running a test cygwin DLL locally with a lot of added
> >> FILE_OPEN_NO_RECALL bits and a couple of added attribute checks for
> >> being offline to allow skipping some code.
> >>
> >> I think I'll push this change in a bit so we get a test release out
> >> so people using OneDrive can test.
> > I pushed this change as well as a followup change to make sure we don't
> > inadvertently recall an offline file.  I also added handling for the
> > Pinned and Unpinned attributes to chattr(1) and lsattr(1).
> >
> > The full set of changes can be tested by installing the Cygwin test
> > release 3.6.0-0.77.g06aa5a751682.
> >
> > Please give it a try.  If you encounter a situation which still results
> > in recalling an offline file in a situation which doesn't qualify for
> > it, please report.  We will have to analyze that situation further
> > then.
> >
> >
> > Thanks,
> > Corinna
>
> Thanks for doing this work so quickly. I'm not subscribed to this
> mailing list so I didn't see previous messages.
>
> I will try to check this in Cygwin this weekend but should tell you that
> I'm not cygwin user and now found a report of another user claiming this
> only happens in MSys and not in Cygwin.
>
> https://github.com/msys2/MSYS2-packages/issues/3049
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ls/stat on OneDrive causes download of files
  2024-03-09 20:29                       ` Marcin Wisnicki
@ 2024-03-11 17:04                         ` Corinna Vinschen
  0 siblings, 0 replies; 17+ messages in thread
From: Corinna Vinschen @ 2024-03-11 17:04 UTC (permalink / raw)
  To: Marcin Wisnicki; +Cc: cygwin, Jeffrey Altman, Thomas Wolff

On Mar  9 15:29, Marcin Wisnicki via Cygwin wrote:
> I did more testing and found out that the problem does not happen in
> cygwin by default because cygwin mounts with acl which doesn't do
> header sniffing while msys uses noacl.
> 
> Testing on an mp4 file in OneDrive, when I use noacl in cygwin it
> triggers the read as well.
> After upgrading to the test version the read is gone and an mp4 file
> is not executable.
> 
> Thank you!

Thanks a lot for testing.  I backported the changes (minus the lsattr(1)/
chattr(1) changes) to the 3.5 branch so it will be in released with
3.5.2 in the next few weeks.


Corinna

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-03-11 17:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06  0:54 ls/stat on OneDrive causes download of files Marcin Wisnicki
2024-03-06 13:22 ` Corinna Vinschen
2024-03-06 13:28   ` Corinna Vinschen
2024-03-06 13:54     ` Brian Inglis
2024-03-06 17:19       ` Corinna Vinschen
2024-03-06 18:55         ` Jeffrey Altman
2024-03-06 19:14           ` Corinna Vinschen
2024-03-07  9:06           ` Corinna Vinschen
2024-03-08 10:37           ` Corinna Vinschen
2024-03-08 12:52             ` Thomas Wolff
2024-03-08 13:15               ` Jeffrey Altman
2024-03-08 13:56                 ` Corinna Vinschen
2024-03-08 22:21                   ` Corinna Vinschen
2024-03-08 22:26                     ` Marcin Wisnicki
2024-03-09 20:29                       ` Marcin Wisnicki
2024-03-11 17:04                         ` Corinna Vinschen
2024-03-06 19:00         ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).