public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Re: untarring symlinks with ../ fails randomly
@ 2011-06-30 12:43 Wolf Geldmacher
  2011-06-30 13:37 ` Corinna Vinschen
  0 siblings, 1 reply; 22+ messages in thread
From: Wolf Geldmacher @ 2011-06-30 12:43 UTC (permalink / raw)
  To: cygwin

Dear all,

just joining after being hit by an obviously known issue:

Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in
symbolic links being randomly substituted by zero length mode 0 files as
described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html.

For me this happens on both Windows 7 and Windows Server 2008.

The interesting (new?) tidbit:

In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to
cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone.

As this is the only change I made: Is it possible that the problem is
not an issue with tar (as discussed on the mailing list) but in fact a
regression in cygwin1.dll?

Regards,
Wolf



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly
  2011-06-30 12:43 untarring symlinks with ../ fails randomly Wolf Geldmacher
@ 2011-06-30 13:37 ` Corinna Vinschen
  2011-06-30 15:05   ` Ken Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Corinna Vinschen @ 2011-06-30 13:37 UTC (permalink / raw)
  To: cygwin

On Jun 30 14:43, Wolf Geldmacher wrote:
> Dear all,
> 
> just joining after being hit by an obviously known issue:
> 
> Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in
> symbolic links being randomly substituted by zero length mode 0 files as
> described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html.
> 
> For me this happens on both Windows 7 and Windows Server 2008.
> 
> The interesting (new?) tidbit:
> 
> In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to
> cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone.
> 
> As this is the only change I made: Is it possible that the problem is
> not an issue with tar (as discussed on the mailing list) but in fact a
> regression in cygwin1.dll?

I never saw this happen.  Therefore, somebody actually seeing this
problem has to debug it.  The least I need is an strace.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly
  2011-06-30 13:37 ` Corinna Vinschen
@ 2011-06-30 15:05   ` Ken Brown
  2011-06-30 15:26     ` Corinna Vinschen
  2011-06-30 15:28     ` Wolf Geldmacher
  0 siblings, 2 replies; 22+ messages in thread
From: Ken Brown @ 2011-06-30 15:05 UTC (permalink / raw)
  To: cygwin

On 6/30/2011 9:37 AM, Corinna Vinschen wrote:
> On Jun 30 14:43, Wolf Geldmacher wrote:
>> Dear all,
>>
>> just joining after being hit by an obviously known issue:
>>
>> Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in
>> symbolic links being randomly substituted by zero length mode 0 files as
>> described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html.
>>
>> For me this happens on both Windows 7 and Windows Server 2008.
>>
>> The interesting (new?) tidbit:
>>
>> In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to
>> cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone.
>>
>> As this is the only change I made: Is it possible that the problem is
>> not an issue with tar (as discussed on the mailing list) but in fact a
>> regression in cygwin1.dll?
>
> I never saw this happen.  Therefore, somebody actually seeing this
> problem has to debug it.  The least I need is an strace.

I'm not sure it needs further debugging.  The patch to tar given in

   http://cygwin.com/ml/cygwin/2011-04/msg00385.html

solves the problem.  Eric said (in the next message) that he will apply 
this to the next release of tar.  In the meantime, it's easy to download 
the tar source, apply the patch, and rebuild.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly
  2011-06-30 15:05   ` Ken Brown
@ 2011-06-30 15:26     ` Corinna Vinschen
  2011-06-30 15:28     ` Wolf Geldmacher
  1 sibling, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-06-30 15:26 UTC (permalink / raw)
  To: cygwin

On Jun 30 11:05, Ken Brown wrote:
> On 6/30/2011 9:37 AM, Corinna Vinschen wrote:
> >On Jun 30 14:43, Wolf Geldmacher wrote:
> >>Dear all,
> >>
> >>just joining after being hit by an obviously known issue:
> >>
> >>Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in
> >>symbolic links being randomly substituted by zero length mode 0 files as
> >>described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html.
> >>
> >>For me this happens on both Windows 7 and Windows Server 2008.
> >>
> >>The interesting (new?) tidbit:
> >>
> >>In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to
> >>cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone.
> >>
> >>As this is the only change I made: Is it possible that the problem is
> >>not an issue with tar (as discussed on the mailing list) but in fact a
> >>regression in cygwin1.dll?
> >
> >I never saw this happen.  Therefore, somebody actually seeing this
> >problem has to debug it.  The least I need is an strace.
> 
> I'm not sure it needs further debugging.  The patch to tar given in
> 
>   http://cygwin.com/ml/cygwin/2011-04/msg00385.html
> 
> solves the problem.  Eric said (in the next message) that he will
> apply this to the next release of tar.  In the meantime, it's easy
> to download the tar source, apply the patch, and rebuild.

Thanks for the reminder.  That's one problem less to worry about :}


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly
  2011-06-30 15:05   ` Ken Brown
  2011-06-30 15:26     ` Corinna Vinschen
@ 2011-06-30 15:28     ` Wolf Geldmacher
  2011-07-04  9:16       ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher
  1 sibling, 1 reply; 22+ messages in thread
From: Wolf Geldmacher @ 2011-06-30 15:28 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2063 bytes --]

On Thu, 2011-06-30 at 11:05 -0400, Ken Brown wrote:
> On 6/30/2011 9:37 AM, Corinna Vinschen wrote:
> > On Jun 30 14:43, Wolf Geldmacher wrote:
> >> Dear all,
> >>
> >> just joining after being hit by an obviously known issue:
> >>
> >> Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in
> >> symbolic links being randomly substituted by zero length mode 0 files as
> >> described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html.
> >>
> >> For me this happens on both Windows 7 and Windows Server 2008.
> >>
> >> The interesting (new?) tidbit:
> >>
> >> In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to
> >> cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone.
> >>
> >> As this is the only change I made: Is it possible that the problem is
> >> not an issue with tar (as discussed on the mailing list) but in fact a
> >> regression in cygwin1.dll?
> >
> > I never saw this happen.  Therefore, somebody actually seeing this
> > problem has to debug it.  The least I need is an strace.
> 
> I'm not sure it needs further debugging.  The patch to tar given in
> 
>    http://cygwin.com/ml/cygwin/2011-04/msg00385.html
> 
> solves the problem.  Eric said (in the next message) that he will apply 
> this to the next release of tar.  In the meantime, it's easy to download 
> the tar source, apply the patch, and rebuild.
I'm afraid this might only solve a symptom and not fix the root cause.

Anyways, pls find attached two (rather big and therefore gzip'd) script
files with:
	- cygcheck -srv output,
	- strace of the tar command (succeeding for 1.7.8, failing
	  for 1.7-9),
	- ls -lR of the resulting directory hierarchy
of a tar archive that has a src directory with 100 files and a tgt
directory with symlinks to the files in ../src. The archive was
created via:

 mkdir symlinks symlinks/src symlinks/tgt
 cd symlinks/src
 let i=0
 while [ $i -lt 100 ]; do
    j=`printf '%03d' $i`; echo $j > $j; let i=$((i+1))
 done
 cd ../tgt
 ln -s ../src/* .
 cd ../..
 tar czvf symlinks.tar.gz symlinks

Regards,
Wolf

[-- Attachment #2: typescript-1.7.8.1.gz --]
[-- Type: application/x-gzip, Size: 232222 bytes --]

[-- Attachment #3: typescript-1.7.9.1.gz --]
[-- Type: application/x-gzip, Size: 150918 bytes --]

[-- Attachment #4: Type: text/plain, Size: 218 bytes --]

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-06-30 15:28     ` Wolf Geldmacher
@ 2011-07-04  9:16       ` Wolf Geldmacher
  2011-07-04 10:47         ` Corinna Vinschen
  0 siblings, 1 reply; 22+ messages in thread
From: Wolf Geldmacher @ 2011-07-04  9:16 UTC (permalink / raw)
  To: cygwin

As an aside:
	I also used to have some trouble with "rm -rf" of a directory
	hierarchy failing more or less reproducibly (like: 80% of the
	time) because files were presumably still "in use". Repeating
	the command several times would succeed, though.

	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
	seems to have solved that issue as well - still have to see
	the first "retry to delete".

This may or may not be related to the original report, as it also reeks
of a race condition during file/directory operations.

Cheers,
Wolf


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04  9:16       ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher
@ 2011-07-04 10:47         ` Corinna Vinschen
  2011-07-04 10:56           ` Ryan Johnson
                             ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-04 10:47 UTC (permalink / raw)
  To: cygwin

On Jul  4 11:15, Wolf Geldmacher wrote:
> As an aside:
> 	I also used to have some trouble with "rm -rf" of a directory
> 	hierarchy failing more or less reproducibly (like: 80% of the
> 	time) because files were presumably still "in use". Repeating
> 	the command several times would succeed, though.
> 
> 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> 	seems to have solved that issue as well - still have to see
> 	the first "retry to delete".
> 
> This may or may not be related to the original report, as it also reeks
> of a race condition during file/directory operations.

I can neither reproduce the tar problem, nor can I reprocude the rm
problem.  I tried this under 2008R2 which is basically the same as your
W7-64 bit.  I used local and remote drives to test the issue but to no
avail.

Are you sure this isn't a BLODA problem which is triggered by the
changes in 1.7.9?

I just took a look through the changes between 1.7.8 and 1.7.9, and
the list of changes which affect filesystem access is pretty small:

2011-03-14  Corinna Vinschen  <...>

        * fhandler_disk_file.cc (fhandler_base::fstat_by_handle): Only use
        file id as inode number if it masters the isgood_inode check.

2011-03-08  Corinna Vinschen  <...>

        * fhandler.cc (fhandler_base::open): When creating a file on a
        filesystem supporting ACLs, create the file with WRITE_DAC access.
        Explain why.
        * fhandler_disk_file.cc (fhandler_disk_file::mkdir): Ditto for
        directories.
        * fhandler_socket.cc (fhandler_socket::bind): Ditto for sockets.
        * path.cc (symlink_worker): Ditto for symlinks.
        * security.cc (get_file_sd): Always call GetSecurityInfo for directories
        on XP and Server 2003.  Improve comment to explain why.

So, is it possible that the request for WRITE_DAC access in the call to
NtCreateFile triggers some hiccup of your virus checker?  It could easily
explain both effects.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 10:47         ` Corinna Vinschen
@ 2011-07-04 10:56           ` Ryan Johnson
  2011-07-04 11:34             ` Corinna Vinschen
  2011-07-04 12:00           ` Wolf Geldmacher
  2011-07-05 12:12           ` Corinna Vinschen
  2 siblings, 1 reply; 22+ messages in thread
From: Ryan Johnson @ 2011-07-04 10:56 UTC (permalink / raw)
  To: cygwin

On 04/07/2011 6:46 AM, Corinna Vinschen wrote:
> On Jul  4 11:15, Wolf Geldmacher wrote:
>> As an aside:
>> 	I also used to have some trouble with "rm -rf" of a directory
>> 	hierarchy failing more or less reproducibly (like: 80% of the
>> 	time) because files were presumably still "in use". Repeating
>> 	the command several times would succeed, though.
>>
>> 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
>> 	seems to have solved that issue as well - still have to see
>> 	the first "retry to delete".
>>
>> This may or may not be related to the original report, as it also reeks
>> of a race condition during file/directory operations.
> I can neither reproduce the tar problem, nor can I reprocude the rm
> problem.  I tried this under 2008R2 which is basically the same as your
> W7-64 bit.  I used local and remote drives to test the issue but to no
> avail.
>
> Are you sure this isn't a BLODA problem which is triggered by the
> changes in 1.7.9?
>
> I just took a look through the changes between 1.7.8 and 1.7.9, and
> the list of changes which affect filesystem access is pretty small:
>
> [snip]
>
> So, is it possible that the request for WRITE_DAC access in the call to
> NtCreateFile triggers some hiccup of your virus checker?  It could easily
> explain both effects.
I have also seen the rm -rf problem occasionally on my w7-64 machine, 
and I don't think anything from BLODA is installed.

However, I haven't noticed the issue since disabling the search indexer 
on my machine. I did this on the hunch that I often delete large 
directory trees which aren't very old (e.g. after untar/configure/make 
of some source package), and that it wouldn't be a big surprise if 
indexing and cygwin's rm don't mix for whatever reason.

Thoughts?
Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 10:56           ` Ryan Johnson
@ 2011-07-04 11:34             ` Corinna Vinschen
  2011-07-04 11:55               ` Corinna Vinschen
                                 ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-04 11:34 UTC (permalink / raw)
  To: cygwin

On Jul  4 06:56, Ryan Johnson wrote:
> On 04/07/2011 6:46 AM, Corinna Vinschen wrote:
> >On Jul  4 11:15, Wolf Geldmacher wrote:
> >>As an aside:
> >>	I also used to have some trouble with "rm -rf" of a directory
> >>	hierarchy failing more or less reproducibly (like: 80% of the
> >>	time) because files were presumably still "in use". Repeating
> >>	the command several times would succeed, though.
> >>
> >>	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> >>	seems to have solved that issue as well - still have to see
> >>	the first "retry to delete".
> >>
> >>This may or may not be related to the original report, as it also reeks
> >>of a race condition during file/directory operations.
> >I can neither reproduce the tar problem, nor can I reprocude the rm
> >problem.  I tried this under 2008R2 which is basically the same as your
> >W7-64 bit.  I used local and remote drives to test the issue but to no
> >avail.
> >
> >Are you sure this isn't a BLODA problem which is triggered by the
> >changes in 1.7.9?
> >
> >I just took a look through the changes between 1.7.8 and 1.7.9, and
> >the list of changes which affect filesystem access is pretty small:
> >
> >[snip]
> >
> >So, is it possible that the request for WRITE_DAC access in the call to
> >NtCreateFile triggers some hiccup of your virus checker?  It could easily
> >explain both effects.
> I have also seen the rm -rf problem occasionally on my w7-64
> machine, and I don't think anything from BLODA is installed.

Also with 1.7.8?  Given the minor number of FS-related changes, it's
so very unlikely that they would cause a differnce between 1.7.8 and
1.7.9.

> However, I haven't noticed the issue since disabling the search
> indexer on my machine. I did this on the hunch that I often delete
> large directory trees which aren't very old (e.g. after
> untar/configure/make of some source package), and that it wouldn't
> be a big surprise if indexing and cygwin's rm don't mix for whatever
> reason.

Hard to imagine that setting the WRITE_DAC flag would interfere with the
search indexer.  On second thought, the flag is only set if a file does
not exist yet and NtCreateFile gets called to create the file.  That
makes it especially unlikely that this would affect unlinking.

However, given that you can reproduce the issue, could you test the
scenario again?  If the issue occurs, can you disable the following code
in fhandler.cc and see if it changes anything?

616  else if (!exists () && has_acls ())
617    /* If we are about to create the file and the filesystem supports
618       ACLs, we will overwrite the DACL after the call to NtCreateFile.
619       This requires a handle with additional WRITE_DAC access,
620       otherwise set_file_sd has to open the file again. */
621    access |= WRITE_DAC;
 

Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 11:34             ` Corinna Vinschen
@ 2011-07-04 11:55               ` Corinna Vinschen
  2011-07-04 12:07               ` Wolf Geldmacher
  2011-07-04 12:22               ` Ryan Johnson
  2 siblings, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-04 11:55 UTC (permalink / raw)
  To: cygwin

On Jul  4 13:33, Corinna Vinschen wrote:
> On Jul  4 06:56, Ryan Johnson wrote:
> > I have also seen the rm -rf problem occasionally on my w7-64
> > machine, and I don't think anything from BLODA is installed.
> 
> Also with 1.7.8?  Given the minor number of FS-related changes, it's
> so very unlikely that they would cause a differnce between 1.7.8 and
> 1.7.9.

Btw., just because a piece of software is not on the BLODA list, doesn't
mean it's not a potential BLODA...


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 10:47         ` Corinna Vinschen
  2011-07-04 10:56           ` Ryan Johnson
@ 2011-07-04 12:00           ` Wolf Geldmacher
  2011-07-05 12:12           ` Corinna Vinschen
  2 siblings, 0 replies; 22+ messages in thread
From: Wolf Geldmacher @ 2011-07-04 12:00 UTC (permalink / raw)
  To: cygwin

On Mon, 2011-07-04 at 12:46 +0200, Corinna Vinschen wrote:
> On Jul  4 11:15, Wolf Geldmacher wrote:
> > As an aside:
> > 	I also used to have some trouble with "rm -rf" of a directory
> > 	hierarchy failing more or less reproducibly (like: 80% of the
> > 	time) because files were presumably still "in use". Repeating
> > 	the command several times would succeed, though.
> > 
> > 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> > 	seems to have solved that issue as well - still have to see
> > 	the first "retry to delete".
> > 
> > This may or may not be related to the original report, as it also reeks
> > of a race condition during file/directory operations.
> 
> I can neither reproduce the tar problem, nor can I reprocude the rm
> problem.  I tried this under 2008R2 which is basically the same as your
> W7-64 bit.  I used local and remote drives to test the issue but to no
> avail.
> 
> Are you sure this isn't a BLODA problem which is triggered by the
> changes in 1.7.9?
> 
> I just took a look through the changes between 1.7.8 and 1.7.9, and
> the list of changes which affect filesystem access is pretty small:
> 
> 2011-03-14  Corinna Vinschen  <...>
> 
>         * fhandler_disk_file.cc (fhandler_base::fstat_by_handle): Only use
>         file id as inode number if it masters the isgood_inode check.
> 
> 2011-03-08  Corinna Vinschen  <...>
> 
>         * fhandler.cc (fhandler_base::open): When creating a file on a
>         filesystem supporting ACLs, create the file with WRITE_DAC access.
>         Explain why.
>         * fhandler_disk_file.cc (fhandler_disk_file::mkdir): Ditto for
>         directories.
>         * fhandler_socket.cc (fhandler_socket::bind): Ditto for sockets.
>         * path.cc (symlink_worker): Ditto for symlinks.
>         * security.cc (get_file_sd): Always call GetSecurityInfo for directories
>         on XP and Server 2003.  Improve comment to explain why.
> 
> So, is it possible that the request for WRITE_DAC access in the call to
> NtCreateFile triggers some hiccup of your virus checker?  It could easily
> explain both effects.
The machines I'm observing this on are not running anti virus (or any of
the listed BLODA) software as they are used internally as build
(compile) only servers, but (as I just found out) do run indexing.
Turned indexing off and will go back to 1.7.9.1 on one of the machines
to check.

What also may be different that these machines are virtual machines
running on an ESX server with disks on a SAN, which results in disk
access times being almost comparable to SSD times.

> Corinna
> 



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 11:34             ` Corinna Vinschen
  2011-07-04 11:55               ` Corinna Vinschen
@ 2011-07-04 12:07               ` Wolf Geldmacher
  2011-07-04 12:22               ` Ryan Johnson
  2 siblings, 0 replies; 22+ messages in thread
From: Wolf Geldmacher @ 2011-07-04 12:07 UTC (permalink / raw)
  To: cygwin

On Mon, 2011-07-04 at 13:33 +0200, Corinna Vinschen wrote:
> On Jul  4 06:56, Ryan Johnson wrote:
> > On 04/07/2011 6:46 AM, Corinna Vinschen wrote:
> > >On Jul  4 11:15, Wolf Geldmacher wrote:
> > >>As an aside:
> > >>	I also used to have some trouble with "rm -rf" of a directory
> > >>	hierarchy failing more or less reproducibly (like: 80% of the
> > >>	time) because files were presumably still "in use". Repeating
> > >>	the command several times would succeed, though.
> > >>
> > >>	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> > >>	seems to have solved that issue as well - still have to see
> > >>	the first "retry to delete".
> > >>
> > >>This may or may not be related to the original report, as it also reeks
> > >>of a race condition during file/directory operations.
> > >I can neither reproduce the tar problem, nor can I reprocude the rm
> > >problem.  I tried this under 2008R2 which is basically the same as your
> > >W7-64 bit.  I used local and remote drives to test the issue but to no
> > >avail.
> > >
> > >Are you sure this isn't a BLODA problem which is triggered by the
> > >changes in 1.7.9?
> > >
> > >I just took a look through the changes between 1.7.8 and 1.7.9, and
> > >the list of changes which affect filesystem access is pretty small:
> > >
> > >[snip]
> > >
> > >So, is it possible that the request for WRITE_DAC access in the call to
> > >NtCreateFile triggers some hiccup of your virus checker?  It could easily
> > >explain both effects.
> > I have also seen the rm -rf problem occasionally on my w7-64
> > machine, and I don't think anything from BLODA is installed.
> 
> Also with 1.7.8?  Given the minor number of FS-related changes, it's
> so very unlikely that they would cause a differnce between 1.7.8 and
> 1.7.9.
> 
> > However, I haven't noticed the issue since disabling the search
> > indexer on my machine. I did this on the hunch that I often delete
> > large directory trees which aren't very old (e.g. after
> > untar/configure/make of some source package), and that it wouldn't
> > be a big surprise if indexing and cygwin's rm don't mix for whatever
> > reason.
> 
> Hard to imagine that setting the WRITE_DAC flag would interfere with the
> search indexer.  On second thought, the flag is only set if a file does
> not exist yet and NtCreateFile gets called to create the file.  That
> makes it especially unlikely that this would affect unlinking.
> 
> However, given that you can reproduce the issue, could you test the
> scenario again?  If the issue occurs, can you disable the following code
> in fhandler.cc and see if it changes anything?
> 
> 616  else if (!exists () && has_acls ())
> 617    /* If we are about to create the file and the filesystem supports
> 618       ACLs, we will overwrite the DACL after the call to NtCreateFile.
> 619       This requires a handle with additional WRITE_DAC access,
> 620       otherwise set_file_sd has to open the file again. */
> 621    access |= WRITE_DAC;
>  
> 
> Thanks,
> Corinna

If turning off indexing (which is not really necessary for a machine in a
build farm anyway) does not result in any change, I'll try your suggestion.

Thanks for your support!

Wolf



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 11:34             ` Corinna Vinschen
  2011-07-04 11:55               ` Corinna Vinschen
  2011-07-04 12:07               ` Wolf Geldmacher
@ 2011-07-04 12:22               ` Ryan Johnson
  2011-07-04 14:21                 ` Corinna Vinschen
  2011-07-04 15:05                 ` Ryan Johnson
  2 siblings, 2 replies; 22+ messages in thread
From: Ryan Johnson @ 2011-07-04 12:22 UTC (permalink / raw)
  To: cygwin

On 04/07/2011 7:33 AM, Corinna Vinschen wrote:
> On Jul  4 06:56, Ryan Johnson wrote:
>> On 04/07/2011 6:46 AM, Corinna Vinschen wrote:
>>> On Jul  4 11:15, Wolf Geldmacher wrote:
>>>> As an aside:
>>>> 	I also used to have some trouble with "rm -rf" of a directory
>>>> 	hierarchy failing more or less reproducibly (like: 80% of the
>>>> 	time) because files were presumably still "in use". Repeating
>>>> 	the command several times would succeed, though.
>>>>
>>>> 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
>>>> 	seems to have solved that issue as well - still have to see
>>>> 	the first "retry to delete".
>>>>
>>>> This may or may not be related to the original report, as it also reeks
>>>> of a race condition during file/directory operations.
>>> I can neither reproduce the tar problem, nor can I reprocude the rm
>>> problem.  I tried this under 2008R2 which is basically the same as your
>>> W7-64 bit.  I used local and remote drives to test the issue but to no
>>> avail.
>>>
>>> Are you sure this isn't a BLODA problem which is triggered by the
>>> changes in 1.7.9?
>>>
>>> I just took a look through the changes between 1.7.8 and 1.7.9, and
>>> the list of changes which affect filesystem access is pretty small:
>>>
>>> [snip]
>>>
>>> So, is it possible that the request for WRITE_DAC access in the call to
>>> NtCreateFile triggers some hiccup of your virus checker?  It could easily
>>> explain both effects.
>> I have also seen the rm -rf problem occasionally on my w7-64
>> machine, and I don't think anything from BLODA is installed.
> Also with 1.7.8?  Given the minor number of FS-related changes, it's
> so very unlikely that they would cause a differnce between 1.7.8 and
> 1.7.9.
>
>> However, I haven't noticed the issue since disabling the search
>> indexer on my machine. I did this on the hunch that I often delete
>> large directory trees which aren't very old (e.g. after
>> untar/configure/make of some source package), and that it wouldn't
>> be a big surprise if indexing and cygwin's rm don't mix for whatever
>> reason.
> Hard to imagine that setting the WRITE_DAC flag would interfere with the
> search indexer.  On second thought, the flag is only set if a file does
> not exist yet and NtCreateFile gets called to create the file.  That
> makes it especially unlikely that this would affect unlinking.
>
> However, given that you can reproduce the issue, could you test the
> scenario again?  If the issue occurs, can you disable the following code
> in fhandler.cc and see if it changes anything?
>
> 616  else if (!exists ()&&  has_acls ())
> 617    /* If we are about to create the file and the filesystem supports
> 618       ACLs, we will overwrite the DACL after the call to NtCreateFile.
> 619       This requires a handle with additional WRITE_DAC access,
> 620       otherwise set_file_sd has to open the file again. */
> 621    access |= WRITE_DAC;
>
Sorry, I have no idea which version of the dll I had at the time. It was 
at least a month ago, maybe more.

However, I was wrong about not seeing the problem since. Choosing a 
random source dir to blow away:
> $ rm -rf Python-2.6.6
> rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty
> $ rm -rf Python-2.6.6
> $

This seems to happen more than half the time (different non-empty dir 
every time). Naturally, running under strace makes the problem go away 
(it doesn't help that strace kills stderr, where any error messages 
might have gone).

Running the following command 10x:

$ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 || 
(echo 'Retrying...' && rm -rf Python-2.6.6))

I get six times with no error, two times with one error, one time each 
with two and three errors.

I'm currently updating and rebuilding my cygwin sources to try out your 
patch...

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 12:22               ` Ryan Johnson
@ 2011-07-04 14:21                 ` Corinna Vinschen
  2011-07-04 15:13                   ` Corinna Vinschen
  2011-07-05 12:12                   ` Corinna Vinschen
  2011-07-04 15:05                 ` Ryan Johnson
  1 sibling, 2 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-04 14:21 UTC (permalink / raw)
  To: cygwin

On Jul  4 08:21, Ryan Johnson wrote:
> However, I was wrong about not seeing the problem since. Choosing a
> random source dir to blow away:
> >$ rm -rf Python-2.6.6
> >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty
> >$ rm -rf Python-2.6.6
> >$
> 
> This seems to happen more than half the time (different non-empty
> dir every time). Naturally, running under strace makes the problem
> go away (it doesn't help that strace kills stderr, where any error
> messages might have gone).
> 
> Running the following command 10x:
> 
> $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6
> || (echo 'Retrying...' && rm -rf Python-2.6.6))
> 
> I get six times with no error, two times with one error, one time
> each with two and three errors.

I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS
on 2K8R2 and it just works.  In a VM.

> I'm currently updating and rebuilding my cygwin sources to try out
> your patch...

Thanks.  I'm just installing W7 64 on real hardware and I'll try again
as soon as the boring install-windows/update-windows/configure-windows/
install-cygwin/configure-cygwin process is finished.


Corinna


(*) I used Python-2.6.5.tar.bz2, not 2.6.6, but I can't believe that's 
    the reason I don't see the problem...

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 12:22               ` Ryan Johnson
  2011-07-04 14:21                 ` Corinna Vinschen
@ 2011-07-04 15:05                 ` Ryan Johnson
  2011-07-04 21:44                   ` Mark Geisert
  1 sibling, 1 reply; 22+ messages in thread
From: Ryan Johnson @ 2011-07-04 15:05 UTC (permalink / raw)
  To: cygwin

On 04/07/2011 8:21 AM, Ryan Johnson wrote:
> On 04/07/2011 7:33 AM, Corinna Vinschen wrote:
>> On Jul  4 06:56, Ryan Johnson wrote:
>>> On 04/07/2011 6:46 AM, Corinna Vinschen wrote:
>>>> On Jul  4 11:15, Wolf Geldmacher wrote:
>>>>> As an aside:
>>>>>     I also used to have some trouble with "rm -rf" of a directory
>>>>>     hierarchy failing more or less reproducibly (like: 80% of the
>>>>>     time) because files were presumably still "in use". Repeating
>>>>>     the command several times would succeed, though.
>>>>>
>>>>>     Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
>>>>>     seems to have solved that issue as well - still have to see
>>>>>     the first "retry to delete".
>>>>>
>>>>> This may or may not be related to the original report, as it also 
>>>>> reeks
>>>>> of a race condition during file/directory operations.
>>>> I can neither reproduce the tar problem, nor can I reprocude the rm
>>>> problem.  I tried this under 2008R2 which is basically the same as 
>>>> your
>>>> W7-64 bit.  I used local and remote drives to test the issue but to no
>>>> avail.
>>>>
>>>> Are you sure this isn't a BLODA problem which is triggered by the
>>>> changes in 1.7.9?
>>>>
>>>> I just took a look through the changes between 1.7.8 and 1.7.9, and
>>>> the list of changes which affect filesystem access is pretty small:
>>>>
>>>> [snip]
>>>>
>>>> So, is it possible that the request for WRITE_DAC access in the 
>>>> call to
>>>> NtCreateFile triggers some hiccup of your virus checker?  It could 
>>>> easily
>>>> explain both effects.
>>> I have also seen the rm -rf problem occasionally on my w7-64
>>> machine, and I don't think anything from BLODA is installed.
>> Also with 1.7.8?  Given the minor number of FS-related changes, it's
>> so very unlikely that they would cause a differnce between 1.7.8 and
>> 1.7.9.
>>
>>> However, I haven't noticed the issue since disabling the search
>>> indexer on my machine. I did this on the hunch that I often delete
>>> large directory trees which aren't very old (e.g. after
>>> untar/configure/make of some source package), and that it wouldn't
>>> be a big surprise if indexing and cygwin's rm don't mix for whatever
>>> reason.
>> Hard to imagine that setting the WRITE_DAC flag would interfere with the
>> search indexer.  On second thought, the flag is only set if a file does
>> not exist yet and NtCreateFile gets called to create the file.  That
>> makes it especially unlikely that this would affect unlinking.
>>
>> However, given that you can reproduce the issue, could you test the
>> scenario again?  If the issue occurs, can you disable the following code
>> in fhandler.cc and see if it changes anything?
>>
>> 616  else if (!exists ()&&  has_acls ())
>> 617    /* If we are about to create the file and the filesystem supports
>> 618       ACLs, we will overwrite the DACL after the call to 
>> NtCreateFile.
>> 619       This requires a handle with additional WRITE_DAC access,
>> 620       otherwise set_file_sd has to open the file again. */
>> 621    access |= WRITE_DAC;
>>
> Sorry, I have no idea which version of the dll I had at the time. It 
> was at least a month ago, maybe more.
>
> However, I was wrong about not seeing the problem since. Choosing a 
> random source dir to blow away:
>> $ rm -rf Python-2.6.6
>> rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty
>> $ rm -rf Python-2.6.6
>> $
>
> This seems to happen more than half the time (different non-empty dir 
> every time). Naturally, running under strace makes the problem go away 
> (it doesn't help that strace kills stderr, where any error messages 
> might have gone).
>
> Running the following command 10x:
>
> $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 || 
> (echo 'Retrying...' && rm -rf Python-2.6.6))
>
> I get six times with no error, two times with one error, one time each 
> with two and three errors.
>
> I'm currently updating and rebuilding my cygwin sources to try out 
> your patch...
Updated, built, and reproduced, with and without the patch. If anything 
it's more common in my dev build -- it happened on the first try both times.

Any idea of how to debug this? We need some instantaneous version of 
lsof or something...

Ryan

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 14:21                 ` Corinna Vinschen
@ 2011-07-04 15:13                   ` Corinna Vinschen
  2011-07-05 12:12                   ` Corinna Vinschen
  1 sibling, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-04 15:13 UTC (permalink / raw)
  To: cygwin

On Jul  4 16:20, Corinna Vinschen wrote:
> On Jul  4 08:21, Ryan Johnson wrote:
> > However, I was wrong about not seeing the problem since. Choosing a
> > random source dir to blow away:
> > >$ rm -rf Python-2.6.6
> > >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty
> > >$ rm -rf Python-2.6.6
> > >$
> > 
> > This seems to happen more than half the time (different non-empty
> > dir every time). Naturally, running under strace makes the problem
> > go away (it doesn't help that strace kills stderr, where any error
> > messages might have gone).
> > 
> > Running the following command 10x:
> > 
> > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6
> > || (echo 'Retrying...' && rm -rf Python-2.6.6))
> > 
> > I get six times with no error, two times with one error, one time
> > each with two and three errors.
> 
> I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS
> on 2K8R2 and it just works.  In a VM.

Btw., I would like to stress again that Cygwin does *not* lock files
it opens, except in very rare circumstances.  It always opens files
with all sharing flags set, except in these scenarios:

1. @file handling, file is opened w/ FILE_SHARE_READ only.

2. rename(): Omit FILE_SHARE_DELETE flag on Samba to avoid
   STATUS_ACCESS_DENIED if file has the DOS R/O attribute set.

3. unlink(): Tries to open with FILE_SHARE_DELETE only to check for
   files-in-use.  If that works, the file is deleted anyway.  If not,
   it retries to open with all sharing flags set.

4. On exit, if a DLL can't be found, the executable is opened without
   FILE_SHARE_DELETE to scan for DLLs.

5. When exec'ing a file, it's potentially tested for being a script.
   If so, the FILE_SHARE_DELETE is omitted.

I'm going to change 1, 4, and 5, but they can't be the culprit for what
you see.

If a file can't be removed, it's typically a non-Cygwin process holding
a handle to the file with file sharing set to 0.  Consider that a Cygwin
process opens the file with all sharing flags set, so removing the file
will at least work by moving it to the trashcan.  Well, except on remote
drives, that is, because we don't even know if a trashcan is available
on the remote drive and even if so, most of the time it's not accessible
from remote.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 15:05                 ` Ryan Johnson
@ 2011-07-04 21:44                   ` Mark Geisert
  0 siblings, 0 replies; 22+ messages in thread
From: Mark Geisert @ 2011-07-04 21:44 UTC (permalink / raw)
  To: cygwin

> Any idea of how to debug this? We need some instantaneous version of 
> lsof or something...

Not what you asked for, but useful for debugging stuff like this: FileMon and
ProcessMonitor from Sysinternals.com (now a MS site).  Just in case you haven't
run across them before...

..mark


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 14:21                 ` Corinna Vinschen
  2011-07-04 15:13                   ` Corinna Vinschen
@ 2011-07-05 12:12                   ` Corinna Vinschen
  1 sibling, 0 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-05 12:12 UTC (permalink / raw)
  To: cygwin

On Jul  4 16:20, Corinna Vinschen wrote:
> On Jul  4 08:21, Ryan Johnson wrote:
> > However, I was wrong about not seeing the problem since. Choosing a
> > random source dir to blow away:
> > >$ rm -rf Python-2.6.6
> > >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty
> > >$ rm -rf Python-2.6.6
> > >$
> > 
> > This seems to happen more than half the time (different non-empty
> > dir every time). Naturally, running under strace makes the problem
> > go away (it doesn't help that strace kills stderr, where any error
> > messages might have gone).
> > 
> > Running the following command 10x:
> > 
> > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6
> > || (echo 'Retrying...' && rm -rf Python-2.6.6))
> > 
> > I get six times with no error, two times with one error, one time
> > each with two and three errors.
> 
> I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS
> on 2K8R2 and it just works.  In a VM.
> 
> > I'm currently updating and rebuilding my cygwin sources to try out
> > your patch...
> 
> Thanks.  I'm just installing W7 64 on real hardware and I'll try again
> as soon as the boring install-windows/update-windows/configure-windows/
> install-cygwin/configure-cygwin process is finished.

Nope, I still can't reproduce the rm problem.  Works still fine for me,
on real W7 64 hardware.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-04 10:47         ` Corinna Vinschen
  2011-07-04 10:56           ` Ryan Johnson
  2011-07-04 12:00           ` Wolf Geldmacher
@ 2011-07-05 12:12           ` Corinna Vinschen
  2011-07-05 12:21             ` Ryan Johnson
  2011-07-05 15:53             ` Wolf Geldmacher
  2 siblings, 2 replies; 22+ messages in thread
From: Corinna Vinschen @ 2011-07-05 12:12 UTC (permalink / raw)
  To: cygwin

On Jul  4 12:46, Corinna Vinschen wrote:
> On Jul  4 11:15, Wolf Geldmacher wrote:
> > As an aside:
> > 	I also used to have some trouble with "rm -rf" of a directory
> > 	hierarchy failing more or less reproducibly (like: 80% of the
> > 	time) because files were presumably still "in use". Repeating
> > 	the command several times would succeed, though.
> > 
> > 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> > 	seems to have solved that issue as well - still have to see
> > 	the first "retry to delete".
> > 
> > This may or may not be related to the original report, as it also reeks
> > of a race condition during file/directory operations.
> 
> I can neither reproduce the tar problem, nor can I reprocude the rm
> problem.  I tried this under 2008R2 which is basically the same as your
> W7-64 bit.  I used local and remote drives to test the issue but to no
> avail.

Finally I managed to reproduce the problem and now I see what happens.

Windows does not write back the file change timestamp unless the file
buffers are flushed.  This usually occurs at close time.  In contrast to
POSIX specifications the timestamps are *not* automatically updated when
a call to fetch file metadata is performed.

Here's what tar does when creating the symlink:

  1. create file with 000 permissions
  2. fstat
  3. close file
  [...]
  4. stat file
  5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been
     overwritten.

The problem is that the call to fstat on the opened handle gets some
value of the change time timestamp, but the subsequent close changes
the timestamp again.

Speculation: It seems that the timestamp fstat sees is the timestamp
created at the time NtCreateFile is called, while the timestamp from the
call to NtSetSecurityFile to change the DACL is cached and only updated
when calling NtClose.

This also explains why this doesn't occur in 1.7.8.  In 1.7.8, the DACL
has been written using another file handle, because the original handle
didn't have the right to change the DACL.  By adding the WRITE_DAC flag,
I allowed Cygwin to use the original file handle to write the DACL.  The
difference is:

1.7.8:

  - create file
  -   open file for writing the DACL
  -   write DACL
  -   close
  - do whatever the orignal handle was opened for
  - close

1.7.9:

  - create file
  - write DACL
  - do whatever the orignal handle was opened for
  - close

So, with 1.7.9 the close call after writing the DACL is missing, which
accounts for the missing flushing of the file metadata.

By calling FlushFileBuffers in fstat before calling NtQueryInformationFile
I can fix the problem.  Unfortunately that slows down applications like tar,
which use fstat a lot, a lot.

There are two solutions, one is reverting to the 1.7.8 state, which
means, writing the DACL requires to open the file again, or calling
FlushFileBuffers in fstat.
I compared both solutions.  On my hardware, calling tar xzf on your file
is 500% slower if fstat calls FlushFileBuffers compared to just dropping
the WRITE_DAC flag from the open call.  Wow!  Imagine that I added the
WRITE_DAC flag to gain performance...

So I guess this all boils down to the fact that adding WRITE_DAC was
not really a good move.  It's a shame that Windows punishes every try
to speed up file operations with a raise in non-POSIXy behaviour :-(((

I changed that in CVS and right now I'm generating a new developer
snapshot on http://cygwn.com/snapshots/.  Give it a try, please.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-05 12:12           ` Corinna Vinschen
@ 2011-07-05 12:21             ` Ryan Johnson
  2011-07-05 17:02               ` Wolf Geldmacher
  2011-07-05 15:53             ` Wolf Geldmacher
  1 sibling, 1 reply; 22+ messages in thread
From: Ryan Johnson @ 2011-07-05 12:21 UTC (permalink / raw)
  To: cygwin

On 05/07/2011 8:10 AM, Corinna Vinschen wrote:
> On Jul  4 12:46, Corinna Vinschen wrote:
>> On Jul  4 11:15, Wolf Geldmacher wrote:
>>> As an aside:
>>> 	I also used to have some trouble with "rm -rf" of a directory
>>> 	hierarchy failing more or less reproducibly (like: 80% of the
>>> 	time) because files were presumably still "in use". Repeating
>>> 	the command several times would succeed, though.
>>>
>>> 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
>>> 	seems to have solved that issue as well - still have to see
>>> 	the first "retry to delete".
>>>
>>> This may or may not be related to the original report, as it also reeks
>>> of a race condition during file/directory operations.
>> I can neither reproduce the tar problem, nor can I reprocude the rm
>> problem.  I tried this under 2008R2 which is basically the same as your
>> W7-64 bit.  I used local and remote drives to test the issue but to no
>> avail.
> Finally I managed to reproduce the problem and now I see what happens.
>
> Windows does not write back the file change timestamp unless the file
> buffers are flushed.  This usually occurs at close time.  In contrast to
> POSIX specifications the timestamps are *not* automatically updated when
> a call to fetch file metadata is performed.
>
> Here's what tar does when creating the symlink:
>
>    1. create file with 000 permissions
>    2. fstat
>    3. close file
>    [...]
>    4. stat file
>    5. if fstat.st_ctime != stat.st_ctime ==>  symlink placeholder has been
>       overwritten.
>
> The problem is that the call to fstat on the opened handle gets some
> value of the change time timestamp, but the subsequent close changes
> the timestamp again.
Wow. That must have been one hairy debug session... my hat goes off to you!

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-05 12:12           ` Corinna Vinschen
  2011-07-05 12:21             ` Ryan Johnson
@ 2011-07-05 15:53             ` Wolf Geldmacher
  1 sibling, 0 replies; 22+ messages in thread
From: Wolf Geldmacher @ 2011-07-05 15:53 UTC (permalink / raw)
  To: cygwin

On Tue, 2011-07-05 at 14:10 +0200, Corinna Vinschen wrote:
> On Jul  4 12:46, Corinna Vinschen wrote:
> > On Jul  4 11:15, Wolf Geldmacher wrote:
> > > As an aside:
> > > 	I also used to have some trouble with "rm -rf" of a directory
> > > 	hierarchy failing more or less reproducibly (like: 80% of the
> > > 	time) because files were presumably still "in use". Repeating
> > > 	the command several times would succeed, though.
> > > 
> > > 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> > > 	seems to have solved that issue as well - still have to see
> > > 	the first "retry to delete".
> > > 
> > > This may or may not be related to the original report, as it also reeks
> > > of a race condition during file/directory operations.
> > 
> > I can neither reproduce the tar problem, nor can I reprocude the rm
> > problem.  I tried this under 2008R2 which is basically the same as your
> > W7-64 bit.  I used local and remote drives to test the issue but to no
> > avail.
> 
> Finally I managed to reproduce the problem and now I see what happens.
> 
> Windows does not write back the file change timestamp unless the file
> buffers are flushed.  This usually occurs at close time.  In contrast to
> POSIX specifications the timestamps are *not* automatically updated when
> a call to fetch file metadata is performed.
> 
> Here's what tar does when creating the symlink:
> 
>   1. create file with 000 permissions
>   2. fstat
>   3. close file
>   [...]
>   4. stat file
>   5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been
>      overwritten.
> 
> The problem is that the call to fstat on the opened handle gets some
> value of the change time timestamp, but the subsequent close changes
> the timestamp again.
> 
> Speculation: It seems that the timestamp fstat sees is the timestamp
> created at the time NtCreateFile is called, while the timestamp from the
> call to NtSetSecurityFile to change the DACL is cached and only updated
> when calling NtClose.
> 
> This also explains why this doesn't occur in 1.7.8.  In 1.7.8, the DACL
> has been written using another file handle, because the original handle
> didn't have the right to change the DACL.  By adding the WRITE_DAC flag,
> I allowed Cygwin to use the original file handle to write the DACL.  The
> difference is:
> 
> 1.7.8:
> 
>   - create file
>   -   open file for writing the DACL
>   -   write DACL
>   -   close
>   - do whatever the orignal handle was opened for
>   - close
> 
> 1.7.9:
> 
>   - create file
>   - write DACL
>   - do whatever the orignal handle was opened for
>   - close
> 
> So, with 1.7.9 the close call after writing the DACL is missing, which
> accounts for the missing flushing of the file metadata.
> 
> By calling FlushFileBuffers in fstat before calling NtQueryInformationFile
> I can fix the problem.  Unfortunately that slows down applications like tar,
> which use fstat a lot, a lot.
> 
> There are two solutions, one is reverting to the 1.7.8 state, which
> means, writing the DACL requires to open the file again, or calling
> FlushFileBuffers in fstat.
> I compared both solutions.  On my hardware, calling tar xzf on your file
> is 500% slower if fstat calls FlushFileBuffers compared to just dropping
> the WRITE_DAC flag from the open call.  Wow!  Imagine that I added the
> WRITE_DAC flag to gain performance...
> 
> So I guess this all boils down to the fact that adding WRITE_DAC was
> not really a good move.  It's a shame that Windows punishes every try
> to speed up file operations with a raise in non-POSIXy behaviour :-(((
> 
> I changed that in CVS and right now I'm generating a new developer
> snapshot on http://cygwn.com/snapshots/.  Give it a try, please.
> 
> 
> Thanks,
> Corinna
I downloaded and installed the daily dll: I can no longer reproduce the
"failing symlink" problem at all which was 100% reproducible before. So
it looks like your diagnosis and the fix are correct. Thank you very
much for your support!

Regarding the "rm -rf failing" problem: Although I could no longer
reproduce the issue on the test machine when I downgraded to the older
dll, it *did* happen yesterday night on the nightly build with a 1.7.8
cygwin1.dll - so it seems to be unrelated to the WRITE_DAC change, which
incidentially also agrees with Ryan's test results.

Thanks again & Regards
Wolf



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: untarring symlinks with ../ fails randomly, silghtly OT
  2011-07-05 12:21             ` Ryan Johnson
@ 2011-07-05 17:02               ` Wolf Geldmacher
  0 siblings, 0 replies; 22+ messages in thread
From: Wolf Geldmacher @ 2011-07-05 17:02 UTC (permalink / raw)
  To: cygwin

On Tue, 2011-07-05 at 08:21 -0400, Ryan Johnson wrote:
> On 05/07/2011 8:10 AM, Corinna Vinschen wrote:
> > On Jul  4 12:46, Corinna Vinschen wrote:
> >> On Jul  4 11:15, Wolf Geldmacher wrote:
> >>> As an aside:
> >>> 	I also used to have some trouble with "rm -rf" of a directory
> >>> 	hierarchy failing more or less reproducibly (like: 80% of the
> >>> 	time) because files were presumably still "in use". Repeating
> >>> 	the command several times would succeed, though.
> >>>
> >>> 	Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1
> >>> 	seems to have solved that issue as well - still have to see
> >>> 	the first "retry to delete".
> >>>
> >>> This may or may not be related to the original report, as it also reeks
> >>> of a race condition during file/directory operations.
> >> I can neither reproduce the tar problem, nor can I reprocude the rm
> >> problem.  I tried this under 2008R2 which is basically the same as your
> >> W7-64 bit.  I used local and remote drives to test the issue but to no
> >> avail.
> > Finally I managed to reproduce the problem and now I see what happens.
> >
> > Windows does not write back the file change timestamp unless the file
> > buffers are flushed.  This usually occurs at close time.  In contrast to
> > POSIX specifications the timestamps are *not* automatically updated when
> > a call to fetch file metadata is performed.
> >
> > Here's what tar does when creating the symlink:
> >
> >    1. create file with 000 permissions
> >    2. fstat
> >    3. close file
> >    [...]
> >    4. stat file
> >    5. if fstat.st_ctime != stat.st_ctime ==>  symlink placeholder has been
> >       overwritten.
> >
> > The problem is that the call to fstat on the opened handle gets some
> > value of the change time timestamp, but the subsequent close changes
> > the timestamp again.
> Wow. That must have been one hairy debug session... my hat goes off to you!
> 
> Ryan
Definitely agree! (Where's the "Like" button?)




--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-07-05 17:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-30 12:43 untarring symlinks with ../ fails randomly Wolf Geldmacher
2011-06-30 13:37 ` Corinna Vinschen
2011-06-30 15:05   ` Ken Brown
2011-06-30 15:26     ` Corinna Vinschen
2011-06-30 15:28     ` Wolf Geldmacher
2011-07-04  9:16       ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher
2011-07-04 10:47         ` Corinna Vinschen
2011-07-04 10:56           ` Ryan Johnson
2011-07-04 11:34             ` Corinna Vinschen
2011-07-04 11:55               ` Corinna Vinschen
2011-07-04 12:07               ` Wolf Geldmacher
2011-07-04 12:22               ` Ryan Johnson
2011-07-04 14:21                 ` Corinna Vinschen
2011-07-04 15:13                   ` Corinna Vinschen
2011-07-05 12:12                   ` Corinna Vinschen
2011-07-04 15:05                 ` Ryan Johnson
2011-07-04 21:44                   ` Mark Geisert
2011-07-04 12:00           ` Wolf Geldmacher
2011-07-05 12:12           ` Corinna Vinschen
2011-07-05 12:21             ` Ryan Johnson
2011-07-05 17:02               ` Wolf Geldmacher
2011-07-05 15:53             ` Wolf Geldmacher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).