* Re: untarring symlinks with ../ fails randomly @ 2011-06-30 12:43 Wolf Geldmacher 2011-06-30 13:37 ` Corinna Vinschen 0 siblings, 1 reply; 22+ messages in thread From: Wolf Geldmacher @ 2011-06-30 12:43 UTC (permalink / raw) To: cygwin Dear all, just joining after being hit by an obviously known issue: Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in symbolic links being randomly substituted by zero length mode 0 files as described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html. For me this happens on both Windows 7 and Windows Server 2008. The interesting (new?) tidbit: In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone. As this is the only change I made: Is it possible that the problem is not an issue with tar (as discussed on the mailing list) but in fact a regression in cygwin1.dll? Regards, Wolf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly 2011-06-30 12:43 untarring symlinks with ../ fails randomly Wolf Geldmacher @ 2011-06-30 13:37 ` Corinna Vinschen 2011-06-30 15:05 ` Ken Brown 0 siblings, 1 reply; 22+ messages in thread From: Corinna Vinschen @ 2011-06-30 13:37 UTC (permalink / raw) To: cygwin On Jun 30 14:43, Wolf Geldmacher wrote: > Dear all, > > just joining after being hit by an obviously known issue: > > Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in > symbolic links being randomly substituted by zero length mode 0 files as > described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html. > > For me this happens on both Windows 7 and Windows Server 2008. > > The interesting (new?) tidbit: > > In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to > cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone. > > As this is the only change I made: Is it possible that the problem is > not an issue with tar (as discussed on the mailing list) but in fact a > regression in cygwin1.dll? I never saw this happen. Therefore, somebody actually seeing this problem has to debug it. The least I need is an strace. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly 2011-06-30 13:37 ` Corinna Vinschen @ 2011-06-30 15:05 ` Ken Brown 2011-06-30 15:26 ` Corinna Vinschen 2011-06-30 15:28 ` Wolf Geldmacher 0 siblings, 2 replies; 22+ messages in thread From: Ken Brown @ 2011-06-30 15:05 UTC (permalink / raw) To: cygwin On 6/30/2011 9:37 AM, Corinna Vinschen wrote: > On Jun 30 14:43, Wolf Geldmacher wrote: >> Dear all, >> >> just joining after being hit by an obviously known issue: >> >> Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in >> symbolic links being randomly substituted by zero length mode 0 files as >> described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html. >> >> For me this happens on both Windows 7 and Windows Server 2008. >> >> The interesting (new?) tidbit: >> >> In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to >> cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone. >> >> As this is the only change I made: Is it possible that the problem is >> not an issue with tar (as discussed on the mailing list) but in fact a >> regression in cygwin1.dll? > > I never saw this happen. Therefore, somebody actually seeing this > problem has to debug it. The least I need is an strace. I'm not sure it needs further debugging. The patch to tar given in http://cygwin.com/ml/cygwin/2011-04/msg00385.html solves the problem. Eric said (in the next message) that he will apply this to the next release of tar. In the meantime, it's easy to download the tar source, apply the patch, and rebuild. Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly 2011-06-30 15:05 ` Ken Brown @ 2011-06-30 15:26 ` Corinna Vinschen 2011-06-30 15:28 ` Wolf Geldmacher 1 sibling, 0 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-06-30 15:26 UTC (permalink / raw) To: cygwin On Jun 30 11:05, Ken Brown wrote: > On 6/30/2011 9:37 AM, Corinna Vinschen wrote: > >On Jun 30 14:43, Wolf Geldmacher wrote: > >>Dear all, > >> > >>just joining after being hit by an obviously known issue: > >> > >>Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in > >>symbolic links being randomly substituted by zero length mode 0 files as > >>described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html. > >> > >>For me this happens on both Windows 7 and Windows Server 2008. > >> > >>The interesting (new?) tidbit: > >> > >>In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to > >>cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone. > >> > >>As this is the only change I made: Is it possible that the problem is > >>not an issue with tar (as discussed on the mailing list) but in fact a > >>regression in cygwin1.dll? > > > >I never saw this happen. Therefore, somebody actually seeing this > >problem has to debug it. The least I need is an strace. > > I'm not sure it needs further debugging. The patch to tar given in > > http://cygwin.com/ml/cygwin/2011-04/msg00385.html > > solves the problem. Eric said (in the next message) that he will > apply this to the next release of tar. In the meantime, it's easy > to download the tar source, apply the patch, and rebuild. Thanks for the reminder. That's one problem less to worry about :} Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly 2011-06-30 15:05 ` Ken Brown 2011-06-30 15:26 ` Corinna Vinschen @ 2011-06-30 15:28 ` Wolf Geldmacher 2011-07-04 9:16 ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher 1 sibling, 1 reply; 22+ messages in thread From: Wolf Geldmacher @ 2011-06-30 15:28 UTC (permalink / raw) To: cygwin [-- Attachment #1: Type: text/plain, Size: 2063 bytes --] On Thu, 2011-06-30 at 11:05 -0400, Ken Brown wrote: > On 6/30/2011 9:37 AM, Corinna Vinschen wrote: > > On Jun 30 14:43, Wolf Geldmacher wrote: > >> Dear all, > >> > >> just joining after being hit by an obviously known issue: > >> > >> Running tar (in my case to extract openssl-0.9.8r.tar.gz) results in > >> symbolic links being randomly substituted by zero length mode 0 files as > >> described in http://sourceware.org/ml/cygwin/2011-04/msg00299.html. > >> > >> For me this happens on both Windows 7 and Windows Server 2008. > >> > >> The interesting (new?) tidbit: > >> > >> In sheer desparation I downgraded from cygwin1.dll/1.7.9.1 to > >> cygwin1.ddl/1.7.8.1 via setup.exe and the issue seems to be gone. > >> > >> As this is the only change I made: Is it possible that the problem is > >> not an issue with tar (as discussed on the mailing list) but in fact a > >> regression in cygwin1.dll? > > > > I never saw this happen. Therefore, somebody actually seeing this > > problem has to debug it. The least I need is an strace. > > I'm not sure it needs further debugging. The patch to tar given in > > http://cygwin.com/ml/cygwin/2011-04/msg00385.html > > solves the problem. Eric said (in the next message) that he will apply > this to the next release of tar. In the meantime, it's easy to download > the tar source, apply the patch, and rebuild. I'm afraid this might only solve a symptom and not fix the root cause. Anyways, pls find attached two (rather big and therefore gzip'd) script files with: - cygcheck -srv output, - strace of the tar command (succeeding for 1.7.8, failing for 1.7-9), - ls -lR of the resulting directory hierarchy of a tar archive that has a src directory with 100 files and a tgt directory with symlinks to the files in ../src. The archive was created via: mkdir symlinks symlinks/src symlinks/tgt cd symlinks/src let i=0 while [ $i -lt 100 ]; do j=`printf '%03d' $i`; echo $j > $j; let i=$((i+1)) done cd ../tgt ln -s ../src/* . cd ../.. tar czvf symlinks.tar.gz symlinks Regards, Wolf [-- Attachment #2: typescript-1.7.8.1.gz --] [-- Type: application/x-gzip, Size: 232222 bytes --] [-- Attachment #3: typescript-1.7.9.1.gz --] [-- Type: application/x-gzip, Size: 150918 bytes --] [-- Attachment #4: Type: text/plain, Size: 218 bytes --] -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-06-30 15:28 ` Wolf Geldmacher @ 2011-07-04 9:16 ` Wolf Geldmacher 2011-07-04 10:47 ` Corinna Vinschen 0 siblings, 1 reply; 22+ messages in thread From: Wolf Geldmacher @ 2011-07-04 9:16 UTC (permalink / raw) To: cygwin As an aside: I also used to have some trouble with "rm -rf" of a directory hierarchy failing more or less reproducibly (like: 80% of the time) because files were presumably still "in use". Repeating the command several times would succeed, though. Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 seems to have solved that issue as well - still have to see the first "retry to delete". This may or may not be related to the original report, as it also reeks of a race condition during file/directory operations. Cheers, Wolf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 9:16 ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher @ 2011-07-04 10:47 ` Corinna Vinschen 2011-07-04 10:56 ` Ryan Johnson ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-04 10:47 UTC (permalink / raw) To: cygwin On Jul 4 11:15, Wolf Geldmacher wrote: > As an aside: > I also used to have some trouble with "rm -rf" of a directory > hierarchy failing more or less reproducibly (like: 80% of the > time) because files were presumably still "in use". Repeating > the command several times would succeed, though. > > Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > seems to have solved that issue as well - still have to see > the first "retry to delete". > > This may or may not be related to the original report, as it also reeks > of a race condition during file/directory operations. I can neither reproduce the tar problem, nor can I reprocude the rm problem. I tried this under 2008R2 which is basically the same as your W7-64 bit. I used local and remote drives to test the issue but to no avail. Are you sure this isn't a BLODA problem which is triggered by the changes in 1.7.9? I just took a look through the changes between 1.7.8 and 1.7.9, and the list of changes which affect filesystem access is pretty small: 2011-03-14 Corinna Vinschen <...> * fhandler_disk_file.cc (fhandler_base::fstat_by_handle): Only use file id as inode number if it masters the isgood_inode check. 2011-03-08 Corinna Vinschen <...> * fhandler.cc (fhandler_base::open): When creating a file on a filesystem supporting ACLs, create the file with WRITE_DAC access. Explain why. * fhandler_disk_file.cc (fhandler_disk_file::mkdir): Ditto for directories. * fhandler_socket.cc (fhandler_socket::bind): Ditto for sockets. * path.cc (symlink_worker): Ditto for symlinks. * security.cc (get_file_sd): Always call GetSecurityInfo for directories on XP and Server 2003. Improve comment to explain why. So, is it possible that the request for WRITE_DAC access in the call to NtCreateFile triggers some hiccup of your virus checker? It could easily explain both effects. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 10:47 ` Corinna Vinschen @ 2011-07-04 10:56 ` Ryan Johnson 2011-07-04 11:34 ` Corinna Vinschen 2011-07-04 12:00 ` Wolf Geldmacher 2011-07-05 12:12 ` Corinna Vinschen 2 siblings, 1 reply; 22+ messages in thread From: Ryan Johnson @ 2011-07-04 10:56 UTC (permalink / raw) To: cygwin On 04/07/2011 6:46 AM, Corinna Vinschen wrote: > On Jul 4 11:15, Wolf Geldmacher wrote: >> As an aside: >> I also used to have some trouble with "rm -rf" of a directory >> hierarchy failing more or less reproducibly (like: 80% of the >> time) because files were presumably still "in use". Repeating >> the command several times would succeed, though. >> >> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 >> seems to have solved that issue as well - still have to see >> the first "retry to delete". >> >> This may or may not be related to the original report, as it also reeks >> of a race condition during file/directory operations. > I can neither reproduce the tar problem, nor can I reprocude the rm > problem. I tried this under 2008R2 which is basically the same as your > W7-64 bit. I used local and remote drives to test the issue but to no > avail. > > Are you sure this isn't a BLODA problem which is triggered by the > changes in 1.7.9? > > I just took a look through the changes between 1.7.8 and 1.7.9, and > the list of changes which affect filesystem access is pretty small: > > [snip] > > So, is it possible that the request for WRITE_DAC access in the call to > NtCreateFile triggers some hiccup of your virus checker? It could easily > explain both effects. I have also seen the rm -rf problem occasionally on my w7-64 machine, and I don't think anything from BLODA is installed. However, I haven't noticed the issue since disabling the search indexer on my machine. I did this on the hunch that I often delete large directory trees which aren't very old (e.g. after untar/configure/make of some source package), and that it wouldn't be a big surprise if indexing and cygwin's rm don't mix for whatever reason. Thoughts? Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 10:56 ` Ryan Johnson @ 2011-07-04 11:34 ` Corinna Vinschen 2011-07-04 11:55 ` Corinna Vinschen ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-04 11:34 UTC (permalink / raw) To: cygwin On Jul 4 06:56, Ryan Johnson wrote: > On 04/07/2011 6:46 AM, Corinna Vinschen wrote: > >On Jul 4 11:15, Wolf Geldmacher wrote: > >>As an aside: > >> I also used to have some trouble with "rm -rf" of a directory > >> hierarchy failing more or less reproducibly (like: 80% of the > >> time) because files were presumably still "in use". Repeating > >> the command several times would succeed, though. > >> > >> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > >> seems to have solved that issue as well - still have to see > >> the first "retry to delete". > >> > >>This may or may not be related to the original report, as it also reeks > >>of a race condition during file/directory operations. > >I can neither reproduce the tar problem, nor can I reprocude the rm > >problem. I tried this under 2008R2 which is basically the same as your > >W7-64 bit. I used local and remote drives to test the issue but to no > >avail. > > > >Are you sure this isn't a BLODA problem which is triggered by the > >changes in 1.7.9? > > > >I just took a look through the changes between 1.7.8 and 1.7.9, and > >the list of changes which affect filesystem access is pretty small: > > > >[snip] > > > >So, is it possible that the request for WRITE_DAC access in the call to > >NtCreateFile triggers some hiccup of your virus checker? It could easily > >explain both effects. > I have also seen the rm -rf problem occasionally on my w7-64 > machine, and I don't think anything from BLODA is installed. Also with 1.7.8? Given the minor number of FS-related changes, it's so very unlikely that they would cause a differnce between 1.7.8 and 1.7.9. > However, I haven't noticed the issue since disabling the search > indexer on my machine. I did this on the hunch that I often delete > large directory trees which aren't very old (e.g. after > untar/configure/make of some source package), and that it wouldn't > be a big surprise if indexing and cygwin's rm don't mix for whatever > reason. Hard to imagine that setting the WRITE_DAC flag would interfere with the search indexer. On second thought, the flag is only set if a file does not exist yet and NtCreateFile gets called to create the file. That makes it especially unlikely that this would affect unlinking. However, given that you can reproduce the issue, could you test the scenario again? If the issue occurs, can you disable the following code in fhandler.cc and see if it changes anything? 616 else if (!exists () && has_acls ()) 617 /* If we are about to create the file and the filesystem supports 618 ACLs, we will overwrite the DACL after the call to NtCreateFile. 619 This requires a handle with additional WRITE_DAC access, 620 otherwise set_file_sd has to open the file again. */ 621 access |= WRITE_DAC; Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 11:34 ` Corinna Vinschen @ 2011-07-04 11:55 ` Corinna Vinschen 2011-07-04 12:07 ` Wolf Geldmacher 2011-07-04 12:22 ` Ryan Johnson 2 siblings, 0 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-04 11:55 UTC (permalink / raw) To: cygwin On Jul 4 13:33, Corinna Vinschen wrote: > On Jul 4 06:56, Ryan Johnson wrote: > > I have also seen the rm -rf problem occasionally on my w7-64 > > machine, and I don't think anything from BLODA is installed. > > Also with 1.7.8? Given the minor number of FS-related changes, it's > so very unlikely that they would cause a differnce between 1.7.8 and > 1.7.9. Btw., just because a piece of software is not on the BLODA list, doesn't mean it's not a potential BLODA... Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 11:34 ` Corinna Vinschen 2011-07-04 11:55 ` Corinna Vinschen @ 2011-07-04 12:07 ` Wolf Geldmacher 2011-07-04 12:22 ` Ryan Johnson 2 siblings, 0 replies; 22+ messages in thread From: Wolf Geldmacher @ 2011-07-04 12:07 UTC (permalink / raw) To: cygwin On Mon, 2011-07-04 at 13:33 +0200, Corinna Vinschen wrote: > On Jul 4 06:56, Ryan Johnson wrote: > > On 04/07/2011 6:46 AM, Corinna Vinschen wrote: > > >On Jul 4 11:15, Wolf Geldmacher wrote: > > >>As an aside: > > >> I also used to have some trouble with "rm -rf" of a directory > > >> hierarchy failing more or less reproducibly (like: 80% of the > > >> time) because files were presumably still "in use". Repeating > > >> the command several times would succeed, though. > > >> > > >> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > > >> seems to have solved that issue as well - still have to see > > >> the first "retry to delete". > > >> > > >>This may or may not be related to the original report, as it also reeks > > >>of a race condition during file/directory operations. > > >I can neither reproduce the tar problem, nor can I reprocude the rm > > >problem. I tried this under 2008R2 which is basically the same as your > > >W7-64 bit. I used local and remote drives to test the issue but to no > > >avail. > > > > > >Are you sure this isn't a BLODA problem which is triggered by the > > >changes in 1.7.9? > > > > > >I just took a look through the changes between 1.7.8 and 1.7.9, and > > >the list of changes which affect filesystem access is pretty small: > > > > > >[snip] > > > > > >So, is it possible that the request for WRITE_DAC access in the call to > > >NtCreateFile triggers some hiccup of your virus checker? It could easily > > >explain both effects. > > I have also seen the rm -rf problem occasionally on my w7-64 > > machine, and I don't think anything from BLODA is installed. > > Also with 1.7.8? Given the minor number of FS-related changes, it's > so very unlikely that they would cause a differnce between 1.7.8 and > 1.7.9. > > > However, I haven't noticed the issue since disabling the search > > indexer on my machine. I did this on the hunch that I often delete > > large directory trees which aren't very old (e.g. after > > untar/configure/make of some source package), and that it wouldn't > > be a big surprise if indexing and cygwin's rm don't mix for whatever > > reason. > > Hard to imagine that setting the WRITE_DAC flag would interfere with the > search indexer. On second thought, the flag is only set if a file does > not exist yet and NtCreateFile gets called to create the file. That > makes it especially unlikely that this would affect unlinking. > > However, given that you can reproduce the issue, could you test the > scenario again? If the issue occurs, can you disable the following code > in fhandler.cc and see if it changes anything? > > 616 else if (!exists () && has_acls ()) > 617 /* If we are about to create the file and the filesystem supports > 618 ACLs, we will overwrite the DACL after the call to NtCreateFile. > 619 This requires a handle with additional WRITE_DAC access, > 620 otherwise set_file_sd has to open the file again. */ > 621 access |= WRITE_DAC; > > > Thanks, > Corinna If turning off indexing (which is not really necessary for a machine in a build farm anyway) does not result in any change, I'll try your suggestion. Thanks for your support! Wolf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 11:34 ` Corinna Vinschen 2011-07-04 11:55 ` Corinna Vinschen 2011-07-04 12:07 ` Wolf Geldmacher @ 2011-07-04 12:22 ` Ryan Johnson 2011-07-04 14:21 ` Corinna Vinschen 2011-07-04 15:05 ` Ryan Johnson 2 siblings, 2 replies; 22+ messages in thread From: Ryan Johnson @ 2011-07-04 12:22 UTC (permalink / raw) To: cygwin On 04/07/2011 7:33 AM, Corinna Vinschen wrote: > On Jul 4 06:56, Ryan Johnson wrote: >> On 04/07/2011 6:46 AM, Corinna Vinschen wrote: >>> On Jul 4 11:15, Wolf Geldmacher wrote: >>>> As an aside: >>>> I also used to have some trouble with "rm -rf" of a directory >>>> hierarchy failing more or less reproducibly (like: 80% of the >>>> time) because files were presumably still "in use". Repeating >>>> the command several times would succeed, though. >>>> >>>> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 >>>> seems to have solved that issue as well - still have to see >>>> the first "retry to delete". >>>> >>>> This may or may not be related to the original report, as it also reeks >>>> of a race condition during file/directory operations. >>> I can neither reproduce the tar problem, nor can I reprocude the rm >>> problem. I tried this under 2008R2 which is basically the same as your >>> W7-64 bit. I used local and remote drives to test the issue but to no >>> avail. >>> >>> Are you sure this isn't a BLODA problem which is triggered by the >>> changes in 1.7.9? >>> >>> I just took a look through the changes between 1.7.8 and 1.7.9, and >>> the list of changes which affect filesystem access is pretty small: >>> >>> [snip] >>> >>> So, is it possible that the request for WRITE_DAC access in the call to >>> NtCreateFile triggers some hiccup of your virus checker? It could easily >>> explain both effects. >> I have also seen the rm -rf problem occasionally on my w7-64 >> machine, and I don't think anything from BLODA is installed. > Also with 1.7.8? Given the minor number of FS-related changes, it's > so very unlikely that they would cause a differnce between 1.7.8 and > 1.7.9. > >> However, I haven't noticed the issue since disabling the search >> indexer on my machine. I did this on the hunch that I often delete >> large directory trees which aren't very old (e.g. after >> untar/configure/make of some source package), and that it wouldn't >> be a big surprise if indexing and cygwin's rm don't mix for whatever >> reason. > Hard to imagine that setting the WRITE_DAC flag would interfere with the > search indexer. On second thought, the flag is only set if a file does > not exist yet and NtCreateFile gets called to create the file. That > makes it especially unlikely that this would affect unlinking. > > However, given that you can reproduce the issue, could you test the > scenario again? If the issue occurs, can you disable the following code > in fhandler.cc and see if it changes anything? > > 616 else if (!exists ()&& has_acls ()) > 617 /* If we are about to create the file and the filesystem supports > 618 ACLs, we will overwrite the DACL after the call to NtCreateFile. > 619 This requires a handle with additional WRITE_DAC access, > 620 otherwise set_file_sd has to open the file again. */ > 621 access |= WRITE_DAC; > Sorry, I have no idea which version of the dll I had at the time. It was at least a month ago, maybe more. However, I was wrong about not seeing the problem since. Choosing a random source dir to blow away: > $ rm -rf Python-2.6.6 > rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty > $ rm -rf Python-2.6.6 > $ This seems to happen more than half the time (different non-empty dir every time). Naturally, running under strace makes the problem go away (it doesn't help that strace kills stderr, where any error messages might have gone). Running the following command 10x: $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 || (echo 'Retrying...' && rm -rf Python-2.6.6)) I get six times with no error, two times with one error, one time each with two and three errors. I'm currently updating and rebuilding my cygwin sources to try out your patch... Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 12:22 ` Ryan Johnson @ 2011-07-04 14:21 ` Corinna Vinschen 2011-07-04 15:13 ` Corinna Vinschen 2011-07-05 12:12 ` Corinna Vinschen 2011-07-04 15:05 ` Ryan Johnson 1 sibling, 2 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-04 14:21 UTC (permalink / raw) To: cygwin On Jul 4 08:21, Ryan Johnson wrote: > However, I was wrong about not seeing the problem since. Choosing a > random source dir to blow away: > >$ rm -rf Python-2.6.6 > >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty > >$ rm -rf Python-2.6.6 > >$ > > This seems to happen more than half the time (different non-empty > dir every time). Naturally, running under strace makes the problem > go away (it doesn't help that strace kills stderr, where any error > messages might have gone). > > Running the following command 10x: > > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 > || (echo 'Retrying...' && rm -rf Python-2.6.6)) > > I get six times with no error, two times with one error, one time > each with two and three errors. I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS on 2K8R2 and it just works. In a VM. > I'm currently updating and rebuilding my cygwin sources to try out > your patch... Thanks. I'm just installing W7 64 on real hardware and I'll try again as soon as the boring install-windows/update-windows/configure-windows/ install-cygwin/configure-cygwin process is finished. Corinna (*) I used Python-2.6.5.tar.bz2, not 2.6.6, but I can't believe that's the reason I don't see the problem... -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 14:21 ` Corinna Vinschen @ 2011-07-04 15:13 ` Corinna Vinschen 2011-07-05 12:12 ` Corinna Vinschen 1 sibling, 0 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-04 15:13 UTC (permalink / raw) To: cygwin On Jul 4 16:20, Corinna Vinschen wrote: > On Jul 4 08:21, Ryan Johnson wrote: > > However, I was wrong about not seeing the problem since. Choosing a > > random source dir to blow away: > > >$ rm -rf Python-2.6.6 > > >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty > > >$ rm -rf Python-2.6.6 > > >$ > > > > This seems to happen more than half the time (different non-empty > > dir every time). Naturally, running under strace makes the problem > > go away (it doesn't help that strace kills stderr, where any error > > messages might have gone). > > > > Running the following command 10x: > > > > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 > > || (echo 'Retrying...' && rm -rf Python-2.6.6)) > > > > I get six times with no error, two times with one error, one time > > each with two and three errors. > > I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS > on 2K8R2 and it just works. In a VM. Btw., I would like to stress again that Cygwin does *not* lock files it opens, except in very rare circumstances. It always opens files with all sharing flags set, except in these scenarios: 1. @file handling, file is opened w/ FILE_SHARE_READ only. 2. rename(): Omit FILE_SHARE_DELETE flag on Samba to avoid STATUS_ACCESS_DENIED if file has the DOS R/O attribute set. 3. unlink(): Tries to open with FILE_SHARE_DELETE only to check for files-in-use. If that works, the file is deleted anyway. If not, it retries to open with all sharing flags set. 4. On exit, if a DLL can't be found, the executable is opened without FILE_SHARE_DELETE to scan for DLLs. 5. When exec'ing a file, it's potentially tested for being a script. If so, the FILE_SHARE_DELETE is omitted. I'm going to change 1, 4, and 5, but they can't be the culprit for what you see. If a file can't be removed, it's typically a non-Cygwin process holding a handle to the file with file sharing set to 0. Consider that a Cygwin process opens the file with all sharing flags set, so removing the file will at least work by moving it to the trashcan. Well, except on remote drives, that is, because we don't even know if a trashcan is available on the remote drive and even if so, most of the time it's not accessible from remote. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 14:21 ` Corinna Vinschen 2011-07-04 15:13 ` Corinna Vinschen @ 2011-07-05 12:12 ` Corinna Vinschen 1 sibling, 0 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-05 12:12 UTC (permalink / raw) To: cygwin On Jul 4 16:20, Corinna Vinschen wrote: > On Jul 4 08:21, Ryan Johnson wrote: > > However, I was wrong about not seeing the problem since. Choosing a > > random source dir to blow away: > > >$ rm -rf Python-2.6.6 > > >rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty > > >$ rm -rf Python-2.6.6 > > >$ > > > > This seems to happen more than half the time (different non-empty > > dir every time). Naturally, running under strace makes the problem > > go away (it doesn't help that strace kills stderr, where any error > > messages might have gone). > > > > Running the following command 10x: > > > > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 > > || (echo 'Retrying...' && rm -rf Python-2.6.6)) > > > > I get six times with no error, two times with one error, one time > > each with two and three errors. > > I tried this(*) with Cygwin 1.7.9 as well as with the latest from CVS > on 2K8R2 and it just works. In a VM. > > > I'm currently updating and rebuilding my cygwin sources to try out > > your patch... > > Thanks. I'm just installing W7 64 on real hardware and I'll try again > as soon as the boring install-windows/update-windows/configure-windows/ > install-cygwin/configure-cygwin process is finished. Nope, I still can't reproduce the rm problem. Works still fine for me, on real W7 64 hardware. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 12:22 ` Ryan Johnson 2011-07-04 14:21 ` Corinna Vinschen @ 2011-07-04 15:05 ` Ryan Johnson 2011-07-04 21:44 ` Mark Geisert 1 sibling, 1 reply; 22+ messages in thread From: Ryan Johnson @ 2011-07-04 15:05 UTC (permalink / raw) To: cygwin On 04/07/2011 8:21 AM, Ryan Johnson wrote: > On 04/07/2011 7:33 AM, Corinna Vinschen wrote: >> On Jul 4 06:56, Ryan Johnson wrote: >>> On 04/07/2011 6:46 AM, Corinna Vinschen wrote: >>>> On Jul 4 11:15, Wolf Geldmacher wrote: >>>>> As an aside: >>>>> I also used to have some trouble with "rm -rf" of a directory >>>>> hierarchy failing more or less reproducibly (like: 80% of the >>>>> time) because files were presumably still "in use". Repeating >>>>> the command several times would succeed, though. >>>>> >>>>> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 >>>>> seems to have solved that issue as well - still have to see >>>>> the first "retry to delete". >>>>> >>>>> This may or may not be related to the original report, as it also >>>>> reeks >>>>> of a race condition during file/directory operations. >>>> I can neither reproduce the tar problem, nor can I reprocude the rm >>>> problem. I tried this under 2008R2 which is basically the same as >>>> your >>>> W7-64 bit. I used local and remote drives to test the issue but to no >>>> avail. >>>> >>>> Are you sure this isn't a BLODA problem which is triggered by the >>>> changes in 1.7.9? >>>> >>>> I just took a look through the changes between 1.7.8 and 1.7.9, and >>>> the list of changes which affect filesystem access is pretty small: >>>> >>>> [snip] >>>> >>>> So, is it possible that the request for WRITE_DAC access in the >>>> call to >>>> NtCreateFile triggers some hiccup of your virus checker? It could >>>> easily >>>> explain both effects. >>> I have also seen the rm -rf problem occasionally on my w7-64 >>> machine, and I don't think anything from BLODA is installed. >> Also with 1.7.8? Given the minor number of FS-related changes, it's >> so very unlikely that they would cause a differnce between 1.7.8 and >> 1.7.9. >> >>> However, I haven't noticed the issue since disabling the search >>> indexer on my machine. I did this on the hunch that I often delete >>> large directory trees which aren't very old (e.g. after >>> untar/configure/make of some source package), and that it wouldn't >>> be a big surprise if indexing and cygwin's rm don't mix for whatever >>> reason. >> Hard to imagine that setting the WRITE_DAC flag would interfere with the >> search indexer. On second thought, the flag is only set if a file does >> not exist yet and NtCreateFile gets called to create the file. That >> makes it especially unlikely that this would affect unlinking. >> >> However, given that you can reproduce the issue, could you test the >> scenario again? If the issue occurs, can you disable the following code >> in fhandler.cc and see if it changes anything? >> >> 616 else if (!exists ()&& has_acls ()) >> 617 /* If we are about to create the file and the filesystem supports >> 618 ACLs, we will overwrite the DACL after the call to >> NtCreateFile. >> 619 This requires a handle with additional WRITE_DAC access, >> 620 otherwise set_file_sd has to open the file again. */ >> 621 access |= WRITE_DAC; >> > Sorry, I have no idea which version of the dll I had at the time. It > was at least a month ago, maybe more. > > However, I was wrong about not seeing the problem since. Choosing a > random source dir to blow away: >> $ rm -rf Python-2.6.6 >> rm: cannot remove `Python-2.6.6/Lib/lib2to3/tests': Directory not empty >> $ rm -rf Python-2.6.6 >> $ > > This seems to happen more than half the time (different non-empty dir > every time). Naturally, running under strace makes the problem go away > (it doesn't help that strace kills stderr, where any error messages > might have gone). > > Running the following command 10x: > > $ tar -xaf Python-2.6.6.tar.bz2 && sleep 3 && (rm -rf Python-2.6.6 || > (echo 'Retrying...' && rm -rf Python-2.6.6)) > > I get six times with no error, two times with one error, one time each > with two and three errors. > > I'm currently updating and rebuilding my cygwin sources to try out > your patch... Updated, built, and reproduced, with and without the patch. If anything it's more common in my dev build -- it happened on the first try both times. Any idea of how to debug this? We need some instantaneous version of lsof or something... Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 15:05 ` Ryan Johnson @ 2011-07-04 21:44 ` Mark Geisert 0 siblings, 0 replies; 22+ messages in thread From: Mark Geisert @ 2011-07-04 21:44 UTC (permalink / raw) To: cygwin > Any idea of how to debug this? We need some instantaneous version of > lsof or something... Not what you asked for, but useful for debugging stuff like this: FileMon and ProcessMonitor from Sysinternals.com (now a MS site). Just in case you haven't run across them before... ..mark -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 10:47 ` Corinna Vinschen 2011-07-04 10:56 ` Ryan Johnson @ 2011-07-04 12:00 ` Wolf Geldmacher 2011-07-05 12:12 ` Corinna Vinschen 2 siblings, 0 replies; 22+ messages in thread From: Wolf Geldmacher @ 2011-07-04 12:00 UTC (permalink / raw) To: cygwin On Mon, 2011-07-04 at 12:46 +0200, Corinna Vinschen wrote: > On Jul 4 11:15, Wolf Geldmacher wrote: > > As an aside: > > I also used to have some trouble with "rm -rf" of a directory > > hierarchy failing more or less reproducibly (like: 80% of the > > time) because files were presumably still "in use". Repeating > > the command several times would succeed, though. > > > > Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > > seems to have solved that issue as well - still have to see > > the first "retry to delete". > > > > This may or may not be related to the original report, as it also reeks > > of a race condition during file/directory operations. > > I can neither reproduce the tar problem, nor can I reprocude the rm > problem. I tried this under 2008R2 which is basically the same as your > W7-64 bit. I used local and remote drives to test the issue but to no > avail. > > Are you sure this isn't a BLODA problem which is triggered by the > changes in 1.7.9? > > I just took a look through the changes between 1.7.8 and 1.7.9, and > the list of changes which affect filesystem access is pretty small: > > 2011-03-14 Corinna Vinschen <...> > > * fhandler_disk_file.cc (fhandler_base::fstat_by_handle): Only use > file id as inode number if it masters the isgood_inode check. > > 2011-03-08 Corinna Vinschen <...> > > * fhandler.cc (fhandler_base::open): When creating a file on a > filesystem supporting ACLs, create the file with WRITE_DAC access. > Explain why. > * fhandler_disk_file.cc (fhandler_disk_file::mkdir): Ditto for > directories. > * fhandler_socket.cc (fhandler_socket::bind): Ditto for sockets. > * path.cc (symlink_worker): Ditto for symlinks. > * security.cc (get_file_sd): Always call GetSecurityInfo for directories > on XP and Server 2003. Improve comment to explain why. > > So, is it possible that the request for WRITE_DAC access in the call to > NtCreateFile triggers some hiccup of your virus checker? It could easily > explain both effects. The machines I'm observing this on are not running anti virus (or any of the listed BLODA) software as they are used internally as build (compile) only servers, but (as I just found out) do run indexing. Turned indexing off and will go back to 1.7.9.1 on one of the machines to check. What also may be different that these machines are virtual machines running on an ESX server with disks on a SAN, which results in disk access times being almost comparable to SSD times. > Corinna > -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-04 10:47 ` Corinna Vinschen 2011-07-04 10:56 ` Ryan Johnson 2011-07-04 12:00 ` Wolf Geldmacher @ 2011-07-05 12:12 ` Corinna Vinschen 2011-07-05 12:21 ` Ryan Johnson 2011-07-05 15:53 ` Wolf Geldmacher 2 siblings, 2 replies; 22+ messages in thread From: Corinna Vinschen @ 2011-07-05 12:12 UTC (permalink / raw) To: cygwin On Jul 4 12:46, Corinna Vinschen wrote: > On Jul 4 11:15, Wolf Geldmacher wrote: > > As an aside: > > I also used to have some trouble with "rm -rf" of a directory > > hierarchy failing more or less reproducibly (like: 80% of the > > time) because files were presumably still "in use". Repeating > > the command several times would succeed, though. > > > > Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > > seems to have solved that issue as well - still have to see > > the first "retry to delete". > > > > This may or may not be related to the original report, as it also reeks > > of a race condition during file/directory operations. > > I can neither reproduce the tar problem, nor can I reprocude the rm > problem. I tried this under 2008R2 which is basically the same as your > W7-64 bit. I used local and remote drives to test the issue but to no > avail. Finally I managed to reproduce the problem and now I see what happens. Windows does not write back the file change timestamp unless the file buffers are flushed. This usually occurs at close time. In contrast to POSIX specifications the timestamps are *not* automatically updated when a call to fetch file metadata is performed. Here's what tar does when creating the symlink: 1. create file with 000 permissions 2. fstat 3. close file [...] 4. stat file 5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been overwritten. The problem is that the call to fstat on the opened handle gets some value of the change time timestamp, but the subsequent close changes the timestamp again. Speculation: It seems that the timestamp fstat sees is the timestamp created at the time NtCreateFile is called, while the timestamp from the call to NtSetSecurityFile to change the DACL is cached and only updated when calling NtClose. This also explains why this doesn't occur in 1.7.8. In 1.7.8, the DACL has been written using another file handle, because the original handle didn't have the right to change the DACL. By adding the WRITE_DAC flag, I allowed Cygwin to use the original file handle to write the DACL. The difference is: 1.7.8: - create file - open file for writing the DACL - write DACL - close - do whatever the orignal handle was opened for - close 1.7.9: - create file - write DACL - do whatever the orignal handle was opened for - close So, with 1.7.9 the close call after writing the DACL is missing, which accounts for the missing flushing of the file metadata. By calling FlushFileBuffers in fstat before calling NtQueryInformationFile I can fix the problem. Unfortunately that slows down applications like tar, which use fstat a lot, a lot. There are two solutions, one is reverting to the 1.7.8 state, which means, writing the DACL requires to open the file again, or calling FlushFileBuffers in fstat. I compared both solutions. On my hardware, calling tar xzf on your file is 500% slower if fstat calls FlushFileBuffers compared to just dropping the WRITE_DAC flag from the open call. Wow! Imagine that I added the WRITE_DAC flag to gain performance... So I guess this all boils down to the fact that adding WRITE_DAC was not really a good move. It's a shame that Windows punishes every try to speed up file operations with a raise in non-POSIXy behaviour :-((( I changed that in CVS and right now I'm generating a new developer snapshot on http://cygwn.com/snapshots/. Give it a try, please. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-05 12:12 ` Corinna Vinschen @ 2011-07-05 12:21 ` Ryan Johnson 2011-07-05 17:02 ` Wolf Geldmacher 2011-07-05 15:53 ` Wolf Geldmacher 1 sibling, 1 reply; 22+ messages in thread From: Ryan Johnson @ 2011-07-05 12:21 UTC (permalink / raw) To: cygwin On 05/07/2011 8:10 AM, Corinna Vinschen wrote: > On Jul 4 12:46, Corinna Vinschen wrote: >> On Jul 4 11:15, Wolf Geldmacher wrote: >>> As an aside: >>> I also used to have some trouble with "rm -rf" of a directory >>> hierarchy failing more or less reproducibly (like: 80% of the >>> time) because files were presumably still "in use". Repeating >>> the command several times would succeed, though. >>> >>> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 >>> seems to have solved that issue as well - still have to see >>> the first "retry to delete". >>> >>> This may or may not be related to the original report, as it also reeks >>> of a race condition during file/directory operations. >> I can neither reproduce the tar problem, nor can I reprocude the rm >> problem. I tried this under 2008R2 which is basically the same as your >> W7-64 bit. I used local and remote drives to test the issue but to no >> avail. > Finally I managed to reproduce the problem and now I see what happens. > > Windows does not write back the file change timestamp unless the file > buffers are flushed. This usually occurs at close time. In contrast to > POSIX specifications the timestamps are *not* automatically updated when > a call to fetch file metadata is performed. > > Here's what tar does when creating the symlink: > > 1. create file with 000 permissions > 2. fstat > 3. close file > [...] > 4. stat file > 5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been > overwritten. > > The problem is that the call to fstat on the opened handle gets some > value of the change time timestamp, but the subsequent close changes > the timestamp again. Wow. That must have been one hairy debug session... my hat goes off to you! Ryan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-05 12:21 ` Ryan Johnson @ 2011-07-05 17:02 ` Wolf Geldmacher 0 siblings, 0 replies; 22+ messages in thread From: Wolf Geldmacher @ 2011-07-05 17:02 UTC (permalink / raw) To: cygwin On Tue, 2011-07-05 at 08:21 -0400, Ryan Johnson wrote: > On 05/07/2011 8:10 AM, Corinna Vinschen wrote: > > On Jul 4 12:46, Corinna Vinschen wrote: > >> On Jul 4 11:15, Wolf Geldmacher wrote: > >>> As an aside: > >>> I also used to have some trouble with "rm -rf" of a directory > >>> hierarchy failing more or less reproducibly (like: 80% of the > >>> time) because files were presumably still "in use". Repeating > >>> the command several times would succeed, though. > >>> > >>> Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > >>> seems to have solved that issue as well - still have to see > >>> the first "retry to delete". > >>> > >>> This may or may not be related to the original report, as it also reeks > >>> of a race condition during file/directory operations. > >> I can neither reproduce the tar problem, nor can I reprocude the rm > >> problem. I tried this under 2008R2 which is basically the same as your > >> W7-64 bit. I used local and remote drives to test the issue but to no > >> avail. > > Finally I managed to reproduce the problem and now I see what happens. > > > > Windows does not write back the file change timestamp unless the file > > buffers are flushed. This usually occurs at close time. In contrast to > > POSIX specifications the timestamps are *not* automatically updated when > > a call to fetch file metadata is performed. > > > > Here's what tar does when creating the symlink: > > > > 1. create file with 000 permissions > > 2. fstat > > 3. close file > > [...] > > 4. stat file > > 5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been > > overwritten. > > > > The problem is that the call to fstat on the opened handle gets some > > value of the change time timestamp, but the subsequent close changes > > the timestamp again. > Wow. That must have been one hairy debug session... my hat goes off to you! > > Ryan Definitely agree! (Where's the "Like" button?) -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: untarring symlinks with ../ fails randomly, silghtly OT 2011-07-05 12:12 ` Corinna Vinschen 2011-07-05 12:21 ` Ryan Johnson @ 2011-07-05 15:53 ` Wolf Geldmacher 1 sibling, 0 replies; 22+ messages in thread From: Wolf Geldmacher @ 2011-07-05 15:53 UTC (permalink / raw) To: cygwin On Tue, 2011-07-05 at 14:10 +0200, Corinna Vinschen wrote: > On Jul 4 12:46, Corinna Vinschen wrote: > > On Jul 4 11:15, Wolf Geldmacher wrote: > > > As an aside: > > > I also used to have some trouble with "rm -rf" of a directory > > > hierarchy failing more or less reproducibly (like: 80% of the > > > time) because files were presumably still "in use". Repeating > > > the command several times would succeed, though. > > > > > > Downgrading from cygwin1.dll/1.7.9.1 to cygwin1.dll/1.7.8.1 > > > seems to have solved that issue as well - still have to see > > > the first "retry to delete". > > > > > > This may or may not be related to the original report, as it also reeks > > > of a race condition during file/directory operations. > > > > I can neither reproduce the tar problem, nor can I reprocude the rm > > problem. I tried this under 2008R2 which is basically the same as your > > W7-64 bit. I used local and remote drives to test the issue but to no > > avail. > > Finally I managed to reproduce the problem and now I see what happens. > > Windows does not write back the file change timestamp unless the file > buffers are flushed. This usually occurs at close time. In contrast to > POSIX specifications the timestamps are *not* automatically updated when > a call to fetch file metadata is performed. > > Here's what tar does when creating the symlink: > > 1. create file with 000 permissions > 2. fstat > 3. close file > [...] > 4. stat file > 5. if fstat.st_ctime != stat.st_ctime ==> symlink placeholder has been > overwritten. > > The problem is that the call to fstat on the opened handle gets some > value of the change time timestamp, but the subsequent close changes > the timestamp again. > > Speculation: It seems that the timestamp fstat sees is the timestamp > created at the time NtCreateFile is called, while the timestamp from the > call to NtSetSecurityFile to change the DACL is cached and only updated > when calling NtClose. > > This also explains why this doesn't occur in 1.7.8. In 1.7.8, the DACL > has been written using another file handle, because the original handle > didn't have the right to change the DACL. By adding the WRITE_DAC flag, > I allowed Cygwin to use the original file handle to write the DACL. The > difference is: > > 1.7.8: > > - create file > - open file for writing the DACL > - write DACL > - close > - do whatever the orignal handle was opened for > - close > > 1.7.9: > > - create file > - write DACL > - do whatever the orignal handle was opened for > - close > > So, with 1.7.9 the close call after writing the DACL is missing, which > accounts for the missing flushing of the file metadata. > > By calling FlushFileBuffers in fstat before calling NtQueryInformationFile > I can fix the problem. Unfortunately that slows down applications like tar, > which use fstat a lot, a lot. > > There are two solutions, one is reverting to the 1.7.8 state, which > means, writing the DACL requires to open the file again, or calling > FlushFileBuffers in fstat. > I compared both solutions. On my hardware, calling tar xzf on your file > is 500% slower if fstat calls FlushFileBuffers compared to just dropping > the WRITE_DAC flag from the open call. Wow! Imagine that I added the > WRITE_DAC flag to gain performance... > > So I guess this all boils down to the fact that adding WRITE_DAC was > not really a good move. It's a shame that Windows punishes every try > to speed up file operations with a raise in non-POSIXy behaviour :-((( > > I changed that in CVS and right now I'm generating a new developer > snapshot on http://cygwn.com/snapshots/. Give it a try, please. > > > Thanks, > Corinna I downloaded and installed the daily dll: I can no longer reproduce the "failing symlink" problem at all which was 100% reproducible before. So it looks like your diagnosis and the fix are correct. Thank you very much for your support! Regarding the "rm -rf failing" problem: Although I could no longer reproduce the issue on the test machine when I downgraded to the older dll, it *did* happen yesterday night on the nightly build with a 1.7.8 cygwin1.dll - so it seems to be unrelated to the WRITE_DAC change, which incidentially also agrees with Ryan's test results. Thanks again & Regards Wolf -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2011-07-05 17:02 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-06-30 12:43 untarring symlinks with ../ fails randomly Wolf Geldmacher 2011-06-30 13:37 ` Corinna Vinschen 2011-06-30 15:05 ` Ken Brown 2011-06-30 15:26 ` Corinna Vinschen 2011-06-30 15:28 ` Wolf Geldmacher 2011-07-04 9:16 ` untarring symlinks with ../ fails randomly, silghtly OT Wolf Geldmacher 2011-07-04 10:47 ` Corinna Vinschen 2011-07-04 10:56 ` Ryan Johnson 2011-07-04 11:34 ` Corinna Vinschen 2011-07-04 11:55 ` Corinna Vinschen 2011-07-04 12:07 ` Wolf Geldmacher 2011-07-04 12:22 ` Ryan Johnson 2011-07-04 14:21 ` Corinna Vinschen 2011-07-04 15:13 ` Corinna Vinschen 2011-07-05 12:12 ` Corinna Vinschen 2011-07-04 15:05 ` Ryan Johnson 2011-07-04 21:44 ` Mark Geisert 2011-07-04 12:00 ` Wolf Geldmacher 2011-07-05 12:12 ` Corinna Vinschen 2011-07-05 12:21 ` Ryan Johnson 2011-07-05 17:02 ` Wolf Geldmacher 2011-07-05 15:53 ` Wolf Geldmacher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).