From: Kyzer <stuart.caie@gmail.com>
To: cygwin@cygwin.com
Subject: With bad UTF-8, cygwin can create files it can't read
Date: Wed, 25 Mar 2015 15:26:00 -0000 [thread overview]
Message-ID: <CAOCY71AaRWGEFVcPqLKNEjqWEkELdfLD-KBvxMAQCi0wt2A5ZA@mail.gmail.com> (raw)
Hello,
I've found that if you use cygwin to create a file with badly-encoded
UTF-8, readdir() gives out an entry with a name that cygwin won't
subsequently accept.
* create a file using filename with hex bytes F4 8F BF BF
* readdir() reports the filename as hex bytes E2 8E B3 ED BF BF
* attempting to open or unlink the filename E2 8E B3 ED BF BF fails
* attempting to open or unlink the filename F4 8F BF BF succeeds
Here's a test case. Beware that it will delete everything in the
current directory.
#include <stdio.h>
#include <dirent.h>
int main() {
DIR *d;
struct dirent *de;
char *fname = "\xF4\x8F\xBF\xBF";
// touch file
fclose(fopen(fname, "wb"));
// iterate through dir
d = opendir(".");
while ((de = readdir(d))) {
if (de->d_name[0] == '.') continue;
printf("unlink(%s) = %d\n", de->d_name, unlink(de->d_name));
}
closedir(d);
// show that unlink works if you know the real filename
printf("unlink(%s) = %d\n", fname, unlink(fname));
}
This outputs (piped through hexdump -C)
00000000 75 6e 6c 69 6e 6b 28 e2 8e b3 ed bf bf 29 20 3d |unlink(......) =|
00000010 20 2d 31 0a 75 6e 6c 69 6e 6b 28 f4 8f bf bf 29 | -1.unlink(....)|
00000020 20 3d 20 30 0a | = 0.|
00000025
e.g.
unlink(\xe2\x8e\xb3\xed\xbf\xbf) = -1
unlink(\xf4\x8f\xbf\xbf) = 0
This is with cygwin package 1.7.35
$ cygcheck -c cygwin
Cygwin Package Information
Package Version Status
cygwin 1.7.35-1 OK
WIndows / DOS does not have the problem:
c:\test\t>dir
Volume in drive C has no label.
Volume Serial Number is ....-....
Directory of c:\test\t
25/03/2015 14:30 <DIR> .
25/03/2015 14:30 <DIR> ..
25/03/2015 14:30 0 ??
1 File(s) 0 bytes
2 Dir(s) 39,906,525,184 bytes free
c:\test\t>del *
c:\test\t\*, Are you sure (Y/N)? y
c:\test\t>dir
Volume in drive C has no label.
Volume Serial Number is ....-....
Directory of c:\test\t
25/03/2015 14:31 <DIR> .
25/03/2015 14:31 <DIR> ..
0 File(s) 0 bytes
2 Dir(s) 39,906,525,184 bytes free
Regards
Stuart
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next reply other threads:[~2015-03-25 14:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-25 15:26 Kyzer [this message]
2015-03-30 11:16 ` Corinna Vinschen
2015-04-01 13:34 ` Corinna Vinschen
2015-04-01 16:01 ` Warren Young
2015-04-01 16:16 ` Corinna Vinschen
2015-04-01 16:10 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOCY71AaRWGEFVcPqLKNEjqWEkELdfLD-KBvxMAQCi0wt2A5ZA@mail.gmail.com \
--to=stuart.caie@gmail.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).