From: Warren Young <wyml@etr-usa.com>
To: cygwin@cygwin.com
Subject: Re: With bad UTF-8, cygwin can create files it can't read
Date: Wed, 01 Apr 2015 16:01:00 -0000 [thread overview]
Message-ID: <F7BC8B64-DE90-4F01-9C8F-2BB3511B4EF5@etr-usa.com> (raw)
In-Reply-To: <20150401133401.GV13285@calimero.vinschen.de>
On Apr 1, 2015, at 7:34 AM, Corinna Vinschen <corinna-cygwin@cygwin.com> wrote:
>
> As you probably know, Unicode values beyond the base plane (that is,
> everything > 0xffff in UTF-32 and > ef bf bf in UTF-8 notation)
> are represented as so-called surrogate pairs in UTF-16, two UTF-16
> values in the 0xd800 - 0xdfff range.
I happened to have run across a similar strangeness in Unicode earlier today. Does Cygwin cope with/care about Unicode normalization forms?
http://goo.gl/jnsqhC
For example, will open(2) cope with any UTF-8 form of a string that you could pass in UTF-16 encoding to CreateFile()?
You could imagine, say, a web app getting a string from a user, then using that to access a file on disk. A different browser given the “same” string could result in a different series of bytes passed to the Cygwin POSIX layer.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next prev parent reply other threads:[~2015-04-01 16:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-25 15:26 Kyzer
2015-03-30 11:16 ` Corinna Vinschen
2015-04-01 13:34 ` Corinna Vinschen
2015-04-01 16:01 ` Warren Young [this message]
2015-04-01 16:16 ` Corinna Vinschen
2015-04-01 16:10 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F7BC8B64-DE90-4F01-9C8F-2BB3511B4EF5@etr-usa.com \
--to=wyml@etr-usa.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).