public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Andy Koppe <andy.koppe@gmail.com>
To: cygwin@cygwin.com, bug-gnulib@gnu.org, bug-coreutils@gnu.org
Subject: Re: 16-bit wchar_t on Windows and Cygwin
Date: Wed, 02 Feb 2011 20:28:00 -0000	[thread overview]
Message-ID: <AANLkTinVjgOWPar+8prQA2aE4FphJcm-Y1oq3c1D_wta@mail.gmail.com> (raw)
In-Reply-To: <20110202163516.GI2675@calimero.vinschen.de>

On 2 February 2011 16:35, Corinna Vinschen wrote:
> On Feb  2 17:28, Corinna Vinschen wrote:
>> On Feb  2 17:02, Bruno Haible wrote:
>> > But if you say that the application should convert UTF-16 surrogates
>> > to UTF-32 before calling iswalpha: That's certainly a requirement
>> > for Cygwin 1.7.x application that want to support the entire Unicode
>> > character set. But it's outside of POSIX, and many GNU programs will
>> > not want to include this added complexity. Just try to apply this
>> > suggestion to gnulib's quotearg.c, then estimate the time someone
>> > would need to apply it also to regcomp.c, strftime.c, mbscasestr.c,
>> > coreutils/src/wc.c, and so on.
>>
>> Cygwin's regcomp is taken from FreeBSD and is UTF-16 capable, including
>> surrogate handling.  It only required two changes in the code.
>
> Btw., I would be sure glad if Cygwin would use a wchar_t of 4 bytes as
> well.  The problem is that this requires too many changes at once to
> work right, and it would introduce a lot of backward compatibility
> problems which would have to be handled.

Cygwin 1.7 might have been a good point for that change, because the
lack of proper locale and charset support in previous versions meant
that backward compatibility was much less of a concern than it is now.
But it's a difficult change indeed, and it's not entirely clear that
it's worthwhile. I guess 64-bit Cygwin (if or when it happens) might
be the next opportunity.

> If only the one's who decided that wchar_t in Cygwin should have the
> same size as WCHAR_T in the underlying Windows would have thought twice
> about the implications...

Windows Unicode support was introduced with Windows NT in 1993,
whereas Unicode was only extended beyond 16 bits with version 2.0 in
1996. Cygwin was first released the year before. If the Unicode
extension was a consideration at all (which I'd doubt), wchar_t !=
WCHAR probably seemed far more daunting than having to deal with
surrogates at some point down the line.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2011-02-02 20:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <201101310304.42975.bruno@clisp.org>
2011-01-31 19:16 ` Eric Blake
2011-01-31 20:49   ` Corinna Vinschen
2011-02-02 11:29   ` Bruno Haible
2011-02-02 12:15     ` Corinna Vinschen
2011-02-02 12:21       ` Corinna Vinschen
2011-02-02 16:03     ` Bruno Haible
2011-02-02 16:28       ` Corinna Vinschen
2011-02-02 16:35         ` Corinna Vinschen
2011-02-02 20:28           ` Andy Koppe [this message]
2011-02-04 22:46           ` Warren Young
2011-02-02 17:52     ` bug#7948: " Paul Eggert
2011-02-02 18:57       ` Bruno Haible
2011-02-02 20:43         ` Andy Koppe
2011-02-03 12:57       ` Ulf Zibis
2011-02-02 21:24     ` Eric Blake
2011-02-02 21:39       ` Corinna Vinschen
2011-02-02 23:03       ` Bruno Haible
2011-02-02 23:19         ` Eric Blake
2011-02-03  0:13           ` Bruno Haible
2011-02-03  9:42             ` Corinna Vinschen
2011-02-03 10:48               ` Bruno Haible

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTinVjgOWPar+8prQA2aE4FphJcm-Y1oq3c1D_wta@mail.gmail.com \
    --to=andy.koppe@gmail.com \
    --cc=bug-coreutils@gnu.org \
    --cc=bug-gnulib@gnu.org \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).