* Old UTF16 patch
@ 2007-11-01 23:31 Elena Zannoni
2007-11-02 0:08 ` Joseph S. Myers
0 siblings, 1 reply; 3+ messages in thread
From: Elena Zannoni @ 2007-11-01 23:31 UTC (permalink / raw)
To: gcc; +Cc: Tom Tromey
Hi,
does anybody know if this patch ever got merged into GCC, or if UTF-16
is currently supported?
ftp://ftp.sap.com/pub/i18N/utf16/ugcc-3.2/README
Tom, I saw you replied to this thread, so maybe you know about this:
http://mail.nl.linux.org/linux-utf8/2001-07/msg00064.html
I believe the patch was originally from Suse, if it hasn't been merged
I'll do some more digging, and see if
somebody from Oracle can integrate this. My understanding is that it
hasn't been integrated yet.
thanks
elena
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Old UTF16 patch
2007-11-01 23:31 Old UTF16 patch Elena Zannoni
@ 2007-11-02 0:08 ` Joseph S. Myers
2007-11-06 20:53 ` Lawrence Crowl
0 siblings, 1 reply; 3+ messages in thread
From: Joseph S. Myers @ 2007-11-02 0:08 UTC (permalink / raw)
To: Elena Zannoni; +Cc: gcc, Tom Tromey
I haven't followed any developments relating to TR19769 in WG14 after its
publication in detail; has WG14 yet given an answer on what should be done
with u'C' where C represents a single character that requires a surrogate
pair to represent in UTF-16 (to name one noted place where the TR
underspecifies things)?
I don't think there's much worthwhile in those old patches. Start with
the ISO TR text, produce testcases that cover everything there and the
desired semantics for everything the TR leaves unspecified or
underspecified, and only once the testcases are settled work out an
implementation for the agreed semantics.
A TR is not a standard, so for C this must be disabled in all strict
conformance modes (note that it affects the rules for lexing and so
changes the semantics of conforming programs); likewise for C++98. The
C++0x draft includes the notation from TR19769, so the feature should be
enabled by default in C++0x (and so far as the C TR is compatible with
C++0x, both should be followed in both C and C++ when the feature is
enabled).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Old UTF16 patch
2007-11-02 0:08 ` Joseph S. Myers
@ 2007-11-06 20:53 ` Lawrence Crowl
0 siblings, 0 replies; 3+ messages in thread
From: Lawrence Crowl @ 2007-11-06 20:53 UTC (permalink / raw)
To: Joseph S. Myers; +Cc: Elena Zannoni, gcc, Tom Tromey
On 11/1/07, Joseph S. Myers <joseph@codesourcery.com> wrote:
> I haven't followed any developments relating to TR19769 in WG14
> after its publication in detail; has WG14 yet given an answer
> on what should be done with u'C' where C represents a single
> character that requires a surrogate pair to represent in UTF-16
> (to name one noted place where the TR underspecifies things)?
Pending such an answer, I think gcc should make such characters
ill-formed. The text in the C TR is "The corresponding character
constant is denoted by u'c-char-sequence' and has the type char16_t."
Given that surrogate pairs are unrepresentable in that type, I
conclude that the intent was to make character literals requiring
surrogates ill-formed. The C++ standard also makes such characters
ill-formed. Furthermore, making them ill-formed will be upward
compatible should the C committee choose some other interpretation.
> A TR is not a standard, so for C this must be disabled in all strict
> conformance modes (note that it affects the rules for lexing and so
> changes the semantics of conforming programs); likewise for C++98.
> The C++0x draft includes the notation from TR19769, so the feature
> should be enabled by default in C++0x (and so far as the C TR is
> compatible with C++0x, both should be followed in both C and C++
> when the feature is enabled).
Note that char16_t and char32_t are typedefs in C but primitive types
in C++, just like wchar_t.
--
Lawrence Crowl
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-11-06 19:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-01 23:31 Old UTF16 patch Elena Zannoni
2007-11-02 0:08 ` Joseph S. Myers
2007-11-06 20:53 ` Lawrence Crowl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).