From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Joseph S. Myers" To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org Subject: Re: java/2319: invalid UTF-8 sequences should be rejected Date: Mon, 19 Mar 2001 09:06:00 -0000 Message-id: <20010319170601.22568.qmail@sourceware.cygnus.com> X-SW-Source: 2001-03/msg00132.html List-Id: The following reply was made to PR java/2319; it has been noted by GNATS. From: "Joseph S. Myers" To: Cc: , Subject: Re: java/2319: invalid UTF-8 sequences should be rejected Date: Mon, 19 Mar 2001 17:00:47 +0000 (GMT) On 19 Mar 2001 tromey@redhat.com wrote: > Currently the compiler accepts invalid UTF-8 sequences > when reading a file. Instead we ought to diagnose > such sequences as errors. Also note that the invalid sequences that should be rejected include over-long sequences and UTF-8 encodings that would map to values in the UTF-16 surrogate range. http://www.cl.cam.ac.uk/~mgk25/unicode.html http://www.unicode.org/unicode/uni2errata/UTF-8_Corrigendum.html -- Joseph S. Myers jsm28@cam.ac.uk