public inbox for gcc-prs@sourceware.org help / color / mirror / Atom feed
From: jjc@jclark.com To: gcc-gnats@gcc.gnu.org Subject: libgcj/9802: Bug in surrogate handling in Unicode to UTF-8 conversion Date: Sat, 22 Feb 2003 09:56:00 -0000 [thread overview] Message-ID: <20030222095110.15975.qmail@sources.redhat.com> (raw) >Number: 9802 >Category: libgcj >Synopsis: Bug in surrogate handling in Unicode to UTF-8 conversion >Confidential: no >Severity: serious >Priority: medium >Responsible: unassigned >State: open >Class: sw-bug >Submitter-Id: net >Arrival-Date: Sat Feb 22 09:56:01 UTC 2003 >Closed-Date: >Last-Modified: >Originator: jjc@jclark.com >Release: gcc version 3.3 20030217 (prerelease) >Organization: >Environment: Red Hat Linux 8.0 >Description: The following program class Bug { static public char surrogate1(int c) { return (char)(((c - 0x10000) >> 10) | 0xD800); } static public char surrogate2(int c) { return (char)(((c - 0x10000) & 0x3FF) | 0xDC00); } static public void main(String[] args) throws java.io.UnsupportedEncodingException { int ch = 0x10300; char[] v = new char[2]; v[0] = surrogate1(ch); v[1] = surrogate2(ch); String str = new String(v); str.getBytes("UTF-8"); } } when compiled and executed throws an exception Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2 at gnu.gcj.convert.Output_UTF8.write(char[], int, int) (/home/jjc/gcc/lib/libgcj.so.4.0.0) at gnu.gcj.convert.UnicodeToBytes.write(java.lang.String, int, int, char[]) (/home/jjc/gcc/lib/libgcj.so.4.0.0) at java.lang.String.getBytes(java.lang.String) (/home/jjc/gcc/lib/libgcj.so.4.0.0) at Bug.main(java.lang.String[]) (Unknown Source) >How-To-Repeat: >Fix: I haven't tested this, but I suspect the following should fix it: *** gcc/libjava/gnu/gcj/convert/Output_UTF8.java~ 2000-08-09 00:35:32.000000000 +0700 --- gcc/libjava/gnu/gcj/convert/Output_UTF8.java 2003-02-22 16:38:52.000000000 +0700 *************** *** 104,109 **** --- 104,110 ---- { value = (hi_part - 0xD800) * 0x400 + (ch - 0xDC00) + 0x10000; buf[count++] = (byte) (0xF0 | (value >> 18)); + avail-- bytes_todo = 3; hi_part = 0; } >Release-Note: >Audit-Trail: >Unformatted:
next reply other threads:[~2003-02-22 9:56 UTC|newest] Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2003-02-22 9:56 jjc [this message] 2003-02-22 13:46 Mark Wielaard 2003-02-22 14:56 James Clark
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20030222095110.15975.qmail@sources.redhat.com \ --to=jjc@jclark.com \ --cc=gcc-gnats@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).