From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mauve-discuss-return-505-listarch-mauve-discuss=sources.redhat.com@sources.redhat.com>
Received: (qmail 5733 invoked by alias); 26 Mar 2003 20:57:39 -0000
Mailing-List: contact mauve-discuss-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:mauve-discuss-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/mauve-discuss/>
List-Post: <mailto:mauve-discuss@sources.redhat.com>
List-Help: <mailto:mauve-discuss-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: mauve-discuss-owner@sources.redhat.com
Received: (qmail 5714 invoked from network); 26 Mar 2003 20:57:38 -0000
Received: from unknown (HELO merc61.na.sas.com) (149.173.6.14)
  by sources.redhat.com with SMTP; 26 Mar 2003 20:57:38 -0000
Received: from merc18.na.sas.com ([10.16.12.224]) by 10.19.11.2 with InterScan Messaging Security Suite; Wed, 26 Mar 2003 15:57:37 -0500
X-MimeOLE: Produced By Microsoft Exchange V6.0.6410.0
Content-Class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: problems with "InputStreamReader.read" tests in "java.io.Utf8Encoding"
Date: Wed, 26 Mar 2003 20:57:00 -0000
Message-ID: <F2E670D5036BE14E89473A3FAEDE6ACE01BD6C28@merc18.na.sas.com>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
From: "Steve Murry" <Steve.Murry@sas.com>
To: <mauve-discuss@sources.redhat.com>
X-SW-Source: 2003-q1/txt/msg00042.txt.bz2

Can someone check the 'negative' testcases within
java.io.Utf8Encoding.mojo to see if these are valid tests or not?
Specifically, I'm referring to the 9 testcases with data values that are de=
clared 'test5_bytes' through 'test13_bytes'.  The testcases expect a
CharEncodingException when decoding illegal UTF-8 byte strings.  In some
cases the UTF-8 data is incorrect and in others it represents codepoints
that have not yet been assigned (at least for Unicode 3.2).  As I read the
Sun API description for InputStreamReader.read, I would expect either
MalformedInputException or UTFDataFormatException to be thrown instead (the=
 API description doesn't seem very precise in this area).  In fact, most of=
 the platforms that we have run these testcases against do not throw any ty=
pe of exception at all!  Only the IBM JREs throw the expected CharConversio=
nException.   We also found the following paragraph in one of the Sun bug d=
escriptions:


>This is a bug in the tests. The specification of java.io.InputStreamReader
>does not require that an implementation throw IOExceptions on malformed
>input when decoding bytes in the UTF-8 charset. That our implementation has
>done this historically is a bug that was fixed as part of 4503732.

I'm not sure I agree with this statement myself (BTW - I believe they are r=
eferring to one of their own internal tests, not Mauve), but I'm just tryin=
g to get Mauve's take on all of this.  I would appreciate anyone's thoughts.

Thanks,

Steve Murry
SAS Institute