public inbox for mauve-discuss@sourceware.org
 help / color / mirror / Atom feed
* problems with "InputStreamReader.read" tests in "java.io.Utf8Encoding"
@ 2003-03-26 20:57 Steve Murry
  0 siblings, 0 replies; 3+ messages in thread
From: Steve Murry @ 2003-03-26 20:57 UTC (permalink / raw)
  To: mauve-discuss

Can someone check the 'negative' testcases within
java.io.Utf8Encoding.mojo to see if these are valid tests or not?
Specifically, I'm referring to the 9 testcases with data values that are declared 'test5_bytes' through 'test13_bytes'.  The testcases expect a
CharEncodingException when decoding illegal UTF-8 byte strings.  In some
cases the UTF-8 data is incorrect and in others it represents codepoints
that have not yet been assigned (at least for Unicode 3.2).  As I read the
Sun API description for InputStreamReader.read, I would expect either
MalformedInputException or UTFDataFormatException to be thrown instead (the API description doesn't seem very precise in this area).  In fact, most of the platforms that we have run these testcases against do not throw any type of exception at all!  Only the IBM JREs throw the expected CharConversionException.   We also found the following paragraph in one of the Sun bug descriptions:


>This is a bug in the tests. The specification of java.io.InputStreamReader
>does not require that an implementation throw IOExceptions on malformed
>input when decoding bytes in the UTF-8 charset. That our implementation has
>done this historically is a bug that was fixed as part of 4503732.

I'm not sure I agree with this statement myself (BTW - I believe they are referring to one of their own internal tests, not Mauve), but I'm just trying to get Mauve's take on all of this.  I would appreciate anyone's thoughts.

Thanks,

Steve Murry
SAS Institute

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: problems with "InputStreamReader.read" tests in "java.io.Utf8Encoding"
  2003-03-27 14:09 Steve Murry
@ 2003-04-10  3:05 ` Brian Jones
  0 siblings, 0 replies; 3+ messages in thread
From: Brian Jones @ 2003-04-10  3:05 UTC (permalink / raw)
  To: Steve Murry; +Cc: mauve-discuss

Steve Murry <stmurr@unx.sas.com> writes:

> (I'm reposting my earlier email due to format problems. Sorry 'bout that) 
> 
> 
> Can someone check the 'negative' testcases within
> java.io.Utf8Encoding.mojo to see if these are valid tests or not?
> Specifically, I'm referring to the 9 testcases with data values that
> are declared 'test5_bytes' through 'test13_bytes'.  The testcases
> expect a CharEncodingException when decoding illegal UTF-8 byte
> strings.  In some cases the UTF-8 data is incorrect and in others it
> represents codepoints that have not yet been assigned (at least for
> Unicode 3.2).  As I read the Sun API description for
> InputStreamReader.read, I would expect either
> MalformedInputException or UTFDataFormatException to be thrown
> instead (the API description doesn't seem very precise in this
> area).  In fact, most of the platforms that we have run these
> testcases against do not throw any type of exception at all!  Only
> the IBM JREs throw the expected CharConversionException.  We also
> found the following paragraph in one of the Sun bug descriptions:
> 
> >This is a bug in the tests. The specification of
> >java.io.InputStreamReader does not require that an implementation
> >throw IOExceptions on malformed input when decoding bytes in the
> >UTF-8 charset. That our implementation has done this historically
> >is a bug that was fixed as part of 4503732.
> 
> I'm not sure I agree with this statement myself (BTW - I believe
> they are referring to one of their own internal tests, not Mauve),
> but I'm just trying to get Mauve's take on all of this.  I would
> appreciate anyone's thoughts.

I'm not a Character/UTF-8/Unicode expert so I don't think I can answer
your question.

Brian
-- 
Brian Jones <cbj@gnu.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* problems with "InputStreamReader.read" tests in "java.io.Utf8Encoding"
@ 2003-03-27 14:09 Steve Murry
  2003-04-10  3:05 ` Brian Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Steve Murry @ 2003-03-27 14:09 UTC (permalink / raw)
  To: mauve-discuss

(I'm reposting my earlier email due to format problems. Sorry 'bout that) 


Can someone check the 'negative' testcases within java.io.Utf8Encoding.mojo 
to see if these are valid tests or not? Specifically, I'm referring to the 
9 testcases with data values that are declared 'test5_bytes' through 
'test13_bytes'.  The testcases expect a CharEncodingException when decoding 
illegal UTF-8 byte strings.  In some cases the UTF-8 data is incorrect and 
in others it represents codepoints that have not yet been assigned (at least 
for Unicode 3.2).  As I read the Sun API description for InputStreamReader.read, 
I would expect either MalformedInputException or UTFDataFormatException to 
be thrown instead (the API description doesn't seem very precise in this area). 
In fact, most of the platforms that we have run these testcases against do not 
throw any type of exception at all!  Only the IBM JREs throw the expected 
CharConversionException.   We also found the following paragraph in one 
of the Sun bug descriptions: 

>This is a bug in the tests. The specification of 
>java.io.InputStreamReader does not require that an implementation throw 
>IOExceptions on malformed input when decoding bytes in the UTF-8 
>charset. That our implementation has done this historically is a bug 
>that was fixed as part of 4503732. 

I'm not sure I agree with this statement myself (BTW - I believe they are 
referring to one of their own internal tests, not Mauve), but I'm just 
trying to get Mauve's take on all of this.  I would appreciate anyone's thoughts. 

Thanks, 

Steve Murry 
SAS Institute 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-04-10  3:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-26 20:57 problems with "InputStreamReader.read" tests in "java.io.Utf8Encoding" Steve Murry
2003-03-27 14:09 Steve Murry
2003-04-10  3:05 ` Brian Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).