From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mauve-discuss-return-488-listarch-mauve-discuss=sources.redhat.com@sources.redhat.com>
Received: (qmail 24639 invoked by alias); 14 Feb 2003 23:58:31 -0000
Mailing-List: contact mauve-discuss-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:mauve-discuss-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/mauve-discuss/>
List-Post: <mailto:mauve-discuss@sources.redhat.com>
List-Help: <mailto:mauve-discuss-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: mauve-discuss-owner@sources.redhat.com
Received: (qmail 24578 invoked from network); 14 Feb 2003 23:58:30 -0000
Received: from unknown (HELO delenn.fl.net.au) (202.181.0.28)
  by 172.16.49.205 with SMTP; 14 Feb 2003 23:58:30 -0000
Received: from solomon (a3-p48.syd.fl.net.au [202.181.1.112])
	by delenn.fl.net.au (Postfix) with ESMTP
	id 887E317FE00; Sat, 15 Feb 2003 11:05:54 +1100 (EST)
Content-Type: text/plain;
  charset="utf-8"
From: "Raif S. Naffah" <raif@fl.net.au>
Reply-To: raif@fl.net.au
To: Mark Wielaard <mark@klomp.org>
Subject: Re: new test cases (long)
Date: Fri, 14 Feb 2003 23:58:00 -0000
User-Agent: KMail/1.4.3
Cc: Mauve <mauve-discuss@sources.redhat.com>
References: <200302090318.02317.raif@fl.net.au> <1045223836.30202.326.camel@elsschot>
In-Reply-To: <1045223836.30202.326.camel@elsschot>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Message-Id: <200302151101.21183.raif@fl.net.au>
X-SW-Source: 2003-q1/txt/msg00025.txt.bz2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

hello Mark,

On Friday 14 February 2003 22:57, Mark Wielaard wrote:
> Hi Raif,
>
> On Sat, 2003-02-08 at 17:17, Raif S. Naffah wrote:
> > the tests are to ensure that the mandated (as per public Javadoc
> > 1.3.1 and 1.4.1) minimal character encodings are supported by the
> > bytecode interpreter.
> > [...]
> > +	* gnu/testlet/java/lang/String/getBytes14: new test
>
> Here you test for "ISO8859_15". I looked here:
> http://java.sun.com/j2se/1.4.1/docs/api/java/nio/charset/Charset.html
> but couldn't see where it said this is a required character set.

yes it is not listed there.  but i refer you to=20
<.../j2sdk1.4.1/docs/guide/intl/encoding.doc.html> page of the public=20
documentation of sun's jdk-1.4.1; 2nd paragraph:

"Sun's Java 2 Software Development Kit, Standard Edition, v. 1.4.1 for=20
all platforms (SolarisTM operating environment, Linux, and Microsoft=20
Windows) and the Java 2 Runtime Environment, Standard Edition, v. 1.4.1=20
for Solaris and Linux support all encodings shown on this page..."

and further down the same page, a table giving the "Basic Encoding Set=20
(contained in lib/rt.jar) - Supported by java.nio, java.io and=20
java.lang APIs."  in the "Canonical Name for java.io and java.lang=20
API," column, next to ISO-8859-15 row entry, there is a reference to=20
"extended encoding set."  i took this to mean the value of the=20
canonical name to be taken from the second set; ie. the extended=20
encoding set."

there are two possible deductions from this page:

a. "ISO-8859-15" is a MUST encoding in java.nio, as well as in java.io=20
and java.lang, but in the last two the canonical name is as stated in=20
the "extended set" i.e. "ISO8859_15" (ISO 8859-15, Latin alphabet No. 9=20
(and hence the supporting classes are in charsets.jar rather than in=20
rt.jar).

b. "ISO-8859-15" is only a MUST encoding in java.nio, but not in java.io=20
nor java.lang.

i adopted the first.

there is of course the 3rd possibility of the writer(s) of these=20
documentation pages being in contradiction.

the code itself (for the sun's jdk 1.4.1_01) does support ISO-8859-15,=20
which can be thought of as the lithmus test.


> Is it really required or just nice to have since the Sun
> implementation supports it? (Which might still be a good reason to
> add them to Mauve, but then I would like to label them explicitly as
> such.)

my interpretation of it was that is is a MUST.


> Also you seem to test (in getBytes13) for the "historical names" for
> which I couldn't find a definition.

the relevant javadoc page in sun'd jdk 1.3.1_06=20
<.../jdk1.3.1/docs/guide/intl/encoding.doc.html> lists the required=20
encodings:

"...Sun's Java 2 Runtime Environment, Standard Edition, v. 1.3.1 for=20
Windows comes in two different versions: US-only and international. The=20
US-only version only supports the encodings shown in the first table.=20
The international version (which includes the lib\i18n.jar file)=20
supports all encodings shown on this page."

it then proceeds to list the "Basic Encoding Set" (contained in rt.jar)=20
where those names are defined.

the only difference is the Latin Alphabet #9.


>... Do you know where they are
> specified? InputStreamReader and OutputStreamWriter getEncoding() are
> supposed to return them but they don't document what they actually
> look like.

the references sun cites are:

* The Unicode standard=20
<http://www.unicode.org/unicode/standard/standard.html>, and
* The Unicode FAQ <http://www.unicode.org/unicode/faq>.


cheers;
rsn
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Que du magnifique

iD8DBQE+TYNP+e1AKnsTRiERA/LLAKCrscVg83sy882JsHImp/ybGSipCgCgquAV
4VLd68TlTPVpPV5w296qSCc=3D
=3DX4EK
-----END PGP SIGNATURE-----