From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24639 invoked by alias); 14 Feb 2003 23:58:31 -0000 Mailing-List: contact mauve-discuss-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: mauve-discuss-owner@sources.redhat.com Received: (qmail 24578 invoked from network); 14 Feb 2003 23:58:30 -0000 Received: from unknown (HELO delenn.fl.net.au) (202.181.0.28) by 172.16.49.205 with SMTP; 14 Feb 2003 23:58:30 -0000 Received: from solomon (a3-p48.syd.fl.net.au [202.181.1.112]) by delenn.fl.net.au (Postfix) with ESMTP id 887E317FE00; Sat, 15 Feb 2003 11:05:54 +1100 (EST) Content-Type: text/plain; charset="utf-8" From: "Raif S. Naffah" Reply-To: raif@fl.net.au To: Mark Wielaard Subject: Re: new test cases (long) Date: Fri, 14 Feb 2003 23:58:00 -0000 User-Agent: KMail/1.4.3 Cc: Mauve References: <200302090318.02317.raif@fl.net.au> <1045223836.30202.326.camel@elsschot> In-Reply-To: <1045223836.30202.326.camel@elsschot> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Message-Id: <200302151101.21183.raif@fl.net.au> X-SW-Source: 2003-q1/txt/msg00025.txt.bz2 -----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 hello Mark, On Friday 14 February 2003 22:57, Mark Wielaard wrote: > Hi Raif, > > On Sat, 2003-02-08 at 17:17, Raif S. Naffah wrote: > > the tests are to ensure that the mandated (as per public Javadoc > > 1.3.1 and 1.4.1) minimal character encodings are supported by the > > bytecode interpreter. > > [...] > > + * gnu/testlet/java/lang/String/getBytes14: new test > > Here you test for "ISO8859_15". I looked here: > http://java.sun.com/j2se/1.4.1/docs/api/java/nio/charset/Charset.html > but couldn't see where it said this is a required character set. yes it is not listed there. but i refer you to=20 <.../j2sdk1.4.1/docs/guide/intl/encoding.doc.html> page of the public=20 documentation of sun's jdk-1.4.1; 2nd paragraph: "Sun's Java 2 Software Development Kit, Standard Edition, v. 1.4.1 for=20 all platforms (SolarisTM operating environment, Linux, and Microsoft=20 Windows) and the Java 2 Runtime Environment, Standard Edition, v. 1.4.1=20 for Solaris and Linux support all encodings shown on this page..." and further down the same page, a table giving the "Basic Encoding Set=20 (contained in lib/rt.jar) - Supported by java.nio, java.io and=20 java.lang APIs." in the "Canonical Name for java.io and java.lang=20 API," column, next to ISO-8859-15 row entry, there is a reference to=20 "extended encoding set." i took this to mean the value of the=20 canonical name to be taken from the second set; ie. the extended=20 encoding set." there are two possible deductions from this page: a. "ISO-8859-15" is a MUST encoding in java.nio, as well as in java.io=20 and java.lang, but in the last two the canonical name is as stated in=20 the "extended set" i.e. "ISO8859_15" (ISO 8859-15, Latin alphabet No. 9=20 (and hence the supporting classes are in charsets.jar rather than in=20 rt.jar). b. "ISO-8859-15" is only a MUST encoding in java.nio, but not in java.io=20 nor java.lang. i adopted the first. there is of course the 3rd possibility of the writer(s) of these=20 documentation pages being in contradiction. the code itself (for the sun's jdk 1.4.1_01) does support ISO-8859-15,=20 which can be thought of as the lithmus test. > Is it really required or just nice to have since the Sun > implementation supports it? (Which might still be a good reason to > add them to Mauve, but then I would like to label them explicitly as > such.) my interpretation of it was that is is a MUST. > Also you seem to test (in getBytes13) for the "historical names" for > which I couldn't find a definition. the relevant javadoc page in sun'd jdk 1.3.1_06=20 <.../jdk1.3.1/docs/guide/intl/encoding.doc.html> lists the required=20 encodings: "...Sun's Java 2 Runtime Environment, Standard Edition, v. 1.3.1 for=20 Windows comes in two different versions: US-only and international. The=20 US-only version only supports the encodings shown in the first table.=20 The international version (which includes the lib\i18n.jar file)=20 supports all encodings shown on this page." it then proceeds to list the "Basic Encoding Set" (contained in rt.jar)=20 where those names are defined. the only difference is the Latin Alphabet #9. >... Do you know where they are > specified? InputStreamReader and OutputStreamWriter getEncoding() are > supposed to return them but they don't document what they actually > look like. the references sun cites are: * The Unicode standard=20 , and * The Unicode FAQ . cheers; rsn -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) Comment: Que du magnifique iD8DBQE+TYNP+e1AKnsTRiERA/LLAKCrscVg83sy882JsHImp/ybGSipCgCgquAV 4VLd68TlTPVpPV5w296qSCc=3D =3DX4EK -----END PGP SIGNATURE-----