From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34275 invoked by alias); 24 Jul 2017 22:39:50 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 33770 invoked by uid 89); 24 Jul 2017 22:39:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=Hx-languages-length:1265 X-HELO: aev204.rev.netart.pl X-Spam-Score: 0 Date: Mon, 24 Jul 2017 22:51:00 -0000 From: Rafal Luzynski Reply-To: Rafal Luzynski To: Florian Weimer , Carlos O'Donell Cc: libc-alpha@sourceware.org, Mike FABIAN , Andreas Schwab Message-ID: <364005592.108952.1500935986620@poczta.nazwa.pl> In-Reply-To: <87379lczdi.fsf@mid.deneb.enyo.de> References: <5f71f2f6-be0e-2b5d-91ce-03386eafa7f7@redhat.com> <87h8y13gvb.fsf@mid.deneb.enyo.de> <87379lczdi.fsf@mid.deneb.enyo.de> Subject: Re: Is it OK to write ASCII strings directly into locale source files? MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Originating-Client: com.openexchange.ox.gui.dhtml X-SW-Source: 2017-07/txt/msg00833.txt.bz2 24.07.2017 23:13 Florian Weimer wrote: > > > * Carlos O'Donell: > > [...] > > My only technical objection with writing straight UTF-8 is that it could > > lead to more mistakes, and Mike just found one in CLDR where an Arabic > > Farsi character was used incorrectly because it displayed the same glyp= h. > > It was caught when harmonizing with glibc where you have to write out t= he > > code points (Mike filed a bug upstream with CLDR). > > Wasn't it caught by locale testing which revealed that the locale > wasn't compatible with ISO-8859-6? [...] This is exactly what happened. The character was not representable in ISO-8859-6. There was no problem in UTF-8. > [...] > > My preference would be to start small, start using the POSIX portable > > character set to it's maximum extent for all latin-based languages, > > I would still prefer the encoding for control characters whi= ch > are in the portable character set. So I have to object to the > =E2=80=9Cmaximum=E2=80=9D part. :) I agree modulo the concerns which I expressed in another email: let's investigate the history behind it and if we still don't know then let's just wait for the 2.26 release. Regards, Rafal