From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 130502 invoked by alias); 19 Dec 2018 22:42:04 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 130481 invoked by uid 89); 19 Dec 2018 22:42:03 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=european, European, western, H*x:Mailer X-HELO: shared-ano163.rev.nazwa.pl X-Spam-Score: 0 Date: Wed, 19 Dec 2018 22:42:00 -0000 From: Rafal Luzynski To: Egor Kobylkin , libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" , Marko Myllynen , mfabian@redhat.com Message-ID: <749726562.674232.1545259279320@poczta.nazwa.pl> In-Reply-To: References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <676c37bd-ba92-a7ed-019e-94974143233f@kobylkin.com> <1718190635.706992.1544225756803@poczta.nazwa.pl> Subject: Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2018-q4/txt/msg00138.txt.bz2 8.12.2018 22:51 Egor Kobylkin wrote: > > Rafal, Dmitry, Marko, Mike > > On 08.12.18 00:35, Rafal Luzynski wrote: > > 19.11.2018 12:10 Egor Kobylkin wrote: > >> > >> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A > >> (transliteration to Latin with diacritics) as conflicting with > >> System B within glibc mechanics and not solving BZ #2872 > > > > I'm in favor of implementing System A and dropping System B instead. > > The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII > fails". The ISO 9 System A does not map to ASCII so it is not a solution > to BZ #2872 at all. I did not mean implementing System A and nothing more. I meant implementing System A and a fallback for ASCII which can be similar to System B but we wouldn't be able to call it "System B" because it would differ in few cases. > I was scratching my head as to how can we avoid the explosion of the > scope for this patch. And then it appeared to me that it was wrong to > target all the present locales for the ASCII translit. This seems to be > the root cause for this prolonged A vs. B discussions. The proper target > for my table is actually the C locale translit file > (locale/C-translit.h.in). I will submit a proper patch shortly. I saw your patch v11 and now I must say I'm sorry for making noise because it was me who said that I didn't mind adding Cyrillic -> ASCII transliteration to C locale. I said so before taking a look at the current contents of transliteration in C locale. When I looked at this I realized that it does not support any national characters, even from modified Latin alphabets (like used in most of western European languages). It only contains mathematical, physical, commercial, diacritical etc. characters. So I'm no longer sure it should support Cyrillic -> ASCII. But maybe again I'm wrong, maybe it should support but just nobody implemented it yet. > If anyone wants to keep working on the implementation of the Latin > Diacritics transliteration of the Cyrillic letters (System A) you are > welcome to use the tables I have submitted before (v9). That would be a > new feature for glibc as per my understanding. Let's just make super > clear the distinction of the System A (Latin with Diacritics, non-ASCII) > to the ASCII translit as mentioned in BZ #2872 (System B). I liked your v9 patch more. I really appreciate your work and I'm not going to ask you to provide more patches because I think that so far you have provided all possible versions. I hope that your work will not be lost. > My focus is super sharp on helping with Cyrillic -> ASCII translit > availability for a default installation with glibc. I understand your aim and I agree to support ASCII. Our disagreements are: * whether to support conversion Cyrillic -> extended Latin as well, * which standard to implement, * what to do if the standard is ambiguous or if some details cannot be implemented for technical reasons. Regards, Rafal