From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 67468 invoked by alias); 11 Mar 2019 13:59:47 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 67449 invoked by uid 89); 11 Mar 2019 13:59:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=BAYES_00,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 spammy=Individual, Countries, HTo:U*libc-locales, dig X-HELO: mout.kundenserver.de Subject: PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 From: Egor Kobylkin To: Marko Myllynen , libc-alpha@sourceware.org, libc-locales@sourceware.org, Carlos O'Donell , Rafal Luzynski , Mike Fabian Cc: Siddhesh Poyarekar , "Dmitry V. Levin" References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <2124833400.35614.1546698902753@poczta.nazwa.pl> <908ed415-cfe4-804c-f421-4351ef062edc@kobylkin.com> <6d076299-babd-406a-b1fe-87778f54bf36@kobylkin.com> <41aff10b-9cf1-638c-4fbc-8c4f4122f2e9@kobylkin.com> <728016cc-f2c2-c05d-6753-5c16e3adc8f1@kobylkin.com> Message-ID: <15a24f58-8a5b-0b3a-d71a-4baeec8b7c5b@kobylkin.com> Date: Mon, 11 Mar 2019 13:59:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <728016cc-f2c2-c05d-6753-5c16e3adc8f1@kobylkin.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2019-q1/txt/msg00067.txt.bz2 On 04.03.19 23:11, Egor Kobylkin wrote: > ping > > On 14.02.19 17:48, Marko Myllynen wrote: >> Hi Carlos, Mike, Rafal, >> >> It seems clear that you all are currently too busy to have a look at >> this but would you have any estimate when you might be able to review >> this so that we could consider merging? >> >> FWIW, I chatted with Egor off-list and we're on the same page wrt the >> following, hopefully this gives you a bit off jump start for this >> subject when you have time to dig deeper: >> >> 1) Built-in C locale doesn't read/use any translit_* files and it can't >> have any fallback mechanisms and it only supports ASCII so using GOST >> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to >> be the appropriate way to implement Cyrillic transliteration for the >> built-in C locale (it adds some 8KB to the binary). >> >> 2) Other locales read/use translit_* files and with them fallbacks and >> non-ASCII are possible so it would seem preferable to first try ISO 9 / >> GOST 7.79 System A and only if that fails then use GOST 7.79 System B >> (in which case the end result should match with the built-in C locale). >> For this the translit_cyrillic file should be added (as per patch v9 + >> changes mentioned in patches v10 and v12). >> >> 3) Individual locale files can then be updated to use translit_cyrillic >> as appropriate (see patch v9) and language/national specific conventions >> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis. >> >> Thanks, >> >> On 04/02/2019 09.14, Egor Kobylkin wrote: >>> Carlos, >>> are you comfortable to pick this up again this month? >>> >>> I would really love to have a reliable action plan to get this committed >>> for 2.30. Maybe cut out a subset that is undisputed and commit only that >>> first. It looks kinda like an eternal moving target otherwise. >>> >>> for you reference: >>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html >>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html >>> >>> Bests, >>> Egor Kobylkin >>> >>> On 09.01.19 21:03, Marko Myllynen wrote: >>>> Hi, >>>> >>>> On 09/01/2019 02.46, Egor Kobylkin wrote: >>>>> On 07.01.19 21:37, Marko Myllynen wrote: >>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote: >>>>>>> >>>>>>> Good catch! Should we maybe split this into two patches, one for >>>>>>> C and >>>>>>> the other for "country" locales? They have different codes and >>>>>>> functionality so it looks like it would be easier to keep focus. >>>>>> >>>>>> That would probably make sense, the standard C/POSIX locale won't >>>>>> support System A so it also narrows down solution alternatives >>>>>> with it. >>>>>> >>>>>>> "Country" locales in localedata/locales/ can then have the exact >>>>>>> same >>>>>>> translit table included or they can have any other flavor - I don't >>>>>>> see >>>>>>> a problem here. >>>>>> >>>>>> Indeed, and since those files are not limited to ASCII, perhaps we >>>>>> could >>>>>> now reconsider the v9 approach for them, i.e., prefer System A if >>>>>> possible, otherwise use System B / ASCII (just need to make sure that >>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)? >>>>> >>>>> Happy to hear the split seems to be a clear cut one. >>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]... >>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report >>>>> (number) and title for clarity in communication? >>>> >>>> I'm not sure is a new BZ really needed for such an addition, perhaps a >>>> NEWS entry might be more appropriate (with the full details >>>> explained in >>>> the commit messages of course) but I'll leave this to others to decide. >>>> >>>>> This way it would probably be easier to have the decision making >>>>> process >>>>> tied up for both patches (separately). We may want to get the v12 >>>>> POSIX >>>>> out of the door in 2.30 then and can take all the time we need to >>>>> set up >>>>> the rules for "Countries" locales as you need them to be. >>>> >>>> Perhaps Rafal or Carlos have better suggestions but I would think we >>>> could have a patch series where the patch 1/3 adds the C/POSIX locale >>>> part (that would be what you posted as v12), then patch 2/3 adds >>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79 >>>> System A and GOST 7.79 System B as a fall-back (which would match the >>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use >>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have >>>> alternative suggestions so it might be best to wait for their feedback >>>> before doing anything yet (it's unfortunate you've had to do so many >>>> iterations around this already but I think we've all learned something >>>> during the process and the end result will be more correct than any of >>>> the earlier versions). >>>> >>>> Thanks, >>>> >> >>