From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 71582 invoked by alias); 14 Mar 2019 19:49:00 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 71551 invoked by uid 89); 14 Mar 2019 19:49:00 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=BAYES_00,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 spammy=HTo:U*libc-locales, 110319, H*Ad:U*siddhesh, 8kb X-HELO: mout.kundenserver.de Subject: PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 From: Egor Kobylkin To: libc-alpha@sourceware.org, libc-locales@sourceware.org, Carlos O'Donell Cc: Marko Myllynen , Rafal Luzynski , Mike Fabian , Siddhesh Poyarekar , "Dmitry V. Levin" References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <2124833400.35614.1546698902753@poczta.nazwa.pl> <908ed415-cfe4-804c-f421-4351ef062edc@kobylkin.com> <6d076299-babd-406a-b1fe-87778f54bf36@kobylkin.com> <41aff10b-9cf1-638c-4fbc-8c4f4122f2e9@kobylkin.com> <728016cc-f2c2-c05d-6753-5c16e3adc8f1@kobylkin.com> <15a24f58-8a5b-0b3a-d71a-4baeec8b7c5b@kobylkin.com> Message-ID: <63a52a0c-df2d-01bb-c1a7-e92a81d39d3d@kobylkin.com> Date: Thu, 14 Mar 2019 19:49:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 MIME-Version: 1.0 In-Reply-To: <15a24f58-8a5b-0b3a-d71a-4baeec8b7c5b@kobylkin.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2019-q1/txt/msg00068.txt.bz2 On 11.03.19 14:59, Egor Kobylkin wrote: > > > On 04.03.19 23:11, Egor Kobylkin wrote: >> ping >> >> On 14.02.19 17:48, Marko Myllynen wrote: >>> Hi Carlos, Mike, Rafal, >>> >>> It seems clear that you all are currently too busy to have a look at >>> this but would you have any estimate when you might be able to review >>> this so that we could consider merging? >>> >>> FWIW, I chatted with Egor off-list and we're on the same page wrt the >>> following, hopefully this gives you a bit off jump start for this >>> subject when you have time to dig deeper: >>> >>> 1) Built-in C locale doesn't read/use any translit_* files and it can't >>> have any fallback mechanisms and it only supports ASCII so using GOST >>> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to >>> be the appropriate way to implement Cyrillic transliteration for the >>> built-in C locale (it adds some 8KB to the binary). >>> >>> 2) Other locales read/use translit_* files and with them fallbacks and >>> non-ASCII are possible so it would seem preferable to first try ISO 9 / >>> GOST 7.79 System A and only if that fails then use GOST 7.79 System B >>> (in which case the end result should match with the built-in C locale). >>> For this the translit_cyrillic file should be added (as per patch v9 + >>> changes mentioned in patches v10 and v12). >>> >>> 3) Individual locale files can then be updated to use translit_cyrillic >>> as appropriate (see patch v9) and language/national specific conventions >>> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis. >>> >>> Thanks, >>> >>> On 04/02/2019 09.14, Egor Kobylkin wrote: >>>> Carlos, >>>> are you comfortable to pick this up again this month? >>>> >>>> I would really love to have a reliable action plan to get this >>>> committed >>>> for 2.30. Maybe cut out a subset that is undisputed and commit only >>>> that >>>> first. It looks kinda like an eternal moving target otherwise. >>>> >>>> for you reference: >>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html >>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html >>>> >>>> Bests, >>>> Egor Kobylkin >>>> >>>> On 09.01.19 21:03, Marko Myllynen wrote: >>>>> Hi, >>>>> >>>>> On 09/01/2019 02.46, Egor Kobylkin wrote: >>>>>> On 07.01.19 21:37, Marko Myllynen wrote: >>>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote: >>>>>>>> >>>>>>>> Good catch! Should we maybe split this into two patches, one for >>>>>>>> C and >>>>>>>> the other for "country" locales? They have different codes and >>>>>>>> functionality so it looks like it would be easier to keep focus. >>>>>>> >>>>>>> That would probably make sense, the standard C/POSIX locale won't >>>>>>> support System A so it also narrows down solution alternatives >>>>>>> with it. >>>>>>> >>>>>>>> "Country" locales in localedata/locales/ can then have the exact >>>>>>>> same >>>>>>>> translit table included or they can have any other flavor - I don't >>>>>>>> see >>>>>>>> a problem here. >>>>>>> >>>>>>> Indeed, and since those files are not limited to ASCII, perhaps we >>>>>>> could >>>>>>> now reconsider the v9 approach for them, i.e., prefer System A if >>>>>>> possible, otherwise use System B / ASCII (just need to make sure >>>>>>> that >>>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)? >>>>>> >>>>>> Happy to hear the split seems to be a clear cut one. >>>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]... >>>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report >>>>>> (number) and title for clarity in communication? >>>>> >>>>> I'm not sure is a new BZ really needed for such an addition, perhaps a >>>>> NEWS entry might be more appropriate (with the full details >>>>> explained in >>>>> the commit messages of course) but I'll leave this to others to >>>>> decide. >>>>> >>>>>> This way it would probably be easier to have the decision making >>>>>> process >>>>>> tied up for both patches (separately). We may want to get the v12 >>>>>> POSIX >>>>>> out of the door in 2.30 then and can take all the time we need to >>>>>> set up >>>>>> the rules for "Countries" locales as you need them to be. >>>>> >>>>> Perhaps Rafal or Carlos have better suggestions but I would think we >>>>> could have a patch series where the patch 1/3 adds the C/POSIX locale >>>>> part (that would be what you posted as v12), then patch 2/3 adds >>>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79 >>>>> System A and GOST 7.79 System B as a fall-back (which would match the >>>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use >>>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may >>>>> have >>>>> alternative suggestions so it might be best to wait for their feedback >>>>> before doing anything yet (it's unfortunate you've had to do so many >>>>> iterations around this already but I think we've all learned something >>>>> during the process and the end result will be more correct than any of >>>>> the earlier versions). >>>>> >>>>> Thanks, >>>>> >>> >>>