From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 58114 invoked by alias); 11 Oct 2018 10:10:12 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 57841 invoked by uid 89); 11 Oct 2018 10:10:12 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=BAYES_00,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=H*r:sk:libc-lo, H*Ad:U*libc-locales, Hx-spam-relays-external:209.85.128.65, H*RU:209.85.128.65 X-HELO: mail-wm1-f65.google.com Return-Path: Reply-To: Marko Myllynen Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 From: Marko Myllynen To: Rafal Luzynski , Egor Kobylkin , Keld Simonsen Cc: libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" , Volodymyr Lisivka , Carlos O'Donell , Max Kutny , danilo@gnome.org References: <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com> <20180412224352.GB2911@altlinux.org> <16e785f3-2e9f-ceb2-698f-dc33c91a5d5e@kobylkin.com> <20181003091949.GA21486@rap.rap.dk> <21d872b2-613e-d1f5-26c0-baa4b5721df9@kobylkin.com> <1485772360.805333.1538731225156@poczta.nazwa.pl> <19e29568-e710-535f-4f90-98dbcec930ed@kobylkin.com> <1028447684.826961.1539036295224@poczta.nazwa.pl> <63fb4fae-a93b-7aff-13df-4452cbc8853f@redhat.com> Message-ID: <5cdb2fe4-c0c9-50e5-99f4-a8618a5fdcd4@redhat.com> Date: Thu, 11 Oct 2018 10:10:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <63fb4fae-a93b-7aff-13df-4452cbc8853f@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-SW-Source: 2018-q4/txt/msg00043.txt.bz2 Hi, On 2018-10-09 19:10, Marko Myllynen wrote: > > One thing that might be helpful here could be something like: > > $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE > ž > > That is, force transliteration of each character (if defined) even if > it's part of the target character set. AFAICS this is not currently > possible. FWIW, this is currently not possible with iconv(1) but uconv(1) supports this with -x (AFAICS it's using ICU not glibc locale data): https://en.wikipedia.org/wiki/uconv https://linux.die.net/man/1/uconv https://github.com/unicode-org/icu/tree/master/icu4c/source/extra/uconv Cheers, -- Marko Myllynen