From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 45869 invoked by alias); 5 May 2015 09:46:42 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 16671 invoked by uid 48); 5 May 2015 09:40:48 -0000 From: "maiku.fabian at gmail dot com" To: libc-locales@sourceware.org Subject: [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others) Date: Tue, 05 May 2015 09:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: 2.3.5 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: maiku.fabian at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-q2/txt/msg00027.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=2253 --- Comment #7 from Mike FABIAN --- (In reply to Samuel Thibault from comment #6) > Err, but here e+combineacute *is* representable in latin1, it's eacute. So > transliteration should not discard the accent. Yes, maybe. But is this doable with the glibc transliteration system? All the glibc/localedata/locales/translit_* files just transliterate one single character to another character or a list of characters. It never starts with a character sequence. So I guess this is not supported. As Jungshik Shin suggests in comment#1, iconv could normalize the input to NFC before attempting a transliteration. Certainly not without transliteration, as Rich Felker writes in comment#3, but *if* transliteration is used, normalizing to NFC and then doing the transliteration might be a reasonable approach. -- You are receiving this mail because: You are on the CC list for the bug.