From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 67271 invoked by alias); 4 May 2015 20:40:16 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 4208 invoked by uid 48); 4 May 2015 18:53:54 -0000 From: "maiku.fabian at gmail dot com" To: libc-locales@sourceware.org Subject: [Bug localedata/12031] iconv -t ascii//translit with Greek characters Date: Mon, 04 May 2015 20:40:00 -0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: maiku.fabian at gmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: libc-locales at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-q2/txt/msg00020.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D12031 Mike FABIAN changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |maiku.fabian at gmail dot = com --- Comment #8 from Mike FABIAN --- (In reply to Petter Reinholdtsen from comment #5) > (In reply to comment #4) > > gives me "ae,?,a" but in my opinion it should give me "ae,o,a". > [...] > > Is this a bug? >=20 > I believe it is a bug. It works in recent glibc (glibc-2.20-8.fc21.x86_64) in *all* locales except C/POSIX.=20 $ echo '=C3=86,=C3=A6,=C3=98,=C3=B8,=C3=85,=C3=A5' | LANG=3Dnb_NO.UTF-8 ico= nv -t ascii//TRANSLIT=20 AE,ae,OE,oe,A,a $ echo '=C3=86,=C3=A6,=C3=98,=C3=B8,=C3=85,=C3=A5' | LANG=3Den_US.UTF-8 ico= nv -t ascii//TRANSLIT=20 AE,ae,OE,oe,A,a $ echo '=C3=86,=C3=A6,=C3=98,=C3=B8,=C3=85,=C3=A5' | LANG=3DPOSIX iconv -t = ascii//TRANSLIT=20 iconv: illegal input sequence at position 0 It is independent of the locale because all locales (except C/POSIX) include translit_neutral where this is defined. > The request to change transliteration for =C3=A6=C3=B8=C3=A5 is > http://sourceware.org/bugzilla/show_bug.cgi?id=3D89 . Please explain the= re > why you believe it should transliterate to ae,o,a and not ae,oe,aa. For Scandinavian locales, transliterating '=C3=86,=C3=A6,=C3=98,=C3=B8,=C3= =85,=C3=A5' to 'Ae, ae, Oe, oe, Aa, aa' is more appropriate. For most other locales, transliterating =C3=A5 to a is probably OK. I am a bit puzzled about =C3= =86 -> AE, shouldn=E2=80=99t this be transliterated to Ae, even in English locales? (Same with =C3=98, transliterating to just O or maybe Oe in translit_neutral for all locales which do not have special rules seems better. The patch attached to https://sourceware.org/bugzilla/show_bug.cgi?id=3D89#c5 fixes the transliteration for Norwegian locales (nn_NO and nb_NO). Probably the same fix should be applied also for Swedish and Finnish locales (and maybe Icelandic locales as well). --=20 You are receiving this mail because: You are the assignee for the bug.