From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26045 invoked by alias); 8 Oct 2019 19:20:06 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 26015 invoked by uid 89); 8 Oct 2019 19:20:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00,BODY_8BITS,GARBLED_BODY,KAM_SHORT,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.1 spammy=heels, greek, BGN, bgn X-HELO: mail4.protonmail.ch Date: Tue, 08 Oct 2019 19:20:00 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kobylkin.com; s=protonmail; t=1570562398; bh=kk3R3uqKaV8YaLh1gG9GkdjMMTG5qO8CLnaw1WcVvtw=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References: Feedback-ID:From; b=e13W7eiDon6EPoKJBYQZ+Q+IxJsGsVcdM5vf1znXH+d5xLvXGbmhIFB3/spnafOGf nH9Qvkuf6yd7rHIawca05so6xb2SQH/aUDYkpjt0+OvXIlkbFyUSaMUNz0gqCsnEnV mWqVW2EVNxZ4TuNvAkqtQublGKxHVY28N+NqTVz8= To: "libc-locales@sourceware.org" , "libc-alpha@sourceware.org" , Carlos O'Donell , Rafal Luzynski From: "Diego (Egor) Kobylkin" Cc: Marko Myllynen , Siddhesh Poyarekar Reply-To: "Diego (Egor) Kobylkin" Subject: Re: [PATCH] locale/C-translit.h.in: Greek -> ASCII transliteration table [BZ #12031] Message-ID: In-Reply-To: <15ng4NDEFJeZhF1FBBL6X6CB9aroE4hyWVKzRshoOYhTmf-Cj2U64VAczBw5-eTCL0PqD_Urr7Fjv0P1bZMtTIwmoE7kiaGesv6e6KJhB_U=@kobylkin.com> References: <15ng4NDEFJeZhF1FBBL6X6CB9aroE4hyWVKzRshoOYhTmf-Cj2U64VAczBw5-eTCL0PqD_Urr7Fjv0P1bZMtTIwmoE7kiaGesv6e6KJhB_U=@kobylkin.com> MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="---------------------ed09cbb60ddb8399a85b4c58061be1b2"; charset=UTF-8 X-SW-Source: 2019-q4/txt/msg00012.txt.bz2 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) -----------------------ed09cbb60ddb8399a85b4c58061be1b2 Content-Type: multipart/mixed;boundary=---------------------c7aef03668a8e1e59667c570fc84b3bd -----------------------c7aef03668a8e1e59667c570fc84b3bd Content-Transfer-Encoding: quoted-printable Content-Type: text/plain;charset=utf-8 Content-length: 3135 Carlos, Rafal, here is another patch for ASCII transliteration bug [BZ #12031], this time = for Greek. You were instrumental in getting the other patch for the transliteration=20 [BZ #2872] approved. So I want to make you aware of this patch.=20 Just to make sure, it has nothing to do with Cyrillics.=20 It is entirely Greek -> ASCII transliteration table. Yet it has exact same = structure=20 as [BZ #2872]. So it is only logical if you two could just re-run the same = tests you=20 did for [BZ #2872]. Given it is Greek there may be other considerations as well of course. Happ= y to hear=20 on this from anyone else any time. Best regards, Egor =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original Me= ssage =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 On Wednesday, September 4, 2019 9:31 AM, Diego (Egor) Kobylkin wrote: > Dear locale maintainers, >=20 > fix the glibc bug 12031 "iconv -t ascii//translit with Greek characters" = [1] > add Greek transliteration rows to locale/C-translit.h.in. >=20 > This work is done on the heels of the successfully committed patch for the > virtually the same bug [BZ #2872] but concerning Cyrillic characters. [2] >=20 > AFAIK there are many versions of transcription tables for Greek to ASCII > transcription. Given that current iconv logic can only translit one to ma= ny > but not many to many symbols we take the "Standard" part of > the Romanization_of_Greek#Modern_Greek table [3] >=20 > and only keep the one letter Greek graphems. That "standard" seems to be = close to > the ELOT 743 indeed but not the same. >=20 > So we omit things like M and =CE=9C=CF=80 being transliterated as M and B= accordingly. > Rather =CE=9C=CF=80 will be treated like two separate graphems and transl= iterated as Mp. >=20 > Here is the list of some standards I have collected so far. There doesn't= seem > a way to harmonize them all into one. But if anyone want to propose a sol= ution - > please do. >=20 > - =CE=95=CE=9B=CE=9F=CE=A4 743 https://www.teicrete.gr/users/kutrulis/E= rgalia/ELOT743.htm Passports. > - ISO 843 https://en.wikipedia.org/wiki/ISO_843 > - ALA-LC https://www.loc.gov/catdir/cpso/romanization/greek.pdf Book ti= tles. > - BGN/PCGN http://libraries.ucsd.edu/bib/fed/USBGN_romanization.pdf > - http://geonames.nga.mil/gns/html/Romanization/Romanization_Greek.pdf = Geographical names. >=20=20=20=20=20 > Furthermore to cover the whole U0370-U03FF Greek/Coptic Unicode range= I have > asked around and made a best effort transliteration for the rest of c= haracters > not covered in above standards. >=20=20=20=20=20 > Should you have better sources for the actual translit entries please= make sure to > send your feedback! >=20=20=20=20=20 > The patch is attached. >=20=20=20=20=20 > Best regards, > Egor Kobylkin >=20=20=20=20=20 > https://sourceware.org/bugzilla/show_bug.cgi?id=3D12031 [1] > https://sourceware.org/ml/libc-alpha/2019-07/msg00477.html [2] > https://en.wikipedia.org/wiki/Romanization_of_Greek#Modern_Greek [3] > -----------------------c7aef03668a8e1e59667c570fc84b3bd Content-Type: application/pgp-keys; filename="publickey - egor@kobylkin.com - 0x01FEB4E8.asc"; name="publickey - egor@kobylkin.com - 0x01FEB4E8.asc" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="publickey - egor@kobylkin.com - 0x01FEB4E8.asc"; name="publickey - egor@kobylkin.com - 0x01FEB4E8.asc" Content-length: 891 LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tDQpWZXJzaW9u OiBPcGVuUEdQLmpzIHY0LjYuMg0KQ29tbWVudDogaHR0cHM6Ly9vcGVucGdw anMub3JnDQoNCnhqTUVYTGN4NkJZSkt3WUJCQUhhUnc4QkFRZEFUYVpYRStO US9ZYXJYRk9jTEhJQk9DSWJ6TXNnNXpQZQ0KSTZ5VzR4OHBQVlhOSnlKbFoy OXlRR3R2WW5sc2EybHVMbU52YlNJZ1BHVm5iM0pBYTI5aWVXeHJhVzR1DQpZ Mjl0UHNKM0JCQVdDZ0FmQlFKY3R6SG9CZ3NKQndnREFnUVZDQW9DQXhZQ0FR SVpBUUliQXdJZUFRQUsNCkNSQStPcVNEZ0FHcG9acmVBUDlOTUdxMXZ1UVJi Y1hBbGhZbStvRU9XMGVWYXRyK0RJcDRBdGJoYzdkZw0KUUFFQXA1NjBKMFEz RHpmK1BKY1pDdFBHeERlOWZWVkZyelBYUzN3MTBYN00wd2ZPT0FSY3R6SG9F Z29yDQpCZ0VFQVpkVkFRVUJBUWRBb2RSbXRLSDkwV0ZMZzlwTHloS0c2b0Rv ZWpIdWhjOEd0eTROSXlhRUxtd0QNCkFRZ0h3bUVFR0JZSUFBa0ZBbHkzTWVn Q0d3d0FDZ2tRUGpxa2c0QUJxYUVtc2dFQTZnSWdWQ29jMVp0cw0KWWMyNVh6 MEtVWXNuMWtPNEZxZmwyd2pQNzVUYkxYZ0EvQW9odWdlc2xXZVFsRTdUQ2Fh U3hFV0RXL2xYDQo4SmRlTEo4dFlIZFEvNU1MDQo9T0JwMQ0KLS0tLS1FTkQg UEdQIFBVQkxJQyBLRVkgQkxPQ0stLS0tLQ0K -----------------------c7aef03668a8e1e59667c570fc84b3bd-- -----------------------ed09cbb60ddb8399a85b4c58061be1b2 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-length: 217 -----BEGIN PGP SIGNATURE----- Version: ProtonMail wl4EARYKAAYFAl2c4VIACgkQPjqkg4ABqaFUJgEAyefXQGuxNs2INP0gZaMy lKvQILTj79GM6A2jYL5YK/wA/1+z7E/ChVbvPjQHIWq6Ushqhr6aRdpsQHrg LF56mtgH =lgIB -----END PGP SIGNATURE----- -----------------------ed09cbb60ddb8399a85b4c58061be1b2--