public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
From: "Diego (Egor) Kobylkin" <egor@kobylkin.com>
To: "libc-locales@sourceware.org" <libc-locales@sourceware.org>,
	"libc-alpha@sourceware.org" <libc-alpha@sourceware.org>,
	Carlos O'Donell <carlos@redhat.com>,
	Rafal Luzynski <digitalfreak@lingonborough.com>
Cc: Marko Myllynen <myllynen@redhat.com>,
	Siddhesh Poyarekar <siddhesh@gotplt.org>
Subject: Re: [PATCH] locale/C-translit.h.in: Greek -> ASCII transliteration table [BZ #12031]
Date: Tue, 08 Oct 2019 19:20:00 -0000	[thread overview]
Message-ID: <PSddXYnx56_367SrGMUyF8KffacTiODk5jhRvwco4GDZfNJu6I-4kFG3Uv8e_qnFlF6E0CkJatGGGI6crwjS4B0V-GPWgDDTjigVBQRRePo=@kobylkin.com> (raw)
In-Reply-To: <15ng4NDEFJeZhF1FBBL6X6CB9aroE4hyWVKzRshoOYhTmf-Cj2U64VAczBw5-eTCL0PqD_Urr7Fjv0P1bZMtTIwmoE7kiaGesv6e6KJhB_U=@kobylkin.com>


[-- Attachment #1.1: Type: text/plain, Size: 2990 bytes --]

Carlos, Rafal,

here is another patch for ASCII transliteration bug [BZ #12031], this time for Greek.

You were instrumental in getting the other patch for the transliteration 

[BZ #2872] approved. So I want to make you aware of this patch. 


Just to make sure, it has nothing to do with Cyrillics. 

It is entirely Greek -> ASCII transliteration table. Yet it has exact same structure 

as [BZ #2872]. So it is only logical if you two could just re-run the same tests you 

did for [BZ #2872].

Given it is Greek there may be other considerations as well of course. Happy to hear 

on this from anyone else any time.

Best regards,
Egor



‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, September 4, 2019 9:31 AM, Diego (Egor) Kobylkin <egor@kobylkin.com> wrote:

> Dear locale maintainers,
> 

> fix the glibc bug 12031 "iconv -t ascii//translit with Greek characters" [1]
> add Greek transliteration rows to locale/C-translit.h.in.
> 

> This work is done on the heels of the successfully committed patch for the
> virtually the same bug [BZ #2872] but concerning Cyrillic characters. [2]
> 

> AFAIK there are many versions of transcription tables for Greek to ASCII
> transcription. Given that current iconv logic can only translit one to many
> but not many to many symbols we take the "Standard" part of
> the Romanization_of_Greek#Modern_Greek table [3]
> 

> and only keep the one letter Greek graphems. That "standard" seems to be close to
> the ELOT 743 indeed but not the same.
> 

> So we omit things like M and Μπ being transliterated as M and B accordingly.
> Rather Μπ will be treated like two separate graphems and transliterated as Mp.
> 

> Here is the list of some standards I have collected so far. There doesn't seem
> a way to harmonize them all into one. But if anyone want to propose a solution -
> please do.
> 

> -   ΕΛΟΤ 743 https://www.teicrete.gr/users/kutrulis/Ergalia/ELOT743.htm Passports.
> -   ISO 843 https://en.wikipedia.org/wiki/ISO_843
> -   ALA-LC https://www.loc.gov/catdir/cpso/romanization/greek.pdf Book titles.
> -   BGN/PCGN http://libraries.ucsd.edu/bib/fed/USBGN_romanization.pdf
> -   http://geonames.nga.mil/gns/html/Romanization/Romanization_Greek.pdf Geographical names.
>     

>     Furthermore to cover the whole U0370-U03FF Greek/Coptic Unicode range I have
>     asked around and made a best effort transliteration for the rest of characters
>     not covered in above standards.
>     

>     Should you have better sources for the actual translit entries please make sure to
>     send your feedback!
>     

>     The patch is attached.
>     

>     Best regards,
>     Egor Kobylkin
>     

>     https://sourceware.org/bugzilla/show_bug.cgi?id=12031 [1]
>     https://sourceware.org/ml/libc-alpha/2019-07/msg00477.html [2]
>     https://en.wikipedia.org/wiki/Romanization_of_Greek#Modern_Greek [3]
>


[-- Attachment #1.2: publickey - egor@kobylkin.com - 0x01FEB4E8.asc --]
[-- Type: application/pgp-keys, Size: 657 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 217 bytes --]

  parent reply	other threads:[~2019-10-08 19:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04  7:31 Diego (Egor) Kobylkin
2019-09-05  9:44 ` Roumen Petrov
2019-09-05 20:12   ` Diego (Egor) Kobylkin
2019-10-08 19:20 ` Diego (Egor) Kobylkin [this message]
2019-11-08 19:43 ` Florian Weimer
2019-11-08 20:53   ` Diego (Egor) Kobylkin
2019-11-08 20:59     ` Florian Weimer
2019-11-14 12:33       ` Diego (Egor) Kobylkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='PSddXYnx56_367SrGMUyF8KffacTiODk5jhRvwco4GDZfNJu6I-4kFG3Uv8e_qnFlF6E0CkJatGGGI6crwjS4B0V-GPWgDDTjigVBQRRePo=@kobylkin.com' \
    --to=egor@kobylkin.com \
    --cc=carlos@redhat.com \
    --cc=digitalfreak@lingonborough.com \
    --cc=libc-alpha@sourceware.org \
    --cc=libc-locales@sourceware.org \
    --cc=myllynen@redhat.com \
    --cc=siddhesh@gotplt.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).