* SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
@ 2018-07-17 19:34 ` Egor Kobylkin
2018-07-17 19:41 ` Carlos O'Donell
2018-08-06 19:00 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
` (12 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-07-17 19:34 UTC (permalink / raw)
To: libc-alpha, libc-locales; +Cc: Dmitry V. Levin, Volodymyr Lisivka
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
Root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compliation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files but have
received not reply so far except from Volodymyr Lisivka
<vlisivka@gmail.com> (uk_UA) who has confirmed the exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
2018-07-17 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nan_TW@latin: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_IN@devanagari: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-07-17 19:34 ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-07-17 19:41 ` Carlos O'Donell
2018-07-17 19:50 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2018-07-17 19:41 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
Cc: Dmitry V. Levin, Volodymyr Lisivka
On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
We are currently preparing for the 2.28 release and it may take
a while to review this change and the structure of the changes,
and the data itself.
Is it OK if this material is reviewed for 2.29 inclusion (after
August 1st)?
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-07-17 19:41 ` Carlos O'Donell
@ 2018-07-17 19:50 ` Egor Kobylkin
2018-07-17 19:59 ` Carlos O'Donell
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-07-17 19:50 UTC (permalink / raw)
To: Carlos O'Donell, libc-alpha, libc-locales
Cc: Dmitry V. Levin, Volodymyr Lisivka
On 17.07.2018 21:40, Carlos O'Donell wrote:
> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>> Dear locale maintainers,
>>
>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> We are currently preparing for the 2.28 release and it may take
> a while to review this change and the structure of the changes,
> and the data itself.
>
> Is it OK if this material is reviewed for 2.29 inclusion (after
> August 1st)?
It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
Should I send a reminder in August?
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-07-17 19:50 ` Egor Kobylkin
@ 2018-07-17 19:59 ` Carlos O'Donell
0 siblings, 0 replies; 111+ messages in thread
From: Carlos O'Donell @ 2018-07-17 19:59 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
Cc: Dmitry V. Levin, Volodymyr Lisivka
On 07/17/2018 03:50 PM, Egor Kobylkin wrote:
> On 17.07.2018 21:40, Carlos O'Donell wrote:
>> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>
>> We are currently preparing for the 2.28 release and it may take
>> a while to review this change and the structure of the changes,
>> and the data itself.
>>
>> Is it OK if this material is reviewed for 2.29 inclusion (after
>> August 1st)?
>
> It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
> Should I send a reminder in August?
Yes please, ping the original patches again in August and we can
review. In the meantime others may feel free to review, but we won't
consider them for inclusion yet e.g. don't block the release.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
[not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34 ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-08-06 19:00 ` Egor Kobylkin
2018-10-03 9:20 ` Egor Kobylkin
2018-10-11 2:58 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
` (11 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-08-06 19:00 UTC (permalink / raw)
To: libc-alpha, libc-locales
Cc: Dmitry V. Levin, Volodymyr Lisivka, Carlos O'Donell,
Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 53945 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
This is a re-submission for the consideration for 2.29 on a request from
Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
Root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration has only ASCII codes but still can be read by a native
speaker. Among other things it is useful for processing the Cyrillic
texts and filenames by programs or on systems that are not specifically
prepared to work with Cyrillic, don't have corresponding fonts installed
or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on GOST 7.79-2000 official source
(Federal Agency on Technical Regulating and Metrology Of Russian
Federation [2]). Technically an independent but identical source [3] was
used and prepared in a spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
However it would not be the standard Russian Cyrillic transliteration as
described above.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=8590
[7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=8618
Best regards,
Egor Kobylkin
---
2018-07-17 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
table from Cyrillic to Latin.
* locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
section.
* locales/aa_DJ: likewise
* locales/af_ZA: likewise
* locales/ak_GH: likewise
* locales/am_ET: likewise
* locales/ar_EG: likewise
* locales/be_BY: likewise
* locales/bem_ZM: likewise
* locales/ber_DZ: likewise
* locales/ber_MA: likewise
* locales/bg_BG: likewise
* locales/bi_VU: likewise
* locales/bn_BD: likewise
* locales/bo_CN: likewise
* locales/ca_ES: likewise
* locales/ce_RU: likewise
* locales/cs_CZ: likewise
* locales/cv_RU: likewise
* locales/cy_GB: likewise
* locales/da_DK: likewise
* locales/de_DE: likewise
* locales/dv_MV: likewise
* locales/dz_BT: likewise
* locales/el_GR: likewise
* locales/en_GB: likewise
* locales/en_NG: likewise
* locales/en_ZM: likewise
* locales/es_CU: likewise
* locales/es_ES: likewise
* locales/et_EE: likewise
* locales/fa_IR: likewise
* locales/ff_SN: likewise
* locales/fi_FI: likewise
* locales/fr_FR: likewise
* locales/ga_IE: likewise
* locales/gd_GB: likewise
* locales/gu_IN: likewise
* locales/gv_GB: likewise
* locales/he_IL: likewise
* locales/hi_IN: likewise
* locales/hif_FJ: likewise
* locales/hr_HR: likewise
* locales/ht_HT: likewise
* locales/hu_HU: likewise
* locales/hy_AM: likewise
* locales/id_ID: likewise
* locales/is_IS: likewise
* locales/it_IT: likewise
* locales/ja_JP: likewise
* locales/kk_KZ: likewise
* locales/km_KH: likewise
* locales/kn_IN: likewise
* locales/ko_KR: likewise
* locales/ks_IN: likewise
* locales/kw_GB: likewise
* locales/lb_LU: likewise
* locales/lg_UG: likewise
* locales/lij_IT: likewise
* locales/ln_CD: likewise
* locales/lo_LA: likewise
* locales/lt_LT: likewise
* locales/lv_LV: likewise
* locales/mg_MG: likewise
* locales/mhr_RU: likewise
* locales/mk_MK: likewise
* locales/ml_IN: likewise
* locales/ms_MY: likewise
* locales/mt_MT: likewise
* locales/nan_TW@latin: likewise
* locales/nb_NO: likewise
* locales/ne_NP: likewise
* locales/nhn_MX: likewise
* locales/niu_NU: likewise
* locales/niu_NZ: likewise
* locales/nl_NL: likewise
* locales/nr_ZA: likewise
* locales/oc_FR: likewise
* locales/om_KE: likewise
* locales/or_IN: likewise
* locales/os_RU: likewise
* locales/pa_IN: likewise
* locales/pa_PK: likewise
* locales/pl_PL: likewise
* locales/pt_PT: likewise
* locales/quz_PE: likewise
* locales/ro_RO: likewise
* locales/ru_RU: likewise
* locales/rw_RW: likewise
* locales/sa_IN: likewise
* locales/sd_IN: likewise
* locales/sd_IN@devanagari: likewise
* locales/sd_PK: likewise
* locales/se_NO: likewise
* locales/sgs_LT: likewise
* locales/si_LK: likewise
* locales/sk_SK: likewise
* locales/sl_SI: likewise
* locales/sm_WS: likewise
* locales/so_SO: likewise
* locales/sq_AL: likewise
* locales/ss_ZA: likewise
* locales/st_ZA: likewise
* locales/sv_SE: likewise
* locales/sw_KE: likewise
* locales/ta_IN: likewise
* locales/te_IN: likewise
* locales/th_TH: likewise
* locales/ti_ET: likewise
* locales/tn_ZA: likewise
* locales/to_TO: likewise
* locales/tpi_PG: likewise
* locales/tr_TR: likewise
* locales/ts_ZA: likewise
* locales/unm_US: likewise
* locales/ur_IN: likewise
* locales/ur_PK: likewise
* locales/ve_ZA: likewise
* locales/vi_VN: likewise
* locales/wa_BE: likewise
* locales/wo_SN: likewise
* locales/xh_ZA: likewise
* locales/yi_US: likewise
* locales/zh_CN: likewise
* locales/zu_ZA: likewise
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
@@ -2292,6 +2292,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
@@ -56,6 +56,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
@@ -1396,6 +1396,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
@@ -166,6 +166,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
@@ -86,6 +86,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
@@ -2311,6 +2311,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
@@ -69,6 +69,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
@@ -167,6 +167,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
@@ -52,6 +52,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
@@ -50,6 +50,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
@@ -109,6 +109,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
@@ -79,6 +79,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
@@ -42,6 +42,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
@@ -137,6 +137,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
@@ -47,6 +47,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
@@ -153,6 +153,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
@@ -478,6 +478,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
@@ -55,6 +55,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
@@ -2161,6 +2161,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
@@ -1682,6 +1682,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
@@ -158,6 +158,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
@@ -873,6 +873,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
@@ -78,6 +78,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "<U0065><U005E>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
@@ -51,6 +51,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
@@ -77,6 +77,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
@@ -2122,6 +2122,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
@@ -53,6 +53,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
@@ -140,6 +140,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
@@ -142,6 +142,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
@@ -144,6 +144,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
@@ -74,6 +74,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
@@ -205,6 +205,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
@@ -59,6 +59,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
@@ -139,6 +139,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
@@ -866,6 +866,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
@@ -2430,6 +2430,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
+0000
@@ -0,0 +1,151 @@
+escape_char /
+comment_char %
+
+% Transliterations that converts cyrillic letters to ascii symbols
inspired by GOST 7.79-2000
+% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=8590
+% Up to three characters are required to do a reversible transliteration.
+
+LC_CTYPE
+
+translit_start
+
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>";<U0059>
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>";<U005A>
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>";<U0043>
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>";<U0043>
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>";<U0053>
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0060><U0060>";<U0060>
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0027>";<U0059>
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>";<U0045>
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>";<U0059>
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>";<U0059>
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>";<U007A>
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>";<U0063>
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>";<U0063>
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>";<U0073>
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>";<U0073>
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>";<U0060>
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0027>";<U0079>
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>";<U0065>
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>";<U0079>
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>";<U0079>
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>";<U0079>
+
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
@@ -69,6 +69,7 @@
<U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
<U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
@@ -55,6 +55,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
@@ -73,6 +73,7 @@
<U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
<U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
<U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 5169 bytes --]
From: Carlos O'Donell <carlos@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, libc-alpha@sourceware.org, libc-locales@sourceware.org
Cc: "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>
Subject: Re: SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
Date: Tue, 17 Jul 2018 15:59:27 -0400
Message-ID: <e2d97c8e-ba84-2244-aec6-fc0d5f560570@redhat.com>
On 07/17/2018 03:50 PM, Egor Kobylkin wrote:
> On 17.07.2018 21:40, Carlos O'Donell wrote:
>> On 07/17/2018 03:34 PM, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>
>> We are currently preparing for the 2.28 release and it may take
>> a while to review this change and the structure of the changes,
>> and the data itself.
>>
>> Is it OK if this material is reviewed for 2.29 inclusion (after
>> August 1st)?
>
> It's fine with me to postpone it for for 2.29 inclusion (after August 1st).
> Should I send a reminder in August?
Yes please, ping the original patches again in August and we can
review. In the meantime others may feel free to review, but we won't
consider them for inclusion yet e.g. don't block the release.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-08-06 19:00 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
@ 2018-10-03 9:20 ` Egor Kobylkin
2018-10-03 9:32 ` Keld Simonsen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-03 9:20 UTC (permalink / raw)
To: libc-alpha, libc-locales
Cc: Dmitry V. Levin, Volodymyr Lisivka, Carlos O'Donell,
Max Kutny, danilo
Ping.
Absent of feedback I am wondering if anything could be missing in this
patch from the maintainers standpoint. More than two months have passed
since the original submission.
If I can be of assistance, please do not hesitate to contact me,
Egor Kobylkin
On 06.08.2018 21:00, Egor Kobylkin wrote:
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>
> add Cyrillic transliteration table translit_cyrillic file
>
> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>
> to localedata/locales/ and include it in all your locales going forward.
>
> Patch included inline below.
>
> This is a re-submission for the consideration for 2.29 on a request from
> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
>
> Their maintainers are requested to make an explicit decision on how and
> whether at all to include this patch.
>
>
>
> Current bug effect:
>
> The glibc wiki explicitly lists this use case as the test example
>
> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>
> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> translit-test-input.txt
>
> currently it fails on Cyrillic texts in most locales including ru_RU [1]
> [8] [9]:
>
> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> translit-test-input.txt |grep CYRILLIC
>
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>
> - It produces a string of question marks and spaces.
>
> This is what it should produce and it does so after the patch applied:
>
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
>
>
> Root problem and the fix:
>
> The root problem is the missing transliteration table that I am
> supplying here. Furthermore it has to be referenced/included into the
> active locale at the compilation time to be used by iconv.
>
>
>
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration has only ASCII codes but still can be read by a native
> speaker. Among other things it is useful for processing the Cyrillic
> texts and filenames by programs or on systems that are not specifically
> prepared to work with Cyrillic, don't have corresponding fonts installed
> or can't handle UTF-8.
>
> The transliteration table itself is attached as a file translit_cyrillic
> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
> (Federal Agency on Technical Regulating and Metrology Of Russian
> Federation [2]). Technically an independent but identical source [3] was
> used and prepared in a spreadsheet [6].
>
> The documentation suggests that the transliteration tables inclusion is
> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
> translit_start section
> http://man7.org/linux/man-pages/man5/locale.5.html [5]
> Practically I have searched for all locales that have a
> translit_start/end stance and generated a patch for them.
>
> The Cyrillic transliteration of e.g. Russian text may have already
> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
> have their transliteration tables included inline.
> However it would not be the standard Russian Cyrillic transliteration as
> described above.
> I am excluding these locales from this proposed patch. I have written
> directly to locale maintainer emails listed in the files. Volodymyr
> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
> exclusion.
>
> Links:
>
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?id=8590
> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>
> Best regards,
> Egor Kobylkin
>
> ---
> 2018-07-17 Egor Kobylkin <egor@kobylkin.com>
>
> [BZ #2872]
> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
> table from Cyrillic to Latin.
> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
> section.
> * locales/aa_DJ: likewise
> * locales/af_ZA: likewise
> * locales/ak_GH: likewise
> * locales/am_ET: likewise
> * locales/ar_EG: likewise
> * locales/be_BY: likewise
> * locales/bem_ZM: likewise
> * locales/ber_DZ: likewise
> * locales/ber_MA: likewise
> * locales/bg_BG: likewise
> * locales/bi_VU: likewise
> * locales/bn_BD: likewise
> * locales/bo_CN: likewise
> * locales/ca_ES: likewise
> * locales/ce_RU: likewise
> * locales/cs_CZ: likewise
> * locales/cv_RU: likewise
> * locales/cy_GB: likewise
> * locales/da_DK: likewise
> * locales/de_DE: likewise
> * locales/dv_MV: likewise
> * locales/dz_BT: likewise
> * locales/el_GR: likewise
> * locales/en_GB: likewise
> * locales/en_NG: likewise
> * locales/en_ZM: likewise
> * locales/es_CU: likewise
> * locales/es_ES: likewise
> * locales/et_EE: likewise
> * locales/fa_IR: likewise
> * locales/ff_SN: likewise
> * locales/fi_FI: likewise
> * locales/fr_FR: likewise
> * locales/ga_IE: likewise
> * locales/gd_GB: likewise
> * locales/gu_IN: likewise
> * locales/gv_GB: likewise
> * locales/he_IL: likewise
> * locales/hi_IN: likewise
> * locales/hif_FJ: likewise
> * locales/hr_HR: likewise
> * locales/ht_HT: likewise
> * locales/hu_HU: likewise
> * locales/hy_AM: likewise
> * locales/id_ID: likewise
> * locales/is_IS: likewise
> * locales/it_IT: likewise
> * locales/ja_JP: likewise
> * locales/kk_KZ: likewise
> * locales/km_KH: likewise
> * locales/kn_IN: likewise
> * locales/ko_KR: likewise
> * locales/ks_IN: likewise
> * locales/kw_GB: likewise
> * locales/lb_LU: likewise
> * locales/lg_UG: likewise
> * locales/lij_IT: likewise
> * locales/ln_CD: likewise
> * locales/lo_LA: likewise
> * locales/lt_LT: likewise
> * locales/lv_LV: likewise
> * locales/mg_MG: likewise
> * locales/mhr_RU: likewise
> * locales/mk_MK: likewise
> * locales/ml_IN: likewise
> * locales/ms_MY: likewise
> * locales/mt_MT: likewise
> * locales/nan_TW@latin: likewise
> * locales/nb_NO: likewise
> * locales/ne_NP: likewise
> * locales/nhn_MX: likewise
> * locales/niu_NU: likewise
> * locales/niu_NZ: likewise
> * locales/nl_NL: likewise
> * locales/nr_ZA: likewise
> * locales/oc_FR: likewise
> * locales/om_KE: likewise
> * locales/or_IN: likewise
> * locales/os_RU: likewise
> * locales/pa_IN: likewise
> * locales/pa_PK: likewise
> * locales/pl_PL: likewise
> * locales/pt_PT: likewise
> * locales/quz_PE: likewise
> * locales/ro_RO: likewise
> * locales/ru_RU: likewise
> * locales/rw_RW: likewise
> * locales/sa_IN: likewise
> * locales/sd_IN: likewise
> * locales/sd_IN@devanagari: likewise
> * locales/sd_PK: likewise
> * locales/se_NO: likewise
> * locales/sgs_LT: likewise
> * locales/si_LK: likewise
> * locales/sk_SK: likewise
> * locales/sl_SI: likewise
> * locales/sm_WS: likewise
> * locales/so_SO: likewise
> * locales/sq_AL: likewise
> * locales/ss_ZA: likewise
> * locales/st_ZA: likewise
> * locales/sv_SE: likewise
> * locales/sw_KE: likewise
> * locales/ta_IN: likewise
> * locales/te_IN: likewise
> * locales/th_TH: likewise
> * locales/ti_ET: likewise
> * locales/tn_ZA: likewise
> * locales/to_TO: likewise
> * locales/tpi_PG: likewise
> * locales/tr_TR: likewise
> * locales/ts_ZA: likewise
> * locales/unm_US: likewise
> * locales/ur_IN: likewise
> * locales/ur_PK: likewise
> * locales/ve_ZA: likewise
> * locales/vi_VN: likewise
> * locales/wa_BE: likewise
> * locales/wo_SN: likewise
> * locales/xh_ZA: likewise
> * locales/yi_US: likewise
> * locales/zh_CN: likewise
> * locales/zu_ZA: likewise
>
>
> diff -uNr a/localedata/locales/C b/localedata/locales/C
> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
> @@ -2292,6 +2292,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
> @@ -70,6 +70,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
> @@ -72,6 +72,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
> @@ -56,6 +56,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
> @@ -1396,6 +1396,7 @@
> <U137A> <U0060><U0039><U0030>
> <U137B> <U0060><U0031><U0030><U0030>
> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> +include "translit_cyrillic";""
> translit_end
> %
> END LC_CTYPE
> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
> @@ -44,6 +44,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
> @@ -69,6 +69,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
> @@ -42,6 +42,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
> @@ -166,6 +166,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
> @@ -86,6 +86,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
> @@ -49,6 +49,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
> @@ -39,6 +39,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
> @@ -63,6 +63,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
> @@ -43,6 +43,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
> @@ -72,6 +72,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
> @@ -39,6 +39,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
> @@ -2311,6 +2311,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
> @@ -109,6 +109,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
> @@ -69,6 +69,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
> @@ -167,6 +167,7 @@
> % LATIN SMALL LETTER O WITH STROKE -> "oe"
> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
> @@ -78,6 +78,7 @@
> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
> <U201F> <U00AB>;<U0022>
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
> @@ -52,6 +52,7 @@
> include "translit_combining";""
>
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
> @@ -55,6 +55,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
> @@ -50,6 +50,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
> @@ -42,6 +42,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
> @@ -73,6 +73,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
> @@ -109,6 +109,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
> @@ -79,6 +79,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
> @@ -42,6 +42,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
> @@ -137,6 +137,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
> % In France, accents are simply omitted if they cannot be represented.
> include "translit_combining";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
> @@ -54,6 +54,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
> @@ -47,6 +47,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
> @@ -62,6 +62,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
> @@ -57,6 +57,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
> @@ -61,6 +61,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
> @@ -37,6 +37,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
> @@ -153,6 +153,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
> @@ -478,6 +478,7 @@
> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
> @@ -77,6 +77,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
> @@ -55,6 +55,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
> @@ -2161,6 +2161,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
> @@ -1682,6 +1682,7 @@
> include "translit_combining";""
> include "translit_cjk_variants";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
> @@ -158,6 +158,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
> @@ -873,6 +873,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
> @@ -63,6 +63,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
> @@ -6099,6 +6099,7 @@
> include "translit_combining";""
> include "translit_hangul";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
> @@ -46,6 +46,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
> @@ -58,6 +58,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
> @@ -78,6 +78,7 @@
> % LATIN SMALL LETTER E WITH CIRCUMFLEX
> <U00EA> "<U0065><U005E>"
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
> @@ -57,6 +57,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
> @@ -47,6 +47,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
> @@ -39,6 +39,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
> @@ -51,6 +51,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
> @@ -77,6 +77,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
> @@ -2122,6 +2122,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
> @@ -55,6 +55,7 @@
> % Accents are simply omitted if they cannot be represented.
> include "translit_combining";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
> @@ -49,6 +49,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
> %
> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
> @@ -45,6 +45,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
> @@ -47,6 +47,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/nan_TW@latin
> b/localedata/locales/nan_TW@latin
> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
> @@ -53,6 +53,7 @@
> % accents are simply omitted if they cannot be represented.
> include "translit_combining";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
> @@ -154,6 +154,7 @@
> % LATIN SMALL LETTER O WITH STROKE -> "oe"
> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
> @@ -43,6 +43,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
> @@ -57,6 +57,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
> @@ -66,6 +66,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
> @@ -62,6 +62,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
> @@ -140,6 +140,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
> @@ -62,6 +62,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
> @@ -70,6 +70,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
> @@ -60,6 +60,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
> @@ -58,6 +58,7 @@
> % Farsi yeh -> yeh
> <U06CC> "<U064A>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
> @@ -142,6 +142,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
> @@ -59,6 +59,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
> @@ -57,6 +57,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
> @@ -144,6 +144,7 @@
> <U0162> "<U021A>";"<U0054>"
> <U0163> "<U021B>";"<U0074>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
> @@ -74,6 +74,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
> @@ -45,6 +45,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
> @@ -44,6 +44,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
> @@ -46,6 +46,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sd_IN@devanagari
> b/localedata/locales/sd_IN@devanagari
> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
> +0000
> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
> +0000
> @@ -44,6 +44,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
> @@ -39,6 +39,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
> @@ -205,6 +205,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
> @@ -59,6 +59,7 @@
> copy "i18n"
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
> @@ -45,6 +45,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
> @@ -68,6 +68,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
> @@ -91,6 +91,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
> @@ -37,6 +37,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
> @@ -70,6 +70,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
> @@ -45,6 +45,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
> @@ -68,6 +68,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
> @@ -64,6 +64,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
> @@ -139,6 +139,7 @@
> % LATIN SMALL LETTER O WITH STROKE -> "oe"
> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
> @@ -44,6 +44,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
> @@ -63,6 +63,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
> @@ -63,6 +63,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
> @@ -58,6 +58,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
> @@ -866,6 +866,7 @@
> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> %
> END LC_CTYPE
> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
> @@ -69,6 +69,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
> @@ -36,6 +36,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
> @@ -37,6 +37,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
> @@ -2430,6 +2430,7 @@
>
> % TURKISH LIRA SIGN
> <U20BA> "<U0054><U004C>"
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
> +0000
> @@ -0,0 +1,151 @@
> +escape_char /
> +comment_char %
> +
> +% Transliterations that converts cyrillic letters to ascii symbols
> inspired by GOST 7.79-2000
> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> +% Generated from UnicodeData.txt with
> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
> +% Up to three characters are required to do a reversible transliteration.
> +
> +LC_CTYPE
> +
> +translit_start
> +
> +
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> "<U0059><U004F>";<U0059>
> +% CYRILLIC CAPITAL LETTER A
> +<U0410> <U0041>
> +% CYRILLIC CAPITAL LETTER BE
> +<U0411> <U0042>
> +% CYRILLIC CAPITAL LETTER VE
> +<U0412> <U0056>
> +% CYRILLIC CAPITAL LETTER GHE
> +<U0413> <U0047>
> +% CYRILLIC CAPITAL LETTER DE
> +<U0414> <U0044>
> +% CYRILLIC CAPITAL LETTER IE
> +<U0415> <U0045>
> +% CYRILLIC CAPITAL LETTER ZHE
> +<U0416> "<U005A><U0048>";<U005A>
> +% CYRILLIC CAPITAL LETTER ZE
> +<U0417> <U005A>
> +% CYRILLIC CAPITAL LETTER I
> +<U0418> <U0049>
> +% CYRILLIC CAPITAL LETTER SHORT I
> +<U0419> <U004A>
> +% CYRILLIC CAPITAL LETTER KA
> +<U041A> <U004B>
> +% CYRILLIC CAPITAL LETTER EL
> +<U041B> <U004C>
> +% CYRILLIC CAPITAL LETTER EM
> +<U041C> <U004D>
> +% CYRILLIC CAPITAL LETTER EN
> +<U041D> <U004E>
> +% CYRILLIC CAPITAL LETTER O
> +<U041E> <U004F>
> +% CYRILLIC CAPITAL LETTER PE
> +<U041F> <U0050>
> +% CYRILLIC CAPITAL LETTER ER
> +<U0420> <U0052>
> +% CYRILLIC CAPITAL LETTER ES
> +<U0421> <U0053>
> +% CYRILLIC CAPITAL LETTER TE
> +<U0422> <U0054>
> +% CYRILLIC CAPITAL LETTER U
> +<U0423> <U0055>
> +% CYRILLIC CAPITAL LETTER EF
> +<U0424> <U0046>
> +% CYRILLIC CAPITAL LETTER HA
> +<U0425> <U0058>
> +% CYRILLIC CAPITAL LETTER TSE
> +<U0426> "<U0043><U005A>";<U0043>
> +% CYRILLIC CAPITAL LETTER CHE
> +<U0427> "<U0043><U0048>";<U0043>
> +% CYRILLIC CAPITAL LETTER SHA
> +<U0428> "<U0053><U0048>";<U0053>
> +% CYRILLIC CAPITAL LETTER SHCHA
> +<U0429> "<U0053><U0048><U0048>";<U0053>
> +% CYRILLIC CAPITAL LETTER HARD SIGN
> +<U042A> "<U0060><U0060>";<U0060>
> +% CYRILLIC CAPITAL LETTER YERU
> +<U042B> "<U0059><U0027>";<U0059>
> +% CYRILLIC CAPITAL LETTER SOFT SIGN
> +<U042C> <U0060>
> +% CYRILLIC CAPITAL LETTER E
> +<U042D> "<U0045><U0060>";<U0045>
> +% CYRILLIC CAPITAL LETTER YU
> +<U042E> "<U0059><U0055>";<U0059>
> +% CYRILLIC CAPITAL LETTER YA
> +<U042F> "<U0059><U0041>";<U0059>
> +% CYRILLIC SMALL LETTER A
> +<U0430> <U0061>
> +% CYRILLIC SMALL LETTER BE
> +<U0431> <U0062>
> +% CYRILLIC SMALL LETTER VE
> +<U0432> <U0076>
> +% CYRILLIC SMALL LETTER GHE
> +<U0433> <U0067>
> +% CYRILLIC SMALL LETTER DE
> +<U0434> <U0064>
> +% CYRILLIC SMALL LETTER IE
> +<U0435> <U0065>
> +% CYRILLIC SMALL LETTER ZHE
> +<U0436> "<U007A><U0068>";<U007A>
> +% CYRILLIC SMALL LETTER ZE
> +<U0437> <U007A>
> +% CYRILLIC SMALL LETTER I
> +<U0438> <U0069>
> +% CYRILLIC SMALL LETTER SHORT I
> +<U0439> <U006A>
> +% CYRILLIC SMALL LETTER KA
> +<U043A> <U006B>
> +% CYRILLIC SMALL LETTER EL
> +<U043B> <U006C>
> +% CYRILLIC SMALL LETTER EM
> +<U043C> <U006D>
> +% CYRILLIC SMALL LETTER EN
> +<U043D> <U006E>
> +% CYRILLIC SMALL LETTER O
> +<U043E> <U006F>
> +% CYRILLIC SMALL LETTER PE
> +<U043F> <U0070>
> +% CYRILLIC SMALL LETTER ER
> +<U0440> <U0072>
> +% CYRILLIC SMALL LETTER ES
> +<U0441> <U0073>
> +% CYRILLIC SMALL LETTER TE
> +<U0442> <U0074>
> +% CYRILLIC SMALL LETTER U
> +<U0443> <U0075>
> +% CYRILLIC SMALL LETTER EF
> +<U0444> <U0066>
> +% CYRILLIC SMALL LETTER HA
> +<U0445> <U0078>
> +% CYRILLIC SMALL LETTER TSE
> +<U0446> "<U0063><U007A>";<U0063>
> +% CYRILLIC SMALL LETTER CHE
> +<U0447> "<U0063><U0068>";<U0063>
> +% CYRILLIC SMALL LETTER SHA
> +<U0448> "<U0073><U0068>";<U0073>
> +% CYRILLIC SMALL LETTER SHCHA
> +<U0449> "<U0073><U0068><U0068>";<U0073>
> +% CYRILLIC SMALL LETTER HARD SIGN
> +<U044A> "<U0060><U0060>";<U0060>
> +% CYRILLIC SMALL LETTER YERU
> +<U044B> "<U0079><U0027>";<U0079>
> +% CYRILLIC SMALL LETTER SOFT SIGN
> +<U044C> <U0060>
> +% CYRILLIC SMALL LETTER E
> +<U044D> "<U0065><U0060>";<U0065>
> +% CYRILLIC SMALL LETTER YU
> +<U044E> "<U0079><U0075>";<U0079>
> +% CYRILLIC SMALL LETTER YA
> +<U044F> "<U0079><U0061>";<U0079>
> +% CYRILLIC SMALL LETTER IO
> +<U0451> "<U0079><U006F>";<U0079>
> +
> +
> +translit_end
> +
> +END LC_CTYPE
> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
> @@ -64,6 +64,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
> @@ -48,6 +48,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
> @@ -46,6 +46,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
> % Farsi yeh -> yeh
> <U06CC> "<U064A>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
> @@ -67,6 +67,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
> % dong sign -> d// -> dd
> <U20AB> "<U0111>";"<U0064><U0064>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
> @@ -69,6 +69,7 @@
> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
> @@ -55,6 +55,7 @@
> % Accents are simply omitted if they cannot be represented.
> include "translit_combining";""
>
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
> @@ -66,6 +66,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
> @@ -73,6 +73,7 @@
> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
> +include "translit_cyrillic";""
> translit_end
>
> END LC_CTYPE
> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
> @@ -58,6 +58,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
>
> class "hanzi"; /
> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
> @@ -70,6 +70,7 @@
>
> translit_start
> include "translit_combining";""
> +include "translit_cyrillic";""
> translit_end
> END LC_CTYPE
>
>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-03 9:20 ` Egor Kobylkin
@ 2018-10-03 9:32 ` Keld Simonsen
2018-10-03 15:01 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Keld Simonsen @ 2018-10-03 9:32 UTC (permalink / raw)
To: Egor Kobylkin
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi
Please note that translitteration of Cyrillic to latin is not universal.
There are different schemes for for example German, English and Danish, and
there is also an ISO standard for it.
But do go forward with fixing this bug.
Best regards
Keld
On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
> Ping.
>
> Absent of feedback I am wondering if anything could be missing in this
> patch from the maintainers standpoint. More than two months have passed
> since the original submission.
>
> If I can be of assistance, please do not hesitate to contact me,
> Egor Kobylkin
>
> On 06.08.2018 21:00, Egor Kobylkin wrote:
> > Dear locale maintainers,
> >
> > fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
> >
> > add Cyrillic transliteration table translit_cyrillic file
> >
> > https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
> >
> > to localedata/locales/ and include it in all your locales going forward.
> >
> > Patch included inline below.
> >
> > This is a re-submission for the consideration for 2.29 on a request from
> > Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
> >
> > From this patch I have excluded locales that already mention cyrillic or
> > have a transliteration table for it:
> > az_AZ
> > iso14651_t1_common
> > ky_KG
> > mn_MN
> > sr_RS
> > tg_TJ
> > tk_TM
> > tt_RU
> > uk_UA
> > uz_UZ
> > uz_UZ@cyrillic
> >
> > Their maintainers are requested to make an explicit decision on how and
> > whether at all to include this patch.
> >
> >
> >
> > Current bug effect:
> >
> > The glibc wiki explicitly lists this use case as the test example
> >
> > https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
> >
> > LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> > translit-test-input.txt
> >
> > currently it fails on Cyrillic texts in most locales including ru_RU [1]
> > [8] [9]:
> >
> > LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
> > translit-test-input.txt |grep CYRILLIC
> >
> > CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
> >
> > - It produces a string of question marks and spaces.
> >
> > This is what it should produce and it does so after the patch applied:
> >
> > CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> > chayu.
> >
> >
> > Root problem and the fix:
> >
> > The root problem is the missing transliteration table that I am
> > supplying here. Furthermore it has to be referenced/included into the
> > active locale at the compilation time to be used by iconv.
> >
> >
> >
> > COMMIT MESSAGE:
> > This translit_cyrillic table enables conversion (e.g. with iconv) from a
> > UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
> >
> > While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> > a transliteration has only ASCII codes but still can be read by a native
> > speaker. Among other things it is useful for processing the Cyrillic
> > texts and filenames by programs or on systems that are not specifically
> > prepared to work with Cyrillic, don't have corresponding fonts installed
> > or can't handle UTF-8.
> >
> > The transliteration table itself is attached as a file translit_cyrillic
> > [7]. Its content (mapping) is based on GOST 7.79-2000 official source
> > (Federal Agency on Technical Regulating and Metrology Of Russian
> > Federation [2]). Technically an independent but identical source [3] was
> > used and prepared in a spreadsheet [6].
> >
> > The documentation suggests that the transliteration tables inclusion is
> > done by adding *include "translit_cyrillic";""* string into LC_CTYPE
> > translit_start section
> > http://man7.org/linux/man-pages/man5/locale.5.html [5]
> > Practically I have searched for all locales that have a
> > translit_start/end stance and generated a patch for them.
> >
> > The Cyrillic transliteration of e.g. Russian text may have already
> > worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
> > have their transliteration tables included inline.
> > However it would not be the standard Russian Cyrillic transliteration as
> > described above.
> > I am excluding these locales from this proposed patch. I have written
> > directly to locale maintainer emails listed in the files. Volodymyr
> > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> > ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
> > exclusion.
> >
> > Links:
> >
> > [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> > [2] GOST 7.79-2000 official source
> > http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> > available in low quality gif format)
> > [3] http://transliteration.ru/gost-7-79-2000/ and
> > http://www.yfermer.ru/specifications/285821.html
> > [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> > https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> > [5] http://man7.org/linux/man-pages/man5/locale.5.html
> > [6] Spreadsheet for generating translit_cyrillic
> > https://sourceware.org/bugzilla/attachment.cgi?id=8590
> > [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
> > [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> > [9] translit-test-input.txt
> > https://sourceware.org/bugzilla/attachment.cgi?id=8618
> >
> > Best regards,
> > Egor Kobylkin
> >
> > ---
> > 2018-07-17 Egor Kobylkin <egor@kobylkin.com>
> >
> > [BZ #2872]
> > * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
> > table from Cyrillic to Latin.
> > * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
> > section.
> > * locales/aa_DJ: likewise
> > * locales/af_ZA: likewise
> > * locales/ak_GH: likewise
> > * locales/am_ET: likewise
> > * locales/ar_EG: likewise
> > * locales/be_BY: likewise
> > * locales/bem_ZM: likewise
> > * locales/ber_DZ: likewise
> > * locales/ber_MA: likewise
> > * locales/bg_BG: likewise
> > * locales/bi_VU: likewise
> > * locales/bn_BD: likewise
> > * locales/bo_CN: likewise
> > * locales/ca_ES: likewise
> > * locales/ce_RU: likewise
> > * locales/cs_CZ: likewise
> > * locales/cv_RU: likewise
> > * locales/cy_GB: likewise
> > * locales/da_DK: likewise
> > * locales/de_DE: likewise
> > * locales/dv_MV: likewise
> > * locales/dz_BT: likewise
> > * locales/el_GR: likewise
> > * locales/en_GB: likewise
> > * locales/en_NG: likewise
> > * locales/en_ZM: likewise
> > * locales/es_CU: likewise
> > * locales/es_ES: likewise
> > * locales/et_EE: likewise
> > * locales/fa_IR: likewise
> > * locales/ff_SN: likewise
> > * locales/fi_FI: likewise
> > * locales/fr_FR: likewise
> > * locales/ga_IE: likewise
> > * locales/gd_GB: likewise
> > * locales/gu_IN: likewise
> > * locales/gv_GB: likewise
> > * locales/he_IL: likewise
> > * locales/hi_IN: likewise
> > * locales/hif_FJ: likewise
> > * locales/hr_HR: likewise
> > * locales/ht_HT: likewise
> > * locales/hu_HU: likewise
> > * locales/hy_AM: likewise
> > * locales/id_ID: likewise
> > * locales/is_IS: likewise
> > * locales/it_IT: likewise
> > * locales/ja_JP: likewise
> > * locales/kk_KZ: likewise
> > * locales/km_KH: likewise
> > * locales/kn_IN: likewise
> > * locales/ko_KR: likewise
> > * locales/ks_IN: likewise
> > * locales/kw_GB: likewise
> > * locales/lb_LU: likewise
> > * locales/lg_UG: likewise
> > * locales/lij_IT: likewise
> > * locales/ln_CD: likewise
> > * locales/lo_LA: likewise
> > * locales/lt_LT: likewise
> > * locales/lv_LV: likewise
> > * locales/mg_MG: likewise
> > * locales/mhr_RU: likewise
> > * locales/mk_MK: likewise
> > * locales/ml_IN: likewise
> > * locales/ms_MY: likewise
> > * locales/mt_MT: likewise
> > * locales/nan_TW@latin: likewise
> > * locales/nb_NO: likewise
> > * locales/ne_NP: likewise
> > * locales/nhn_MX: likewise
> > * locales/niu_NU: likewise
> > * locales/niu_NZ: likewise
> > * locales/nl_NL: likewise
> > * locales/nr_ZA: likewise
> > * locales/oc_FR: likewise
> > * locales/om_KE: likewise
> > * locales/or_IN: likewise
> > * locales/os_RU: likewise
> > * locales/pa_IN: likewise
> > * locales/pa_PK: likewise
> > * locales/pl_PL: likewise
> > * locales/pt_PT: likewise
> > * locales/quz_PE: likewise
> > * locales/ro_RO: likewise
> > * locales/ru_RU: likewise
> > * locales/rw_RW: likewise
> > * locales/sa_IN: likewise
> > * locales/sd_IN: likewise
> > * locales/sd_IN@devanagari: likewise
> > * locales/sd_PK: likewise
> > * locales/se_NO: likewise
> > * locales/sgs_LT: likewise
> > * locales/si_LK: likewise
> > * locales/sk_SK: likewise
> > * locales/sl_SI: likewise
> > * locales/sm_WS: likewise
> > * locales/so_SO: likewise
> > * locales/sq_AL: likewise
> > * locales/ss_ZA: likewise
> > * locales/st_ZA: likewise
> > * locales/sv_SE: likewise
> > * locales/sw_KE: likewise
> > * locales/ta_IN: likewise
> > * locales/te_IN: likewise
> > * locales/th_TH: likewise
> > * locales/ti_ET: likewise
> > * locales/tn_ZA: likewise
> > * locales/to_TO: likewise
> > * locales/tpi_PG: likewise
> > * locales/tr_TR: likewise
> > * locales/ts_ZA: likewise
> > * locales/unm_US: likewise
> > * locales/ur_IN: likewise
> > * locales/ur_PK: likewise
> > * locales/ve_ZA: likewise
> > * locales/vi_VN: likewise
> > * locales/wa_BE: likewise
> > * locales/wo_SN: likewise
> > * locales/xh_ZA: likewise
> > * locales/yi_US: likewise
> > * locales/zh_CN: likewise
> > * locales/zu_ZA: likewise
> >
> >
> > diff -uNr a/localedata/locales/C b/localedata/locales/C
> > --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
> > @@ -2292,6 +2292,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
> > --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
> > @@ -70,6 +70,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
> > --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
> > @@ -72,6 +72,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
> > --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
> > @@ -56,6 +56,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> > --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
> > @@ -1396,6 +1396,7 @@
> > <U137A> <U0060><U0039><U0030>
> > <U137B> <U0060><U0031><U0030><U0030>
> > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> > +include "translit_cyrillic";""
> > translit_end
> > %
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
> > --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
> > +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
> > @@ -44,6 +44,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
> > --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
> > @@ -69,6 +69,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
> > --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
> > @@ -42,6 +42,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
> > --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
> > @@ -166,6 +166,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
> > --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
> > @@ -86,6 +86,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
> > --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
> > @@ -49,6 +49,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
> > --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
> > @@ -39,6 +39,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
> > --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
> > @@ -63,6 +63,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
> > --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
> > @@ -43,6 +43,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
> > --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
> > @@ -72,6 +72,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
> > --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
> > @@ -39,6 +39,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
> > --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
> > +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
> > @@ -2311,6 +2311,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
> > --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
> > @@ -109,6 +109,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
> > --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
> > @@ -69,6 +69,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
> > --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
> > @@ -167,6 +167,7 @@
> > % LATIN SMALL LETTER O WITH STROKE -> "oe"
> > <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
> > --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
> > @@ -78,6 +78,7 @@
> > % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
> > <U201F> <U00AB>;<U0022>
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
> > --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
> > @@ -52,6 +52,7 @@
> > include "translit_combining";""
> >
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
> > --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
> > --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
> > --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
> > @@ -55,6 +55,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
> > --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
> > +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
> > @@ -50,6 +50,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
> > --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
> > @@ -42,6 +42,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
> > --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
> > --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
> > @@ -73,6 +73,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
> > --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
> > @@ -109,6 +109,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
> > --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
> > @@ -79,6 +79,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
> > --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
> > @@ -42,6 +42,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
> > --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
> > +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
> > @@ -137,6 +137,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
> > --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> > % In France, accents are simply omitted if they cannot be represented.
> > include "translit_combining";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
> > --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
> > @@ -54,6 +54,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
> > --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
> > @@ -47,6 +47,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
> > --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
> > @@ -62,6 +62,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
> > --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
> > @@ -57,6 +57,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
> > --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
> > --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
> > @@ -61,6 +61,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
> > --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
> > @@ -37,6 +37,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
> > --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
> > @@ -153,6 +153,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
> > --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
> > --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
> > +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
> > @@ -478,6 +478,7 @@
> > <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
> > <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
> > --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
> > @@ -77,6 +77,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
> > --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
> > @@ -55,6 +55,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
> > --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
> > @@ -2161,6 +2161,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
> > --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
> > --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
> > @@ -1682,6 +1682,7 @@
> > include "translit_combining";""
> > include "translit_cjk_variants";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
> > --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
> > @@ -158,6 +158,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
> > --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
> > @@ -873,6 +873,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
> > --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
> > @@ -63,6 +63,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
> > --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
> > @@ -6099,6 +6099,7 @@
> > include "translit_combining";""
> > include "translit_hangul";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
> > --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
> > @@ -46,6 +46,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
> > --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
> > @@ -58,6 +58,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
> > --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
> > @@ -78,6 +78,7 @@
> > % LATIN SMALL LETTER E WITH CIRCUMFLEX
> > <U00EA> "<U0065><U005E>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
> > --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
> > @@ -57,6 +57,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
> > --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
> > +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
> > @@ -47,6 +47,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
> > --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
> > @@ -39,6 +39,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
> > --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
> > @@ -51,6 +51,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
> > --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
> > @@ -77,6 +77,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
> > --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
> > @@ -2122,6 +2122,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
> > --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
> > @@ -55,6 +55,7 @@
> > % Accents are simply omitted if they cannot be represented.
> > include "translit_combining";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
> > --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
> > --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
> > @@ -49,6 +49,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
> > --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> > %
> > diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
> > --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
> > @@ -45,6 +45,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
> > --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
> > @@ -47,6 +47,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/nan_TW@latin
> > b/localedata/locales/nan_TW@latin
> > --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
> > @@ -53,6 +53,7 @@
> > % accents are simply omitted if they cannot be represented.
> > include "translit_combining";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
> > --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
> > @@ -154,6 +154,7 @@
> > % LATIN SMALL LETTER O WITH STROKE -> "oe"
> > <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
> > --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
> > @@ -43,6 +43,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
> > --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
> > --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
> > --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
> > --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
> > +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
> > @@ -57,6 +57,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
> > --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
> > @@ -66,6 +66,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
> > --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
> > @@ -62,6 +62,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
> > --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
> > @@ -140,6 +140,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
> > --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
> > @@ -62,6 +62,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
> > --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
> > @@ -70,6 +70,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
> > --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
> > @@ -60,6 +60,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
> > --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
> > @@ -58,6 +58,7 @@
> > % Farsi yeh -> yeh
> > <U06CC> "<U064A>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
> > --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
> > @@ -142,6 +142,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
> > --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
> > @@ -59,6 +59,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
> > --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
> > @@ -57,6 +57,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
> > --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
> > @@ -144,6 +144,7 @@
> > <U0162> "<U021A>";"<U0054>"
> > <U0163> "<U021B>";"<U0074>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
> > --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
> > @@ -74,6 +74,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
> > --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
> > @@ -45,6 +45,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
> > --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
> > @@ -44,6 +44,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
> > --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
> > @@ -46,6 +46,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sd_IN@devanagari
> > b/localedata/locales/sd_IN@devanagari
> > --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
> > +0000
> > +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
> > +0000
> > @@ -44,6 +44,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> > --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
> > @@ -39,6 +39,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
> > --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
> > @@ -205,6 +205,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
> > --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
> > @@ -59,6 +59,7 @@
> > copy "i18n"
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
> > --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
> > @@ -45,6 +45,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
> > --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
> > @@ -68,6 +68,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
> > --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
> > +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
> > @@ -91,6 +91,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
> > --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
> > @@ -37,6 +37,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
> > --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
> > @@ -70,6 +70,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
> > --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
> > @@ -45,6 +45,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
> > --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
> > @@ -68,6 +68,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
> > --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
> > @@ -64,6 +64,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
> > --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
> > @@ -139,6 +139,7 @@
> > % LATIN SMALL LETTER O WITH STROKE -> "oe"
> > <U00F8> "<U006F><U0338>";"<U006F><U0065>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
> > --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
> > @@ -44,6 +44,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
> > --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
> > @@ -63,6 +63,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
> > --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
> > @@ -63,6 +63,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
> > --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
> > @@ -58,6 +58,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
> > --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
> > @@ -866,6 +866,7 @@
> > <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> >
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > %
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
> > --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
> > @@ -69,6 +69,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
> > --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
> > @@ -36,6 +36,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
> > --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
> > +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
> > @@ -37,6 +37,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
> > --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
> > @@ -2430,6 +2430,7 @@
> >
> > % TURKISH LIRA SIGN
> > <U20BA> "<U0054><U004C>"
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/translit_cyrillic
> > b/localedata/locales/translit_cyrillic
> > --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> > +0000
> > +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
> > +0000
> > @@ -0,0 +1,151 @@
> > +escape_char /
> > +comment_char %
> > +
> > +% Transliterations that converts cyrillic letters to ascii symbols
> > inspired by GOST 7.79-2000
> > +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> > +% Generated from UnicodeData.txt with
> > +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
> > +% Up to three characters are required to do a reversible transliteration.
> > +
> > +LC_CTYPE
> > +
> > +translit_start
> > +
> > +
> > +% CYRILLIC CAPITAL LETTER IO
> > +<U0401> "<U0059><U004F>";<U0059>
> > +% CYRILLIC CAPITAL LETTER A
> > +<U0410> <U0041>
> > +% CYRILLIC CAPITAL LETTER BE
> > +<U0411> <U0042>
> > +% CYRILLIC CAPITAL LETTER VE
> > +<U0412> <U0056>
> > +% CYRILLIC CAPITAL LETTER GHE
> > +<U0413> <U0047>
> > +% CYRILLIC CAPITAL LETTER DE
> > +<U0414> <U0044>
> > +% CYRILLIC CAPITAL LETTER IE
> > +<U0415> <U0045>
> > +% CYRILLIC CAPITAL LETTER ZHE
> > +<U0416> "<U005A><U0048>";<U005A>
> > +% CYRILLIC CAPITAL LETTER ZE
> > +<U0417> <U005A>
> > +% CYRILLIC CAPITAL LETTER I
> > +<U0418> <U0049>
> > +% CYRILLIC CAPITAL LETTER SHORT I
> > +<U0419> <U004A>
> > +% CYRILLIC CAPITAL LETTER KA
> > +<U041A> <U004B>
> > +% CYRILLIC CAPITAL LETTER EL
> > +<U041B> <U004C>
> > +% CYRILLIC CAPITAL LETTER EM
> > +<U041C> <U004D>
> > +% CYRILLIC CAPITAL LETTER EN
> > +<U041D> <U004E>
> > +% CYRILLIC CAPITAL LETTER O
> > +<U041E> <U004F>
> > +% CYRILLIC CAPITAL LETTER PE
> > +<U041F> <U0050>
> > +% CYRILLIC CAPITAL LETTER ER
> > +<U0420> <U0052>
> > +% CYRILLIC CAPITAL LETTER ES
> > +<U0421> <U0053>
> > +% CYRILLIC CAPITAL LETTER TE
> > +<U0422> <U0054>
> > +% CYRILLIC CAPITAL LETTER U
> > +<U0423> <U0055>
> > +% CYRILLIC CAPITAL LETTER EF
> > +<U0424> <U0046>
> > +% CYRILLIC CAPITAL LETTER HA
> > +<U0425> <U0058>
> > +% CYRILLIC CAPITAL LETTER TSE
> > +<U0426> "<U0043><U005A>";<U0043>
> > +% CYRILLIC CAPITAL LETTER CHE
> > +<U0427> "<U0043><U0048>";<U0043>
> > +% CYRILLIC CAPITAL LETTER SHA
> > +<U0428> "<U0053><U0048>";<U0053>
> > +% CYRILLIC CAPITAL LETTER SHCHA
> > +<U0429> "<U0053><U0048><U0048>";<U0053>
> > +% CYRILLIC CAPITAL LETTER HARD SIGN
> > +<U042A> "<U0060><U0060>";<U0060>
> > +% CYRILLIC CAPITAL LETTER YERU
> > +<U042B> "<U0059><U0027>";<U0059>
> > +% CYRILLIC CAPITAL LETTER SOFT SIGN
> > +<U042C> <U0060>
> > +% CYRILLIC CAPITAL LETTER E
> > +<U042D> "<U0045><U0060>";<U0045>
> > +% CYRILLIC CAPITAL LETTER YU
> > +<U042E> "<U0059><U0055>";<U0059>
> > +% CYRILLIC CAPITAL LETTER YA
> > +<U042F> "<U0059><U0041>";<U0059>
> > +% CYRILLIC SMALL LETTER A
> > +<U0430> <U0061>
> > +% CYRILLIC SMALL LETTER BE
> > +<U0431> <U0062>
> > +% CYRILLIC SMALL LETTER VE
> > +<U0432> <U0076>
> > +% CYRILLIC SMALL LETTER GHE
> > +<U0433> <U0067>
> > +% CYRILLIC SMALL LETTER DE
> > +<U0434> <U0064>
> > +% CYRILLIC SMALL LETTER IE
> > +<U0435> <U0065>
> > +% CYRILLIC SMALL LETTER ZHE
> > +<U0436> "<U007A><U0068>";<U007A>
> > +% CYRILLIC SMALL LETTER ZE
> > +<U0437> <U007A>
> > +% CYRILLIC SMALL LETTER I
> > +<U0438> <U0069>
> > +% CYRILLIC SMALL LETTER SHORT I
> > +<U0439> <U006A>
> > +% CYRILLIC SMALL LETTER KA
> > +<U043A> <U006B>
> > +% CYRILLIC SMALL LETTER EL
> > +<U043B> <U006C>
> > +% CYRILLIC SMALL LETTER EM
> > +<U043C> <U006D>
> > +% CYRILLIC SMALL LETTER EN
> > +<U043D> <U006E>
> > +% CYRILLIC SMALL LETTER O
> > +<U043E> <U006F>
> > +% CYRILLIC SMALL LETTER PE
> > +<U043F> <U0070>
> > +% CYRILLIC SMALL LETTER ER
> > +<U0440> <U0072>
> > +% CYRILLIC SMALL LETTER ES
> > +<U0441> <U0073>
> > +% CYRILLIC SMALL LETTER TE
> > +<U0442> <U0074>
> > +% CYRILLIC SMALL LETTER U
> > +<U0443> <U0075>
> > +% CYRILLIC SMALL LETTER EF
> > +<U0444> <U0066>
> > +% CYRILLIC SMALL LETTER HA
> > +<U0445> <U0078>
> > +% CYRILLIC SMALL LETTER TSE
> > +<U0446> "<U0063><U007A>";<U0063>
> > +% CYRILLIC SMALL LETTER CHE
> > +<U0447> "<U0063><U0068>";<U0063>
> > +% CYRILLIC SMALL LETTER SHA
> > +<U0448> "<U0073><U0068>";<U0073>
> > +% CYRILLIC SMALL LETTER SHCHA
> > +<U0449> "<U0073><U0068><U0068>";<U0073>
> > +% CYRILLIC SMALL LETTER HARD SIGN
> > +<U044A> "<U0060><U0060>";<U0060>
> > +% CYRILLIC SMALL LETTER YERU
> > +<U044B> "<U0079><U0027>";<U0079>
> > +% CYRILLIC SMALL LETTER SOFT SIGN
> > +<U044C> <U0060>
> > +% CYRILLIC SMALL LETTER E
> > +<U044D> "<U0065><U0060>";<U0065>
> > +% CYRILLIC SMALL LETTER YU
> > +<U044E> "<U0079><U0075>";<U0079>
> > +% CYRILLIC SMALL LETTER YA
> > +<U044F> "<U0079><U0061>";<U0079>
> > +% CYRILLIC SMALL LETTER IO
> > +<U0451> "<U0079><U006F>";<U0079>
> > +
> > +
> > +translit_end
> > +
> > +END LC_CTYPE
> > diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
> > --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
> > @@ -64,6 +64,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
> > --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
> > @@ -48,6 +48,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
> > --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
> > @@ -46,6 +46,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
> > --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> > % Farsi yeh -> yeh
> > <U06CC> "<U064A>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
> > --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
> > @@ -67,6 +67,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
> > --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> > % dong sign -> d// -> dd
> > <U20AB> "<U0111>";"<U0064><U0064>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
> > --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
> > @@ -69,6 +69,7 @@
> > <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
> > <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
> >
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
> > --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
> > @@ -55,6 +55,7 @@
> > % Accents are simply omitted if they cannot be represented.
> > include "translit_combining";""
> >
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
> > --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
> > @@ -66,6 +66,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> > diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
> > --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
> > @@ -73,6 +73,7 @@
> > <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
> > <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
> > <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
> > +include "translit_cyrillic";""
> > translit_end
> >
> > END LC_CTYPE
> > diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
> > --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
> > +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
> > @@ -58,6 +58,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> >
> > class "hanzi"; /
> > diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
> > --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
> > +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
> > @@ -70,6 +70,7 @@
> >
> > translit_start
> > include "translit_combining";""
> > +include "translit_cyrillic";""
> > translit_end
> > END LC_CTYPE
> >
> >
> >
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-03 9:32 ` Keld Simonsen
@ 2018-10-03 15:01 ` Egor Kobylkin
2018-10-05 9:20 ` Marko Myllynen
2018-10-05 9:56 ` Rafal Luzynski
0 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-03 15:01 UTC (permalink / raw)
To: Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
On 03.10.2018 11:19, Keld Simonsen wrote:
> Hi
>
> Please note that translitteration of Cyrillic to latin is not universal.
> There are different schemes for for example German, English and Danish, and
> there is also an ISO standard for it.
Thanks for your feedback, Keld!
Could the locale maintainers that wouldn't like to include this patch
explicitly state so here?
That is:
- In the case that there is a different preferred cyrillic
transliteration table for any specific locale their maintainers may want
to point me to it so I can supply a separate table/patch.
- Or they could state explicitly that for some reason they would like to
exclude their locale from the patch for a default cyrillic
transliteration altogether.
--Egor
>
> But do go forward with fixing this bug.
>
> Best regards
> Keld
>
> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>> Ping.
>>
>> Absent of feedback I am wondering if anything could be missing in this
>> patch from the maintainers standpoint. More than two months have passed
>> since the original submission.
>>
>> If I can be of assistance, please do not hesitate to contact me,
>> Egor Kobylkin
>>
>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>> Dear locale maintainers,
>>>
>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>
>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>
>>> add Cyrillic transliteration table translit_cyrillic file
>>>
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>
>>> to localedata/locales/ and include it in all your locales going forward.
>>>
>>> Patch included inline below.
>>>
>>> This is a re-submission for the consideration for 2.29 on a request from
>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>
>>> From this patch I have excluded locales that already mention cyrillic or
>>> have a transliteration table for it:
>>> az_AZ
>>> iso14651_t1_common
>>> ky_KG
>>> mn_MN
>>> sr_RS
>>> tg_TJ
>>> tk_TM
>>> tt_RU
>>> uk_UA
>>> uz_UZ
>>> uz_UZ@cyrillic
>>>
>>> Their maintainers are requested to make an explicit decision on how and
>>> whether at all to include this patch.
>>>
>>>
>>>
>>> Current bug effect:
>>>
>>> The glibc wiki explicitly lists this use case as the test example
>>>
>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>
>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>> translit-test-input.txt
>>>
>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>> [8] [9]:
>>>
>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>> translit-test-input.txt |grep CYRILLIC
>>>
>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>
>>> - It produces a string of question marks and spaces.
>>>
>>> This is what it should produce and it does so after the patch applied:
>>>
>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>> chayu.
>>>
>>>
>>> Root problem and the fix:
>>>
>>> The root problem is the missing transliteration table that I am
>>> supplying here. Furthermore it has to be referenced/included into the
>>> active locale at the compilation time to be used by iconv.
>>>
>>>
>>>
>>> COMMIT MESSAGE:
>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>
>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>> a transliteration has only ASCII codes but still can be read by a native
>>> speaker. Among other things it is useful for processing the Cyrillic
>>> texts and filenames by programs or on systems that are not specifically
>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>> or can't handle UTF-8.
>>>
>>> The transliteration table itself is attached as a file translit_cyrillic
>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>> Federation [2]). Technically an independent but identical source [3] was
>>> used and prepared in a spreadsheet [6].
>>>
>>> The documentation suggests that the transliteration tables inclusion is
>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>> translit_start section
>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>> Practically I have searched for all locales that have a
>>> translit_start/end stance and generated a patch for them.
>>>
>>> The Cyrillic transliteration of e.g. Russian text may have already
>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>> have their transliteration tables included inline.
>>> However it would not be the standard Russian Cyrillic transliteration as
>>> described above.
>>> I am excluding these locales from this proposed patch. I have written
>>> directly to locale maintainer emails listed in the files. Volodymyr
>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>>> exclusion.
>>>
>>> Links:
>>>
>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>> [2] GOST 7.79-2000 official source
>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>> available in low quality gif format)
>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>> http://www.yfermer.ru/specifications/285821.html
>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>> [6] Spreadsheet for generating translit_cyrillic
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>> [9] translit-test-input.txt
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>
>>> Best regards,
>>> Egor Kobylkin
>>>
>>> ---
>>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com>
>>>
>>> [BZ #2872]
>>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>> table from Cyrillic to Latin.
>>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>> section.
>>> * locales/aa_DJ: likewise
>>> * locales/af_ZA: likewise
>>> * locales/ak_GH: likewise
>>> * locales/am_ET: likewise
>>> * locales/ar_EG: likewise
>>> * locales/be_BY: likewise
>>> * locales/bem_ZM: likewise
>>> * locales/ber_DZ: likewise
>>> * locales/ber_MA: likewise
>>> * locales/bg_BG: likewise
>>> * locales/bi_VU: likewise
>>> * locales/bn_BD: likewise
>>> * locales/bo_CN: likewise
>>> * locales/ca_ES: likewise
>>> * locales/ce_RU: likewise
>>> * locales/cs_CZ: likewise
>>> * locales/cv_RU: likewise
>>> * locales/cy_GB: likewise
>>> * locales/da_DK: likewise
>>> * locales/de_DE: likewise
>>> * locales/dv_MV: likewise
>>> * locales/dz_BT: likewise
>>> * locales/el_GR: likewise
>>> * locales/en_GB: likewise
>>> * locales/en_NG: likewise
>>> * locales/en_ZM: likewise
>>> * locales/es_CU: likewise
>>> * locales/es_ES: likewise
>>> * locales/et_EE: likewise
>>> * locales/fa_IR: likewise
>>> * locales/ff_SN: likewise
>>> * locales/fi_FI: likewise
>>> * locales/fr_FR: likewise
>>> * locales/ga_IE: likewise
>>> * locales/gd_GB: likewise
>>> * locales/gu_IN: likewise
>>> * locales/gv_GB: likewise
>>> * locales/he_IL: likewise
>>> * locales/hi_IN: likewise
>>> * locales/hif_FJ: likewise
>>> * locales/hr_HR: likewise
>>> * locales/ht_HT: likewise
>>> * locales/hu_HU: likewise
>>> * locales/hy_AM: likewise
>>> * locales/id_ID: likewise
>>> * locales/is_IS: likewise
>>> * locales/it_IT: likewise
>>> * locales/ja_JP: likewise
>>> * locales/kk_KZ: likewise
>>> * locales/km_KH: likewise
>>> * locales/kn_IN: likewise
>>> * locales/ko_KR: likewise
>>> * locales/ks_IN: likewise
>>> * locales/kw_GB: likewise
>>> * locales/lb_LU: likewise
>>> * locales/lg_UG: likewise
>>> * locales/lij_IT: likewise
>>> * locales/ln_CD: likewise
>>> * locales/lo_LA: likewise
>>> * locales/lt_LT: likewise
>>> * locales/lv_LV: likewise
>>> * locales/mg_MG: likewise
>>> * locales/mhr_RU: likewise
>>> * locales/mk_MK: likewise
>>> * locales/ml_IN: likewise
>>> * locales/ms_MY: likewise
>>> * locales/mt_MT: likewise
>>> * locales/nan_TW@latin: likewise
>>> * locales/nb_NO: likewise
>>> * locales/ne_NP: likewise
>>> * locales/nhn_MX: likewise
>>> * locales/niu_NU: likewise
>>> * locales/niu_NZ: likewise
>>> * locales/nl_NL: likewise
>>> * locales/nr_ZA: likewise
>>> * locales/oc_FR: likewise
>>> * locales/om_KE: likewise
>>> * locales/or_IN: likewise
>>> * locales/os_RU: likewise
>>> * locales/pa_IN: likewise
>>> * locales/pa_PK: likewise
>>> * locales/pl_PL: likewise
>>> * locales/pt_PT: likewise
>>> * locales/quz_PE: likewise
>>> * locales/ro_RO: likewise
>>> * locales/ru_RU: likewise
>>> * locales/rw_RW: likewise
>>> * locales/sa_IN: likewise
>>> * locales/sd_IN: likewise
>>> * locales/sd_IN@devanagari: likewise
>>> * locales/sd_PK: likewise
>>> * locales/se_NO: likewise
>>> * locales/sgs_LT: likewise
>>> * locales/si_LK: likewise
>>> * locales/sk_SK: likewise
>>> * locales/sl_SI: likewise
>>> * locales/sm_WS: likewise
>>> * locales/so_SO: likewise
>>> * locales/sq_AL: likewise
>>> * locales/ss_ZA: likewise
>>> * locales/st_ZA: likewise
>>> * locales/sv_SE: likewise
>>> * locales/sw_KE: likewise
>>> * locales/ta_IN: likewise
>>> * locales/te_IN: likewise
>>> * locales/th_TH: likewise
>>> * locales/ti_ET: likewise
>>> * locales/tn_ZA: likewise
>>> * locales/to_TO: likewise
>>> * locales/tpi_PG: likewise
>>> * locales/tr_TR: likewise
>>> * locales/ts_ZA: likewise
>>> * locales/unm_US: likewise
>>> * locales/ur_IN: likewise
>>> * locales/ur_PK: likewise
>>> * locales/ve_ZA: likewise
>>> * locales/vi_VN: likewise
>>> * locales/wa_BE: likewise
>>> * locales/wo_SN: likewise
>>> * locales/xh_ZA: likewise
>>> * locales/yi_US: likewise
>>> * locales/zh_CN: likewise
>>> * locales/zu_ZA: likewise
>>>
>>>
>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
>>> @@ -2292,6 +2292,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
>>> @@ -72,6 +72,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
>>> @@ -56,6 +56,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
>>> @@ -1396,6 +1396,7 @@
>>> <U137A> <U0060><U0039><U0030>
>>> <U137B> <U0060><U0031><U0030><U0030>
>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> %
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
>>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
>>> @@ -166,6 +166,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
>>> @@ -86,6 +86,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
>>> @@ -49,6 +49,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
>>> @@ -43,6 +43,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
>>> @@ -72,6 +72,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
>>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
>>> @@ -2311,6 +2311,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
>>> @@ -109,6 +109,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
>>> @@ -167,6 +167,7 @@
>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
>>> @@ -78,6 +78,7 @@
>>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>> <U201F> <U00AB>;<U0022>
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
>>> @@ -52,6 +52,7 @@
>>> include "translit_combining";""
>>>
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
>>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
>>> @@ -50,6 +50,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
>>> @@ -73,6 +73,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
>>> @@ -109,6 +109,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
>>> @@ -79,6 +79,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
>>> @@ -42,6 +42,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
>>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
>>> @@ -137,6 +137,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>> % In France, accents are simply omitted if they cannot be represented.
>>> include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
>>> @@ -54,6 +54,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
>>> @@ -61,6 +61,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
>>> @@ -153,6 +153,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
>>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
>>> @@ -478,6 +478,7 @@
>>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
>>> @@ -77,6 +77,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
>>> @@ -2161,6 +2161,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
>>> @@ -1682,6 +1682,7 @@
>>> include "translit_combining";""
>>> include "translit_cjk_variants";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
>>> @@ -158,6 +158,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
>>> @@ -873,6 +873,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
>>> @@ -6099,6 +6099,7 @@
>>> include "translit_combining";""
>>> include "translit_hangul";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
>>> @@ -78,6 +78,7 @@
>>> % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>> <U00EA> "<U0065><U005E>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
>>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
>>> @@ -51,6 +51,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
>>> @@ -77,6 +77,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
>>> @@ -2122,6 +2122,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>> % Accents are simply omitted if they cannot be represented.
>>> include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
>>> @@ -49,6 +49,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>> %
>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
>>> @@ -47,6 +47,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nan_TW@latin
>>> b/localedata/locales/nan_TW@latin
>>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
>>> @@ -53,6 +53,7 @@
>>> % accents are simply omitted if they cannot be represented.
>>> include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
>>> @@ -154,6 +154,7 @@
>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
>>> @@ -43,6 +43,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
>>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
>>> @@ -66,6 +66,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
>>> @@ -140,6 +140,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
>>> @@ -62,6 +62,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
>>> @@ -60,6 +60,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>> % Farsi yeh -> yeh
>>> <U06CC> "<U064A>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
>>> @@ -142,6 +142,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
>>> @@ -57,6 +57,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
>>> @@ -144,6 +144,7 @@
>>> <U0162> "<U021A>";"<U0054>"
>>> <U0163> "<U021B>";"<U0074>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
>>> @@ -74,6 +74,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>> b/localedata/locales/sd_IN@devanagari
>>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
>>> +0000
>>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
>>> +0000
>>> @@ -44,6 +44,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
>>> @@ -39,6 +39,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
>>> @@ -205,6 +205,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
>>> @@ -59,6 +59,7 @@
>>> copy "i18n"
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
>>> @@ -68,6 +68,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
>>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
>>> @@ -91,6 +91,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
>>> @@ -45,6 +45,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
>>> @@ -68,6 +68,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
>>> @@ -64,6 +64,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
>>> @@ -139,6 +139,7 @@
>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
>>> @@ -44,6 +44,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
>>> @@ -63,6 +63,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
>>> @@ -866,6 +866,7 @@
>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>>
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> %
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
>>> @@ -36,6 +36,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
>>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
>>> @@ -37,6 +37,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
>>> @@ -2430,6 +2430,7 @@
>>>
>>> % TURKISH LIRA SIGN
>>> <U20BA> "<U0054><U004C>"
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/translit_cyrillic
>>> b/localedata/locales/translit_cyrillic
>>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
>>> +0000
>>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
>>> +0000
>>> @@ -0,0 +1,151 @@
>>> +escape_char /
>>> +comment_char %
>>> +
>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>> inspired by GOST 7.79-2000
>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>> +% Generated from UnicodeData.txt with
>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>> +% Up to three characters are required to do a reversible transliteration.
>>> +
>>> +LC_CTYPE
>>> +
>>> +translit_start
>>> +
>>> +
>>> +% CYRILLIC CAPITAL LETTER IO
>>> +<U0401> "<U0059><U004F>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER A
>>> +<U0410> <U0041>
>>> +% CYRILLIC CAPITAL LETTER BE
>>> +<U0411> <U0042>
>>> +% CYRILLIC CAPITAL LETTER VE
>>> +<U0412> <U0056>
>>> +% CYRILLIC CAPITAL LETTER GHE
>>> +<U0413> <U0047>
>>> +% CYRILLIC CAPITAL LETTER DE
>>> +<U0414> <U0044>
>>> +% CYRILLIC CAPITAL LETTER IE
>>> +<U0415> <U0045>
>>> +% CYRILLIC CAPITAL LETTER ZHE
>>> +<U0416> "<U005A><U0048>";<U005A>
>>> +% CYRILLIC CAPITAL LETTER ZE
>>> +<U0417> <U005A>
>>> +% CYRILLIC CAPITAL LETTER I
>>> +<U0418> <U0049>
>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>> +<U0419> <U004A>
>>> +% CYRILLIC CAPITAL LETTER KA
>>> +<U041A> <U004B>
>>> +% CYRILLIC CAPITAL LETTER EL
>>> +<U041B> <U004C>
>>> +% CYRILLIC CAPITAL LETTER EM
>>> +<U041C> <U004D>
>>> +% CYRILLIC CAPITAL LETTER EN
>>> +<U041D> <U004E>
>>> +% CYRILLIC CAPITAL LETTER O
>>> +<U041E> <U004F>
>>> +% CYRILLIC CAPITAL LETTER PE
>>> +<U041F> <U0050>
>>> +% CYRILLIC CAPITAL LETTER ER
>>> +<U0420> <U0052>
>>> +% CYRILLIC CAPITAL LETTER ES
>>> +<U0421> <U0053>
>>> +% CYRILLIC CAPITAL LETTER TE
>>> +<U0422> <U0054>
>>> +% CYRILLIC CAPITAL LETTER U
>>> +<U0423> <U0055>
>>> +% CYRILLIC CAPITAL LETTER EF
>>> +<U0424> <U0046>
>>> +% CYRILLIC CAPITAL LETTER HA
>>> +<U0425> <U0058>
>>> +% CYRILLIC CAPITAL LETTER TSE
>>> +<U0426> "<U0043><U005A>";<U0043>
>>> +% CYRILLIC CAPITAL LETTER CHE
>>> +<U0427> "<U0043><U0048>";<U0043>
>>> +% CYRILLIC CAPITAL LETTER SHA
>>> +<U0428> "<U0053><U0048>";<U0053>
>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>> +<U042A> "<U0060><U0060>";<U0060>
>>> +% CYRILLIC CAPITAL LETTER YERU
>>> +<U042B> "<U0059><U0027>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>> +<U042C> <U0060>
>>> +% CYRILLIC CAPITAL LETTER E
>>> +<U042D> "<U0045><U0060>";<U0045>
>>> +% CYRILLIC CAPITAL LETTER YU
>>> +<U042E> "<U0059><U0055>";<U0059>
>>> +% CYRILLIC CAPITAL LETTER YA
>>> +<U042F> "<U0059><U0041>";<U0059>
>>> +% CYRILLIC SMALL LETTER A
>>> +<U0430> <U0061>
>>> +% CYRILLIC SMALL LETTER BE
>>> +<U0431> <U0062>
>>> +% CYRILLIC SMALL LETTER VE
>>> +<U0432> <U0076>
>>> +% CYRILLIC SMALL LETTER GHE
>>> +<U0433> <U0067>
>>> +% CYRILLIC SMALL LETTER DE
>>> +<U0434> <U0064>
>>> +% CYRILLIC SMALL LETTER IE
>>> +<U0435> <U0065>
>>> +% CYRILLIC SMALL LETTER ZHE
>>> +<U0436> "<U007A><U0068>";<U007A>
>>> +% CYRILLIC SMALL LETTER ZE
>>> +<U0437> <U007A>
>>> +% CYRILLIC SMALL LETTER I
>>> +<U0438> <U0069>
>>> +% CYRILLIC SMALL LETTER SHORT I
>>> +<U0439> <U006A>
>>> +% CYRILLIC SMALL LETTER KA
>>> +<U043A> <U006B>
>>> +% CYRILLIC SMALL LETTER EL
>>> +<U043B> <U006C>
>>> +% CYRILLIC SMALL LETTER EM
>>> +<U043C> <U006D>
>>> +% CYRILLIC SMALL LETTER EN
>>> +<U043D> <U006E>
>>> +% CYRILLIC SMALL LETTER O
>>> +<U043E> <U006F>
>>> +% CYRILLIC SMALL LETTER PE
>>> +<U043F> <U0070>
>>> +% CYRILLIC SMALL LETTER ER
>>> +<U0440> <U0072>
>>> +% CYRILLIC SMALL LETTER ES
>>> +<U0441> <U0073>
>>> +% CYRILLIC SMALL LETTER TE
>>> +<U0442> <U0074>
>>> +% CYRILLIC SMALL LETTER U
>>> +<U0443> <U0075>
>>> +% CYRILLIC SMALL LETTER EF
>>> +<U0444> <U0066>
>>> +% CYRILLIC SMALL LETTER HA
>>> +<U0445> <U0078>
>>> +% CYRILLIC SMALL LETTER TSE
>>> +<U0446> "<U0063><U007A>";<U0063>
>>> +% CYRILLIC SMALL LETTER CHE
>>> +<U0447> "<U0063><U0068>";<U0063>
>>> +% CYRILLIC SMALL LETTER SHA
>>> +<U0448> "<U0073><U0068>";<U0073>
>>> +% CYRILLIC SMALL LETTER SHCHA
>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>> +<U044A> "<U0060><U0060>";<U0060>
>>> +% CYRILLIC SMALL LETTER YERU
>>> +<U044B> "<U0079><U0027>";<U0079>
>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>> +<U044C> <U0060>
>>> +% CYRILLIC SMALL LETTER E
>>> +<U044D> "<U0065><U0060>";<U0065>
>>> +% CYRILLIC SMALL LETTER YU
>>> +<U044E> "<U0079><U0075>";<U0079>
>>> +% CYRILLIC SMALL LETTER YA
>>> +<U044F> "<U0079><U0061>";<U0079>
>>> +% CYRILLIC SMALL LETTER IO
>>> +<U0451> "<U0079><U006F>";<U0079>
>>> +
>>> +
>>> +translit_end
>>> +
>>> +END LC_CTYPE
>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
>>> @@ -64,6 +64,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
>>> @@ -48,6 +48,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
>>> @@ -46,6 +46,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>> % Farsi yeh -> yeh
>>> <U06CC> "<U064A>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
>>> @@ -67,6 +67,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>> % dong sign -> d// -> dd
>>> <U20AB> "<U0111>";"<U0064><U0064>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
>>> @@ -69,6 +69,7 @@
>>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
>>> @@ -55,6 +55,7 @@
>>> % Accents are simply omitted if they cannot be represented.
>>> include "translit_combining";""
>>>
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
>>> @@ -66,6 +66,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
>>> @@ -73,6 +73,7 @@
>>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> END LC_CTYPE
>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
>>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
>>> @@ -58,6 +58,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>>
>>> class "hanzi"; /
>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
>>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
>>> @@ -70,6 +70,7 @@
>>>
>>> translit_start
>>> include "translit_combining";""
>>> +include "translit_cyrillic";""
>>> translit_end
>>> END LC_CTYPE
>>>
>>>
>>>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-03 15:01 ` Egor Kobylkin
@ 2018-10-05 9:20 ` Marko Myllynen
2018-10-05 9:56 ` Rafal Luzynski
1 sibling, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05 9:20 UTC (permalink / raw)
To: Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi Egor,
Thanks for your patience with this one.
On 2018-10-03 12:32, Egor Kobylkin wrote:
> On 03.10.2018 11:19, Keld Simonsen wrote:
>>
>> Please note that translitteration of Cyrillic to latin is not universal.
>> There are different schemes for for example German, English and Danish, and
>> there is also an ISO standard for it.
>
> Thanks for your feedback, Keld!
>
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?
>
> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.
The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to
understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so
perhaps you could mention both ISO 9 and the Wikipedia article in the
commit log. translit_cyrillic includes every transliteration defined in
ISO 9:1995 and GOST 7.79-2000, correct?
I think those locales which already have Cyrillic transliteration
defined it would be best to leave them as-is (as you've done) unless
there are some issues with them, there's probably a good reason why they
have been added in the first place.
For other locales, using ISO 9 instead of not doing transliteration at
all may not be entirely correct but I'd suppose it's better to provide
at least some sort of transliteration (even if not entirely correct)
than sequences of question marks. But as you say, locale maintainers may
know better the case for individual locales.
Wrt language-specific differences Keld mentioned, Finnish Wikipedia
article on transliteration gives an example, see the table on right at
https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian /
international / Finnish / Swedish / English / French / German / Polish /
phonetic transliteration of a Russian name. (The table also shows that
for correct transliteration ASCII letters are not enough for some
languages.)
Some of the differences and language-specific aspects are probably
impossible to take fully into account within the locale system we have
today. For example, in Finnish (the tables at
http://jkorpela.fi/iso9.html8 and
https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might
also be helpful):
1) transliteration of Russian is mostly as per ISO 9 but with national
differences defined in SFS 4900
2) transliteration of Russian and Ukrainian names have some slight
differences according to http://jkorpela.fi/iso9.html8
3) transliteration of a letter depends on its position within a word or
pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e)
except when at the beginning of a word it becomes U+006A U+0065 (je)
Hopefully we'll hear comments from others as well. Once your patch is
merged, I'll try to come up with the needed locale-specific changes for
fi_FI, some differences referred to in 1) above are straightforward to
implement but for 2) and 3) some compromises probably need to be made,
unfortunately.
Thanks,
>> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>>> Ping.
>>>
>>> Absent of feedback I am wondering if anything could be missing in this
>>> patch from the maintainers standpoint. More than two months have passed
>>> since the original submission.
>>>
>>> If I can be of assistance, please do not hesitate to contact me,
>>> Egor Kobylkin
>>>
>>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>>> Dear locale maintainers,
>>>>
>>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>>
>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>>
>>>> add Cyrillic transliteration table translit_cyrillic file
>>>>
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>>
>>>> to localedata/locales/ and include it in all your locales going forward.
>>>>
>>>> Patch included inline below.
>>>>
>>>> This is a re-submission for the consideration for 2.29 on a request from
>>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>>
>>>> From this patch I have excluded locales that already mention cyrillic or
>>>> have a transliteration table for it:
>>>> az_AZ
>>>> iso14651_t1_common
>>>> ky_KG
>>>> mn_MN
>>>> sr_RS
>>>> tg_TJ
>>>> tk_TM
>>>> tt_RU
>>>> uk_UA
>>>> uz_UZ
>>>> uz_UZ@cyrillic
>>>>
>>>> Their maintainers are requested to make an explicit decision on how and
>>>> whether at all to include this patch.
>>>>
>>>>
>>>>
>>>> Current bug effect:
>>>>
>>>> The glibc wiki explicitly lists this use case as the test example
>>>>
>>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>>
>>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt
>>>>
>>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>>> [8] [9]:
>>>>
>>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt |grep CYRILLIC
>>>>
>>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>>
>>>> - It produces a string of question marks and spaces.
>>>>
>>>> This is what it should produce and it does so after the patch applied:
>>>>
>>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>>> chayu.
>>>>
>>>>
>>>> Root problem and the fix:
>>>>
>>>> The root problem is the missing transliteration table that I am
>>>> supplying here. Furthermore it has to be referenced/included into the
>>>> active locale at the compilation time to be used by iconv.
>>>>
>>>>
>>>>
>>>> COMMIT MESSAGE:
>>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>>
>>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>>> a transliteration has only ASCII codes but still can be read by a native
>>>> speaker. Among other things it is useful for processing the Cyrillic
>>>> texts and filenames by programs or on systems that are not specifically
>>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>>> or can't handle UTF-8.
>>>>
>>>> The transliteration table itself is attached as a file translit_cyrillic
>>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>>> Federation [2]). Technically an independent but identical source [3] was
>>>> used and prepared in a spreadsheet [6].
>>>>
>>>> The documentation suggests that the transliteration tables inclusion is
>>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>>> translit_start section
>>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>>> Practically I have searched for all locales that have a
>>>> translit_start/end stance and generated a patch for them.
>>>>
>>>> The Cyrillic transliteration of e.g. Russian text may have already
>>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>>> have their transliteration tables included inline.
>>>> However it would not be the standard Russian Cyrillic transliteration as
>>>> described above.
>>>> I am excluding these locales from this proposed patch. I have written
>>>> directly to locale maintainer emails listed in the files. Volodymyr
>>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>>>> exclusion.
>>>>
>>>> Links:
>>>>
>>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> [2] GOST 7.79-2000 official source
>>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>>> available in low quality gif format)
>>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>>> http://www.yfermer.ru/specifications/285821.html
>>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>>> [6] Spreadsheet for generating translit_cyrillic
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>>> [9] translit-test-input.txt
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>>
>>>> Best regards,
>>>> Egor Kobylkin
>>>>
>>>> ---
>>>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com>
>>>>
>>>> [BZ #2872]
>>>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>>> table from Cyrillic to Latin.
>>>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>>> section.
>>>> * locales/aa_DJ: likewise
>>>> * locales/af_ZA: likewise
>>>> * locales/ak_GH: likewise
>>>> * locales/am_ET: likewise
>>>> * locales/ar_EG: likewise
>>>> * locales/be_BY: likewise
>>>> * locales/bem_ZM: likewise
>>>> * locales/ber_DZ: likewise
>>>> * locales/ber_MA: likewise
>>>> * locales/bg_BG: likewise
>>>> * locales/bi_VU: likewise
>>>> * locales/bn_BD: likewise
>>>> * locales/bo_CN: likewise
>>>> * locales/ca_ES: likewise
>>>> * locales/ce_RU: likewise
>>>> * locales/cs_CZ: likewise
>>>> * locales/cv_RU: likewise
>>>> * locales/cy_GB: likewise
>>>> * locales/da_DK: likewise
>>>> * locales/de_DE: likewise
>>>> * locales/dv_MV: likewise
>>>> * locales/dz_BT: likewise
>>>> * locales/el_GR: likewise
>>>> * locales/en_GB: likewise
>>>> * locales/en_NG: likewise
>>>> * locales/en_ZM: likewise
>>>> * locales/es_CU: likewise
>>>> * locales/es_ES: likewise
>>>> * locales/et_EE: likewise
>>>> * locales/fa_IR: likewise
>>>> * locales/ff_SN: likewise
>>>> * locales/fi_FI: likewise
>>>> * locales/fr_FR: likewise
>>>> * locales/ga_IE: likewise
>>>> * locales/gd_GB: likewise
>>>> * locales/gu_IN: likewise
>>>> * locales/gv_GB: likewise
>>>> * locales/he_IL: likewise
>>>> * locales/hi_IN: likewise
>>>> * locales/hif_FJ: likewise
>>>> * locales/hr_HR: likewise
>>>> * locales/ht_HT: likewise
>>>> * locales/hu_HU: likewise
>>>> * locales/hy_AM: likewise
>>>> * locales/id_ID: likewise
>>>> * locales/is_IS: likewise
>>>> * locales/it_IT: likewise
>>>> * locales/ja_JP: likewise
>>>> * locales/kk_KZ: likewise
>>>> * locales/km_KH: likewise
>>>> * locales/kn_IN: likewise
>>>> * locales/ko_KR: likewise
>>>> * locales/ks_IN: likewise
>>>> * locales/kw_GB: likewise
>>>> * locales/lb_LU: likewise
>>>> * locales/lg_UG: likewise
>>>> * locales/lij_IT: likewise
>>>> * locales/ln_CD: likewise
>>>> * locales/lo_LA: likewise
>>>> * locales/lt_LT: likewise
>>>> * locales/lv_LV: likewise
>>>> * locales/mg_MG: likewise
>>>> * locales/mhr_RU: likewise
>>>> * locales/mk_MK: likewise
>>>> * locales/ml_IN: likewise
>>>> * locales/ms_MY: likewise
>>>> * locales/mt_MT: likewise
>>>> * locales/nan_TW@latin: likewise
>>>> * locales/nb_NO: likewise
>>>> * locales/ne_NP: likewise
>>>> * locales/nhn_MX: likewise
>>>> * locales/niu_NU: likewise
>>>> * locales/niu_NZ: likewise
>>>> * locales/nl_NL: likewise
>>>> * locales/nr_ZA: likewise
>>>> * locales/oc_FR: likewise
>>>> * locales/om_KE: likewise
>>>> * locales/or_IN: likewise
>>>> * locales/os_RU: likewise
>>>> * locales/pa_IN: likewise
>>>> * locales/pa_PK: likewise
>>>> * locales/pl_PL: likewise
>>>> * locales/pt_PT: likewise
>>>> * locales/quz_PE: likewise
>>>> * locales/ro_RO: likewise
>>>> * locales/ru_RU: likewise
>>>> * locales/rw_RW: likewise
>>>> * locales/sa_IN: likewise
>>>> * locales/sd_IN: likewise
>>>> * locales/sd_IN@devanagari: likewise
>>>> * locales/sd_PK: likewise
>>>> * locales/se_NO: likewise
>>>> * locales/sgs_LT: likewise
>>>> * locales/si_LK: likewise
>>>> * locales/sk_SK: likewise
>>>> * locales/sl_SI: likewise
>>>> * locales/sm_WS: likewise
>>>> * locales/so_SO: likewise
>>>> * locales/sq_AL: likewise
>>>> * locales/ss_ZA: likewise
>>>> * locales/st_ZA: likewise
>>>> * locales/sv_SE: likewise
>>>> * locales/sw_KE: likewise
>>>> * locales/ta_IN: likewise
>>>> * locales/te_IN: likewise
>>>> * locales/th_TH: likewise
>>>> * locales/ti_ET: likewise
>>>> * locales/tn_ZA: likewise
>>>> * locales/to_TO: likewise
>>>> * locales/tpi_PG: likewise
>>>> * locales/tr_TR: likewise
>>>> * locales/ts_ZA: likewise
>>>> * locales/unm_US: likewise
>>>> * locales/ur_IN: likewise
>>>> * locales/ur_PK: likewise
>>>> * locales/ve_ZA: likewise
>>>> * locales/vi_VN: likewise
>>>> * locales/wa_BE: likewise
>>>> * locales/wo_SN: likewise
>>>> * locales/xh_ZA: likewise
>>>> * locales/yi_US: likewise
>>>> * locales/zh_CN: likewise
>>>> * locales/zu_ZA: likewise
>>>>
>>>>
>>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -2292,6 +2292,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -56,6 +56,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -1396,6 +1396,7 @@
>>>> <U137A> <U0060><U0039><U0030>
>>>> <U137B> <U0060><U0031><U0030><U0030>
>>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> %
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -166,6 +166,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -86,6 +86,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -2311,6 +2311,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -167,6 +167,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>>> <U201F> <U00AB>;<U0022>
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -52,6 +52,7 @@
>>>> include "translit_combining";""
>>>>
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -50,6 +50,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -79,6 +79,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -137,6 +137,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> % In France, accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -54,6 +54,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -61,6 +61,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -153,6 +153,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -478,6 +478,7 @@
>>>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -2161,6 +2161,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -1682,6 +1682,7 @@
>>>> include "translit_combining";""
>>>> include "translit_cjk_variants";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -158,6 +158,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -873,6 +873,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -6099,6 +6099,7 @@
>>>> include "translit_combining";""
>>>> include "translit_hangul";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>> % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>>> <U00EA> "<U0065><U005E>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -51,6 +51,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -2122,6 +2122,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>> % Accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>> %
>>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nan_TW@latin
>>>> b/localedata/locales/nan_TW@latin
>>>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -53,6 +53,7 @@
>>>> % accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -154,6 +154,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -140,6 +140,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % Farsi yeh -> yeh
>>>> <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -142,6 +142,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -144,6 +144,7 @@
>>>> <U0162> "<U021A>";"<U0054>"
>>>> <U0163> "<U021B>";"<U0074>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -74,6 +74,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>>> b/localedata/locales/sd_IN@devanagari
>>>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
>>>> +0000
>>>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
>>>> +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -205,6 +205,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -91,6 +91,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -139,6 +139,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -866,6 +866,7 @@
>>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>>>
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> %
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -36,6 +36,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -2430,6 +2430,7 @@
>>>>
>>>> % TURKISH LIRA SIGN
>>>> <U20BA> "<U0054><U004C>"
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/translit_cyrillic
>>>> b/localedata/locales/translit_cyrillic
>>>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
>>>> +0000
>>>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
>>>> +0000
>>>> @@ -0,0 +1,151 @@
>>>> +escape_char /
>>>> +comment_char %
>>>> +
>>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>>> inspired by GOST 7.79-2000
>>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> +% Generated from UnicodeData.txt with
>>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> +% Up to three characters are required to do a reversible transliteration.
>>>> +
>>>> +LC_CTYPE
>>>> +
>>>> +translit_start
>>>> +
>>>> +
>>>> +% CYRILLIC CAPITAL LETTER IO
>>>> +<U0401> "<U0059><U004F>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER A
>>>> +<U0410> <U0041>
>>>> +% CYRILLIC CAPITAL LETTER BE
>>>> +<U0411> <U0042>
>>>> +% CYRILLIC CAPITAL LETTER VE
>>>> +<U0412> <U0056>
>>>> +% CYRILLIC CAPITAL LETTER GHE
>>>> +<U0413> <U0047>
>>>> +% CYRILLIC CAPITAL LETTER DE
>>>> +<U0414> <U0044>
>>>> +% CYRILLIC CAPITAL LETTER IE
>>>> +<U0415> <U0045>
>>>> +% CYRILLIC CAPITAL LETTER ZHE
>>>> +<U0416> "<U005A><U0048>";<U005A>
>>>> +% CYRILLIC CAPITAL LETTER ZE
>>>> +<U0417> <U005A>
>>>> +% CYRILLIC CAPITAL LETTER I
>>>> +<U0418> <U0049>
>>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>>> +<U0419> <U004A>
>>>> +% CYRILLIC CAPITAL LETTER KA
>>>> +<U041A> <U004B>
>>>> +% CYRILLIC CAPITAL LETTER EL
>>>> +<U041B> <U004C>
>>>> +% CYRILLIC CAPITAL LETTER EM
>>>> +<U041C> <U004D>
>>>> +% CYRILLIC CAPITAL LETTER EN
>>>> +<U041D> <U004E>
>>>> +% CYRILLIC CAPITAL LETTER O
>>>> +<U041E> <U004F>
>>>> +% CYRILLIC CAPITAL LETTER PE
>>>> +<U041F> <U0050>
>>>> +% CYRILLIC CAPITAL LETTER ER
>>>> +<U0420> <U0052>
>>>> +% CYRILLIC CAPITAL LETTER ES
>>>> +<U0421> <U0053>
>>>> +% CYRILLIC CAPITAL LETTER TE
>>>> +<U0422> <U0054>
>>>> +% CYRILLIC CAPITAL LETTER U
>>>> +<U0423> <U0055>
>>>> +% CYRILLIC CAPITAL LETTER EF
>>>> +<U0424> <U0046>
>>>> +% CYRILLIC CAPITAL LETTER HA
>>>> +<U0425> <U0058>
>>>> +% CYRILLIC CAPITAL LETTER TSE
>>>> +<U0426> "<U0043><U005A>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER CHE
>>>> +<U0427> "<U0043><U0048>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER SHA
>>>> +<U0428> "<U0053><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>>> +<U042A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC CAPITAL LETTER YERU
>>>> +<U042B> "<U0059><U0027>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>>> +<U042C> <U0060>
>>>> +% CYRILLIC CAPITAL LETTER E
>>>> +<U042D> "<U0045><U0060>";<U0045>
>>>> +% CYRILLIC CAPITAL LETTER YU
>>>> +<U042E> "<U0059><U0055>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER YA
>>>> +<U042F> "<U0059><U0041>";<U0059>
>>>> +% CYRILLIC SMALL LETTER A
>>>> +<U0430> <U0061>
>>>> +% CYRILLIC SMALL LETTER BE
>>>> +<U0431> <U0062>
>>>> +% CYRILLIC SMALL LETTER VE
>>>> +<U0432> <U0076>
>>>> +% CYRILLIC SMALL LETTER GHE
>>>> +<U0433> <U0067>
>>>> +% CYRILLIC SMALL LETTER DE
>>>> +<U0434> <U0064>
>>>> +% CYRILLIC SMALL LETTER IE
>>>> +<U0435> <U0065>
>>>> +% CYRILLIC SMALL LETTER ZHE
>>>> +<U0436> "<U007A><U0068>";<U007A>
>>>> +% CYRILLIC SMALL LETTER ZE
>>>> +<U0437> <U007A>
>>>> +% CYRILLIC SMALL LETTER I
>>>> +<U0438> <U0069>
>>>> +% CYRILLIC SMALL LETTER SHORT I
>>>> +<U0439> <U006A>
>>>> +% CYRILLIC SMALL LETTER KA
>>>> +<U043A> <U006B>
>>>> +% CYRILLIC SMALL LETTER EL
>>>> +<U043B> <U006C>
>>>> +% CYRILLIC SMALL LETTER EM
>>>> +<U043C> <U006D>
>>>> +% CYRILLIC SMALL LETTER EN
>>>> +<U043D> <U006E>
>>>> +% CYRILLIC SMALL LETTER O
>>>> +<U043E> <U006F>
>>>> +% CYRILLIC SMALL LETTER PE
>>>> +<U043F> <U0070>
>>>> +% CYRILLIC SMALL LETTER ER
>>>> +<U0440> <U0072>
>>>> +% CYRILLIC SMALL LETTER ES
>>>> +<U0441> <U0073>
>>>> +% CYRILLIC SMALL LETTER TE
>>>> +<U0442> <U0074>
>>>> +% CYRILLIC SMALL LETTER U
>>>> +<U0443> <U0075>
>>>> +% CYRILLIC SMALL LETTER EF
>>>> +<U0444> <U0066>
>>>> +% CYRILLIC SMALL LETTER HA
>>>> +<U0445> <U0078>
>>>> +% CYRILLIC SMALL LETTER TSE
>>>> +<U0446> "<U0063><U007A>";<U0063>
>>>> +% CYRILLIC SMALL LETTER CHE
>>>> +<U0447> "<U0063><U0068>";<U0063>
>>>> +% CYRILLIC SMALL LETTER SHA
>>>> +<U0448> "<U0073><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER SHCHA
>>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>>> +<U044A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC SMALL LETTER YERU
>>>> +<U044B> "<U0079><U0027>";<U0079>
>>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>>> +<U044C> <U0060>
>>>> +% CYRILLIC SMALL LETTER E
>>>> +<U044D> "<U0065><U0060>";<U0065>
>>>> +% CYRILLIC SMALL LETTER YU
>>>> +<U044E> "<U0079><U0075>";<U0079>
>>>> +% CYRILLIC SMALL LETTER YA
>>>> +<U044F> "<U0079><U0061>";<U0079>
>>>> +% CYRILLIC SMALL LETTER IO
>>>> +<U0451> "<U0079><U006F>";<U0079>
>>>> +
>>>> +
>>>> +translit_end
>>>> +
>>>> +END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -48,6 +48,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % Farsi yeh -> yeh
>>>> <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -67,6 +67,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % dong sign -> d// -> dd
>>>> <U20AB> "<U0111>";"<U0064><U0064>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>> % Accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> class "hanzi"; /
>>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
>>>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>>
>>>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-03 15:01 ` Egor Kobylkin
2018-10-05 9:20 ` Marko Myllynen
@ 2018-10-05 9:56 ` Rafal Luzynski
2018-10-05 11:54 ` Egor Kobylkin
[not found] ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
1 sibling, 2 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-05 9:56 UTC (permalink / raw)
To: Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> On 03.10.2018 11:19, Keld Simonsen wrote:
> > Hi
> >
> > Please note that translitteration of Cyrillic to latin is not universal.
> > There are different schemes for for example German, English and Danish, and
> > there is also an ISO standard for it.
>
> Thanks for your feedback, Keld!
>
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?
I think it is about me so I must reply. I am sorry about that and the sole
reason is my lack of time. I'm just a volunteer here, that means it's not
my regular job to work on locale data nor anything in glibc nor in any other
open source project. I do these things only in my free time which I don't
have much. Of course you will see my contributions here and there but they
are either trivial or take me months to complete. Your patches are on my
radar but I can't tell any ETA for them. Of course, there are other people
around here and they are all welcome to come and join.
> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.
As Keld wrote, there are probably separate rules for every language so
I don't think you should treat your rules as universal and include them
in every locale. At first sight, it seems to me they work only for English
(as a destination locale). Also, although it is called "transliteration
from Cyrillic" it seems that it covers only Russian alphabet. What about
other languages which use Cyrillic alphabet but add their own diacritic
characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
Cyrillic alphabet but transliterate their respective letters in a different
way than Russian? For example, Russian "Ъ" is (I think) usually skipped
in transliteration, I think you propose "``", but when transliterating from
Bulgarian they usually transliterate this as "ă".
Few remarks:
* I think you transliterate "щ" as "shh", wouldn't "shch" be better?
* You transliterate "ц" as "cz", wouldn't "ts" be better? By the way,
in Polish language "cz" is a correct transliteration of "ч".
* You transliterate "й" as "j", this is fine in many languages but wouldn't
"y" be better in English?
* In case of "е": how will you know if it is correct to transliterate it
to "e" or "ie" or "je" or "ye"?
These remarks are obviously incomplete, your patch deserves much more
attention to review.
Best regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-05 9:56 ` Rafal Luzynski
@ 2018-10-05 11:54 ` Egor Kobylkin
2018-10-08 22:23 ` Rafal Luzynski
[not found] ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-05 11:54 UTC (permalink / raw)
To: Rafal Luzynski, Keld Simonsen, Marko Myllynen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 6105 bytes --]
removed a png image attachment
Keld,Marko,Rafal, other locale maintainers,
this all is written with having in mind a minimal viable fix for this
bug asap. I want to avoid wasting maintainers time getting into
fundamental discussions here (although for perfectly good reasons).
I see three options:
1. those locale maintainers that are fine with using ISO
9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
2. those that that want to have a differing table can create their own
variety based on the spreadsheet I have prepared
https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
this patch.
3. those that want to omit a cyrillic transliteration altogether for now
state so and just carry over the bug #2872 from the year 2006.
Does this make sense to you?
Just to be super clear on this: the patch is a stopgap _ASCII_
transliteration table. ASCII being AMERICAN Standard Code for
Information Interchange, that is obviously orthogonal to any
transliteration rule of other countries. As such it is not explicitly
targeting transliteration standards of any country.
The fact that the patch is reflecting Russian variety of ISO
9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
available and can be helpful to a majority of cyrillic users b) I have
access to it including via being proficient in Russian.
It is offered to all the respective locale maintainers as a stopgap
solution. Stopgap in the sense that it is better to have some
transliteration than not to have any at all and carry over the bug from
2006. That it may be a somewhat officially correct transliteration for
ru_RU is a bonus. In that sense I would dub the discussion on the
correctness for other languages "offtopic". Let me know if this is not OK.
You are all are correctly mentioning the deficiencies of this approach.
However, I couldn't find a better straightforward approach as of yet.
Happy to hear from you as on how this could be handled.
There is a danger of being caught in the web of language/country
differences. I propose just pruning the locales that are not comfortable
including this current table. We can address possible solutions in the
second wave of patching.
I am vary of getting into discussions on specific country variants just
because of the sheer complexity of this topic. It is probably better
addressed by respective maintainers of their locales. I do not see a
"one fits all" solution in this first wave possible.
I would like to have this "three options plan of action" vetted first
and then we could go to the specific detail. (Like, for instance, what
characters should be included in to the table, and in which
transliteration form.)
I am looking forward to your reply,
Egor Kobylkin
P.S. specifically as to how address languages other than Ru included in
GOST_7.79_System_B: we can take the first option left to right from that
table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
locales/languages but with errors where Ru supersedes their own variants.
On 05.10.2018 11:20, Rafal Luzynski wrote:
> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>> Hi
>>>
>>> Please note that translitteration of Cyrillic to latin is not universal.
>>> There are different schemes for for example German, English and Danish, and
>>> there is also an ISO standard for it.
>>
>> Thanks for your feedback, Keld!
>>
>> Could the locale maintainers that wouldn't like to include this patch
>> explicitly state so here?
>
> I think it is about me so I must reply. I am sorry about that and the sole
> reason is my lack of time. I'm just a volunteer here, that means it's not
> my regular job to work on locale data nor anything in glibc nor in any other
> open source project. I do these things only in my free time which I don't
> have much. Of course you will see my contributions here and there but they
> are either trivial or take me months to complete. Your patches are on my
> radar but I can't tell any ETA for them. Of course, there are other people
> around here and they are all welcome to come and join.
>
>> That is:
>> - In the case that there is a different preferred cyrillic
>> transliteration table for any specific locale their maintainers may want
>> to point me to it so I can supply a separate table/patch.
>> - Or they could state explicitly that for some reason they would like to
>> exclude their locale from the patch for a default cyrillic
>> transliteration altogether.
>
> As Keld wrote, there are probably separate rules for every language so
> I don't think you should treat your rules as universal and include them
> in every locale. At first sight, it seems to me they work only for English
> (as a destination locale). Also, although it is called "transliteration
> from Cyrillic" it seems that it covers only Russian alphabet. What about
> other languages which use Cyrillic alphabet but add their own diacritic
> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
> Cyrillic alphabet but transliterate their respective letters in a different
> way than Russian? For example, Russian "Ъ" is (I think) usually skipped
> in transliteration, I think you propose "``", but when transliterating from
> Bulgarian they usually transliterate this as "Ä".
>
> Few remarks:
>
> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be better?
> * You transliterate "Ñ" as "cz", wouldn't "ts" be better? By the way,
> in Polish language "cz" is a correct transliteration of "Ñ".
> * You transliterate "й" as "j", this is fine in many languages but wouldn't
> "y" be better in English?
> * In case of "е": how will you know if it is correct to transliterate it
> to "e" or "ie" or "je" or "ye"?
>
> These remarks are obviously incomplete, your patch deserves much more
> attention to review.
>
> Best regards,
>
> Rafal
>
[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 71090 bytes --]
From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Keld Simonsen <keld@keldix.com>
Cc: libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Fri, 5 Oct 2018 11:43:46 +0300
Message-ID: <66f29205-d7fe-478c-26f9-f3a1d7eb9f25@redhat.com>
Hi Egor,
Thanks for your patience with this one.
On 2018-10-03 12:32, Egor Kobylkin wrote:
> On 03.10.2018 11:19, Keld Simonsen wrote:
>>
>> Please note that translitteration of Cyrillic to latin is not universal.
>> There are different schemes for for example German, English and Danish, and
>> there is also an ISO standard for it.
>
> Thanks for your feedback, Keld!
>
> Could the locale maintainers that wouldn't like to include this patch
> explicitly state so here?
>
> That is:
> - In the case that there is a different preferred cyrillic
> transliteration table for any specific locale their maintainers may want
> to point me to it so I can supply a separate table/patch.
> - Or they could state explicitly that for some reason they would like to
> exclude their locale from the patch for a default cyrillic
> transliteration altogether.
The Wikipedia article https://en.wikipedia.org/wiki/ISO_9 helps to
understand that ISO 9:1995 and GOST 7.79-2000 System A are identical so
perhaps you could mention both ISO 9 and the Wikipedia article in the
commit log. translit_cyrillic includes every transliteration defined in
ISO 9:1995 and GOST 7.79-2000, correct?
I think those locales which already have Cyrillic transliteration
defined it would be best to leave them as-is (as you've done) unless
there are some issues with them, there's probably a good reason why they
have been added in the first place.
For other locales, using ISO 9 instead of not doing transliteration at
all may not be entirely correct but I'd suppose it's better to provide
at least some sort of transliteration (even if not entirely correct)
than sequences of question marks. But as you say, locale maintainers may
know better the case for individual locales.
Wrt language-specific differences Keld mentioned, Finnish Wikipedia
article on transliteration gives an example, see the table on right at
https://fi.wikipedia.org/wiki/Siirtokirjoitus for Russian /
international / Finnish / Swedish / English / French / German / Polish /
phonetic transliteration of a Russian name. (The table also shows that
for correct transliteration ASCII letters are not enough for some
languages.)
Some of the differences and language-specific aspects are probably
impossible to take fully into account within the locale system we have
today. For example, in Finnish (the tables at
http://jkorpela.fi/iso9.html8 and
https://fi.wikipedia.org/wiki/Ven%C3%A4j%C3%A4n_translitterointi might
also be helpful):
1) transliteration of Russian is mostly as per ISO 9 but with national
differences defined in SFS 4900
2) transliteration of Russian and Ukrainian names have some slight
differences according to http://jkorpela.fi/iso9.html8
3) transliteration of a letter depends on its position within a word or
pronunciation of adjacent letters, for example U+0435 becomes U+0065 (e)
except when at the beginning of a word it becomes U+006A U+0065 (je)
Hopefully we'll hear comments from others as well. Once your patch is
merged, I'll try to come up with the needed locale-specific changes for
fi_FI, some differences referred to in 1) above are straightforward to
implement but for 2) and 3) some compromises probably need to be made,
unfortunately.
Thanks,
>> On Wed, Oct 03, 2018 at 10:26:40AM +0200, Egor Kobylkin wrote:
>>> Ping.
>>>
>>> Absent of feedback I am wondering if anything could be missing in this
>>> patch from the maintainers standpoint. More than two months have passed
>>> since the original submission.
>>>
>>> If I can be of assistance, please do not hesitate to contact me,
>>> Egor Kobylkin
>>>
>>> On 06.08.2018 21:00, Egor Kobylkin wrote:
>>>> Dear locale maintainers,
>>>>
>>>> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>>>>
>>>> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>>>>
>>>> add Cyrillic transliteration table translit_cyrillic file
>>>>
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
>>>>
>>>> to localedata/locales/ and include it in all your locales going forward.
>>>>
>>>> Patch included inline below.
>>>>
>>>> This is a re-submission for the consideration for 2.29 on a request from
>>>> Carlos O'Donell https://sourceware.org/ml/libc-alpha/2018-07/msg00506.html
>>>>
>>>> From this patch I have excluded locales that already mention cyrillic or
>>>> have a transliteration table for it:
>>>> az_AZ
>>>> iso14651_t1_common
>>>> ky_KG
>>>> mn_MN
>>>> sr_RS
>>>> tg_TJ
>>>> tk_TM
>>>> tt_RU
>>>> uk_UA
>>>> uz_UZ
>>>> uz_UZ@cyrillic
>>>>
>>>> Their maintainers are requested to make an explicit decision on how and
>>>> whether at all to include this patch.
>>>>
>>>>
>>>>
>>>> Current bug effect:
>>>>
>>>> The glibc wiki explicitly lists this use case as the test example
>>>>
>>>> https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
>>>>
>>>> LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt
>>>>
>>>> currently it fails on Cyrillic texts in most locales including ru_RU [1]
>>>> [8] [9]:
>>>>
>>>> LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
>>>> translit-test-input.txt |grep CYRILLIC
>>>>
>>>> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>>>>
>>>> - It produces a string of question marks and spaces.
>>>>
>>>> This is what it should produce and it does so after the patch applied:
>>>>
>>>> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
>>>> chayu.
>>>>
>>>>
>>>> Root problem and the fix:
>>>>
>>>> The root problem is the missing transliteration table that I am
>>>> supplying here. Furthermore it has to be referenced/included into the
>>>> active locale at the compilation time to be used by iconv.
>>>>
>>>>
>>>>
>>>> COMMIT MESSAGE:
>>>> This translit_cyrillic table enables conversion (e.g. with iconv) from a
>>>> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>>>>
>>>> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
>>>> a transliteration has only ASCII codes but still can be read by a native
>>>> speaker. Among other things it is useful for processing the Cyrillic
>>>> texts and filenames by programs or on systems that are not specifically
>>>> prepared to work with Cyrillic, don't have corresponding fonts installed
>>>> or can't handle UTF-8.
>>>>
>>>> The transliteration table itself is attached as a file translit_cyrillic
>>>> [7]. Its content (mapping) is based on GOST 7.79-2000 official source
>>>> (Federal Agency on Technical Regulating and Metrology Of Russian
>>>> Federation [2]). Technically an independent but identical source [3] was
>>>> used and prepared in a spreadsheet [6].
>>>>
>>>> The documentation suggests that the transliteration tables inclusion is
>>>> done by adding *include "translit_cyrillic";""* string into LC_CTYPE
>>>> translit_start section
>>>> http://man7.org/linux/man-pages/man5/locale.5.html [5]
>>>> Practically I have searched for all locales that have a
>>>> translit_start/end stance and generated a patch for them.
>>>>
>>>> The Cyrillic transliteration of e.g. Russian text may have already
>>>> worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
>>>> have their transliteration tables included inline.
>>>> However it would not be the standard Russian Cyrillic transliteration as
>>>> described above.
>>>> I am excluding these locales from this proposed patch. I have written
>>>> directly to locale maintainer emails listed in the files. Volodymyr
>>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>>> ???????????? ?????????? <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>>>> exclusion.
>>>>
>>>> Links:
>>>>
>>>> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> [2] GOST 7.79-2000 official source
>>>> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
>>>> available in low quality gif format)
>>>> [3] http://transliteration.ru/gost-7-79-2000/ and
>>>> http://www.yfermer.ru/specifications/285821.html
>>>> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
>>>> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>>>> [5] http://man7.org/linux/man-pages/man5/locale.5.html
>>>> [6] Spreadsheet for generating translit_cyrillic
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> [7] translit_cyrillic https://sourceware.org/bugzilla/attachment.cgi?id=8591
>>>> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
>>>> [9] translit-test-input.txt
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8618
>>>>
>>>> Best regards,
>>>> Egor Kobylkin
>>>>
>>>> ---
>>>> 2018-07-17 Egor Kobylkin <egor@kobylkin.com>
>>>>
>>>> [BZ #2872]
>>>> * locales/translit_cyrillic: add Russian GOST 7.79-2000 transliteration
>>>> table from Cyrillic to Latin.
>>>> * locales/C: add include "translit_cyrillic";"" to LC_CTYPE translit
>>>> section.
>>>> * locales/aa_DJ: likewise
>>>> * locales/af_ZA: likewise
>>>> * locales/ak_GH: likewise
>>>> * locales/am_ET: likewise
>>>> * locales/ar_EG: likewise
>>>> * locales/be_BY: likewise
>>>> * locales/bem_ZM: likewise
>>>> * locales/ber_DZ: likewise
>>>> * locales/ber_MA: likewise
>>>> * locales/bg_BG: likewise
>>>> * locales/bi_VU: likewise
>>>> * locales/bn_BD: likewise
>>>> * locales/bo_CN: likewise
>>>> * locales/ca_ES: likewise
>>>> * locales/ce_RU: likewise
>>>> * locales/cs_CZ: likewise
>>>> * locales/cv_RU: likewise
>>>> * locales/cy_GB: likewise
>>>> * locales/da_DK: likewise
>>>> * locales/de_DE: likewise
>>>> * locales/dv_MV: likewise
>>>> * locales/dz_BT: likewise
>>>> * locales/el_GR: likewise
>>>> * locales/en_GB: likewise
>>>> * locales/en_NG: likewise
>>>> * locales/en_ZM: likewise
>>>> * locales/es_CU: likewise
>>>> * locales/es_ES: likewise
>>>> * locales/et_EE: likewise
>>>> * locales/fa_IR: likewise
>>>> * locales/ff_SN: likewise
>>>> * locales/fi_FI: likewise
>>>> * locales/fr_FR: likewise
>>>> * locales/ga_IE: likewise
>>>> * locales/gd_GB: likewise
>>>> * locales/gu_IN: likewise
>>>> * locales/gv_GB: likewise
>>>> * locales/he_IL: likewise
>>>> * locales/hi_IN: likewise
>>>> * locales/hif_FJ: likewise
>>>> * locales/hr_HR: likewise
>>>> * locales/ht_HT: likewise
>>>> * locales/hu_HU: likewise
>>>> * locales/hy_AM: likewise
>>>> * locales/id_ID: likewise
>>>> * locales/is_IS: likewise
>>>> * locales/it_IT: likewise
>>>> * locales/ja_JP: likewise
>>>> * locales/kk_KZ: likewise
>>>> * locales/km_KH: likewise
>>>> * locales/kn_IN: likewise
>>>> * locales/ko_KR: likewise
>>>> * locales/ks_IN: likewise
>>>> * locales/kw_GB: likewise
>>>> * locales/lb_LU: likewise
>>>> * locales/lg_UG: likewise
>>>> * locales/lij_IT: likewise
>>>> * locales/ln_CD: likewise
>>>> * locales/lo_LA: likewise
>>>> * locales/lt_LT: likewise
>>>> * locales/lv_LV: likewise
>>>> * locales/mg_MG: likewise
>>>> * locales/mhr_RU: likewise
>>>> * locales/mk_MK: likewise
>>>> * locales/ml_IN: likewise
>>>> * locales/ms_MY: likewise
>>>> * locales/mt_MT: likewise
>>>> * locales/nan_TW@latin: likewise
>>>> * locales/nb_NO: likewise
>>>> * locales/ne_NP: likewise
>>>> * locales/nhn_MX: likewise
>>>> * locales/niu_NU: likewise
>>>> * locales/niu_NZ: likewise
>>>> * locales/nl_NL: likewise
>>>> * locales/nr_ZA: likewise
>>>> * locales/oc_FR: likewise
>>>> * locales/om_KE: likewise
>>>> * locales/or_IN: likewise
>>>> * locales/os_RU: likewise
>>>> * locales/pa_IN: likewise
>>>> * locales/pa_PK: likewise
>>>> * locales/pl_PL: likewise
>>>> * locales/pt_PT: likewise
>>>> * locales/quz_PE: likewise
>>>> * locales/ro_RO: likewise
>>>> * locales/ru_RU: likewise
>>>> * locales/rw_RW: likewise
>>>> * locales/sa_IN: likewise
>>>> * locales/sd_IN: likewise
>>>> * locales/sd_IN@devanagari: likewise
>>>> * locales/sd_PK: likewise
>>>> * locales/se_NO: likewise
>>>> * locales/sgs_LT: likewise
>>>> * locales/si_LK: likewise
>>>> * locales/sk_SK: likewise
>>>> * locales/sl_SI: likewise
>>>> * locales/sm_WS: likewise
>>>> * locales/so_SO: likewise
>>>> * locales/sq_AL: likewise
>>>> * locales/ss_ZA: likewise
>>>> * locales/st_ZA: likewise
>>>> * locales/sv_SE: likewise
>>>> * locales/sw_KE: likewise
>>>> * locales/ta_IN: likewise
>>>> * locales/te_IN: likewise
>>>> * locales/th_TH: likewise
>>>> * locales/ti_ET: likewise
>>>> * locales/tn_ZA: likewise
>>>> * locales/to_TO: likewise
>>>> * locales/tpi_PG: likewise
>>>> * locales/tr_TR: likewise
>>>> * locales/ts_ZA: likewise
>>>> * locales/unm_US: likewise
>>>> * locales/ur_IN: likewise
>>>> * locales/ur_PK: likewise
>>>> * locales/ve_ZA: likewise
>>>> * locales/vi_VN: likewise
>>>> * locales/wa_BE: likewise
>>>> * locales/wo_SN: likewise
>>>> * locales/xh_ZA: likewise
>>>> * locales/yi_US: likewise
>>>> * locales/zh_CN: likewise
>>>> * locales/zu_ZA: likewise
>>>>
>>>>
>>>> diff -uNr a/localedata/locales/C b/localedata/locales/C
>>>> --- a/localedata/locales/C 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/C 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -2292,6 +2292,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
>>>> --- a/localedata/locales/aa_DJ 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/aa_DJ 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
>>>> --- a/localedata/locales/af_ZA 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/af_ZA 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
>>>> --- a/localedata/locales/ak_GH 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ak_GH 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -56,6 +56,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
>>>> --- a/localedata/locales/am_ET 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/am_ET 2018-07-17 17:55:47.000000000 +0000
>>>> @@ -1396,6 +1396,7 @@
>>>> <U137A> <U0060><U0039><U0030>
>>>> <U137B> <U0060><U0031><U0030><U0030>
>>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> %
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
>>>> --- a/localedata/locales/ar_EG 2018-07-17 17:49:12.000000000 +0000
>>>> +++ b/localedata/locales/ar_EG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
>>>> --- a/localedata/locales/be_BY 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/be_BY 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
>>>> --- a/localedata/locales/bem_ZM 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bem_ZM 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
>>>> --- a/localedata/locales/ber_DZ 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_DZ 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -166,6 +166,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
>>>> --- a/localedata/locales/ber_MA 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ber_MA 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -86,6 +86,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
>>>> --- a/localedata/locales/bg_BG 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bg_BG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
>>>> --- a/localedata/locales/bi_VU 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bi_VU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
>>>> --- a/localedata/locales/bn_BD 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bn_BD 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
>>>> --- a/localedata/locales/bo_CN 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/bo_CN 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
>>>> --- a/localedata/locales/ca_ES 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ca_ES 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -72,6 +72,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
>>>> --- a/localedata/locales/ce_RU 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/ce_RU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
>>>> --- a/localedata/locales/cs_CZ 2018-07-17 17:49:13.000000000 +0000
>>>> +++ b/localedata/locales/cs_CZ 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -2311,6 +2311,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
>>>> --- a/localedata/locales/cv_RU 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cv_RU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
>>>> --- a/localedata/locales/cy_GB 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/cy_GB 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
>>>> --- a/localedata/locales/da_DK 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/da_DK 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -167,6 +167,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
>>>> --- a/localedata/locales/de_DE 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/de_DE 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>> % DOUBLE HIGH-REVERSED-9 QUOTATION MARK
>>>> <U201F> <U00AB>;<U0022>
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
>>>> --- a/localedata/locales/dv_MV 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dv_MV 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -52,6 +52,7 @@
>>>> include "translit_combining";""
>>>>
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
>>>> --- a/localedata/locales/dz_BT 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/dz_BT 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
>>>> --- a/localedata/locales/el_GR 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/el_GR 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
>>>> --- a/localedata/locales/en_GB 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_GB 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
>>>> --- a/localedata/locales/en_NG 2018-07-17 17:49:14.000000000 +0000
>>>> +++ b/localedata/locales/en_NG 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -50,6 +50,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
>>>> --- a/localedata/locales/en_ZM 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/en_ZM 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
>>>> --- a/localedata/locales/es_CU 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_CU 2018-07-17 17:55:48.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
>>>> --- a/localedata/locales/es_ES 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/es_ES 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
>>>> --- a/localedata/locales/et_EE 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/et_EE 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -109,6 +109,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
>>>> --- a/localedata/locales/fa_IR 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fa_IR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -79,6 +79,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
>>>> --- a/localedata/locales/ff_SN 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/ff_SN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -42,6 +42,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
>>>> --- a/localedata/locales/fi_FI 2018-07-17 17:49:15.000000000 +0000
>>>> +++ b/localedata/locales/fi_FI 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -137,6 +137,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
>>>> --- a/localedata/locales/fr_FR 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/fr_FR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> % In France, accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
>>>> --- a/localedata/locales/ga_IE 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ga_IE 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -54,6 +54,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
>>>> --- a/localedata/locales/gd_GB 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gd_GB 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
>>>> --- a/localedata/locales/gu_IN 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gu_IN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
>>>> --- a/localedata/locales/gv_GB 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/gv_GB 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
>>>> --- a/localedata/locales/he_IL 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/he_IL 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
>>>> --- a/localedata/locales/hi_IN 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hi_IN 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -61,6 +61,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
>>>> --- a/localedata/locales/hif_FJ 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hif_FJ 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
>>>> --- a/localedata/locales/hr_HR 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hr_HR 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -153,6 +153,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
>>>> --- a/localedata/locales/ht_HT 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/ht_HT 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
>>>> --- a/localedata/locales/hu_HU 2018-07-17 17:49:16.000000000 +0000
>>>> +++ b/localedata/locales/hu_HU 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -478,6 +478,7 @@
>>>> <U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
>>>> <U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
>>>> --- a/localedata/locales/hy_AM 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/hy_AM 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
>>>> --- a/localedata/locales/id_ID 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/id_ID 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
>>>> --- a/localedata/locales/is_IS 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/is_IS 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -2161,6 +2161,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
>>>> --- a/localedata/locales/it_IT 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/it_IT 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
>>>> --- a/localedata/locales/ja_JP 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ja_JP 2018-07-17 17:55:49.000000000 +0000
>>>> @@ -1682,6 +1682,7 @@
>>>> include "translit_combining";""
>>>> include "translit_cjk_variants";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
>>>> --- a/localedata/locales/kk_KZ 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kk_KZ 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -158,6 +158,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
>>>> --- a/localedata/locales/km_KH 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/km_KH 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -873,6 +873,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
>>>> --- a/localedata/locales/kn_IN 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kn_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
>>>> --- a/localedata/locales/ko_KR 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ko_KR 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -6099,6 +6099,7 @@
>>>> include "translit_combining";""
>>>> include "translit_hangul";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
>>>> --- a/localedata/locales/ks_IN 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/ks_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
>>>> --- a/localedata/locales/kw_GB 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/kw_GB 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
>>>> --- a/localedata/locales/lb_LU 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lb_LU 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -78,6 +78,7 @@
>>>> % LATIN SMALL LETTER E WITH CIRCUMFLEX
>>>> <U00EA> "<U0065><U005E>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
>>>> --- a/localedata/locales/lg_UG 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lg_UG 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
>>>> --- a/localedata/locales/lij_IT 2018-07-17 17:49:17.000000000 +0000
>>>> +++ b/localedata/locales/lij_IT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
>>>> --- a/localedata/locales/ln_CD 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ln_CD 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
>>>> --- a/localedata/locales/lo_LA 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lo_LA 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -51,6 +51,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
>>>> --- a/localedata/locales/lt_LT 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lt_LT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -77,6 +77,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
>>>> --- a/localedata/locales/lv_LV 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/lv_LV 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -2122,6 +2122,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
>>>> --- a/localedata/locales/mg_MG 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mg_MG 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>> % Accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
>>>> --- a/localedata/locales/mhr_RU 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mhr_RU 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
>>>> --- a/localedata/locales/mk_MK 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mk_MK 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -49,6 +49,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
>>>> --- a/localedata/locales/ml_IN 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ml_IN 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>> %
>>>> diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
>>>> --- a/localedata/locales/ms_MY 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ms_MY 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
>>>> --- a/localedata/locales/mt_MT 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/mt_MT 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -47,6 +47,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nan_TW@latin
>>>> b/localedata/locales/nan_TW@latin
>>>> --- a/localedata/locales/nan_TW@latin 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nan_TW@latin 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -53,6 +53,7 @@
>>>> % accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
>>>> --- a/localedata/locales/nb_NO 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nb_NO 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -154,6 +154,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
>>>> --- a/localedata/locales/ne_NP 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/ne_NP 2018-07-17 17:55:50.000000000 +0000
>>>> @@ -43,6 +43,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
>>>> --- a/localedata/locales/nhn_MX 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nhn_MX 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
>>>> --- a/localedata/locales/niu_NU 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
>>>> --- a/localedata/locales/niu_NZ 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/niu_NZ 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
>>>> --- a/localedata/locales/nl_NL 2018-07-17 17:49:18.000000000 +0000
>>>> +++ b/localedata/locales/nl_NL 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
>>>> --- a/localedata/locales/nr_ZA 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/nr_ZA 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
>>>> --- a/localedata/locales/oc_FR 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/oc_FR 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
>>>> --- a/localedata/locales/om_KE 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/om_KE 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -140,6 +140,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
>>>> --- a/localedata/locales/or_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/or_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -62,6 +62,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
>>>> --- a/localedata/locales/os_RU 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/os_RU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
>>>> --- a/localedata/locales/pa_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -60,6 +60,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
>>>> --- a/localedata/locales/pa_PK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pa_PK 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % Farsi yeh -> yeh
>>>> <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
>>>> --- a/localedata/locales/pl_PL 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pl_PL 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -142,6 +142,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
>>>> --- a/localedata/locales/pt_PT 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/pt_PT 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
>>>> --- a/localedata/locales/quz_PE 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/quz_PE 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -57,6 +57,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
>>>> --- a/localedata/locales/ro_RO 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ro_RO 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -144,6 +144,7 @@
>>>> <U0162> "<U021A>";"<U0054>"
>>>> <U0163> "<U021B>";"<U0074>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
>>>> --- a/localedata/locales/ru_RU 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/ru_RU 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -74,6 +74,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
>>>> --- a/localedata/locales/rw_RW 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/rw_RW 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
>>>> --- a/localedata/locales/sa_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sa_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
>>>> --- a/localedata/locales/sd_IN 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_IN 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_IN@devanagari
>>>> b/localedata/locales/sd_IN@devanagari
>>>> --- a/localedata/locales/sd_IN@devanagari 2018-07-17 17:49:19.000000000
>>>> +0000
>>>> +++ b/localedata/locales/sd_IN@devanagari 2018-07-17 17:55:51.000000000
>>>> +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
>>>> --- a/localedata/locales/sd_PK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sd_PK 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
>>>> --- a/localedata/locales/se_NO 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/se_NO 2018-07-17 17:55:51.000000000 +0000
>>>> @@ -205,6 +205,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
>>>> --- a/localedata/locales/sgs_LT 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sgs_LT 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -59,6 +59,7 @@
>>>> copy "i18n"
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
>>>> --- a/localedata/locales/si_LK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/si_LK 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
>>>> --- a/localedata/locales/sk_SK 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sk_SK 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
>>>> --- a/localedata/locales/sl_SI 2018-07-17 17:49:19.000000000 +0000
>>>> +++ b/localedata/locales/sl_SI 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -91,6 +91,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
>>>> --- a/localedata/locales/sm_WS 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sm_WS 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
>>>> --- a/localedata/locales/so_SO 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/so_SO 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
>>>> --- a/localedata/locales/sq_AL 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sq_AL 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -45,6 +45,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
>>>> --- a/localedata/locales/ss_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ss_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -68,6 +68,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
>>>> --- a/localedata/locales/st_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/st_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
>>>> --- a/localedata/locales/sv_SE 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sv_SE 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -139,6 +139,7 @@
>>>> % LATIN SMALL LETTER O WITH STROKE -> "oe"
>>>> <U00F8> "<U006F><U0338>";"<U006F><U0065>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
>>>> --- a/localedata/locales/sw_KE 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/sw_KE 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -44,6 +44,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
>>>> --- a/localedata/locales/ta_IN 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ta_IN 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
>>>> --- a/localedata/locales/te_IN 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/te_IN 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -63,6 +63,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
>>>> --- a/localedata/locales/th_TH 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/th_TH 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
>>>> --- a/localedata/locales/ti_ET 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/ti_ET 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -866,6 +866,7 @@
>>>> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
>>>>
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> %
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
>>>> --- a/localedata/locales/tn_ZA 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tn_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
>>>> --- a/localedata/locales/to_TO 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/to_TO 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -36,6 +36,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
>>>> --- a/localedata/locales/tpi_PG 2018-07-17 17:49:20.000000000 +0000
>>>> +++ b/localedata/locales/tpi_PG 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -37,6 +37,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
>>>> --- a/localedata/locales/tr_TR 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/tr_TR 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -2430,6 +2430,7 @@
>>>>
>>>> % TURKISH LIRA SIGN
>>>> <U20BA> "<U0054><U004C>"
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/translit_cyrillic
>>>> b/localedata/locales/translit_cyrillic
>>>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
>>>> +0000
>>>> +++ b/localedata/locales/translit_cyrillic 2018-07-17 17:55:52.000000000
>>>> +0000
>>>> @@ -0,0 +1,151 @@
>>>> +escape_char /
>>>> +comment_char %
>>>> +
>>>> +% Transliterations that converts cyrillic letters to ascii symbols
>>>> inspired by GOST 7.79-2000
>>>> +% https://sourceware.org/bugzilla/show_bug.cgi?id=2872
>>>> +% Generated from UnicodeData.txt with
>>>> +% https://sourceware.org/bugzilla/attachment.cgi?id=8590
>>>> +% Up to three characters are required to do a reversible transliteration.
>>>> +
>>>> +LC_CTYPE
>>>> +
>>>> +translit_start
>>>> +
>>>> +
>>>> +% CYRILLIC CAPITAL LETTER IO
>>>> +<U0401> "<U0059><U004F>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER A
>>>> +<U0410> <U0041>
>>>> +% CYRILLIC CAPITAL LETTER BE
>>>> +<U0411> <U0042>
>>>> +% CYRILLIC CAPITAL LETTER VE
>>>> +<U0412> <U0056>
>>>> +% CYRILLIC CAPITAL LETTER GHE
>>>> +<U0413> <U0047>
>>>> +% CYRILLIC CAPITAL LETTER DE
>>>> +<U0414> <U0044>
>>>> +% CYRILLIC CAPITAL LETTER IE
>>>> +<U0415> <U0045>
>>>> +% CYRILLIC CAPITAL LETTER ZHE
>>>> +<U0416> "<U005A><U0048>";<U005A>
>>>> +% CYRILLIC CAPITAL LETTER ZE
>>>> +<U0417> <U005A>
>>>> +% CYRILLIC CAPITAL LETTER I
>>>> +<U0418> <U0049>
>>>> +% CYRILLIC CAPITAL LETTER SHORT I
>>>> +<U0419> <U004A>
>>>> +% CYRILLIC CAPITAL LETTER KA
>>>> +<U041A> <U004B>
>>>> +% CYRILLIC CAPITAL LETTER EL
>>>> +<U041B> <U004C>
>>>> +% CYRILLIC CAPITAL LETTER EM
>>>> +<U041C> <U004D>
>>>> +% CYRILLIC CAPITAL LETTER EN
>>>> +<U041D> <U004E>
>>>> +% CYRILLIC CAPITAL LETTER O
>>>> +<U041E> <U004F>
>>>> +% CYRILLIC CAPITAL LETTER PE
>>>> +<U041F> <U0050>
>>>> +% CYRILLIC CAPITAL LETTER ER
>>>> +<U0420> <U0052>
>>>> +% CYRILLIC CAPITAL LETTER ES
>>>> +<U0421> <U0053>
>>>> +% CYRILLIC CAPITAL LETTER TE
>>>> +<U0422> <U0054>
>>>> +% CYRILLIC CAPITAL LETTER U
>>>> +<U0423> <U0055>
>>>> +% CYRILLIC CAPITAL LETTER EF
>>>> +<U0424> <U0046>
>>>> +% CYRILLIC CAPITAL LETTER HA
>>>> +<U0425> <U0058>
>>>> +% CYRILLIC CAPITAL LETTER TSE
>>>> +<U0426> "<U0043><U005A>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER CHE
>>>> +<U0427> "<U0043><U0048>";<U0043>
>>>> +% CYRILLIC CAPITAL LETTER SHA
>>>> +<U0428> "<U0053><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER SHCHA
>>>> +<U0429> "<U0053><U0048><U0048>";<U0053>
>>>> +% CYRILLIC CAPITAL LETTER HARD SIGN
>>>> +<U042A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC CAPITAL LETTER YERU
>>>> +<U042B> "<U0059><U0027>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER SOFT SIGN
>>>> +<U042C> <U0060>
>>>> +% CYRILLIC CAPITAL LETTER E
>>>> +<U042D> "<U0045><U0060>";<U0045>
>>>> +% CYRILLIC CAPITAL LETTER YU
>>>> +<U042E> "<U0059><U0055>";<U0059>
>>>> +% CYRILLIC CAPITAL LETTER YA
>>>> +<U042F> "<U0059><U0041>";<U0059>
>>>> +% CYRILLIC SMALL LETTER A
>>>> +<U0430> <U0061>
>>>> +% CYRILLIC SMALL LETTER BE
>>>> +<U0431> <U0062>
>>>> +% CYRILLIC SMALL LETTER VE
>>>> +<U0432> <U0076>
>>>> +% CYRILLIC SMALL LETTER GHE
>>>> +<U0433> <U0067>
>>>> +% CYRILLIC SMALL LETTER DE
>>>> +<U0434> <U0064>
>>>> +% CYRILLIC SMALL LETTER IE
>>>> +<U0435> <U0065>
>>>> +% CYRILLIC SMALL LETTER ZHE
>>>> +<U0436> "<U007A><U0068>";<U007A>
>>>> +% CYRILLIC SMALL LETTER ZE
>>>> +<U0437> <U007A>
>>>> +% CYRILLIC SMALL LETTER I
>>>> +<U0438> <U0069>
>>>> +% CYRILLIC SMALL LETTER SHORT I
>>>> +<U0439> <U006A>
>>>> +% CYRILLIC SMALL LETTER KA
>>>> +<U043A> <U006B>
>>>> +% CYRILLIC SMALL LETTER EL
>>>> +<U043B> <U006C>
>>>> +% CYRILLIC SMALL LETTER EM
>>>> +<U043C> <U006D>
>>>> +% CYRILLIC SMALL LETTER EN
>>>> +<U043D> <U006E>
>>>> +% CYRILLIC SMALL LETTER O
>>>> +<U043E> <U006F>
>>>> +% CYRILLIC SMALL LETTER PE
>>>> +<U043F> <U0070>
>>>> +% CYRILLIC SMALL LETTER ER
>>>> +<U0440> <U0072>
>>>> +% CYRILLIC SMALL LETTER ES
>>>> +<U0441> <U0073>
>>>> +% CYRILLIC SMALL LETTER TE
>>>> +<U0442> <U0074>
>>>> +% CYRILLIC SMALL LETTER U
>>>> +<U0443> <U0075>
>>>> +% CYRILLIC SMALL LETTER EF
>>>> +<U0444> <U0066>
>>>> +% CYRILLIC SMALL LETTER HA
>>>> +<U0445> <U0078>
>>>> +% CYRILLIC SMALL LETTER TSE
>>>> +<U0446> "<U0063><U007A>";<U0063>
>>>> +% CYRILLIC SMALL LETTER CHE
>>>> +<U0447> "<U0063><U0068>";<U0063>
>>>> +% CYRILLIC SMALL LETTER SHA
>>>> +<U0448> "<U0073><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER SHCHA
>>>> +<U0449> "<U0073><U0068><U0068>";<U0073>
>>>> +% CYRILLIC SMALL LETTER HARD SIGN
>>>> +<U044A> "<U0060><U0060>";<U0060>
>>>> +% CYRILLIC SMALL LETTER YERU
>>>> +<U044B> "<U0079><U0027>";<U0079>
>>>> +% CYRILLIC SMALL LETTER SOFT SIGN
>>>> +<U044C> <U0060>
>>>> +% CYRILLIC SMALL LETTER E
>>>> +<U044D> "<U0065><U0060>";<U0065>
>>>> +% CYRILLIC SMALL LETTER YU
>>>> +<U044E> "<U0079><U0075>";<U0079>
>>>> +% CYRILLIC SMALL LETTER YA
>>>> +<U044F> "<U0079><U0061>";<U0079>
>>>> +% CYRILLIC SMALL LETTER IO
>>>> +<U0451> "<U0079><U006F>";<U0079>
>>>> +
>>>> +
>>>> +translit_end
>>>> +
>>>> +END LC_CTYPE
>>>> diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
>>>> --- a/localedata/locales/ts_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ts_ZA 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -64,6 +64,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
>>>> --- a/localedata/locales/unm_US 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/unm_US 2018-07-17 17:55:52.000000000 +0000
>>>> @@ -48,6 +48,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
>>>> --- a/localedata/locales/ur_IN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_IN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -46,6 +46,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
>>>> --- a/localedata/locales/ur_PK 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ur_PK 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % Farsi yeh -> yeh
>>>> <U06CC> "<U064A>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
>>>> --- a/localedata/locales/ve_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/ve_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -67,6 +67,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
>>>> --- a/localedata/locales/vi_VN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/vi_VN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>> % dong sign -> d// -> dd
>>>> <U20AB> "<U0111>";"<U0064><U0064>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
>>>> --- a/localedata/locales/wa_BE 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wa_BE 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -69,6 +69,7 @@
>>>> <U00C5> "<U0041><U030A>";"<U0041>";"<U0041><U0055>"
>>>> <U00E5> "<U0061><U030A>";"<U0061>";"<U0061><U0075>"
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
>>>> --- a/localedata/locales/wo_SN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/wo_SN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -55,6 +55,7 @@
>>>> % Accents are simply omitted if they cannot be represented.
>>>> include "translit_combining";""
>>>>
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
>>>> --- a/localedata/locales/xh_ZA 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/xh_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -66,6 +66,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>> diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
>>>> --- a/localedata/locales/yi_US 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/yi_US 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -73,6 +73,7 @@
>>>> <U05F0> "<U05D5><U05D5>";"<U0077><U0077>"
>>>> <U05F1> "<U05D5><U05D9>";"<U0077><U006A>"
>>>> <U05F2> "<U05D9><U05D9>";"<U006A><U006A>"
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> END LC_CTYPE
>>>> diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
>>>> --- a/localedata/locales/zh_CN 2018-07-17 17:49:21.000000000 +0000
>>>> +++ b/localedata/locales/zh_CN 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -58,6 +58,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>>
>>>> class "hanzi"; /
>>>> diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
>>>> --- a/localedata/locales/zu_ZA 2018-07-17 17:49:22.000000000 +0000
>>>> +++ b/localedata/locales/zu_ZA 2018-07-17 17:55:53.000000000 +0000
>>>> @@ -70,6 +70,7 @@
>>>>
>>>> translit_start
>>>> include "translit_combining";""
>>>> +include "translit_cyrillic";""
>>>> translit_end
>>>> END LC_CTYPE
>>>>
>>>>
>>>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
[not found] ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
@ 2018-10-05 12:01 ` Marko Myllynen
2018-10-05 12:21 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05 12:01 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
possible and if not, then fall back to GOST 7.79 System B?
Implementation-wise current translit_* files have few examples where a
non-ASCII transliteration is tried first before an ASCII fallback. These
examples are from translit_neutral:
% NARROW NO-BREAK SPACE
<U202F> <U00A0>;<U0020>
% REVERSED TRIPLE PRIME
<U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
Thanks,
On 2018-10-05 13:29, Egor Kobylkin wrote:
> Keld,Marko,Rafal, other locale maintainers,
>
> this all is written with having in mind a minimal viable fix for this
> bug asap. I want to avoid wasting maintainers time getting into
> fundamental discussions here (although for perfectly good reasons).
>
> I see three options:
> 1. those locale maintainers that are fine with using ISO
> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
> in their locales (see attached screenshot of the table).
> 2. those that that want to have a differing table can create their own
> variety based on the spreadsheet I have prepared
> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
> this patch.
> 3. those that want to omit a cyrillic transliteration altogether for now
> state so and just carry over the bug #2872 from the year 2006.
>
> Does this make sense to you?
>
> Just to be super clear on this: the patch is a stopgap _ASCII_
> transliteration table. ASCII being AMERICAN Standard Code for
> Information Interchange, that is obviously orthogonal to any
> transliteration rule of other countries. As such it is not explicitly
> targeting transliteration standards of any country.
>
> The fact that the patch is reflecting Russian variety of ISO
> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
> available and can be helpful to a majority of cyrillic users b) I have
> access to it including via being proficient in Russian.
>
> It is offered to all the respective locale maintainers as a stopgap
> solution. Stopgap in the sense that it is better to have some
> transliteration than not to have any at all and carry over the bug from
> 2006. That it may be a somewhat officially correct transliteration for
> ru_RU is a bonus. In that sense I would dub the discussion on the
> correctness for other languages "offtopic". Let me know if this is not OK.
>
> You are all are correctly mentioning the deficiencies of this approach.
> However, I couldn't find a better straightforward approach as of yet.
> Happy to hear from you as on how this could be handled.
>
> There is a danger of being caught in the web of language/country
> differences. I propose just pruning the locales that are not comfortable
> including this current table. We can address possible solutions in the
> second wave of patching.
>
> I am vary of getting into discussions on specific country variants just
> because of the sheer complexity of this topic. It is probably better
> addressed by respective maintainers of their locales. I do not see a
> "one fits all" solution in this first wave possible.
>
> I would like to have this "three options plan of action" vetted first
> and then we could go to the specific detail. (Like, for instance, what
> characters should be included in to the table, and in which
> transliteration form.)
>
> I am looking forward to your reply,
> Egor Kobylkin
>
> P.S. specifically as to how address languages other than Ru included in
> GOST_7.79_System_B: we can take the first option left to right from that
> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
> locales/languages but with errors where Ru supersedes their own variants.
>
>
> On 05.10.2018 11:20, Rafal Luzynski wrote:
>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>> Hi
>>>>
>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>> There are different schemes for for example German, English and Danish, and
>>>> there is also an ISO standard for it.
>>>
>>> Thanks for your feedback, Keld!
>>>
>>> Could the locale maintainers that wouldn't like to include this patch
>>> explicitly state so here?
>>
>> I think it is about me so I must reply. I am sorry about that and the sole
>> reason is my lack of time. I'm just a volunteer here, that means it's not
>> my regular job to work on locale data nor anything in glibc nor in any other
>> open source project. I do these things only in my free time which I don't
>> have much. Of course you will see my contributions here and there but they
>> are either trivial or take me months to complete. Your patches are on my
>> radar but I can't tell any ETA for them. Of course, there are other people
>> around here and they are all welcome to come and join.
>>
>>> That is:
>>> - In the case that there is a different preferred cyrillic
>>> transliteration table for any specific locale their maintainers may want
>>> to point me to it so I can supply a separate table/patch.
>>> - Or they could state explicitly that for some reason they would like to
>>> exclude their locale from the patch for a default cyrillic
>>> transliteration altogether.
>>
>> As Keld wrote, there are probably separate rules for every language so
>> I don't think you should treat your rules as universal and include them
>> in every locale. At first sight, it seems to me they work only for English
>> (as a destination locale). Also, although it is called "transliteration
>> from Cyrillic" it seems that it covers only Russian alphabet. What about
>> other languages which use Cyrillic alphabet but add their own diacritic
>> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
>> Cyrillic alphabet but transliterate their respective letters in a different
>> way than Russian? For example, Russian "Ъ" is (I think) usually skipped
>> in transliteration, I think you propose "``", but when transliterating from
>> Bulgarian they usually transliterate this as "Ä".
>>
>> Few remarks:
>>
>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be better?
>> * You transliterate "Ñ" as "cz", wouldn't "ts" be better? By the way,
>> in Polish language "cz" is a correct transliteration of "Ñ".
>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>> "y" be better in English?
>> * In case of "е": how will you know if it is correct to transliterate it
>> to "e" or "ie" or "je" or "ye"?
>>
>> These remarks are obviously incomplete, your patch deserves much more
>> attention to review.
>>
>> Best regards,
>>
>> Rafal
>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-05 12:01 ` Marko Myllynen
@ 2018-10-05 12:21 ` Egor Kobylkin
2018-10-05 15:55 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-05 12:21 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi Marko,
I have chosen the System B because it is ASCII compartible. System A is
not ASCII compartible (diacritics in target).
https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
"GOST 7.79 contains two transliteration tables.
System A
one Cyrillic character to one Latin character, some with diacritics
â identical to ISO 9:1995
System B
one Cyrillic character to one or many Latin characters without
diacritics
"
Hope this helps,
Egor
On 05.10.2018 13:54, Marko Myllynen wrote:
> Hi,
>
> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
> possible and if not, then fall back to GOST 7.79 System B?
>
> Implementation-wise current translit_* files have few examples where a
> non-ASCII transliteration is tried first before an ASCII fallback. These
> examples are from translit_neutral:
>
> % NARROW NO-BREAK SPACE
> <U202F> <U00A0>;<U0020>
> % REVERSED TRIPLE PRIME
> <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>
> Thanks,
>
> On 2018-10-05 13:29, Egor Kobylkin wrote:
>> Keld,Marko,Rafal, other locale maintainers,
>>
>> this all is written with having in mind a minimal viable fix for this
>> bug asap. I want to avoid wasting maintainers time getting into
>> fundamental discussions here (although for perfectly good reasons).
>>
>> I see three options:
>> 1. those locale maintainers that are fine with using ISO
>> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
>> in their locales (see attached screenshot of the table).
>> 2. those that that want to have a differing table can create their own
>> variety based on the spreadsheet I have prepared
>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
>> this patch.
>> 3. those that want to omit a cyrillic transliteration altogether for now
>> state so and just carry over the bug #2872 from the year 2006.
>>
>> Does this make sense to you?
>>
>> Just to be super clear on this: the patch is a stopgap _ASCII_
>> transliteration table. ASCII being AMERICAN Standard Code for
>> Information Interchange, that is obviously orthogonal to any
>> transliteration rule of other countries. As such it is not explicitly
>> targeting transliteration standards of any country.
>>
>> The fact that the patch is reflecting Russian variety of ISO
>> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
>> available and can be helpful to a majority of cyrillic users b) I have
>> access to it including via being proficient in Russian.
>>
>> It is offered to all the respective locale maintainers as a stopgap
>> solution. Stopgap in the sense that it is better to have some
>> transliteration than not to have any at all and carry over the bug from
>> 2006. That it may be a somewhat officially correct transliteration for
>> ru_RU is a bonus. In that sense I would dub the discussion on the
>> correctness for other languages "offtopic". Let me know if this is not OK.
>>
>> You are all are correctly mentioning the deficiencies of this approach.
>> However, I couldn't find a better straightforward approach as of yet.
>> Happy to hear from you as on how this could be handled.
>>
>> There is a danger of being caught in the web of language/country
>> differences. I propose just pruning the locales that are not comfortable
>> including this current table. We can address possible solutions in the
>> second wave of patching.
>>
>> I am vary of getting into discussions on specific country variants just
>> because of the sheer complexity of this topic. It is probably better
>> addressed by respective maintainers of their locales. I do not see a
>> "one fits all" solution in this first wave possible.
>>
>> I would like to have this "three options plan of action" vetted first
>> and then we could go to the specific detail. (Like, for instance, what
>> characters should be included in to the table, and in which
>> transliteration form.)
>>
>> I am looking forward to your reply,
>> Egor Kobylkin
>>
>> P.S. specifically as to how address languages other than Ru included in
>> GOST_7.79_System_B: we can take the first option left to right from that
>> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
>> locales/languages but with errors where Ru supersedes their own variants.
>>
>>
>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>> Hi
>>>>>
>>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>>> There are different schemes for for example German, English and Danish, and
>>>>> there is also an ISO standard for it.
>>>>
>>>> Thanks for your feedback, Keld!
>>>>
>>>> Could the locale maintainers that wouldn't like to include this patch
>>>> explicitly state so here?
>>>
>>> I think it is about me so I must reply. I am sorry about that and the sole
>>> reason is my lack of time. I'm just a volunteer here, that means it's not
>>> my regular job to work on locale data nor anything in glibc nor in any other
>>> open source project. I do these things only in my free time which I don't
>>> have much. Of course you will see my contributions here and there but they
>>> are either trivial or take me months to complete. Your patches are on my
>>> radar but I can't tell any ETA for them. Of course, there are other people
>>> around here and they are all welcome to come and join.
>>>
>>>> That is:
>>>> - In the case that there is a different preferred cyrillic
>>>> transliteration table for any specific locale their maintainers may want
>>>> to point me to it so I can supply a separate table/patch.
>>>> - Or they could state explicitly that for some reason they would like to
>>>> exclude their locale from the patch for a default cyrillic
>>>> transliteration altogether.
>>>
>>> As Keld wrote, there are probably separate rules for every language so
>>> I don't think you should treat your rules as universal and include them
>>> in every locale. At first sight, it seems to me they work only for English
>>> (as a destination locale). Also, although it is called "transliteration
>>> from Cyrillic" it seems that it covers only Russian alphabet. What about
>>> other languages which use Cyrillic alphabet but add their own diacritic
>>> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>>> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
>>> Cyrillic alphabet but transliterate their respective letters in a different
>>> way than Russian? For example, Russian "Ъ" is (I think) usually skipped
>>> in transliteration, I think you propose "``", but when transliterating from
>>> Bulgarian they usually transliterate this as "Ä".
>>>
>>> Few remarks:
>>>
>>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be better?
>>> * You transliterate "Ñ" as "cz", wouldn't "ts" be better? By the way,
>>> in Polish language "cz" is a correct transliteration of "Ñ".
>>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>>> "y" be better in English?
>>> * In case of "е": how will you know if it is correct to transliterate it
>>> to "e" or "ie" or "je" or "ye"?
>>>
>>> These remarks are obviously incomplete, your patch deserves much more
>>> attention to review.
>>>
>>> Best regards,
>>>
>>> Rafal
>>>
>>
>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-05 12:21 ` Egor Kobylkin
@ 2018-10-05 15:55 ` Marko Myllynen
2018-10-08 10:42 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-05 15:55 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
The scheme I proposed would also be ASCII compatible; consider this example:
% CYRILLIC CAPITAL LETTER SHA
<U0428> "<U0160>";"<U0053><U0068>"
"printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv -f
ISO-8859-15 -t UTF-8" would produce Å as per System A and "printf
\\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as per
System B.
Thanks,
On 2018-10-05 15:00, Egor Kobylkin wrote:
> Hi Marko,
>
> I have chosen the System B because it is ASCII compartible. System A is
> not ASCII compartible (diacritics in target).
>
> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
> "GOST 7.79 contains two transliteration tables.
>
> System A
> one Cyrillic character to one Latin character, some with diacritics
> â identical to ISO 9:1995
>
> System B
> one Cyrillic character to one or many Latin characters without
> diacritics
> "
> Hope this helps,
> Egor
>
> On 05.10.2018 13:54, Marko Myllynen wrote:
>> Hi,
>>
>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>> possible and if not, then fall back to GOST 7.79 System B?
>>
>> Implementation-wise current translit_* files have few examples where a
>> non-ASCII transliteration is tried first before an ASCII fallback. These
>> examples are from translit_neutral:
>>
>> % NARROW NO-BREAK SPACE
>> <U202F> <U00A0>;<U0020>
>> % REVERSED TRIPLE PRIME
>> <U2037> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>
>> Thanks,
>>
>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>> Keld,Marko,Rafal, other locale maintainers,
>>>
>>> this all is written with having in mind a minimal viable fix for this
>>> bug asap. I want to avoid wasting maintainers time getting into
>>> fundamental discussions here (although for perfectly good reasons).
>>>
>>> I see three options:
>>> 1. those locale maintainers that are fine with using ISO
>>> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
>>> in their locales (see attached screenshot of the table).
>>> 2. those that that want to have a differing table can create their own
>>> variety based on the spreadsheet I have prepared
>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
>>> this patch.
>>> 3. those that want to omit a cyrillic transliteration altogether for now
>>> state so and just carry over the bug #2872 from the year 2006.
>>>
>>> Does this make sense to you?
>>>
>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>> transliteration table. ASCII being AMERICAN Standard Code for
>>> Information Interchange, that is obviously orthogonal to any
>>> transliteration rule of other countries. As such it is not explicitly
>>> targeting transliteration standards of any country.
>>>
>>> The fact that the patch is reflecting Russian variety of ISO
>>> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
>>> available and can be helpful to a majority of cyrillic users b) I have
>>> access to it including via being proficient in Russian.
>>>
>>> It is offered to all the respective locale maintainers as a stopgap
>>> solution. Stopgap in the sense that it is better to have some
>>> transliteration than not to have any at all and carry over the bug from
>>> 2006. That it may be a somewhat officially correct transliteration for
>>> ru_RU is a bonus. In that sense I would dub the discussion on the
>>> correctness for other languages "offtopic". Let me know if this is not OK.
>>>
>>> You are all are correctly mentioning the deficiencies of this approach.
>>> However, I couldn't find a better straightforward approach as of yet.
>>> Happy to hear from you as on how this could be handled.
>>>
>>> There is a danger of being caught in the web of language/country
>>> differences. I propose just pruning the locales that are not comfortable
>>> including this current table. We can address possible solutions in the
>>> second wave of patching.
>>>
>>> I am vary of getting into discussions on specific country variants just
>>> because of the sheer complexity of this topic. It is probably better
>>> addressed by respective maintainers of their locales. I do not see a
>>> "one fits all" solution in this first wave possible.
>>>
>>> I would like to have this "three options plan of action" vetted first
>>> and then we could go to the specific detail. (Like, for instance, what
>>> characters should be included in to the table, and in which
>>> transliteration form.)
>>>
>>> I am looking forward to your reply,
>>> Egor Kobylkin
>>>
>>> P.S. specifically as to how address languages other than Ru included in
>>> GOST_7.79_System_B: we can take the first option left to right from that
>>> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
>>> locales/languages but with errors where Ru supersedes their own variants.
>>>
>>>
>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>
>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please note that translitteration of Cyrillic to latin is not universal.
>>>>>> There are different schemes for for example German, English and Danish, and
>>>>>> there is also an ISO standard for it.
>>>>>
>>>>> Thanks for your feedback, Keld!
>>>>>
>>>>> Could the locale maintainers that wouldn't like to include this patch
>>>>> explicitly state so here?
>>>>
>>>> I think it is about me so I must reply. I am sorry about that and the sole
>>>> reason is my lack of time. I'm just a volunteer here, that means it's not
>>>> my regular job to work on locale data nor anything in glibc nor in any other
>>>> open source project. I do these things only in my free time which I don't
>>>> have much. Of course you will see my contributions here and there but they
>>>> are either trivial or take me months to complete. Your patches are on my
>>>> radar but I can't tell any ETA for them. Of course, there are other people
>>>> around here and they are all welcome to come and join.
>>>>
>>>>> That is:
>>>>> - In the case that there is a different preferred cyrillic
>>>>> transliteration table for any specific locale their maintainers may want
>>>>> to point me to it so I can supply a separate table/patch.
>>>>> - Or they could state explicitly that for some reason they would like to
>>>>> exclude their locale from the patch for a default cyrillic
>>>>> transliteration altogether.
>>>>
>>>> As Keld wrote, there are probably separate rules for every language so
>>>> I don't think you should treat your rules as universal and include them
>>>> in every locale. At first sight, it seems to me they work only for English
>>>> (as a destination locale). Also, although it is called "transliteration
>>>> from Cyrillic" it seems that it covers only Russian alphabet. What about
>>>> other languages which use Cyrillic alphabet but add their own diacritic
>>>> characters? Think about Belarusian, Ukrainian, Serbian, Chechen, Chuvash,
>>>> Mari, Ossetian, Yakut, Tatar, and more. What about languages which use
>>>> Cyrillic alphabet but transliterate their respective letters in a different
>>>> way than Russian? For example, Russian "Ъ" is (I think) usually skipped
>>>> in transliteration, I think you propose "``", but when transliterating from
>>>> Bulgarian they usually transliterate this as "Ä".
>>>>
>>>> Few remarks:
>>>>
>>>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be better?
>>>> * You transliterate "Ñ" as "cz", wouldn't "ts" be better? By the way,
>>>> in Polish language "cz" is a correct transliteration of "Ñ".
>>>> * You transliterate "й" as "j", this is fine in many languages but wouldn't
>>>> "y" be better in English?
>>>> * In case of "е": how will you know if it is correct to transliterate it
>>>> to "e" or "ie" or "je" or "ye"?
>>>>
>>>> These remarks are obviously incomplete, your patch deserves much more
>>>> attention to review.
>>>>
>>>> Best regards,
>>>>
>>>> Rafal
>>>>
>>>
>>
>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-05 15:55 ` Marko Myllynen
@ 2018-10-08 10:42 ` Egor Kobylkin
2018-10-08 13:53 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-08 10:42 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
After some kind help from Marko in the offline discussion
I realized the multi/single character approach I originally took was
against the of the iconv(1) logic anyway. So there is no harm in
dropping it and adopting Marko's suggestion instead. I will do so and
will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
GOST 7.79 System B (for ASCII).
However this doesn't resolve the issue for ASCII part being different
for various locales. Again, I am offering the locale maintainers to let
me know if they want to 1) adopt the one I am supplying, 2) write their
own or 3) ignore the patch altogether. Your feedback is appreciated!
This is the relevant part that helped:
> The first part (ISO-8859-15 or ASCII) defines the target encoding for
> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
>
> If the string //TRANSLIT is appended to to-encoding, characters
> being converted are transliterated when needed and possible. This
> means that when a character cannot be represented in the target
> character set, it can be approximated through one or sevâ eral
> similar looking characters. Characters that are outside of the
> target character set and cannot be transliterated are replaced
> with a question mark (?) in the output.
>
> So in the above examples, iconv(1) encounters the character U+0428
> which is not part of either of the target encoding and since
> //TRANSLIT is specified, iconv(1) tries transliteration according to
> the rules defined above, in case of ASCII U+0160 is not part of the
> target encoding so the next alternative is used.
Bests,
Egor Kobylkin
On 05.10.2018 14:21, Marko Myllynen wrote:
> Hi,
>
> The scheme I proposed would also be ASCII compatible; consider this
> example:
>
> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
>
> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv
> -f ISO-8859-15 -t UTF-8" would produce Å as per System A and "printf
> \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as
> per System B.
>
> Thanks,
>
> On 2018-10-05 15:00, Egor Kobylkin wrote:
>> Hi Marko,
>>
>> I have chosen the System B because it is ASCII compartible. System
>> A is not ASCII compartible (diacritics in target).
>>
>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>
>>
>>
"GOST 7.79 contains two transliteration tables.
>>
>> System A one Cyrillic character to one Latin character, some with
>> diacritics â identical to ISO 9:1995
>>
>> System B one Cyrillic character to one or many Latin characters
>> without diacritics " Hope this helps, Egor
>>
>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>> Hi,
>>>
>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>> possible and if not, then fall back to GOST 7.79 System B?
>>>
>>> Implementation-wise current translit_* files have few examples
>>> where a non-ASCII transliteration is tried first before an ASCII
>>> fallback. These examples are from translit_neutral:
>>>
>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>> TRIPLE PRIME <U2037>
>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>>
>>> Thanks,
>>>
>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>> Keld,Marko,Rafal, other locale maintainers,
>>>>
>>>> this all is written with having in mind a minimal viable fix
>>>> for this bug asap. I want to avoid wasting maintainers time
>>>> getting into fundamental discussions here (although for
>>>> perfectly good reasons).
>>>>
>>>> I see three options: 1. those locale maintainers that are fine
>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic
>>>> transliteration table (Ru) include it in their locales (see
>>>> attached screenshot of the table). 2. those that that want to
>>>> have a differing table can create their own variety based on
>>>> the spreadsheet I have prepared
>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and
>>>> include it in this patch. 3. those that want to omit a
>>>> cyrillic transliteration altogether for now state so and just
>>>> carry over the bug #2872 from the year 2006.
>>>>
>>>> Does this make sense to you?
>>>>
>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>> transliteration table. ASCII being AMERICAN Standard Code for
>>>> Information Interchange, that is obviously orthogonal to any
>>>> transliteration rule of other countries. As such it is not
>>>> explicitly targeting transliteration standards of any country.
>>>>
>>>> The fact that the patch is reflecting Russian variety of ISO
>>>> 9:1995/GOST_7.79_System_B is because a) ISO
>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a
>>>> majority of cyrillic users b) I have access to it including
>>>> via being proficient in Russian.
>>>>
>>>> It is offered to all the respective locale maintainers as a
>>>> stopgap solution. Stopgap in the sense that it is better to
>>>> have some transliteration than not to have any at all and
>>>> carry over the bug from 2006. That it may be a somewhat
>>>> officially correct transliteration for ru_RU is a bonus. In
>>>> that sense I would dub the discussion on the correctness for
>>>> other languages "offtopic". Let me know if this is not OK.
>>>>
>>>> You are all are correctly mentioning the deficiencies of this
>>>> approach. However, I couldn't find a better straightforward
>>>> approach as of yet. Happy to hear from you as on how this
>>>> could be handled.
>>>>
>>>> There is a danger of being caught in the web of
>>>> language/country differences. I propose just pruning the
>>>> locales that are not comfortable including this current table.
>>>> We can address possible solutions in the second wave of
>>>> patching.
>>>>
>>>> I am vary of getting into discussions on specific country
>>>> variants just because of the sheer complexity of this topic.
>>>> It is probably better addressed by respective maintainers of
>>>> their locales. I do not see a "one fits all" solution in this
>>>> first wave possible.
>>>>
>>>> I would like to have this "three options plan of action"
>>>> vetted first and then we could go to the specific detail.
>>>> (Like, for instance, what characters should be included in to
>>>> the table, and in which transliteration form.)
>>>>
>>>> I am looking forward to your reply, Egor Kobylkin
>>>>
>>>> P.S. specifically as to how address languages other than Ru
>>>> included in GOST_7.79_System_B: we can take the first option
>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will
>>>> technically work for all those locales/languages but with
>>>> errors where Ru supersedes their own variants.
>>>>
>>>>
>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>>
>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>> is not universal. There are different schemes for for
>>>>>>> example German, English and Danish, and there is also an
>>>>>>> ISO standard for it.
>>>>>>
>>>>>> Thanks for your feedback, Keld!
>>>>>>
>>>>>> Could the locale maintainers that wouldn't like to include
>>>>>> this patch explicitly state so here?
>>>>>
>>>>> I think it is about me so I must reply. I am sorry about
>>>>> that and the sole reason is my lack of time. I'm just a
>>>>> volunteer here, that means it's not my regular job to work
>>>>> on locale data nor anything in glibc nor in any other open
>>>>> source project. I do these things only in my free time
>>>>> which I don't have much. Of course you will see my
>>>>> contributions here and there but they are either trivial or
>>>>> take me months to complete. Your patches are on my radar but
>>>>> I can't tell any ETA for them. Of course, there are other
>>>>> people around here and they are all welcome to come and
>>>>> join.
>>>>>
>>>>>> That is: - In the case that there is a different preferred
>>>>>> cyrillic transliteration table for any specific locale
>>>>>> their maintainers may want to point me to it so I can
>>>>>> supply a separate table/patch. - Or they could state
>>>>>> explicitly that for some reason they would like to exclude
>>>>>> their locale from the patch for a default cyrillic
>>>>>> transliteration altogether.
>>>>>
>>>>> As Keld wrote, there are probably separate rules for every
>>>>> language so I don't think you should treat your rules as
>>>>> universal and include them in every locale. At first sight,
>>>>> it seems to me they work only for English (as a destination
>>>>> locale). Also, although it is called "transliteration from
>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>> about other languages which use Cyrillic alphabet but add
>>>>> their own diacritic characters? Think about Belarusian,
>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut,
>>>>> Tatar, and more. What about languages which use Cyrillic
>>>>> alphabet but transliterate their respective letters in a
>>>>> different way than Russian? For example, Russian "Ъ" is (I
>>>>> think) usually skipped in transliteration, I think you
>>>>> propose "``", but when transliterating from Bulgarian they
>>>>> usually transliterate this as "Ä".
>>>>>
>>>>> Few remarks:
>>>>>
>>>>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be
>>>>> better? * You transliterate "Ñ" as "cz", wouldn't "ts" be
>>>>> better? By the way, in Polish language "cz" is a correct
>>>>> transliteration of "Ñ". * You transliterate "й" as "j", this
>>>>> is fine in many languages but wouldn't "y" be better in
>>>>> English? * In case of "е": how will you know if it is
>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>>
>>>>> These remarks are obviously incomplete, your patch deserves
>>>>> much more attention to review.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Rafal
>>>>>
>>>>
>>>
>>>
>>
>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 10:42 ` Egor Kobylkin
@ 2018-10-08 13:53 ` Marko Myllynen
2018-10-08 22:34 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-08 13:53 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
- No duplicates:
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
should become:
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
- There are few issues with the definitions:
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
Thanks,
On 2018-10-05 23:47, Egor Kobylkin wrote:
> After some kind help from Marko in the offline discussion
> I realized the multi/single character approach I originally took was
> against the of the iconv(1) logic anyway. So there is no harm in
> dropping it and adopting Marko's suggestion instead. I will do so and
> will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
> GOST 7.79 System B (for ASCII).
>
> However this doesn't resolve the issue for ASCII part being different
> for various locales. Again, I am offering the locale maintainers to let
> me know if they want to 1) adopt the one I am supplying, 2) write their
> own or 3) ignore the patch altogether. Your feedback is appreciated!
>
> This is the relevant part that helped:
>> The first part (ISO-8859-15 or ASCII) defines the target encoding for
>> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
>>
>> If the string //TRANSLIT is appended to to-encoding, characters
>> being converted are transliterated when needed and possible. This
>> means that when a character cannot be represented in the target
>> character set, it can be approximated through one or sevâ eral
>> similar looking characters. Characters that are outside of the
>> target character set and cannot be transliterated are replaced
>> with a question mark (?) in the output.
>>
>> So in the above examples, iconv(1) encounters the character U+0428
>> which is not part of either of the target encoding and since
>> //TRANSLIT is specified, iconv(1) tries transliteration according to
>> the rules defined above, in case of ASCII U+0160 is not part of the
>> target encoding so the next alternative is used.
>
> Bests,
> Egor Kobylkin
>
> On 05.10.2018 14:21, Marko Myllynen wrote:
>> Hi,
>>
>> The scheme I proposed would also be ASCII compatible; consider this
>> example:
>>
>> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
>>
>> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv
>> -f ISO-8859-15 -t UTF-8" would produce Å as per System A and "printf
>> \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as
>> per System B.
>>
>> Thanks,
>>
>> On 2018-10-05 15:00, Egor Kobylkin wrote:
>>> Hi Marko,
>>>
>>> I have chosen the System B because it is ASCII compartible. System
>>> A is not ASCII compartible (diacritics in target).
>>>
>>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>>
>>>
>>>
> "GOST 7.79 contains two transliteration tables.
>>>
>>> System A one Cyrillic character to one Latin character, some with
>>> diacritics â identical to ISO 9:1995
>>>
>>> System B one Cyrillic character to one or many Latin characters
>>> without diacritics " Hope this helps, Egor
>>>
>>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>>> possible and if not, then fall back to GOST 7.79 System B?
>>>>
>>>> Implementation-wise current translit_* files have few examples
>>>> where a non-ASCII transliteration is tried first before an ASCII
>>>> fallback. These examples are from translit_neutral:
>>>>
>>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>>> TRIPLE PRIME <U2037>
>>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>>>
>>>> Thanks,
>>>>
>>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>>> Keld,Marko,Rafal, other locale maintainers,
>>>>>
>>>>> this all is written with having in mind a minimal viable fix
>>>>> for this bug asap. I want to avoid wasting maintainers time
>>>>> getting into fundamental discussions here (although for
>>>>> perfectly good reasons).
>>>>>
>>>>> I see three options: 1. those locale maintainers that are fine
>>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic
>>>>> transliteration table (Ru) include it in their locales (see
>>>>> attached screenshot of the table). 2. those that that want to
>>>>> have a differing table can create their own variety based on
>>>>> the spreadsheet I have prepared
>>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and
>>>>> include it in this patch. 3. those that want to omit a
>>>>> cyrillic transliteration altogether for now state so and just
>>>>> carry over the bug #2872 from the year 2006.
>>>>>
>>>>> Does this make sense to you?
>>>>>
>>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>>> transliteration table. ASCII being AMERICAN Standard Code for
>>>>> Information Interchange, that is obviously orthogonal to any
>>>>> transliteration rule of other countries. As such it is not
>>>>> explicitly targeting transliteration standards of any country.
>>>>>
>>>>> The fact that the patch is reflecting Russian variety of ISO
>>>>> 9:1995/GOST_7.79_System_B is because a) ISO
>>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a
>>>>> majority of cyrillic users b) I have access to it including
>>>>> via being proficient in Russian.
>>>>>
>>>>> It is offered to all the respective locale maintainers as a
>>>>> stopgap solution. Stopgap in the sense that it is better to
>>>>> have some transliteration than not to have any at all and
>>>>> carry over the bug from 2006. That it may be a somewhat
>>>>> officially correct transliteration for ru_RU is a bonus. In
>>>>> that sense I would dub the discussion on the correctness for
>>>>> other languages "offtopic". Let me know if this is not OK.
>>>>>
>>>>> You are all are correctly mentioning the deficiencies of this
>>>>> approach. However, I couldn't find a better straightforward
>>>>> approach as of yet. Happy to hear from you as on how this
>>>>> could be handled.
>>>>>
>>>>> There is a danger of being caught in the web of
>>>>> language/country differences. I propose just pruning the
>>>>> locales that are not comfortable including this current table.
>>>>> We can address possible solutions in the second wave of
>>>>> patching.
>>>>>
>>>>> I am vary of getting into discussions on specific country
>>>>> variants just because of the sheer complexity of this topic.
>>>>> It is probably better addressed by respective maintainers of
>>>>> their locales. I do not see a "one fits all" solution in this
>>>>> first wave possible.
>>>>>
>>>>> I would like to have this "three options plan of action"
>>>>> vetted first and then we could go to the specific detail.
>>>>> (Like, for instance, what characters should be included in to
>>>>> the table, and in which transliteration form.)
>>>>>
>>>>> I am looking forward to your reply, Egor Kobylkin
>>>>>
>>>>> P.S. specifically as to how address languages other than Ru
>>>>> included in GOST_7.79_System_B: we can take the first option
>>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will
>>>>> technically work for all those locales/languages but with
>>>>> errors where Ru supersedes their own variants.
>>>>>
>>>>>
>>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>>>
>>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>>> is not universal. There are different schemes for for
>>>>>>>> example German, English and Danish, and there is also an
>>>>>>>> ISO standard for it.
>>>>>>>
>>>>>>> Thanks for your feedback, Keld!
>>>>>>>
>>>>>>> Could the locale maintainers that wouldn't like to include
>>>>>>> this patch explicitly state so here?
>>>>>>
>>>>>> I think it is about me so I must reply. I am sorry about
>>>>>> that and the sole reason is my lack of time. I'm just a
>>>>>> volunteer here, that means it's not my regular job to work
>>>>>> on locale data nor anything in glibc nor in any other open
>>>>>> source project. I do these things only in my free time
>>>>>> which I don't have much. Of course you will see my
>>>>>> contributions here and there but they are either trivial or
>>>>>> take me months to complete. Your patches are on my radar but
>>>>>> I can't tell any ETA for them. Of course, there are other
>>>>>> people around here and they are all welcome to come and
>>>>>> join.
>>>>>>
>>>>>>> That is: - In the case that there is a different preferred
>>>>>>> cyrillic transliteration table for any specific locale
>>>>>>> their maintainers may want to point me to it so I can
>>>>>>> supply a separate table/patch. - Or they could state
>>>>>>> explicitly that for some reason they would like to exclude
>>>>>>> their locale from the patch for a default cyrillic
>>>>>>> transliteration altogether.
>>>>>>
>>>>>> As Keld wrote, there are probably separate rules for every
>>>>>> language so I don't think you should treat your rules as
>>>>>> universal and include them in every locale. At first sight,
>>>>>> it seems to me they work only for English (as a destination
>>>>>> locale). Also, although it is called "transliteration from
>>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>>> about other languages which use Cyrillic alphabet but add
>>>>>> their own diacritic characters? Think about Belarusian,
>>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut,
>>>>>> Tatar, and more. What about languages which use Cyrillic
>>>>>> alphabet but transliterate their respective letters in a
>>>>>> different way than Russian? For example, Russian "Ъ" is (I
>>>>>> think) usually skipped in transliteration, I think you
>>>>>> propose "``", but when transliterating from Bulgarian they
>>>>>> usually transliterate this as "Ä".
>>>>>>
>>>>>> Few remarks:
>>>>>>
>>>>>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be
>>>>>> better? * You transliterate "Ñ" as "cz", wouldn't "ts" be
>>>>>> better? By the way, in Polish language "cz" is a correct
>>>>>> transliteration of "Ñ". * You transliterate "й" as "j", this
>>>>>> is fine in many languages but wouldn't "y" be better in
>>>>>> English? * In case of "е": how will you know if it is
>>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>>>
>>>>>> These remarks are obviously incomplete, your patch deserves
>>>>>> much more attention to review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Rafal
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-05 11:54 ` Egor Kobylkin
@ 2018-10-08 22:23 ` Rafal Luzynski
2018-10-08 23:20 ` Egor Kobylkin
` (2 more replies)
0 siblings, 3 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-08 22:23 UTC (permalink / raw)
To: Egor Kobylkin, Keld Simonsen, Marko Myllynen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> I see three options:
> 1. those locale maintainers that are fine with using ISO
> 9:1995/GOST_7.79_System_B cyrillic transliteration table (Ru) include it
> in their locales. https://sourceware.org/bugzilla/attachment.cgi?id=11289
> 2. those that that want to have a differing table can create their own
> variety based on the spreadsheet I have prepared
> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include it in
> this patch.
> 3. those that want to omit a cyrillic transliteration altogether for now
> state so and just carry over the bug #2872 from the year 2006.
>
> Does this make sense to you?
The problem is that we don't have a separate maintainer for each locale,
we have only 2 maintainers for about 200 locales and we must represent
them all. Sometimes a locale may happen to be our own native locale or
of someone in this list, or it may be a locale which we accidentally can
speak as a foreign language, or we may have friends who can speak it.
Or it may be totally unknown and we still must somehow handle it.
I think that these transliteration rules should be included in multiple
locales on "opt-in" basis rather than "opt-out". I mean, we should not
include them in all locales unless someone explicitly provides a different
rules. Instead, I think we should add them (maybe with modification)
only to those locales where we have a good reason to think they will work.
Particularly, I think that those rules will not be helpful at all for
the languages which use neither Latin nor Cyrillic alphabet.
> [...]
> The fact that the patch is reflecting Russian variety of ISO
> 9:1995/GOST_7.79_System_B is because a) ISO 9:1995/GOST_7.79_System_B is
> available and can be helpful to a majority of cyrillic users b) I have
> access to it including via being proficient in Russian.
I took a look at these standards and as first I doubted they may be
correct for English language now I understand they are created for
Russian users. Therefore I think it is pretty correct to include them
to Russian locale data. Will it be OK if we say that it is only for
Russian language? Will it be satisfying for you and/or your users?
> It is offered to all the respective locale maintainers as a stopgap
> solution. Stopgap in the sense that it is better to have some
> transliteration than not to have any at all and carry over the bug from
> 2006. That it may be a somewhat officially correct transliteration for
> ru_RU is a bonus. In that sense I would dub the discussion on the
> correctness for other languages "offtopic". Let me know if this is not OK.
If you refer to other languages than Russian which also use the Cyrillic
alphabet but need a different transliteration rules than Russian for
the same characters then it is OK for me now. I am afraid that the iconv
algorithm does not handle such case. Of course, we should add this missing
feature eventually but I do not volunteer to do it now.
> [...]
> P.S. specifically as to how address languages other than Ru included in
> GOST_7.79_System_B: we can take the first option left to right from that
> table (Ru,By,Uk,Bg,Mk). Then it will technically work for all those
> locales/languages but with errors where Ru supersedes their own variants.
Makes sense, as long as we cannot select the source language now.
But, while at this, is there anything that stops are from adding transliteration
rules for additional Cyrillic characters not used in Russian but used in
other languages?
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 13:53 ` Marko Myllynen
@ 2018-10-08 22:34 ` Rafal Luzynski
2018-10-09 8:40 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-08 22:34 UTC (permalink / raw)
To: Marko Myllynen, Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
> Hi,
>
> Thanks for the update. I have few mostly cosmetic comments below,
> hopefully we'll hear from others whether they agree with this direction.
>
> - Please add the standard glibc locale header (see the existing
> translit_* files for reference)
> - Consider wrapping the header lines at or around column 70-72
> - Consider describing which characters, character ranges, or blocks are
> supported (perhaps also describe why some of those are not included, see
> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
> - Please remove trailing whitespaces and spaces after ;
Thanks for this, Marko. While at this, in the ChangeLog and in the commit
message these paths:
* locales/aa_DJ: likewise
1. Should be a relative path starting in the root directory of glibc source,
that is: "* localedata/locales/aa_DJ".
2. Should be "Likewise." (starting with an uppercase and ending with a dot).
> - No duplicates:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>; <U0065>
>
> should become:
>
> % CYRILLIC SMALL LETTER IE
> <U0435> <U0065>
>
> - There are few issues with the definitions:
>
> % CYRILLIC CAPITAL LETTER U
> <U0423> <U0055>; <U0055>
> % CYRILLIC UNDEFINED
> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>
> % CYRILLIC SMALL LETTER U
> <U0443> <U0075>; <U0075>
> % CYRILLIC UNDEFINED
> <U0443><U0443> <U00FA>; "<U0075><U0060>"
Are the duplicates here because some Cyrillic letters may have multiple
Latin transliterations depending on the context, for example Cyrillic IE
must be transliterated sometimes as "e", sometimes as "ie", sometimes
as "ye" or "je"? Can we provide rules for groups of characters instead?
> I wonder would it be possible to automate generation of this file so
> that issues like the above could avoided? But perhaps that could be the
> next step once this initial patch lands.
I agree with this.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 22:23 ` Rafal Luzynski
@ 2018-10-08 23:20 ` Egor Kobylkin
2018-10-09 21:52 ` Rafal Luzynski
2018-10-08 23:23 ` Zack Weinberg
2018-10-09 16:22 ` Marko Myllynen
2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-08 23:20 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 6674 bytes --]
Hi Rafal,
> But, while at this, is there anything that stops are from adding
> transliteration rules for additional Cyrillic characters not used in
> Russian but used in other languages?
Just to make sure we are not talking at cross purposes. Since your last
email on this topic on the suggestion from Marko I have already
implemented ISO 9 transliteration for all characters there are. This
should cover most if not all Slavic Cyrillic. You seem to have just
noticed and replied to this email of Marko as I write mine.
Pls also check the Spreadsheet version I have just uploaded
https://sourceware.org/bugzilla/attachment.cgi?id=11298
I am currently absorbing Marko's further suggestions and correction to
that one and will get back for more discussion once done there. I am
reading your suggestions and taking them to my heart, be sure of that.
Two professional translators independently indicated the difference
between transliteration and transcription to me. Transliteration is
normative (letter for letter) and transcription is phonetic - letter for
whatever combination of Latin letters in the target language that sounds
like it for a native speaker. While transliteration should be easy to
cover for all those languages via ISO 9, transcription is inherently
language specific. The problem is we are (mis)using the transcription as
transliteration to ASCII because ASCII set of characters does not allow
for proper transcription. Another problem is that to be really useful
the ASCII transliteration should work outside of source locale (i.e. not
only ru_RU but en_US, de_DE, en_DE, es_ES etc. or even just C locale).
In fact for myself I would be committed to do all work needed to cover
at least C, en_US, ru_RU, de_DE in that order. ru_RU as a "courtesy", I
am not really using it but hope more contributors for locales may come
because of that and fix my bugs :-).
> The problem is that we don't have a separate maintainer for each
> locale, we have only 2 maintainers for about 200 locales and we must
> represent them all.
It was not clear to me that glibc team can not fall back on the
individual locale maintainers to make the decision. But then it may make
the decision making even easier. If you guys have a list of requirements
(may be implicit until now) could you please shoot them my way? We can
also certainly just keep this thread up and have all issues ironed out.
Anyway hopefully with ISO 9 as a first column in the translit_cyrillic
we cover the issue of the completeness of transliteration now. What we
need to figure out is transcription/transliteration to ASCII - second
column.
Are we sharing the same view on this?
Speaking on decision making - maybe I can get an officially certified
court translator to answer our questions. Do you care to put a list
together of questions you would like answered to make a decision on the
table/inclusion into various locales?
Hope this helps,
Egor
On 09.10.2018 00:04, Rafal Luzynski wrote:
> 5.10.2018 12:36 Egor Kobylkin <egor@kobylkin.com> wrote:
>> [...] I see three options: 1. those locale maintainers that are
>> fine with using ISO 9:1995/GOST_7.79_System_B cyrillic
>> transliteration table (Ru) include it in their locales.
>> https://sourceware.org/bugzilla/attachment.cgi?id=11289 2. those
>> that that want to have a differing table can create their own
>> variety based on the spreadsheet I have prepared
>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and include
>> it in this patch. 3. those that want to omit a cyrillic
>> transliteration altogether for now state so and just carry over the
>> bug #2872 from the year 2006.
>>
>> Does this make sense to you?
>
> The problem is that we don't have a separate maintainer for each
> locale, we have only 2 maintainers for about 200 locales and we must
> represent them all. Sometimes a locale may happen to be our own
> native locale or of someone in this list, or it may be a locale which
> we accidentally can speak as a foreign language, or we may have
> friends who can speak it. Or it may be totally unknown and we still
> must somehow handle it.
>
> I think that these transliteration rules should be included in
> multiple locales on "opt-in" basis rather than "opt-out". I mean, we
> should not include them in all locales unless someone explicitly
> provides a different rules. Instead, I think we should add them
> (maybe with modification) only to those locales where we have a good
> reason to think they will work.
>
> Particularly, I think that those rules will not be helpful at all
> for the languages which use neither Latin nor Cyrillic alphabet.
>
>> [...] The fact that the patch is reflecting Russian variety of ISO
>> 9:1995/GOST_7.79_System_B is because a) ISO
>> 9:1995/GOST_7.79_System_B is available and can be helpful to a
>> majority of cyrillic users b) I have access to it including via
>> being proficient in Russian.
>
> I took a look at these standards and as first I doubted they may be
> correct for English language now I understand they are created for
> Russian users. Therefore I think it is pretty correct to include
> them to Russian locale data. Will it be OK if we say that it is only
> for Russian language? Will it be satisfying for you and/or your
> users?
>
>> It is offered to all the respective locale maintainers as a
>> stopgap solution. Stopgap in the sense that it is better to have
>> some transliteration than not to have any at all and carry over the
>> bug from 2006. That it may be a somewhat officially correct
>> transliteration for ru_RU is a bonus. In that sense I would dub the
>> discussion on the correctness for other languages "offtopic". Let
>> me know if this is not OK.
>
> If you refer to other languages than Russian which also use the
> Cyrillic alphabet but need a different transliteration rules than
> Russian for the same characters then it is OK for me now. I am
> afraid that the iconv algorithm does not handle such case. Of
> course, we should add this missing feature eventually but I do not
> volunteer to do it now.
>
>> [...] P.S. specifically as to how address languages other than Ru
>> included in GOST_7.79_System_B: we can take the first option left
>> to right from that table (Ru,By,Uk,Bg,Mk). Then it will technically
>> work for all those locales/languages but with errors where Ru
>> supersedes their own variants.
>
> Makes sense, as long as we cannot select the source language now.
>
> But, while at this, is there anything that stops are from adding
> transliteration rules for additional Cyrillic characters not used in
> Russian but used in other languages?
>
> Regards,
>
> Rafal
>
[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 16713 bytes --]
From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Rafal Luzynski <digitalfreak@lingonborough.com>, Keld Simonsen <keld@keldix.com>
Cc: libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Mon, 8 Oct 2018 15:40:53 +0300
Message-ID: <f51992ad-008b-03a4-8880-4c12edced53b@redhat.com>
Hi,
Thanks for the update. I have few mostly cosmetic comments below,
hopefully we'll hear from others whether they agree with this direction.
- Please add the standard glibc locale header (see the existing
translit_* files for reference)
- Consider wrapping the header lines at or around column 70-72
- Consider describing which characters, character ranges, or blocks are
supported (perhaps also describe why some of those are not included, see
e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
- Please remove trailing whitespaces and spaces after ;
- No duplicates:
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>; <U0065>
should become:
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
- There are few issues with the definitions:
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>; <U0055>
% CYRILLIC UNDEFINED
<U0423><U0423> <U00DA>; "<U0055><U0060>"
% CYRILLIC SMALL LETTER U
<U0443> <U0075>; <U0075>
% CYRILLIC UNDEFINED
<U0443><U0443> <U00FA>; "<U0075><U0060>"
I wonder would it be possible to automate generation of this file so
that issues like the above could avoided? But perhaps that could be the
next step once this initial patch lands.
Thanks,
On 2018-10-05 23:47, Egor Kobylkin wrote:
> After some kind help from Marko in the offline discussion
> I realized the multi/single character approach I originally took was
> against the of the iconv(1) logic anyway. So there is no harm in
> dropping it and adopting Marko's suggestion instead. I will do so and
> will resubmit the patch with ISO 9:1995/GOST 7.79 System A + fallback to
> GOST 7.79 System B (for ASCII).
>
> However this doesn't resolve the issue for ASCII part being different
> for various locales. Again, I am offering the locale maintainers to let
> me know if they want to 1) adopt the one I am supplying, 2) write their
> own or 3) ignore the patch altogether. Your feedback is appreciated!
>
> This is the relevant part that helped:
>> The first part (ISO-8859-15 or ASCII) defines the target encoding for
>> iconv(1). //TRANSLIT is described in the iconv(1) man page as:
>>
>> If the string //TRANSLIT is appended to to-encoding, characters
>> being converted are transliterated when needed and possible. This
>> means that when a character cannot be represented in the target
>> character set, it can be approximated through one or sevâ eral
>> similar looking characters. Characters that are outside of the
>> target character set and cannot be transliterated are replaced
>> with a question mark (?) in the output.
>>
>> So in the above examples, iconv(1) encounters the character U+0428
>> which is not part of either of the target encoding and since
>> //TRANSLIT is specified, iconv(1) tries transliteration according to
>> the rules defined above, in case of ASCII U+0160 is not part of the
>> target encoding so the next alternative is used.
>
> Bests,
> Egor Kobylkin
>
> On 05.10.2018 14:21, Marko Myllynen wrote:
>> Hi,
>>
>> The scheme I proposed would also be ASCII compatible; consider this
>> example:
>>
>> % CYRILLIC CAPITAL LETTER SHA <U0428> "<U0160>";"<U0053><U0068>"
>>
>> "printf \\u0428\\n | iconv -f UTF-8 -t ISO-8859-15//TRANSLIT | iconv
>> -f ISO-8859-15 -t UTF-8" would produce Å as per System A and "printf
>> \\u0428\\n | iconv -f UTF-8 -t ASCII//TRANSLIT" would produce Sh as
>> per System B.
>>
>> Thanks,
>>
>> On 2018-10-05 15:00, Egor Kobylkin wrote:
>>> Hi Marko,
>>>
>>> I have chosen the System B because it is ASCII compartible. System
>>> A is not ASCII compartible (diacritics in target).
>>>
>>> https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
>>>
>>>
>>>
> "GOST 7.79 contains two transliteration tables.
>>>
>>> System A one Cyrillic character to one Latin character, some with
>>> diacritics â identical to ISO 9:1995
>>>
>>> System B one Cyrillic character to one or many Latin characters
>>> without diacritics " Hope this helps, Egor
>>>
>>> On 05.10.2018 13:54, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> Would it make sense to first use ISO 9:1995/GOST 7.79 System A if
>>>> possible and if not, then fall back to GOST 7.79 System B?
>>>>
>>>> Implementation-wise current translit_* files have few examples
>>>> where a non-ASCII transliteration is tried first before an ASCII
>>>> fallback. These examples are from translit_neutral:
>>>>
>>>> % NARROW NO-BREAK SPACE <U202F> <U00A0>;<U0020> % REVERSED
>>>> TRIPLE PRIME <U2037>
>>>> "<U2035><U2035><U2035>";"<U0060><U0060><U0060>"
>>>>
>>>> Thanks,
>>>>
>>>> On 2018-10-05 13:29, Egor Kobylkin wrote:
>>>>> Keld,Marko,Rafal, other locale maintainers,
>>>>>
>>>>> this all is written with having in mind a minimal viable fix
>>>>> for this bug asap. I want to avoid wasting maintainers time
>>>>> getting into fundamental discussions here (although for
>>>>> perfectly good reasons).
>>>>>
>>>>> I see three options: 1. those locale maintainers that are fine
>>>>> with using ISO 9:1995/GOST_7.79_System_B cyrillic
>>>>> transliteration table (Ru) include it in their locales (see
>>>>> attached screenshot of the table). 2. those that that want to
>>>>> have a differing table can create their own variety based on
>>>>> the spreadsheet I have prepared
>>>>> https://sourceware.org/bugzilla/attachment.cgi?id=8590 and
>>>>> include it in this patch. 3. those that want to omit a
>>>>> cyrillic transliteration altogether for now state so and just
>>>>> carry over the bug #2872 from the year 2006.
>>>>>
>>>>> Does this make sense to you?
>>>>>
>>>>> Just to be super clear on this: the patch is a stopgap _ASCII_
>>>>> transliteration table. ASCII being AMERICAN Standard Code for
>>>>> Information Interchange, that is obviously orthogonal to any
>>>>> transliteration rule of other countries. As such it is not
>>>>> explicitly targeting transliteration standards of any country.
>>>>>
>>>>> The fact that the patch is reflecting Russian variety of ISO
>>>>> 9:1995/GOST_7.79_System_B is because a) ISO
>>>>> 9:1995/GOST_7.79_System_B is available and can be helpful to a
>>>>> majority of cyrillic users b) I have access to it including
>>>>> via being proficient in Russian.
>>>>>
>>>>> It is offered to all the respective locale maintainers as a
>>>>> stopgap solution. Stopgap in the sense that it is better to
>>>>> have some transliteration than not to have any at all and
>>>>> carry over the bug from 2006. That it may be a somewhat
>>>>> officially correct transliteration for ru_RU is a bonus. In
>>>>> that sense I would dub the discussion on the correctness for
>>>>> other languages "offtopic". Let me know if this is not OK.
>>>>>
>>>>> You are all are correctly mentioning the deficiencies of this
>>>>> approach. However, I couldn't find a better straightforward
>>>>> approach as of yet. Happy to hear from you as on how this
>>>>> could be handled.
>>>>>
>>>>> There is a danger of being caught in the web of
>>>>> language/country differences. I propose just pruning the
>>>>> locales that are not comfortable including this current table.
>>>>> We can address possible solutions in the second wave of
>>>>> patching.
>>>>>
>>>>> I am vary of getting into discussions on specific country
>>>>> variants just because of the sheer complexity of this topic.
>>>>> It is probably better addressed by respective maintainers of
>>>>> their locales. I do not see a "one fits all" solution in this
>>>>> first wave possible.
>>>>>
>>>>> I would like to have this "three options plan of action"
>>>>> vetted first and then we could go to the specific detail.
>>>>> (Like, for instance, what characters should be included in to
>>>>> the table, and in which transliteration form.)
>>>>>
>>>>> I am looking forward to your reply, Egor Kobylkin
>>>>>
>>>>> P.S. specifically as to how address languages other than Ru
>>>>> included in GOST_7.79_System_B: we can take the first option
>>>>> left to right from that table (Ru,By,Uk,Bg,Mk). Then it will
>>>>> technically work for all those locales/languages but with
>>>>> errors where Ru supersedes their own variants.
>>>>>
>>>>>
>>>>> On 05.10.2018 11:20, Rafal Luzynski wrote:
>>>>>> 3.10.2018 11:32 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>>>
>>>>>>> On 03.10.2018 11:19, Keld Simonsen wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please note that translitteration of Cyrillic to latin
>>>>>>>> is not universal. There are different schemes for for
>>>>>>>> example German, English and Danish, and there is also an
>>>>>>>> ISO standard for it.
>>>>>>>
>>>>>>> Thanks for your feedback, Keld!
>>>>>>>
>>>>>>> Could the locale maintainers that wouldn't like to include
>>>>>>> this patch explicitly state so here?
>>>>>>
>>>>>> I think it is about me so I must reply. I am sorry about
>>>>>> that and the sole reason is my lack of time. I'm just a
>>>>>> volunteer here, that means it's not my regular job to work
>>>>>> on locale data nor anything in glibc nor in any other open
>>>>>> source project. I do these things only in my free time
>>>>>> which I don't have much. Of course you will see my
>>>>>> contributions here and there but they are either trivial or
>>>>>> take me months to complete. Your patches are on my radar but
>>>>>> I can't tell any ETA for them. Of course, there are other
>>>>>> people around here and they are all welcome to come and
>>>>>> join.
>>>>>>
>>>>>>> That is: - In the case that there is a different preferred
>>>>>>> cyrillic transliteration table for any specific locale
>>>>>>> their maintainers may want to point me to it so I can
>>>>>>> supply a separate table/patch. - Or they could state
>>>>>>> explicitly that for some reason they would like to exclude
>>>>>>> their locale from the patch for a default cyrillic
>>>>>>> transliteration altogether.
>>>>>>
>>>>>> As Keld wrote, there are probably separate rules for every
>>>>>> language so I don't think you should treat your rules as
>>>>>> universal and include them in every locale. At first sight,
>>>>>> it seems to me they work only for English (as a destination
>>>>>> locale). Also, although it is called "transliteration from
>>>>>> Cyrillic" it seems that it covers only Russian alphabet. What
>>>>>> about other languages which use Cyrillic alphabet but add
>>>>>> their own diacritic characters? Think about Belarusian,
>>>>>> Ukrainian, Serbian, Chechen, Chuvash, Mari, Ossetian, Yakut,
>>>>>> Tatar, and more. What about languages which use Cyrillic
>>>>>> alphabet but transliterate their respective letters in a
>>>>>> different way than Russian? For example, Russian "Ъ" is (I
>>>>>> think) usually skipped in transliteration, I think you
>>>>>> propose "``", but when transliterating from Bulgarian they
>>>>>> usually transliterate this as "Ä".
>>>>>>
>>>>>> Few remarks:
>>>>>>
>>>>>> * I think you transliterate "Ñ" as "shh", wouldn't "shch" be
>>>>>> better? * You transliterate "Ñ" as "cz", wouldn't "ts" be
>>>>>> better? By the way, in Polish language "cz" is a correct
>>>>>> transliteration of "Ñ". * You transliterate "й" as "j", this
>>>>>> is fine in many languages but wouldn't "y" be better in
>>>>>> English? * In case of "е": how will you know if it is
>>>>>> correct to transliterate it to "e" or "ie" or "je" or "ye"?
>>>>>>
>>>>>> These remarks are obviously incomplete, your patch deserves
>>>>>> much more attention to review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Rafal
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 22:23 ` Rafal Luzynski
2018-10-08 23:20 ` Egor Kobylkin
@ 2018-10-08 23:23 ` Zack Weinberg
2018-10-09 16:10 ` Carlos O'Donell
2018-10-09 16:22 ` Marko Myllynen
2 siblings, 1 reply; 111+ messages in thread
From: Zack Weinberg @ 2018-10-08 23:23 UTC (permalink / raw)
To: Rafal Luzynski, GNU C Library
On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
<digitalfreak@lingonborough.com> wrote:
> The problem is that we don't have a separate maintainer for each locale,
> we have only 2 maintainers for about 200 locales and we must represent
> them all. Sometimes a locale may happen to be our own native locale or
> of someone in this list, or it may be a locale which we accidentally can
> speak as a foreign language, or we may have friends who can speak it.
> Or it may be totally unknown and we still must somehow handle it.
I just want to mention that this is also why most of the non-locale
maintainers tend to stay out of threads about locales. We know we're
even less expert on these issues than you are, and I think as a
general rule you should be assuming that the community is OK with what
you're doing unless someone speaks up to object.
zw
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 22:34 ` Rafal Luzynski
@ 2018-10-09 8:40 ` Egor Kobylkin
2018-10-09 14:19 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 8:40 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
On 09.10.2018 00:23, Rafal Luzynski wrote:
> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>> Hi,
>>
>> Thanks for the update. I have few mostly cosmetic comments below,
>> hopefully we'll hear from others whether they agree with this direction.
>>
Yeah, the earlier we have feedback the more productive we are. I'd be
happy to get much feedback on this as early as possible. So please
everybody concerned please chime in.
>
>> - No duplicates:
>>
>> % CYRILLIC SMALL LETTER IE
>> <U0435> <U0065>; <U0065>
>>
>> should become:
>>
>> % CYRILLIC SMALL LETTER IE
>> <U0435> <U0065>
>>
>> - There are few issues with the definitions:
>>
>> % CYRILLIC CAPITAL LETTER U
>> <U0423> <U0055>; <U0055>
>> % CYRILLIC UNDEFINED
>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>
>> % CYRILLIC SMALL LETTER U
>> <U0443> <U0075>; <U0075>
>> % CYRILLIC UNDEFINED
>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
>
> Are the duplicates here because some Cyrillic letters may have multiple
> Latin transliterations depending on the context, for example Cyrillic IE
> must be transliterated sometimes as "e", sometimes as "ie", sometimes
> as "ye" or "je"? Can we provide rules for groups of characters instead?
No, the duplicates are just by design of my line generating logic. I
have fixed (removed) them. The varying transcription between
languages/locales can not be handled in one file at all as far as I
understood.
>
>> I wonder would it be possible to automate generation of this file so
>> that issues like the above could avoided? But perhaps that could be the
>> next step once this initial patch lands.
I am generating the content part of the translit_cyrillc from the
LibreOffice Spreadsheet. Not sure if you had time to view it by now?
https://sourceware.org/bugzilla/attachment.cgi?id=11299
Anyway I have just fixed the issues identified by Marko above in that
spreadsheet. I will do the changes for the below request and then upload
the new translit_cyrillic file to the bugzilla.
>> - Please add the standard glibc locale header (see the existing
>> translit_* files for reference)
>> - Consider wrapping the header lines at or around column 70-72
>> - Consider describing which characters, character ranges, or blocks are
>> supported (perhaps also describe why some of those are not included, see
>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>> - Please remove trailing whitespaces and spaces after ;
>
> Thanks for this, Marko. While at this, in the ChangeLog and in the commit
> message these paths:
>
> * locales/aa_DJ: likewise
>
> 1. Should be a relative path starting in the root directory of glibc
source,
> that is: "* localedata/locales/aa_DJ".
> 2. Should be "Likewise." (starting with an uppercase and ending with a
dot).
will do.
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 8:40 ` Egor Kobylkin
@ 2018-10-09 14:19 ` Egor Kobylkin
2018-10-09 18:56 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 14:19 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 3786 bytes --]
Hi,
I have now implemented all the changes requested for translit_cyrillic
file but started hitting what seems like a bug:
- If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
locale compilation fails i.e. grep CYRILLIC < $testfile |
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.
- If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
everything works, just the transliteration of <U0425> fails as expected
(? is displayed)
- If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
line the transliteration of <U0425> works again (others as ?).
Would you have any idea into what direction should I look? The new
translit_cyrillic is attached.
(<U0425> is % CYRILLIC CAPITAL LETTER HA)
Best regards,
Egor
On 09.10.2018 01:35, Egor Kobylkin wrote:
> On 09.10.2018 00:23, Rafal Luzynski wrote:
>> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>>> Hi,
>>>
>>> Thanks for the update. I have few mostly cosmetic comments below,
>>> hopefully we'll hear from others whether they agree with this direction.
>>>
>
> Yeah, the earlier we have feedback the more productive we are. I'd be
> happy to get much feedback on this as early as possible. So please
> everybody concerned please chime in.
>
>>
>>> - No duplicates:
>>>
>>> % CYRILLIC SMALL LETTER IE
>>> <U0435> <U0065>; <U0065>
>>>
>>> should become:
>>>
>>> % CYRILLIC SMALL LETTER IE
>>> <U0435> <U0065>
>>>
>>> - There are few issues with the definitions:
>>>
>>> % CYRILLIC CAPITAL LETTER U
>>> <U0423> <U0055>; <U0055>
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>>
>>> % CYRILLIC SMALL LETTER U
>>> <U0443> <U0075>; <U0075>
>>> % CYRILLIC UNDEFINED
>>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
>>
>> Are the duplicates here because some Cyrillic letters may have multiple
>> Latin transliterations depending on the context, for example Cyrillic IE
>> must be transliterated sometimes as "e", sometimes as "ie", sometimes
>> as "ye" or "je"? Can we provide rules for groups of characters instead?
> No, the duplicates are just by design of my line generating logic. I
> have fixed (removed) them. The varying transcription between
> languages/locales can not be handled in one file at all as far as I
> understood.
>
>>
>>> I wonder would it be possible to automate generation of this file so
>>> that issues like the above could avoided? But perhaps that could be the
>>> next step once this initial patch lands.
>
> I am generating the content part of the translit_cyrillc from the
> LibreOffice Spreadsheet. Not sure if you had time to view it by now?
> https://sourceware.org/bugzilla/attachment.cgi?id=11299
>
> Anyway I have just fixed the issues identified by Marko above in that
> spreadsheet. I will do the changes for the below request and then upload
> the new translit_cyrillic file to the bugzilla.
>
>
>>> - Please add the standard glibc locale header (see the existing
>>> translit_* files for reference)
>>> - Consider wrapping the header lines at or around column 70-72
>>> - Consider describing which characters, character ranges, or blocks are
>>> supported (perhaps also describe why some of those are not included, see
>>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>>> - Please remove trailing whitespaces and spaces after ;
>>
>> Thanks for this, Marko. While at this, in the ChangeLog and in the commit
>> message these paths:
>>
>> * locales/aa_DJ: likewise
>>
>> 1. Should be a relative path starting in the root directory of glibc
> source,
>> that is: "* localedata/locales/aa_DJ".
>> 2. Should be "Likewise." (starting with an uppercase and ending with a
> dot).
>
> will do.
>
> Bests,
> Egor
>
[-- Attachment #2: translit_cyrillic --]
[-- Type: text/plain, Size: 12688 bytes --]
escape_char /
comment_char %
% This file is part of the GNU C Library and contains locale data.
% The Free Software Foundation does not claim any copyright interest
% in the locale data contained in this file. The foregoing does not
% affect the license of the GNU C Library as a whole. It does not
% exempt you from the conditions of the license if your use would
% otherwise be governed by that license.
% Transliterations of cyrillic letters to latin and/or ascii symbols.
% Inspired by ISO 9.1995 / GOST 7.79-2000.
% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
% It implements the GOST_7.79 System A (Latin Script) as a first
% option and System B Cyrillic (ASCII) as a second option. Check
% https://en.wikipedia.org/wiki/ISO_9 for reference.
% The System B is extended from GOST_7.79-Russian using open sources
% of the transliteration mappings and the "h/`" diacritics logic.
% Usage examples:
% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
% | iconv -f ISO-8859-15 -t UTF-8 # System A
% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
% Contributions welcome for the rest of Cyrillic script in Unicode
% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
% Generated from UnicodeData.txt with
% https://sourceware.org/bugzilla/attachment.cgi?id=11300.
LC_CTYPE
translit_start
% CYRILLIC CAPITAL LETTER IO
<U0401> <U00CB>;"<U0059><U004F>"
% CYRILLIC CAPITAL LETTER DJE
<U0402> <U0110>;"<U0044><U004A>"
% CYRILLIC CAPITAL LETTER GJE
<U0403> <U01F4>;"<U0047><U0060>"
% CYRILLIC CAPITAL LETTER UKRAINIAN IE
<U0404> <U00CA>;"<U0059><U0065>"
% CYRILLIC CAPITAL LETTER DZE
<U0405> <U1E90>;"<U005A><U0060>"
% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
<U0406> <U00CC>;<U0049>
% CYRILLIC CAPITAL LETTER YI
<U0407> <U00CF>;"<U0059><U0069>"
% CYRILLIC CAPITAL LETTER JE
<U0408> "<U004A><U030C>";<U004A>
% CYRILLIC CAPITAL LETTER LJE
<U0409> "<U004C><U0302>";"<U004C><U0060>"
% CYRILLIC CAPITAL LETTER NJE
<U040A> "<U004E><U0302>";"<U004E><U0060>"
% CYRILLIC CAPITAL LETTER TSHE
<U040B> <U0106>;"<U0054><U0053><U0048>"
% CYRILLIC CAPITAL LETTER KJE
<U040C> <U1E30>;"<U004B><U0060>"
% CYRILLIC CAPITAL LETTER SHORT U
<U040E> <U016C>;"<U0055><U0060>"
% CYRILLIC CAPITAL LETTER DZHE
<U040F> "<U0044><U0302>";"<U0044><U0068>"
% CYRILLIC CAPITAL LETTER A
<U0410> <U0041>
% CYRILLIC CAPITAL LETTER BE
<U0411> <U0042>
% CYRILLIC CAPITAL LETTER VE
<U0412> <U0056>
% CYRILLIC CAPITAL LETTER GHE
<U0413> <U0047>
% CYRILLIC CAPITAL LETTER DE
<U0414> <U0044>
% CYRILLIC CAPITAL LETTER IE
<U0415> <U0045>
% CYRILLIC CAPITAL LETTER ZHE
<U0416> <U017D>;"<U005A><U0048>"
% CYRILLIC CAPITAL LETTER ZE
<U0417> <U005A>
% CYRILLIC CAPITAL LETTER I
<U0418> <U0049>
% CYRILLIC CAPITAL LETTER SHORT I
<U0419> <U004A>
% CYRILLIC CAPITAL LETTER KA
<U041A> <U004B>
% CYRILLIC CAPITAL LETTER EL
<U041B> <U004C>
% CYRILLIC CAPITAL LETTER EM
<U041C> <U004D>
% CYRILLIC CAPITAL LETTER EN
<U041D> <U004E>
% CYRILLIC CAPITAL LETTER O
<U041E> <U004F>
% CYRILLIC CAPITAL LETTER PE
<U041F> <U0050>
% CYRILLIC CAPITAL LETTER ER
<U0420> <U0052>
% CYRILLIC CAPITAL LETTER ES
<U0421> <U0053>
% CYRILLIC CAPITAL LETTER TE
<U0422> <U0054>
% CYRILLIC CAPITAL LETTER U
<U0423> <U0055>
% CYRILLIC UNDEFINED
"<U0423><U0301>" <U00DA>;"<U0055><U0060>"
% CYRILLIC CAPITAL LETTER EF
<U0424> <U0046>
% CYRILLIC CAPITAL LETTER HA
<U0425> <U0048>;<U0058>
% CYRILLIC CAPITAL LETTER TSE
<U0426> <U0043>;"<U0043><U005A>"
% CYRILLIC CAPITAL LETTER CHE
<U0427> <U010C>;"<U0043><U0048>"
% CYRILLIC CAPITAL LETTER SHA
<U0428> <U0160>;"<U0053><U0048>"
% CYRILLIC CAPITAL LETTER SHCHA
<U0429> <U015C>;"<U0053><U0048><U0048>"
% CYRILLIC CAPITAL LETTER HARD SIGN
<U042A> <U02BA>;"<U0041><U0060>"
% CYRILLIC CAPITAL LETTER YERU
<U042B> <U0059>;"<U0059><U0060>"
% CYRILLIC CAPITAL LETTER SOFT SIGN
<U042C> <U02B9>;<U0060>
% CYRILLIC CAPITAL LETTER E
<U042D> <U00C8>;"<U0045><U0060>"
% CYRILLIC CAPITAL LETTER YU
<U042E> <U00DB>;"<U0059><U0055>"
% CYRILLIC CAPITAL LETTER YA
<U042F> <U00C2>;"<U0059><U0041>"
% CYRILLIC SMALL LETTER A
<U0430> <U0061>
% CYRILLIC SMALL LETTER BE
<U0431> <U0062>
% CYRILLIC SMALL LETTER VE
<U0432> <U0076>
% CYRILLIC SMALL LETTER GHE
<U0433> <U0067>
% CYRILLIC SMALL LETTER DE
<U0434> <U0064>
% CYRILLIC SMALL LETTER IE
<U0435> <U0065>
% CYRILLIC SMALL LETTER ZHE
<U0436> <U017E>;"<U007A><U0068>"
% CYRILLIC SMALL LETTER ZE
<U0437> <U007A>
% CYRILLIC SMALL LETTER I
<U0438> <U0069>
% CYRILLIC SMALL LETTER SHORT I
<U0439> <U006A>
% CYRILLIC SMALL LETTER KA
<U043A> <U006B>
% CYRILLIC SMALL LETTER EL
<U043B> <U006C>
% CYRILLIC SMALL LETTER EM
<U043C> <U006D>
% CYRILLIC SMALL LETTER EN
<U043D> <U006E>
% CYRILLIC SMALL LETTER O
<U043E> <U006F>
% CYRILLIC SMALL LETTER PE
<U043F> <U0070>
% CYRILLIC SMALL LETTER ER
<U0440> <U0072>
% CYRILLIC SMALL LETTER ES
<U0441> <U0073>
% CYRILLIC SMALL LETTER TE
<U0442> <U0074>
% CYRILLIC SMALL LETTER U
<U0443> <U0075>
% CYRILLIC UNDEFINED
"<U0443><U0301>" <U00FA>;"<U0075><U0060>"
% CYRILLIC SMALL LETTER EF
<U0444> <U0066>
% CYRILLIC SMALL LETTER HA
<U0445> <U0068>;<U0078>
% CYRILLIC SMALL LETTER TSE
<U0446> <U0063>;"<U0063><U007A>"
% CYRILLIC SMALL LETTER CHE
<U0447> <U010D>;"<U0063><U0068>"
% CYRILLIC SMALL LETTER SHA
<U0448> <U0161>;"<U0073><U0068>"
% CYRILLIC SMALL LETTER SHCHA
<U0449> <U015D>;"<U0073><U0068><U0068>"
% CYRILLIC SMALL LETTER HARD SIGN
<U044A> <U02BA>;"<U0060><U0060>"
% CYRILLIC SMALL LETTER YERU
<U044B> <U0079>;"<U0079><U0060>"
% CYRILLIC SMALL LETTER SOFT SIGN
<U044C> <U02B9>;<U0060>
% CYRILLIC SMALL LETTER E
<U044D> <U00E8>;"<U0065><U0060>"
% CYRILLIC SMALL LETTER YU
<U044E> <U00FB>;"<U0079><U0075>"
% CYRILLIC SMALL LETTER YA
<U044F> <U00E2>;"<U0079><U0061>"
% CYRILLIC SMALL LETTER IO
<U0451> <U00EB>;"<U0079><U006F>"
% CYRILLIC SMALL LETTER DJE
<U0452> <U0111>;"<U0064><U006A>"
% CYRILLIC SMALL LETTER GJE
<U0453> <U01F5>;"<U0067><U0060>"
% CYRILLIC SMALL LETTER UKRAINIAN IE
<U0454> <U00EA>;"<U0079><U0065>"
% CYRILLIC SMALL LETTER DZE
<U0455> <U1E91>;"<U007A><U0060>"
% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
<U0456> <U00EC>;<U0069>
% CYRILLIC SMALL LETTER YI
<U0457> <U00EF>;"<U0079><U0069>"
% CYRILLIC SMALL LETTER JE
<U0458> <U01F0>;<U006A>
% CYRILLIC SMALL LETTER LJE
<U0459> "<U006C><U0302>";"<U006C><U0060>"
% CYRILLIC SMALL LETTER NJE
<U045A> "<U006E><U0302>";"<U006E><U0060>"
% CYRILLIC SMALL LETTER TSHE
<U045B> <U0107>;"<U0074><U0073><U0068>"
% CYRILLIC SMALL LETTER KJE
<U045C> <U1E31>;"<U006B><U0060>"
% CYRILLIC SMALL LETTER SHORT U
<U045E> <U016D>;"<U0075><U0060>"
% CYRILLIC SMALL LETTER DZHE
<U045F> "<U0064><U0302>";"<U0064><U0068>"
% CYRILLIC CAPITAL LETTER BIG YUS
<U046A> <U01CD>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER BIG YUS
<U046B> <U01CE>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER FITA
<U0472> "<U0046><U0300>";"<U0046><U0068>"
% CYRILLIC SMALL LETTER FITA
<U0473> "<U0066><U0300>";"<U0066><U0068>"
% CYRILLIC CAPITAL LETTER IZHITSA
<U0474> <U1EF2>;"<U0059><U0068>"
% CYRILLIC SMALL LETTER IZHITSA
<U0475> <U1EF3>;"<U0079><U0068>"
% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
<U048C> <U011A>;"<U0045><U0060>"
% CYRILLIC SMALL LETTER SEMISOFT SIGN
<U048D> <U011B>;"<U0065><U0060>"
% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
<U0490> "<U0047><U0300>";"<U0047><U0060>"
% CYRILLIC SMALL LETTER GHE WITH UPTURN
<U0491> "<U0067><U0300>";"<U0067><U0060>"
% CYRILLIC CAPITAL LETTER GHE WITH STROKE
<U0492> <U0120>;"<U0047><U0048>"
% CYRILLIC SMALL LETTER GHE WITH STROKE
<U0493> <U0121>;"<U0067><U0068>"
% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
<U0494> <U011E>;"<U0047><U0048>"
% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
<U0495> <U011F>;"<U0067><U0068>"
% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
<U049A> <U0136>;"<U004B><U0060>"
% CYRILLIC SMALL LETTER KA WITH DESCENDER
<U049B> <U0137>;"<U006B><U0060>"
% CYRILLIC CAPITAL LETTER KA WITH STROKE
<U049E> "<U004B><U0304>";"<U004B><U0060>"
% CYRILLIC SMALL LETTER KA WITH STROKE
<U049F> "<U006B><U0304>";"<U006B><U0060>"
% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
<U04A2> <U1E46>;"<U004E><U0060>"
% CYRILLIC SMALL LETTER EN WITH DESCENDER
<U04A3> <U1E47>;"<U006E><U0060>"
% CYRILLIC CAPITAL LIGATURE EN GHE
<U04A4> <U1E44>;"<U004E><U0047>"
% CYRILLIC SMALL LIGATURE EN GHE
<U04A5> <U1E45>;"<U006E><U0067>"
% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
<U04A6> <U1E54>;"<U0050><U0060>"
% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
<U04A7> <U1E55>;"<U0070><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN HA
<U04A8> <U00D2>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN HA
<U04A9> <U00F2>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
<U04AA> <U00C7>;"<U0043><U0060>"
% CYRILLIC SMALL LETTER ES WITH DESCENDER
<U04AB> <U00E7>;"<U0043><U0060>"
% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
<U04AC> <U0162>;"<U0054><U0060>"
% CYRILLIC SMALL LETTER TE WITH DESCENDER
<U04AD> <U0163>;"<U0074><U0060>"
% CYRILLIC CAPITAL LETTER STRAIGHT U
<U04AE> <U00D9>;<U0055>
% CYRILLIC SMALL LETTER STRAIGHT U
<U04AF> <U00F9>;<U0075>
% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
<U04B2> <U1E28>;"<U0048><U0060>"
% CYRILLIC SMALL LETTER HA WITH DESCENDER
<U04B3> <U1E29>;"<U0068><U0060>"
% CYRILLIC CAPITAL LIGATURE TE TSE
<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
% CYRILLIC SMALL LIGATURE TE TSE
<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
% CYRILLIC CAPITAL LETTER SHHA
<U04BA> <U1E24>;"<U0053><U0048><U0060>"
% CYRILLIC SMALL LETTER SHHA
<U04BB> <U1E25>;"<U0053><U0048><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN CHE
<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
% CYRILLIC LETTER PALOCHKA
<U04C0> <U2021>;<U0069>
% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH BREVE
<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
<U04CB> <U00C7>;"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER KHAKASSIAN CHE
<U04CC> <U00E7>;"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER A WITH BREVE
<U04D0> <U0102>;"<U0041><U0060>"
% CYRILLIC SMALL LETTER A WITH BREVE
<U04D1> <U0103>;"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
<U04D2> <U00C4>;"<U0041><U0060>"
% CYRILLIC SMALL LETTER A WITH DIAERESIS
<U04D3> <U00E4>;"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER IE WITH BREVE
<U04D6> <U0114>;"<U0045><U0060>"
% CYRILLIC SMALL LETTER IE WITH BREVE
<U04D7> <U0115>;"<U0065><U0060>"
% CYRILLIC CAPITAL LETTER SCHWA
<U04D8> "<U0041><U030B>";"<U0041><U0060>"
% CYRILLIC SMALL LETTER SCHWA
<U04D9> "<U0061><U030B>";"<U0061><U0060>"
% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
<U04DE> "<U005A><U0308>";"<U005A><U0060>"
% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
<U04DF> "<U007A><U0308>";"<U007A><U0060>"
% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
<U04E0> <U0179>;"<U005A><U0060>"
% CYRILLIC SMALL LETTER ABKHASIAN DZE
<U04E1> <U017A>;"<U007A><U0060>"
% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
<U04E4> <U00CE>;"<U0049><U0060>"
% CYRILLIC SMALL LETTER I WITH DIAERESIS
<U04E5> <U00EE>;"<U0069><U0060>"
% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
<U04E6> <U00D6>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER O WITH DIAERESIS
<U04E7> <U00F6>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER BARRED O
<U04E8> <U00D4>;"<U004F><U0060>"
% CYRILLIC SMALL LETTER BARRED O
<U04E9> <U00F4>;"<U006F><U0060>"
% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
<U04F0> <U00DC>;"<U0055><U0060>"
% CYRILLIC SMALL LETTER U WITH DIAERESIS
<U04F1> <U00FC>;"<U0075><U0060>"
% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
<U04F2> <U0170>;"<U0055><U0060>"
% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
<U04F3> <U0171>;"<U0075><U0060>"
% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
<U04F8> <U0178>;"<U0059><U0060>"
% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
<U04F9> <U00FF>;"<U0079><U0060>"
% RIGHT SINGLE QUOTATION MARK
<U2019> <U2035>;<U0027>
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 23:23 ` Zack Weinberg
@ 2018-10-09 16:10 ` Carlos O'Donell
2018-10-09 22:09 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2018-10-09 16:10 UTC (permalink / raw)
To: Zack Weinberg, Rafal Luzynski, GNU C Library
On 10/8/18 7:20 PM, Zack Weinberg wrote:
> On Mon, Oct 8, 2018 at 6:05 PM Rafal Luzynski
> <digitalfreak@lingonborough.com> wrote:
>> The problem is that we don't have a separate maintainer for each locale,
>> we have only 2 maintainers for about 200 locales and we must represent
>> them all. Sometimes a locale may happen to be our own native locale or
>> of someone in this list, or it may be a locale which we accidentally can
>> speak as a foreign language, or we may have friends who can speak it.
>> Or it may be totally unknown and we still must somehow handle it.
>
> I just want to mention that this is also why most of the non-locale
> maintainers tend to stay out of threads about locales. We know we're
> even less expert on these issues than you are, and I think as a
> general rule you should be assuming that the community is OK with what
> you're doing unless someone speaks up to object.
I agree with Zach here.
Rafal and Mike are localedata subsystem maintainers, and your best efforts
are the best we have right now in the community.
I also agree that a conservative position of is always a good place to start,
but it sounds like Egor has added enough coverage to perhaps make all of
these transliterations opt-in by default.
I don't have a good sense of this though, and so I defer to you as a the
subsystem maintainer to review and formulate a position. If you have any
specific questions, I can certainly help review.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 22:23 ` Rafal Luzynski
2018-10-08 23:20 ` Egor Kobylkin
2018-10-08 23:23 ` Zack Weinberg
@ 2018-10-09 16:22 ` Marko Myllynen
2018-10-09 16:49 ` Egor Kobylkin
` (2 more replies)
2 siblings, 3 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-09 16:22 UTC (permalink / raw)
To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
On 2018-10-09 01:04, Rafal Luzynski wrote:
>
> Particularly, I think that those rules will not be helpful at all for
> the languages which use neither Latin nor Cyrillic alphabet.
This is certainly a very good point.
> If you refer to other languages than Russian which also use the Cyrillic
> alphabet but need a different transliteration rules than Russian for
> the same characters then it is OK for me now. I am afraid that the iconv
> algorithm does not handle such case. Of course, we should add this missing
> feature eventually but I do not volunteer to do it now.
Yes, this would be needed for correct transliteration of different
languages, and this might be quite a bit of work. There's also the case
of transliteration and character sets, consider the transliteration
examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
Russian: ÐоÑÐ¸Ñ ÐÐ¸ÐºÐ¾Ð»Ð°ÐµÐ²Ð¸Ñ ÐлÑÑин
Int'l: Boris NikolaeviÄ Elʹcin
Finnish: Boris Nikolajevitš Jeltsin
French: Boris Nikolaïevitch Ieltsine
Phonetic (IPA): [bÉËrʲis nʲɪkÉËlaɪvʲɪtÉ Ëjelʲtsɨn]
For French you'll get the correct transliteration with iconv by using -t
ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
not so obvious how to get the above kind transliteration for ISO 9
international or especially for the phonetic case.
One thing that might be helpful here could be something like:
$ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
ž
That is, force transliteration of each character (if defined) even if
it's part of the target character set. AFAICS this is not currently
possible.
> But, while at this, is there anything that stops are from adding transliteration
> rules for additional Cyrillic characters not used in Russian but used in
> other languages?
This would probably make sense.
FWIW, for Finnish the diff for Russian to be applied in the locale on
top of translit_cyrillic (ISO 9) rules would be something like below, I
still need to check whether there are rules needed for other languages
than Russian that could be added (I hope to submit a proper patch
against fi_FI shortly after translit_cyrillic has landed):
<U0446> "<U0074><U0073>"
<U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
<U0448> "<U0161>";"<U0073><U0068>"
<U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
<U044A> ""
<U044C> ""
<U044D> "<U0065>"
<U044E> "<U006A><U0075>"
<U044F> "<U006A><U0061>"
<U0451> "<U006A><U006F>"
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 16:22 ` Marko Myllynen
@ 2018-10-09 16:49 ` Egor Kobylkin
2018-10-09 17:04 ` Marko Myllynen
2018-10-09 22:18 ` Rafal Luzynski
2018-10-11 11:05 ` Marko Myllynen
2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 16:49 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
In the hope to be helpful: what you describe below from
https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
not transliteration.
Transliteration is what we have done with ISO 9 or GOST 7.79 System A
and it could be the same for all languages indeed.
The transcription can be phonetic or serve other purposes and depends on
the target language or use case. We have used the GOST 7.79 System B.
Egor
On 09.10.2018 18:10, Marko Myllynen wrote:
> Hi,
>
> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>
>> Particularly, I think that those rules will not be helpful at all for
>> the languages which use neither Latin nor Cyrillic alphabet.
>
> This is certainly a very good point.
>
>> If you refer to other languages than Russian which also use the Cyrillic
>> alphabet but need a different transliteration rules than Russian for
>> the same characters then it is OK for me now. I am afraid that the iconv
>> algorithm does not handle such case. Of course, we should add this missing
>> feature eventually but I do not volunteer to do it now.
>
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>
> Russian: ÐоÑÐ¸Ñ ÐÐ¸ÐºÐ¾Ð»Ð°ÐµÐ²Ð¸Ñ ÐлÑÑин
> Int'l: Boris NikolaeviÄ Elʹcin
> Finnish: Boris Nikolajevitš Jeltsin
> French: Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bÉËrʲis nʲɪkÉËlaɪvʲɪtÉ Ëjelʲtsɨn]
>
> For French you'll get the correct transliteration with iconv by using -t
> ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
> not so obvious how to get the above kind transliteration for ISO 9
> international or especially for the phonetic case.
>
> One thing that might be helpful here could be something like:
>
> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
> ž
>
> That is, force transliteration of each character (if defined) even if
> it's part of the target character set. AFAICS this is not currently
> possible.
>
>> But, while at this, is there anything that stops are from adding transliteration
>> rules for additional Cyrillic characters not used in Russian but used in
>> other languages?
>
> This would probably make sense.
>
> FWIW, for Finnish the diff for Russian to be applied in the locale on
> top of translit_cyrillic (ISO 9) rules would be something like below, I
> still need to check whether there are rules needed for other languages
> than Russian that could be added (I hope to submit a proper patch
> against fi_FI shortly after translit_cyrillic has landed):
>
> <U0446> "<U0074><U0073>"
> <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
> <U0448> "<U0161>";"<U0073><U0068>"
> <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
> <U044A> ""
> <U044C> ""
> <U044D> "<U0065>"
> <U044E> "<U006A><U0075>"
> <U044F> "<U006A><U0061>"
> <U0451> "<U006A><U006F>"
>
> Thanks,
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 16:49 ` Egor Kobylkin
@ 2018-10-09 17:04 ` Marko Myllynen
0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-09 17:04 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
To clarify, the page has a section explaining the differences between
transliteration and transcription and how the terminology is not
entirely unambiguous. It also explains that the national standard SFS
4900 overrides ISO 9, thus ISO 9 can't be used as-is in Finnish context.
Thanks,
On 2018-10-09 19:22, Egor Kobylkin wrote:
> In the hope to be helpful: what you describe below from
> https://fi.wikipedia.org/wiki/Siirtokirjoitus is called _transcription_,
> not transliteration.
>
> Transliteration is what we have done with ISO 9 or GOST 7.79 System A
> and it could be the same for all languages indeed.
>
> The transcription can be phonetic or serve other purposes and depends on
> the target language or use case. We have used the GOST 7.79 System B.
>
> Egor
>
> On 09.10.2018 18:10, Marko Myllynen wrote:
>> Hi,
>>
>> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>>
>>> Particularly, I think that those rules will not be helpful at all for
>>> the languages which use neither Latin nor Cyrillic alphabet.
>>
>> This is certainly a very good point.
>>
>>> If you refer to other languages than Russian which also use the Cyrillic
>>> alphabet but need a different transliteration rules than Russian for
>>> the same characters then it is OK for me now. I am afraid that the iconv
>>> algorithm does not handle such case. Of course, we should add this missing
>>> feature eventually but I do not volunteer to do it now.
>>
>> Yes, this would be needed for correct transliteration of different
>> languages, and this might be quite a bit of work. There's also the case
>> of transliteration and character sets, consider the transliteration
>> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>>
>> Russian: ÐоÑÐ¸Ñ ÐÐ¸ÐºÐ¾Ð»Ð°ÐµÐ²Ð¸Ñ ÐлÑÑин
>> Int'l: Boris NikolaeviÄ Elʹcin
>> Finnish: Boris Nikolajevitš Jeltsin
>> French: Boris Nikolaïevitch Ieltsine
>> Phonetic (IPA): [bÉËrʲis nʲɪkÉËlaɪvʲɪtÉ Ëjelʲtsɨn]
>>
>> For French you'll get the correct transliteration with iconv by using -t
>> ISO-8859-1//TRANSLIT, for Finnish with -t ISO-8859-15//TRANSLIT but it's
>> not so obvious how to get the above kind transliteration for ISO 9
>> international or especially for the phonetic case.
>>
>> One thing that might be helpful here could be something like:
>>
>> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
>> ž
>>
>> That is, force transliteration of each character (if defined) even if
>> it's part of the target character set. AFAICS this is not currently
>> possible.
>>
>>> But, while at this, is there anything that stops are from adding transliteration
>>> rules for additional Cyrillic characters not used in Russian but used in
>>> other languages?
>>
>> This would probably make sense.
>>
>> FWIW, for Finnish the diff for Russian to be applied in the locale on
>> top of translit_cyrillic (ISO 9) rules would be something like below, I
>> still need to check whether there are rules needed for other languages
>> than Russian that could be added (I hope to submit a proper patch
>> against fi_FI shortly after translit_cyrillic has landed):
>>
>> <U0446> "<U0074><U0073>"
>> <U0447> "<U0074><U0161>";"<U0074><U0073><U0068>"
>> <U0448> "<U0161>";"<U0073><U0068>"
>> <U0449> "<U0161><U0074><U0161>";"<U0073><U0068><U0074><U0073><U0068>"
>> <U044A> ""
>> <U044C> ""
>> <U044D> "<U0065>"
>> <U044E> "<U006A><U0075>"
>> <U044F> "<U006A><U0061>"
>> <U0451> "<U006A><U006F>"
>>
>> Thanks,
>>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 14:19 ` Egor Kobylkin
@ 2018-10-09 18:56 ` Egor Kobylkin
2018-10-09 22:31 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 18:56 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
"<U0443><U0301>" (<U00FA>).
It works now with
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
% CYRILLIC UNDEFINED
<U0443><U0301> <U00FA>;"<U0075><U0060>"
The <U0301> is "combining" and obviously it doesn't work if enclosed in
quotes with the letter codepoint. Please let me know if there is another
explanation.
I will now make those changes and generate the patch itself.
Egor
On 09.10.2018 15:18, Egor Kobylkin wrote:
> Hi,
>
> I have now implemented all the changes requested for translit_cyrillic
> file but started hitting what seems like a bug:
>
> - If the line <U0425> <U0048>;<U0058> is present in translt_cyrillic the
> locale compilation fails i.e. grep CYRILLIC < $testfile |
> LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
> iconv -f UTF-8 -t ASCII//TRANSLIT is hanging frozen.
>
> - If the line <U0425> <U0048>;<U0058> is absent from translit_cyrillic
> everything works, just the transliteration of <U0425> fails as expected
> (? is displayed)
>
> - If translit_cyrillic contains <U0425> <U0048>;<U0058> as the _only_
> line the transliteration of <U0425> works again (others as ?).
>
> Would you have any idea into what direction should I look? The new
> translit_cyrillic is attached.
>
> (<U0425> is % CYRILLIC CAPITAL LETTER HA)
>
> Best regards,
> Egor
>
> On 09.10.2018 01:35, Egor Kobylkin wrote:
>> On 09.10.2018 00:23, Rafal Luzynski wrote:
>>> 8.10.2018 14:40 Marko Myllynen <myllynen@redhat.com> wrote:
>>>> Hi,
>>>>
>>>> Thanks for the update. I have few mostly cosmetic comments below,
>>>> hopefully we'll hear from others whether they agree with this direction.
>>>>
>>
>> Yeah, the earlier we have feedback the more productive we are. I'd be
>> happy to get much feedback on this as early as possible. So please
>> everybody concerned please chime in.
>>
>>>
>>>> - No duplicates:
>>>>
>>>> % CYRILLIC SMALL LETTER IE
>>>> <U0435> <U0065>; <U0065>
>>>>
>>>> should become:
>>>>
>>>> % CYRILLIC SMALL LETTER IE
>>>> <U0435> <U0065>
>>>>
>>>> - There are few issues with the definitions:
>>>>
>>>> % CYRILLIC CAPITAL LETTER U
>>>> <U0423> <U0055>; <U0055>
>>>> % CYRILLIC UNDEFINED
>>>> <U0423><U0423> <U00DA>; "<U0055><U0060>"
>>>>
>>>> % CYRILLIC SMALL LETTER U
>>>> <U0443> <U0075>; <U0075>
>>>> % CYRILLIC UNDEFINED
>>>> <U0443><U0443> <U00FA>; "<U0075><U0060>"
>>>
>>> Are the duplicates here because some Cyrillic letters may have multiple
>>> Latin transliterations depending on the context, for example Cyrillic IE
>>> must be transliterated sometimes as "e", sometimes as "ie", sometimes
>>> as "ye" or "je"? Can we provide rules for groups of characters instead?
>> No, the duplicates are just by design of my line generating logic. I
>> have fixed (removed) them. The varying transcription between
>> languages/locales can not be handled in one file at all as far as I
>> understood.
>>
>>>
>>>> I wonder would it be possible to automate generation of this file so
>>>> that issues like the above could avoided? But perhaps that could be the
>>>> next step once this initial patch lands.
>>
>> I am generating the content part of the translit_cyrillc from the
>> LibreOffice Spreadsheet. Not sure if you had time to view it by now?
>> https://sourceware.org/bugzilla/attachment.cgi?id=11299
>>
>> Anyway I have just fixed the issues identified by Marko above in that
>> spreadsheet. I will do the changes for the below request and then upload
>> the new translit_cyrillic file to the bugzilla.
>>
>>
>>>> - Please add the standard glibc locale header (see the existing
>>>> translit_* files for reference)
>>>> - Consider wrapping the header lines at or around column 70-72
>>>> - Consider describing which characters, character ranges, or blocks are
>>>> supported (perhaps also describe why some of those are not included, see
>>>> e.g. https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode)
>>>> - Please remove trailing whitespaces and spaces after ;
>>>
>>> Thanks for this, Marko. While at this, in the ChangeLog and in the commit
>>> message these paths:
>>>
>>> * locales/aa_DJ: likewise
>>>
>>> 1. Should be a relative path starting in the root directory of glibc
>> source,
>>> that is: "* localedata/locales/aa_DJ".
>>> 2. Should be "Likewise." (starting with an uppercase and ending with a
>> dot).
>>
>> will do.
>>
>> Bests,
>> Egor
>>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-08 23:20 ` Egor Kobylkin
@ 2018-10-09 21:52 ` Rafal Luzynski
0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 21:52 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
9.10.2018 00:52 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Just to make sure we are not talking at cross purposes. Since your last
> email on this topic on the suggestion from Marko I have already
> implemented ISO 9 transliteration for all characters there are. This
> should cover most if not all Slavic Cyrillic. You seem to have just
> noticed and replied to this email of Marko as I write mine.
That's great. I'm sorry about not noticing this before, as you can see
this only confirms that I'm unable to give a proper attention to your bug.
9.10.2018 01:35 Egor Kobylkin <egor@kobylkin.com> wrote:
> On 09.10.2018 00:23, Rafal Luzynski wrote:
> > Are the duplicates here because some Cyrillic letters may have multiple
> > Latin transliterations depending on the context, for example Cyrillic IE
> > must be transliterated sometimes as "e", sometimes as "ie", sometimes
> > as "ye" or "je"? Can we provide rules for groups of characters instead?
>
> No, the duplicates are just by design of my line generating logic. I
> have fixed (removed) them. The varying transcription between
> languages/locales can not be handled in one file at all as far as I
> understood.
No, I did not mean here different languages but that some letters may need
to be transliterated in a different way depending on the context. For
example, a letter "е" might be transliterated as "e" or "ie" or "je"
depending on whether it appears after "ж" or after another consonant
or after a vowel or a soft or hard sign etc. All within Russian language.
(Sorry if I'm messing that, maybe what I wrote is wrong but may be correct
for another combination of letters.)
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 16:10 ` Carlos O'Donell
@ 2018-10-09 22:09 ` Rafal Luzynski
0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 22:09 UTC (permalink / raw)
To: Carlos O'Donell, GNU C Library
9.10.2018 17:26 Carlos O'Donell <carlos@redhat.com> wrote:
> [...]
> but it sounds like Egor has added enough coverage to perhaps make all of
> these transliterations opt-in by default.
I think that it is correct if this transliteration is meant to be "Russian
language as if it used a Latin alphabet (even if it does not actually
except in some computer systems which do not support Cyrillic)"
but not if it is meant to be "Russian language to make sure it is comfortable
for reading by English speakers (assuming that everyone else should be fine
with English if their native language is not supported)".
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 16:22 ` Marko Myllynen
2018-10-09 16:49 ` Egor Kobylkin
@ 2018-10-09 22:18 ` Rafal Luzynski
2018-10-10 11:23 ` Marko Myllynen
2018-10-11 11:05 ` Marko Myllynen
2 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 22:18 UTC (permalink / raw)
To: Marko Myllynen, Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote:
> On 2018-10-09 01:04, Rafal Luzynski wrote:
> > If you refer to other languages than Russian which also use the Cyrillic
> > alphabet but need a different transliteration rules than Russian for
> > the same characters then it is OK for me now. I am afraid that the iconv
> > algorithm does not handle such case. Of course, we should add this missing
> > feature eventually but I do not volunteer to do it now.
>
> Yes, this would be needed for correct transliteration of different
> languages, and this might be quite a bit of work. There's also the case
> of transliteration and character sets, consider the transliteration
> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>
> Russian: Борис Николаевич Ельцин
> Int'l: Boris Nikolaevič Elʹcin
> Finnish: Boris Nikolajevitš Jeltsin
> French: Boris Nikolaïevitch Ieltsine
> Phonetic (IPA): [bɐˈrʲis nʲɪkɐˈlaɪvʲɪtɕ ˈjelʲtsɨn]
No, I did not mean the transcription using the rules of the destination
locale using Latin but that the rules of transliteration may be different
depending on the language of the source text. For example, consider
this Cyrillic string: "нъг" (I'm not telling that it is actually used
in any existing word but still must be handled). By our transliteration
rules it will be transliterated as "n``g". But this is fine for Russian;
if we knew that the source string is Ukrainian it would be transliterated
as "n``h"; if it was Bulgarian it would be transliterated as "năg".
Similarly, if you had to transliterate the Latin letters "sch" to Cyrillic
first you would have to ask what was be the source language.
Unfortunately, I think that distinction of the source language is impossible
at the moment so let's assume that we fall back to Russian if there is
any ambiguity.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 18:56 ` Egor Kobylkin
@ 2018-10-09 22:31 ` Rafal Luzynski
2018-10-09 22:43 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-09 22:31 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
> "<U0443><U0301>" (<U00FA>).
> It works now with
> % CYRILLIC UNDEFINED
> <U0423><U0301> <U00DA>;"<U0055><U0060>"
> % CYRILLIC UNDEFINED
> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>
> [...]
I wonder why you need Cyrillic U with acute, and why you comment it
as "undefined" at all. I know that any Cyrillic vowel may appear with
an acute accent but "the diacritic is used only in dictionaries, children's
books, resources for foreign-language learners (...)". [1] So maybe
all vowels with an acute accent should be handled (which I think is fine)
rather than just U.
Regards,
Rafal
[1] https://en.wikipedia.org/wiki/Russian_alphabet#Diacritics
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 22:31 ` Rafal Luzynski
@ 2018-10-09 22:43 ` Egor Kobylkin
2018-10-10 4:16 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-09 22:43 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
On 10.10.2018 00:17, Rafal Luzynski wrote:
> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>> "<U0443><U0301>" (<U00FA>).
>> It works now with
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>> % CYRILLIC UNDEFINED
>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>
>> [...]
>
> I wonder why you need Cyrillic U with acute, and why you comment it
> as "undefined" at all. I know that any Cyrillic vowel may appear with
> an acute accent but "the diacritic is used only in dictionaries, children's
> books, resources for foreign-language learners (...)". [1] So maybe
> all vowels with an acute accent should be handled (which I think is fine)
> rather than just U.
I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
implemented it on Marko's suggestion. Personally I have no opinion on
what letters should be included and under what name. These funny Us just
happened to be in the ISO9 table.
There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
in Unicode. Thatâs why its coming through that way from my worksheet as
it does a reverse lookup on the names based on the Unicode codepoints.
Manually we can change it to whatever youâd suggest in the
translit_cyrillic. I just donât know the right name.
On my side I think I have all outstanding tasks complete for the patch
https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
me know explicitly if you'd like anything changed there.
I was planning to rewrite just the commit message according to your
earlier feedback and resubmit sometime soon.
Bests,
Diego
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 22:43 ` Egor Kobylkin
@ 2018-10-10 4:16 ` Egor Kobylkin
2018-10-10 12:12 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-10 4:16 UTC (permalink / raw)
To: Rafal Luzynski, Marko Myllynen
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
Ups, sorry, wrong link to the patch
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
On 10.10.2018 00:40, Egor Kobylkin wrote:
> On 10.10.2018 00:17, Rafal Luzynski wrote:
>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>> "<U0443><U0301>" (<U00FA>).
>>> It works now with
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>> % CYRILLIC UNDEFINED
>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>
>>> [...]
>>
>> I wonder why you need Cyrillic U with acute, and why you comment it
>> as "undefined" at all. I know that any Cyrillic vowel may appear with
>> an acute accent but "the diacritic is used only in dictionaries, children's
>> books, resources for foreign-language learners (...)". [1] So maybe
>> all vowels with an acute accent should be handled (which I think is fine)
>> rather than just U.
>
> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
> implemented it on Marko's suggestion. Personally I have no opinion on
> what letters should be included and under what name. These funny Us just
> happened to be in the ISO9 table.
>
> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
> in Unicode. Thatâs why its coming through that way from my worksheet as
> it does a reverse lookup on the names based on the Unicode codepoints.
>
> Manually we can change it to whatever youâd suggest in the
> translit_cyrillic. I just donât know the right name.
>
> On my side I think I have all outstanding tasks complete for the patch
> https://sourceware.org/bugzilla/attachment.cgi?id=11144. So please let
> me know explicitly if you'd like anything changed there.
correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>
> I was planning to rewrite just the commit message according to your
> earlier feedback and resubmit sometime soon.
>
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 22:18 ` Rafal Luzynski
@ 2018-10-10 11:23 ` Marko Myllynen
0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 11:23 UTC (permalink / raw)
To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
On 2018-10-10 01:08, Rafal Luzynski wrote:
> 9.10.2018 18:10 Marko Myllynen <myllynen@redhat.com> wrote:
>> On 2018-10-09 01:04, Rafal Luzynski wrote:
>>> If you refer to other languages than Russian which also use the Cyrillic
>>> alphabet but need a different transliteration rules than Russian for
>>> the same characters then it is OK for me now. I am afraid that the iconv
>>> algorithm does not handle such case. Of course, we should add this missing
>>> feature eventually but I do not volunteer to do it now.
>>
>> Yes, this would be needed for correct transliteration of different
>> languages, and this might be quite a bit of work. There's also the case
>> of transliteration and character sets, consider the transliteration
>> examples from https://fi.wikipedia.org/wiki/Siirtokirjoitus:
>>
>> Russian: ÐоÑÐ¸Ñ ÐÐ¸ÐºÐ¾Ð»Ð°ÐµÐ²Ð¸Ñ ÐлÑÑин
>> Int'l: Boris NikolaeviÄ Elʹcin
>> Finnish: Boris Nikolajevitš Jeltsin
>> French: Boris Nikolaïevitch Ieltsine
>> Phonetic (IPA): [bÉËrʲis nʲɪkÉËlaɪvʲɪtÉ Ëjelʲtsɨn]
>
> No, I did not mean the transcription using the rules of the destination
> locale using Latin but that the rules of transliteration may be different
> depending on the language of the source text.
Yes, I mentioned this case in my earlier email:
https://sourceware.org/ml/libc-alpha/2018-10/msg00083.html
> this Cyrillic string: "нÑг" (I'm not telling that it is actually used
> in any existing word but still must be handled). By our transliteration
> rules it will be transliterated as "n``g". But this is fine for Russian;
> if we knew that the source string is Ukrainian it would be transliterated
> as "n``h"; if it was Bulgarian it would be transliterated as "nÄg".
And according to SFS 4900, in fi_FI for this string we would see for
Russian ng, for Ukrainian nh, and for Bulgarian nÄg.
> Unfortunately, I think that distinction of the source language is impossible
> at the moment so let's assume that we fall back to Russian if there is
> any ambiguity.
Yeah, it's not optimal but probably the most decent compromise for now.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-10 4:16 ` Egor Kobylkin
@ 2018-10-10 12:12 ` Marko Myllynen
2018-10-10 12:34 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 12:12 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
Hi,
On 2018-10-10 01:42, Egor Kobylkin wrote:
> Ups, sorry, wrong link to the patch
> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
Although I haven't checked every rule this in general looks very good
(but see below). Not sure do we want to add the few missing characters
mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
least initially the more exotic characters, like the historic ones,
though.) Perhaps filing a bug or two for these cases for separate
consideration would be ok.
> On 10.10.2018 00:40, Egor Kobylkin wrote:
>> On 10.10.2018 00:17, Rafal Luzynski wrote:
>>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>>> "<U0443><U0301>" (<U00FA>).
>>>> It works now with
>>>> % CYRILLIC UNDEFINED
>>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>> % CYRILLIC UNDEFINED
>>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>>
>>>> [...]
>>>
>>> I wonder why you need Cyrillic U with acute, and why you comment it
>>> as "undefined" at all. I know that any Cyrillic vowel may appear with
>>> an acute accent but "the diacritic is used only in dictionaries, children's
>>> books, resources for foreign-language learners (...)". [1] So maybe
>>> all vowels with an acute accent should be handled (which I think is fine)
>>> rather than just U.
>>
>> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
>> implemented it on Marko's suggestion. Personally I have no opinion on
>> what letters should be included and under what name. These funny Us just
>> happened to be in the ISO9 table.
>>
>> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
>> in Unicode. Thatâs why its coming through that way from my worksheet as
>> it does a reverse lookup on the names based on the Unicode codepoints.
>>
>> Manually we can change it to whatever youâd suggest in the
>> translit_cyrillic. I just donât know the right name.
I'm not sure this will work, no existing rule in translit_* files
contain two characters, I'd assume that the rule for U+0423 is applied
first and then the below rule is never used.
% CYRILLIC UNDEFINED
<U0423><U0301> <U00DA>;"<U0055><U0060>"
Perhaps this should be commented out or removed altogether if it's not
working as intended.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-10 12:12 ` Marko Myllynen
@ 2018-10-10 12:34 ` Egor Kobylkin
2018-10-10 16:23 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-10 12:34 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
On 10.10.2018 13:22, Marko Myllynen wrote:
>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>
> Although I haven't checked every rule this in general looks very good
> (but see below).
> Not sure do we want to add the few missing characters
> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
> least initially the more exotic characters, like the historic ones,
> though.) Perhaps filing a bug or two for these cases for separate
> consideration would be ok.
The question here is what should serve as their transliteration and
transcription?
They are not covered by ISO9 neither by GOST 7.79. So maybe it would be
reasonable to assume there is no notable occurrence of those anywhere?
Anyway I am happy to include your specific suggestions for all and any
Unicode quartets in this form:
[Cyrillic Unicode
; ISO9 Latin Transliteration (System A) as Unicode
; Transcription (System B) as (mulitcharacter)ASCII
; name to put in %COMMENT
].
>
>> On 10.10.2018 00:40, Egor Kobylkin wrote:
>>> On 10.10.2018 00:17, Rafal Luzynski wrote:
>>>> 9.10.2018 20:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>>
>>>>> The culprits were the "" around the "<U0423><U0301>" (<U00DA>) and
>>>>> "<U0443><U0301>" (<U00FA>).
>>>>> It works now with
>>>>> % CYRILLIC UNDEFINED
>>>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>>> % CYRILLIC UNDEFINED
>>>>> <U0443><U0301> <U00FA>;"<U0075><U0060>"
>>>>>
>>>>> [...]
>>>>
>>>> I wonder why you need Cyrillic U with acute, and why you comment it
>>>> as "undefined" at all. I know that any Cyrillic vowel may appear with
>>>> an acute accent but "the diacritic is used only in dictionaries, children's
>>>> books, resources for foreign-language learners (...)". [1] So maybe
>>>> all vowels with an acute accent should be handled (which I think is fine)
>>>> rather than just U.
>>>
>>> I have just taken the https://en.wikipedia.org/wiki/ISO_9 table and
>>> implemented it on Marko's suggestion. Personally I have no opinion on
>>> what letters should be included and under what name. These funny Us just
>>> happened to be in the ISO9 table.
>>>
>>> There is no codepoint and no name for <U0423><U0301> and <U0443><U0301>
>>> in Unicode. Thatâs why its coming through that way from my worksheet as
>>> it does a reverse lookup on the names based on the Unicode codepoints.
>>>
>>> Manually we can change it to whatever youâd suggest in the
>>> translit_cyrillic. I just donât know the right name.
>
> I'm not sure this will work, no existing rule in translit_* files
> contain two characters, I'd assume that the rule for U+0423 is applied
> first and then the below rule is never used.
>
> % CYRILLIC UNDEFINED
> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>
> Perhaps this should be commented out or removed altogether if it's not
> working as intended.
here is a result of my test on
https://sourceware.org/bugzilla/attachment.cgi?id=11304
U0423 0301-Ð£Ì -> U0423 0301-U
U0443 0301-ÑÌ -> U0443 0301-u
So yes, they are not processed. I would drop them to not to have special
cases. But I am also fine with keeping them because all work is done
already.
Result:
CYRILLIC RUSSIAN S``esh` eshhyo e`tih myagkih francuzskih bulok, da
vypej zhe chayu. SA`ESH` ESHHYO E`TIH MYAGKIH FRANCUZSKIH BULOK? DA
VYPEJ ZHE CHAYU!
CYRILLIC COMPLETE U0401-YO U0402-DJ U0403-G` U0404-Ye U0405-Z` U0406-I
U0407-Yi U0408-J U0409-L` U040A-N` U040B-TSH U040C-K` U040E-U` U040F-Dh
U0410-A U0411-B U0412-V U0413-G U0414-D U0415-E U0416-ZH U0417-Z U0418-I
U0419-J U041A-K U041B-L U041C-M U041D-N U041E-O U041F-P U0420-R U0421-S
U0422-T U0423-U U0423 0301-U U0424-F U0425-H U0426-C U0427-CH U0428-SH
U0429-SHH U042A-`` U042B-Y U042C-` U042D-E` U042E-YU U042F-YA U0430-a
U0431-b U0432-v U0433-g U0434-d U0435-e U0436-zh U0437-z U0438-i U0439-j
U043A-k U043B-l U043C-m U043D-n U043E-o U043F-p U0440-r U0441-s U0442-t
U0443-u U0443 0301-u U0444-f U0445-h U0446-c U0447-ch U0448-sh U0449-shh
U044A-A` U044B-y U044C-` U044D-e` U044E-yu U044F-ya U0451-yo U0452-dj
U0453-g` U0454-ye U0455-z` U0456-i U0457-yi U0458-j U0459-l` U045A-n`
U045B-tsh U045C-k` U045E-u` U045F-dh U046A-O` U046B-o` U0472-Fh U0473-fh
U0474-Yh U0475-yh U048C-E` U048D-e` U0490-G` U0491-g` U0492-GH U0493-gh
U0494-GH U0495-gh U0496-ZH` U0497-zh` U049A-K` U049B-k` U049E-K`
U049F-k` U04A2-N` U04A3-n` U04A4-NG U04A5-ng U04A6-P` U04A7-p` U04A8-O`
U04A9-o` U04AA-C` U04AB-C` U04AC-T` U04AD-t` U04AE-U U04AF-u U04B2-H`
U04B3-h` U04B4-TCZ U04B5-tcz U04BA-SH` U04BB-SH` U04BC-CH` U04BD-ch`
U04BE-CH` U04BF-ch` U04C0-i U04C1-ZH` U04C2-zh` U04CB-CH` U04CC-ch`
U04D0-A` U04D1-a` U04D2-A` U04D3-a` U04D6-E` U04D7-e` U04D8-A` U04D9-a`
U04DC-ZH` U04DD-zh` U04DE-Z` U04DF-z` U04E0-Z` U04E1-z` U04E4-I`
U04E5-i` U04E6-O` U04E7-o` U04E8-O` U04E9-o` U04F0-U` U04F1-u` U04F2-U`
U04F3-u` U04F4-CH` U04F5-ch` U04F8-Y` U04F9-y` U2019-'
Source:
CYRILLIC RUSSIAN СÑеÑÑ ÐµÑÑ ÑÑиÑ
мÑгкиÑ
ÑÑанÑÑзÑкиÑ
бÑлок, да вÑпей же
ÑаÑ. СЪÐШЬ ÐЩРÐТÐÐ¥ ÐЯÐÐÐÐ¥ ФРÐÐЦУÐСÐÐÐ¥ ÐУÐÐÐ? ÐÐ ÐЫÐÐÐ ÐРЧÐЮ!
CYRILLIC COMPLETE U0401-Ð U0402-Ð U0403-Ð U0404-Ð U0405-Ð
U0406-Ð
U0407-Ð U0408-Ð U0409-Ð U040A-Ð U040B-Ð U040C-Ð U040E-Ð U040F-Ð U0410-Ð
U0411-Ð U0412-Ð U0413-Ð U0414-Ð U0415-Ð U0416-Ð U0417-Ð U0418-Ð U0419-Ð
U041A-РU041B-РU041C-РU041D-РU041E-РU041F-РU0420-РU0421-С U0422-Т
U0423-У U0423 0301-Ð£Ì U0424-Ф U0425-Ð¥ U0426-Ц U0427-Ч U0428-Ш U0429-Щ
U042A-Ñ U042B-Ы U042C-Ñ U042D-Ð U042E-Ю U042F-Я U0430-а U0431-б U0432-в
U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л
U043C-м U043D-н U043E-о U043F-п U0440-Ñ U0441-Ñ U0442-Ñ U0443-Ñ U0443
0301-ÑÌ U0444-Ñ U0445-Ñ
U0446-Ñ U0447-Ñ U0448-Ñ U0449-Ñ U044A-Ъ U044B-Ñ
U044C-Ь U044D-Ñ U044E-Ñ U044F-Ñ U0451-Ñ U0452-Ñ U0453-Ñ U0454-Ñ U0455-Ñ
U0456-Ñ U0457-Ñ U0458-Ñ U0459-Ñ U045A-Ñ U045B-Ñ U045C-Ñ U045E-Ñ U045F-Ñ
U046A-Ѫ U046B-Ñ« U0472-Ѳ U0473-ѳ U0474-Ñ´ U0475-ѵ U048C-Ò U048D-Ò U0490-Ò
U0491-Ò U0492-Ò U0493-Ò U0494-Ò U0495-Ò U0496-Ò U0497-Ò U049A-Ò U049B-Ò
U049E-Ò U049F-Ò U04A2-Ò¢ U04A3-Ò£ U04A4-Ò¤ U04A5-Ò¥ U04A6-Ò¦ U04A7-Ò§ U04A8-Ò¨
U04A9-Ò© U04AA-Òª U04AB-Ò« U04AC-Ò¬ U04AD-Ò U04AE-Ò® U04AF-Ò¯ U04B2-Ò² U04B3-Ò³
U04B4-Ò´ U04B5-Òµ U04BA-Òº U04BB-Ò» U04BC-Ò¼ U04BD-Ò½ U04BE-Ò¾ U04BF-Ò¿ U04C0-Ó
U04C1-Ó U04C2-Ó U04CB-Ó U04CC-Ó U04D0-Ó U04D1-Ó U04D2-Ó U04D3-Ó U04D6-Ó
U04D7-Ó U04D8-Ó U04D9-Ó U04DC-Ó U04DD-Ó U04DE-Ó U04DF-Ó U04E0-Ó U04E1-Ó¡
U04E4-Ó¤ U04E5-Ó¥ U04E6-Ó¦ U04E7-Ó§ U04E8-Ó¨ U04E9-Ó© U04F0-Ó° U04F1-Ó± U04F2-Ó²
U04F3-Ó³ U04F4-Ó´ U04F5-Óµ U04F8-Ó¸ U04F9-Ó¹ U2019-â
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-10 12:34 ` Egor Kobylkin
@ 2018-10-10 16:23 ` Marko Myllynen
0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-10 16:23 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski
Cc: Keld Simonsen, libc-alpha, libc-locales, Dmitry V. Levin,
Volodymyr Lisivka, Carlos O'Donell, Max Kutny, danilo
Hi,
On 2018-10-10 15:19, Egor Kobylkin wrote:
> On 10.10.2018 13:22, Marko Myllynen wrote:
>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>
>> Although I haven't checked every rule this in general looks very good
>> (but see below).
>
>> Not sure do we want to add the few missing characters
>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>> least initially the more exotic characters, like the historic ones,
>> though.) Perhaps filing a bug or two for these cases for separate
>> consideration would be ok.
>
> The question here is what should serve as their transliteration and
> transcription?
Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.
>> I'm not sure this will work, no existing rule in translit_* files
>> contain two characters, I'd assume that the rule for U+0423 is applied
>> first and then the below rule is never used.
>>
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>
>> Perhaps this should be commented out or removed altogether if it's not
>> working as intended.
>
> So yes, they are not processed. I would drop them to not to have special
> cases. But I am also fine with keeping them because all work is done
> already.
I'd probably drop them but I don't feel strongly about this either way.
Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
[not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34 ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-08-06 19:00 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
@ 2018-10-11 2:58 ` Egor Kobylkin
2018-10-11 10:10 ` Marko Myllynen
2018-10-11 12:13 ` Rafal Luzynski
2018-10-11 15:51 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
` (10 subsequent siblings)
13 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 2:58 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 66203 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=8591 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
Root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11302
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-11 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/C 2018-10-09 19:02:45.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-09 19:02:45.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-09 19:02:45.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-09 19:02:45.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-09 19:02:45.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-09 19:02:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-09 19:02:45.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-09 19:02:45.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-09 19:02:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-09 19:02:45.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-09 19:02:46.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-09 19:02:46.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-09 19:02:46.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-09 19:02:46.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-09 19:02:46.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-09 19:02:46.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-09 19:02:46.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-09 19:02:46.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-09 19:02:46.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-09 19:02:46.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-09 19:02:46.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-09 19:02:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-09 19:02:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-09 19:02:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-09 19:02:47.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-09 19:02:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-09 19:02:47.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-09 19:02:47.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-09 19:02:47.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-09 19:02:47.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-09 19:02:47.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-09 19:02:47.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-09 19:02:47.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-09 19:02:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-09 19:02:47.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-09 19:02:48.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-09 19:02:48.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-09 19:02:48.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-09 19:02:48.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-09 19:02:48.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-09 19:02:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-09 19:02:48.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-09 19:02:48.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-09 19:02:48.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-09 19:02:48.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-09 19:02:49.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-09 19:02:49.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-09 19:02:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-09 19:02:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-09 19:02:49.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-09 19:02:49.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-09 19:02:49.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-09 19:02:49.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-09 19:02:49.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-09 19:02:49.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-09 19:02:50.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-09 19:02:50.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-09 19:02:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-09 19:02:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-09 19:02:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-09 19:02:50.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-09 19:02:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-09 19:02:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-09 19:02:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-09 19:02:50.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-09 19:02:50.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-09 19:02:50.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-09 19:02:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-09 19:02:51.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-09 19:02:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-09 19:02:51.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-09 19:02:51.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-09 19:02:51.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-09 19:02:51.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-09 19:02:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-09 19:02:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-09 19:02:19.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-09 19:02:51.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-09 19:02:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-09 19:02:51.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-09 19:02:52.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-09 19:02:52.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-09 19:02:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-09 19:02:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-09 19:02:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-09 19:02:52.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-09 19:02:52.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-09 19:02:52.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-09 19:02:52.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-09 19:02:53.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-09 19:02:53.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-09 19:02:53.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-09 19:02:53.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-09 19:02:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-09 19:02:53.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-09 19:02:53.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-09 19:02:54.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-09 19:02:54.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-09 19:02:54.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56414 bytes --]
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/C 2018-10-09 19:02:45.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-09 19:02:45.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-09 19:02:45.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-09 19:02:45.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-09 19:02:12.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-09 19:02:45.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-09 19:02:45.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-09 19:02:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-09 19:02:45.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-09 19:02:45.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-09 19:02:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-09 19:02:45.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-09 19:02:46.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-09 19:02:46.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-09 19:02:46.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-09 19:02:46.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-09 19:02:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-09 19:02:46.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-09 19:02:46.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-09 19:02:46.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-09 19:02:46.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-09 19:02:46.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-09 19:02:46.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-09 19:02:46.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-09 19:02:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-09 19:02:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-09 19:02:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-09 19:02:46.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-09 19:02:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-09 19:02:47.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-09 19:02:47.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-09 19:02:47.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-09 19:02:47.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-09 19:02:47.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-09 19:02:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-09 19:02:47.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-09 19:02:47.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-09 19:02:47.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-09 19:02:47.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-09 19:02:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-09 19:02:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-09 19:02:47.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-09 19:02:47.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-09 19:02:48.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-09 19:02:48.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-09 19:02:48.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-09 19:02:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-09 19:02:48.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-09 19:02:48.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-09 19:02:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-09 19:02:48.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-09 19:02:48.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-09 19:02:48.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-09 19:02:48.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-09 19:02:49.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-09 19:02:49.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-09 19:02:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-09 19:02:49.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-09 19:02:49.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-09 19:02:49.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-09 19:02:49.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-09 19:02:49.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-09 19:02:49.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-09 19:02:49.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-09 19:02:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-09 19:02:50.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-09 19:02:50.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-09 19:02:50.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-09 19:02:50.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-09 19:02:50.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-09 19:02:50.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-09 19:02:50.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-09 19:02:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-09 19:02:50.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-09 19:02:50.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-09 19:02:50.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-09 19:02:50.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-09 19:02:50.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-09 19:02:50.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-09 19:02:51.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-09 19:02:51.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-09 19:02:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-09 19:02:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-09 19:02:51.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-09 19:02:51.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-09 19:02:51.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-09 19:02:51.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-09 19:02:51.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-09 19:02:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-09 19:02:51.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-09 19:02:51.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-09 19:02:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-09 19:02:51.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-09 19:02:52.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-09 19:02:52.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-09 19:02:19.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-09 19:02:52.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-09 19:02:52.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-09 19:02:52.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-09 19:02:52.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-09 19:02:52.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-09 19:02:52.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-09 19:02:52.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-09 19:02:52.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-09 19:02:52.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-09 19:02:53.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-09 19:02:20.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-09 19:02:53.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-09 19:02:53.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of cyrillic letters to latin and/or ascii symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-09 19:02:53.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-09 19:02:53.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-09 19:02:53.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-09 19:02:53.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-09 19:02:53.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-09 19:02:53.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-09 19:02:54.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-09 19:02:54.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-09 19:02:54.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-09 19:02:21.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-09 19:02:54.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 2:58 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
@ 2018-10-11 10:10 ` Marko Myllynen
2018-10-11 12:13 ` Rafal Luzynski
1 sibling, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11 10:10 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Rafal Luzynski
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
Hi,
Looks like there's one rule after all which might be debatable, I'll
just highlight it and let others to comment and decide what to do with it.
On 2018-10-11 01:29, Egor Kobylkin wrote:
>
> +% RIGHT SINGLE QUOTATION MARK
> +<U2019> <U2035>;<U0027>
translit_neutral (which is included by i18n) has:
% RIGHT SINGLE QUOTATION MARK
<U2019> <U0027> % not <U00B4> because it's often used as an apostrophe
In practice the end result might well be the same (since if U+2019 is
not available then probably U+2035 is neither and both rules produce
U+0027). However, given that translit_cyrillic would be included in
every locale, I'm not sure is this kind of minor discrepancy ok or not.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
2018-10-09 16:22 ` Marko Myllynen
2018-10-09 16:49 ` Egor Kobylkin
2018-10-09 22:18 ` Rafal Luzynski
@ 2018-10-11 11:05 ` Marko Myllynen
2 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11 11:05 UTC (permalink / raw)
To: Rafal Luzynski, Egor Kobylkin, Keld Simonsen
Cc: libc-alpha, libc-locales, Dmitry V. Levin, Volodymyr Lisivka,
Carlos O'Donell, Max Kutny, danilo
Hi,
On 2018-10-09 19:10, Marko Myllynen wrote:
>
> One thing that might be helpful here could be something like:
>
> $ echo ж | LC_ALL=fi_FI.UTF-8 iconv -f UTF-8 -t UTF-8//TRANSLIT_FORCE
> ž
>
> That is, force transliteration of each character (if defined) even if
> it's part of the target character set. AFAICS this is not currently
> possible.
FWIW, this is currently not possible with iconv(1) but uconv(1) supports
this with -x (AFAICS it's using ICU not glibc locale data):
https://en.wikipedia.org/wiki/uconv
https://linux.die.net/man/1/uconv
https://github.com/unicode-org/icu/tree/master/icu4c/source/extra/uconv
Cheers,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 2:58 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
2018-10-11 10:10 ` Marko Myllynen
@ 2018-10-11 12:13 ` Rafal Luzynski
2018-10-11 13:32 ` Marko Myllynen
` (3 more replies)
1 sibling, 4 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-11 12:13 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
Thank you, Egor. I am looking at your patch and although I have
not yet finished, here are some remarks:
First of all, I think that such a large patch should also include
the tests. Please see how automatic tests are performed in locale
data and write your own.
11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> [...]
I think that eventually we would like to include your translit_cyrillic
also in these locales because I assume that your rules should work good
for them as well, also should include more characters than the individual
language contributors took into account. Similarly to Mike's work on
collation: a common rules were created and all locales include them adding
their own language specific modifications.
> [...]
> COMMIT MESSAGE:
> [...]
> I am excluding these locales from this proposed patch. I have written
> directly to locale maintainer emails listed in the files. Volodymyr
> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
I am not sure if we want Cyrillic text in the commit message. Shouldn't
it be, uhm, tranlisterated? :-)
"sr_CS" - I guess you meant "sr_RS".
"sr_YU" has been dropped, do we want to mention it?
> [...]
> [BZ #2872]
> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
instead?
> System A transliteration System B transcription table from Cyrillic to
> Latin/ASCII.
> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
> translit section.
Same, "Add" here.
> * localedata/locales/aa_DJ: Likewise.
Good (here and everywhere below).
> [...]
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
> +0000
> @@ -0,0 +1,383 @@
> +escape_char /
> +comment_char %
> +
> +% This file is part of the GNU C Library and contains locale data.
> +% The Free Software Foundation does not claim any copyright interest
> +% in the locale data contained in this file. The foregoing does not
> +% affect the license of the GNU C Library as a whole. It does not
> +% exempt you from the conditions of the license if your use would
> +% otherwise be governed by that license.
> +
> +% Transliterations of cyrillic letters to latin and/or ascii symbols.
"cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
> +% Inspired by ISO 9.1995 / GOST 7.79-2000.
> +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
> +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
Typos:
"i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
"U4001" - I guess you meant "U0401"
"U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
let's be consistent.
Also I can see some gaps in the range. Are you going to fill them
or maybe for now just mention that they exist?
> +% It implements the GOST_7.79 System A (Latin Script) as a first
> +% option and System B Cyrillic (ASCII) as a second option. Check
> +% https://en.wikipedia.org/wiki/ISO_9 for reference.
> +% The System B is extended from GOST_7.79-Russian using open sources
> +% of the transliteration mappings and the "h/`" diacritics logic.
What is "h/`" diacritics logic?
> +
> +% Usage examples:
> +% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> +% | iconv -f ISO-8859-15 -t UTF-8 # System A
> +% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
> +
> +% Contributions welcome for the rest of Cyrillic script in Unicode
Sure, I'm not going to stop you from pushing these changes just because
there are missing characters. I will consider adding them later.
> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
> +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
> +% Generated from UnicodeData.txt with
> +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
1. Is the file really generated with a script and not modified later?
If yes then maybe you should contribute the script instead? In that case,
you should also not post this file to libc-locale, maintainers and
developers should be able to regenerate it.
2. The link leads to a LibreOffice spreadsheet.
> +LC_CTYPE
> +
> +translit_start
> +
<U0400> is missing here. Are you going to leave it for now?
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> <U00CB>;"<U0059><U004F>"
> [...]
> +% CYRILLIC CAPITAL LETTER KJE
> +<U040C> <U1E30>;"<U004B><U0060>"
<U040D> is missing here. Can we add it already?
> +% CYRILLIC CAPITAL LETTER SHORT U
> +<U040E> <U016C>;"<U0055><U0060>"
> [...]
> +% CYRILLIC CAPITAL LETTER U
> +<U0423> <U0055>
> +% CYRILLIC UNDEFINED
> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
This still makes me wonder.
Does it work at all?
What if we remove this rule, won't it be transliterated as
<U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
will eventually produce "Ú"?
Why is it called "UNDEFINED"?
Do we need similar rules for other characters?
> [...]
> +% CYRILLIC SMALL LETTER U
> +<U0443> <U0075>
> +% CYRILLIC UNDEFINED
> +<U0443><U0301> <U00FA>;"<U0075><U0060>"
Same here.
> [...]
> +% CYRILLIC SMALL LETTER YA
> +<U044F> <U00E2>;"<U0079><U0061>"
Again <U0450> missing (because it is lowercase variant of <U0400>).
> +% CYRILLIC SMALL LETTER IO
> +<U0451> <U00EB>;"<U0079><U006F>"
> [...]
> +% CYRILLIC SMALL LETTER KJE
> +<U045C> <U1E31>;"<U006B><U0060>"
<U045D> missing (same reason as <U040D>).
> +% CYRILLIC SMALL LETTER SHORT U
> +<U045E> <U016D>;"<U0075><U0060>"
> +% CYRILLIC SMALL LETTER DZHE
> +<U045F> "<U0064><U0302>";"<U0064><U0068>"
More letters missing here. Is this because they are historic so we
don't want to include them now? Well, but "YUS" is also historic.
(Please, do not remove YUS for consistency).
> +% CYRILLIC CAPITAL LETTER BIG YUS
> +<U046A> <U01CD>;"<U004F><U0060>"
> +% CYRILLIC SMALL LETTER BIG YUS
> +<U046B> <U01CE>;"<U006F><U0060>"
> [...]
I will continue but, again, I don't give any ETA so other reviewers
are welcome here.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 12:13 ` Rafal Luzynski
@ 2018-10-11 13:32 ` Marko Myllynen
2018-10-11 13:57 ` Volodymyr Lisivka
` (2 subsequent siblings)
3 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-10-11 13:32 UTC (permalink / raw)
To: Rafal Luzynski, Egor Kobylkin, libc-alpha, libc-locales, mfabian
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
Hi,
On 2018-10-11 14:04, Rafal Luzynski wrote:
>
> First of all, I think that such a large patch should also include
> the tests. Please see how automatic tests are performed in locale
> data and write your own.
>
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> Also I can see some gaps in the range. Are you going to fill them
> or maybe for now just mention that they exist?
>
> <U040D> is missing here. Can we add it already?
>
> Sure, I'm not going to stop you from pushing these changes just because
> there are missing characters. I will consider adding them later.
>
> <U0400> is missing here. Are you going to leave it for now?
See check https://sourceware.org/ml/libc-alpha/2018-10/msg00160.html.
>> +% CYRILLIC CAPITAL LETTER U
>> +<U0423> <U0055>
>> +% CYRILLIC UNDEFINED
>> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
>
> This still makes me wonder.
>
> Does it work at all?
No, see the above link.
More importantly, I realized that ICU uconv(1) I mentioned earlier
should make a great reference for this data; output of the currently
included transliteration rules should match uconv(1) output. If that is
not the case, the patch or uconv(1) might have an issue. If the outputs
match, then we should be able to safely assume the patch is ok.
It could also be considered to use uconv(1) output as reference how the
handle to currently missing characters.
(uconv(1) is part of the icu package on Fedora/CentOS/RHEL/openSUSE.)
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 12:13 ` Rafal Luzynski
2018-10-11 13:32 ` Marko Myllynen
@ 2018-10-11 13:57 ` Volodymyr Lisivka
2018-10-11 15:05 ` Egor Kobylkin
2018-10-11 15:23 ` Egor Kobylkin
3 siblings, 0 replies; 111+ messages in thread
From: Volodymyr Lisivka @ 2018-10-11 13:57 UTC (permalink / raw)
To: digitalfreak
Cc: Egor Kobylkin, libc-alpha, libc-locales, mfabian, myllynen, ldv,
Max Kutny, danilo
чт, 11 жовт. 2018 о 14:05 Rafal Luzynski <digitalfreak@lingonborough.com> пише:
>
> Thank you, Egor. I am looking at your patch and although I have
> not yet finished, here are some remarks:
>
> First of all, I think that such a large patch should also include
> the tests. Please see how automatic tests are performed in locale
> data and write your own.
>
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
> > [...]
> > From this patch I have excluded locales that already mention cyrillic or
> > have a transliteration table for it:
> > az_AZ
> > iso14651_t1_common
> > ky_KG
> > mn_MN
> > sr_RS
> > tg_TJ
> > tk_TM
> > tt_RU
> > uk_UA
> > uz_UZ
> > uz_UZ@cyrillic
> > [...]
>
> I think that eventually we would like to include your translit_cyrillic
> also in these locales because I assume that your rules should work good
> for them as well, also should include more characters than the individual
> language contributors took into account.
It's very good idea. Transliteration in Ukrainian locale predates this
work for about decade. It well tested. I also have automatic test
cases, which I can adapt to current standard. Let's drop Russian
transliteration rules and replace them with Ukrainian transliteration
rules. I assume that Ukrainian rules should work good for them as
well.
Ukrainian language is the oldest and most developed language in Slavic
family - last king of all Slavs named Madzhak/Muzhik (Brave), leader
of Volyniana union, was lived in Western Ukraine in Volyn` region.
After Madzhak capturing of Madzhak, kingdom was split into multiple
western parts and eastern part, where 9 Slavic tribes were united by
Rus` tribe, which abandoned their city, now known as Old Russa,
because of epidemic. IMHO, it's will be fair to use rules of the
oldest Slavic union.
> Similarly to Mike's work on
> collation: a common rules were created and all locales include them adding
> their own language specific modifications.
It's good idea too. In our own locale we prefer that words in our
language will be at top of a sorted list. Currently, in Ukrainian
locale it works as intended, but Russian locale has inverted order.
IMHO, Russian locale should use Ukrainian rules.
$ echo 'один два three four'| tr ' ' '\n' | LANG=uk_UA.utf8 sort
два
один
four
three
$ echo 'один два three four'| tr ' ' '\n' | LANG=ru_RU.utf8 sort
four
three
два
один
>
> > [...]
> > COMMIT MESSAGE:
> > [...]
> > I am excluding these locales from this proposed patch. I have written
> > directly to locale maintainer emails listed in the files. Volodymyr
> > Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
> > Данило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>
> I am not sure if we want Cyrillic text in the commit message. Shouldn't
> it be, uhm, tranlisterated? :-)
>
> "sr_CS" - I guess you meant "sr_RS".
>
> "sr_YU" has been dropped, do we want to mention it?
>
> > [...]
> > [BZ #2872]
> > * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
>
> Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
> instead?
>
> > System A transliteration System B transcription table from Cyrillic to
> > Latin/ASCII.
> > * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
> > translit section.
>
> Same, "Add" here.
>
> > * localedata/locales/aa_DJ: Likewise.
>
> Good (here and everywhere below).
>
> > [...]
> > diff -uNr a/localedata/locales/translit_cyrillic
> > b/localedata/locales/translit_cyrillic
> > --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> > +0000
> > +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
> > +0000
> > @@ -0,0 +1,383 @@
> > +escape_char /
> > +comment_char %
> > +
> > +% This file is part of the GNU C Library and contains locale data.
> > +% The Free Software Foundation does not claim any copyright interest
> > +% in the locale data contained in this file. The foregoing does not
> > +% affect the license of the GNU C Library as a whole. It does not
> > +% exempt you from the conditions of the license if your use would
> > +% otherwise be governed by that license.
> > +
> > +% Transliterations of cyrillic letters to latin and/or ascii symbols.
>
> "cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
>
> > +% Inspired by ISO 9.1995 / GOST 7.79-2000.
> > +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
> > +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
>
> Typos:
>
> "i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
> "U4001" - I guess you meant "U0401"
> "U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
> let's be consistent.
>
> Also I can see some gaps in the range. Are you going to fill them
> or maybe for now just mention that they exist?
>
> > +% It implements the GOST_7.79 System A (Latin Script) as a first
> > +% option and System B Cyrillic (ASCII) as a second option. Check
> > +% https://en.wikipedia.org/wiki/ISO_9 for reference.
> > +% The System B is extended from GOST_7.79-Russian using open sources
> > +% of the transliteration mappings and the "h/`" diacritics logic.
>
> What is "h/`" diacritics logic?
>
> > +
> > +% Usage examples:
> > +% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> > +% | iconv -f ISO-8859-15 -t UTF-8 # System A
> > +% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
> > +
> > +% Contributions welcome for the rest of Cyrillic script in Unicode
>
> Sure, I'm not going to stop you from pushing these changes just because
> there are missing characters. I will consider adding them later.
>
> > +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
> > +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
> > +% Generated from UnicodeData.txt with
> > +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
>
> 1. Is the file really generated with a script and not modified later?
> If yes then maybe you should contribute the script instead? In that case,
> you should also not post this file to libc-locale, maintainers and
> developers should be able to regenerate it.
> 2. The link leads to a LibreOffice spreadsheet.
>
> > +LC_CTYPE
> > +
> > +translit_start
> > +
>
> <U0400> is missing here. Are you going to leave it for now?
>
> > +% CYRILLIC CAPITAL LETTER IO
> > +<U0401> <U00CB>;"<U0059><U004F>"
> > [...]
> > +% CYRILLIC CAPITAL LETTER KJE
> > +<U040C> <U1E30>;"<U004B><U0060>"
>
> <U040D> is missing here. Can we add it already?
>
> > +% CYRILLIC CAPITAL LETTER SHORT U
> > +<U040E> <U016C>;"<U0055><U0060>"
> > [...]
> > +% CYRILLIC CAPITAL LETTER U
> > +<U0423> <U0055>
> > +% CYRILLIC UNDEFINED
> > +<U0423><U0301> <U00DA>;"<U0055><U0060>"
>
> This still makes me wonder.
>
> Does it work at all?
> What if we remove this rule, won't it be transliterated as
> <U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
> will eventually produce "Ú"?
> Why is it called "UNDEFINED"?
> Do we need similar rules for other characters?
>
> > [...]
> > +% CYRILLIC SMALL LETTER U
> > +<U0443> <U0075>
> > +% CYRILLIC UNDEFINED
> > +<U0443><U0301> <U00FA>;"<U0075><U0060>"
>
> Same here.
>
> > [...]
> > +% CYRILLIC SMALL LETTER YA
> > +<U044F> <U00E2>;"<U0079><U0061>"
>
> Again <U0450> missing (because it is lowercase variant of <U0400>).
>
> > +% CYRILLIC SMALL LETTER IO
> > +<U0451> <U00EB>;"<U0079><U006F>"
> > [...]
> > +% CYRILLIC SMALL LETTER KJE
> > +<U045C> <U1E31>;"<U006B><U0060>"
>
> <U045D> missing (same reason as <U040D>).
>
> > +% CYRILLIC SMALL LETTER SHORT U
> > +<U045E> <U016D>;"<U0075><U0060>"
> > +% CYRILLIC SMALL LETTER DZHE
> > +<U045F> "<U0064><U0302>";"<U0064><U0068>"
>
> More letters missing here. Is this because they are historic so we
> don't want to include them now? Well, but "YUS" is also historic.
> (Please, do not remove YUS for consistency).
>
> > +% CYRILLIC CAPITAL LETTER BIG YUS
> > +<U046A> <U01CD>;"<U004F><U0060>"
> > +% CYRILLIC SMALL LETTER BIG YUS
> > +<U046B> <U01CE>;"<U006F><U0060>"
> > [...]
>
> I will continue but, again, I don't give any ETA so other reviewers
> are welcome here.
>
> Regards,
>
> Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 12:13 ` Rafal Luzynski
2018-10-11 13:32 ` Marko Myllynen
2018-10-11 13:57 ` Volodymyr Lisivka
@ 2018-10-11 15:05 ` Egor Kobylkin
2018-10-11 21:33 ` Egor Kobylkin
2018-10-11 15:23 ` Egor Kobylkin
3 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 15:05 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales, mfabian, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 10345 bytes --]
Hi Rafal
On 11.10.2018 13:04, Rafal Luzynski wrote:
> Thank you, Egor. I am looking at your patch and although I have
> not yet finished, here are some remarks:
>
> First of all, I think that such a large patch should also include
> the tests. Please see how automatic tests are performed in locale
> data and write your own.
Could you please point me to the existing automatic tests?
Locally I am using the test suggested in glibc locales wiki.
From my commit message:
"The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
"
I am visually checking whether any iconv run fails for all those locales
but you must refer to some automated unit test with a boolean outcome,
right?
>
> 11.10.2018 00:29 Egor Kobylkin <egor@kobylkin.com> wrote:
>> [...]
>> From this patch I have excluded locales that already mention cyrillic or
>> have a transliteration table for it:
>> az_AZ
>> iso14651_t1_common
>> ky_KG
>> mn_MN
>> sr_RS
>> tg_TJ
>> tk_TM
>> tt_RU
>> uk_UA
>> uz_UZ
>> uz_UZ@cyrillic
>> [...]
>
> I think that eventually we would like to include your translit_cyrillic
> also in these locales because I assume that your rules should work good
> for them as well, also should include more characters than the individual
> language contributors took into account. Similarly to Mike's work on
> collation: a common rules were created and all locales include them adding
> their own language specific modifications.
This is fine with me. Should anybody supply translit_xxxxxxxxxxxx for
any of the mentioned locales we can include them as well. Wouldn't it be
easier to coordinate those as separate patches though?
>
>> [...]
>> COMMIT MESSAGE:
>> [...]
>> I am excluding these locales from this proposed patch. I have written
>> directly to locale maintainer emails listed in the files. Volodymyr
>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>> Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>
> I am not sure if we want Cyrillic text in the commit message. Shouldn't
> it be, uhm, tranlisterated? :-)
I do not see any Cyrillic text in the commit message.
the ?????? you see are the actual "?" symbols coming out of iconv now.
>
> "sr_CS" - I guess you meant "sr_RS".
>
> "sr_YU" has been dropped, do we want to mention it?
The list of locales and the patch itself is generated from the actual
locales - I do not hand pick them, only exclude the ones in the
exclusion list above.
>
>> [...]
>> [BZ #2872]
>> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
>
> Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
> instead?
>
>> System A transliteration System B transcription table from Cyrillic to
>> Latin/ASCII.
>> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
>> translit section.
>
> Same, "Add" here.
>
>> * localedata/locales/aa_DJ: Likewise.
>
> Good (here and everywhere below).
>
>> [...]
>> diff -uNr a/localedata/locales/translit_cyrillic
>> b/localedata/locales/translit_cyrillic
>> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
>> +0000
>> +++ b/localedata/locales/translit_cyrillic 2018-10-09 19:02:54.000000000
>> +0000
>> @@ -0,0 +1,383 @@
>> +escape_char /
>> +comment_char %
>> +
>> +% This file is part of the GNU C Library and contains locale data.
>> +% The Free Software Foundation does not claim any copyright interest
>> +% in the locale data contained in this file. The foregoing does not
>> +% affect the license of the GNU C Library as a whole. It does not
>> +% exempt you from the conditions of the license if your use would
>> +% otherwise be governed by that license.
>> +
>> +% Transliterations of cyrillic letters to latin and/or ascii symbols.
>
> "cyrillic" -> "Cyrillic"; "latin" -> "Latin"; "ascii" -> "ASCII".
>
>> +% Inspired by ISO 9.1995 / GOST 7.79-2000.
>> +% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
>> +% i.e [U4001-U4F9, U2019] but only the letters covered by ISO 9.1995
>
> Typos:
>
> "i.e" -> "i.e.," (somebody please fix me if I'm wrong here)
> "U4001" - I guess you meant "U0401"
> "U4F9" -> "U04F9". I think that "U4F9" is not definitely bad but
> let's be consistent.
These are all good catches. I will fix them and resubmit.
>
> Also I can see some gaps in the range. Are you going to fill them
> or maybe for now just mention that they exist?
>
No, were not going to fill them please see this:
On 10.10.2018 14:34, Marko Myllynen wrote:
> On 2018-10-10 15:19, Egor Kobylkin wrote:
>> On 10.10.2018 13:22, Marko Myllynen wrote:
>>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>> Although I haven't checked every rule this in general looks very good
>>> (but see below).
>>> Not sure do we want to add the few missing characters
>>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>>> least initially the more exotic characters, like the historic ones,
>>> though.) Perhaps filing a bug or two for these cases for separate
>>> consideration would be ok.
>> The question here is what should serve as their transliteration and
>> transcription?
> Not sure, so filing a separate bug about this once your patch is merged
> might be the most suitable action for now, I don't think we want to
> postpone merging your work further due to these non-ISO 9 cases.
>
>> +% It implements the GOST_7.79 System A (Latin Script) as a first
>> +% option and System B Cyrillic (ASCII) as a second option. Check
>> +% https://en.wikipedia.org/wiki/ISO_9 for reference.
>> +% The System B is extended from GOST_7.79-Russian using open sources
>> +% of the transliteration mappings and the "h/`" diacritics logic.
>
> What is "h/`" diacritics logic?
Basically some Linguist mentioned that they have chosen "h" and '`" to
represent the diacritics for the transcription (i.e. GOST 7.79 System
B). This way there is some resemblance to the watertight transliteration
as per ISO 9 (Sysetem A) but it is still all in ASCII. We have decided
to extend GOST 7.79 to the all ISO 9 characters and so I have extended
it following that Linguist logic.
>> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
>> +% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
>> +% Generated from UnicodeData.txt with
>> +% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
>
> 1. Is the file really generated with a script and not modified later?
> If yes then maybe you should contribute the script instead? In that case,
> you should also not post this file to libc-locale, maintainers and
> developers should be able to regenerate it.
> 2. The link leads to a LibreOffice spreadsheet.
No, I do not have a script. The "generated" means it is a result of
formulas in that spreadsheet. People are welcome to write a script that
should be straightforward implementation of those rules in formulas.
>
>> +LC_CTYPE
>> +
>> +translit_start
>> +
>
> <U0400> is missing here. Are you going to leave it for now?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.
>
>> +% CYRILLIC CAPITAL LETTER IO
>> +<U0401> <U00CB>;"<U0059><U004F>"
>> [...]
>> +% CYRILLIC CAPITAL LETTER KJE
>> +<U040C> <U1E30>;"<U004B><U0060>"
>
> <U040D> is missing here. Can we add it already?
Yes, it is to be left out, not in ISO 9. See the exchange with Marko above.
>
>> +% CYRILLIC CAPITAL LETTER SHORT U
>> +<U040E> <U016C>;"<U0055><U0060>"
>> [...]
>> +% CYRILLIC CAPITAL LETTER U
>> +<U0423> <U0055>
>> +% CYRILLIC UNDEFINED
>> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
>
> This still makes me wonder.
>
> Does it work at all?
> What if we remove this rule, won't it be transliterated as
> <U0423> => "U", <U0301> - left unchanged, so "U" + <U0301>"
> will eventually produce "Ã"?
> Why is it called "UNDEFINED"?
On 10.10.2018 14:34, Marko Myllynen wrote:
> On 2018-10-10 15:19, Egor Kobylkin wrote:
>> On 10.10.2018 13:22, Marko Myllynen wrote:
...
>>> I'm not sure this will work, no existing rule in translit_* files
>>> contain two characters, I'd assume that the rule for U+0423 is applied
>>> first and then the below rule is never used.
>>>
>>> % CYRILLIC UNDEFINED
>>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>>
>>> Perhaps this should be commented out or removed altogether if it's not
>>> working as intended.
>>
>> So yes, they are not processed. I would drop them to not to have special
>> cases. But I am also fine with keeping them because all work is done
>> already.
> I'd probably drop them but I don't feel strongly about this either way.
>
> Thanks for your efforts, I don't have any further comments, I'll leave
> this now for Rafal and Mike to provide additional feedback and hopefully
> merge soon.
Could you also please check the discussion with Marko on UNDEFINED and
other related topics? You were on To: or CC: for those emails.
The same for the other characters below.
> Do we need similar rules for other characters?
>
>> [...]
>> +% CYRILLIC SMALL LETTER U
>> +<U0443> <U0075>
>> +% CYRILLIC UNDEFINED
>> +<U0443><U0301> <U00FA>;"<U0075><U0060>"
>
> Same here.
>
>> [...]
>> +% CYRILLIC SMALL LETTER YA
>> +<U044F> <U00E2>;"<U0079><U0061>"
>
> Again <U0450> missing (because it is lowercase variant of <U0400>).
>
>> +% CYRILLIC SMALL LETTER IO
>> +<U0451> <U00EB>;"<U0079><U006F>"
>> [...]
>> +% CYRILLIC SMALL LETTER KJE
>> +<U045C> <U1E31>;"<U006B><U0060>"
>
> <U045D> missing (same reason as <U040D>).
>
>> +% CYRILLIC SMALL LETTER SHORT U
>> +<U045E> <U016D>;"<U0075><U0060>"
>> +% CYRILLIC SMALL LETTER DZHE
>> +<U045F> "<U0064><U0302>";"<U0064><U0068>"
>
> More letters missing here. Is this because they are historic so we
> don't want to include them now? Well, but "YUS" is also historic.
> (Please, do not remove YUS for consistency).
>
>> +% CYRILLIC CAPITAL LETTER BIG YUS
>> +<U046A> <U01CD>;"<U004F><U0060>"
>> +% CYRILLIC SMALL LETTER BIG YUS
>> +<U046B> <U01CE>;"<U006F><U0060>"
>> [...]
>
> I will continue but, again, I don't give any ETA so other reviewers
> are welcome here.
>
> Regards,
>
> Rafal
>
Bests,
Egor
[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 7328 bytes --]
From: Marko Myllynen <myllynen@redhat.com>
To: Egor Kobylkin <egor@kobylkin.com>, Rafal Luzynski <digitalfreak@lingonborough.com>
Cc: Keld Simonsen <keld@keldix.com>, libc-alpha@sourceware.org, libc-locales@sourceware.org, "Dmitry V. Levin" <ldv@altlinux.org>, Volodymyr Lisivka <vlisivka@gmail.com>, Carlos O'Donell <carlos@redhat.com>, Max Kutny <mkutny@gmail.com>, danilo@gnome.org
Subject: Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29
Date: Wed, 10 Oct 2018 15:34:26 +0300
Message-ID: <286bc20c-db97-5244-8c26-a3a95e989361@redhat.com>
Hi,
On 2018-10-10 15:19, Egor Kobylkin wrote:
> On 10.10.2018 13:22, Marko Myllynen wrote:
>>> correct link https://sourceware.org/bugzilla/attachment.cgi?id=11303
>>
>> Although I haven't checked every rule this in general looks very good
>> (but see below).
>
>> Not sure do we want to add the few missing characters
>> mentioned at https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode,
>> e.g., one instantly notices that U+0400 is missing. (I wouldn't add at
>> least initially the more exotic characters, like the historic ones,
>> though.) Perhaps filing a bug or two for these cases for separate
>> consideration would be ok.
>
> The question here is what should serve as their transliteration and
> transcription?
Not sure, so filing a separate bug about this once your patch is merged
might be the most suitable action for now, I don't think we want to
postpone merging your work further due to these non-ISO 9 cases.
>> I'm not sure this will work, no existing rule in translit_* files
>> contain two characters, I'd assume that the rule for U+0423 is applied
>> first and then the below rule is never used.
>>
>> % CYRILLIC UNDEFINED
>> <U0423><U0301> <U00DA>;"<U0055><U0060>"
>>
>> Perhaps this should be commented out or removed altogether if it's not
>> working as intended.
>
> So yes, they are not processed. I would drop them to not to have special
> cases. But I am also fine with keeping them because all work is done
> already.
I'd probably drop them but I don't feel strongly about this either way.
Thanks for your efforts, I don't have any further comments, I'll leave
this now for Rafal and Mike to provide additional feedback and hopefully
merge soon.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 12:13 ` Rafal Luzynski
` (2 preceding siblings ...)
2018-10-11 15:05 ` Egor Kobylkin
@ 2018-10-11 15:23 ` Egor Kobylkin
3 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 15:23 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales, mfabian, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
On 11.10.2018 13:04, Rafal Luzynski wrote:
> Thank you, Egor. I am looking at your patch and although I have
> not yet finished, here are some remarks:
...
>> [...]
>> [BZ #2872]
>> * localedata/locales/translit_cyrillic: add ISO 9.1995, GOST 7.79
>
> Please start "Add" with an uppercase. BTW, shouldn't it be "New file"
> instead?
"New file or Add" - I don't know. You tell me.
>
>> System A transliteration System B transcription table from Cyrillic to
>> Latin/ASCII.
>> * localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
>> translit section.
>
> Same, "Add" here.
>
Same, please advise.
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3
[not found] ` <20180412224352.GB2911@altlinux.org>
` (2 preceding siblings ...)
2018-10-11 2:58 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
@ 2018-10-11 15:51 ` Egor Kobylkin
2018-10-12 1:04 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
` (9 subsequent siblings)
13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 15:51 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 66208 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
Root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-11 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56416 bytes --]
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2
2018-10-11 15:05 ` Egor Kobylkin
@ 2018-10-11 21:33 ` Egor Kobylkin
0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-11 21:33 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales
Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
Max Kutny, danilo
On 11.10.2018 16:59, Egor Kobylkin wrote:
> On 11.10.2018 13:04, Rafal Luzynski wrote:
>>> COMMIT MESSAGE:
>>> [...]
>>> I am excluding these locales from this proposed patch. I have written
>>> directly to locale maintainer emails listed in the files. Volodymyr
>>> Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
>>> Ðанило Шеган <danilo@gnome.org> (sr_YU, sr_CS) have confirmed the
>>
>> I am not sure if we want Cyrillic text in the commit message. Shouldn't
>> it be, uhm, tranlisterated? :-)
>
> I do not see any Cyrillic text in the commit message.
> the ?????? you see are the actual "?" symbols coming out of iconv now.
>
>>
>> "sr_CS" - I guess you meant "sr_RS".
>>
>> "sr_YU" has been dropped, do we want to mention it?
>
> The list of locales and the patch itself is generated from the actual
> locales - I do not hand pick them, only exclude the ones in the
> exclusion list above.
Ah, yes, that message above should read sr_RS. Will fix.
There is no sr_YU anymore indeed, so I will drop it. No changes to the
patch, just the commit message.
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4
[not found] ` <20180412224352.GB2911@altlinux.org>
` (3 preceding siblings ...)
2018-10-11 15:51 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
@ 2018-10-12 1:04 ` Egor Kobylkin
2018-10-12 14:08 ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
` (8 subsequent siblings)
13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-12 1:04 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski, Marko Myllynen
Cc: Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 66202 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]
to localedata/locales/ and include it in all your locales going forward.
Patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
Root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-11 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56417 bytes --]
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (4 preceding siblings ...)
2018-10-12 1:04 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
@ 2018-10-12 14:08 ` Egor Kobylkin
2018-10-13 16:58 ` Rafal Luzynski
2018-10-17 14:20 ` [PATCH v6] " Egor Kobylkin
` (7 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-12 14:08 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski,
Marko Myllynen, Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 66215 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11317 [7]
to localedata/locales/ and include it in all your locales going forward.
The patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
az_AZ
iso14651_t1_common
ky_KG
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11317
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-11 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/C: Add include "translit_cyrillic";"" to LC_CTYPE
translit section.
* localedata/locales/aa_DJ: Likewise.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/sd_PK: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin
b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari
b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
+0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
+0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic
b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
+0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
+0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56418 bytes --]
diff -uNr a/localedata/locales/C b/localedata/locales/C
--- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
@@ -2293,6 +2293,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/aa_DJ 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/af_ZA 2018-10-11 15:10:43.000000000 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ak_GH 2018-10-11 15:10:43.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
@@ -1394,6 +1394,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-11 15:10:11.000000000 +0000
+++ b/localedata/locales/ar_EG 2018-10-11 15:10:43.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/be_BY 2018-10-11 15:10:43.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bem_ZM 2018-10-11 15:10:43.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_DZ 2018-10-11 15:10:43.000000000 +0000
@@ -165,6 +165,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/ber_MA 2018-10-11 15:10:44.000000000 +0000
@@ -85,6 +85,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bg_BG 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bi_VU 2018-10-11 15:10:44.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bn_BD 2018-10-11 15:10:44.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-11 15:10:12.000000000 +0000
+++ b/localedata/locales/bo_CN 2018-10-11 15:10:44.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ca_ES 2018-10-11 15:10:44.000000000 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/ce_RU 2018-10-11 15:10:44.000000000 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cmn_TW 2018-10-11 15:10:44.000000000 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cs_CZ 2018-10-11 15:10:44.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cv_RU 2018-10-11 15:10:44.000000000 +0000
@@ -108,6 +108,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/cy_GB 2018-10-11 15:10:44.000000000 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/da_DK 2018-10-11 15:10:44.000000000 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/de_DE 2018-10-11 15:10:44.000000000 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dv_MV 2018-10-11 15:10:44.000000000 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/dz_BT 2018-10-11 15:10:44.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-11 15:10:13.000000000 +0000
+++ b/localedata/locales/el_GR 2018-10-11 15:10:44.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_GB 2018-10-11 15:10:44.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_NG 2018-10-11 15:10:45.000000000 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/en_ZM 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_CU 2018-10-11 15:10:45.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-11 15:10:14.000000000 +0000
+++ b/localedata/locales/es_ES 2018-10-11 15:10:45.000000000 +0000
@@ -72,6 +72,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/et_EE 2018-10-11 15:10:45.000000000 +0000
@@ -112,6 +112,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fa_IR 2018-10-11 15:10:45.000000000 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ff_SN 2018-10-11 15:10:45.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fi_FI 2018-10-11 15:10:45.000000000 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/fr_FR 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/ga_IE 2018-10-11 15:10:45.000000000 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gd_GB 2018-10-11 15:10:45.000000000 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gu_IN 2018-10-11 15:10:45.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/gv_GB 2018-10-11 15:10:45.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/he_IL 2018-10-11 15:10:45.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hi_IN 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hif_FJ 2018-10-11 15:10:45.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-11 15:10:15.000000000 +0000
+++ b/localedata/locales/hr_HR 2018-10-11 15:10:45.000000000 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ht_HT 2018-10-11 15:10:45.000000000 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hu_HU 2018-10-11 15:10:46.000000000 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/hy_AM 2018-10-11 15:10:46.000000000 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/id_ID 2018-10-11 15:10:46.000000000 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/is_IS 2018-10-11 15:10:46.000000000 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/it_IT 2018-10-11 15:10:46.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ja_JP 2018-10-11 15:10:46.000000000 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kab_DZ 2018-10-11 15:10:46.000000000 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kk_KZ 2018-10-11 15:10:46.000000000 +0000
@@ -157,6 +157,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/km_KH 2018-10-11 15:10:46.000000000 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kn_IN 2018-10-11 15:10:46.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ko_KR 2018-10-11 15:10:47.000000000 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ks_IN 2018-10-11 15:10:47.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/kw_GB 2018-10-11 15:10:47.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lb_LU 2018-10-11 15:10:47.000000000 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lg_UG 2018-10-11 15:10:47.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/lij_IT 2018-10-11 15:10:47.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-11 15:10:16.000000000 +0000
+++ b/localedata/locales/ln_CD 2018-10-11 15:10:47.000000000 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lo_LA 2018-10-11 15:10:47.000000000 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lt_LT 2018-10-11 15:10:47.000000000 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/lv_LV 2018-10-11 15:10:47.000000000 +0000
@@ -110,6 +110,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mg_MG 2018-10-11 15:10:47.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mhr_RU 2018-10-11 15:10:47.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mk_MK 2018-10-11 15:10:47.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ml_IN 2018-10-11 15:10:47.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ms_MY 2018-10-11 15:10:48.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/mt_MT 2018-10-11 15:10:48.000000000 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-11 15:10:48.000000000 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nb_NO 2018-10-11 15:10:48.000000000 +0000
@@ -154,6 +154,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/ne_NP 2018-10-11 15:10:48.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nhn_MX 2018-10-11 15:10:48.000000000 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NU 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/niu_NZ 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nl_NL 2018-10-11 15:10:48.000000000 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/nr_ZA 2018-10-11 15:10:48.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/oc_FR 2018-10-11 15:10:48.000000000 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-11 15:10:17.000000000 +0000
+++ b/localedata/locales/om_KE 2018-10-11 15:10:48.000000000 +0000
@@ -138,6 +138,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/or_IN 2018-10-11 15:10:48.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/os_RU 2018-10-11 15:10:48.000000000 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_IN 2018-10-11 15:10:48.000000000 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pa_PK 2018-10-11 15:10:48.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pl_PL 2018-10-11 15:10:48.000000000 +0000
@@ -116,6 +116,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/pt_PT 2018-10-11 15:10:48.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/quz_PE 2018-10-11 15:10:48.000000000 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ro_RO 2018-10-11 15:10:49.000000000 +0000
@@ -143,6 +143,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/ru_RU 2018-10-11 15:10:49.000000000 +0000
@@ -73,6 +73,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/rw_RW 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sa_IN 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN 2018-10-11 15:10:49.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
--- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/se_NO 2018-10-11 15:10:49.000000000 +0000
@@ -204,6 +204,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sgs_LT 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/shn_MM 2018-10-11 15:10:49.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/si_LK 2018-10-11 15:10:49.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sk_SK 2018-10-11 15:10:49.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sl_SI 2018-10-11 15:10:49.000000000 +0000
@@ -90,6 +90,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sm_WS 2018-10-11 15:10:49.000000000 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/so_SO 2018-10-11 15:10:49.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-11 15:10:18.000000000 +0000
+++ b/localedata/locales/sq_AL 2018-10-11 15:10:49.000000000 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ss_ZA 2018-10-11 15:10:49.000000000 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/st_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sv_SE 2018-10-11 15:10:50.000000000 +0000
@@ -138,6 +138,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/sw_KE 2018-10-11 15:10:50.000000000 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ta_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/te_IN 2018-10-11 15:10:50.000000000 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/th_TH 2018-10-11 15:10:50.000000000 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/ti_ET 2018-10-11 15:10:50.000000000 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tn_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/to_TO 2018-10-11 15:10:50.000000000 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-11 15:10:19.000000000 +0000
+++ b/localedata/locales/tpi_PG 2018-10-11 15:10:50.000000000 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/tr_TR 2018-10-11 15:10:50.000000000 +0000
@@ -2423,6 +2423,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with
+% https://sourceware.org/bugzilla/attachment.cgi?id=11301.
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0065>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0069>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0068>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0068>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0068>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ts_ZA 2018-10-11 15:10:50.000000000 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/unm_US 2018-10-11 15:10:51.000000000 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_IN 2018-10-11 15:10:51.000000000 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ur_PK 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/ve_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/vi_VN 2018-10-11 15:10:51.000000000 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wa_BE 2018-10-11 15:10:51.000000000 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/wo_SN 2018-10-11 15:10:51.000000000 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/xh_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yi_US 2018-10-11 15:10:51.000000000 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/yuw_PG 2018-10-11 15:10:51.000000000 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zh_CN 2018-10-11 15:10:51.000000000 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-11 15:10:20.000000000 +0000
+++ b/localedata/locales/zu_ZA 2018-10-11 15:10:51.000000000 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-10-12 14:08 ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-10-13 16:58 ` Rafal Luzynski
2018-10-13 21:16 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-13 16:58 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen,
Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
Egor,
Thank you for the update. I took a closer look at your patch so this
time my review is more complete than before although not yet fully complete.
As far as I understand, ISO-9 and its GOST variants are meant to be
universal rather than Russian-specific. Therefore it is correct to place
them in the external file, like translit_cyrillic, and then include this
file in other locales adding locale specific modifications, if required.
For example, if there are any Russian-specific rules not included in this
file, they should go to ru_RU.
The text of the ISO-9 standard is not available in public, have we got
anything better than an article in Wikipedia?
Regarding the format of your commit message, I hesitate to say anything
more because there are more experienced maintainers around here. Please
take a look at the Contribution Checklist. [1]
While at this, what is your legal relationship with GLIBC project? Have
you signed the FSF Copyright Assignment? It is not necessary for the locale
data but it might be necessary if you are going to contribute the testing code.
Regarding the tests, I think there is no complete transliteration test
suite at the moment. Probably the only test is localedata/bug-iconv-trans.c.
You can also see the collation tests placed in the same directory, they
use those multiple *.UTF-8.in files.
You can skip the tests for now.
Technical issue: Please either attach your patch to the email message or
paste it inline, not both. The patch as it is now is not applicable.
I had to edit it manually to apply.
12.10.2018 16:05 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> From this patch I have excluded locales that already mention cyrillic or
> have a transliteration table for it:
> az_AZ
> iso14651_t1_common
> ky_KG
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
I confirm that these locales are excluded and there are no other missing
locales.
> [...]
>
> diff -uNr a/localedata/locales/C b/localedata/locales/C
> --- a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000
> +++ b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
There is no such file. Where have you got the source code from? Are you
sure this is glibc? :-)
> [...]
> diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
> --- a/localedata/locales/am_ET 2018-10-11 15:10:11.000000000 +0000
> +++ b/localedata/locales/am_ET 2018-10-11 15:10:43.000000000 +0000
> @@ -1394,6 +1394,7 @@
> <U137A> <U0060><U0039><U0030>
> <U137B> <U0060><U0031><U0030><U0030>
> <U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
> +include "translit_cyrillic";""
> translit_end
> %
> END LC_CTYPE
Shouldn't “include "translit_cyrillic";""” be placed before the custom rules,
together with other includes? The same in more files, I will not mention
them all.
> [...]
> diff -uNr a/localedata/locales/sd_IN@devanagari
> b/localedata/locales/sd_IN@devanagari
> --- a/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:18.000000000
> +0000
> +++ b/localedata/locales/sd_IN@devanagari 2018-10-11 15:10:49.000000000
> +0000
Those 3 lines have been broken by the email agent, the patch is not applicable.
> [...]
> diff -uNr a/localedata/locales/sd_PK b/localedata/locales/sd_PK
> --- a/localedata/locales/sd_PK 2018-10-11 15:10:18.000000000 +0000
> +++ b/localedata/locales/sd_PK 2018-10-11 15:10:49.000000000 +0000
There is no such file in glibc.
> [...]
> diff -uNr a/localedata/locales/translit_cyrillic
> b/localedata/locales/translit_cyrillic
> --- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000
> +0000
> +++ b/localedata/locales/translit_cyrillic 2018-10-11 15:10:52.000000000
> +0000
Again 3 lines broken, the patch is not applicable.
> [...]
> +% Contributions welcome for the rest of Cyrillic script in Unicode
> +% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
I am still tempted to add more Cyrillic characters but I understand
that it must be clearly separated which transliteration rules come from
ISO-9 and which are our own invention. But that's not for now.
> [...]
> +translit_start
> +
> +% CYRILLIC CAPITAL LETTER IO
> +<U0401> <U00CB>;"<U0059><U004F>"
This says that for ASCII (GOST 7.79 System B) you would like to transliterate
"Ё" as "YO" but the table in Wikipedia says "Yo". I understand that one or
another may be correct depending on the context but we should be consistent
and also better let's stick with the standard.
> +% CYRILLIC CAPITAL LETTER DJE
> +<U0402> <U0110>;"<U0044><U004A>"
This says "DJ" but System B does not mention it. Where does it come from?
Also, I think it should be "Dj" rather than "DJ".
> +% CYRILLIC CAPITAL LETTER GJE
> +<U0403> <U01F4>;"<U0047><U0060>"
Correct, according to both systems.
> +% CYRILLIC CAPITAL LETTER UKRAINIAN IE
> +<U0404> <U00CA>;"<U0059><U0065>"
"Ye" - correct.
> +% CYRILLIC CAPITAL LETTER DZE
> +<U0405> <U1E90>;"<U005A><U0060>"
Correct.
> +% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
> +<U0406> <U00CC>;<U0049>
Correct. The table mentions an alternative transliteration "I`" but
says that it is "only before vowels for Old Russian and Old Bulgarian".
I think we can skip this other variant.
> +% CYRILLIC CAPITAL LETTER YI
> +<U0407> <U00CF>;"<U0059><U0069>"
"Yi" - correct.
> +% CYRILLIC CAPITAL LETTER JE
> +<U0408> "<U004A><U030C>";<U004A>
Correct.
> +% CYRILLIC CAPITAL LETTER LJE
> +<U0409> "<U004C><U0302>";"<U004C><U0060>"
Correct, according to the standard. If Serbian language requires "Lj"
then overrides should go to sr_RS file.
> +% CYRILLIC CAPITAL LETTER NJE
> +<U040A> "<U004E><U0302>";"<U004E><U0060>"
Correct, the same comment.
> +% CYRILLIC CAPITAL LETTER TSHE
> +<U040B> <U0106>;"<U0054><U0053><U0048>"
Where does "TSH" come from? It is not mentioned by the System B table.
Also I am afraid this is not correct.
> +% CYRILLIC CAPITAL LETTER KJE
> +<U040C> <U1E30>;"<U004B><U0060>"
Correct.
> +% CYRILLIC CAPITAL LETTER SHORT U
> +<U040E> <U016C>;"<U0055><U0060>"
"U`" - correct.
> +% CYRILLIC CAPITAL LETTER DZHE
> +<U040F> "<U0044><U0302>";"<U0044><U0068>"
"Dh" - correct.
> [...]
> +% CYRILLIC CAPITAL LETTER ZHE
> +<U0416> <U017D>;"<U005A><U0048>"
"ZH" - shouldn't be "Zh"?
> [...]
> +% CYRILLIC UNDEFINED
> +<U0423><U0301> <U00DA>;"<U0055><U0060>"
1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
2. OK, the System A table mentions this letter but System B does not.
Somehow we should handle it. I think that "U`" is the best we can
do for now.
3. It must be tested whether this actually works.
> [...]
> +% CYRILLIC CAPITAL LETTER HA
> +<U0425> <U0048>;<U0058>
I don't think that "H" is unavailable in any encoding therefore it will
always be transliterated as "H" and never as "X". We can't help it and
I don't think it is bad.
> +% CYRILLIC CAPITAL LETTER TSE
> +<U0426> <U0043>;"<U0043><U005A>"
1. "CZ" - maybe should be "Cz"?
2. Are we able to implement the rule: "c before i, e, y, j"?
> +% CYRILLIC CAPITAL LETTER CHE
> +<U0427> <U010C>;"<U0043><U0048>"
"CH" -> "Ch"?
> +% CYRILLIC CAPITAL LETTER SHA
> +<U0428> <U0160>;"<U0053><U0048>"
"SH" -> "Sh"?
> +% CYRILLIC CAPITAL LETTER SHCHA
> +<U0429> <U015C>;"<U0053><U0048><U0048>"
"SHH" -> "Shh"?
> +% CYRILLIC CAPITAL LETTER HARD SIGN
> +<U042A> <U02BA>;"<U0041><U0060>"
"A`" is only for Bulgarian and should go to bg_BG. How should
we transliterate an upper case hard sign to plain ASCII? I think
that just "``", same as lower case.
> +% CYRILLIC CAPITAL LETTER YERU
> +<U042B> <U0059>;"<U0059><U0060>"
Again, as "Y" is always available it will never be transliterated
as "Y`".
> +% CYRILLIC CAPITAL LETTER SOFT SIGN
> +<U042C> <U02B9>;<U0060>
OK, I like it to be transliterated to plain ASCII as "`".
> +% CYRILLIC CAPITAL LETTER E
> +<U042D> <U00C8>;"<U0045><U0060>"
OK
> +% CYRILLIC CAPITAL LETTER YU
> +<U042E> <U00DB>;"<U0059><U0055>"
"YU" -> "Yu"?
> +% CYRILLIC CAPITAL LETTER YA
> +<U042F> <U00C2>;"<U0059><U0041>"
"YA" -> "Ya"?
> [...]
I am sorry, this is of course incomplete but that's enough for tonight.
Regards,
Rafal
[1] https://sourceware.org/glibc/wiki/Contribution%20checklist
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-10-13 16:58 ` Rafal Luzynski
@ 2018-10-13 21:16 ` Egor Kobylkin
2018-10-15 11:15 ` Marko Myllynen
2018-10-24 0:12 ` Rafal Luzynski
0 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-13 21:16 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales
Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 7730 bytes --]
Hi Rafal,
Thanks for the thorough checking, it really helps.
On 13.10.2018 02:59, Rafal Luzynski wrote:
> Technical issue: Please either attach your patch to the email
> message or paste it inline, not both. The patch as it is now is not
> applicable. I had to edit it manually to apply.
>> diff -uNr a/localedata/locales/C b/localedata/locales/C ---
>> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++
>> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
>
> There is no such file. Where have you got the source code from?
> Are you sure this is glibc? :-)
I was running my patch process against the Ubuntu 18.04 version of
localedata/locales. Now I have checked out the GitHub glibc source v2.28
and done the same. Please find the new patch attached. I am not
submitting it as a patch request because we have not yet addressed the
rest of your comments below. But at least this should be working as a
patch for you. Please let me know if there is any problem there still.
>> [...] From this patch I have excluded locales that already mention
>> cyrillic or have a transliteration table for it: az_AZ
>> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ
>> uz_UZ@cyrillic
>
> I confirm that these locales are excluded and there are no other
> missing locales.
Because of the surprisingly different list of locales between Ubuntu and
glibc there is now a different list of excluded ones as well.
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
az_AZ, ky_KG are now included because they don't have cyrillic translit
in glibc. iso14651_t1_common is still implicitly excluded, because it
doesn't have 'translit_end' string.
Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
after the patch applied (az_AZ is explicitly including tr_TR). I do not
see a reason, maybe you could check?
> Regarding the tests, I think there is no complete transliteration
> test suite at the moment. Probably the only test is
> localedata/bug-iconv-trans.c. You can also see the collation tests
> placed in the same directory, they use those multiple *.UTF-8.in
> files.
>
> You can skip the tests for now.
In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
change the list of the symbols we are now transliterating
const char str[] = "ÃäÃöÃüÃ";
const char expected[] = "AEaeOEoeUEuess";
like this
const char str[] =
"ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩÑЫÑÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑЪÑЬÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒ
ÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
const char expected[] =
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
Y`y`'";
First I though they could just be added but not all locales
transliterate Umlauts so just extending the current test won't do as it
will fail for those locales.
>> [...] diff -uNr a/localedata/locales/am_ET
>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET
>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A>
>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
>> <U0060><U0031><U0030><U0030><U0030><U0030> +include
>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>
> Shouldn't âinclude "translit_cyrillic";""â be placed before the
> custom rules, together with other includes? The same in more files,
> I will not mention them all.
If I recall correctly it is because of the
"translit_end
END LC_CTYPE"
part at the end of the translit_cyrillic. This way it works for any
locale, regardless whether it has translit itself or not. And being at
the end it does not supersede any previous transliteration that may be
there for a reason.
As with some other comments, I am not super familiar with the formats of
glibc files. So if you have a definitive suggestion - pls. formulate it
as an imperative, not a question.
>> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401>
>> <U00CB>;"<U0059><U004F>"
>
> This says that for ASCII (GOST 7.79 System B) you would like to
> transliterate "Ð" as "YO" but the table in Wikipedia says "Yo". I
> understand that one or another may be correct depending on the
> context but we should be consistent and also better let's stick with
> the standard.
The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
example for "СÑ
" and "Ш" that would both transliterate to Sh:
With SH:"СÑ
ема"->"Shema" but "Шема"->"SHema"
With Sh:"СÑ
ема"->"Shema" and "Шема"->"Shema". Collision!
This is important e.g. for renaming files, grouping as in using uniq etc.
>
>> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
>
> This says "DJ" but System B does not mention it. Where does it come
> from? Also, I think it should be "Dj" rather than "DJ".
I took the first two letters from its name.
>> [...] +% CYRILLIC UNDEFINED +<U0423><U0301>
>> <U00DA>;"<U0055><U0060>"
>
> 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> 2. OK, the System A table mentions this letter but System B does not.
> Somehow we should handle it. I think that "U`" is the best we can do
> for now. 3. It must be tested whether this actually works.
1. Let's do it just before you are ready to commit the patch, because it
breaks formulas in my worksheet and I will have to do it manually?
3. I have tested and it doesn't work/gets ignored. But if you were to
handle COMBINING it would work, wouldn't it?
>> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
>
> I don't think that "H" is unavailable in any encoding therefore it
> will always be transliterated as "H" and never as "X". We can't
> help it and I don't think it is bad.
>
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.
>> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
>
> 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> rule: "c before i, e, y, j"?
>
1. see for CYRILLIC CAPITAL LETTER IO
2. not sure what you are talking about in 2. but I believe it's not
possible as per Marko's email.
>> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A>
>> <U02BA>;"<U0041><U0060>"
>
> "A`" is only for Bulgarian and should go to bg_BG. How should we
> transliterate an upper case hard sign to plain ASCII? I think that
> just "``", same as lower case.
This is to avoid collision. Besides AFAIK e.g. in Russian there is no
capital hard sign because there are no words starting with it.
>
>> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
>
> Again, as "Y" is always available it will never be transliterated as
> "Y`".
>
But we can keep this for when/if there is a way to explicitly request
transcription instead of transliteration.
Bests,
Diego
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56410 bytes --]
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/aa_DJ 2018-10-13 16:52:32.666374687 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/af_ZA 2018-10-13 16:52:32.442373810 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ak_GH 2018-10-13 16:52:32.774375109 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/am_ET 2018-10-13 16:52:32.466373904 +0000
@@ -893,6 +893,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-13 16:52:32.218372934 +0000
+++ b/localedata/locales/ar_EG 2018-10-13 16:52:32.806375234 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/az_AZ b/localedata/locales/az_AZ
--- a/localedata/locales/az_AZ 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/az_AZ 2018-10-13 16:52:32.494374014 +0000
@@ -136,6 +136,7 @@
<U0259> "<U00E4>"
<U018F> "<U00C4>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/be_BY 2018-10-13 16:52:32.518374107 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bem_ZM 2018-10-13 16:52:32.674374718 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_DZ 2018-10-13 16:52:32.878375516 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/ber_MA 2018-10-13 16:52:32.858375438 +0000
@@ -83,6 +83,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-13 16:52:32.222372949 +0000
+++ b/localedata/locales/bg_BG 2018-10-13 16:52:32.446373826 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bi_VU 2018-10-13 16:52:32.786375156 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bn_BD 2018-10-13 16:52:32.766375078 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/bo_CN 2018-10-13 16:52:32.930375719 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ca_ES 2018-10-13 16:52:32.930375719 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/ce_RU 2018-10-13 16:52:32.490373998 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-13 16:52:32.226372965 +0000
+++ b/localedata/locales/cmn_TW 2018-10-13 16:52:32.670374702 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cs_CZ 2018-10-13 16:52:32.874375500 +0000
@@ -215,6 +215,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-13 16:52:32.238373012 +0000
+++ b/localedata/locales/cv_RU 2018-10-13 16:52:32.610374468 +0000
@@ -103,6 +103,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/cy_GB 2018-10-13 16:52:32.434373779 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/da_DK 2018-10-13 16:52:32.894375579 +0000
@@ -169,6 +169,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/de_DE 2018-10-13 16:52:32.898375594 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dv_MV 2018-10-13 16:52:32.842375375 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/dz_BT 2018-10-13 16:52:32.838375360 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/el_GR 2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_GB 2018-10-13 16:52:32.794375187 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_NG 2018-10-13 16:52:32.626374530 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-13 16:52:32.242373028 +0000
+++ b/localedata/locales/en_ZM 2018-10-13 16:52:32.454373857 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_CU 2018-10-13 16:52:32.886375547 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/es_ES 2018-10-13 16:52:32.426373748 +0000
@@ -107,6 +107,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/et_EE 2018-10-13 16:52:32.758375046 +0000
@@ -113,6 +113,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fa_IR 2018-10-13 16:52:32.446373826 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ff_SN 2018-10-13 16:52:32.466373904 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fi_FI 2018-10-13 16:52:32.846375391 +0000
@@ -177,6 +177,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/fr_FR 2018-10-13 16:52:32.522374123 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ga_IE 2018-10-13 16:52:32.906375626 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gd_GB 2018-10-13 16:52:32.894375579 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gu_IN 2018-10-13 16:52:32.802375218 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/gv_GB 2018-10-13 16:52:32.626374530 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/he_IL 2018-10-13 16:52:32.926375704 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hi_IN 2018-10-13 16:52:32.634374561 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hif_FJ 2018-10-13 16:52:32.642374593 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hr_HR 2018-10-13 16:52:32.870375485 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/ht_HT 2018-10-13 16:52:32.798375203 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hu_HU 2018-10-13 16:52:32.518374107 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/hy_AM 2018-10-13 16:52:32.766375078 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/id_ID 2018-10-13 16:52:32.522374123 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-13 16:52:32.246373043 +0000
+++ b/localedata/locales/is_IS 2018-10-13 16:52:32.606374452 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/it_IT 2018-10-13 16:52:32.770375093 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ja_JP 2018-10-13 16:52:32.754375031 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kab_DZ 2018-10-13 16:52:32.922375688 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kk_KZ 2018-10-13 16:52:32.866375469 +0000
@@ -99,6 +99,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/km_KH 2018-10-13 16:52:32.598374421 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kn_IN 2018-10-13 16:52:32.762375062 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ko_KR 2018-10-13 16:52:32.582374358 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ks_IN 2018-10-13 16:52:32.510374076 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/kw_GB 2018-10-13 16:52:32.790375171 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ky_KG b/localedata/locales/ky_KG
--- a/localedata/locales/ky_KG 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ky_KG 2018-10-13 16:52:32.410373685 +0000
@@ -82,6 +82,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lb_LU 2018-10-13 16:52:32.874375500 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lg_UG 2018-10-13 16:52:32.430373763 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lij_IT 2018-10-13 16:52:32.782375140 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ln_CD 2018-10-13 16:52:32.438373795 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lo_LA 2018-10-13 16:52:32.530374154 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lt_LT 2018-10-13 16:52:32.602374436 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/lv_LV 2018-10-13 16:52:32.794375187 +0000
@@ -125,6 +125,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mg_MG 2018-10-13 16:52:32.486373982 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mhr_RU 2018-10-13 16:52:32.866375469 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mk_MK 2018-10-13 16:52:32.598374421 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ml_IN 2018-10-13 16:52:32.610374468 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/ms_MY 2018-10-13 16:52:32.638374577 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/mt_MT 2018-10-13 16:52:32.890375563 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-13 16:52:32.530374154 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-13 16:52:32.262373106 +0000
+++ b/localedata/locales/nb_NO 2018-10-13 16:52:32.778375125 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ne_NP 2018-10-13 16:52:32.842375375 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nhn_MX 2018-10-13 16:52:32.766375078 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NU 2018-10-13 16:52:32.802375218 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/niu_NZ 2018-10-13 16:52:32.850375407 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nl_NL 2018-10-13 16:52:32.602374436 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/nr_ZA 2018-10-13 16:52:32.918375673 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/oc_FR 2018-10-13 16:52:32.818375281 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/om_KE 2018-10-13 16:52:32.918375673 +0000
@@ -156,6 +156,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/or_IN 2018-10-13 16:52:32.926375704 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/os_RU 2018-10-13 16:52:32.910375641 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_IN 2018-10-13 16:52:32.638374577 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pa_PK 2018-10-13 16:52:32.422373732 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pl_PL 2018-10-13 16:52:32.502374045 +0000
@@ -130,6 +130,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/pt_PT 2018-10-13 16:52:32.910375641 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/quz_PE 2018-10-13 16:52:32.470373920 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ro_RO 2018-10-13 16:52:32.646374608 +0000
@@ -142,6 +142,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ru_RU 2018-10-13 16:52:32.534374170 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/rw_RW 2018-10-13 16:52:32.814375265 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sa_IN 2018-10-13 16:52:32.790375171 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN 2018-10-13 16:52:32.770375093 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-13 16:52:32.818375281 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/se_NO 2018-10-13 16:52:32.634374561 +0000
@@ -221,6 +221,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sgs_LT 2018-10-13 16:52:32.810375250 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/shn_MM 2018-10-13 16:52:32.506374060 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/si_LK 2018-10-13 16:52:32.814375265 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sk_SK 2018-10-13 16:52:32.418373716 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sl_SI 2018-10-13 16:52:32.486373982 +0000
@@ -2120,6 +2120,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sm_WS 2018-10-13 16:52:32.498374029 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/so_SO 2018-10-13 16:52:32.414373701 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sq_AL 2018-10-13 16:52:32.798375203 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ss_ZA 2018-10-13 16:52:32.846375391 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/st_ZA 2018-10-13 16:52:32.906375626 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sv_SE 2018-10-13 16:52:32.630374546 +0000
@@ -173,6 +173,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/sw_KE 2018-10-13 16:52:32.590374389 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ta_IN 2018-10-13 16:52:32.586374374 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/te_IN 2018-10-13 16:52:32.642374593 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/th_TH 2018-10-13 16:52:32.902375610 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-13 16:52:32.266373121 +0000
+++ b/localedata/locales/ti_ET 2018-10-13 16:52:32.618374499 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tn_ZA 2018-10-13 16:52:32.882375532 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/to_TO 2018-10-13 16:52:32.822375297 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-13 16:52:32.270373137 +0000
+++ b/localedata/locales/tpi_PG 2018-10-13 16:52:32.454373857 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/tr_TR 2018-10-13 16:52:32.662374671 +0000
@@ -2538,6 +2538,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-13 16:52:32.942375766 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ts_ZA 2018-10-13 16:52:32.806375234 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/unm_US 2018-10-13 16:52:32.782375140 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_IN 2018-10-13 16:52:32.762375062 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ur_PK 2018-10-13 16:52:32.510374076 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/ve_ZA 2018-10-13 16:52:32.854375422 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/vi_VN 2018-10-13 16:52:32.826375313 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wa_BE 2018-10-13 16:52:32.850375407 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/wo_SN 2018-10-13 16:52:32.886375547 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/xh_ZA 2018-10-13 16:52:32.858375438 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yi_US 2018-10-13 16:52:32.506374060 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-13 16:52:32.274373153 +0000
+++ b/localedata/locales/yuw_PG 2018-10-13 16:52:32.494374014 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zh_CN 2018-10-13 16:52:32.862375454 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-13 16:52:32.278373168 +0000
+++ b/localedata/locales/zu_ZA 2018-10-13 16:52:32.886375547 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-10-13 21:16 ` Egor Kobylkin
@ 2018-10-15 11:15 ` Marko Myllynen
2018-10-15 12:04 ` Egor Kobylkin
2018-10-24 0:12 ` Rafal Luzynski
1 sibling, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-10-15 11:15 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales
Cc: mfabian, Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
Hi,
On 2018-10-13 19:58, Egor Kobylkin wrote:
> On 13.10.2018 02:59, Rafal Luzynski wrote:
>
>> Regarding the tests, I think there is no complete transliteration
>> test suite at the moment. Probably the only test is
>> localedata/bug-iconv-trans.c. You can also see the collation tests
>> placed in the same directory, they use those multiple *.UTF-8.in
>> files.
>>
>> You can skip the tests for now.
>
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.
I still think a one-time check against uconv(1) (part of Unicode's ICU
project) for discrepancies.
>>> [...] diff -uNr a/localedata/locales/am_ET
>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET
>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A>
>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include
>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>
>> Shouldn't âinclude "translit_cyrillic";""â be placed before the
>> custom rules, together with other includes? The same in more files,
>> I will not mention them all.
>
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.
I suspect one problem would be that the latter rule wins, so if there
are some locale-specific rules than possible translit_* inclusions would
override them if not included before the locale-specific rules.
Cheers,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-10-15 11:15 ` Marko Myllynen
@ 2018-10-15 12:04 ` Egor Kobylkin
0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-15 12:04 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski, libc-alpha, libc-locales
Cc: mfabian, Dmitry V. Levin, Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 2342 bytes --]
On 15.10.2018 13:04, Marko Myllynen wrote:
> Hi,
>
> On 2018-10-13 19:58, Egor Kobylkin wrote:
>> On 13.10.2018 02:59, Rafal Luzynski wrote:
>>
>>> Regarding the tests, I think there is no complete transliteration
>>> test suite at the moment. Probably the only test is
>>> localedata/bug-iconv-trans.c. You can also see the collation tests
>>> placed in the same directory, they use those multiple *.UTF-8.in
>>> files.
>>>
>>> You can skip the tests for now.
>>
>> First I though they could just be added but not all locales
>> transliterate Umlauts so just extending the current test won't do as it
>> will fail for those locales.
>
> I still think a one-time check against uconv(1) (part of Unicode's ICU
> project) for discrepancies.
Just an addition. I have changes a few constants to see whether
localedata/bug-iconv-trans.c could be made to test cyrillic. Attached is
the bug-iconv-trans-cyr.c that goes through in this form. I had to save
it as UTF-8 instead of ISO-8859-15 for localedata/bug-iconv-trans.c.
>>>> [...] diff -uNr a/localedata/locales/am_ET
>>>> b/localedata/locales/am_ET --- a/localedata/locales/am_ET
>>>> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
>>>> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A>
>>>> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
>>>> <U0060><U0031><U0030><U0030><U0030><U0030> +include
>>>> "translit_cyrillic";"" translit_end % END LC_CTYPE
>>>
>>> Shouldn't âinclude "translit_cyrillic";""â be placed before the
>>> custom rules, together with other includes? The same in more files,
>>> I will not mention them all.
>>
>> If I recall correctly it is because of the
>> "translit_end
>> END LC_CTYPE"
>> part at the end of the translit_cyrillic. This way it works for any
>> locale, regardless whether it has translit itself or not. And being at
>> the end it does not supersede any previous transliteration that may be
>> there for a reason.
>
> I suspect one problem would be that the latter rule wins, so if there
> are some locale-specific rules than possible translit_* inclusions would
> override them if not included before the locale-specific rules.
What is the best way forward here? Can somebody make an explicit
suggestion on how to change the current approach if needed?
Bests,
Egor
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: bug-iconv-trans-cyr.c --]
[-- Type: text/x-csrc; name="bug-iconv-trans-cyr.c", Size: 2556 bytes --]
#include <iconv.h>
#include <locale.h>
#include <stdio.h>
#include <string.h>
int
main (void)
{
iconv_t cd;
const char str[] = "CyrillicLetters_ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â";
const char expected[] = "CyrillicLetters_YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUUFHCCHSHSHHA`Y`E`YUYAabvgdezhzijklmnoprstuufhcchshshh``y`e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'";
char *inptr = (char *) str;
size_t inlen = strlen (str) + 1;
char outbuf[500];
char *outptr = outbuf;
size_t outlen = sizeof (outbuf);
int result = 0;
size_t n;
if (setlocale (LC_ALL, "de_DE.UTF-8") == NULL)
{
puts ("setlocale failed");
return 1;
}
cd = iconv_open ("ANSI_X3.4-1968//TRANSLIT", "UTF-8");
if (cd == (iconv_t) -1)
{
puts ("iconv_open failed");
return 1;
}
n = iconv (cd, &inptr, &inlen, &outptr, &outlen);
if (n != 174)
{
if (n == (size_t) -1)
printf ("iconv() returned error: %m\n");
else
printf ("iconv() returned %Zd, expected 7\n", n);
result = 1;
}
if (inlen != 0)
{
puts ("not all input consumed");
result = 1;
}
else if (inptr - str != strlen (str) + 1)
{
printf ("inptr wrong, advanced by %td\n", inptr - str);
result = 1;
}
if (memcmp (outbuf, expected, sizeof (expected)) != 0)
{
printf ("result wrong: \"%.*s\", expected: \"%s\"\n",
(int) (sizeof (outbuf) - outlen), outbuf, expected);
result = 1;
}
else if (outlen != sizeof (outbuf) - sizeof (expected))
{
printf ("outlen wrong: %Zd, expected %Zd\n", outlen,
sizeof (outbuf) - 15);
result = 1;
}
else
printf ("output is \"%s\" which is OK\n", outbuf);
return result;
}
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v6] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (5 preceding siblings ...)
2018-10-12 14:08 ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
@ 2018-10-17 14:20 ` Egor Kobylkin
2018-11-01 22:52 ` [PATCH v7] " Egor Kobylkin
` (6 subsequent siblings)
13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-10-17 14:20 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski,
Marko Myllynen, Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 10029 bytes --]
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]
to localedata/locales/ and include it in all your locales going forward.
The patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that have a
translit_start/end stance and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-17 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/az_AZ: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: locales.patch --]
[-- Type: text/x-patch; name="locales.patch", Size: 56410 bytes --]
diff -uNr a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
--- a/localedata/locales/aa_DJ 2018-10-17 13:52:01.871309540 +0000
+++ b/localedata/locales/aa_DJ 2018-10-17 13:52:02.415310947 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/af_ZA b/localedata/locales/af_ZA
--- a/localedata/locales/af_ZA 2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/af_ZA 2018-10-17 13:52:02.211310419 +0000
@@ -70,6 +70,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ak_GH b/localedata/locales/ak_GH
--- a/localedata/locales/ak_GH 2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/ak_GH 2018-10-17 13:52:02.519311216 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/am_ET b/localedata/locales/am_ET
--- a/localedata/locales/am_ET 2018-10-17 13:52:01.875309551 +0000
+++ b/localedata/locales/am_ET 2018-10-17 13:52:02.235310481 +0000
@@ -893,6 +893,7 @@
<U137A> <U0060><U0039><U0030>
<U137B> <U0060><U0031><U0030><U0030>
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/ar_EG b/localedata/locales/ar_EG
--- a/localedata/locales/ar_EG 2018-10-17 13:52:01.879309562 +0000
+++ b/localedata/locales/ar_EG 2018-10-17 13:52:02.551311298 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/az_AZ b/localedata/locales/az_AZ
--- a/localedata/locales/az_AZ 2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/az_AZ 2018-10-17 13:52:02.259310543 +0000
@@ -136,6 +136,7 @@
<U0259> "<U00E4>"
<U018F> "<U00C4>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/be_BY b/localedata/locales/be_BY
--- a/localedata/locales/be_BY 2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/be_BY 2018-10-17 13:52:02.283310605 +0000
@@ -91,6 +91,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
--- a/localedata/locales/bem_ZM 2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/bem_ZM 2018-10-17 13:52:02.423310967 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
--- a/localedata/locales/ber_DZ 2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/ber_DZ 2018-10-17 13:52:02.623311484 +0000
@@ -136,6 +136,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ber_MA b/localedata/locales/ber_MA
--- a/localedata/locales/ber_MA 2018-10-17 13:52:01.891309592 +0000
+++ b/localedata/locales/ber_MA 2018-10-17 13:52:02.603311433 +0000
@@ -83,6 +83,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bg_BG b/localedata/locales/bg_BG
--- a/localedata/locales/bg_BG 2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bg_BG 2018-10-17 13:52:02.215310430 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bi_VU b/localedata/locales/bi_VU
--- a/localedata/locales/bi_VU 2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bi_VU 2018-10-17 13:52:02.531311247 +0000
@@ -39,6 +39,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bn_BD b/localedata/locales/bn_BD
--- a/localedata/locales/bn_BD 2018-10-17 13:52:01.895309603 +0000
+++ b/localedata/locales/bn_BD 2018-10-17 13:52:02.511311195 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/bo_CN b/localedata/locales/bo_CN
--- a/localedata/locales/bo_CN 2018-10-17 13:52:01.899309613 +0000
+++ b/localedata/locales/bo_CN 2018-10-17 13:52:02.675311619 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ca_ES b/localedata/locales/ca_ES
--- a/localedata/locales/ca_ES 2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/ca_ES 2018-10-17 13:52:02.675311619 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ce_RU b/localedata/locales/ce_RU
--- a/localedata/locales/ce_RU 2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/ce_RU 2018-10-17 13:52:02.255310533 +0000
@@ -38,6 +38,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
--- a/localedata/locales/cmn_TW 2018-10-17 13:52:01.903309623 +0000
+++ b/localedata/locales/cmn_TW 2018-10-17 13:52:02.419310957 +0000
@@ -49,6 +49,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
--- a/localedata/locales/cs_CZ 2018-10-17 13:52:01.939309717 +0000
+++ b/localedata/locales/cs_CZ 2018-10-17 13:52:02.619311474 +0000
@@ -215,6 +215,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cv_RU b/localedata/locales/cv_RU
--- a/localedata/locales/cv_RU 2018-10-17 13:52:01.939309717 +0000
+++ b/localedata/locales/cv_RU 2018-10-17 13:52:02.359310802 +0000
@@ -103,6 +103,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/cy_GB b/localedata/locales/cy_GB
--- a/localedata/locales/cy_GB 2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/cy_GB 2018-10-17 13:52:02.207310409 +0000
@@ -65,6 +65,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/da_DK b/localedata/locales/da_DK
--- a/localedata/locales/da_DK 2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/da_DK 2018-10-17 13:52:02.635311515 +0000
@@ -169,6 +169,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/de_DE b/localedata/locales/de_DE
--- a/localedata/locales/de_DE 2018-10-17 13:52:01.943309726 +0000
+++ b/localedata/locales/de_DE 2018-10-17 13:52:02.639311526 +0000
@@ -78,6 +78,7 @@
% DOUBLE HIGH-REVERSED-9 QUOTATION MARK
<U201F> <U00AB>;<U0022>
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dv_MV b/localedata/locales/dv_MV
--- a/localedata/locales/dv_MV 2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/dv_MV 2018-10-17 13:52:02.587311391 +0000
@@ -51,6 +51,7 @@
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/dz_BT b/localedata/locales/dz_BT
--- a/localedata/locales/dz_BT 2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/dz_BT 2018-10-17 13:52:02.583311382 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/el_GR b/localedata/locales/el_GR
--- a/localedata/locales/el_GR 2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/el_GR 2018-10-17 13:52:02.607311443 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_GB b/localedata/locales/en_GB
--- a/localedata/locales/en_GB 2018-10-17 13:52:01.947309737 +0000
+++ b/localedata/locales/en_GB 2018-10-17 13:52:02.539311268 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_NG b/localedata/locales/en_NG
--- a/localedata/locales/en_NG 2018-10-17 13:52:01.951309747 +0000
+++ b/localedata/locales/en_NG 2018-10-17 13:52:02.379310854 +0000
@@ -49,6 +49,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/en_ZM b/localedata/locales/en_ZM
--- a/localedata/locales/en_ZM 2018-10-17 13:52:01.951309747 +0000
+++ b/localedata/locales/en_ZM 2018-10-17 13:52:02.227310461 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_CU b/localedata/locales/es_CU
--- a/localedata/locales/es_CU 2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/es_CU 2018-10-17 13:52:02.631311506 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/es_ES b/localedata/locales/es_ES
--- a/localedata/locales/es_ES 2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/es_ES 2018-10-17 13:52:02.195310378 +0000
@@ -107,6 +107,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/et_EE b/localedata/locales/et_EE
--- a/localedata/locales/et_EE 2018-10-17 13:52:01.955309758 +0000
+++ b/localedata/locales/et_EE 2018-10-17 13:52:02.503311174 +0000
@@ -113,6 +113,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fa_IR b/localedata/locales/fa_IR
--- a/localedata/locales/fa_IR 2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fa_IR 2018-10-17 13:52:02.219310440 +0000
@@ -78,6 +78,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ff_SN b/localedata/locales/ff_SN
--- a/localedata/locales/ff_SN 2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/ff_SN 2018-10-17 13:52:02.235310481 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fi_FI b/localedata/locales/fi_FI
--- a/localedata/locales/fi_FI 2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fi_FI 2018-10-17 13:52:02.595311412 +0000
@@ -177,6 +177,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/fr_FR b/localedata/locales/fr_FR
--- a/localedata/locales/fr_FR 2018-10-17 13:52:01.959309768 +0000
+++ b/localedata/locales/fr_FR 2018-10-17 13:52:02.287310616 +0000
@@ -58,6 +58,7 @@
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ga_IE b/localedata/locales/ga_IE
--- a/localedata/locales/ga_IE 2018-10-17 13:52:01.963309778 +0000
+++ b/localedata/locales/ga_IE 2018-10-17 13:52:02.651311557 +0000
@@ -53,6 +53,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gd_GB b/localedata/locales/gd_GB
--- a/localedata/locales/gd_GB 2018-10-17 13:52:01.963309778 +0000
+++ b/localedata/locales/gd_GB 2018-10-17 13:52:02.639311526 +0000
@@ -45,6 +45,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gu_IN b/localedata/locales/gu_IN
--- a/localedata/locales/gu_IN 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/gu_IN 2018-10-17 13:52:02.551311298 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/gv_GB b/localedata/locales/gv_GB
--- a/localedata/locales/gv_GB 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/gv_GB 2018-10-17 13:52:02.375310843 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/he_IL b/localedata/locales/he_IL
--- a/localedata/locales/he_IL 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/he_IL 2018-10-17 13:52:02.671311609 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hi_IN b/localedata/locales/hi_IN
--- a/localedata/locales/hi_IN 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hi_IN 2018-10-17 13:52:02.383310865 +0000
@@ -61,6 +61,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
--- a/localedata/locales/hif_FJ 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hif_FJ 2018-10-17 13:52:02.395310895 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hr_HR b/localedata/locales/hr_HR
--- a/localedata/locales/hr_HR 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/hr_HR 2018-10-17 13:52:02.615311464 +0000
@@ -61,6 +61,7 @@
% transliterate <U0111> {Ä} into d + j
<U0111> "<U0064><U006A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ht_HT b/localedata/locales/ht_HT
--- a/localedata/locales/ht_HT 2018-10-17 13:52:01.967309789 +0000
+++ b/localedata/locales/ht_HT 2018-10-17 13:52:02.543311277 +0000
@@ -57,6 +57,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hu_HU b/localedata/locales/hu_HU
--- a/localedata/locales/hu_HU 2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/hu_HU 2018-10-17 13:52:02.279310595 +0000
@@ -476,6 +476,7 @@
<U00FC> "<U0075><U0308>";"<U0075><U00A8>";"<U0075><U003A>"
<U0171> "<U0075><U030B>";"<U0075><U02DD>";"<U0075><U0022>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/hy_AM b/localedata/locales/hy_AM
--- a/localedata/locales/hy_AM 2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/hy_AM 2018-10-17 13:52:02.515311205 +0000
@@ -75,6 +75,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/id_ID b/localedata/locales/id_ID
--- a/localedata/locales/id_ID 2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/id_ID 2018-10-17 13:52:02.283310605 +0000
@@ -54,6 +54,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/is_IS b/localedata/locales/is_IS
--- a/localedata/locales/is_IS 2018-10-17 13:52:01.971309799 +0000
+++ b/localedata/locales/is_IS 2018-10-17 13:52:02.359310802 +0000
@@ -149,6 +149,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/it_IT b/localedata/locales/it_IT
--- a/localedata/locales/it_IT 2018-10-17 13:52:01.987309840 +0000
+++ b/localedata/locales/it_IT 2018-10-17 13:52:02.519311216 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ja_JP b/localedata/locales/ja_JP
--- a/localedata/locales/ja_JP 2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/ja_JP 2018-10-17 13:52:02.503311174 +0000
@@ -1681,6 +1681,7 @@
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
--- a/localedata/locales/kab_DZ 2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kab_DZ 2018-10-17 13:52:02.663311589 +0000
@@ -41,6 +41,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
--- a/localedata/locales/kk_KZ 2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kk_KZ 2018-10-17 13:52:02.611311453 +0000
@@ -99,6 +99,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/km_KH b/localedata/locales/km_KH
--- a/localedata/locales/km_KH 2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/km_KH 2018-10-17 13:52:02.351310781 +0000
@@ -42,6 +42,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kn_IN b/localedata/locales/kn_IN
--- a/localedata/locales/kn_IN 2018-10-17 13:52:01.991309850 +0000
+++ b/localedata/locales/kn_IN 2018-10-17 13:52:02.507311185 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ko_KR b/localedata/locales/ko_KR
--- a/localedata/locales/ko_KR 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ko_KR 2018-10-17 13:52:02.339310751 +0000
@@ -6099,6 +6099,7 @@
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ks_IN b/localedata/locales/ks_IN
--- a/localedata/locales/ks_IN 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ks_IN 2018-10-17 13:52:02.275310585 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/kw_GB b/localedata/locales/kw_GB
--- a/localedata/locales/kw_GB 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/kw_GB 2018-10-17 13:52:02.535311257 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ky_KG b/localedata/locales/ky_KG
--- a/localedata/locales/ky_KG 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ky_KG 2018-10-17 13:52:02.171310317 +0000
@@ -82,6 +82,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lb_LU b/localedata/locales/lb_LU
--- a/localedata/locales/lb_LU 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lb_LU 2018-10-17 13:52:02.615311464 +0000
@@ -77,6 +77,7 @@
% LATIN SMALL LETTER E WITH CIRCUMFLEX
<U00EA> "e^"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lg_UG b/localedata/locales/lg_UG
--- a/localedata/locales/lg_UG 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lg_UG 2018-10-17 13:52:02.199310389 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lij_IT b/localedata/locales/lij_IT
--- a/localedata/locales/lij_IT 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/lij_IT 2018-10-17 13:52:02.527311236 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ln_CD b/localedata/locales/ln_CD
--- a/localedata/locales/ln_CD 2018-10-17 13:52:01.995309861 +0000
+++ b/localedata/locales/ln_CD 2018-10-17 13:52:02.211310419 +0000
@@ -39,6 +39,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lo_LA b/localedata/locales/lo_LA
--- a/localedata/locales/lo_LA 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lo_LA 2018-10-17 13:52:02.291310627 +0000
@@ -50,6 +50,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lt_LT b/localedata/locales/lt_LT
--- a/localedata/locales/lt_LT 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lt_LT 2018-10-17 13:52:02.355310792 +0000
@@ -163,6 +163,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/lv_LV b/localedata/locales/lv_LV
--- a/localedata/locales/lv_LV 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/lv_LV 2018-10-17 13:52:02.539311268 +0000
@@ -125,6 +125,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mg_MG b/localedata/locales/mg_MG
--- a/localedata/locales/mg_MG 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mg_MG 2018-10-17 13:52:02.255310533 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
--- a/localedata/locales/mhr_RU 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mhr_RU 2018-10-17 13:52:02.611311453 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mk_MK b/localedata/locales/mk_MK
--- a/localedata/locales/mk_MK 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/mk_MK 2018-10-17 13:52:02.351310781 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ml_IN b/localedata/locales/ml_IN
--- a/localedata/locales/ml_IN 2018-10-17 13:52:01.999309872 +0000
+++ b/localedata/locales/ml_IN 2018-10-17 13:52:02.363310812 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff -uNr a/localedata/locales/ms_MY b/localedata/locales/ms_MY
--- a/localedata/locales/ms_MY 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/ms_MY 2018-10-17 13:52:02.391310885 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/mt_MT b/localedata/locales/mt_MT
--- a/localedata/locales/mt_MT 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/mt_MT 2018-10-17 13:52:02.635311515 +0000
@@ -47,6 +47,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
--- a/localedata/locales/nan_TW@latin 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nan_TW@latin 2018-10-17 13:52:02.295310636 +0000
@@ -52,6 +52,7 @@
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nb_NO b/localedata/locales/nb_NO
--- a/localedata/locales/nb_NO 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nb_NO 2018-10-17 13:52:02.523311227 +0000
@@ -166,6 +166,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ne_NP b/localedata/locales/ne_NP
--- a/localedata/locales/ne_NP 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/ne_NP 2018-10-17 13:52:02.587311391 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
--- a/localedata/locales/nhn_MX 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/nhn_MX 2018-10-17 13:52:02.511311195 +0000
@@ -59,6 +59,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NU b/localedata/locales/niu_NU
--- a/localedata/locales/niu_NU 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/niu_NU 2018-10-17 13:52:02.547311288 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
--- a/localedata/locales/niu_NZ 2018-10-17 13:52:02.003309881 +0000
+++ b/localedata/locales/niu_NZ 2018-10-17 13:52:02.595311412 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nl_NL b/localedata/locales/nl_NL
--- a/localedata/locales/nl_NL 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/nl_NL 2018-10-17 13:52:02.355310792 +0000
@@ -56,6 +56,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
--- a/localedata/locales/nr_ZA 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/nr_ZA 2018-10-17 13:52:02.659311578 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/oc_FR b/localedata/locales/oc_FR
--- a/localedata/locales/oc_FR 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/oc_FR 2018-10-17 13:52:02.563311329 +0000
@@ -54,6 +54,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/om_KE b/localedata/locales/om_KE
--- a/localedata/locales/om_KE 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/om_KE 2018-10-17 13:52:02.663311589 +0000
@@ -156,6 +156,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/or_IN b/localedata/locales/or_IN
--- a/localedata/locales/or_IN 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/or_IN 2018-10-17 13:52:02.671311609 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/os_RU b/localedata/locales/os_RU
--- a/localedata/locales/os_RU 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/os_RU 2018-10-17 13:52:02.655311567 +0000
@@ -71,6 +71,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_IN b/localedata/locales/pa_IN
--- a/localedata/locales/pa_IN 2018-10-17 13:52:02.007309892 +0000
+++ b/localedata/locales/pa_IN 2018-10-17 13:52:02.387310874 +0000
@@ -60,6 +60,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pa_PK b/localedata/locales/pa_PK
--- a/localedata/locales/pa_PK 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pa_PK 2018-10-17 13:52:02.191310367 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pl_PL b/localedata/locales/pl_PL
--- a/localedata/locales/pl_PL 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pl_PL 2018-10-17 13:52:02.267310564 +0000
@@ -130,6 +130,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/pt_PT b/localedata/locales/pt_PT
--- a/localedata/locales/pt_PT 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/pt_PT 2018-10-17 13:52:02.651311557 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/quz_PE b/localedata/locales/quz_PE
--- a/localedata/locales/quz_PE 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/quz_PE 2018-10-17 13:52:02.239310492 +0000
@@ -55,6 +55,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ro_RO b/localedata/locales/ro_RO
--- a/localedata/locales/ro_RO 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/ro_RO 2018-10-17 13:52:02.399310906 +0000
@@ -142,6 +142,7 @@
<U0162> "<U021A>";"<U0054>"
<U0163> "<U021B>";"<U0074>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ru_RU b/localedata/locales/ru_RU
--- a/localedata/locales/ru_RU 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/ru_RU 2018-10-17 13:52:02.295310636 +0000
@@ -69,6 +69,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/rw_RW b/localedata/locales/rw_RW
--- a/localedata/locales/rw_RW 2018-10-17 13:52:02.011309902 +0000
+++ b/localedata/locales/rw_RW 2018-10-17 13:52:02.559311319 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sa_IN b/localedata/locales/sa_IN
--- a/localedata/locales/sa_IN 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sa_IN 2018-10-17 13:52:02.535311257 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN b/localedata/locales/sd_IN
--- a/localedata/locales/sd_IN 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sd_IN 2018-10-17 13:52:02.515311205 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
--- a/localedata/locales/sd_IN@devanagari 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sd_IN@devanagari 2018-10-17 13:52:02.563311329 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/se_NO b/localedata/locales/se_NO
--- a/localedata/locales/se_NO 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/se_NO 2018-10-17 13:52:02.387310874 +0000
@@ -221,6 +221,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
--- a/localedata/locales/sgs_LT 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sgs_LT 2018-10-17 13:52:02.555311309 +0000
@@ -58,6 +58,7 @@
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/shn_MM b/localedata/locales/shn_MM
--- a/localedata/locales/shn_MM 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/shn_MM 2018-10-17 13:52:02.271310574 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/si_LK b/localedata/locales/si_LK
--- a/localedata/locales/si_LK 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/si_LK 2018-10-17 13:52:02.559311319 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sk_SK b/localedata/locales/sk_SK
--- a/localedata/locales/sk_SK 2018-10-17 13:52:02.015309913 +0000
+++ b/localedata/locales/sk_SK 2018-10-17 13:52:02.187310357 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sl_SI b/localedata/locales/sl_SI
--- a/localedata/locales/sl_SI 2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sl_SI 2018-10-17 13:52:02.251310523 +0000
@@ -2120,6 +2120,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sm_WS b/localedata/locales/sm_WS
--- a/localedata/locales/sm_WS 2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sm_WS 2018-10-17 13:52:02.263310554 +0000
@@ -37,6 +37,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/so_SO b/localedata/locales/so_SO
--- a/localedata/locales/so_SO 2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/so_SO 2018-10-17 13:52:02.183310347 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sq_AL b/localedata/locales/sq_AL
--- a/localedata/locales/sq_AL 2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/sq_AL 2018-10-17 13:52:02.543311277 +0000
@@ -45,6 +45,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
--- a/localedata/locales/ss_ZA 2018-10-17 13:52:02.019309923 +0000
+++ b/localedata/locales/ss_ZA 2018-10-17 13:52:02.591311402 +0000
@@ -66,6 +66,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/st_ZA b/localedata/locales/st_ZA
--- a/localedata/locales/st_ZA 2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/st_ZA 2018-10-17 13:52:02.647311547 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sv_SE b/localedata/locales/sv_SE
--- a/localedata/locales/sv_SE 2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/sv_SE 2018-10-17 13:52:02.383310865 +0000
@@ -173,6 +173,7 @@
% LATIN SMALL LETTER O WITH STROKE -> "oe"
<U00F8> "<U006F><U0338>";"<U006F><U0065>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/sw_KE b/localedata/locales/sw_KE
--- a/localedata/locales/sw_KE 2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/sw_KE 2018-10-17 13:52:02.343310760 +0000
@@ -43,6 +43,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ta_IN b/localedata/locales/ta_IN
--- a/localedata/locales/ta_IN 2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/ta_IN 2018-10-17 13:52:02.339310751 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/te_IN b/localedata/locales/te_IN
--- a/localedata/locales/te_IN 2018-10-17 13:52:02.023309933 +0000
+++ b/localedata/locales/te_IN 2018-10-17 13:52:02.395310895 +0000
@@ -63,6 +63,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/th_TH b/localedata/locales/th_TH
--- a/localedata/locales/th_TH 2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/th_TH 2018-10-17 13:52:02.647311547 +0000
@@ -57,6 +57,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ti_ET b/localedata/locales/ti_ET
--- a/localedata/locales/ti_ET 2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/ti_ET 2018-10-17 13:52:02.371310834 +0000
@@ -864,6 +864,7 @@
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff -uNr a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
--- a/localedata/locales/tn_ZA 2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/tn_ZA 2018-10-17 13:52:02.623311484 +0000
@@ -67,6 +67,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/to_TO b/localedata/locales/to_TO
--- a/localedata/locales/to_TO 2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/to_TO 2018-10-17 13:52:02.567311340 +0000
@@ -36,6 +36,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
--- a/localedata/locales/tpi_PG 2018-10-17 13:52:02.027309944 +0000
+++ b/localedata/locales/tpi_PG 2018-10-17 13:52:02.223310450 +0000
@@ -44,6 +44,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/tr_TR b/localedata/locales/tr_TR
--- a/localedata/locales/tr_TR 2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/tr_TR 2018-10-17 13:52:02.415310947 +0000
@@ -2538,6 +2538,7 @@
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
--- a/localedata/locales/translit_cyrillic 1970-01-01 00:00:00.000000000 +0000
+++ b/localedata/locales/translit_cyrillic 2018-10-17 13:52:02.687311650 +0000
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff -uNr a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
--- a/localedata/locales/ts_ZA 2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/ts_ZA 2018-10-17 13:52:02.555311309 +0000
@@ -62,6 +62,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/unm_US b/localedata/locales/unm_US
--- a/localedata/locales/unm_US 2018-10-17 13:52:02.039309975 +0000
+++ b/localedata/locales/unm_US 2018-10-17 13:52:02.531311247 +0000
@@ -48,6 +48,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_IN b/localedata/locales/ur_IN
--- a/localedata/locales/ur_IN 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ur_IN 2018-10-17 13:52:02.507311185 +0000
@@ -46,6 +46,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ur_PK b/localedata/locales/ur_PK
--- a/localedata/locales/ur_PK 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ur_PK 2018-10-17 13:52:02.275310585 +0000
@@ -57,6 +57,7 @@
% Farsi yeh -> yeh
<U06CC> "<U064A>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
--- a/localedata/locales/ve_ZA 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/ve_ZA 2018-10-17 13:52:02.599311423 +0000
@@ -65,6 +65,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/vi_VN b/localedata/locales/vi_VN
--- a/localedata/locales/vi_VN 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/vi_VN 2018-10-17 13:52:02.571311351 +0000
@@ -57,6 +57,7 @@
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wa_BE b/localedata/locales/wa_BE
--- a/localedata/locales/wa_BE 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/wa_BE 2018-10-17 13:52:02.595311412 +0000
@@ -59,6 +59,7 @@
<U00C5> "A<U030A>";"A";"AU"
<U00E5> "a<U030A>";"a";"au"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/wo_SN b/localedata/locales/wo_SN
--- a/localedata/locales/wo_SN 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/wo_SN 2018-10-17 13:52:02.631311506 +0000
@@ -54,6 +54,7 @@
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
--- a/localedata/locales/xh_ZA 2018-10-17 13:52:02.043309985 +0000
+++ b/localedata/locales/xh_ZA 2018-10-17 13:52:02.603311433 +0000
@@ -64,6 +64,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yi_US b/localedata/locales/yi_US
--- a/localedata/locales/yi_US 2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/yi_US 2018-10-17 13:52:02.267310564 +0000
@@ -66,6 +66,7 @@
<U05F0> "<U05D5><U05D5>";"ww"
<U05F1> "<U05D5><U05D9>";"wj"
<U05F2> "<U05D9><U05D9>";"jj"
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
--- a/localedata/locales/yuw_PG 2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/yuw_PG 2018-10-17 13:52:02.259310543 +0000
@@ -40,6 +40,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff -uNr a/localedata/locales/zh_CN b/localedata/locales/zh_CN
--- a/localedata/locales/zh_CN 2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/zh_CN 2018-10-17 13:52:02.607311443 +0000
@@ -58,6 +58,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff -uNr a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
--- a/localedata/locales/zu_ZA 2018-10-17 13:52:02.047309996 +0000
+++ b/localedata/locales/zu_ZA 2018-10-17 13:52:02.627311495 +0000
@@ -68,6 +68,7 @@
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-10-13 21:16 ` Egor Kobylkin
2018-10-15 11:15 ` Marko Myllynen
@ 2018-10-24 0:12 ` Rafal Luzynski
1 sibling, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-10-24 0:12 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
Cc: mfabian, Marko Myllynen, Dmitry V. Levin, Volodymyr Lisivka,
Max Kutny, danilo
Hi Egor,
Thank you for your updates and again I'm sorry for my delayed response.
A general remark about this: if you are in a hurry and you need the
corrected transliteration rules for yourself or for your users then
you don't have to wait for the patch to be reviewed and accepted here.
You can make your own locale and use it, you don't need to rebuild glibc,
you don't even need root privileges to do it. The locale data subsystem
is designed to allow users create and use their own locales.
I have seen and tested locally your newer patch [1] but I will reply
in this thread because I think it is easier to reply in context.
I would like to summarize the differences between v5 [2] and v6 to make
sure that I noticed them all and that you have not introduced any changes
inadvertently. (Yes, that means I have skipped another patch which you
sent between those two.)
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* You consequently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ї" is now transliterated as "YI" rather than "Yi".
Again I must say that I experienced lots of technical difficulties to apply
the patch and I had to rework it manually because it is not applicable as
it is now. Here I explain below how to make a technically correct patch:
13.10.2018 18:58 Egor Kobylkin <egor@kobylkin.com> wrote:
>
>
> Hi Rafal,
>
> Thanks for the thorough checking, it really helps.
>
> On 13.10.2018 02:59, Rafal Luzynski wrote:
> > Technical issue: Please either attach your patch to the email
> > message or paste it inline, not both. The patch as it is now is not
> > applicable. I had to edit it manually to apply.
> >> diff -uNr a/localedata/locales/C b/localedata/locales/C ---
> >> a/localedata/locales/C 2018-10-11 15:10:12.000000000 +0000 +++
> >> b/localedata/locales/C 2018-10-11 15:10:43.000000000 +0000
> >
> > There is no such file. Where have you got the source code from?
> > Are you sure this is glibc? :-)
>
> I was running my patch process against the Ubuntu 18.04 version of
> localedata/locales. Now I have checked out the GitHub glibc source v2.28
> and done the same. [...]
Remarks:
* Please use the repository at https://sourceware.org/git/?p=glibc.git
rather than a copy at GitHub.
* Please use the master branch rather than 2.28.
* Commit your work locally.
* Use "git format-patch" (e.g., "git format-patch HEAD^..HEAD") to generate
the patch, then you can email it to this list.
* You can email it inline or, if your email client breaks the lines and
inserts
other unnecessary characters, send as an attachment.
* Use "git pull --rebase" to keep your work up to date.
* Read the Contribution Checklist [3] for more details.
>
> >> [...] From this patch I have excluded locales that already mention
> >> cyrillic or have a transliteration table for it: az_AZ
> >> iso14651_t1_common ky_KG mn_MN sr_RS tg_TJ tk_TM tt_RU uk_UA uz_UZ
> >> uz_UZ@cyrillic
> >
> > I confirm that these locales are excluded and there are no other
> > missing locales.
>
> Because of the surprisingly different list of locales between Ubuntu and
> glibc there is now a different list of excluded ones as well.
>
> mn_MN
> sr_RS
> tg_TJ
> tk_TM
> tt_RU
> uk_UA
> uz_UZ
> uz_UZ@cyrillic
> uk_UA
>
> az_AZ, ky_KG are now included
As far as I can see, there are no other differences between those two
patches.
> because they don't have cyrillic translit
> in glibc. iso14651_t1_common is still implicitly excluded, because it
> doesn't have 'translit_end' string.
>
> Somehow az_AZ and tr_TR from glibc fail to transliterate Cyrillic even
> after the patch applied (az_AZ is explicitly including tr_TR). I do not
> see a reason, maybe you could check?
I noticed that az_AZ does not build at all, localedef program reports
a "circular dependency" (if I recall correctly). I think that since az_AZ
contains “copy "tr_TR"” and tr_TR already contains (in your patch)
“include "translit_cyrillic";""” you should just remove
“include "translit_cyrillic";""” from az_AZ which effectively means that
there are no changes in az_AZ. Optionally, you can add a comment to az_AZ
to explain why it does not contain “include "translit_cyrillic";""” and to
make sure that if anyone removes “copy "tr_TR"” ever in the future, the
“include "translit_cyrillic";""” will be added at the same time. I have
verified that removing that line makes the locale data build without an
error but I have not yet verified that they work as expected.
> > Regarding the tests, I think there is no complete transliteration
> > test suite at the moment. Probably the only test is
> > localedata/bug-iconv-trans.c. You can also see the collation tests
> > placed in the same directory, they use those multiple *.UTF-8.in
> > files.
> >
> > You can skip the tests for now.
>
> In the copy of localedata/bug-iconv-trans.c lines 10-11 we could just
> change the list of the symbols we are now transliterating
>
> const char str[] = "ÄäÖöÜüß";
> const char expected[] = "AEaeOEoeUEuess";
>
> like this
>
> const char str[] =
> "ЁЂЃЄЅІЇЈЉЊЋЌЎЏАБВГДЕЖЗИЙКЛМНОПРСТУУ́ФХЦЧШЩъЫьЭЮЯабвгдежзийклмнопрстуу́фхцчшщЪыЬэюяёђѓєѕіїјљњћќўџѪѫѲѳѴѵҌҍ
> ҐґҒғҔҕҖҗҚқҞҟҢңҤҥҦҧҨҩҪҫҬҭҮүҲҳҴҵҺһҼҽҾҿӀӁӂӋӌӐӑӒӓӖӗӘәӜӝӞӟӠӡӤӥӦӧӨөӰӱӲӳӴӵӸӹ’"
> const char expected[] =
> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU`FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu`fxczchsh
> shh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`
> T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`
> Y`y`'";
>
> First I though they could just be added but not all locales
> transliterate Umlauts so just extending the current test won't do as it
> will fail for those locales.
I noticed that you pasted a patch in a Bugzilla comment. [4] If I understand
correctly you suggest to rework the existing test case to test Cyrillic
transliteration instead of German. Please don't do it: the existing test
cases may be extended but must not be removed. I think we should rework
this
test case to handle multiple locales and multiple transliteration pairs;
optionally we can add a new case instead. Currently I lean into reworking
the existing test case.
> >> [...] diff -uNr a/localedata/locales/am_ET
> >> b/localedata/locales/am_ET --- a/localedata/locales/am_ET
> >> 2018-10-11 15:10:11.000000000 +0000 +++ b/localedata/locales/am_ET
> >> 2018-10-11 15:10:43.000000000 +0000 @@ -1394,6 +1394,7 @@ <U137A>
> >> <U0060><U0039><U0030> <U137B> <U0060><U0031><U0030><U0030> <U137C>
> >> <U0060><U0031><U0030><U0030><U0030><U0030> +include
> >> "translit_cyrillic";"" translit_end % END LC_CTYPE
> >
> > Shouldn't “include "translit_cyrillic";""” be placed before the
> > custom rules, together with other includes? The same in more files,
> > I will not mention them all.
>
> If I recall correctly it is because of the
> "translit_end
> END LC_CTYPE"
> part at the end of the translit_cyrillic. This way it works for any
> locale, regardless whether it has translit itself or not. And being at
> the end it does not supersede any previous transliteration that may be
> there for a reason.
>
> As with some other comments, I am not super familiar with the formats of
> glibc files. So if you have a definitive suggestion - pls. formulate it
> as an imperative, not a question.
I feel like a newcomer here so it was meant to be a question to other
more experienced maintainers but probably it's time to change this attitude.
So, also taking into account what Marko wrote, [5] please put the include
directive after all other include directives, or after the "translit_start"
directive if there are no other includes, rather than putting it just before
"translit_end". Even if putting it at the dnd works sometimes or even
always.
Same as you put #include's near top of the file when writing a C program
even
if sometimes you may put it anywhere and it will work. If you use a script
to insert your include directives then please rework it, if you insert them
manually then just move them manually.
> >> [...] +translit_start + +% CYRILLIC CAPITAL LETTER IO +<U0401>
> >> <U00CB>;"<U0059><U004F>"
> >
> > This says that for ASCII (GOST 7.79 System B) you would like to
> > transliterate "Ё" as "YO" but the table in Wikipedia says "Yo". I
> > understand that one or another may be correct depending on the
> > context but we should be consistent and also better let's stick with
> > the standard.
>
> The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
> example for "Сх" and "Ш" that would both transliterate to Sh:
> With SH:"Схема"->"Shema" but "Шема"->"SHema"
> With Sh:"Схема"->"Shema" and "Шема"->"Shema". Collision!
> This is important e.g. for renaming files, grouping as in using uniq etc.
I understand this idea. Is this part of any existing standard? I can't
see it regulated by GOST 7.79.
I'd rather not include the transliteration rules which seems reasonable to
us (the developers) but are not known and therefore not acceptable by the
outer world.
>
> >
> >> +% CYRILLIC CAPITAL LETTER DJE +<U0402> <U0110>;"<U0044><U004A>"
> >
> > This says "DJ" but System B does not mention it. Where does it come
> > from? Also, I think it should be "Dj" rather than "DJ".
> I took the first two letters from its name.
As I said previously, I would like to add more Cyrillic letters even if
they are not regulated by any standard. But let's separate them and make
it clear that these rules are based on GOST 7.79 and those are our own
invention (or come from other standard etc.) I think that all these
rules may even be in the same file but in different parts of it.
> >> [...] +% CYRILLIC UNDEFINED +<U0423><U0301>
> >> <U00DA>;"<U0055><U0060>"
> >
> > 1. I think it should be named "CYRILLIC CAPITAL LETTER U WITH ACUTE".
> > 2. OK, the System A table mentions this letter but System B does not.
> > Somehow we should handle it. I think that "U`" is the best we can do
> > for now. 3. It must be tested whether this actually works.
> 1. Let's do it just before you are ready to commit the patch, because it
> breaks formulas in my worksheet and I will have to do it manually?
> 3. I have tested and it doesn't work/gets ignored. But if you were to
> handle COMBINING it would work, wouldn't it?
My guess is that since translit_combining just removes all those combining
diacritic characters and translit_combining is usually included before
translit_cyrillic then <U0301> is removed even before <U0423> is taken
into account. Also my another guess is that it might work good if you
just removed this rule: <U0423> would be translated to "U" and <U0301>
would remain unchanged and eventually those two characters would produce
"Ú". But, again, that's just a guess, I have not tested.
> >> [...] +% CYRILLIC CAPITAL LETTER HA +<U0425> <U0048>;<U0058>
> >
> > I don't think that "H" is unavailable in any encoding therefore it
> > will always be transliterated as "H" and never as "X". We can't
> > help it and I don't think it is bad.
> >
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.
Note that either it will make the test cases fail or we will have to
prepare the test cases deliberately skip the translation of <U0425>
into "X" because "H" will be always working. We can't force iconv
to choose the second transliteration rule if the first one works.
That means we will have a problem to construct the test cases.
> >> +% CYRILLIC CAPITAL LETTER TSE +<U0426> <U0043>;"<U0043><U005A>"
> >
> > 1. "CZ" - maybe should be "Cz"?> 2. Are we able to implement the
> > rule: "c before i, e, y, j"?
> >
> 1. see for CYRILLIC CAPITAL LETTER IO
> 2. not sure what you are talking about in 2. but I believe it's not
> possible as per Marko's email.
Hm... I can't find a good example now. Maybe I was mislead by the rules
of Cyrillic transliteration which I learned at school and which are not
necessarily universal and not necessarily useful for English readers.
> >> +% CYRILLIC CAPITAL LETTER HARD SIGN +<U042A>
> >> <U02BA>;"<U0041><U0060>"
> >
> > "A`" is only for Bulgarian and should go to bg_BG. How should we
> > transliterate an upper case hard sign to plain ASCII? I think that
> > just "``", same as lower case.
> This is to avoid collision.
What collision?
> Besides AFAIK e.g. in Russian there is no
> capital hard sign because there are no words starting with it.
True but it can be used in ALL UPPERCASE text. Therefore we need a clear
and correct transliteration rule for it.
>
> >
> >> +% CYRILLIC CAPITAL LETTER YERU +<U042B> <U0059>;"<U0059><U0060>"
> >
> > Again, as "Y" is always available it will never be transliterated as
> > "Y`".
> >
> But we can keep this for when/if there is a way to explicitly request
> transcription instead of transliteration.
Again, it will be difficult or impossible to construct a correct test case
and we must be aware of this.
Regards,
Rafal
[1] https://sourceware.org/ml/libc-alpha/2018-10/msg00300.html
[2] https://sourceware.org/ml/libc-alpha/2018-10/msg00213.html
[3] https://sourceware.org/glibc/wiki/Contribution%20checklist
[4] https://sourceware.org/bugzilla/show_bug.cgi?id=2872#c47
[5] https://sourceware.org/ml/libc-alpha/2018-10/msg00232.html
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v7] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (6 preceding siblings ...)
2018-10-17 14:20 ` [PATCH v6] " Egor Kobylkin
@ 2018-11-01 22:52 ` Egor Kobylkin
2018-11-02 0:01 ` [PATCH v8] " Egor Kobylkin
` (5 subsequent siblings)
13 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-01 22:52 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski,
Marko Myllynen, Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 10911 bytes --]
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]
to localedata/locales/ and include it in all your locales going forward.
The patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-10-17 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-v7-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch --]
[-- Type: text/x-patch; name="0001-v7-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch", Size: 45999 bytes --]
From 733055a6da290f32f508216519de715aa8b5b566 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Thu, 1 Nov 2018 23:46:03 +0100
Subject: [PATCH] v7 Locales: Cyrillic -> ASCII transliteration table [BZ
#2872]
---
localedata/locales/aa_DJ | 1 +
localedata/locales/af_ZA | 1 +
localedata/locales/ak_GH | 1 +
localedata/locales/am_ET | 1 +
localedata/locales/ar_EG | 1 +
localedata/locales/be_BY | 1 +
localedata/locales/bem_ZM | 1 +
localedata/locales/ber_DZ | 1 +
localedata/locales/ber_MA | 1 +
localedata/locales/bg_BG | 1 +
localedata/locales/bi_VU | 1 +
localedata/locales/bn_BD | 1 +
localedata/locales/bo_CN | 1 +
localedata/locales/ca_ES | 1 +
localedata/locales/ce_RU | 1 +
localedata/locales/cmn_TW | 1 +
localedata/locales/cs_CZ | 1 +
localedata/locales/cv_RU | 1 +
localedata/locales/cy_GB | 1 +
localedata/locales/da_DK | 1 +
localedata/locales/de_DE | 1 +
localedata/locales/dv_MV | 1 +
localedata/locales/dz_BT | 1 +
localedata/locales/el_GR | 1 +
localedata/locales/en_GB | 1 +
localedata/locales/en_NG | 1 +
localedata/locales/en_ZM | 1 +
localedata/locales/es_CU | 1 +
localedata/locales/es_ES | 1 +
localedata/locales/et_EE | 1 +
localedata/locales/fa_IR | 1 +
localedata/locales/ff_SN | 1 +
localedata/locales/fi_FI | 1 +
localedata/locales/fr_FR | 1 +
localedata/locales/ga_IE | 1 +
localedata/locales/gd_GB | 1 +
localedata/locales/gu_IN | 1 +
localedata/locales/gv_GB | 1 +
localedata/locales/he_IL | 1 +
localedata/locales/hi_IN | 1 +
localedata/locales/hif_FJ | 1 +
localedata/locales/hr_HR | 1 +
localedata/locales/ht_HT | 1 +
localedata/locales/hu_HU | 1 +
localedata/locales/hy_AM | 1 +
localedata/locales/id_ID | 1 +
localedata/locales/is_IS | 1 +
localedata/locales/it_IT | 1 +
localedata/locales/ja_JP | 1 +
localedata/locales/kab_DZ | 1 +
localedata/locales/kk_KZ | 1 +
localedata/locales/km_KH | 1 +
localedata/locales/kn_IN | 1 +
localedata/locales/ko_KR | 1 +
localedata/locales/ks_IN | 1 +
localedata/locales/kw_GB | 1 +
localedata/locales/ky_KG | 1 +
localedata/locales/lb_LU | 1 +
localedata/locales/lg_UG | 1 +
localedata/locales/lij_IT | 1 +
localedata/locales/ln_CD | 1 +
localedata/locales/lo_LA | 1 +
localedata/locales/lt_LT | 1 +
localedata/locales/lv_LV | 1 +
localedata/locales/mg_MG | 1 +
localedata/locales/mhr_RU | 1 +
localedata/locales/mk_MK | 1 +
localedata/locales/ml_IN | 1 +
localedata/locales/ms_MY | 1 +
localedata/locales/mt_MT | 1 +
localedata/locales/nan_TW@latin | 1 +
localedata/locales/nb_NO | 1 +
localedata/locales/ne_NP | 1 +
localedata/locales/nhn_MX | 1 +
localedata/locales/niu_NU | 1 +
localedata/locales/niu_NZ | 1 +
localedata/locales/nl_NL | 1 +
localedata/locales/nr_ZA | 1 +
localedata/locales/oc_FR | 1 +
localedata/locales/om_KE | 1 +
localedata/locales/or_IN | 1 +
localedata/locales/os_RU | 1 +
localedata/locales/pa_IN | 1 +
localedata/locales/pa_PK | 1 +
localedata/locales/pl_PL | 1 +
localedata/locales/pt_PT | 1 +
localedata/locales/quz_PE | 1 +
localedata/locales/ro_RO | 1 +
localedata/locales/ru_RU | 1 +
localedata/locales/rw_RW | 1 +
localedata/locales/sa_IN | 1 +
localedata/locales/sd_IN | 1 +
localedata/locales/sd_IN@devanagari | 1 +
localedata/locales/se_NO | 1 +
localedata/locales/sgs_LT | 1 +
localedata/locales/shn_MM | 1 +
localedata/locales/si_LK | 1 +
localedata/locales/sk_SK | 1 +
localedata/locales/sl_SI | 1 +
localedata/locales/sm_WS | 1 +
localedata/locales/so_SO | 1 +
localedata/locales/sq_AL | 1 +
localedata/locales/ss_ZA | 1 +
localedata/locales/st_ZA | 1 +
localedata/locales/sv_SE | 1 +
localedata/locales/sw_KE | 1 +
localedata/locales/ta_IN | 1 +
localedata/locales/te_IN | 1 +
localedata/locales/th_TH | 1 +
localedata/locales/ti_ET | 1 +
localedata/locales/tn_ZA | 1 +
localedata/locales/to_TO | 1 +
localedata/locales/tpi_PG | 1 +
localedata/locales/tr_TR | 1 +
localedata/locales/ts_ZA | 1 +
localedata/locales/unm_US | 1 +
localedata/locales/ur_IN | 1 +
localedata/locales/ur_PK | 1 +
localedata/locales/ve_ZA | 1 +
localedata/locales/vi_VN | 1 +
localedata/locales/wa_BE | 1 +
localedata/locales/wo_SN | 1 +
localedata/locales/xh_ZA | 1 +
localedata/locales/yi_US | 1 +
localedata/locales/yuw_PG | 1 +
localedata/locales/zh_CN | 1 +
localedata/locales/zu_ZA | 1 +
127 files changed, 127 insertions(+)
diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
space <U1361>
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% hoy-sadis followed by a vowel
<U1205><U12A0> <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts.
% LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% Historicaly we used ISO-8869-2 and wrote digraphs
% <U01C6> {Ç}, <U01C9> {Ç} and <U01CC> {Ç}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
<U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
<U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts
% LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
%
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if t/scomma is not available, try first t/scedilla
<U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% A-bole -> A-circonflecse -> AU
<U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if digraphs are not available (this is the case with iso-8859-8)
% then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (7 preceding siblings ...)
2018-11-01 22:52 ` [PATCH v7] " Egor Kobylkin
@ 2018-11-02 0:01 ` Egor Kobylkin
2018-11-02 22:22 ` Rafal Luzynski
2018-11-14 21:25 ` [PATCH v9] " Egor Kobylkin
` (4 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-02 0:01 UTC (permalink / raw)
To: libc-alpha, libc-locales, mfabian, Rafal Luzynski,
Marko Myllynen, Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
[-- Attachment #1: Type: text/plain, Size: 11015 bytes --]
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]
to localedata/locales/ and include it in all your locales going forward.
The patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
---
2018-11-02 Egor Kobylkin <egor@kobylkin.com>
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic to
Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-v8-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch --]
[-- Type: text/x-patch; name="0001-v8-Locales-Cyrillic-ASCII-transliteration-table-BZ-2.patch", Size: 59781 bytes --]
From efdde90219d25ecbdc762f113d357cf7de08fc94 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Fri, 2 Nov 2018 00:56:35 +0100
Subject: [PATCH] v8 Locales: Cyrillic -> ASCII transliteration table [BZ
#2872]
---
localedata/locales/aa_DJ | 1 +
localedata/locales/af_ZA | 1 +
localedata/locales/ak_GH | 1 +
localedata/locales/am_ET | 1 +
localedata/locales/ar_EG | 1 +
localedata/locales/be_BY | 1 +
localedata/locales/bem_ZM | 1 +
localedata/locales/ber_DZ | 1 +
localedata/locales/ber_MA | 1 +
localedata/locales/bg_BG | 1 +
localedata/locales/bi_VU | 1 +
localedata/locales/bn_BD | 1 +
localedata/locales/bo_CN | 1 +
localedata/locales/ca_ES | 1 +
localedata/locales/ce_RU | 1 +
localedata/locales/cmn_TW | 1 +
localedata/locales/cs_CZ | 1 +
localedata/locales/cv_RU | 1 +
localedata/locales/cy_GB | 1 +
localedata/locales/da_DK | 1 +
localedata/locales/de_DE | 1 +
localedata/locales/dv_MV | 1 +
localedata/locales/dz_BT | 1 +
localedata/locales/el_GR | 1 +
localedata/locales/en_GB | 1 +
localedata/locales/en_NG | 1 +
localedata/locales/en_ZM | 1 +
localedata/locales/es_CU | 1 +
localedata/locales/es_ES | 1 +
localedata/locales/et_EE | 1 +
localedata/locales/fa_IR | 1 +
localedata/locales/ff_SN | 1 +
localedata/locales/fi_FI | 1 +
localedata/locales/fr_FR | 1 +
localedata/locales/ga_IE | 1 +
localedata/locales/gd_GB | 1 +
localedata/locales/gu_IN | 1 +
localedata/locales/gv_GB | 1 +
localedata/locales/he_IL | 1 +
localedata/locales/hi_IN | 1 +
localedata/locales/hif_FJ | 1 +
localedata/locales/hr_HR | 1 +
localedata/locales/ht_HT | 1 +
localedata/locales/hu_HU | 1 +
localedata/locales/hy_AM | 1 +
localedata/locales/id_ID | 1 +
localedata/locales/is_IS | 1 +
localedata/locales/it_IT | 1 +
localedata/locales/ja_JP | 1 +
localedata/locales/kab_DZ | 1 +
localedata/locales/kk_KZ | 1 +
localedata/locales/km_KH | 1 +
localedata/locales/kn_IN | 1 +
localedata/locales/ko_KR | 1 +
localedata/locales/ks_IN | 1 +
localedata/locales/kw_GB | 1 +
localedata/locales/ky_KG | 1 +
localedata/locales/lb_LU | 1 +
localedata/locales/lg_UG | 1 +
localedata/locales/lij_IT | 1 +
localedata/locales/ln_CD | 1 +
localedata/locales/lo_LA | 1 +
localedata/locales/lt_LT | 1 +
localedata/locales/lv_LV | 1 +
localedata/locales/mg_MG | 1 +
localedata/locales/mhr_RU | 1 +
localedata/locales/mk_MK | 1 +
localedata/locales/ml_IN | 1 +
localedata/locales/ms_MY | 1 +
localedata/locales/mt_MT | 1 +
localedata/locales/nan_TW@latin | 1 +
localedata/locales/nb_NO | 1 +
localedata/locales/ne_NP | 1 +
localedata/locales/nhn_MX | 1 +
localedata/locales/niu_NU | 1 +
localedata/locales/niu_NZ | 1 +
localedata/locales/nl_NL | 1 +
localedata/locales/nr_ZA | 1 +
localedata/locales/oc_FR | 1 +
localedata/locales/om_KE | 1 +
localedata/locales/or_IN | 1 +
localedata/locales/os_RU | 1 +
localedata/locales/pa_IN | 1 +
localedata/locales/pa_PK | 1 +
localedata/locales/pl_PL | 1 +
localedata/locales/pt_PT | 1 +
localedata/locales/quz_PE | 1 +
localedata/locales/ro_RO | 1 +
localedata/locales/ru_RU | 1 +
localedata/locales/rw_RW | 1 +
localedata/locales/sa_IN | 1 +
localedata/locales/sd_IN | 1 +
localedata/locales/sd_IN@devanagari | 1 +
localedata/locales/se_NO | 1 +
localedata/locales/sgs_LT | 1 +
localedata/locales/shn_MM | 1 +
localedata/locales/si_LK | 1 +
localedata/locales/sk_SK | 1 +
localedata/locales/sl_SI | 1 +
localedata/locales/sm_WS | 1 +
localedata/locales/so_SO | 1 +
localedata/locales/sq_AL | 1 +
localedata/locales/ss_ZA | 1 +
localedata/locales/st_ZA | 1 +
localedata/locales/sv_SE | 1 +
localedata/locales/sw_KE | 1 +
localedata/locales/ta_IN | 1 +
localedata/locales/te_IN | 1 +
localedata/locales/th_TH | 1 +
localedata/locales/ti_ET | 1 +
localedata/locales/tn_ZA | 1 +
localedata/locales/to_TO | 1 +
localedata/locales/tpi_PG | 1 +
localedata/locales/tr_TR | 1 +
localedata/locales/translit_cyrillic | 383 +++++++++++++++++++++++++++
localedata/locales/ts_ZA | 1 +
localedata/locales/unm_US | 1 +
localedata/locales/ur_IN | 1 +
localedata/locales/ur_PK | 1 +
localedata/locales/ve_ZA | 1 +
localedata/locales/vi_VN | 1 +
localedata/locales/wa_BE | 1 +
localedata/locales/wo_SN | 1 +
localedata/locales/xh_ZA | 1 +
localedata/locales/yi_US | 1 +
localedata/locales/yuw_PG | 1 +
localedata/locales/zh_CN | 1 +
localedata/locales/zu_ZA | 1 +
128 files changed, 510 insertions(+)
create mode 100644 localedata/locales/translit_cyrillic
diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
space <U1361>
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% hoy-sadis followed by a vowel
<U1205><U12A0> <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts.
% LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% Historicaly we used ISO-8869-2 and wrote digraphs
% <U01C6> {Ç}, <U01C9> {Ç} and <U01CC> {Ç}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
<U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
<U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts
% LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
%
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if t/scomma is not available, try first t/scedilla
<U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..073a138a6a
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% A-bole -> A-circonflecse -> AU
<U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if digraphs are not available (this is the case with iso-8859-8)
% then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-02 0:01 ` [PATCH v8] " Egor Kobylkin
@ 2018-11-02 22:22 ` Rafal Luzynski
2018-11-02 23:27 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-11-02 22:22 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, mfabian, Marko Myllynen,
Dmitry V. Levin
Cc: Volodymyr Lisivka, Max Kutny, danilo
Hi Egor,
I have applied your patch locally and I am going to start reviewing it.
I can tell you already that it applies correctly but git reports these
warnings:
Applying: v8 Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
.git/rebase-apply/patch:1520: trailing whitespace.
% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
.git/rebase-apply/patch:1521: trailing whitespace.
% It implements the GOST_7.79 System A (Latin Script) as a first
.git/rebase-apply/patch:1523: trailing whitespace.
% https://en.wikipedia.org/wiki/ISO_9 for reference.
.git/rebase-apply/patch:1524: trailing whitespace.
% The System B is extended from GOST_7.79-Russian using open sources
.git/rebase-apply/patch:1535: trailing whitespace.
% Generated from UnicodeData.txt with a spreadsheet referenced
warning: 5 lines add whitespace errors.
Also the commit message is missing from your patch because probably it is
missing from your local repository. Please re-add it and please remember
that it must contain a summary like this:
[BZ #2872]
* localedata/locales/translit_cyrillic: Add ISO 9.1995, GOST 7.79
System A transliteration System B transcription table from Cyrillic
to Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""' to
LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
Hm... as I look at this now I think it should rather be:
[BZ #2872]
* localedata/locales/translit_cyrillic: New file.
* localedata/locales/aa_DJ (LC_CTYPE): Add
“'include "translit_cyrillic";""”
* localedata/locales/af_ZA (LC_CTYPE): Likewise.
... and so on. Optionally you can use:
* localedata/locales/translit_cyrillic: New file. Supports
ISO 9.1995, GOST 7.79 System A transliteration System B
transcription table from Cyrillic to Latin/ASCII.
I will appreciate more hints about how to write the ChangeLog entry
correctly
from more experienced maintainers.
2.11.2018 01:00 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
I confirm that this is the only relevant difference between v6 and v8.
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
Has this list changed, that is, has any locale been added or removed?
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> “copy "tr_TR"”.
True, this is another difference and I hope this is correct (I have not
yet tested).
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
Correct.
> * Consistently transliterate single uppercase Cyrillic letters
> to sequences of all uppercase Latin letters in all languages (whenever
> a Cyrillic letter is transliterated to more than one Latin letter),
> for example "Ї" is now transliterated as "YI" rather than "Yi".
I think you have not yet explained whether this is required by any existing
standard (please provide links) or whether this is your genuine idea to
distinguish between the cases like "Ш" transliterated to "Sh" and "Сх"
also transliterated to "Sh".
Again, I have not yet started reviewing and testing, this is just a feedback
after applying the patch locally.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v8] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-02 22:22 ` Rafal Luzynski
@ 2018-11-02 23:27 ` Egor Kobylkin
0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-02 23:27 UTC (permalink / raw)
To: libc-alpha, libc-locales
Moving everybody from To: and CC: on BCC. It seems at this stage it is
Rafal and me. It is still going to libc-alpha and libc-locales. If you
are interested to be put back on CC - please let me know.
On 02.11.18 23:22, Rafal Luzynski wrote:
>> * Consistently transliterate single uppercase Cyrillic letters to
>> sequences of all uppercase Latin letters in all languages
>> (whenever a Cyrillic letter is transliterated to more than one
>> Latin letter), for example "Ð" is now transliterated as "YI" rather
>> than "Yi".
>
> I think you have not yet explained whether this is required by any
> existing standard (please provide links) or whether this is your
> genuine idea to distinguish between the cases like "Ш" transliterated > to "Sh" and
"СÑ
" also transliterated to "Sh".
I remember seeing this form of the capitalization it in actual
transliterated texts long time ago but can't find a formal description
as of now. Just don't want to claim this to be my original idea.
>> The choice for YO, SH, YA, ZH etc. is to avoid naming collisions for
>> example for "СÑ
" and "Ш" that would both transliterate to Sh:
>> With SH:"СÑ
ема"->"Shema" but "Шема"->"SHema"
>> With Sh:"СÑ
ема"->"Shema" and "Шема"->"Shema". Collision!
>> This is important e.g. for renaming files, grouping as in using uniq >> etc.
As for the users - I am a user and I have demonstrated the use cases
where the collisions due to "one symbol capitalization" would cause
irreversible damage to data. For a library like glibc this seems like a
relevant issue to consider.
The "two symbol capitalization" on the other hand would prevent
collision and can be easily corrected in the userspace if needed
with something like
foo="SHema"
foo="${foo:0:1}$(tr '[:upper:]' '[:lower:]' <<<${foo:1})"
echo "$foo"
Shema
It looks like everyone really using transliteration for something
sensitive already have done it the userspace since at least 2006 when
this bug was first logged. So we won't brake the official use cases
where the capitalization should be done in a certain way. But we will
prevent new bugs due to collision if we use "two symbol capitalization"
indeed.
Happy to hear arguments to the contrary.
Bests,
Egor Kobylkin
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (8 preceding siblings ...)
2018-11-02 0:01 ` [PATCH v8] " Egor Kobylkin
@ 2018-11-14 21:25 ` Egor Kobylkin
2018-11-16 22:17 ` Rafal Luzynski
2018-11-19 11:11 ` [PATCH v10] " Egor Kobylkin
` (3 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-14 21:25 UTC (permalink / raw)
To: libc-alpha, libc-locales
[-- Attachment #1: Type: text/plain, Size: 5901 bytes --]
Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file
https://sourceware.org/bugzilla/attachment.cgi?id=11340 [7]
to localedata/locales/ and include it in all your locales going forward.
The patch included inline below.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Examples: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription and iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8 will produce Latin transliteration as per
ISO 9.1995.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is attached as a file translit_cyrillic
[7]. Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 official source (Federal Agency on Technical
Regulating and Metrology Of Russian Federation [2]). Technically an
independent but mostly identical source [3] was used and prepared in a
spreadsheet [6].
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically I have searched for all locales that already have
'include .*translit.*;""' string and generated a patch for them.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11301
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?id=11340
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
Best regards,
Egor Kobylkin
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch; name="0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch", Size: 64631 bytes --]
From a8ae30e0bf7484f4c0f034480110c81dd059b69e Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 14 Nov 2018 22:10:37 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* localedata/locales/translit_cyrillic: New file. Supports
ISO 9.1995, GOST 7.79 System A transliteration System B
transcription table from Cyrillic to Latin/ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""'
to LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
---
localedata/locales/aa_DJ | 1 +
localedata/locales/af_ZA | 1 +
localedata/locales/ak_GH | 1 +
localedata/locales/am_ET | 1 +
localedata/locales/ar_EG | 1 +
localedata/locales/be_BY | 1 +
localedata/locales/bem_ZM | 1 +
localedata/locales/ber_DZ | 1 +
localedata/locales/ber_MA | 1 +
localedata/locales/bg_BG | 1 +
localedata/locales/bi_VU | 1 +
localedata/locales/bn_BD | 1 +
localedata/locales/bo_CN | 1 +
localedata/locales/ca_ES | 1 +
localedata/locales/ce_RU | 1 +
localedata/locales/cs_CZ | 1 +
localedata/locales/cv_RU | 1 +
localedata/locales/cy_GB | 1 +
localedata/locales/da_DK | 1 +
localedata/locales/de_DE | 1 +
localedata/locales/dv_MV | 1 +
localedata/locales/dz_BT | 1 +
localedata/locales/el_GR | 1 +
localedata/locales/en_GB | 1 +
localedata/locales/en_NG | 1 +
localedata/locales/en_ZM | 1 +
localedata/locales/es_CU | 1 +
localedata/locales/es_ES | 1 +
localedata/locales/et_EE | 1 +
localedata/locales/fa_IR | 1 +
localedata/locales/ff_SN | 1 +
localedata/locales/fi_FI | 1 +
localedata/locales/fr_FR | 1 +
localedata/locales/ga_IE | 1 +
localedata/locales/gd_GB | 1 +
localedata/locales/gu_IN | 1 +
localedata/locales/gv_GB | 1 +
localedata/locales/he_IL | 1 +
localedata/locales/hi_IN | 1 +
localedata/locales/hif_FJ | 1 +
localedata/locales/hr_HR | 1 +
localedata/locales/ht_HT | 1 +
localedata/locales/hu_HU | 1 +
localedata/locales/hy_AM | 1 +
localedata/locales/id_ID | 1 +
localedata/locales/is_IS | 1 +
localedata/locales/it_IT | 1 +
localedata/locales/ja_JP | 1 +
localedata/locales/kab_DZ | 1 +
localedata/locales/kk_KZ | 1 +
localedata/locales/km_KH | 1 +
localedata/locales/kn_IN | 1 +
localedata/locales/ko_KR | 1 +
localedata/locales/ks_IN | 1 +
localedata/locales/kw_GB | 1 +
localedata/locales/ky_KG | 1 +
localedata/locales/lb_LU | 1 +
localedata/locales/lg_UG | 1 +
localedata/locales/lij_IT | 1 +
localedata/locales/ln_CD | 1 +
localedata/locales/lo_LA | 1 +
localedata/locales/lt_LT | 1 +
localedata/locales/lv_LV | 1 +
localedata/locales/mg_MG | 1 +
localedata/locales/mhr_RU | 1 +
localedata/locales/mk_MK | 1 +
localedata/locales/ml_IN | 1 +
localedata/locales/ms_MY | 1 +
localedata/locales/mt_MT | 1 +
localedata/locales/nan_TW@latin | 1 +
localedata/locales/nb_NO | 1 +
localedata/locales/ne_NP | 1 +
localedata/locales/nhn_MX | 1 +
localedata/locales/niu_NU | 1 +
localedata/locales/niu_NZ | 1 +
localedata/locales/nl_NL | 1 +
localedata/locales/nr_ZA | 1 +
localedata/locales/oc_FR | 1 +
localedata/locales/om_KE | 1 +
localedata/locales/or_IN | 1 +
localedata/locales/os_RU | 1 +
localedata/locales/pa_IN | 1 +
localedata/locales/pa_PK | 1 +
localedata/locales/pl_PL | 1 +
localedata/locales/pt_PT | 1 +
localedata/locales/quz_PE | 1 +
localedata/locales/ro_RO | 1 +
localedata/locales/ru_RU | 1 +
localedata/locales/rw_RW | 1 +
localedata/locales/sa_IN | 1 +
localedata/locales/sd_IN | 1 +
localedata/locales/sd_IN@devanagari | 1 +
localedata/locales/se_NO | 1 +
localedata/locales/sgs_LT | 1 +
localedata/locales/shn_MM | 1 +
localedata/locales/si_LK | 1 +
localedata/locales/sk_SK | 1 +
localedata/locales/sl_SI | 1 +
localedata/locales/sm_WS | 1 +
localedata/locales/so_SO | 1 +
localedata/locales/sq_AL | 1 +
localedata/locales/ss_ZA | 1 +
localedata/locales/st_ZA | 1 +
localedata/locales/sv_SE | 1 +
localedata/locales/sw_KE | 1 +
localedata/locales/ta_IN | 1 +
localedata/locales/te_IN | 1 +
localedata/locales/th_TH | 1 +
localedata/locales/ti_ET | 1 +
localedata/locales/tn_ZA | 1 +
localedata/locales/to_TO | 1 +
localedata/locales/tpi_PG | 1 +
localedata/locales/tr_TR | 1 +
localedata/locales/translit_cyrillic | 383 +++++++++++++++++++++++++++
localedata/locales/ts_ZA | 1 +
localedata/locales/unm_US | 1 +
localedata/locales/ur_IN | 1 +
localedata/locales/ur_PK | 1 +
localedata/locales/ve_ZA | 1 +
localedata/locales/vi_VN | 1 +
localedata/locales/wa_BE | 1 +
localedata/locales/wo_SN | 1 +
localedata/locales/xh_ZA | 1 +
localedata/locales/yi_US | 1 +
localedata/locales/yuw_PG | 1 +
localedata/locales/zh_CN | 1 +
localedata/locales/zu_ZA | 1 +
127 files changed, 509 insertions(+)
create mode 100644 localedata/locales/translit_cyrillic
diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
space <U1361>
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% hoy-sadis followed by a vowel
<U1205><U12A0> <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts.
% LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% Historicaly we used ISO-8869-2 and wrote digraphs
% <U01C6> {Ç}, <U01C9> {Ç} and <U01CC> {Ç}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
<U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
<U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts
% LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
%
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if t/scomma is not available, try first t/scedilla
<U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..82d9749e08
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,383 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transliterations of Cyrillic letters to Latin and/or ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% It implements the GOST_7.79 System A (Latin Script) as a first
+% option and System B Cyrillic (ASCII) as a second option. Check
+% h:ttps://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+
+% Usage examples:
+% iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
+% | iconv -f ISO-8859-15 -t UTF-8 # System A
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bug's doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> <U00CB>;"<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> <U0110>;"<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> <U01F4>;"<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> <U00CA>;"<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> <U1E90>;"<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U00CC>;<U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> <U00CF>;"<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> "<U004A><U030C>";<U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0302>";"<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0302>";"<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> <U0106>;"<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> <U1E30>;"<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> <U016C>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0302>";"<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> <U017D>;"<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC UNDEFINED
+<U0423><U0301> <U00DA>;"<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0048>;<U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> <U0043>;"<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> <U010C>;"<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> <U0160>;"<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> <U015C>;"<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> <U02BA>;"<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> <U0059>;"<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U02B9>;<U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> <U00C8>;"<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> <U00DB>;"<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> <U00C2>;"<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> <U017E>;"<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC UNDEFINED
+<U0443><U0301> <U00FA>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0068>;<U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> <U0063>;"<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> <U010D>;"<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> <U0161>;"<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> <U015D>;"<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> <U02BA>;"<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> <U0079>;"<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U02B9>;<U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> <U00E8>;"<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> <U00FB>;"<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> <U00E2>;"<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> <U00EB>;"<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> <U0111>;"<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> <U01F5>;"<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> <U00EA>;"<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> <U1E91>;"<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U00EC>;<U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> <U00EF>;"<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U01F0>;<U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0302>";"<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0302>";"<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> <U0107>;"<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> <U1E31>;"<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> <U016D>;"<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0302>";"<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> <U01CD>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> <U01CE>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0300>";"<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0300>";"<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> <U1EF2>;"<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> <U1EF3>;"<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> <U011A>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> <U011B>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0300>";"<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0300>";"<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> <U0120>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> <U0121>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> <U011E>;"<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> <U011F>;"<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U017D><U0327>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U017E><U0327>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> <U0136>;"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> <U0137>;"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0304>";"<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0304>";"<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> <U1E46>;"<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> <U1E47>;"<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> <U1E44>;"<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> <U1E45>;"<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> <U1E54>;"<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> <U1E55>;"<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> <U00D2>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> <U00F2>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> <U00C7>;"<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> <U00E7>;"<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> <U0162>;"<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> <U0163>;"<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U00D9>;<U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U00F9>;<U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> <U1E28>;"<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> <U1E29>;"<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0043><U0304>";"<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0063><U0304>";"<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> <U1E24>;"<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> <U1E25>;"<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U00C7><U0306>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U00E7><U0306>";"<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U2021>;<U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0306>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0306>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> <U00C7>;"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> <U00E7>;"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> <U0102>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> <U0103>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> <U00C4>;"<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> <U00E4>;"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> <U0114>;"<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> <U0115>;"<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U030B>";"<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U030B>";"<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0304>";"<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0304>";"<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0308>";"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0308>";"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> <U0179>;"<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> <U017A>;"<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> <U00CE>;"<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> <U00EE>;"<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> <U00D6>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> <U00F6>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> <U00D4>;"<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> <U00F4>;"<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> <U00DC>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> <U00FC>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> <U0170>;"<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> <U0171>;"<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0308>";"<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0308>";"<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> <U0178>;"<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> <U00FF>;"<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U2035>;<U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% A-bole -> A-circonflecse -> AU
<U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if digraphs are not available (this is the case with iso-8859-8)
% then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-14 21:25 ` [PATCH v9] " Egor Kobylkin
@ 2018-11-16 22:17 ` Rafal Luzynski
2018-11-17 18:35 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-11-16 22:17 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
Thank you for working on this, Egor.
Before I start reviewing I would like to summarize the things which
I think are blocking for this patch.
1. I think we need tests for transliteration. Currently there is only
one test program which is similar to what we need,
localedata/bug-iconv-trans.c. It is old and it is not quite clear
what bug it is trying to test. Therefore I think we need a new
framework to test transliteration. Is it a good idea to base the
test on the iconv(1) command line utility which is part of glibc?
2. I made few tests in the command line and it seems to me that the
transliteration from "З" to "Z" (+ lowercase as well) in uk_UA does
not work and has not been working for some time already because
I've checked some older systems as well and the result is always
the same. I think that the reason is that uk_UA defines multiple
transliteration rules for "З" depending on what is the letter following
it. It does not seem to work. AFAIK the reason is that the syntax of
transliteration rules says that a single non-Latin character may map
one or more Latin strings, each consisting of one or more characters.
There cannot be a rule transliterating multiple source characters into
one or multiple destination characters. Is it a bug in transliteration
implementation? Or maybe in the specification, including POSIX standard?
The definition of transliteration says that it is one-to-one mapping
of graphemes while a grapheme may be one or multiple characters.
It does not have to be always mapping one-to-one character. Should we
fix this bug first, make uk_UA transliteration work, and only then
add a generic Cyrillic transliteration? Egor's patch already contains
transliteration of "У" + combining acute accent to "Ú" which most
probably
will not work.
I still think that in the longer term all existing custom transliterations
of Cyrillic alphabets should be ported to a modification of your patch.
Egor, while at this I was thinking about your idea to transliterate letters
like "Ш" (uppercase) to "SH" (always uppercase) in order to distinguish
between "Шема" (-> "SHema") and "Схема" (-> "Shema" or "Sxema"). Also
you include a rule to transliterate "Х" to "H" or "X" depending on which
destination characters are available, which I told you already that will
not work because both "H" and "X" are always available and therefore only
the first rule will always be used. I still don't like the idea to
put two uppercase letters in a beginning of a word in titlecase only to
indicate that there was originally a single letter. What if we:
* drop the rule of transliterating "Х" to "H" and transliterate always to
"X",
* transliterate uppercase "Ш" to "Sh" (so it will work fine for titlecase
words)?
As a result the Latin letter "h" will only appear as part of a digraph and
never as a transliteration of "Х" and therefore will never cause a conflict.
Examples:
* "Шема" -> "Shema",
* "Схема" -> "Sxema".
Will this solve the problem?
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-16 22:17 ` Rafal Luzynski
@ 2018-11-17 18:35 ` Egor Kobylkin
2018-11-19 7:14 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-17 18:35 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales, Marko Myllynen
Hi Rafal,
thanks for putting it into a clear issue statement on SH/Sh problem. I'm
totally with you on this being a good thing to discuss. It is orthogonal
to the tests so let me focus on SH/Sh and System A/B problematic here.
Looks like we have three issues:
1. lack of explicit control which transformation to use (System A or
System B) via //TRANSLIT
2. possibility of collision for System B if used CAP/low transcription
for capital letters
3. Cyrillic 'Ð¥'/'Ñ
' (ha) never transcribes to 'H'/'h' as it should per
System B because it's equivalent 'X'/'x' from System A is always present
and takes precedence.
As a solution shouldn't we only keep System B in a new file
transcribe_cyrillic and put it in place as the explicit ASCII
transcription for targeted locales (as opposed to transliteration)?
We would keep System A as translit_cyrillic but won't include it into
this patch. Once you have resolved an issue of having two conflicting
rule-sets but only one key //TRANSLIT you could add the System A back.
The SH/Sh can be decided on either way - seems like an easy change any way.
Please see more discussion on your excellent points below:
On 16.11.18 23:17, Rafal Luzynski wrote:
> Egor, while at this I was thinking about your idea to transliterate
> letters like "Ш" (uppercase) to "SH" (always uppercase) in order to
> distinguish between "Шема" (-> "SHema") and "СÑ
ема" (-> "Shema" or
> "Sxema").
to clarify, this SH/Sh collision issue relates only to iconv -f UTF-8 -t
ASCII//TRANSLIT (i.e. System B transcription).
But it's not only SH/Sh, there are following combinations used to
transcribe capital letters:
YO, DJ, YE, TSH, DH, ZH, CZ, CH, SH, SHH, YU, YA, FH, YH, GH, NG, TCZ
Arguably any of them (if not in that CAP/CAP form) could collide with
their CAP/low equivalent from a different word. (there may be language
grammar rules that in fact prevent some but we don't know for sure)
With transcription we are basically striping information from the data,
mapping it into a smaller character set. The idea to keep them in
CAP/CAP is to try to preserve as much information as possible.
> Also you include a rule to transliterate "Ð¥" to "H" or "X" depending
> on which destination characters are available, which I told you
> already that will not work because both "H" and "X" are always
> available and therefore only the first rule will always be used.
Just to have this here for reference, the idea was to have both rules in
one file so
iconv -f UTF-8 -t ASCII//TRANSLIT
will produce ASCII compatible _transcription_ (System B)
iconv -f UTF-8 -t ISO-8859-15//TRANSLIT |
iconv -f ISO-8859-15 -t UTF-8
will produce Latin _transliteration_ as per ISO 9.1995. (System A)
So in fact we have two rules for each letter in the same file (System A
and System B), where System A takes precedence.
I have a question then: isn't this more like a hack than a right thing
to do?
Shouldn't we have two explicit rules for transcription and
transliteration not dependent on a destination character set?
> I still don't like the idea to
> put two uppercase letters in a beginning of a word in titlecase only
> to indicate that there was originally a single letter. What if we:
>
> * drop the rule of transliterating "Ð¥" to "H" and transliterate
> always to "X",
This would contradict ISO 9.1995. (System A).
System A was added on Marko's request (so setting him on TO:) I am
neutral on keeping it or dropping it, just to be clear.
> * transliterate uppercase "Ш" to "Sh" (so it will work fine for
> titlecase words)?
>
> As a result the Latin letter "h" will only appear as part of a
> digraph and never as a transliteration of "Ð¥" and therefore will
> never cause a conflict. Examples:
>
> * "Шема" -> "Shema", * "СÑ
ема" -> "Sxema".
>
> Will this solve the problem?
This particular rule with h/x would make sense it's own.
But again - it would contradict the standards.
On the other hand, for my personal needs I care less about standards but
about current functionality and data loss because of missing
transcription altogether due to the BZ #2872.
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-17 18:35 ` Egor Kobylkin
@ 2018-11-19 7:14 ` Marko Myllynen
2018-11-19 9:22 ` Egor Kobylkin
2018-12-01 22:09 ` Rafal Luzynski
0 siblings, 2 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-11-19 7:14 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales
Hi,
On 17/11/2018 20.34, Egor Kobylkin wrote:
>
> Looks like we have three issues:
> 1. lack of explicit control which transformation to use (System A or
> System B) via //TRANSLIT
> 2. possibility of collision for System B if used CAP/low transcription
> for capital letters
> 3. Cyrillic 'Ð¥'/'Ñ
' (ha) never transcribes to 'H'/'h' as it should per
> System B because it's equivalent 'X'/'x' from System A is always present
> and takes precedence.
>
> As a solution shouldn't we only keep System B in a new file
> transcribe_cyrillic and put it in place as the explicit ASCII
> transcription for targeted locales (as opposed to transliteration)?
>
> We would keep System A as translit_cyrillic but won't include it into
> this patch. Once you have resolved an issue of having two conflicting
> rule-sets but only one key //TRANSLIT you could add the System A back.
>
> The SH/Sh can be decided on either way - seems like an easy change any way.
>
> I have a question then: isn't this more like a hack than a right thing
> to do?
>
> Shouldn't we have two explicit rules for transcription and
> transliteration not dependent on a destination character set?
>
> This would contradict ISO 9.1995. (System A).
> System A was added on Marko's request (so setting him on TO:) I am
> neutral on keeping it or dropping it, just to be clear.
>
> This particular rule with h/x would make sense it's own.
> But again - it would contradict the standards.
> On the other hand, for my personal needs I care less about standards but
> about current functionality and data loss because of missing
> transcription altogether due to the BZ #2872.
Given the amount of questions above I think the way forward is to try
follow the relevant standards as closely as possible and also check what
the other implementations (i.e., uconv(1)) do. For example, checking the
case earlier mentioned case may or may not give some hints:
$ echo Шема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
Å ema
$ echo СÑ
ема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
Shema
$ uconv -V
uconv v2.1 ICU 50.1.2
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-19 7:14 ` Marko Myllynen
@ 2018-11-19 9:22 ` Egor Kobylkin
2018-11-19 19:36 ` Marko Myllynen
2018-12-01 22:09 ` Rafal Luzynski
1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-19 9:22 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales
On 19.11.18 08:13, Marko Myllynen wrote:
> Hi,
>
> On 17/11/2018 20.34, Egor Kobylkin wrote:
>>
>> Shouldn't we have two explicit rules for transcription and
>> transliteration not dependent on a destination character set?
>>
>> This would contradict ISO 9.1995. (System A).
>> System A was added on Marko's request (so setting him on TO:) I am
>> neutral on keeping it or dropping it, just to be clear.
>>
>> This particular rule with h/x would make sense it's own.
>> But again - it would contradict the standards.
>> On the other hand, for my personal needs I care less about standards but
>> about current functionality and data loss because of missing
>> transcription altogether due to the BZ #2872.
>
> Given the amount of questions above I think the way forward is to try
> follow the relevant standards as closely as possible and also check what
> the other implementations (i.e., uconv(1)) do. For example, checking the
> case earlier mentioned case may or may not give some hints:
>
> $ echo Шема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Å ema
> $ echo СÑ
ема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Shema
> $ uconv -V
> uconv v2.1 ICU 50.1.2
Marko,
Your example only covers _tansliteration_ to Latin Diacritics
iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
| iconv -f ISO-8859-15 -t UTF-8
while BZ #2872 is about _transcription_ to ASCII
iconv -f UTF-8 -t ASCII//TRANSLIT
The glibc wiki explicitly lists this use case (ASCII) as the test
example https://sourceware.org/glibc/wiki/Locales#Testing_Locales
So again, you are asking to have ISO 9.1995. System A but the bug is
about ISO 9.1995. System B (GOST 7.79-2000)
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (9 preceding siblings ...)
2018-11-14 21:25 ` [PATCH v9] " Egor Kobylkin
@ 2018-11-19 11:11 ` Egor Kobylkin
2018-12-08 0:02 ` Rafal Luzynski
2018-12-10 1:28 ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
` (2 subsequent siblings)
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-11-19 11:11 UTC (permalink / raw)
To: libc-alpha, libc-locales
[-- Attachment #1: Type: text/plain, Size: 6611 bytes --]
Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics
Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration table translit_cyrillic file [7]
to localedata/locales/ and include it in all your locales going forward.
The patch is attached.
From this patch I have excluded locales that already mention cyrillic or
have a transliteration table for it:
mn_MN
sr_RS
tg_TJ
tk_TM
tt_RU
uk_UA
uz_UZ
uz_UZ@cyrillic
uk_UA
Their maintainers are requested to make an explicit decision on how and
whether at all to include this patch.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example
https://sourceware.org/glibc/wiki/Locales#Testing_Locales :
LC_ALL=$LOCALE.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt
currently it fails on Cyrillic texts in most locales including ru_RU [1]
[8] [9]:
LC_ALL=ru_RU.UTF-8 iconv -f UTF-8 -t ASCII//TRANSLIT <
translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- It produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here. Furthermore it has to be referenced/included into the
active locale at the compilation time to be used by iconv.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The transliteration table itself is in the file translit_cyrillic [7].
Its content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].
The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen. [11]
The documentation suggests that the transliteration tables inclusion is
done by adding *include "translit_cyrillic";""* string into LC_CTYPE
translit_start section
http://man7.org/linux/man-pages/man5/locale.5.html [5]
Practically all locales that already have 'include .*translit.*;""'
string were identified and included into this patch.
The Cyrillic transliteration of e.g. Russian text may have already
worked to some extent for mn_MN, sr_RS, tk_TM, uz_UZ, uk_UA locales that
have their transliteration tables included inline.
I am excluding these locales from this proposed patch. I have written
directly to locale maintainer emails listed in the files. Volodymyr
Lisivka <vlisivka@gmail.com>, Max Kutny <mkutny@gmail.com> (uk_UA),
Ðанило Шеган <danilo@gnome.org> (sr_RS) have confirmed the
exclusion.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[7] translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#ISO_9:1995,_or_GOST_7.79_System_A
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
Best regards,
Egor Kobylkin
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch; name="0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch", Size: 63440 bytes --]
From ce25f26f21918147f6444dac0fa03096368e6494 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Mon, 19 Nov 2018 12:03:14 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* localedata/locales/translit_cyrillic: New file. Supports
ISO 9.1995, GOST 7.79 System B transcription table from Cyrillic
to ASCII.
* localedata/locales/aa_DJ: Add 'include "translit_cyrillic";""'
to LC_CTYPE translit section.
* localedata/locales/af_ZA: Likewise.
* localedata/locales/ak_GH: Likewise.
* localedata/locales/am_ET: Likewise.
* localedata/locales/ar_EG: Likewise.
* localedata/locales/be_BY: Likewise.
* localedata/locales/bem_ZM: Likewise.
* localedata/locales/ber_DZ: Likewise.
* localedata/locales/ber_MA: Likewise.
* localedata/locales/bg_BG: Likewise.
* localedata/locales/bi_VU: Likewise.
* localedata/locales/bn_BD: Likewise.
* localedata/locales/bo_CN: Likewise.
* localedata/locales/ca_ES: Likewise.
* localedata/locales/ce_RU: Likewise.
* localedata/locales/cmn_TW: Likewise.
* localedata/locales/cs_CZ: Likewise.
* localedata/locales/cv_RU: Likewise.
* localedata/locales/cy_GB: Likewise.
* localedata/locales/da_DK: Likewise.
* localedata/locales/de_DE: Likewise.
* localedata/locales/dv_MV: Likewise.
* localedata/locales/dz_BT: Likewise.
* localedata/locales/el_GR: Likewise.
* localedata/locales/en_GB: Likewise.
* localedata/locales/en_NG: Likewise.
* localedata/locales/en_ZM: Likewise.
* localedata/locales/es_CU: Likewise.
* localedata/locales/es_ES: Likewise.
* localedata/locales/et_EE: Likewise.
* localedata/locales/fa_IR: Likewise.
* localedata/locales/ff_SN: Likewise.
* localedata/locales/fi_FI: Likewise.
* localedata/locales/fr_FR: Likewise.
* localedata/locales/ga_IE: Likewise.
* localedata/locales/gd_GB: Likewise.
* localedata/locales/gu_IN: Likewise.
* localedata/locales/gv_GB: Likewise.
* localedata/locales/he_IL: Likewise.
* localedata/locales/hi_IN: Likewise.
* localedata/locales/hif_FJ: Likewise.
* localedata/locales/hr_HR: Likewise.
* localedata/locales/ht_HT: Likewise.
* localedata/locales/hu_HU: Likewise.
* localedata/locales/hy_AM: Likewise.
* localedata/locales/id_ID: Likewise.
* localedata/locales/is_IS: Likewise.
* localedata/locales/it_IT: Likewise.
* localedata/locales/ja_JP: Likewise.
* localedata/locales/kab_DZ: Likewise.
* localedata/locales/kk_KZ: Likewise.
* localedata/locales/km_KH: Likewise.
* localedata/locales/kn_IN: Likewise.
* localedata/locales/ko_KR: Likewise.
* localedata/locales/ks_IN: Likewise.
* localedata/locales/kw_GB: Likewise.
* localedata/locales/ky_KG: Likewise.
* localedata/locales/lb_LU: Likewise.
* localedata/locales/lg_UG: Likewise.
* localedata/locales/lij_IT: Likewise.
* localedata/locales/ln_CD: Likewise.
* localedata/locales/lo_LA: Likewise.
* localedata/locales/lt_LT: Likewise.
* localedata/locales/lv_LV: Likewise.
* localedata/locales/mg_MG: Likewise.
* localedata/locales/mhr_RU: Likewise.
* localedata/locales/mk_MK: Likewise.
* localedata/locales/ml_IN: Likewise.
* localedata/locales/ms_MY: Likewise.
* localedata/locales/mt_MT: Likewise.
* localedata/locales/nan_TW@latin: Likewise.
* localedata/locales/nb_NO: Likewise.
* localedata/locales/ne_NP: Likewise.
* localedata/locales/nhn_MX: Likewise.
* localedata/locales/niu_NU: Likewise.
* localedata/locales/niu_NZ: Likewise.
* localedata/locales/nl_NL: Likewise.
* localedata/locales/nr_ZA: Likewise.
* localedata/locales/oc_FR: Likewise.
* localedata/locales/om_KE: Likewise.
* localedata/locales/or_IN: Likewise.
* localedata/locales/os_RU: Likewise.
* localedata/locales/pa_IN: Likewise.
* localedata/locales/pa_PK: Likewise.
* localedata/locales/pl_PL: Likewise.
* localedata/locales/pt_PT: Likewise.
* localedata/locales/quz_PE: Likewise.
* localedata/locales/ro_RO: Likewise.
* localedata/locales/ru_RU: Likewise.
* localedata/locales/rw_RW: Likewise.
* localedata/locales/sa_IN: Likewise.
* localedata/locales/sd_IN: Likewise.
* localedata/locales/sd_IN@devanagari: Likewise.
* localedata/locales/se_NO: Likewise.
* localedata/locales/sgs_LT: Likewise.
* localedata/locales/shn_MM: Likewise.
* localedata/locales/si_LK: Likewise.
* localedata/locales/sk_SK: Likewise.
* localedata/locales/sl_SI: Likewise.
* localedata/locales/sm_WS: Likewise.
* localedata/locales/so_SO: Likewise.
* localedata/locales/sq_AL: Likewise.
* localedata/locales/ss_ZA: Likewise.
* localedata/locales/st_ZA: Likewise.
* localedata/locales/sv_SE: Likewise.
* localedata/locales/sw_KE: Likewise.
* localedata/locales/ta_IN: Likewise.
* localedata/locales/te_IN: Likewise.
* localedata/locales/th_TH: Likewise.
* localedata/locales/ti_ET: Likewise.
* localedata/locales/tn_ZA: Likewise.
* localedata/locales/to_TO: Likewise.
* localedata/locales/tpi_PG: Likewise.
* localedata/locales/tr_TR: Likewise.
* localedata/locales/ts_ZA: Likewise.
* localedata/locales/unm_US: Likewise.
* localedata/locales/ur_IN: Likewise.
* localedata/locales/ur_PK: Likewise.
* localedata/locales/ve_ZA: Likewise.
* localedata/locales/vi_VN: Likewise.
* localedata/locales/wa_BE: Likewise.
* localedata/locales/wo_SN: Likewise.
* localedata/locales/xh_ZA: Likewise.
* localedata/locales/yi_US: Likewise.
* localedata/locales/yuw_PG: Likewise.
* localedata/locales/zh_CN: Likewise.
* localedata/locales/zu_ZA: Likewise.
---
localedata/locales/aa_DJ | 1 +
localedata/locales/af_ZA | 1 +
localedata/locales/ak_GH | 1 +
localedata/locales/am_ET | 1 +
localedata/locales/ar_EG | 1 +
localedata/locales/be_BY | 1 +
localedata/locales/bem_ZM | 1 +
localedata/locales/ber_DZ | 1 +
localedata/locales/ber_MA | 1 +
localedata/locales/bg_BG | 1 +
localedata/locales/bi_VU | 1 +
localedata/locales/bn_BD | 1 +
localedata/locales/bo_CN | 1 +
localedata/locales/ca_ES | 1 +
localedata/locales/ce_RU | 1 +
localedata/locales/cmn_TW | 1 +
localedata/locales/cs_CZ | 1 +
localedata/locales/cv_RU | 1 +
localedata/locales/cy_GB | 1 +
localedata/locales/da_DK | 1 +
localedata/locales/de_DE | 1 +
localedata/locales/dv_MV | 1 +
localedata/locales/dz_BT | 1 +
localedata/locales/el_GR | 1 +
localedata/locales/en_GB | 1 +
localedata/locales/en_NG | 1 +
localedata/locales/en_ZM | 1 +
localedata/locales/es_CU | 1 +
localedata/locales/es_ES | 1 +
localedata/locales/et_EE | 1 +
localedata/locales/fa_IR | 1 +
localedata/locales/ff_SN | 1 +
localedata/locales/fi_FI | 1 +
localedata/locales/fr_FR | 1 +
localedata/locales/ga_IE | 1 +
localedata/locales/gd_GB | 1 +
localedata/locales/gu_IN | 1 +
localedata/locales/gv_GB | 1 +
localedata/locales/he_IL | 1 +
localedata/locales/hi_IN | 1 +
localedata/locales/hif_FJ | 1 +
localedata/locales/hr_HR | 1 +
localedata/locales/ht_HT | 1 +
localedata/locales/hu_HU | 1 +
localedata/locales/hy_AM | 1 +
localedata/locales/id_ID | 1 +
localedata/locales/is_IS | 1 +
localedata/locales/it_IT | 1 +
localedata/locales/ja_JP | 1 +
localedata/locales/kab_DZ | 1 +
localedata/locales/kk_KZ | 1 +
localedata/locales/km_KH | 1 +
localedata/locales/kn_IN | 1 +
localedata/locales/ko_KR | 1 +
localedata/locales/ks_IN | 1 +
localedata/locales/kw_GB | 1 +
localedata/locales/ky_KG | 1 +
localedata/locales/lb_LU | 1 +
localedata/locales/lg_UG | 1 +
localedata/locales/lij_IT | 1 +
localedata/locales/ln_CD | 1 +
localedata/locales/lo_LA | 1 +
localedata/locales/lt_LT | 1 +
localedata/locales/lv_LV | 1 +
localedata/locales/mg_MG | 1 +
localedata/locales/mhr_RU | 1 +
localedata/locales/mk_MK | 1 +
localedata/locales/ml_IN | 1 +
localedata/locales/ms_MY | 1 +
localedata/locales/mt_MT | 1 +
localedata/locales/nan_TW@latin | 1 +
localedata/locales/nb_NO | 1 +
localedata/locales/ne_NP | 1 +
localedata/locales/nhn_MX | 1 +
localedata/locales/niu_NU | 1 +
localedata/locales/niu_NZ | 1 +
localedata/locales/nl_NL | 1 +
localedata/locales/nr_ZA | 1 +
localedata/locales/oc_FR | 1 +
localedata/locales/om_KE | 1 +
localedata/locales/or_IN | 1 +
localedata/locales/os_RU | 1 +
localedata/locales/pa_IN | 1 +
localedata/locales/pa_PK | 1 +
localedata/locales/pl_PL | 1 +
localedata/locales/pt_PT | 1 +
localedata/locales/quz_PE | 1 +
localedata/locales/ro_RO | 1 +
localedata/locales/ru_RU | 1 +
localedata/locales/rw_RW | 1 +
localedata/locales/sa_IN | 1 +
localedata/locales/sd_IN | 1 +
localedata/locales/sd_IN@devanagari | 1 +
localedata/locales/se_NO | 1 +
localedata/locales/sgs_LT | 1 +
localedata/locales/shn_MM | 1 +
localedata/locales/si_LK | 1 +
localedata/locales/sk_SK | 1 +
localedata/locales/sl_SI | 1 +
localedata/locales/sm_WS | 1 +
localedata/locales/so_SO | 1 +
localedata/locales/sq_AL | 1 +
localedata/locales/ss_ZA | 1 +
localedata/locales/st_ZA | 1 +
localedata/locales/sv_SE | 1 +
localedata/locales/sw_KE | 1 +
localedata/locales/ta_IN | 1 +
localedata/locales/te_IN | 1 +
localedata/locales/th_TH | 1 +
localedata/locales/ti_ET | 1 +
localedata/locales/tn_ZA | 1 +
localedata/locales/to_TO | 1 +
localedata/locales/tpi_PG | 1 +
localedata/locales/tr_TR | 1 +
localedata/locales/translit_cyrillic | 378 +++++++++++++++++++++++++++
localedata/locales/ts_ZA | 1 +
localedata/locales/unm_US | 1 +
localedata/locales/ur_IN | 1 +
localedata/locales/ur_PK | 1 +
localedata/locales/ve_ZA | 1 +
localedata/locales/vi_VN | 1 +
localedata/locales/wa_BE | 1 +
localedata/locales/wo_SN | 1 +
localedata/locales/xh_ZA | 1 +
localedata/locales/yi_US | 1 +
localedata/locales/yuw_PG | 1 +
localedata/locales/zh_CN | 1 +
localedata/locales/zu_ZA | 1 +
128 files changed, 505 insertions(+)
create mode 100644 localedata/locales/translit_cyrillic
diff --git a/localedata/locales/aa_DJ b/localedata/locales/aa_DJ
index fcb9af8abc..533e5b714e 100644
--- a/localedata/locales/aa_DJ
+++ b/localedata/locales/aa_DJ
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/af_ZA b/localedata/locales/af_ZA
index 2f45ddad63..d16bbcf707 100644
--- a/localedata/locales/af_ZA
+++ b/localedata/locales/af_ZA
@@ -70,6 +70,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ak_GH b/localedata/locales/ak_GH
index 926e4df343..d743ba48c7 100644
--- a/localedata/locales/ak_GH
+++ b/localedata/locales/ak_GH
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/am_ET b/localedata/locales/am_ET
index e5fe88a4cd..bee494be0a 100644
--- a/localedata/locales/am_ET
+++ b/localedata/locales/am_ET
@@ -96,6 +96,7 @@ copy "i18n"
space <U1361>
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% hoy-sadis followed by a vowel
<U1205><U12A0> <U0068><U0027><U0065>
diff --git a/localedata/locales/ar_EG b/localedata/locales/ar_EG
index c8cb3180bf..f2584cd7ad 100644
--- a/localedata/locales/ar_EG
+++ b/localedata/locales/ar_EG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/be_BY b/localedata/locales/be_BY
index 324379b65a..4fb16d3540 100644
--- a/localedata/locales/be_BY
+++ b/localedata/locales/be_BY
@@ -91,6 +91,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bem_ZM b/localedata/locales/bem_ZM
index fa43ad1610..7a8c3c3b77 100644
--- a/localedata/locales/bem_ZM
+++ b/localedata/locales/bem_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_DZ b/localedata/locales/ber_DZ
index 79f3d289b1..137643873d 100644
--- a/localedata/locales/ber_DZ
+++ b/localedata/locales/ber_DZ
@@ -136,6 +136,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ber_MA b/localedata/locales/ber_MA
index b9bd64868c..fd79bf11d6 100644
--- a/localedata/locales/ber_MA
+++ b/localedata/locales/ber_MA
@@ -83,6 +83,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bg_BG b/localedata/locales/bg_BG
index 7a9cfa0a5d..504199a4d9 100644
--- a/localedata/locales/bg_BG
+++ b/localedata/locales/bg_BG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bi_VU b/localedata/locales/bi_VU
index 88bf70a61b..81d717b2f6 100755
--- a/localedata/locales/bi_VU
+++ b/localedata/locales/bi_VU
@@ -39,6 +39,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bn_BD b/localedata/locales/bn_BD
index 73efd1cbc3..bc82d611e0 100644
--- a/localedata/locales/bn_BD
+++ b/localedata/locales/bn_BD
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/bo_CN b/localedata/locales/bo_CN
index 90cbc7807b..7779d3d99b 100644
--- a/localedata/locales/bo_CN
+++ b/localedata/locales/bo_CN
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ca_ES b/localedata/locales/ca_ES
index 0ba74ccf33..af72a1ab86 100644
--- a/localedata/locales/ca_ES
+++ b/localedata/locales/ca_ES
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ce_RU b/localedata/locales/ce_RU
index 03e60f838a..75ef80498d 100644
--- a/localedata/locales/ce_RU
+++ b/localedata/locales/ce_RU
@@ -38,6 +38,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index cca7cc19af..3866f06004 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/cs_CZ b/localedata/locales/cs_CZ
index 41fbd2be93..9450d22f2f 100644
--- a/localedata/locales/cs_CZ
+++ b/localedata/locales/cs_CZ
@@ -215,6 +215,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cv_RU b/localedata/locales/cv_RU
index e9247b39f8..253cbd63af 100644
--- a/localedata/locales/cv_RU
+++ b/localedata/locales/cv_RU
@@ -103,6 +103,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/cy_GB b/localedata/locales/cy_GB
index 5f6fd7c87f..6d35d7c27e 100644
--- a/localedata/locales/cy_GB
+++ b/localedata/locales/cy_GB
@@ -65,6 +65,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/da_DK b/localedata/locales/da_DK
index 05a2681bef..1b38e8af17 100644
--- a/localedata/locales/da_DK
+++ b/localedata/locales/da_DK
@@ -147,6 +147,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/de_DE b/localedata/locales/de_DE
index eaa9f7ff8e..85793437a5 100644
--- a/localedata/locales/de_DE
+++ b/localedata/locales/de_DE
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts.
% LATIN CAPITAL LETTER A WITH DIAERESIS.
diff --git a/localedata/locales/dv_MV b/localedata/locales/dv_MV
index 0d7842f39f..f9c8de4a50 100644
--- a/localedata/locales/dv_MV
+++ b/localedata/locales/dv_MV
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/dz_BT b/localedata/locales/dz_BT
index 272fa7e78f..31d488ad0c 100644
--- a/localedata/locales/dz_BT
+++ b/localedata/locales/dz_BT
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/el_GR b/localedata/locales/el_GR
index 7362492fbd..994a4a913d 100644
--- a/localedata/locales/el_GR
+++ b/localedata/locales/el_GR
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..2f1cc5904b 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_NG b/localedata/locales/en_NG
index 109201c2fe..fa70ffe943 100644
--- a/localedata/locales/en_NG
+++ b/localedata/locales/en_NG
@@ -49,6 +49,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/en_ZM b/localedata/locales/en_ZM
index 8957d8e8aa..1fc5dfed65 100644
--- a/localedata/locales/en_ZM
+++ b/localedata/locales/en_ZM
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_CU b/localedata/locales/es_CU
index d37d452b0f..90c714ea18 100644
--- a/localedata/locales/es_CU
+++ b/localedata/locales/es_CU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/es_ES b/localedata/locales/es_ES
index aa919a2626..534152d0a8 100644
--- a/localedata/locales/es_ES
+++ b/localedata/locales/es_ES
@@ -107,6 +107,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/et_EE b/localedata/locales/et_EE
index f5c47149a6..51e6a4ab13 100644
--- a/localedata/locales/et_EE
+++ b/localedata/locales/et_EE
@@ -113,6 +113,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fa_IR b/localedata/locales/fa_IR
index 3714a30932..fdeaf6312e 100644
--- a/localedata/locales/fa_IR
+++ b/localedata/locales/fa_IR
@@ -78,6 +78,7 @@ map to_outpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ff_SN b/localedata/locales/ff_SN
index e4b18eba7b..32e2eb78d8 100644
--- a/localedata/locales/ff_SN
+++ b/localedata/locales/ff_SN
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fi_FI b/localedata/locales/fi_FI
index eeb278316b..57eda9bff1 100644
--- a/localedata/locales/fi_FI
+++ b/localedata/locales/fi_FI
@@ -177,6 +177,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/fr_FR b/localedata/locales/fr_FR
index a18c514f19..098be4906f 100644
--- a/localedata/locales/fr_FR
+++ b/localedata/locales/fr_FR
@@ -57,6 +57,7 @@ translit_start
% In France, accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ga_IE b/localedata/locales/ga_IE
index 782adbaa5c..d430028b74 100644
--- a/localedata/locales/ga_IE
+++ b/localedata/locales/ga_IE
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gd_GB b/localedata/locales/gd_GB
index 8d54593113..aaa41a0bda 100644
--- a/localedata/locales/gd_GB
+++ b/localedata/locales/gd_GB
@@ -45,6 +45,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gu_IN b/localedata/locales/gu_IN
index cd7e23a4be..00f00d4f8d 100644
--- a/localedata/locales/gu_IN
+++ b/localedata/locales/gu_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/gv_GB b/localedata/locales/gv_GB
index 473c043cba..3c6ba93629 100644
--- a/localedata/locales/gv_GB
+++ b/localedata/locales/gv_GB
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/he_IL b/localedata/locales/he_IL
index 52b5a6bff0..82a0760c10 100644
--- a/localedata/locales/he_IL
+++ b/localedata/locales/he_IL
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hi_IN b/localedata/locales/hi_IN
index a94365519f..12a44e6689 100644
--- a/localedata/locales/hi_IN
+++ b/localedata/locales/hi_IN
@@ -61,6 +61,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hif_FJ b/localedata/locales/hif_FJ
index 5433bb4a2a..005ac6d308 100644
--- a/localedata/locales/hif_FJ
+++ b/localedata/locales/hif_FJ
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hr_HR b/localedata/locales/hr_HR
index 029a3794e2..8222d73ff0 100644
--- a/localedata/locales/hr_HR
+++ b/localedata/locales/hr_HR
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% Historicaly we used ISO-8869-2 and wrote digraphs
% <U01C6> {Ç}, <U01C9> {Ç} and <U01CC> {Ç}
diff --git a/localedata/locales/ht_HT b/localedata/locales/ht_HT
index 0e0a79d2f1..69688a401e 100644
--- a/localedata/locales/ht_HT
+++ b/localedata/locales/ht_HT
@@ -57,6 +57,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/hu_HU b/localedata/locales/hu_HU
index 9d6bb85022..5e19e5b689 100644
--- a/localedata/locales/hu_HU
+++ b/localedata/locales/hu_HU
@@ -455,6 +455,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
<U00C1> "<U0041><U0301>";"<U0041><U00B4>";"<U0041><U0027>"
<U00C9> "<U0045><U0301>";"<U0045><U00B4>";"<U0045><U0027>"
diff --git a/localedata/locales/hy_AM b/localedata/locales/hy_AM
index 74e1b77efb..5973c85f33 100644
--- a/localedata/locales/hy_AM
+++ b/localedata/locales/hy_AM
@@ -75,6 +75,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/id_ID b/localedata/locales/id_ID
index 3ddd8d07da..af36159ca6 100644
--- a/localedata/locales/id_ID
+++ b/localedata/locales/id_ID
@@ -54,6 +54,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/is_IS b/localedata/locales/is_IS
index 8d59b468d6..f614fea728 100644
--- a/localedata/locales/is_IS
+++ b/localedata/locales/is_IS
@@ -149,6 +149,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/it_IT b/localedata/locales/it_IT
index 8a10545de0..7d4cda7fc6 100644
--- a/localedata/locales/it_IT
+++ b/localedata/locales/it_IT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ja_JP b/localedata/locales/ja_JP
index 1fd2fee44b..34ed430947 100644
--- a/localedata/locales/ja_JP
+++ b/localedata/locales/ja_JP
@@ -1680,6 +1680,7 @@ translit_start
include "translit_combining";""
include "translit_cjk_variants";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/kab_DZ b/localedata/locales/kab_DZ
index a165f53f01..4cf468c6a5 100644
--- a/localedata/locales/kab_DZ
+++ b/localedata/locales/kab_DZ
@@ -41,6 +41,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kk_KZ b/localedata/locales/kk_KZ
index c29c84b46e..c4ceb28b27 100644
--- a/localedata/locales/kk_KZ
+++ b/localedata/locales/kk_KZ
@@ -99,6 +99,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/km_KH b/localedata/locales/km_KH
index 0d8c9ce78d..acd9291346 100644
--- a/localedata/locales/km_KH
+++ b/localedata/locales/km_KH
@@ -42,6 +42,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kn_IN b/localedata/locales/kn_IN
index b6443d12c8..cffa4e4544 100644
--- a/localedata/locales/kn_IN
+++ b/localedata/locales/kn_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ko_KR b/localedata/locales/ko_KR
index bd0d919218..31a8b105c5 100644
--- a/localedata/locales/ko_KR
+++ b/localedata/locales/ko_KR
@@ -6098,6 +6098,7 @@ translit_start
include "translit_combining";""
include "translit_hangul";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/ks_IN b/localedata/locales/ks_IN
index 9ab8707922..0c1572b8fd 100644
--- a/localedata/locales/ks_IN
+++ b/localedata/locales/ks_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/kw_GB b/localedata/locales/kw_GB
index c0433b3f07..1eb4cfd1c1 100644
--- a/localedata/locales/kw_GB
+++ b/localedata/locales/kw_GB
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ky_KG b/localedata/locales/ky_KG
index 871b8a818b..f46b6979e2 100644
--- a/localedata/locales/ky_KG
+++ b/localedata/locales/ky_KG
@@ -82,6 +82,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lb_LU b/localedata/locales/lb_LU
index 92f1e22e1a..992d0f677d 100644
--- a/localedata/locales/lb_LU
+++ b/localedata/locales/lb_LU
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% German umlauts
% LATIN CAPITAL LETTER A WITH DIAERESIS
diff --git a/localedata/locales/lg_UG b/localedata/locales/lg_UG
index 70dd1cad2e..57dd8c74e8 100644
--- a/localedata/locales/lg_UG
+++ b/localedata/locales/lg_UG
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lij_IT b/localedata/locales/lij_IT
index 2d6e5fcc5c..baec837196 100644
--- a/localedata/locales/lij_IT
+++ b/localedata/locales/lij_IT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ln_CD b/localedata/locales/ln_CD
index ed6404a1e5..a91441809c 100644
--- a/localedata/locales/ln_CD
+++ b/localedata/locales/ln_CD
@@ -39,6 +39,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lo_LA b/localedata/locales/lo_LA
index d60d157167..2abd680a6a 100644
--- a/localedata/locales/lo_LA
+++ b/localedata/locales/lo_LA
@@ -50,6 +50,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lt_LT b/localedata/locales/lt_LT
index e9834bd200..a58168dc45 100644
--- a/localedata/locales/lt_LT
+++ b/localedata/locales/lt_LT
@@ -163,6 +163,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/lv_LV b/localedata/locales/lv_LV
index a20cbdde46..e3fb992562 100644
--- a/localedata/locales/lv_LV
+++ b/localedata/locales/lv_LV
@@ -125,6 +125,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mg_MG b/localedata/locales/mg_MG
index 266ff17e7d..ee1ed56fed 100644
--- a/localedata/locales/mg_MG
+++ b/localedata/locales/mg_MG
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/mhr_RU b/localedata/locales/mhr_RU
index 85ac21b35a..b936253ebc 100644
--- a/localedata/locales/mhr_RU
+++ b/localedata/locales/mhr_RU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mk_MK b/localedata/locales/mk_MK
index 87bae1dc7c..210cfce05c 100644
--- a/localedata/locales/mk_MK
+++ b/localedata/locales/mk_MK
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ml_IN b/localedata/locales/ml_IN
index d7a8f43f1e..794d59f923 100644
--- a/localedata/locales/ml_IN
+++ b/localedata/locales/ml_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
%
diff --git a/localedata/locales/ms_MY b/localedata/locales/ms_MY
index 66b5dd98e9..4fa53adbc3 100644
--- a/localedata/locales/ms_MY
+++ b/localedata/locales/ms_MY
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/mt_MT b/localedata/locales/mt_MT
index a6ab7b1dad..4b6a08f4e1 100644
--- a/localedata/locales/mt_MT
+++ b/localedata/locales/mt_MT
@@ -47,6 +47,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nan_TW@latin b/localedata/locales/nan_TW@latin
index d4579a4cdf..99e2bd80ab 100644
--- a/localedata/locales/nan_TW@latin
+++ b/localedata/locales/nan_TW@latin
@@ -51,6 +51,7 @@ translit_start
% accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/nb_NO b/localedata/locales/nb_NO
index a8675b6104..4c90307366 100644
--- a/localedata/locales/nb_NO
+++ b/localedata/locales/nb_NO
@@ -144,6 +144,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/ne_NP b/localedata/locales/ne_NP
index eb80eabbd8..3aecda7fd7 100644
--- a/localedata/locales/ne_NP
+++ b/localedata/locales/ne_NP
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nhn_MX b/localedata/locales/nhn_MX
index 88a89765e8..a5e286bc4c 100644
--- a/localedata/locales/nhn_MX
+++ b/localedata/locales/nhn_MX
@@ -59,6 +59,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NU b/localedata/locales/niu_NU
index 553c5d9edc..e34f33e0c6 100644
--- a/localedata/locales/niu_NU
+++ b/localedata/locales/niu_NU
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/niu_NZ b/localedata/locales/niu_NZ
index 560101b447..85acd3bc44 100644
--- a/localedata/locales/niu_NZ
+++ b/localedata/locales/niu_NZ
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nl_NL b/localedata/locales/nl_NL
index 1ab3277aa0..6284728fe7 100644
--- a/localedata/locales/nl_NL
+++ b/localedata/locales/nl_NL
@@ -56,6 +56,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/nr_ZA b/localedata/locales/nr_ZA
index 7de6420a6b..caf2aba2e4 100644
--- a/localedata/locales/nr_ZA
+++ b/localedata/locales/nr_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/oc_FR b/localedata/locales/oc_FR
index 707927ee26..f347c8c4d8 100644
--- a/localedata/locales/oc_FR
+++ b/localedata/locales/oc_FR
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/om_KE b/localedata/locales/om_KE
index 66cdcf5c45..a75a623053 100644
--- a/localedata/locales/om_KE
+++ b/localedata/locales/om_KE
@@ -156,6 +156,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/or_IN b/localedata/locales/or_IN
index ef28b58895..5c7b9cf8ef 100644
--- a/localedata/locales/or_IN
+++ b/localedata/locales/or_IN
@@ -62,6 +62,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/os_RU b/localedata/locales/os_RU
index 9a4ce037cd..7ab0b7a9bc 100644
--- a/localedata/locales/os_RU
+++ b/localedata/locales/os_RU
@@ -71,6 +71,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_IN b/localedata/locales/pa_IN
index ca28f21162..93e17fa848 100644
--- a/localedata/locales/pa_IN
+++ b/localedata/locales/pa_IN
@@ -60,6 +60,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pa_PK b/localedata/locales/pa_PK
index 1f49bdc90d..7782adb5d8 100644
--- a/localedata/locales/pa_PK
+++ b/localedata/locales/pa_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/pl_PL b/localedata/locales/pl_PL
index 4c1b2a869d..8caa5e8579 100644
--- a/localedata/locales/pl_PL
+++ b/localedata/locales/pl_PL
@@ -130,6 +130,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/pt_PT b/localedata/locales/pt_PT
index 6225036edf..d52ac3ac26 100644
--- a/localedata/locales/pt_PT
+++ b/localedata/locales/pt_PT
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/quz_PE b/localedata/locales/quz_PE
index f6b1956b93..018cd9a7e5 100644
--- a/localedata/locales/quz_PE
+++ b/localedata/locales/quz_PE
@@ -55,6 +55,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ro_RO b/localedata/locales/ro_RO
index 39c4d09a07..6443d66d6a 100644
--- a/localedata/locales/ro_RO
+++ b/localedata/locales/ro_RO
@@ -129,6 +129,7 @@ copy "i18n"
%
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if t/scomma is not available, try first t/scedilla
<U0218> "<U015E>";"<U0053>"
diff --git a/localedata/locales/ru_RU b/localedata/locales/ru_RU
index fdb2059fe7..1f6d2c6935 100644
--- a/localedata/locales/ru_RU
+++ b/localedata/locales/ru_RU
@@ -69,6 +69,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/rw_RW b/localedata/locales/rw_RW
index e0bc763c5a..e12a3d83a3 100644
--- a/localedata/locales/rw_RW
+++ b/localedata/locales/rw_RW
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sa_IN b/localedata/locales/sa_IN
index 4eaf6fe1fe..6ebb5e4f90 100644
--- a/localedata/locales/sa_IN
+++ b/localedata/locales/sa_IN
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN b/localedata/locales/sd_IN
index e5ab80b062..23b7424d3b 100644
--- a/localedata/locales/sd_IN
+++ b/localedata/locales/sd_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sd_IN@devanagari b/localedata/locales/sd_IN@devanagari
index d57cea639b..0a122b95ac 100644
--- a/localedata/locales/sd_IN@devanagari
+++ b/localedata/locales/sd_IN@devanagari
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/se_NO b/localedata/locales/se_NO
index b50001139a..b423d93531 100644
--- a/localedata/locales/se_NO
+++ b/localedata/locales/se_NO
@@ -221,6 +221,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sgs_LT b/localedata/locales/sgs_LT
index 6b6ab1cac9..561c43b651 100644
--- a/localedata/locales/sgs_LT
+++ b/localedata/locales/sgs_LT
@@ -58,6 +58,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/shn_MM b/localedata/locales/shn_MM
index 4212c50ec5..079506dafc 100644
--- a/localedata/locales/shn_MM
+++ b/localedata/locales/shn_MM
@@ -58,6 +58,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/si_LK b/localedata/locales/si_LK
index dc4a9eb04d..4d2fc8b3f0 100644
--- a/localedata/locales/si_LK
+++ b/localedata/locales/si_LK
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sk_SK b/localedata/locales/sk_SK
index 94e6e12bb2..086499bb7e 100644
--- a/localedata/locales/sk_SK
+++ b/localedata/locales/sk_SK
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sl_SI b/localedata/locales/sl_SI
index 6157b26d4f..dd9b516111 100644
--- a/localedata/locales/sl_SI
+++ b/localedata/locales/sl_SI
@@ -2120,6 +2120,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sm_WS b/localedata/locales/sm_WS
index 6058fbdc38..b9954ae30e 100644
--- a/localedata/locales/sm_WS
+++ b/localedata/locales/sm_WS
@@ -37,6 +37,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/so_SO b/localedata/locales/so_SO
index 713bf79608..9ed4d68ce9 100644
--- a/localedata/locales/so_SO
+++ b/localedata/locales/so_SO
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sq_AL b/localedata/locales/sq_AL
index b16a459c56..d9154d7f9e 100644
--- a/localedata/locales/sq_AL
+++ b/localedata/locales/sq_AL
@@ -45,6 +45,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ss_ZA b/localedata/locales/ss_ZA
index 7532a1940b..31c45321ce 100644
--- a/localedata/locales/ss_ZA
+++ b/localedata/locales/ss_ZA
@@ -66,6 +66,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/st_ZA b/localedata/locales/st_ZA
index 706ef3e50a..b62f478f5f 100644
--- a/localedata/locales/st_ZA
+++ b/localedata/locales/st_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/sv_SE b/localedata/locales/sv_SE
index aa28c23776..7443ee277c 100644
--- a/localedata/locales/sv_SE
+++ b/localedata/locales/sv_SE
@@ -151,6 +151,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% LATIN CAPITAL LETTER A WITH DIAERESIS -> "AE"
<U00C4> "<U0041><U0308>";"<U0041><U0045>"
diff --git a/localedata/locales/sw_KE b/localedata/locales/sw_KE
index 6c303da983..1e3f848e1d 100644
--- a/localedata/locales/sw_KE
+++ b/localedata/locales/sw_KE
@@ -43,6 +43,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ta_IN b/localedata/locales/ta_IN
index 5a083d2658..ec08739ebd 100644
--- a/localedata/locales/ta_IN
+++ b/localedata/locales/ta_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/te_IN b/localedata/locales/te_IN
index b70f320051..99ffb43bf5 100644
--- a/localedata/locales/te_IN
+++ b/localedata/locales/te_IN
@@ -63,6 +63,7 @@ map to_inpunct; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/th_TH b/localedata/locales/th_TH
index 7a10376e80..148a1c632b 100644
--- a/localedata/locales/th_TH
+++ b/localedata/locales/th_TH
@@ -57,6 +57,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ti_ET b/localedata/locales/ti_ET
index 6c387604e9..2c2e32a702 100644
--- a/localedata/locales/ti_ET
+++ b/localedata/locales/ti_ET
@@ -864,6 +864,7 @@ translit_start
<U137C> <U0060><U0031><U0030><U0030><U0030><U0030>
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
%
END LC_CTYPE
diff --git a/localedata/locales/tn_ZA b/localedata/locales/tn_ZA
index 8473426eab..274336c8d3 100644
--- a/localedata/locales/tn_ZA
+++ b/localedata/locales/tn_ZA
@@ -67,6 +67,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/to_TO b/localedata/locales/to_TO
index 7abe8685df..09e5e093d5 100644
--- a/localedata/locales/to_TO
+++ b/localedata/locales/to_TO
@@ -36,6 +36,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tpi_PG b/localedata/locales/tpi_PG
index 3315c27633..e625543fcb 100644
--- a/localedata/locales/tpi_PG
+++ b/localedata/locales/tpi_PG
@@ -44,6 +44,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index f7c13ddf4b..c751dc696a 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -2535,6 +2535,7 @@ class "combining_level3"; /
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% TURKISH LIRA SIGN
<U20BA> "<U0054><U004C>"
diff --git a/localedata/locales/translit_cyrillic b/localedata/locales/translit_cyrillic
new file mode 100644
index 0000000000..253f5c9618
--- /dev/null
+++ b/localedata/locales/translit_cyrillic
@@ -0,0 +1,378 @@
+escape_char /
+comment_char %
+
+% This file is part of the GNU C Library and contains locale data.
+% The Free Software Foundation does not claim any copyright interest
+% in the locale data contained in this file. The foregoing does not
+% affect the license of the GNU C Library as a whole. It does not
+% exempt you from the conditions of the license if your use would
+% otherwise be governed by that license.
+
+% Transcription of Cyrillic letters to ASCII symbols.
+% Inspired by ISO 9.1995 / GOST 7.79-2000 System B.
+% Covers Unicode Range https://www.unicode.org/charts/PDF/U0400.pdf
+% i.e. [U0401-U04F9, U2019] but only the letters covered by ISO 9.1995
+% Check https://en.wikipedia.org/wiki/ISO_9 for reference.
+% The System B is extended from GOST_7.79-Russian using open sources
+% of the transliteration mappings and the "h/`" diacritics logic.
+% Capital Cyrillic letters that are transcribed with two ASCII letters
+% combination get both ASCII letters capitalized to avoid collisions.
+
+
+% Usage examples:
+% iconv -f UTF-8 -t ASCII//TRANSLIT # System B.
+
+% Contributions welcome for the rest of Cyrillic script in Unicode
+% https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode.
+% Bugfix for https://sourceware.org/bugzilla/show_bug.cgi?id=2872.
+% Generated from UnicodeData.txt with a spreadsheet referenced
+% in that bugs doclet
+
+LC_CTYPE
+
+translit_start
+
+% CYRILLIC CAPITAL LETTER IO
+<U0401> "<U0059><U004F>"
+% CYRILLIC CAPITAL LETTER DJE
+<U0402> "<U0044><U004A>"
+% CYRILLIC CAPITAL LETTER GJE
+<U0403> "<U0047><U0060>"
+% CYRILLIC CAPITAL LETTER UKRAINIAN IE
+<U0404> "<U0059><U0045>"
+% CYRILLIC CAPITAL LETTER DZE
+<U0405> "<U005A><U0060>"
+% CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0406> <U0049>
+% CYRILLIC CAPITAL LETTER YI
+<U0407> "<U0059><U0049>"
+% CYRILLIC CAPITAL LETTER JE
+<U0408> <U004A>
+% CYRILLIC CAPITAL LETTER LJE
+<U0409> "<U004C><U0060>"
+% CYRILLIC CAPITAL LETTER NJE
+<U040A> "<U004E><U0060>"
+% CYRILLIC CAPITAL LETTER TSHE
+<U040B> "<U0054><U0053><U0048>"
+% CYRILLIC CAPITAL LETTER KJE
+<U040C> "<U004B><U0060>"
+% CYRILLIC CAPITAL LETTER SHORT U
+<U040E> "<U0055><U0060>"
+% CYRILLIC CAPITAL LETTER DZHE
+<U040F> "<U0044><U0048>"
+% CYRILLIC CAPITAL LETTER A
+<U0410> <U0041>
+% CYRILLIC CAPITAL LETTER BE
+<U0411> <U0042>
+% CYRILLIC CAPITAL LETTER VE
+<U0412> <U0056>
+% CYRILLIC CAPITAL LETTER GHE
+<U0413> <U0047>
+% CYRILLIC CAPITAL LETTER DE
+<U0414> <U0044>
+% CYRILLIC CAPITAL LETTER IE
+<U0415> <U0045>
+% CYRILLIC CAPITAL LETTER ZHE
+<U0416> "<U005A><U0048>"
+% CYRILLIC CAPITAL LETTER ZE
+<U0417> <U005A>
+% CYRILLIC CAPITAL LETTER I
+<U0418> <U0049>
+% CYRILLIC CAPITAL LETTER SHORT I
+<U0419> <U004A>
+% CYRILLIC CAPITAL LETTER KA
+<U041A> <U004B>
+% CYRILLIC CAPITAL LETTER EL
+<U041B> <U004C>
+% CYRILLIC CAPITAL LETTER EM
+<U041C> <U004D>
+% CYRILLIC CAPITAL LETTER EN
+<U041D> <U004E>
+% CYRILLIC CAPITAL LETTER O
+<U041E> <U004F>
+% CYRILLIC CAPITAL LETTER PE
+<U041F> <U0050>
+% CYRILLIC CAPITAL LETTER ER
+<U0420> <U0052>
+% CYRILLIC CAPITAL LETTER ES
+<U0421> <U0053>
+% CYRILLIC CAPITAL LETTER TE
+<U0422> <U0054>
+% CYRILLIC CAPITAL LETTER U
+<U0423> <U0055>
+% CYRILLIC CAPITAL LETTER EF
+<U0424> <U0046>
+% CYRILLIC CAPITAL LETTER HA
+<U0425> <U0058>
+% CYRILLIC CAPITAL LETTER TSE
+<U0426> "<U0043><U005A>"
+% CYRILLIC CAPITAL LETTER CHE
+<U0427> "<U0043><U0048>"
+% CYRILLIC CAPITAL LETTER SHA
+<U0428> "<U0053><U0048>"
+% CYRILLIC CAPITAL LETTER SHCHA
+<U0429> "<U0053><U0048><U0048>"
+% CYRILLIC CAPITAL LETTER HARD SIGN
+<U042A> "<U0041><U0060>"
+% CYRILLIC CAPITAL LETTER YERU
+<U042B> "<U0059><U0060>"
+% CYRILLIC CAPITAL LETTER SOFT SIGN
+<U042C> <U0060>
+% CYRILLIC CAPITAL LETTER E
+<U042D> "<U0045><U0060>"
+% CYRILLIC CAPITAL LETTER YU
+<U042E> "<U0059><U0055>"
+% CYRILLIC CAPITAL LETTER YA
+<U042F> "<U0059><U0041>"
+% CYRILLIC SMALL LETTER A
+<U0430> <U0061>
+% CYRILLIC SMALL LETTER BE
+<U0431> <U0062>
+% CYRILLIC SMALL LETTER VE
+<U0432> <U0076>
+% CYRILLIC SMALL LETTER GHE
+<U0433> <U0067>
+% CYRILLIC SMALL LETTER DE
+<U0434> <U0064>
+% CYRILLIC SMALL LETTER IE
+<U0435> <U0065>
+% CYRILLIC SMALL LETTER ZHE
+<U0436> "<U007A><U0068>"
+% CYRILLIC SMALL LETTER ZE
+<U0437> <U007A>
+% CYRILLIC SMALL LETTER I
+<U0438> <U0069>
+% CYRILLIC SMALL LETTER SHORT I
+<U0439> <U006A>
+% CYRILLIC SMALL LETTER KA
+<U043A> <U006B>
+% CYRILLIC SMALL LETTER EL
+<U043B> <U006C>
+% CYRILLIC SMALL LETTER EM
+<U043C> <U006D>
+% CYRILLIC SMALL LETTER EN
+<U043D> <U006E>
+% CYRILLIC SMALL LETTER O
+<U043E> <U006F>
+% CYRILLIC SMALL LETTER PE
+<U043F> <U0070>
+% CYRILLIC SMALL LETTER ER
+<U0440> <U0072>
+% CYRILLIC SMALL LETTER ES
+<U0441> <U0073>
+% CYRILLIC SMALL LETTER TE
+<U0442> <U0074>
+% CYRILLIC SMALL LETTER U
+<U0443> <U0075>
+% CYRILLIC SMALL LETTER EF
+<U0444> <U0066>
+% CYRILLIC SMALL LETTER HA
+<U0445> <U0078>
+% CYRILLIC SMALL LETTER TSE
+<U0446> "<U0063><U007A>"
+% CYRILLIC SMALL LETTER CHE
+<U0447> "<U0063><U0068>"
+% CYRILLIC SMALL LETTER SHA
+<U0448> "<U0073><U0068>"
+% CYRILLIC SMALL LETTER SHCHA
+<U0449> "<U0073><U0068><U0068>"
+% CYRILLIC SMALL LETTER HARD SIGN
+<U044A> "<U0060><U0060>"
+% CYRILLIC SMALL LETTER YERU
+<U044B> "<U0079><U0060>"
+% CYRILLIC SMALL LETTER SOFT SIGN
+<U044C> <U0060>
+% CYRILLIC SMALL LETTER E
+<U044D> "<U0065><U0060>"
+% CYRILLIC SMALL LETTER YU
+<U044E> "<U0079><U0075>"
+% CYRILLIC SMALL LETTER YA
+<U044F> "<U0079><U0061>"
+% CYRILLIC SMALL LETTER IO
+<U0451> "<U0079><U006F>"
+% CYRILLIC SMALL LETTER DJE
+<U0452> "<U0064><U006A>"
+% CYRILLIC SMALL LETTER GJE
+<U0453> "<U0067><U0060>"
+% CYRILLIC SMALL LETTER UKRAINIAN IE
+<U0454> "<U0079><U0065>"
+% CYRILLIC SMALL LETTER DZE
+<U0455> "<U007A><U0060>"
+% CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+<U0456> <U0069>
+% CYRILLIC SMALL LETTER YI
+<U0457> "<U0079><U0069>"
+% CYRILLIC SMALL LETTER JE
+<U0458> <U006A>
+% CYRILLIC SMALL LETTER LJE
+<U0459> "<U006C><U0060>"
+% CYRILLIC SMALL LETTER NJE
+<U045A> "<U006E><U0060>"
+% CYRILLIC SMALL LETTER TSHE
+<U045B> "<U0074><U0073><U0068>"
+% CYRILLIC SMALL LETTER KJE
+<U045C> "<U006B><U0060>"
+% CYRILLIC SMALL LETTER SHORT U
+<U045E> "<U0075><U0060>"
+% CYRILLIC SMALL LETTER DZHE
+<U045F> "<U0064><U0068>"
+% CYRILLIC CAPITAL LETTER BIG YUS
+<U046A> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER BIG YUS
+<U046B> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER FITA
+<U0472> "<U0046><U0048>"
+% CYRILLIC SMALL LETTER FITA
+<U0473> "<U0066><U0068>"
+% CYRILLIC CAPITAL LETTER IZHITSA
+<U0474> "<U0059><U0048>"
+% CYRILLIC SMALL LETTER IZHITSA
+<U0475> "<U0079><U0068>"
+% CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+<U048C> "<U0045><U0060>"
+% CYRILLIC SMALL LETTER SEMISOFT SIGN
+<U048D> "<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+<U0490> "<U0047><U0060>"
+% CYRILLIC SMALL LETTER GHE WITH UPTURN
+<U0491> "<U0067><U0060>"
+% CYRILLIC CAPITAL LETTER GHE WITH STROKE
+<U0492> "<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH STROKE
+<U0493> "<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+<U0494> "<U0047><U0048>"
+% CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+<U0495> "<U0067><U0068>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+<U0496> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+<U0497> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+<U049A> "<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH DESCENDER
+<U049B> "<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER KA WITH STROKE
+<U049E> "<U004B><U0060>"
+% CYRILLIC SMALL LETTER KA WITH STROKE
+<U049F> "<U006B><U0060>"
+% CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+<U04A2> "<U004E><U0060>"
+% CYRILLIC SMALL LETTER EN WITH DESCENDER
+<U04A3> "<U006E><U0060>"
+% CYRILLIC CAPITAL LIGATURE EN GHE
+<U04A4> "<U004E><U0047>"
+% CYRILLIC SMALL LIGATURE EN GHE
+<U04A5> "<U006E><U0067>"
+% CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+<U04A6> "<U0050><U0060>"
+% CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+<U04A7> "<U0070><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN HA
+<U04A8> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN HA
+<U04A9> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+<U04AA> "<U0043><U0060>"
+% CYRILLIC SMALL LETTER ES WITH DESCENDER
+<U04AB> "<U0043><U0060>"
+% CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+<U04AC> "<U0054><U0060>"
+% CYRILLIC SMALL LETTER TE WITH DESCENDER
+<U04AD> "<U0074><U0060>"
+% CYRILLIC CAPITAL LETTER STRAIGHT U
+<U04AE> <U0055>
+% CYRILLIC SMALL LETTER STRAIGHT U
+<U04AF> <U0075>
+% CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+<U04B2> "<U0048><U0060>"
+% CYRILLIC SMALL LETTER HA WITH DESCENDER
+<U04B3> "<U0068><U0060>"
+% CYRILLIC CAPITAL LIGATURE TE TSE
+<U04B4> "<U0054><U0043><U005A>"
+% CYRILLIC SMALL LIGATURE TE TSE
+<U04B5> "<U0074><U0063><U007A>"
+% CYRILLIC CAPITAL LETTER SHHA
+<U04BA> "<U0053><U0048><U0060>"
+% CYRILLIC SMALL LETTER SHHA
+<U04BB> "<U0053><U0048><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+<U04BC> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE
+<U04BD> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BE> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+<U04BF> "<U0063><U0068><U0060>"
+% CYRILLIC LETTER PALOCHKA
+<U04C0> <U0069>
+% CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+<U04C1> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH BREVE
+<U04C2> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+<U04CB> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER KHAKASSIAN CHE
+<U04CC> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH BREVE
+<U04D0> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH BREVE
+<U04D1> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+<U04D2> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER A WITH DIAERESIS
+<U04D3> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER IE WITH BREVE
+<U04D6> "<U0045><U0060>"
+% CYRILLIC SMALL LETTER IE WITH BREVE
+<U04D7> "<U0065><U0060>"
+% CYRILLIC CAPITAL LETTER SCHWA
+<U04D8> "<U0041><U0060>"
+% CYRILLIC SMALL LETTER SCHWA
+<U04D9> "<U0061><U0060>"
+% CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+<U04DC> "<U005A><U0048><U0060>"
+% CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+<U04DD> "<U007A><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+<U04DE> "<U005A><U0060>"
+% CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+<U04DF> "<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+<U04E0> "<U005A><U0060>"
+% CYRILLIC SMALL LETTER ABKHASIAN DZE
+<U04E1> "<U007A><U0060>"
+% CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+<U04E4> "<U0049><U0060>"
+% CYRILLIC SMALL LETTER I WITH DIAERESIS
+<U04E5> "<U0069><U0060>"
+% CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+<U04E6> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER O WITH DIAERESIS
+<U04E7> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER BARRED O
+<U04E8> "<U004F><U0060>"
+% CYRILLIC SMALL LETTER BARRED O
+<U04E9> "<U006F><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+<U04F0> "<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DIAERESIS
+<U04F1> "<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+<U04F2> "<U0055><U0060>"
+% CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+<U04F3> "<U0075><U0060>"
+% CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+<U04F4> "<U0043><U0048><U0060>"
+% CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+<U04F5> "<U0063><U0068><U0060>"
+% CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+<U04F8> "<U0059><U0060>"
+% CYRILLIC SMALL LETTER YERU WITH DIAERESIS
+<U04F9> "<U0079><U0060>"
+% RIGHT SINGLE QUOTATION MARK
+<U2019> <U0027>
+
+translit_end
+
+END LC_CTYPE
diff --git a/localedata/locales/ts_ZA b/localedata/locales/ts_ZA
index 0256e42979..8e16fc02ae 100644
--- a/localedata/locales/ts_ZA
+++ b/localedata/locales/ts_ZA
@@ -62,6 +62,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/unm_US b/localedata/locales/unm_US
index 1e62c60443..66cb4f7210 100644
--- a/localedata/locales/unm_US
+++ b/localedata/locales/unm_US
@@ -48,6 +48,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_IN b/localedata/locales/ur_IN
index 062cbf0937..38675b8c6b 100644
--- a/localedata/locales/ur_IN
+++ b/localedata/locales/ur_IN
@@ -46,6 +46,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/ur_PK b/localedata/locales/ur_PK
index aaf47fceb5..4ea9c56100 100644
--- a/localedata/locales/ur_PK
+++ b/localedata/locales/ur_PK
@@ -49,6 +49,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% those two lettes are not in cp1256...
diff --git a/localedata/locales/ve_ZA b/localedata/locales/ve_ZA
index 6b80455c98..1964162cc4 100644
--- a/localedata/locales/ve_ZA
+++ b/localedata/locales/ve_ZA
@@ -65,6 +65,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/vi_VN b/localedata/locales/vi_VN
index 7fac1fbbcc..8eac6f3ba9 100644
--- a/localedata/locales/vi_VN
+++ b/localedata/locales/vi_VN
@@ -53,6 +53,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% dong sign -> d// -> dd
<U20AB> "<U0111>";"<U0064><U0064>"
diff --git a/localedata/locales/wa_BE b/localedata/locales/wa_BE
index e97493089e..6349142ef7 100644
--- a/localedata/locales/wa_BE
+++ b/localedata/locales/wa_BE
@@ -54,6 +54,7 @@ LC_CTYPE
copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% A-bole -> A-circonflecse -> AU
<U00C5> "A<U030A>";"A";"AU"
diff --git a/localedata/locales/wo_SN b/localedata/locales/wo_SN
index 47263d2eab..bd466d934a 100644
--- a/localedata/locales/wo_SN
+++ b/localedata/locales/wo_SN
@@ -53,6 +53,7 @@ translit_start
% Accents are simply omitted if they cannot be represented.
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
diff --git a/localedata/locales/xh_ZA b/localedata/locales/xh_ZA
index 4564137e85..5bd3d5bd3c 100644
--- a/localedata/locales/xh_ZA
+++ b/localedata/locales/xh_ZA
@@ -64,6 +64,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/yi_US b/localedata/locales/yi_US
index 95963830fc..edd55f77e9 100644
--- a/localedata/locales/yi_US
+++ b/localedata/locales/yi_US
@@ -60,6 +60,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
% if digraphs are not available (this is the case with iso-8859-8)
% then use the single letters
diff --git a/localedata/locales/yuw_PG b/localedata/locales/yuw_PG
index 0cb3cadf4a..b9e393d354 100644
--- a/localedata/locales/yuw_PG
+++ b/localedata/locales/yuw_PG
@@ -40,6 +40,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
diff --git a/localedata/locales/zh_CN b/localedata/locales/zh_CN
index 62a46415c1..00f2332dde 100644
--- a/localedata/locales/zh_CN
+++ b/localedata/locales/zh_CN
@@ -58,6 +58,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
class "hanzi"; /
diff --git a/localedata/locales/zu_ZA b/localedata/locales/zu_ZA
index cf93a63009..ab37a145b2 100644
--- a/localedata/locales/zu_ZA
+++ b/localedata/locales/zu_ZA
@@ -68,6 +68,7 @@ copy "i18n"
translit_start
include "translit_combining";""
+include "translit_cyrillic";""
translit_end
END LC_CTYPE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-19 9:22 ` Egor Kobylkin
@ 2018-11-19 19:36 ` Marko Myllynen
0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2018-11-19 19:36 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
Hi,
On 19/11/2018 11.21, Egor Kobylkin wrote:
> On 19.11.18 08:13, Marko Myllynen wrote:
>> On 17/11/2018 20.34, Egor Kobylkin wrote:
>
> Your example only covers _tansliteration_ to Latin Diacritics
> iconv -f UTF-8 -t ISO-8859-15//TRANSLIT \
> | iconv -f ISO-8859-15 -t UTF-8
>
> while BZ #2872 is about _transcription_ to ASCII
> iconv -f UTF-8 -t ASCII//TRANSLIT
AFAICS v9 (unlike v10) supported both of the above cases.
> The glibc wiki explicitly lists this use case (ASCII) as the test
> example https://sourceware.org/glibc/wiki/Locales#Testing_Locales
I wrote that section and I certainly wasn't considering Cyrillic aspects
at that time (IIRC it was written even before Mike did the major update
for transliteration rules at the end of 2015). The context back then was
mostly about handling Latin letters like Ã
, Ã, Ã, Ã, etc.
> So again, you are asking to have ISO 9.1995. System A but the bug is
> about ISO 9.1995. System B (GOST 7.79-2000)
We certainly can decide here what's the best course of action, we do not
have to slavishly follow some old bug report when deciding the direction
for the implementation. But I think I've made my position clear by now
so I'm not going to repeat it anymore.
In any case once your patch lands I'm going to submit a follow-up patch
for fi_FI to make it compliant with the applicable national standard
(SFS 4900) which defines how to do Cyrillic transliteration /
transcription in the context Finnish.
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-19 7:14 ` Marko Myllynen
2018-11-19 9:22 ` Egor Kobylkin
@ 2018-12-01 22:09 ` Rafal Luzynski
2018-12-01 22:53 ` Egor Kobylkin
2018-12-03 22:19 ` Egor Kobylkin
1 sibling, 2 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-01 22:09 UTC (permalink / raw)
To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales
19.11.2018 08:13 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> Given the amount of questions above I think the way forward is to try
> follow the relevant standards as closely as possible and also check what
> the other implementations (i.e., uconv(1)) do. For example, checking the
> case earlier mentioned case may or may not give some hints:
>
> $ echo Шема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Šema
> $ echo Схема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Shema
> $ uconv -V
> uconv v2.1 ICU 50.1.2
I've played a little with uconv and unfortunately it does not look good
to me.
It does not have any fallback transliteration to plain ASCII. When it says
that 'Ш' is transliterated to 'Š' then it always uses 'Š' and if the target
charset does not have this character then crashes:
$ echo Шема | uconv -f UTF-8 -t ASCII -x cyrillic-latin
Conversion from Unicode to codepage failed at output byte position 0.
Unicode: 0160 Error: Invalid character found
$ echo Шема | uconv -f UTF-8 -t ISO-8859-1 -x cyrillic-latin
Conversion from Unicode to codepage failed at output byte position 0.
Unicode: 0160 Error: Invalid character found
$ echo Шема | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin
�ema
$ echo Шема | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin | uconv -f
ISO-8859-2 -t UTF-8
Šema
It seems to follow ISO 9 (GOST 7.79) System A. However, the transliteration
of the hard sign is rather strange:
$ echo нъе | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
nʺe
The above was correct but:
$ echo НЪЕ | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
Nʺ̱E
$ echo Ъ | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
ʺ̱
$ echo Ъ | uconv -f UTF-8 -t UTF-16 -x cyrillic-latin| hexdump -x
0000000 feff 02ba 0331 000a
0000008
So this generates:
02BA MODIFIER LETTER DOUBLE PRIME
0331 COMBINING MACRON BELOW
There is are more transliteration methods, for example Russian-Latin/BGN:
$ echo Шема | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Shema
$ echo Схема | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Skhema
Converting 'х' to 'kh' seems to be common in English transliteration but
it does not follow any ISO standard.
$ echo ХА ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
KHA kha
This means that the choice whether a digraph in the output should be
all uppercase or maybe upper+lower is context based, something which we
probably cannot implement. But definitely a good thing.
Two more tests:
$ echo Ещё | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
Yeshchë
$ echo Ещё | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
Conversion from Unicode to codepage failed at output byte position 6.
Unicode: 00eb Error: Invalid character found
So the output is not plain ASCII.
$ echo е же ле не | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
ye zhe le ne
Again this means that transliteration of 'е' is context based:
it is 'ye' in the beginning of a word and 'e' otherwise.
The version which I've tested:
$ uconv -V
uconv v2.1 ICU 60.2
It seems that uconv will not be a good hint about transliterating
to plain ASCII.
Also, the difference between uconv and iconv is that we can provide
multiple transliterations for any source character but we can't group
them into standards so we can't tell iconv to use this or another
system. It will just choose the best fitting the current output
character set and the only thing we can choose is the locale.
This makes me think: should we add a locale like ru_RU@SystemA or
ru_RU@SystemB?
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-01 22:09 ` Rafal Luzynski
@ 2018-12-01 22:53 ` Egor Kobylkin
2018-12-03 22:19 ` Egor Kobylkin
1 sibling, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-01 22:53 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales
[-- Attachment #1: Type: text/plain, Size: 3439 bytes --]
On 01.12.18 23:07, Rafal Luzynski wrote:
>
> Also, the difference between uconv and iconv is that we can provide
> multiple transliterations for any source character but we can't group
> them into standards so we can't tell iconv to use this or another
> system. It will just choose the best fitting the current output
> character set and the only thing we can choose is the locale.
>
> This makes me think: should we add a locale like ru_RU@SystemA or
> ru_RU@SystemB?
Wouldn't it require to create 3 versions of every locale that would
include the translit_cyrillic file then? I.e. en_US + en_US@SystemA,
en_US@SystemB etc.?
This in turn will make two of them optional (as cyrillic fonts are at
the moment). The highest value is in having the default locale being
able to transliterate, isn't it? So putting the transliteration to
optional locales kind of defeats the purpose.
An example from my experience as a user - a networked device or host
would often have the en_US as the default (only?) locale with no viable
way to change it or install cyrillic fonts. Anyway, this is the most
dire situation where the ASCII transliteration certainly helps most.
Having en_US@SystemA or en_US@SystemB theoretically available but not
compiled by the distributor wouldn't help here, would it?
So the only useful scenario here would be to ship your locales with the
transliteration already included by default in en_US. This way the
distributor won't have to get active to include transliteration as
en_US@SystemA or en_US@SystemB.
From my (however limited) point of view it is better to have the System
B in first, then see if some code need to be changed to accommodate
System A/System B problematic. Again, System B is _transcription_ to
ASCII and System A _transliteration_ to Latin with different use cases.
It's insightful to see your comparison of the uconv vs. iconv!
Similar to your checks this is what I was using to see whether any
locale fails the transliteration for any cyrillic letter:
echo
"ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒ
ÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"|
LOCPATH=$workdir/compiled_locales/"$locale"/ LC_ALL="$locale".UTF-8
iconv -f UTF-8 -t ASCII//TRANSLIT
should give (can be asserted with bash string comparison):
AaOoUussYODJG`YeZ`IYiJL`N`TSHK`U`DhABVGDEZHZIJKLMNOPRSTUUFHCCHSHSHHA`Y`E`YUYAabvgdezhzijklmnoprstuufhcchshshh``y`e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FhfhYhyhE`e`
G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`SH`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'
And I am attaching another file that has the Unicode Codepoints next to
the letters for easier identification of failures. (like "U0401-Ð
U0402-Ð U0403-Ð etc.) Hope it will be helpful in creating the tests.
Best regards,
Egor Kobylkin
[-- Attachment #2: translit-test-input.txt --]
[-- Type: text/plain, Size: 2249 bytes --]
CYRILLIC RUSSIAN Съешь ещё этих мягких французских булок, да выпей же чаю. СЪЕШЬ ЕЩЁ ЭТИХ МЯГКИХ ФРАНЦУЗСКИХ БУЛОК? ДА ВЫПЕЙ ЖЕ ЧАЮ!
CYRILLIC COMPLETE U0401-Ё U0402-Ђ U0403-Ѓ U0404-Є U0405-Ѕ U0406-І U0407-Ї U0408-Ј U0409-Љ U040A-Њ U040B-Ћ U040C-Ќ U040E-Ў U040F-Џ U0410-А U0411-Б U0412-В U0413-Г U0414-Д U0415-Е U0416-Ж U0417-З U0418-И U0419-Й U041A-К U041B-Л U041C-М U041D-Н U041E-О U041F-П U0420-Р U0421-С U0422-Т U0423-У U0423 0301-У́ U0424-Ф U0425-Х U0426-Ц U0427-Ч U0428-Ш U0429-Щ U042A-ъ U042B-Ы U042C-ь U042D-Э U042E-Ю U042F-Я U0430-а U0431-б U0432-в U0433-г U0434-д U0435-е U0436-ж U0437-з U0438-и U0439-й U043A-к U043B-л U043C-м U043D-н U043E-о U043F-п U0440-р U0441-с U0442-т U0443-у U0443 0301-у́ U0444-ф U0445-х U0446-ц U0447-ч U0448-ш U0449-щ U044A-Ъ U044B-ы U044C-Ь U044D-э U044E-ю U044F-я U0451-ё U0452-ђ U0453-ѓ U0454-є U0455-ѕ U0456-і U0457-ї U0458-ј U0459-љ U045A-њ U045B-ћ U045C-ќ U045E-ў U045F-џ U046A-Ѫ U046B-ѫ U0472-Ѳ U0473-ѳ U0474-Ѵ U0475-ѵ U048C-Ҍ U048D-ҍ U0490-Ґ U0491-ґ U0492-Ғ U0493-ғ U0494-Ҕ U0495-ҕ U0496-Җ U0497-җ U049A-Қ U049B-қ U049E-Ҟ U049F-ҟ U04A2-Ң U04A3-ң U04A4-Ҥ U04A5-ҥ U04A6-Ҧ U04A7-ҧ U04A8-Ҩ U04A9-ҩ U04AA-Ҫ U04AB-ҫ U04AC-Ҭ U04AD-ҭ U04AE-Ү U04AF-ү U04B2-Ҳ U04B3-ҳ U04B4-Ҵ U04B5-ҵ U04BA-Һ U04BB-һ U04BC-Ҽ U04BD-ҽ U04BE-Ҿ U04BF-ҿ U04C0-Ӏ U04C1-Ӂ U04C2-ӂ U04CB-Ӌ U04CC-ӌ U04D0-Ӑ U04D1-ӑ U04D2-Ӓ U04D3-ӓ U04D6-Ӗ U04D7-ӗ U04D8-Ә U04D9-ә U04DC-Ӝ U04DD-ӝ U04DE-Ӟ U04DF-ӟ U04E0-Ӡ U04E1-ӡ U04E4-Ӥ U04E5-ӥ U04E6-Ӧ U04E7-ӧ U04E8-Ө U04E9-ө U04F0-Ӱ U04F1-ӱ U04F2-Ӳ U04F3-ӳ U04F4-Ӵ U04F5-ӵ U04F8-Ӹ U04F9-ӹ U2019-’
GREEK Ελληνικό Ίδρυμα Ευρωπαϊκής και Εξωτερικής.
GERMAN Zwölf Boxkämpfer jagen Victor quer über den großen Sylter Deich.
FRENCH Dès Noël où un zéphyr haï me vêt de glaçons würmiens je dîne d’exquis rôtis de bœuf au kir à l’aÿ d’âge mûr \& cætera.
SPANISH El veloz murciélago hindú comía feliz cardillo y kiwi, la cigüeña tocaba el saxofón detrás del palenque de paja.
END
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-01 22:09 ` Rafal Luzynski
2018-12-01 22:53 ` Egor Kobylkin
@ 2018-12-03 22:19 ` Egor Kobylkin
2018-12-08 12:37 ` Rafal Luzynski
1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-03 22:19 UTC (permalink / raw)
To: libc-alpha, libc-locales; +Cc: Marko Myllynen
Rafal,
Just to touch base on this, what is the best way forward? Did you get
any input/feedback on your questions below? Are you expecting input from
anyone but myself?
On the blocking issue #2: I really donât see the connection to the uk_UA
locale that has its transliteration table inline and is explicitly
excluded from my patch. It may be revealing another issue you have with
glibc but wouldnât that be better addressed in a new bug?
Again, in the v10 of my patch I have removed multicharacter source
graphemes, so that issue is moot there.
If youâd like to overhaul the glibc translit system wouldnât it be
better to commit the simple text file with the Cyrillic
translit(transcription) table first, fix the bug from the year 2006 and
then proceed from there all due diligence?
The same with having both System A and System B. Initially I went along
with the suggestion to include the system A but it is clear now that it
doesnât make fixing [BZ #2872] more straightforward. So Iâd also propose
to set it aside for the moment and use the v10 without the system A.
That is the whole reason I have submitted it, to be superclear on that.
Now you saw that uconv is transcribing «Хл as KHA (cap/cap/cap) that
should mitigate your concern about that issue too (somewhat, anyway).
Making it context based would also be about adding new code, see above.
Let me know if thereâs anything I can help with getting more progress
with the decision
Bests,
Egor
On 16.11.18 23:17, Rafal Luzynski wrote:
> 2. I made few tests in the command line and it seems to me that the
> transliteration from "Ð" to "Z" (+ lowercase as well) in uk_UA does
> not work and has not been working for some time already because I've
> checked some older systems as well and the result is always the same.
> I think that the reason is that uk_UA defines multiple
> transliteration rules for "Ð" depending on what is the letter
> following it. It does not seem to work. AFAIK the reason is that
> the syntax of transliteration rules says that a single non-Latin
> character may map one or more Latin strings, each consisting of one
> or more characters. There cannot be a rule transliterating multiple
> source characters into one or multiple destination characters. Is it
> a bug in transliteration implementation? Or maybe in the
> specification, including POSIX standard?
> The definition of transliteration says that it is one-to-one mapping
> of graphemes while a grapheme may be one or multiple characters. It
> does not have to be always mapping one-to-one character. Should we
> fix this bug first, make uk_UA transliteration work, and only then
> add a generic Cyrillic transliteration? Egor's patch already
> contains transliteration of "У" + combining acute accent to "Ã" which
> most probably will not work.
>
> I still think that in the longer term all existing custom
> transliterations of Cyrillic alphabets should be ported to a
> modification of your patch.
On 01.12.18 23:07, Rafal Luzynski wrote:
> 19.11.2018 08:13 Marko Myllynen <myllynen@redhat.com> wrote:
>> [...]
>> Given the amount of questions above I think the way forward is to try
>> follow the relevant standards as closely as possible and also check what
>> the other implementations (i.e., uconv(1)) do. For example, checking the
>> case earlier mentioned case may or may not give some hints:
>>
>> $ echo Шема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
>> Å ema
>> $ echo СÑ
ема | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
>> Shema
>> $ uconv -V
>> uconv v2.1 ICU 50.1.2
>
> I've played a little with uconv and unfortunately it does not look good
> to me.
>
> It does not have any fallback transliteration to plain ASCII. When it says
> that 'Ш' is transliterated to 'Š' then it always uses 'Š' and if the target
> charset does not have this character then crashes:
>
> $ echo Шема | uconv -f UTF-8 -t ASCII -x cyrillic-latin
> Conversion from Unicode to codepage failed at output byte position 0.
> Unicode: 0160 Error: Invalid character found
> $ echo Шема | uconv -f UTF-8 -t ISO-8859-1 -x cyrillic-latin
> Conversion from Unicode to codepage failed at output byte position 0.
> Unicode: 0160 Error: Invalid character found
> $ echo Шема | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin
> �ema
> $ echo Шема | uconv -f UTF-8 -t ISO-8859-2 -x cyrillic-latin | uconv -f
> ISO-8859-2 -t UTF-8
> Å ema
>
> It seems to follow ISO 9 (GOST 7.79) System A. However, the transliteration
> of the hard sign is rather strange:
>
> $ echo нÑе | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> nʺe
>
> The above was correct but:
>
> $ echo ÐЪР| uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> Nʺ̱E
> $ echo Ъ | uconv -f UTF-8 -t UTF-8 -x cyrillic-latin
> ʺ̱
> $ echo Ъ | uconv -f UTF-8 -t UTF-16 -x cyrillic-latin| hexdump -x
> 0000000 feff 02ba 0331 000a
> 0000008
>
> So this generates:
> 02BA MODIFIER LETTER DOUBLE PRIME
> 0331 COMBINING MACRON BELOW
>
> There is are more transliteration methods, for example Russian-Latin/BGN:
>
> $ echo Шема | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Shema
> $ echo СÑ
ема | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Skhema
>
> Converting 'Ñ
' to 'kh' seems to be common in English transliteration but
> it does not follow any ISO standard.
>
> $ echo ХРÑ
а | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> KHA kha
>
> This means that the choice whether a digraph in the output should be
> all uppercase or maybe upper+lower is context based, something which we
> probably cannot implement. But definitely a good thing.
>
> Two more tests:
>
> $ echo ÐÑÑ | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> Yeshchë
> $ echo ÐÑÑ | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
> Conversion from Unicode to codepage failed at output byte position 6.
> Unicode: 00eb Error: Invalid character found
>
> So the output is not plain ASCII.
>
> $ echo е же ле не | uconv -f UTF-8 -t ASCII -x Russian-Latin/BGN
> ye zhe le ne
>
> Again this means that transliteration of 'е' is context based:
> it is 'ye' in the beginning of a word and 'e' otherwise.
>
> The version which I've tested:
>
> $ uconv -V
> uconv v2.1 ICU 60.2
>
> It seems that uconv will not be a good hint about transliterating
> to plain ASCII.
>
> Also, the difference between uconv and iconv is that we can provide
> multiple transliterations for any source character but we can't group
> them into standards so we can't tell iconv to use this or another
> system. It will just choose the best fitting the current output
> character set and the only thing we can choose is the locale.
>
> This makes me think: should we add a locale like ru_RU@SystemA or
> ru_RU@SystemB?
>
> Regards,
>
> Rafal
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-11-19 11:11 ` [PATCH v10] " Egor Kobylkin
@ 2018-12-08 0:02 ` Rafal Luzynski
2018-12-08 22:17 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-08 0:02 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales
19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
I'm in favor of implementing System A and dropping System B instead.
If I understand correctly, System A is actually ISO 9, therefore it is
international, universal, and neutral, while System B is a GOST standard
and therefore used only in Russia (also adopted in several other countries
as well).
It's true that we can't handle both System A and System B. What we
would like to have is:
System A
/============> OUTPUT: Latin with diacritics
INPUT < System B
\============> OUTPUT: Plain ASCII (fallback)
That means: use one system but if the output can't handle it then switch
to another system.
But what we can actually have is either:
System A Fallback
INPUT ============> OUTPUT: Latin with diacritics ============> Plain
ASCII
or:
System B
INPUT ============> OUTPUT: Plain ASCII
That means, we can only provide a fallback for individual characters,
we can't provide a fallback algorithm (that is, we can't switch to
transliterating 'Х' as 'X' instead of 'H' just because we can't
transliterate
'Ш' as 'Š' and switch to 'SH' instead).
Wouldn't it be better to implement ISO 9 (System A) instead and provide
a fallback ASCII transliteration which could be similar but not identical
to System B? Is it necessary to provide plain ASCII transliteration
conforming to System B even if that means that we would have not to
implement System A? If yes, would it be correct to provide System B
for ru_RU (and maybe few more locales) but include System A in all other
locales (except few which we exclude already)?
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
OK, thank you, I like this change.
> [...]
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen. [11]
I'm not sure it should be actually called transcription. IIUC,
transcription
reflects pronunciation, something we can't easily implement in glibc.
As long as we convert letters to letters (or group of letters to group
of letters) without taking pronunciation into account it should be
called transliteration. OTOH, I agree that it is rather uncommon in
Russian language to find an example where pronunciation is not perfectly
reflected in spelling.
> +% Generated from UnicodeData.txt with a spreadsheet referenced
> +% in that bugs doclet
The previous versions of your patch had "in that bug's doclet" here
which I think is correct.
I like the version 9 of your patch more so I'm going to write a more
thorough review of it.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-03 22:19 ` Egor Kobylkin
@ 2018-12-08 12:37 ` Rafal Luzynski
2018-12-10 21:29 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-08 12:37 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales; +Cc: Marko Myllynen
17.11.2018 19:34 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Looks like we have three issues:
> 1. lack of explicit control which transformation to use (System A or
> System B) via //TRANSLIT
> 2. possibility of collision for System B if used CAP/low transcription
> for capital letters
> 3. Cyrillic 'Х'/'х' (ha) never transcribes to 'H'/'h' as it should per
> System B because it's equivalent 'X'/'x' from System A is always present
> and takes precedence.
True.
> As a solution shouldn't we only keep System B in a new file
> transcribe_cyrillic and put it in place as the explicit ASCII
> transcription for targeted locales (as opposed to transliteration)?
>
> We would keep System A as translit_cyrillic but won't include it into
> this patch. Once you have resolved an issue of having two conflicting
> rule-sets but only one key //TRANSLIT you could add the System A back.
Sounds like a good idea to provide those two files:
* translit_cyrillic_system_a,
* translit_cyrillic_system_b,
(or any other pair of names) and let the individual locales choose whether
they want to include System A or System B. For optimization, system_b
file could include system_a and modify it.
> The SH/Sh can be decided on either way - seems like an easy change any
> way.
I'm in favor of "Sh" because it will work fine for titlecased words
(where only the first letter is uppercase) but I'm aware it would be
a problem for uppercased words. Unfortunately, I think we are unable
to satisfy both cases.
> On 16.11.18 23:17, Rafal Luzynski wrote:
>
> > Egor, while at this I was thinking about your idea to transliterate
> > letters like "Ш" (uppercase) to "SH" (always uppercase) in order to
> > distinguish between "Шема" (-> "SHema") and "Схема" (-> "Shema" or
> > "Sxema").
>
> to clarify, this SH/Sh collision issue relates only to iconv -f UTF-8 -t
> ASCII//TRANSLIT (i.e. System B transcription).
True.
> But it's not only SH/Sh, there are following combinations used to
> transcribe capital letters:
>
> YO, DJ, YE, TSH, DH, ZH, CZ, CH, SH, SHH, YU, YA, FH, YH, GH, NG, TCZ
Absolutely true. I skip the whole list only for the brevity: if we
find a solution for one letter the same solution will work fine for
all others.
> [...]
> With transcription we are basically striping information from the data,
> mapping it into a smaller character set. The idea to keep them in
> CAP/CAP is to try to preserve as much information as possible.
I'm only afraid that things like "TWo CApitals" or "CamelCase" are
common among us computer geeks while they do not look great when
working with natural language and when displaying them to regular users
and even non-computer people.
> [...]
> So in fact we have two rules for each letter in the same file (System A
> and System B), where System A takes precedence.
>
> I have a question then: isn't this more like a hack than a right thing
> to do?
>
> Shouldn't we have two explicit rules for transcription and
> transliteration not dependent on a destination character set?
It's impossible with the current API of iconv. Maybe it would be
possible ever in future but that's a greater amount of work than what
we are doing here now. Again, for now different set of rules = different
locale.
I have another question: is it really a job of transliteration to preserve
all original information, to ensure no collisions and have the ability to
restore the original text? I'm afraid that as long as plain ASCII is the
destination charset whatever system we provide it will always be possible
to provide a malicious combination of the Cyrillic characters proving that
the system generates collisions.
> > I still don't like the idea to
> > put two uppercase letters in a beginning of a word in titlecase only
> > to indicate that there was originally a single letter. What if we:
> >
> > * drop the rule of transliterating "Х" to "H" and transliterate
> > always to "X",
> This would contradict ISO 9.1995. (System A).
Yes, it would. I'm trying to find solution here since I think we have
proved that we can't implement a system which will handle System A,
System B, and ensure no collisions at the same time. At least one
requirement must be dropped (at least partially).
> System A was added on Marko's request (so setting him on TO:) I am
> neutral on keeping it or dropping it, just to be clear.
I think I didn't see this Marko's request but I'm in favor of keeping
System A, too.
Marko, it would be good to hear your opinion about System A vs. System B
again.
> [...]
> On the other hand, for my personal needs I care less about standards but
> about current functionality and data loss because of missing
> transcription altogether due to the BZ #2872.
I read this that you are open to a solution which is inspired by some
standards but does not implement them fully due to our technical
limitations.
19.11.2018 10:21 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> Marko,
>
> Your example only covers _tansliteration_ to Latin Diacritics
> [...]
> while BZ #2872 is about _transcription_ to ASCII
> [...]
>
> So again, you are asking to have ISO 9.1995. System A but the bug is
> about ISO 9.1995. System B (GOST 7.79-2000)
It's hard to say what the original bug reporter meant but I think that the
problem is that there is no transliteration from Cyrillic to any variant of
Latin, except in few locales. If System A was implemented but System B was
not then at least some characters would be handled correctly. Currently no
Cyrillic characters are handled.
19.11.2018 20:35 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> In any case once your patch lands I'm going to submit a follow-up patch
> for fi_FI to make it compliant with the applicable national standard
> (SFS 4900) which defines how to do Cyrillic transliteration /
> transcription in the context Finnish.
I totally agree. As far as I can see, SFS 4900 is more similar to
System A (ISO 9) rather than System B, that is, it transliterates to Latin
characters with diacritics rather than plain ASCII. Marko, what is your
opinion about possible implementation of SFS 4900 in these cases:
* When the destination charset does not contain required Latin diacritic
characters (e.g., it is plain ASCII)?
* When the output is ambiguous, that means, when two different Cyrillic
strings produce the same Latin (or ASCII) output?
At the moment I am not curious about SFS 4900 but we are facing the same
problems now with ISO 9 and GOST 7.79.
1.12.2018 23:07 Rafal Luzynski <digitalfreak@lingonborough.com> wrote:
> [...]
> $ echo ХА ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
> KHA kha
>
> This means that the choice whether a digraph in the output should be
> all uppercase or maybe upper+lower is context based, something which we
> probably cannot implement. But definitely a good thing.
I forgot to include this test which is really interesting:
$ echo ХА Ха ха | uconv -f UTF-8 -t UTF-8 -x Russian-Latin/BGN
KHA Kha kha
which again confirms that the choice of all uppercase or just the first
letter uppercased is context based, a thing which we can't implement now.
1.12.2018 23:53 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> On 01.12.18 23:07, Rafal Luzynski wrote:
> >
> > [...]
> > This makes me think: should we add a locale like ru_RU@SystemA or
> > ru_RU@SystemB?
>
> Wouldn't it require to create 3 versions of every locale that would
> include the translit_cyrillic file then? I.e. en_US + en_US@SystemA,
> en_US@SystemB etc.?
OK, please read this as another brainstorming idea and let's just
forget it.
> [...]
> An example from my experience as a user - a networked device or host
> would often have the en_US as the default (only?) locale with no viable
> way to change it or install cyrillic fonts. Anyway, this is the most
> dire situation where the ASCII transliteration certainly helps most.
> Having en_US@SystemA or en_US@SystemB theoretically available but not
> compiled by the distributor wouldn't help here, would it?
>
> So the only useful scenario here would be to ship your locales with the
> transliteration already included by default in en_US. This way the
> distributor won't have to get active to include transliteration as
> en_US@SystemA or en_US@SystemB.
Having the idea of "@SystemA" and "@SystemB" dropped I don't think
implementing any solution in glibc would be helpful for your use case.
Two reasons:
1. I believe that sooner or later someone will develop a transliteration
system for en_US which will follow English transliteration of Russian
instead of any standard we are discussing here. That means, it would
transliterate 'Х' as 'Kh' rather than 'H' or 'X'.
2. Currently there is a trend not to install even en_US locales and leave
only C which is hardcoded into glibc binaries. OTOH, I wouldn't mind
if ISO 9 was hardcoded into C as well.
3. That's beyond Russian language but transliteration according to Serbian
or Bulgarian or Ukrainian or Kazakh rules still requires installing their
proper locales. I think that requiring ru_RU to be installed could be
reasonable especially if we end up with ru_RU somehow differing from
the default "translit_cyrillic".
BTW you don't need Cyrillic fonts to be installed on your server in order
to process the Cyrillic text correctly unless your server renders the text.
3.12.2018 23:19 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> Rafal,
>
> Just to touch base on this, what is the best way forward? Did you get
> any input/feedback on your questions below? Are you expecting input from
> anyone but myself?
Yes, I expected some input from more experienced maintainers about whether
and how to write the tests but I'd rather start another thread about it
because this one is too long already.
> On the blocking issue #2: I really don’t see the connection to the uk_UA
> locale that has its transliteration table inline and is explicitly
> excluded from my patch. It may be revealing another issue you have with
> glibc but wouldn’t that be better addressed in a new bug?
OK, I was not precise enough (I'm sorry about it) so I'd like to explain
here:
1. In the long term goal I would like to convert those excluded locales
to use your translit_cyrillic as well.
2. In order to ensure that change is not destructive for them I will need
automatic tests to prove that their transliteration rules work the
same good before the change and after the change.
3. It does not matter that converting those other locales is in a distant
future because we need the same tests for Russian language now.
4. Even although I have not started writing any tests I can see they
will be failing for uk_UA. The reason is that glibc transliteration
rules can handle transliterating single characters into single
characters,
single characters into multiple characters but not multiple characters
into multiple (or even single) characters.
5. We can ignore uk_UA but we will face the same case in ru_RU where
you had a case of 'У́ ' ('У' + 'COMBINING ACUTE ACCENT').
6. So the question was: how (and whether) to write the tests if we
already know they would be failing? Skip them? Resolve the other
issue first? Mark them as XFAIL?
In the meantime, you have removed the controversial conversion rule
of 'У' with the acute accent:
> Again, in the v10 of my patch I have removed multicharacter source
> graphemes, so that issue is moot there.
so we can move to the next step.
> If you’d like to overhaul the glibc translit system wouldn’t it be
> better to commit the simple text file with the Cyrillic
> translit(transcription) table first, fix the bug from the year 2006 and
> then proceed from there all due diligence?
I agree and we are now one step forward.
> The same with having both System A and System B. Initially I went along
> with the suggestion to include the system A but it is clear now that it
> doesn’t make fixing [BZ #2872] more straightforward. So I’d also propose
> to set it aside for the moment and use the v10 without the system A.
> That is the whole reason I have submitted it, to be superclear on that.
OK, I think that now I understand your reason to drop System A better.
But still I'd like to rethink implementing System A somehow and drop
(or rather: implement only partially) System B.
> Now you saw that uconv is transcribing «ХА» as KHA (cap/cap/cap) that
> should mitigate your concern about that issue too (somewhat, anyway).
> Making it context based would also be about adding new code, see above.
It would also require the changes in the syntax of the source code
of locale data and possibly breaking the POSIX compatibility which
I think would be unacceptable.
> Let me know if there’s anything I can help with getting more progress
> with the decision
I'm afraid you can't help more. I'd like to hear some feedback from other
people. Due to some minor obstacles we can't resolve this issue being only
two here.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-08 0:02 ` Rafal Luzynski
@ 2018-12-08 22:17 ` Egor Kobylkin
2018-12-19 22:48 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-08 22:17 UTC (permalink / raw)
To: libc-alpha, libc-locales, Dmitry V. Levin, Marko Myllynen, mfabian
Rafal, Dmitry, Marko, Mike
On 08.12.18 00:35, Rafal Luzynski wrote:
> 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
>> (transliteration to Latin with diacritics) as conflicting with
>> System B within glibc mechanics and not solving BZ #2872
>
> I'm in favor of implementing System A and dropping System B instead.
The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
fails". The ISO 9 System A does not map to ASCII so it is not a solution
to BZ #2872 at all.
I was scratching my head as to how can we avoid the explosion of the
scope for this patch. And then it appeared to me that it was wrong to
target all the present locales for the ASCII translit. This seems to be
the root cause for this prolonged A vs. B discussions. The proper target
for my table is actually the C locale translit file
(locale/C-translit.h.in). I will submit a proper patch shortly.
If anyone wants to keep working on the implementation of the Latin
Diacritics transliteration of the Cyrillic letters (System A) you are
welcome to use the tables I have submitted before (v9). That would be a
new feature for glibc as per my understanding. Let's just make super
clear the distinction of the System A (Latin with Diacritics, non-ASCII)
to the ASCII translit as mentioned in BZ #2872 (System B).
My focus is super sharp on helping with Cyrillic -> ASCII translit
availability for a default installation with glibc.
Hope this helps,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (10 preceding siblings ...)
2018-11-19 11:11 ` [PATCH v10] " Egor Kobylkin
@ 2018-12-10 1:28 ` Egor Kobylkin
2018-12-19 23:31 ` Egor Kobylkin
2019-01-02 18:39 ` [PATCH v12] " Egor Kobylkin
2019-03-19 10:39 ` ping " Egor Kobylkin
13 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-10 1:28 UTC (permalink / raw)
To: libc-alpha, libc-locales, Marko Myllynen, mfabian, Dmitry V. Levin
[-- Attachment #1: Type: text/plain, Size: 5673 bytes --]
Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.
Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics
Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration rows to locale/C-translit.h.in.
The patch is attached.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:
iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- it produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].
The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.
Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
Best regards,
Egor Kobylkin
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 11581 bytes --]
From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* locale/C-translit.h.in: Add Cyrillic transliteration.
---
locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 170 insertions(+)
diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@
Copyright (C) 2000-2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+ 0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@
"\x02cd" "_" /* <U02CD> MODIFIER LETTER LOW MACRON */
"\x02d0" ":" /* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
"\x02dc" "~" /* <U02DC> SMALL TILDE */
+"\x0401" "YO" /* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402" "DJ" /* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403" "G`" /* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404" "YE" /* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405" "Z`" /* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406" "I" /* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407" "YI" /* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408" "J" /* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409" "L`" /* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a" "N`" /* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b" "TSH" /* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c" "K`" /* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e" "U`" /* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f" "DH" /* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410" "A" /* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411" "B" /* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412" "V" /* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413" "G" /* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414" "D" /* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415" "E" /* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416" "ZH" /* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417" "Z" /* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418" "I" /* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419" "J" /* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a" "K" /* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b" "L" /* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c" "M" /* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d" "N" /* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e" "O" /* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f" "P" /* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420" "R" /* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421" "S" /* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422" "T" /* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423" "U" /* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424" "F" /* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425" "X" /* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426" "CZ" /* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427" "CH" /* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428" "SH" /* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429" "SHH" /* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a" "A`" /* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b" "Y`" /* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c" "`" /* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d" "E`" /* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e" "YU" /* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f" "YA" /* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430" "a" /* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431" "b" /* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432" "v" /* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433" "g" /* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434" "d" /* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435" "e" /* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436" "zh" /* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437" "z" /* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438" "i" /* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439" "j" /* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a" "k" /* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b" "l" /* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c" "m" /* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d" "n" /* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e" "o" /* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f" "p" /* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440" "r" /* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441" "s" /* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442" "t" /* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443" "u" /* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444" "f" /* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445" "x" /* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446" "cz" /* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447" "ch" /* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448" "sh" /* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449" "shh" /* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a" "``" /* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b" "y`" /* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c" "`" /* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d" "e`" /* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e" "yu" /* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f" "ya" /* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451" "yo" /* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452" "dj" /* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453" "g`" /* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454" "ye" /* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455" "z`" /* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456" "i" /* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457" "yi" /* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458" "j" /* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459" "l`" /* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a" "n`" /* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b" "tsh" /* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c" "k`" /* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e" "u`" /* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f" "dh" /* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a" "O`" /* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b" "o`" /* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472" "FH" /* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473" "fh" /* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474" "YH" /* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475" "yh" /* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c" "E`" /* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d" "e`" /* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490" "G`" /* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491" "g`" /* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492" "GH" /* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493" "gh" /* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494" "GH" /* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495" "gh" /* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496" "ZH`" /* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497" "zh`" /* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a" "K`" /* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b" "k`" /* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e" "K`" /* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f" "k`" /* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2" "N`" /* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3" "n`" /* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4" "NG" /* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5" "ng" /* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6" "P`" /* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7" "p`" /* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8" "O`" /* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9" "o`" /* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa" "C`" /* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab" "C`" /* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac" "T`" /* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad" "t`" /* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae" "U" /* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af" "u" /* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2" "H`" /* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3" "h`" /* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4" "TCZ" /* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5" "tcz" /* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba" "SH`" /* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb" "SH`" /* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc" "CH`" /* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd" "ch`" /* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be" "CH`" /* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf" "ch`" /* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0" "i" /* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1" "ZH`" /* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2" "zh`" /* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb" "CH`" /* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc" "ch`" /* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0" "A`" /* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1" "a`" /* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2" "A`" /* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3" "a`" /* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6" "E`" /* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7" "e`" /* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8" "A`" /* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9" "a`" /* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc" "ZH`" /* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd" "zh`" /* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de" "Z`" /* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df" "z`" /* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0" "Z`" /* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1" "z`" /* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4" "I`" /* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5" "i`" /* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6" "O`" /* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7" "o`" /* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8" "O`" /* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9" "o`" /* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0" "U`" /* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1" "u`" /* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2" "U`" /* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3" "u`" /* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4" "CH`" /* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5" "ch`" /* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8" "Y`" /* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9" "y`" /* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
"\x2002" " " /* <U2002> EN SPACE */
"\x2003" " " /* <U2003> EM SPACE */
"\x2004" " " /* <U2004> THREE-PER-EM SPACE */
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-08 12:37 ` Rafal Luzynski
@ 2018-12-10 21:29 ` Marko Myllynen
2018-12-19 22:42 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2018-12-10 21:29 UTC (permalink / raw)
To: Rafal Luzynski, Egor Kobylkin, libc-alpha, libc-locales
Cc: Mike Fabian, Carlos O'Donell
Hi,
On 08/12/2018 03.15, Rafal Luzynski wrote:
> 17.11.2018 19:34 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> The SH/Sh can be decided on either way - seems like an easy change any
>> way.
>
> I'm in favor of "Sh" because it will work fine for titlecased words
> (where only the first letter is uppercase) but I'm aware it would be
> a problem for uppercased words. Unfortunately, I think we are unable
> to satisfy both cases.
I think I'm in favor of "Sh" as well, although not perfect I'd assume
it's probably going to be correct in more cases than SH.
>> System A was added on Marko's request (so setting him on TO:) I am
>> neutral on keeping it or dropping it, just to be clear.
>
> I think I didn't see this Marko's request but I'm in favor of keeping
> System A, too.
>
> Marko, it would be good to hear your opinion about System A vs. System B
> again.
I think System A is a better option as it should be the same as ISO 9
and perhaps also produces results in some cases which are more expected
than with System B (if the Wikipedia ISO 9 article is to be believed).
Wrt BZ #2872 I think it's good to keep it in mind but IMHO we can also
deviate from it if needed, however with System A + ASCII fallback
definitions the RFE should be satisfied as well?
> 19.11.2018 20:35 Marko Myllynen <myllynen@redhat.com> wrote:
>> [...]
>> In any case once your patch lands I'm going to submit a follow-up patch
>> for fi_FI to make it compliant with the applicable national standard
>> (SFS 4900) which defines how to do Cyrillic transliteration /
>> transcription in the context Finnish.
>
> I totally agree. As far as I can see, SFS 4900 is more similar to
> System A (ISO 9) rather than System B, that is, it transliterates to Latin
> characters with diacritics rather than plain ASCII. Marko, what is your
> opinion about possible implementation of SFS 4900 in these cases:
>
> * When the destination charset does not contain required Latin diacritic
> characters (e.g., it is plain ASCII)?
This would be according to http://jkorpela.fi/iso9.html8 so for example
instead of ž -> zh and instead of štš -> shtsh.
> * When the output is ambiguous, that means, when two different Cyrillic
> strings produce the same Latin (or ASCII) output?
This is a good point and one I haven't considered but I'm not sure is
there anything we can do about this (at least without major locale
system internals work)? Do you have any rough idea how frequently this
could happen or is this more a theoretical issue? (Sorry if I've missed
earlier comments about this, it's been a long thread.)
>> The same with having both System A and System B. Initially I went along
>> with the suggestion to include the system A but it is clear now that it
>> doesnât make fixing [BZ #2872] more straightforward. So Iâd also propose
>> to set it aside for the moment and use the v10 without the system A.
>> That is the whole reason I have submitted it, to be superclear on that.
>
> OK, I think that now I understand your reason to drop System A better.
> But still I'd like to rethink implementing System A somehow and drop
> (or rather: implement only partially) System B.
Yes, I also think System A AKA ISO 9 would be a better choice but I'll
leave the final decision for you two (and others who might weigh in).
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-10 21:29 ` Marko Myllynen
@ 2018-12-19 22:42 ` Rafal Luzynski
2018-12-19 22:56 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-19 22:42 UTC (permalink / raw)
To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales
10.12.2018 22:20 Marko Myllynen <myllynen@redhat.com> wrote:
>
> Hi,
>
> On 08/12/2018 03.15, Rafal Luzynski wrote:
> > [...]
> > Marko, it would be good to hear your opinion about System A vs. System B
> > again.
>
> I think System A is a better option as it should be the same as ISO 9
> and perhaps also produces results in some cases which are more expected
> than with System B (if the Wikipedia ISO 9 article is to be believed).
>
> Wrt BZ #2872 I think it's good to keep it in mind but IMHO we can also
> deviate from it if needed, however with System A + ASCII fallback
> definitions the RFE should be satisfied as well?
That's exactly what I meant (sorry if it was not clear before).
> > [...] Marko, what is your
> > opinion about possible implementation of SFS 4900 in these cases:
> >
> > * When the destination charset does not contain required Latin diacritic
> > characters (e.g., it is plain ASCII)?
>
> This would be according to http://jkorpela.fi/iso9.html8 so for example
> instead of ž -> zh and instead of štš -> shtsh.
Agree.
> > * When the output is ambiguous, that means, when two different Cyrillic
> > strings produce the same Latin (or ASCII) output?
>
> This is a good point and one I haven't considered but I'm not sure is
> there anything we can do about this (at least without major locale
> system internals work)?
I agree with the suggestion that we can't do much about it. I mean,
there are possibly solutions (like using more punctuation characters)
but they don't look natural to me.
> Do you have any rough idea how frequently this
> could happen or is this more a theoretical issue? (Sorry if I've missed
> earlier comments about this, it's been a long thread.)
Yes, Egor provided this example many times:
"схема" -> "shema" (if "с" -> "s" and "х" -> "h")
"шема" -> "shema" (if "ш" -> "sh")
I don't think that it matters how frequent are these cases. I think that
the question is if ambiguity is a bug because if yes then even one corner
case proves that the solution is wrong.
> [...]
> Yes, I also think System A AKA ISO 9 would be a better choice but I'll
> leave the final decision for you two (and others who might weigh in).
Egor is a native speaker so I respect his opinion even if I'm not fully
convinced for technical reasons. Sadly, nobody else provides any opinion
which could weigh. I am going to write a separate email about it.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-08 22:17 ` Egor Kobylkin
@ 2018-12-19 22:48 ` Rafal Luzynski
2018-12-19 23:16 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-19 22:48 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Dmitry V. Levin,
Marko Myllynen, mfabian
8.12.2018 22:51 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> Rafal, Dmitry, Marko, Mike
>
> On 08.12.18 00:35, Rafal Luzynski wrote:
> > 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
> >>
> >> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
> >> (transliteration to Latin with diacritics) as conflicting with
> >> System B within glibc mechanics and not solving BZ #2872
> >
> > I'm in favor of implementing System A and dropping System B instead.
>
> The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
> fails". The ISO 9 System A does not map to ASCII so it is not a solution
> to BZ #2872 at all.
I did not mean implementing System A and nothing more. I meant implementing
System A and a fallback for ASCII which can be similar to System B but
we wouldn't be able to call it "System B" because it would differ in
few cases.
> I was scratching my head as to how can we avoid the explosion of the
> scope for this patch. And then it appeared to me that it was wrong to
> target all the present locales for the ASCII translit. This seems to be
> the root cause for this prolonged A vs. B discussions. The proper target
> for my table is actually the C locale translit file
> (locale/C-translit.h.in). I will submit a proper patch shortly.
I saw your patch v11 and now I must say I'm sorry for making noise because
it was me who said that I didn't mind adding Cyrillic -> ASCII
transliteration
to C locale. I said so before taking a look at the current contents of
transliteration in C locale. When I looked at this I realized that it does
not support any national characters, even from modified Latin alphabets
(like
used in most of western European languages). It only contains mathematical,
physical, commercial, diacritical etc. characters. So I'm no longer sure
it should support Cyrillic -> ASCII. But maybe again I'm wrong, maybe
it should support but just nobody implemented it yet.
> If anyone wants to keep working on the implementation of the Latin
> Diacritics transliteration of the Cyrillic letters (System A) you are
> welcome to use the tables I have submitted before (v9). That would be a
> new feature for glibc as per my understanding. Let's just make super
> clear the distinction of the System A (Latin with Diacritics, non-ASCII)
> to the ASCII translit as mentioned in BZ #2872 (System B).
I liked your v9 patch more. I really appreciate your work and I'm not
going to ask you to provide more patches because I think that so far you
have provided all possible versions. I hope that your work will not be
lost.
> My focus is super sharp on helping with Cyrillic -> ASCII translit
> availability for a default installation with glibc.
I understand your aim and I agree to support ASCII. Our disagreements are:
* whether to support conversion Cyrillic -> extended Latin as well,
* which standard to implement,
* what to do if the standard is ambiguous or if some details cannot be
implemented for technical reasons.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-19 22:42 ` Rafal Luzynski
@ 2018-12-19 22:56 ` Egor Kobylkin
2018-12-20 0:06 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 22:56 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales
On 19.12.18 23:25, Rafal Luzynski wrote:
> 10.12.2018 22:20 Marko Myllynen <myllynen@redhat.com> wrote:
>
>> [...]
>> Yes, I also think System A AKA ISO 9 would be a better choice but I'll
>> leave the final decision for you two (and others who might weigh in).
>
> Egor is a native speaker so I respect his opinion even if I'm not fully
> convinced for technical reasons. Sadly, nobody else provides any opinion
> which could weigh. I am going to write a separate email about it.
>
> Regards,
>
> Rafal
>
It's not about which letter should be used for a particular
transliteration. I couldn't care less about that just to be clear.
May be I am missing something, could you tell how do you want to fit
System A to ASCII exactly?
Let's take the very first example from the table:
CyrillicUnicode CyrillicLetter CyrillicUnicodeName LatinUnicode System A
Latin Letter System B ASCII Letter
0401 Ð CYRILLIC CAPITAL LETTER IO 00CB Ã YO
so:
Cyrillic Ð U0401
System A - Ã U00CB - _not_ ASCII
System B - YO (or Yo) "<U0059><U004F>" - ASCII
Could you explain how can we make System A "Ã" to be displayed or
processes somehow in a C locale? Or in a locale or program that doesn't
have "Ã" U00CB?
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-19 22:48 ` Rafal Luzynski
@ 2018-12-19 23:16 ` Egor Kobylkin
2018-12-20 0:14 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 23:16 UTC (permalink / raw)
To: libc-alpha, libc-locales, Dmitry V. Levin, Marko Myllynen, mfabian
On 19.12.18 23:41, Rafal Luzynski wrote:
> 8.12.2018 22:51 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Rafal, Dmitry, Marko, Mike
>>
>> On 08.12.18 00:35, Rafal Luzynski wrote:
>>> 19.11.2018 12:10 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v10: * Removed ISO 9.1995 GOST 7.79-2000 System A
>>>> (transliteration to Latin with diacritics) as conflicting with
>>>> System B within glibc mechanics and not solving BZ #2872
>>>
>>> I'm in favor of implementing System A and dropping System B instead.
>>
>> The BZ #2872 bug name is explicitly "Transliteration Cyrillic -> ASCII
>> fails". The ISO 9 System A does not map to ASCII so it is not a solution
>> to BZ #2872 at all.
>
> I did not mean implementing System A and nothing more. I meant implementing
> System A and a fallback for ASCII which can be similar to System B but
> we wouldn't be able to call it "System B" because it would differ in
> few cases.
Just for the record, I have no objection on my side to that (Using A as
a basis for ASCII as well).
But I'm not sure anymore that inserting a translit table into every
locale is the right solution for ASCII problem. Especially because
distributions may not include any locale but C.
>
>> I was scratching my head as to how can we avoid the explosion of the
>> scope for this patch. And then it appeared to me that it was wrong to
>> target all the present locales for the ASCII translit. This seems to be
>> the root cause for this prolonged A vs. B discussions. The proper target
>> for my table is actually the C locale translit file
>> (locale/C-translit.h.in). I will submit a proper patch shortly.
>
> I saw your patch v11 and now I must say I'm sorry for making noise because
> it was me who said that I didn't mind adding Cyrillic -> ASCII
> transliteration
> to C locale. I said so before taking a look at the current contents of
> transliteration in C locale. When I looked at this I realized that it does
> not support any national characters, even from modified Latin alphabets
> (like
> used in most of western European languages). It only contains mathematical,
> physical, commercial, diacritical etc. characters. So I'm no longer sure
> it should support Cyrillic -> ASCII. But maybe again I'm wrong, maybe
> it should support but just nobody implemented it yet.
Actually there are quite a few letters already transliterated in
locale/C-translit.h.in. (Note the CAPCAP transliteration style for the
capitals, i.e. LATIN CAPITAL LETTER AE is mapped to AE, not to Ae.)
"\x00c6" "AE" /* <U00C6> LATIN CAPITAL LETTER AE */
"\x00d7" "x" /* <U00D7> MULTIPLICATION SIGN */
"\x00df" "ss" /* <U00DF> LATIN SMALL LETTER SHARP S */
"\x00e6" "ae" /* <U00E6> LATIN SMALL LETTER AE */
"\x0132" "IJ" /* <U0132> LATIN CAPITAL LIGATURE IJ */
"\x0133" "ij" /* <U0133> LATIN SMALL LIGATURE IJ */
"\x0149" "'n" /* <U0149> LATIN SMALL LETTER N PRECEDED BY APOSTROPHE */
"\x0152" "OE" /* <U0152> LATIN CAPITAL LIGATURE OE */
"\x0153" "oe" /* <U0153> LATIN SMALL LIGATURE OE */
"\x017f" "s" /* <U017F> LATIN SMALL LETTER LONG S */
"\x01c7" "LJ" /* <U01C7> LATIN CAPITAL LETTER LJ */
"\x01c8" "Lj" /* <U01C8> LATIN CAPITAL LETTER L WITH SMALL LETTER J */
"\x01c9" "lj" /* <U01C9> LATIN SMALL LETTER LJ */
"\x01ca" "NJ" /* <U01CA> LATIN CAPITAL LETTER NJ */
"\x01cb" "Nj" /* <U01CB> LATIN CAPITAL LETTER N WITH SMALL LETTER J */
"\x01cc" "nj" /* <U01CC> LATIN SMALL LETTER NJ */
"\x01f1" "DZ" /* <U01F1> LATIN CAPITAL LETTER DZ */
"\x01f2" "Dz" /* <U01F2> LATIN CAPITAL LETTER D WITH SMALL LETTER Z */
"\x01f3" "dz" /* <U01F3> LATIN SMALL LETTER DZ */
>> My focus is super sharp on helping with Cyrillic -> ASCII translit
>> availability for a default installation with glibc.
>
> I understand your aim and I agree to support ASCII. Our disagreements are:
>
> * whether to support conversion Cyrillic -> extended Latin as well,
no contest on my side
> * which standard to implement,
no contest on my side
> * what to do if the standard is ambiguous or if some details cannot be
> implemented for technical reasons.
no contest on my side either
I just think we may work around all those decisions with a smaller pure
ASCII patch first (more useful too if covers C locale).
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2018-12-10 1:28 ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
@ 2018-12-19 23:31 ` Egor Kobylkin
2018-12-26 12:14 ` Siddhesh Poyarekar
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-19 23:31 UTC (permalink / raw)
To: libc-alpha, libc-locales, Marko Myllynen, Carlos O'Donell
[-- Attachment #1: Type: text/plain, Size: 6423 bytes --]
Freeze ping.
I'd like to ping the list on this patch and to have some discussion on
moving ASCII transliteration to locale/C-translit.h.in before the freeze.
The wiki page for 2.29 [12] is set as "immutable" for newly registered
users, not sure it is so desired. I could not add this patch there as
"desired".
I have added 2.29 keyword to the bug entry.
Bests,
Egor Kobylkin
[12] https://sourceware.org/glibc/wiki/Release/2.29
On 08.12.18 23:28, Egor Kobylkin wrote:
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
>
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
>
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
>
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
>
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> âcopy "tr_TR"â.
>
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
> to sequences of all uppercase Latin letters in all languages (whenever
> a Cyrillic letter is transliterated to more than one Latin letter),
> for example "Ð" is now transliterated as "YI" rather than "Yi".
>
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
>
> The patch is attached.
>
>
> Current bug effect:
>
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
>
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
>
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>
> - it produces a string of question marks and spaces.
>
> This is what it should produce and it does so after the patch applied:
>
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
>
>
> The root problem and the fix:
>
> The root problem is the missing transliteration table that I am
> supplying here.
>
>
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
>
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
>
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
>
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
>
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
>
> Links:
>
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
>
> Best regards,
> Egor Kobylkin
>
>
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 11581 bytes --]
From b9cd550028ecf7c875c9d7250c8598433b1fc474 Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Sat, 8 Dec 2018 22:08:59 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* locale/C-translit.h.in: Add Cyrillic transliteration.
---
locale/C-translit.h.in | 170 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 170 insertions(+)
diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index e27f39e8fe..bd64edc609 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -2,6 +2,7 @@
Copyright (C) 2000-2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Contributed by Ulrich Drepper <drepper@redhat.com>, 2000.
+ 0401-04f9 contributed by Egor Kobylkin <Egor@Kobylkin.com>, 2018.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
@@ -56,6 +57,175 @@
"\x02cd" "_" /* <U02CD> MODIFIER LETTER LOW MACRON */
"\x02d0" ":" /* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
"\x02dc" "~" /* <U02DC> SMALL TILDE */
+"\x0401" "YO" /* <U0401> CYRILLIC CAPITAL LETTER IO */
+"\x0402" "DJ" /* <U0402> CYRILLIC CAPITAL LETTER DJE */
+"\x0403" "G`" /* <U0403> CYRILLIC CAPITAL LETTER GJE */
+"\x0404" "YE" /* <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE */
+"\x0405" "Z`" /* <U0405> CYRILLIC CAPITAL LETTER DZE */
+"\x0406" "I" /* <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0407" "YI" /* <U0407> CYRILLIC CAPITAL LETTER YI */
+"\x0408" "J" /* <U0408> CYRILLIC CAPITAL LETTER JE */
+"\x0409" "L`" /* <U0409> CYRILLIC CAPITAL LETTER LJE */
+"\x040a" "N`" /* <U040A> CYRILLIC CAPITAL LETTER NJE */
+"\x040b" "TSH" /* <U040B> CYRILLIC CAPITAL LETTER TSHE */
+"\x040c" "K`" /* <U040C> CYRILLIC CAPITAL LETTER KJE */
+"\x040e" "U`" /* <U040E> CYRILLIC CAPITAL LETTER SHORT U */
+"\x040f" "DH" /* <U040F> CYRILLIC CAPITAL LETTER DZHE */
+"\x0410" "A" /* <U0410> CYRILLIC CAPITAL LETTER A */
+"\x0411" "B" /* <U0411> CYRILLIC CAPITAL LETTER BE */
+"\x0412" "V" /* <U0412> CYRILLIC CAPITAL LETTER VE */
+"\x0413" "G" /* <U0413> CYRILLIC CAPITAL LETTER GHE */
+"\x0414" "D" /* <U0414> CYRILLIC CAPITAL LETTER DE */
+"\x0415" "E" /* <U0415> CYRILLIC CAPITAL LETTER IE */
+"\x0416" "ZH" /* <U0416> CYRILLIC CAPITAL LETTER ZHE */
+"\x0417" "Z" /* <U0417> CYRILLIC CAPITAL LETTER ZE */
+"\x0418" "I" /* <U0418> CYRILLIC CAPITAL LETTER I */
+"\x0419" "J" /* <U0419> CYRILLIC CAPITAL LETTER SHORT I */
+"\x041a" "K" /* <U041A> CYRILLIC CAPITAL LETTER KA */
+"\x041b" "L" /* <U041B> CYRILLIC CAPITAL LETTER EL */
+"\x041c" "M" /* <U041C> CYRILLIC CAPITAL LETTER EM */
+"\x041d" "N" /* <U041D> CYRILLIC CAPITAL LETTER EN */
+"\x041e" "O" /* <U041E> CYRILLIC CAPITAL LETTER O */
+"\x041f" "P" /* <U041F> CYRILLIC CAPITAL LETTER PE */
+"\x0420" "R" /* <U0420> CYRILLIC CAPITAL LETTER ER */
+"\x0421" "S" /* <U0421> CYRILLIC CAPITAL LETTER ES */
+"\x0422" "T" /* <U0422> CYRILLIC CAPITAL LETTER TE */
+"\x0423" "U" /* <U0423> CYRILLIC CAPITAL LETTER U */
+"\x0424" "F" /* <U0424> CYRILLIC CAPITAL LETTER EF */
+"\x0425" "X" /* <U0425> CYRILLIC CAPITAL LETTER HA */
+"\x0426" "CZ" /* <U0426> CYRILLIC CAPITAL LETTER TSE */
+"\x0427" "CH" /* <U0427> CYRILLIC CAPITAL LETTER CHE */
+"\x0428" "SH" /* <U0428> CYRILLIC CAPITAL LETTER SHA */
+"\x0429" "SHH" /* <U0429> CYRILLIC CAPITAL LETTER SHCHA */
+"\x042a" "A`" /* <U042A> CYRILLIC CAPITAL LETTER HARD SIGN */
+"\x042b" "Y`" /* <U042B> CYRILLIC CAPITAL LETTER YERU */
+"\x042c" "`" /* <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN */
+"\x042d" "E`" /* <U042D> CYRILLIC CAPITAL LETTER E */
+"\x042e" "YU" /* <U042E> CYRILLIC CAPITAL LETTER YU */
+"\x042f" "YA" /* <U042F> CYRILLIC CAPITAL LETTER YA */
+"\x0430" "a" /* <U0430> CYRILLIC SMALL LETTER A */
+"\x0431" "b" /* <U0431> CYRILLIC SMALL LETTER BE */
+"\x0432" "v" /* <U0432> CYRILLIC SMALL LETTER VE */
+"\x0433" "g" /* <U0433> CYRILLIC SMALL LETTER GHE */
+"\x0434" "d" /* <U0434> CYRILLIC SMALL LETTER DE */
+"\x0435" "e" /* <U0435> CYRILLIC SMALL LETTER IE */
+"\x0436" "zh" /* <U0436> CYRILLIC SMALL LETTER ZHE */
+"\x0437" "z" /* <U0437> CYRILLIC SMALL LETTER ZE */
+"\x0438" "i" /* <U0438> CYRILLIC SMALL LETTER I */
+"\x0439" "j" /* <U0439> CYRILLIC SMALL LETTER SHORT I */
+"\x043a" "k" /* <U043A> CYRILLIC SMALL LETTER KA */
+"\x043b" "l" /* <U043B> CYRILLIC SMALL LETTER EL */
+"\x043c" "m" /* <U043C> CYRILLIC SMALL LETTER EM */
+"\x043d" "n" /* <U043D> CYRILLIC SMALL LETTER EN */
+"\x043e" "o" /* <U043E> CYRILLIC SMALL LETTER O */
+"\x043f" "p" /* <U043F> CYRILLIC SMALL LETTER PE */
+"\x0440" "r" /* <U0440> CYRILLIC SMALL LETTER ER */
+"\x0441" "s" /* <U0441> CYRILLIC SMALL LETTER ES */
+"\x0442" "t" /* <U0442> CYRILLIC SMALL LETTER TE */
+"\x0443" "u" /* <U0443> CYRILLIC SMALL LETTER U */
+"\x0444" "f" /* <U0444> CYRILLIC SMALL LETTER EF */
+"\x0445" "x" /* <U0445> CYRILLIC SMALL LETTER HA */
+"\x0446" "cz" /* <U0446> CYRILLIC SMALL LETTER TSE */
+"\x0447" "ch" /* <U0447> CYRILLIC SMALL LETTER CHE */
+"\x0448" "sh" /* <U0448> CYRILLIC SMALL LETTER SHA */
+"\x0449" "shh" /* <U0449> CYRILLIC SMALL LETTER SHCHA */
+"\x044a" "``" /* <U044A> CYRILLIC SMALL LETTER HARD SIGN */
+"\x044b" "y`" /* <U044B> CYRILLIC SMALL LETTER YERU */
+"\x044c" "`" /* <U044C> CYRILLIC SMALL LETTER SOFT SIGN */
+"\x044d" "e`" /* <U044D> CYRILLIC SMALL LETTER E */
+"\x044e" "yu" /* <U044E> CYRILLIC SMALL LETTER YU */
+"\x044f" "ya" /* <U044F> CYRILLIC SMALL LETTER YA */
+"\x0451" "yo" /* <U0451> CYRILLIC SMALL LETTER IO */
+"\x0452" "dj" /* <U0452> CYRILLIC SMALL LETTER DJE */
+"\x0453" "g`" /* <U0453> CYRILLIC SMALL LETTER GJE */
+"\x0454" "ye" /* <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE */
+"\x0455" "z`" /* <U0455> CYRILLIC SMALL LETTER DZE */
+"\x0456" "i" /* <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */
+"\x0457" "yi" /* <U0457> CYRILLIC SMALL LETTER YI */
+"\x0458" "j" /* <U0458> CYRILLIC SMALL LETTER JE */
+"\x0459" "l`" /* <U0459> CYRILLIC SMALL LETTER LJE */
+"\x045a" "n`" /* <U045A> CYRILLIC SMALL LETTER NJE */
+"\x045b" "tsh" /* <U045B> CYRILLIC SMALL LETTER TSHE */
+"\x045c" "k`" /* <U045C> CYRILLIC SMALL LETTER KJE */
+"\x045e" "u`" /* <U045E> CYRILLIC SMALL LETTER SHORT U */
+"\x045f" "dh" /* <U045F> CYRILLIC SMALL LETTER DZHE */
+"\x046a" "O`" /* <U046A> CYRILLIC CAPITAL LETTER BIG YUS */
+"\x046b" "o`" /* <U046B> CYRILLIC SMALL LETTER BIG YUS */
+"\x0472" "FH" /* <U0472> CYRILLIC CAPITAL LETTER FITA */
+"\x0473" "fh" /* <U0473> CYRILLIC SMALL LETTER FITA */
+"\x0474" "YH" /* <U0474> CYRILLIC CAPITAL LETTER IZHITSA */
+"\x0475" "yh" /* <U0475> CYRILLIC SMALL LETTER IZHITSA */
+"\x048c" "E`" /* <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN */
+"\x048d" "e`" /* <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN */
+"\x0490" "G`" /* <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN */
+"\x0491" "g`" /* <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN */
+"\x0492" "GH" /* <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE */
+"\x0493" "gh" /* <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE */
+"\x0494" "GH" /* <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK */
+"\x0495" "gh" /* <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK */
+"\x0496" "ZH`" /* <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER */
+"\x0497" "zh`" /* <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER */
+"\x049a" "K`" /* <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER */
+"\x049b" "k`" /* <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER */
+"\x049e" "K`" /* <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE */
+"\x049f" "k`" /* <U049F> CYRILLIC SMALL LETTER KA WITH STROKE */
+"\x04a2" "N`" /* <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER */
+"\x04a3" "n`" /* <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER */
+"\x04a4" "NG" /* <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE */
+"\x04a5" "ng" /* <U04A5> CYRILLIC SMALL LIGATURE EN GHE */
+"\x04a6" "P`" /* <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK */
+"\x04a7" "p`" /* <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK */
+"\x04a8" "O`" /* <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA */
+"\x04a9" "o`" /* <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA */
+"\x04aa" "C`" /* <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER */
+"\x04ab" "C`" /* <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER */
+"\x04ac" "T`" /* <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER */
+"\x04ad" "t`" /* <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER */
+"\x04ae" "U" /* <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U */
+"\x04af" "u" /* <U04AF> CYRILLIC SMALL LETTER STRAIGHT U */
+"\x04b2" "H`" /* <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER */
+"\x04b3" "h`" /* <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER */
+"\x04b4" "TCZ" /* <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE */
+"\x04b5" "tcz" /* <U04B5> CYRILLIC SMALL LIGATURE TE TSE */
+"\x04ba" "SH`" /* <U04BA> CYRILLIC CAPITAL LETTER SHHA */
+"\x04bb" "SH`" /* <U04BB> CYRILLIC SMALL LETTER SHHA */
+"\x04bc" "CH`" /* <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE */
+"\x04bd" "ch`" /* <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE */
+"\x04be" "CH`" /* <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04bf" "ch`" /* <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER */
+"\x04c0" "i" /* <U04C0> CYRILLIC LETTER PALOCHKA */
+"\x04c1" "ZH`" /* <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE */
+"\x04c2" "zh`" /* <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE */
+"\x04cb" "CH`" /* <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE */
+"\x04cc" "ch`" /* <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE */
+"\x04d0" "A`" /* <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE */
+"\x04d1" "a`" /* <U04D1> CYRILLIC SMALL LETTER A WITH BREVE */
+"\x04d2" "A`" /* <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS */
+"\x04d3" "a`" /* <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS */
+"\x04d6" "E`" /* <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE */
+"\x04d7" "e`" /* <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE */
+"\x04d8" "A`" /* <U04D8> CYRILLIC CAPITAL LETTER SCHWA */
+"\x04d9" "a`" /* <U04D9> CYRILLIC SMALL LETTER SCHWA */
+"\x04dc" "ZH`" /* <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS */
+"\x04dd" "zh`" /* <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS */
+"\x04de" "Z`" /* <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS */
+"\x04df" "z`" /* <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS */
+"\x04e0" "Z`" /* <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE */
+"\x04e1" "z`" /* <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE */
+"\x04e4" "I`" /* <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS */
+"\x04e5" "i`" /* <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS */
+"\x04e6" "O`" /* <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS */
+"\x04e7" "o`" /* <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS */
+"\x04e8" "O`" /* <U04E8> CYRILLIC CAPITAL LETTER BARRED O */
+"\x04e9" "o`" /* <U04E9> CYRILLIC SMALL LETTER BARRED O */
+"\x04f0" "U`" /* <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS */
+"\x04f1" "u`" /* <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS */
+"\x04f2" "U`" /* <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE */
+"\x04f3" "u`" /* <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE */
+"\x04f4" "CH`" /* <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS */
+"\x04f5" "ch`" /* <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS */
+"\x04f8" "Y`" /* <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS */
+"\x04f9" "y`" /* <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS */
"\x2002" " " /* <U2002> EN SPACE */
"\x2003" " " /* <U2003> EM SPACE */
"\x2004" " " /* <U2004> THREE-PER-EM SPACE */
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v9] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-19 22:56 ` Egor Kobylkin
@ 2018-12-20 0:06 ` Rafal Luzynski
0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-20 0:06 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales
19.12.2018 23:48 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> May be I am missing something, could you tell how do you want to fit
> System A to ASCII exactly?
>
> Let's take the very first example from the table:
> CyrillicUnicode CyrillicLetter CyrillicUnicodeName LatinUnicode System A
> Latin Letter System B ASCII Letter
> 0401 Ё CYRILLIC CAPITAL LETTER IO 00CB Ë YO
>
> so:
> Cyrillic Ё U0401
> System A - Ë U00CB - _not_ ASCII
> System B - YO (or Yo) "<U0059><U004F>" - ASCII
>
> Could you explain how can we make System A "Ë" to be displayed or
> processes somehow in a C locale? Or in a locale or program that doesn't
> have "Ë" U00CB?
It should be "YO" (or "Yo"). Exactly as you provided in your previous
patches.
I am afraid that my description "Cyrillic -> Latin -> ASCII" was too
ambiguous, I am sorry about it. Actually it is a list which says:
Convert Cyrillic "Ё" into Latin "Ë" if possible, otherwise to "YO" ("Yo").
We may stop using "Cyrillic -> Latin -> ASCII" picture as too ambiguous
and invent a better one.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v10] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
2018-12-19 23:16 ` Egor Kobylkin
@ 2018-12-20 0:14 ` Rafal Luzynski
0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-20 0:14 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Dmitry V. Levin,
Marko Myllynen, mfabian
20.12.2018 00:02 Egor Kobylkin <egor@kobylkin.com> wrote:
> [...]
> But I'm not sure anymore that inserting a translit table into every
> locale is the right solution for ASCII problem. Especially because
> distributions may not include any locale but C.
My question (and my doubt) is whether they want to support Cyrillic
transliteration in that case. If yes then maybe they also want more
transliterations as well. I'm not telling we will include them now,
just wonder what is the reason why they were not yet included in C.
> [...]
> Actually there are quite a few letters already transliterated in
> locale/C-translit.h.in.
Sure, my list was not complete and I did not mean there are no Latin
characters supported. But there is nothing from the long list of
á, à, ä, ã, ǎ, å, ā, ą, ạ, ȧ, ć, ĉ, ç, é, è, ë, ...
> (Note the CAPCAP transliteration style for the
> capitals, i.e. LATIN CAPITAL LETTER AE is mapped to AE, not to Ae.)
Sure, because they are ligatures: "A" + "E", not "A" + "e". Note that
where three variants of ligatures exist, like "LJ", "Lj", "lj" then
all three are supported.
> [ cut the list ]
>
> > [...]
> > I understand your aim and I agree to support ASCII. Our disagreements
> > are:
> >
> > * whether to support conversion Cyrillic -> extended Latin as well,
> no contest on my side
> > * which standard to implement,
> no contest on my side
> > * what to do if the standard is ambiguous or if some details cannot be
> > implemented for technical reasons.
> no contest on my side either
Good, three steps forward.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2018-12-19 23:31 ` Egor Kobylkin
@ 2018-12-26 12:14 ` Siddhesh Poyarekar
2018-12-26 14:55 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2018-12-26 12:14 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Marko Myllynen,
Carlos O'Donell, digitalfreak
On 20/12/18 4:46 AM, Egor Kobylkin wrote:
> Freeze ping.
>
> I'd like to ping the list on this patch and to have some discussion on
> moving ASCII transliteration to locale/C-translit.h.in before the freeze.
>
> The wiki page for 2.29 [12] is set as "immutable" for newly registered
> users, not sure it is so desired. I could not add this patch there as
> "desired".
> I have added 2.29 keyword to the bug entry.
>
> Bests,
> Egor Kobylkin
>
>
> [12] https://sourceware.org/glibc/wiki/Release/2.29
cc'd Rafal since I am not equipped to review this. Only nit I can point
out is that you need to remove the "Contributed by" line that you added;
we don't do that any more. You can remove the earlier contributed by
line too since it's no longer part of our process.
Also, if you'd like edit access to the wiki then please tell me your
username (assuming you've created an account on the wiki, please do if
you haven't) and I'll add you to the editor group. It's a measure we
added to counter the high amounts of spam we faced on the wiki.
Thanks,
Siddhesh
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2018-12-26 12:14 ` Siddhesh Poyarekar
@ 2018-12-26 14:55 ` Egor Kobylkin
2018-12-27 1:47 ` Siddhesh Poyarekar
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2018-12-26 14:55 UTC (permalink / raw)
To: Siddhesh Poyarekar, libc-alpha, libc-locales, Marko Myllynen,
Carlos O'Donell, digitalfreak
On 26.12.18 11:07, Siddhesh Poyarekar wrote:
> On 20/12/18 4:46 AM, Egor Kobylkin wrote:
>> Freeze ping.
>>
>> I'd like to ping the list on this patch and to have some discussion on
>> moving ASCII transliteration to locale/C-translit.h.in before the freeze.
>>
>> The wiki page for 2.29 [12] is set as "immutable" for newly registered
>> users, not sure it is so desired. I could not add this patch there as
>> "desired".
>> I have added 2.29 keyword to the bug entry.
>>
>> Bests,
>> Egor Kobylkin
>>
>>
>> [12] https://sourceware.org/glibc/wiki/Release/2.29
>
> cc'd Rafal since I am not equipped to review this. Only nit I can point
> out is that you need to remove the "Contributed by" line that you added;
> we don't do that any more. You can remove the earlier contributed by
> line too since it's no longer part of our process.
>
> Also, if you'd like edit access to the wiki then please tell me your
> username (assuming you've created an account on the wiki, please do if
> you haven't) and I'll add you to the editor group. It's a measure we
> added to counter the high amounts of spam we faced on the wiki.
>
> Thanks,
> Siddhesh
Thanks, Siddhesh, yes, please could you add my username EgorKobylkin to
the editors group.
Rafal has requested help and guidance about this patch in another email
to this list [1]. I hope other members would chime in on that in time
for 2.29. I understand we need input from those involved in C locale
that is compiled into the libc binaries (as opposed to the rest of
locales that are shipped in plain text, not compiled).
@Rafal - I know you have asked to drop your email from To: as you are
getting them through your list subscription and so twice. But I guess
To: is still helpful to see who is involved. I am not subscribed to the
list myself, so I would like my email to be kept on To: or CC: for this.
Bests,
Egor
[1] https://sourceware.org/ml/libc-alpha/2018-12/msg00787.html
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2018-12-26 14:55 ` Egor Kobylkin
@ 2018-12-27 1:47 ` Siddhesh Poyarekar
2018-12-27 11:36 ` Rafal Luzynski
0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2018-12-27 1:47 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Marko Myllynen,
Carlos O'Donell, digitalfreak
On 26/12/18 5:43 PM, Egor Kobylkin wrote:
> Thanks, Siddhesh, yes, please could you add my username EgorKobylkin to
> the editors group.
Done. Here's a weird statistic: you're the first user on that wiki with
name starting with E!
> Rafal has requested help and guidance about this patch in another email
> to this list [1]. I hope other members would chime in on that in time
> for 2.29. I understand we need input from those involved in C locale
> that is compiled into the libc binaries (as opposed to the rest of
> locales that are shipped in plain text, not compiled).
Ah OK, I missed that email. It'll have to wait for more inputs though
because like I said, I don't have enough experience in locales to make
an intelligent comment, definitely not for Cyrillic.
> @Rafal - I know you have asked to drop your email from To: as you are
> getting them through your list subscription and so twice. But I guess
> To: is still helpful to see who is involved. I am not subscribed to the
> list myself, so I would like my email to be kept on To: or CC: for this.
I added @Rafal because it's kinda standard practice to do that to get an
individual's attention since otherwise an email could get lost in the
traffic. @Rafal, I'll remove it if you object.
Siddhesh
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v11] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2018-12-27 1:47 ` Siddhesh Poyarekar
@ 2018-12-27 11:36 ` Rafal Luzynski
0 siblings, 0 replies; 111+ messages in thread
From: Rafal Luzynski @ 2018-12-27 11:36 UTC (permalink / raw)
To: Siddhesh Poyarekar, Egor Kobylkin, libc-alpha, libc-locales,
Marko Myllynen, Carlos O'Donell
27.12.2018 02:30 Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
>
> On 26/12/18 5:43 PM, Egor Kobylkin wrote:
> [...]
> > Rafal has requested help and guidance about this patch in another email
> > to this list [1]. I hope other members would chime in on that in time
> > for 2.29. I understand we need input from those involved in C locale
> > that is compiled into the libc binaries (as opposed to the rest of
> > locales that are shipped in plain text, not compiled).
>
> Ah OK, I missed that email. It'll have to wait for more inputs though
> because like I said, I don't have enough experience in locales to make
> an intelligent comment, definitely not for Cyrillic.
My email is here:
https://sourceware.org/ml/libc-alpha/2018-12/msg00787.html
My questions are not related with Cyrillic but in general how
transliteration should be implemented. You may replace "Cyrillic"
with any other script you know and ask yourself "how would I implement
transliteration from Foo Alphabet to ASCII".
I think that so far there was no transliteration common for all
locales except translit_combine which just removes the combining
diacritic characters.
Can we have any live meeting, like on IRC? I think that we could have
more questions answered in direct conversation. By email we can have
little more than one question and answer per day.
> > @Rafal - I know you have asked to drop your email from To: as you are
> > getting them through your list subscription and so twice. But I guess
> > To: is still helpful to see who is involved. I am not subscribed to the
> > list myself, so I would like my email to be kept on To: or CC: for this.
>
> I added @Rafal because it's kinda standard practice to do that to get an
> individual's attention since otherwise an email could get lost in the
> traffic. @Rafal, I'll remove it if you object.
I don't object here. Previously I was complaining about large patches
which arrive in two copies and tend to exceed my email quota. Regular
conversation does not cause much problem for me.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (11 preceding siblings ...)
2018-12-10 1:28 ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
@ 2019-01-02 18:39 ` Egor Kobylkin
2019-01-05 14:36 ` Rafal Luzynski
2019-04-09 1:27 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
2019-03-19 10:39 ` ping " Egor Kobylkin
13 siblings, 2 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-02 18:39 UTC (permalink / raw)
To: libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Marko Myllynen, mfabian
[-- Attachment #1: Type: text/plain, Size: 5987 bytes --]
Changelog v12:
* Adjusted to the new comment style suddenly appearing in the target
file locale/C-translit.h.in (the original file changed on the master
branch from /* style to # style since v11)
* Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
"sh`" instead of erroneous "SH`" in v11
Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.
Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics
Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration rows to locale/C-translit.h.in.
The patch is attached.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:
iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- it produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].
The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.
Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
Best regards,
Egor Kobylkin
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 10494 bytes --]
From 46e0d0e3d07805ec853fdd72dc3793995cb5593c Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* locale/C-translit.h.in: Add Cyrillic transliteration.
---
locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 169 insertions(+)
diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
"\x02cd" "_" # <U02CD> MODIFIER LETTER LOW MACRON
"\x02d0" ":" # <U02D0> MODIFIER LETTER TRIANGULAR COLON
"\x02dc" "~" # <U02DC> SMALL TILDE
+"\x0401" "YO" # <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402" "DJ" # <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403" "G`" # <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404" "YE" # <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405" "Z`" # <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406" "I" # <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407" "YI" # <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408" "J" # <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409" "L`" # <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a" "N`" # <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b" "TSH" # <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c" "K`" # <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e" "U`" # <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f" "DH" # <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410" "A" # <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411" "B" # <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412" "V" # <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413" "G" # <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414" "D" # <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415" "E" # <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416" "ZH" # <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417" "Z" # <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418" "I" # <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419" "J" # <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a" "K" # <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b" "L" # <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c" "M" # <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d" "N" # <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e" "O" # <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f" "P" # <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420" "R" # <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421" "S" # <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422" "T" # <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423" "U" # <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424" "F" # <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425" "X" # <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426" "CZ" # <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427" "CH" # <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428" "SH" # <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429" "SHH" # <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a" "A`" # <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b" "Y`" # <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c" "`" # <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d" "E`" # <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e" "YU" # <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f" "YA" # <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430" "a" # <U0430> CYRILLIC SMALL LETTER A
+"\x0431" "b" # <U0431> CYRILLIC SMALL LETTER BE
+"\x0432" "v" # <U0432> CYRILLIC SMALL LETTER VE
+"\x0433" "g" # <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434" "d" # <U0434> CYRILLIC SMALL LETTER DE
+"\x0435" "e" # <U0435> CYRILLIC SMALL LETTER IE
+"\x0436" "zh" # <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437" "z" # <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438" "i" # <U0438> CYRILLIC SMALL LETTER I
+"\x0439" "j" # <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a" "k" # <U043A> CYRILLIC SMALL LETTER KA
+"\x043b" "l" # <U043B> CYRILLIC SMALL LETTER EL
+"\x043c" "m" # <U043C> CYRILLIC SMALL LETTER EM
+"\x043d" "n" # <U043D> CYRILLIC SMALL LETTER EN
+"\x043e" "o" # <U043E> CYRILLIC SMALL LETTER O
+"\x043f" "p" # <U043F> CYRILLIC SMALL LETTER PE
+"\x0440" "r" # <U0440> CYRILLIC SMALL LETTER ER
+"\x0441" "s" # <U0441> CYRILLIC SMALL LETTER ES
+"\x0442" "t" # <U0442> CYRILLIC SMALL LETTER TE
+"\x0443" "u" # <U0443> CYRILLIC SMALL LETTER U
+"\x0444" "f" # <U0444> CYRILLIC SMALL LETTER EF
+"\x0445" "x" # <U0445> CYRILLIC SMALL LETTER HA
+"\x0446" "cz" # <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447" "ch" # <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448" "sh" # <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449" "shh" # <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a" "``" # <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b" "y`" # <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c" "`" # <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d" "e`" # <U044D> CYRILLIC SMALL LETTER E
+"\x044e" "yu" # <U044E> CYRILLIC SMALL LETTER YU
+"\x044f" "ya" # <U044F> CYRILLIC SMALL LETTER YA
+"\x0451" "yo" # <U0451> CYRILLIC SMALL LETTER IO
+"\x0452" "dj" # <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453" "g`" # <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454" "ye" # <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455" "z`" # <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456" "i" # <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457" "yi" # <U0457> CYRILLIC SMALL LETTER YI
+"\x0458" "j" # <U0458> CYRILLIC SMALL LETTER JE
+"\x0459" "l`" # <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a" "n`" # <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b" "tsh" # <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c" "k`" # <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e" "u`" # <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f" "dh" # <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a" "O`" # <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b" "o`" # <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472" "FH" # <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473" "fh" # <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474" "YH" # <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475" "yh" # <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c" "E`" # <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d" "e`" # <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490" "G`" # <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491" "g`" # <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492" "GH" # <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493" "gh" # <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494" "GH" # <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495" "gh" # <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496" "ZH`" # <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497" "zh`" # <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a" "K`" # <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b" "k`" # <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e" "K`" # <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f" "k`" # <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2" "N`" # <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3" "n`" # <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4" "NG" # <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5" "ng" # <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6" "P`" # <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7" "p`" # <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8" "O`" # <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9" "o`" # <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa" "C`" # <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab" "C`" # <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac" "T`" # <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad" "t`" # <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae" "U" # <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af" "u" # <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2" "H`" # <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3" "h`" # <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4" "TCZ" # <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5" "tcz" # <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba" "SH`" # <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb" "sh`" # <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc" "CH`" # <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd" "ch`" # <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be" "CH`" # <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04bf" "ch`" # <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0" "i" # <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1" "ZH`" # <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2" "zh`" # <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb" "CH`" # <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc" "ch`" # <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0" "A`" # <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1" "a`" # <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2" "A`" # <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3" "a`" # <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6" "E`" # <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7" "e`" # <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8" "A`" # <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9" "a`" # <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc" "ZH`" # <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd" "zh`" # <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de" "Z`" # <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df" "z`" # <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0" "Z`" # <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1" "z`" # <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4" "I`" # <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5" "i`" # <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6" "O`" # <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7" "o`" # <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8" "O`" # <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9" "o`" # <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0" "U`" # <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1" "u`" # <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2" "U`" # <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3" "u`" # <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4" "CH`" # <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5" "ch`" # <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8" "Y`" # <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9" "y`" # <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
"\x2002" " " # <U2002> EN SPACE
"\x2003" " " # <U2003> EM SPACE
"\x2004" " " # <U2004> THREE-PER-EM SPACE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-02 18:39 ` [PATCH v12] " Egor Kobylkin
@ 2019-01-05 14:36 ` Rafal Luzynski
2019-01-05 21:13 ` Egor Kobylkin
2019-04-09 1:27 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
1 sibling, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2019-01-05 14:36 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar
Cc: Marko Myllynen, mfabian
2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>
> Changelog v12:
> [...]
>
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
> [...]
I have tested this and, unfortunately, now this transliteration
works *only* in C locale, that is, only when no locale is set or when
it is explicitly set to C (C.UTF8, POSIX). It does not work when locale
is set to anything different, including en_US, ru_RU, etc.
I'm sorry for confusing you. I think that either we should revert back
to the older versions of your patch to make all locales supported or
merge those two versions to make the transliteration work both in
C and in all (almost all) other locales. Unfortunately, C locale is
not a base for all other locales and is not included, it is only a fallback
when a locale does not provide its own data (that is, when it does not
provide any transliteration table at all).
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-05 14:36 ` Rafal Luzynski
@ 2019-01-05 21:13 ` Egor Kobylkin
2019-01-07 20:37 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-05 21:13 UTC (permalink / raw)
To: Rafal Luzynski, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar
Cc: Marko Myllynen, mfabian
On 05.01.19 15:35, Rafal Luzynski wrote:
> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>
>> Changelog v12:
>> [...]
>>
>> Changelog v11:
>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>> file for the ASCII translit table.
>> * Correspondingly the patch now only contains the additional
>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>> The 'include "translit_cyrillic";""' directives are not necessary in the
>> locale files and they are now all left intact.
>> * Also the file translit_cyrillic is not longer needed and is omitted.
>> * Edited below email, commit message.
>> [...]
>
> I have tested this and, unfortunately, now this transliteration
> works *only* in C locale, that is, only when no locale is set or when
> it is explicitly set to C (C.UTF8, POSIX). It does not work when locale
> is set to anything different, including en_US, ru_RU, etc.
>
> I'm sorry for confusing you. I think that either we should revert back
> to the older versions of your patch to make all locales supported or
> merge those two versions to make the transliteration work both in
> C and in all (almost all) other locales. Unfortunately, C locale is
> not a base for all other locales and is not included, it is only a fallback
> when a locale does not provide its own data (that is, when it does not
> provide any transliteration table at all).
Good catch! Should we maybe split this into two patches, one for C and
the other for "country" locales? They have different codes and
functionality so it looks like it would be easier to keep focus.
My understanding is that locale/C-translit.h.in is still the proper
locale for the sole ASCII translit table. It is also the only solution
for many use cases where there is no locale available (not compiled or
not set).
"Country" locales in localedata/locales/ can then have the exact same
translit table included or they can have any other flavor - I don't see
a problem here.
Best regards,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-05 21:13 ` Egor Kobylkin
@ 2019-01-07 20:37 ` Marko Myllynen
2019-01-09 0:46 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-01-07 20:37 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar
Cc: Mike Fabian
Hi,
On 05/01/2019 23.12, Egor Kobylkin wrote:
> On 05.01.19 15:35, Rafal Luzynski wrote:
>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>
>>> Changelog v12:
>>> [...]
>>>
>>> Changelog v11:
>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>> file for the ASCII translit table.
>>> * Correspondingly the patch now only contains the additional
>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>> locale files and they are now all left intact.
>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>> * Edited below email, commit message.
>>> [...]
>>
>> I have tested this and, unfortunately, now this transliteration
>> works *only* in C locale, that is, only when no locale is set or when
>> it is explicitly set to C (C.UTF8, POSIX). It does not work when locale
>> is set to anything different, including en_US, ru_RU, etc.
>
> Good catch! Should we maybe split this into two patches, one for C and
> the other for "country" locales? They have different codes and
> functionality so it looks like it would be easier to keep focus.
That would probably make sense, the standard C/POSIX locale won't
support System A so it also narrows down solution alternatives with it.
(If the C.UTF-8 locale (see
https://sourceware.org/bugzilla/show_bug.cgi?id=17318) materializes one
day I'm not sure would transliteration be applicable in that context.)
> My understanding is that locale/C-translit.h.in is still the proper
> locale for the sole ASCII translit table. It is also the only solution
> for many use cases where there is no locale available (not compiled or
> not set).
Correct, as Siddhesh mentioned those rules will end up to the built-in
C/POSIX locale which is ASCII and will be used if no other locales are
available or set properly. The translit_* files won't affect to it.
> "Country" locales in localedata/locales/ can then have the exact same
> translit table included or they can have any other flavor - I don't see
> a problem here.
Indeed, and since those files are not limited to ASCII, perhaps we could
now reconsider the v9 approach for them, i.e., prefer System A if
possible, otherwise use System B / ASCII (just need to make sure that
the ASCII fall-back for them will match the built-in C ASCII rule)?
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-07 20:37 ` Marko Myllynen
@ 2019-01-09 0:46 ` Egor Kobylkin
2019-01-09 20:03 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-01-09 0:46 UTC (permalink / raw)
To: Marko Myllynen, Rafal Luzynski, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar
Cc: Mike Fabian
On 07.01.19 21:37, Marko Myllynen wrote:
> Hi,
>
> On 05/01/2019 23.12, Egor Kobylkin wrote:
>> On 05.01.19 15:35, Rafal Luzynski wrote:
>>> 2.01.2019 19:38 Egor Kobylkin <egor@kobylkin.com> wrote:
>>>>
>>>> Changelog v12:
>>>> [...]
>>>>
>>>> Changelog v11:
>>>> * Re-targeted the patch against locale/C-translit.h.in as the proper
>>>> file for the ASCII translit table.
>>>> * Correspondingly the patch now only contains the additional
>>>> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
>>>> The 'include "translit_cyrillic";""' directives are not necessary in the
>>>> locale files and they are now all left intact.
>>>> * Also the file translit_cyrillic is not longer needed and is omitted.
>>>> * Edited below email, commit message.
>>>> [...]
>>>
>>> I have tested this and, unfortunately, now this transliteration
>>> works *only* in C locale, that is, only when no locale is set or when
>>> it is explicitly set to C (C.UTF8, POSIX). It does not work when locale
>>> is set to anything different, including en_US, ru_RU, etc.
>>
>> Good catch! Should we maybe split this into two patches, one for C and
>> the other for "country" locales? They have different codes and
>> functionality so it looks like it would be easier to keep focus.
>
> That would probably make sense, the standard C/POSIX locale won't
> support System A so it also narrows down solution alternatives with it.
>
[SNIP]
>> "Country" locales in localedata/locales/ can then have the exact same
>> translit table included or they can have any other flavor - I don't see
>> a problem here.
>
> Indeed, and since those files are not limited to ASCII, perhaps we could
> now reconsider the v9 approach for them, i.e., prefer System A if
> possible, otherwise use System B / ASCII (just need to make sure that
> the ASCII fall-back for them will match the built-in C ASCII rule)?
>
Happy to hear the split seems to be a clear cut one.
How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
(number) and title for clarity in communication?
The bug report for [PATCH v9] ("Countries" locales) should then ideally
have your (and others) explicit requirements as to the GOST System A/B
fall-back, which countries to include etc. Again, myself I have no other
req. here but just to have _any_ translit in place.
This way it would probably be easier to have the decision making process
tied up for both patches (separately). We may want to get the v12 POSIX
out of the door in 2.30 then and can take all the time we need to set up
the rules for "Countries" locales as you need them to be.
Bests,
Egor
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-09 0:46 ` Egor Kobylkin
@ 2019-01-09 20:03 ` Marko Myllynen
2019-02-04 7:14 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-01-09 20:03 UTC (permalink / raw)
To: Egor Kobylkin, Rafal Luzynski, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar
Cc: Mike Fabian
Hi,
On 09/01/2019 02.46, Egor Kobylkin wrote:
> On 07.01.19 21:37, Marko Myllynen wrote:
>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>
>>> Good catch! Should we maybe split this into two patches, one for C and
>>> the other for "country" locales? They have different codes and
>>> functionality so it looks like it would be easier to keep focus.
>>
>> That would probably make sense, the standard C/POSIX locale won't
>> support System A so it also narrows down solution alternatives with it.
>>
>>> "Country" locales in localedata/locales/ can then have the exact same
>>> translit table included or they can have any other flavor - I don't see
>>> a problem here.
>>
>> Indeed, and since those files are not limited to ASCII, perhaps we could
>> now reconsider the v9 approach for them, i.e., prefer System A if
>> possible, otherwise use System B / ASCII (just need to make sure that
>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>
> Happy to hear the split seems to be a clear cut one.
> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
> (number) and title for clarity in communication?
I'm not sure is a new BZ really needed for such an addition, perhaps a
NEWS entry might be more appropriate (with the full details explained in
the commit messages of course) but I'll leave this to others to decide.
> This way it would probably be easier to have the decision making process
> tied up for both patches (separately). We may want to get the v12 POSIX
> out of the door in 2.30 then and can take all the time we need to set up
> the rules for "Countries" locales as you need them to be.
Perhaps Rafal or Carlos have better suggestions but I would think we
could have a patch series where the patch 1/3 adds the C/POSIX locale
part (that would be what you posted as v12), then patch 2/3 adds
translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
System A and GOST 7.79 System B as a fall-back (which would match the
C/POSIX rules)), and finally the patch 3/3 updates locales to use
translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
alternative suggestions so it might be best to wait for their feedback
before doing anything yet (it's unfortunate you've had to do so many
iterations around this already but I think we've all learned something
during the process and the end result will be more correct than any of
the earlier versions).
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-01-09 20:03 ` Marko Myllynen
@ 2019-02-04 7:14 ` Egor Kobylkin
2019-02-14 16:48 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-02-04 7:14 UTC (permalink / raw)
To: libc-alpha, libc-locales, Carlos O'Donell
Cc: Marko Myllynen, Rafal Luzynski, Siddhesh Poyarekar, Mike Fabian
Carlos,
are you comfortable to pick this up again this month?
I would really love to have a reliable action plan to get this committed
for 2.30. Maybe cut out a subset that is undisputed and commit only that
first. It looks kinda like an eternal moving target otherwise.
for you reference:
https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
Bests,
Egor Kobylkin
On 09.01.19 21:03, Marko Myllynen wrote:
> Hi,
>
> On 09/01/2019 02.46, Egor Kobylkin wrote:
>> On 07.01.19 21:37, Marko Myllynen wrote:
>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>
>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>> the other for "country" locales? They have different codes and
>>>> functionality so it looks like it would be easier to keep focus.
>>>
>>> That would probably make sense, the standard C/POSIX locale won't
>>> support System A so it also narrows down solution alternatives with it.
>>>
>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>> translit table included or they can have any other flavor - I don't see
>>>> a problem here.
>>>
>>> Indeed, and since those files are not limited to ASCII, perhaps we could
>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>> possible, otherwise use System B / ASCII (just need to make sure that
>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>
>> Happy to hear the split seems to be a clear cut one.
>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>> (number) and title for clarity in communication?
>
> I'm not sure is a new BZ really needed for such an addition, perhaps a
> NEWS entry might be more appropriate (with the full details explained in
> the commit messages of course) but I'll leave this to others to decide.
>
>> This way it would probably be easier to have the decision making process
>> tied up for both patches (separately). We may want to get the v12 POSIX
>> out of the door in 2.30 then and can take all the time we need to set up
>> the rules for "Countries" locales as you need them to be.
>
> Perhaps Rafal or Carlos have better suggestions but I would think we
> could have a patch series where the patch 1/3 adds the C/POSIX locale
> part (that would be what you posted as v12), then patch 2/3 adds
> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
> System A and GOST 7.79 System B as a fall-back (which would match the
> C/POSIX rules)), and finally the patch 3/3 updates locales to use
> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
> alternative suggestions so it might be best to wait for their feedback
> before doing anything yet (it's unfortunate you've had to do so many
> iterations around this already but I think we've all learned something
> during the process and the end result will be more correct than any of
> the earlier versions).
>
> Thanks,
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-02-04 7:14 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
@ 2019-02-14 16:48 ` Marko Myllynen
2019-03-04 22:12 ` Egor Kobylkin
2019-04-20 0:20 ` Rafal Luzynski
0 siblings, 2 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-02-14 16:48 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell
Cc: Rafal Luzynski, Siddhesh Poyarekar, Mike Fabian
Hi Carlos, Mike, Rafal,
It seems clear that you all are currently too busy to have a look at
this but would you have any estimate when you might be able to review
this so that we could consider merging?
FWIW, I chatted with Egor off-list and we're on the same page wrt the
following, hopefully this gives you a bit off jump start for this
subject when you have time to dig deeper:
1) Built-in C locale doesn't read/use any translit_* files and it can't
have any fallback mechanisms and it only supports ASCII so using GOST
7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
be the appropriate way to implement Cyrillic transliteration for the
built-in C locale (it adds some 8KB to the binary).
2) Other locales read/use translit_* files and with them fallbacks and
non-ASCII are possible so it would seem preferable to first try ISO 9 /
GOST 7.79 System A and only if that fails then use GOST 7.79 System B
(in which case the end result should match with the built-in C locale).
For this the translit_cyrillic file should be added (as per patch v9 +
changes mentioned in patches v10 and v12).
3) Individual locale files can then be updated to use translit_cyrillic
as appropriate (see patch v9) and language/national specific conventions
(e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
Thanks,
On 04/02/2019 09.14, Egor Kobylkin wrote:
> Carlos,
> are you comfortable to pick this up again this month?
>
> I would really love to have a reliable action plan to get this committed
> for 2.30. Maybe cut out a subset that is undisputed and commit only that
> first. It looks kinda like an eternal moving target otherwise.
>
> for you reference:
> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>
> Bests,
> Egor Kobylkin
>
> On 09.01.19 21:03, Marko Myllynen wrote:
>> Hi,
>>
>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>
>>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>>> the other for "country" locales? They have different codes and
>>>>> functionality so it looks like it would be easier to keep focus.
>>>>
>>>> That would probably make sense, the standard C/POSIX locale won't
>>>> support System A so it also narrows down solution alternatives with it.
>>>>
>>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>>> translit table included or they can have any other flavor - I don't
>>>>> see
>>>>> a problem here.
>>>>
>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>> could
>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>
>>> Happy to hear the split seems to be a clear cut one.
>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>> (number) and title for clarity in communication?
>>
>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>> NEWS entry might be more appropriate (with the full details explained in
>> the commit messages of course) but I'll leave this to others to decide.
>>
>>> This way it would probably be easier to have the decision making process
>>> tied up for both patches (separately). We may want to get the v12 POSIX
>>> out of the door in 2.30 then and can take all the time we need to set up
>>> the rules for "Countries" locales as you need them to be.
>>
>> Perhaps Rafal or Carlos have better suggestions but I would think we
>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>> part (that would be what you posted as v12), then patch 2/3 adds
>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>> System A and GOST 7.79 System B as a fall-back (which would match the
>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>> alternative suggestions so it might be best to wait for their feedback
>> before doing anything yet (it's unfortunate you've had to do so many
>> iterations around this already but I think we've all learned something
>> during the process and the end result will be more correct than any of
>> the earlier versions).
>>
>> Thanks,
>>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-02-14 16:48 ` Marko Myllynen
@ 2019-03-04 22:12 ` Egor Kobylkin
2019-03-11 13:59 ` PING " Egor Kobylkin
2019-04-20 0:20 ` Rafal Luzynski
1 sibling, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-04 22:12 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
Rafal Luzynski, Mike Fabian
Cc: Siddhesh Poyarekar, Dmitry V. Levin
ping
On 14.02.19 17:48, Marko Myllynen wrote:
> Hi Carlos, Mike, Rafal,
>
> It seems clear that you all are currently too busy to have a look at
> this but would you have any estimate when you might be able to review
> this so that we could consider merging?
>
> FWIW, I chatted with Egor off-list and we're on the same page wrt the
> following, hopefully this gives you a bit off jump start for this
> subject when you have time to dig deeper:
>
> 1) Built-in C locale doesn't read/use any translit_* files and it can't
> have any fallback mechanisms and it only supports ASCII so using GOST
> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
> be the appropriate way to implement Cyrillic transliteration for the
> built-in C locale (it adds some 8KB to the binary).
>
> 2) Other locales read/use translit_* files and with them fallbacks and
> non-ASCII are possible so it would seem preferable to first try ISO 9 /
> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
> (in which case the end result should match with the built-in C locale).
> For this the translit_cyrillic file should be added (as per patch v9 +
> changes mentioned in patches v10 and v12).
>
> 3) Individual locale files can then be updated to use translit_cyrillic
> as appropriate (see patch v9) and language/national specific conventions
> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
>
> Thanks,
>
> On 04/02/2019 09.14, Egor Kobylkin wrote:
>> Carlos,
>> are you comfortable to pick this up again this month?
>>
>> I would really love to have a reliable action plan to get this committed
>> for 2.30. Maybe cut out a subset that is undisputed and commit only that
>> first. It looks kinda like an eternal moving target otherwise.
>>
>> for you reference:
>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>
>> Bests,
>> Egor Kobylkin
>>
>> On 09.01.19 21:03, Marko Myllynen wrote:
>>> Hi,
>>>
>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>
>>>>>> Good catch! Should we maybe split this into two patches, one for C and
>>>>>> the other for "country" locales? They have different codes and
>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>
>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>> support System A so it also narrows down solution alternatives with it.
>>>>>
>>>>>> "Country" locales in localedata/locales/ can then have the exact same
>>>>>> translit table included or they can have any other flavor - I don't
>>>>>> see
>>>>>> a problem here.
>>>>>
>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>> could
>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>
>>>> Happy to hear the split seems to be a clear cut one.
>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>> (number) and title for clarity in communication?
>>>
>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>> NEWS entry might be more appropriate (with the full details explained in
>>> the commit messages of course) but I'll leave this to others to decide.
>>>
>>>> This way it would probably be easier to have the decision making process
>>>> tied up for both patches (separately). We may want to get the v12 POSIX
>>>> out of the door in 2.30 then and can take all the time we need to set up
>>>> the rules for "Countries" locales as you need them to be.
>>>
>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>> part (that would be what you posted as v12), then patch 2/3 adds
>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>>> alternative suggestions so it might be best to wait for their feedback
>>> before doing anything yet (it's unfortunate you've had to do so many
>>> iterations around this already but I think we've all learned something
>>> during the process and the end result will be more correct than any of
>>> the earlier versions).
>>>
>>> Thanks,
>>>
>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-03-04 22:12 ` Egor Kobylkin
@ 2019-03-11 13:59 ` Egor Kobylkin
2019-03-14 19:49 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-11 13:59 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
Rafal Luzynski, Mike Fabian
Cc: Siddhesh Poyarekar, Dmitry V. Levin
On 04.03.19 23:11, Egor Kobylkin wrote:
> ping
>
> On 14.02.19 17:48, Marko Myllynen wrote:
>> Hi Carlos, Mike, Rafal,
>>
>> It seems clear that you all are currently too busy to have a look at
>> this but would you have any estimate when you might be able to review
>> this so that we could consider merging?
>>
>> FWIW, I chatted with Egor off-list and we're on the same page wrt the
>> following, hopefully this gives you a bit off jump start for this
>> subject when you have time to dig deeper:
>>
>> 1) Built-in C locale doesn't read/use any translit_* files and it can't
>> have any fallback mechanisms and it only supports ASCII so using GOST
>> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
>> be the appropriate way to implement Cyrillic transliteration for the
>> built-in C locale (it adds some 8KB to the binary).
>>
>> 2) Other locales read/use translit_* files and with them fallbacks and
>> non-ASCII are possible so it would seem preferable to first try ISO 9 /
>> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
>> (in which case the end result should match with the built-in C locale).
>> For this the translit_cyrillic file should be added (as per patch v9 +
>> changes mentioned in patches v10 and v12).
>>
>> 3) Individual locale files can then be updated to use translit_cyrillic
>> as appropriate (see patch v9) and language/national specific conventions
>> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
>>
>> Thanks,
>>
>> On 04/02/2019 09.14, Egor Kobylkin wrote:
>>> Carlos,
>>> are you comfortable to pick this up again this month?
>>>
>>> I would really love to have a reliable action plan to get this committed
>>> for 2.30. Maybe cut out a subset that is undisputed and commit only that
>>> first. It looks kinda like an eternal moving target otherwise.
>>>
>>> for you reference:
>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>>
>>> Bests,
>>> Egor Kobylkin
>>>
>>> On 09.01.19 21:03, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>>
>>>>>>> Good catch! Should we maybe split this into two patches, one for
>>>>>>> C and
>>>>>>> the other for "country" locales? They have different codes and
>>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>>
>>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>>> support System A so it also narrows down solution alternatives
>>>>>> with it.
>>>>>>
>>>>>>> "Country" locales in localedata/locales/ can then have the exact
>>>>>>> same
>>>>>>> translit table included or they can have any other flavor - I don't
>>>>>>> see
>>>>>>> a problem here.
>>>>>>
>>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>>> could
>>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>>> possible, otherwise use System B / ASCII (just need to make sure that
>>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>>
>>>>> Happy to hear the split seems to be a clear cut one.
>>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>>> (number) and title for clarity in communication?
>>>>
>>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>>> NEWS entry might be more appropriate (with the full details
>>>> explained in
>>>> the commit messages of course) but I'll leave this to others to decide.
>>>>
>>>>> This way it would probably be easier to have the decision making
>>>>> process
>>>>> tied up for both patches (separately). We may want to get the v12
>>>>> POSIX
>>>>> out of the door in 2.30 then and can take all the time we need to
>>>>> set up
>>>>> the rules for "Countries" locales as you need them to be.
>>>>
>>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>>> part (that would be what you posted as v12), then patch 2/3 adds
>>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may have
>>>> alternative suggestions so it might be best to wait for their feedback
>>>> before doing anything yet (it's unfortunate you've had to do so many
>>>> iterations around this already but I think we've all learned something
>>>> during the process and the end result will be more correct than any of
>>>> the earlier versions).
>>>>
>>>> Thanks,
>>>>
>>
>>
^ permalink raw reply [flat|nested] 111+ messages in thread
* PING Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-03-11 13:59 ` PING " Egor Kobylkin
@ 2019-03-14 19:49 ` Egor Kobylkin
0 siblings, 0 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-14 19:49 UTC (permalink / raw)
To: libc-alpha, libc-locales, Carlos O'Donell
Cc: Marko Myllynen, Rafal Luzynski, Mike Fabian, Siddhesh Poyarekar,
Dmitry V. Levin
On 11.03.19 14:59, Egor Kobylkin wrote:
>
>
> On 04.03.19 23:11, Egor Kobylkin wrote:
>> ping
>>
>> On 14.02.19 17:48, Marko Myllynen wrote:
>>> Hi Carlos, Mike, Rafal,
>>>
>>> It seems clear that you all are currently too busy to have a look at
>>> this but would you have any estimate when you might be able to review
>>> this so that we could consider merging?
>>>
>>> FWIW, I chatted with Egor off-list and we're on the same page wrt the
>>> following, hopefully this gives you a bit off jump start for this
>>> subject when you have time to dig deeper:
>>>
>>> 1) Built-in C locale doesn't read/use any translit_* files and it can't
>>> have any fallback mechanisms and it only supports ASCII so using GOST
>>> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
>>> be the appropriate way to implement Cyrillic transliteration for the
>>> built-in C locale (it adds some 8KB to the binary).
>>>
>>> 2) Other locales read/use translit_* files and with them fallbacks and
>>> non-ASCII are possible so it would seem preferable to first try ISO 9 /
>>> GOST 7.79 System A and only if that fails then use GOST 7.79 System B
>>> (in which case the end result should match with the built-in C locale).
>>> For this the translit_cyrillic file should be added (as per patch v9 +
>>> changes mentioned in patches v10 and v12).
>>>
>>> 3) Individual locale files can then be updated to use translit_cyrillic
>>> as appropriate (see patch v9) and language/national specific conventions
>>> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
>>>
>>> Thanks,
>>>
>>> On 04/02/2019 09.14, Egor Kobylkin wrote:
>>>> Carlos,
>>>> are you comfortable to pick this up again this month?
>>>>
>>>> I would really love to have a reliable action plan to get this
>>>> committed
>>>> for 2.30. Maybe cut out a subset that is undisputed and commit only
>>>> that
>>>> first. It looks kinda like an eternal moving target otherwise.
>>>>
>>>> for you reference:
>>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00036.html
>>>> https://sourceware.org/ml/libc-alpha/2019-01/msg00040.html
>>>>
>>>> Bests,
>>>> Egor Kobylkin
>>>>
>>>> On 09.01.19 21:03, Marko Myllynen wrote:
>>>>> Hi,
>>>>>
>>>>> On 09/01/2019 02.46, Egor Kobylkin wrote:
>>>>>> On 07.01.19 21:37, Marko Myllynen wrote:
>>>>>>> On 05/01/2019 23.12, Egor Kobylkin wrote:
>>>>>>>>
>>>>>>>> Good catch! Should we maybe split this into two patches, one for
>>>>>>>> C and
>>>>>>>> the other for "country" locales? They have different codes and
>>>>>>>> functionality so it looks like it would be easier to keep focus.
>>>>>>>
>>>>>>> That would probably make sense, the standard C/POSIX locale won't
>>>>>>> support System A so it also narrows down solution alternatives
>>>>>>> with it.
>>>>>>>
>>>>>>>> "Country" locales in localedata/locales/ can then have the exact
>>>>>>>> same
>>>>>>>> translit table included or they can have any other flavor - I don't
>>>>>>>> see
>>>>>>>> a problem here.
>>>>>>>
>>>>>>> Indeed, and since those files are not limited to ASCII, perhaps we
>>>>>>> could
>>>>>>> now reconsider the v9 approach for them, i.e., prefer System A if
>>>>>>> possible, otherwise use System B / ASCII (just need to make sure
>>>>>>> that
>>>>>>> the ASCII fall-back for them will match the built-in C ASCII rule)?
>>>>>>
>>>>>> Happy to hear the split seems to be a clear cut one.
>>>>>> How about I rename the "[PATCH v12]...[BZ #2872]" to "[PATCH v1]...
>>>>>> C/POSIX [BZ #2872]" and the "[PATCH v9]" gets its own bug-report
>>>>>> (number) and title for clarity in communication?
>>>>>
>>>>> I'm not sure is a new BZ really needed for such an addition, perhaps a
>>>>> NEWS entry might be more appropriate (with the full details
>>>>> explained in
>>>>> the commit messages of course) but I'll leave this to others to
>>>>> decide.
>>>>>
>>>>>> This way it would probably be easier to have the decision making
>>>>>> process
>>>>>> tied up for both patches (separately). We may want to get the v12
>>>>>> POSIX
>>>>>> out of the door in 2.30 then and can take all the time we need to
>>>>>> set up
>>>>>> the rules for "Countries" locales as you need them to be.
>>>>>
>>>>> Perhaps Rafal or Carlos have better suggestions but I would think we
>>>>> could have a patch series where the patch 1/3 adds the C/POSIX locale
>>>>> part (that would be what you posted as v12), then patch 2/3 adds
>>>>> translit_cyrillic (based on your v9 so supports ISO 9.1995 / GOST 7.79
>>>>> System A and GOST 7.79 System B as a fall-back (which would match the
>>>>> C/POSIX rules)), and finally the patch 3/3 updates locales to use
>>>>> translit_cyrillic as appropriate. But as said, Rafal or Carlos may
>>>>> have
>>>>> alternative suggestions so it might be best to wait for their feedback
>>>>> before doing anything yet (it's unfortunate you've had to do so many
>>>>> iterations around this already but I think we've all learned something
>>>>> during the process and the end result will be more correct than any of
>>>>> the earlier versions).
>>>>>
>>>>> Thanks,
>>>>>
>>>
>>>
^ permalink raw reply [flat|nested] 111+ messages in thread
* ping [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
[not found] ` <20180412224352.GB2911@altlinux.org>
` (12 preceding siblings ...)
2019-01-02 18:39 ` [PATCH v12] " Egor Kobylkin
@ 2019-03-19 10:39 ` Egor Kobylkin
2019-03-28 16:20 ` [PING^4][PATCH " Marko Myllynen
` (2 more replies)
13 siblings, 3 replies; 111+ messages in thread
From: Egor Kobylkin @ 2019-03-19 10:39 UTC (permalink / raw)
To: libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Marko Myllynen, mfabian
[-- Attachment #1: Type: text/plain, Size: 5991 bytes --]
Changelog v12:
* Adjusted to the new comment style suddenly appearing in the target
file locale/C-translit.h.in (the original file changed on the master
branch from /* style to # style since v11)
* Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
"sh`" instead of erroneous "SH`" in v11
Changelog v11:
* Re-targeted the patch against locale/C-translit.h.in as the proper
file for the ASCII translit table.
* Correspondingly the patch now only contains the additional
Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
The 'include "translit_cyrillic";""' directives are not necessary in the
locale files and they are now all left intact.
* Also the file translit_cyrillic is not longer needed and is omitted.
* Edited below email, commit message.
Changelog v10:
* Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
with diacritics) as conflicting with System B within glibc mechanics and
not solving BZ #2872
* Edited below email, commit message, comment in translit_cyrillic to
reflect System A removal
* Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
using composition) as composing is not covered by current glibc
conversion mechanics
Changelog v9:
* Fixed formatting (trailing spaces etc.)
* Put commit summary in the patch file, now it is generated completely
by git format-patch
Changelog v8:
* Re-added missing translit_cyrillic in patch v7 (due to missing "git
add" in the script).
Changelog v7:
* Generated against git://sourceware.org/git/glibc.git master with git
format-patch.
* The 'include "translit_cyrillic";""' now immediately follows last
'include "translit_XXX";""' string (was inserted just before
translit_end previously.)
* Only the locales already having 'include .*translit.*;""' are patched
(see the list for manual exclusions below, full list of included locales
at the end of the email in the commit section.)
* Excluded az_AZ completely to avoid circular reference from tr_TR via
âcopy "tr_TR"â.
Changelog v6:
* Locales removed from the patch: C and sd_PK.
* Added locales: az_AZ and ky_KG.
* Consistently transliterate single uppercase Cyrillic letters
to sequences of all uppercase Latin letters in all languages (whenever
a Cyrillic letter is transliterated to more than one Latin letter),
for example "Ð" is now transliterated as "YI" rather than "Yi".
Dear locale maintainers,
fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
add the Cyrillic transliteration rows to locale/C-translit.h.in.
The patch is attached.
Current bug effect:
The glibc wiki explicitly lists this use case as the test example and
currently it fails on Cyrillic texts [1] [8] [9]:
iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
- it produces a string of question marks and spaces.
This is what it should produce and it does so after the patch applied:
CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
chayu.
The root problem and the fix:
The root problem is the missing transliteration table that I am
supplying here.
COMMIT MESSAGE:
This translit_cyrillic table enables conversion (e.g. with iconv) from a
UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
compatible transcription.
While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
a transliteration/transcription has only Latin/ASCII codes but still can
be read by a native speaker. Among other things it is useful for
processing the Cyrillic texts and filenames by programs or on systems
that are not specifically prepared to work with Cyrillic, don't have
corresponding fonts installed or can't handle UTF-8.
The patch content (mapping) is based on ISO 9.1995 standard [10] and its
derivative GOST 7.79-2000 System B official source (Federal Agency on
Technical Regulating and Metrology Of Russian Federation [2]).
Technically an independent but mostly identical source [3] was used and
prepared in a spreadsheet [6].
The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
System B represents what is actually called transcription (preserving
phonemes), while System A is the transliteration (preserving graphemes).
There is no meaningful way to preserve graphemes converting Cyrillic to
ASCII and thus the System B is chosen [11]. To be super clear the System
A has nothing to do with this bug regardless it being a transliteration.
Those interested in implementing System A for transliteration of
Cyrillic to Latin with Diacritic as a new feature are welcome to use the
spreadsheet in [6] as a starting point.
Links:
[1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
[2] GOST 7.79-2000 official source
http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
available in low quality gif format)
[3] http://transliteration.ru/gost-7-79-2000/ and
http://www.yfermer.ru/specifications/285821.html
[4] Wikipedia article on Cyrillic transliteration with Latin alphabet
https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
[5] http://man7.org/linux/man-pages/man5/locale.5.html
[6] Spreadsheet for generating translit_cyrillic
https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
[8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
[9] translit-test-input.txt
https://sourceware.org/bugzilla/attachment.cgi?id=11304
[10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
[11]
https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
Best regards,
Egor Kobylkin
[-- Attachment #2: 0001-Locales-Cyrillic-ASCII-transliteration-table-BZ-2872.patch --]
[-- Type: text/x-patch, Size: 10495 bytes --]
From 46e0d0e3d07805ec853fdd72dc3793995cb5593c Mon Sep 17 00:00:00 2001
From: Egor Kobylkin <egor@kobylkin.com>
Date: Wed, 2 Jan 2019 05:50:13 +0100
Subject: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872]
[BZ #2872]
* locale/C-translit.h.in: Add Cyrillic transliteration.
---
locale/C-translit.h.in | 169 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 169 insertions(+)
diff --git a/locale/C-translit.h.in b/locale/C-translit.h.in
index d5f00df0f3..758171c394 100644
--- a/locale/C-translit.h.in
+++ b/locale/C-translit.h.in
@@ -56,6 +56,175 @@
"\x02cd" "_" # <U02CD> MODIFIER LETTER LOW MACRON
"\x02d0" ":" # <U02D0> MODIFIER LETTER TRIANGULAR COLON
"\x02dc" "~" # <U02DC> SMALL TILDE
+"\x0401" "YO" # <U0401> CYRILLIC CAPITAL LETTER IO
+"\x0402" "DJ" # <U0402> CYRILLIC CAPITAL LETTER DJE
+"\x0403" "G`" # <U0403> CYRILLIC CAPITAL LETTER GJE
+"\x0404" "YE" # <U0404> CYRILLIC CAPITAL LETTER UKRAINIAN IE
+"\x0405" "Z`" # <U0405> CYRILLIC CAPITAL LETTER DZE
+"\x0406" "I" # <U0406> CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0407" "YI" # <U0407> CYRILLIC CAPITAL LETTER YI
+"\x0408" "J" # <U0408> CYRILLIC CAPITAL LETTER JE
+"\x0409" "L`" # <U0409> CYRILLIC CAPITAL LETTER LJE
+"\x040a" "N`" # <U040A> CYRILLIC CAPITAL LETTER NJE
+"\x040b" "TSH" # <U040B> CYRILLIC CAPITAL LETTER TSHE
+"\x040c" "K`" # <U040C> CYRILLIC CAPITAL LETTER KJE
+"\x040e" "U`" # <U040E> CYRILLIC CAPITAL LETTER SHORT U
+"\x040f" "DH" # <U040F> CYRILLIC CAPITAL LETTER DZHE
+"\x0410" "A" # <U0410> CYRILLIC CAPITAL LETTER A
+"\x0411" "B" # <U0411> CYRILLIC CAPITAL LETTER BE
+"\x0412" "V" # <U0412> CYRILLIC CAPITAL LETTER VE
+"\x0413" "G" # <U0413> CYRILLIC CAPITAL LETTER GHE
+"\x0414" "D" # <U0414> CYRILLIC CAPITAL LETTER DE
+"\x0415" "E" # <U0415> CYRILLIC CAPITAL LETTER IE
+"\x0416" "ZH" # <U0416> CYRILLIC CAPITAL LETTER ZHE
+"\x0417" "Z" # <U0417> CYRILLIC CAPITAL LETTER ZE
+"\x0418" "I" # <U0418> CYRILLIC CAPITAL LETTER I
+"\x0419" "J" # <U0419> CYRILLIC CAPITAL LETTER SHORT I
+"\x041a" "K" # <U041A> CYRILLIC CAPITAL LETTER KA
+"\x041b" "L" # <U041B> CYRILLIC CAPITAL LETTER EL
+"\x041c" "M" # <U041C> CYRILLIC CAPITAL LETTER EM
+"\x041d" "N" # <U041D> CYRILLIC CAPITAL LETTER EN
+"\x041e" "O" # <U041E> CYRILLIC CAPITAL LETTER O
+"\x041f" "P" # <U041F> CYRILLIC CAPITAL LETTER PE
+"\x0420" "R" # <U0420> CYRILLIC CAPITAL LETTER ER
+"\x0421" "S" # <U0421> CYRILLIC CAPITAL LETTER ES
+"\x0422" "T" # <U0422> CYRILLIC CAPITAL LETTER TE
+"\x0423" "U" # <U0423> CYRILLIC CAPITAL LETTER U
+"\x0424" "F" # <U0424> CYRILLIC CAPITAL LETTER EF
+"\x0425" "X" # <U0425> CYRILLIC CAPITAL LETTER HA
+"\x0426" "CZ" # <U0426> CYRILLIC CAPITAL LETTER TSE
+"\x0427" "CH" # <U0427> CYRILLIC CAPITAL LETTER CHE
+"\x0428" "SH" # <U0428> CYRILLIC CAPITAL LETTER SHA
+"\x0429" "SHH" # <U0429> CYRILLIC CAPITAL LETTER SHCHA
+"\x042a" "A`" # <U042A> CYRILLIC CAPITAL LETTER HARD SIGN
+"\x042b" "Y`" # <U042B> CYRILLIC CAPITAL LETTER YERU
+"\x042c" "`" # <U042C> CYRILLIC CAPITAL LETTER SOFT SIGN
+"\x042d" "E`" # <U042D> CYRILLIC CAPITAL LETTER E
+"\x042e" "YU" # <U042E> CYRILLIC CAPITAL LETTER YU
+"\x042f" "YA" # <U042F> CYRILLIC CAPITAL LETTER YA
+"\x0430" "a" # <U0430> CYRILLIC SMALL LETTER A
+"\x0431" "b" # <U0431> CYRILLIC SMALL LETTER BE
+"\x0432" "v" # <U0432> CYRILLIC SMALL LETTER VE
+"\x0433" "g" # <U0433> CYRILLIC SMALL LETTER GHE
+"\x0434" "d" # <U0434> CYRILLIC SMALL LETTER DE
+"\x0435" "e" # <U0435> CYRILLIC SMALL LETTER IE
+"\x0436" "zh" # <U0436> CYRILLIC SMALL LETTER ZHE
+"\x0437" "z" # <U0437> CYRILLIC SMALL LETTER ZE
+"\x0438" "i" # <U0438> CYRILLIC SMALL LETTER I
+"\x0439" "j" # <U0439> CYRILLIC SMALL LETTER SHORT I
+"\x043a" "k" # <U043A> CYRILLIC SMALL LETTER KA
+"\x043b" "l" # <U043B> CYRILLIC SMALL LETTER EL
+"\x043c" "m" # <U043C> CYRILLIC SMALL LETTER EM
+"\x043d" "n" # <U043D> CYRILLIC SMALL LETTER EN
+"\x043e" "o" # <U043E> CYRILLIC SMALL LETTER O
+"\x043f" "p" # <U043F> CYRILLIC SMALL LETTER PE
+"\x0440" "r" # <U0440> CYRILLIC SMALL LETTER ER
+"\x0441" "s" # <U0441> CYRILLIC SMALL LETTER ES
+"\x0442" "t" # <U0442> CYRILLIC SMALL LETTER TE
+"\x0443" "u" # <U0443> CYRILLIC SMALL LETTER U
+"\x0444" "f" # <U0444> CYRILLIC SMALL LETTER EF
+"\x0445" "x" # <U0445> CYRILLIC SMALL LETTER HA
+"\x0446" "cz" # <U0446> CYRILLIC SMALL LETTER TSE
+"\x0447" "ch" # <U0447> CYRILLIC SMALL LETTER CHE
+"\x0448" "sh" # <U0448> CYRILLIC SMALL LETTER SHA
+"\x0449" "shh" # <U0449> CYRILLIC SMALL LETTER SHCHA
+"\x044a" "``" # <U044A> CYRILLIC SMALL LETTER HARD SIGN
+"\x044b" "y`" # <U044B> CYRILLIC SMALL LETTER YERU
+"\x044c" "`" # <U044C> CYRILLIC SMALL LETTER SOFT SIGN
+"\x044d" "e`" # <U044D> CYRILLIC SMALL LETTER E
+"\x044e" "yu" # <U044E> CYRILLIC SMALL LETTER YU
+"\x044f" "ya" # <U044F> CYRILLIC SMALL LETTER YA
+"\x0451" "yo" # <U0451> CYRILLIC SMALL LETTER IO
+"\x0452" "dj" # <U0452> CYRILLIC SMALL LETTER DJE
+"\x0453" "g`" # <U0453> CYRILLIC SMALL LETTER GJE
+"\x0454" "ye" # <U0454> CYRILLIC SMALL LETTER UKRAINIAN IE
+"\x0455" "z`" # <U0455> CYRILLIC SMALL LETTER DZE
+"\x0456" "i" # <U0456> CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+"\x0457" "yi" # <U0457> CYRILLIC SMALL LETTER YI
+"\x0458" "j" # <U0458> CYRILLIC SMALL LETTER JE
+"\x0459" "l`" # <U0459> CYRILLIC SMALL LETTER LJE
+"\x045a" "n`" # <U045A> CYRILLIC SMALL LETTER NJE
+"\x045b" "tsh" # <U045B> CYRILLIC SMALL LETTER TSHE
+"\x045c" "k`" # <U045C> CYRILLIC SMALL LETTER KJE
+"\x045e" "u`" # <U045E> CYRILLIC SMALL LETTER SHORT U
+"\x045f" "dh" # <U045F> CYRILLIC SMALL LETTER DZHE
+"\x046a" "O`" # <U046A> CYRILLIC CAPITAL LETTER BIG YUS
+"\x046b" "o`" # <U046B> CYRILLIC SMALL LETTER BIG YUS
+"\x0472" "FH" # <U0472> CYRILLIC CAPITAL LETTER FITA
+"\x0473" "fh" # <U0473> CYRILLIC SMALL LETTER FITA
+"\x0474" "YH" # <U0474> CYRILLIC CAPITAL LETTER IZHITSA
+"\x0475" "yh" # <U0475> CYRILLIC SMALL LETTER IZHITSA
+"\x048c" "E`" # <U048C> CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+"\x048d" "e`" # <U048D> CYRILLIC SMALL LETTER SEMISOFT SIGN
+"\x0490" "G`" # <U0490> CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+"\x0491" "g`" # <U0491> CYRILLIC SMALL LETTER GHE WITH UPTURN
+"\x0492" "GH" # <U0492> CYRILLIC CAPITAL LETTER GHE WITH STROKE
+"\x0493" "gh" # <U0493> CYRILLIC SMALL LETTER GHE WITH STROKE
+"\x0494" "GH" # <U0494> CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+"\x0495" "gh" # <U0495> CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK
+"\x0496" "ZH`" # <U0496> CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+"\x0497" "zh`" # <U0497> CYRILLIC SMALL LETTER ZHE WITH DESCENDER
+"\x049a" "K`" # <U049A> CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+"\x049b" "k`" # <U049B> CYRILLIC SMALL LETTER KA WITH DESCENDER
+"\x049e" "K`" # <U049E> CYRILLIC CAPITAL LETTER KA WITH STROKE
+"\x049f" "k`" # <U049F> CYRILLIC SMALL LETTER KA WITH STROKE
+"\x04a2" "N`" # <U04A2> CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+"\x04a3" "n`" # <U04A3> CYRILLIC SMALL LETTER EN WITH DESCENDER
+"\x04a4" "NG" # <U04A4> CYRILLIC CAPITAL LIGATURE EN GHE
+"\x04a5" "ng" # <U04A5> CYRILLIC SMALL LIGATURE EN GHE
+"\x04a6" "P`" # <U04A6> CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+"\x04a7" "p`" # <U04A7> CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK
+"\x04a8" "O`" # <U04A8> CYRILLIC CAPITAL LETTER ABKHASIAN HA
+"\x04a9" "o`" # <U04A9> CYRILLIC SMALL LETTER ABKHASIAN HA
+"\x04aa" "C`" # <U04AA> CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+"\x04ab" "C`" # <U04AB> CYRILLIC SMALL LETTER ES WITH DESCENDER
+"\x04ac" "T`" # <U04AC> CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+"\x04ad" "t`" # <U04AD> CYRILLIC SMALL LETTER TE WITH DESCENDER
+"\x04ae" "U" # <U04AE> CYRILLIC CAPITAL LETTER STRAIGHT U
+"\x04af" "u" # <U04AF> CYRILLIC SMALL LETTER STRAIGHT U
+"\x04b2" "H`" # <U04B2> CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+"\x04b3" "h`" # <U04B3> CYRILLIC SMALL LETTER HA WITH DESCENDER
+"\x04b4" "TCZ" # <U04B4> CYRILLIC CAPITAL LIGATURE TE TSE
+"\x04b5" "tcz" # <U04B5> CYRILLIC SMALL LIGATURE TE TSE
+"\x04ba" "SH`" # <U04BA> CYRILLIC CAPITAL LETTER SHHA
+"\x04bb" "sh`" # <U04BB> CYRILLIC SMALL LETTER SHHA
+"\x04bc" "CH`" # <U04BC> CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+"\x04bd" "ch`" # <U04BD> CYRILLIC SMALL LETTER ABKHASIAN CHE
+"\x04be" "CH`" # <U04BE> CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04bf" "ch`" # <U04BF> CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER
+"\x04c0" "i" # <U04C0> CYRILLIC LETTER PALOCHKA
+"\x04c1" "ZH`" # <U04C1> CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+"\x04c2" "zh`" # <U04C2> CYRILLIC SMALL LETTER ZHE WITH BREVE
+"\x04cb" "CH`" # <U04CB> CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+"\x04cc" "ch`" # <U04CC> CYRILLIC SMALL LETTER KHAKASSIAN CHE
+"\x04d0" "A`" # <U04D0> CYRILLIC CAPITAL LETTER A WITH BREVE
+"\x04d1" "a`" # <U04D1> CYRILLIC SMALL LETTER A WITH BREVE
+"\x04d2" "A`" # <U04D2> CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+"\x04d3" "a`" # <U04D3> CYRILLIC SMALL LETTER A WITH DIAERESIS
+"\x04d6" "E`" # <U04D6> CYRILLIC CAPITAL LETTER IE WITH BREVE
+"\x04d7" "e`" # <U04D7> CYRILLIC SMALL LETTER IE WITH BREVE
+"\x04d8" "A`" # <U04D8> CYRILLIC CAPITAL LETTER SCHWA
+"\x04d9" "a`" # <U04D9> CYRILLIC SMALL LETTER SCHWA
+"\x04dc" "ZH`" # <U04DC> CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+"\x04dd" "zh`" # <U04DD> CYRILLIC SMALL LETTER ZHE WITH DIAERESIS
+"\x04de" "Z`" # <U04DE> CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+"\x04df" "z`" # <U04DF> CYRILLIC SMALL LETTER ZE WITH DIAERESIS
+"\x04e0" "Z`" # <U04E0> CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+"\x04e1" "z`" # <U04E1> CYRILLIC SMALL LETTER ABKHASIAN DZE
+"\x04e4" "I`" # <U04E4> CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+"\x04e5" "i`" # <U04E5> CYRILLIC SMALL LETTER I WITH DIAERESIS
+"\x04e6" "O`" # <U04E6> CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+"\x04e7" "o`" # <U04E7> CYRILLIC SMALL LETTER O WITH DIAERESIS
+"\x04e8" "O`" # <U04E8> CYRILLIC CAPITAL LETTER BARRED O
+"\x04e9" "o`" # <U04E9> CYRILLIC SMALL LETTER BARRED O
+"\x04f0" "U`" # <U04F0> CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+"\x04f1" "u`" # <U04F1> CYRILLIC SMALL LETTER U WITH DIAERESIS
+"\x04f2" "U`" # <U04F2> CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+"\x04f3" "u`" # <U04F3> CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE
+"\x04f4" "CH`" # <U04F4> CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+"\x04f5" "ch`" # <U04F5> CYRILLIC SMALL LETTER CHE WITH DIAERESIS
+"\x04f8" "Y`" # <U04F8> CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+"\x04f9" "y`" # <U04F9> CYRILLIC SMALL LETTER YERU WITH DIAERESIS
"\x2002" " " # <U2002> EN SPACE
"\x2003" " " # <U2003> EM SPACE
"\x2004" " " # <U2004> THREE-PER-EM SPACE
--
2.17.1
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PING^4][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-03-19 10:39 ` ping " Egor Kobylkin
@ 2019-03-28 16:20 ` Marko Myllynen
2019-04-04 19:44 ` [PING^5][PATCH " Egor Kobylkin
2019-04-16 9:59 ` [PING^6][PATCH " Marko Myllynen
2 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-03-28 16:20 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
Ping?
On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
>
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
>
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
>
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
>
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
>
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> âcopy "tr_TR"â.
>
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
> Â Â to sequences of all uppercase Latin letters in all languages (whenever
> Â Â a Cyrillic letter is transliterated to more than one Latin letter),
> Â Â for example "Ð" is now transliterated as "YI" rather than "Yi".
>
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
>
> The patch is attached.
>
>
> Current bug effect:
>
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
>
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
>
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>
> - it produces a string of question marks and spaces.
>
> This is what it should produce and it does so after the patch applied:
>
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
>
>
> The root problem and the fix:
>
> The root problem is the missing transliteration table that I am
> supplying here.
>
>
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
>
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
>
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
>
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
>
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
>
> Links:
>
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
>
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
>
>
> Best regards,
> Egor Kobylkin
>
>
>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PING^5][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-03-19 10:39 ` ping " Egor Kobylkin
2019-03-28 16:20 ` [PING^4][PATCH " Marko Myllynen
@ 2019-04-04 19:44 ` Egor Kobylkin
2019-04-06 20:15 ` Siddhesh Poyarekar
2019-04-16 9:59 ` [PING^6][PATCH " Marko Myllynen
2 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-04 19:44 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
Ping?
On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
>
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
>
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
>
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
>
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
>
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> âcopy "tr_TR"â.
>
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
> Â Â to sequences of all uppercase Latin letters in all languages (whenever
> Â Â a Cyrillic letter is transliterated to more than one Latin letter),
> Â Â for example "Ð" is now transliterated as "YI" rather than "Yi".
>
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
>
> The patch is attached.
>
>
> Current bug effect:
>
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
>
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
>
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>
> - it produces a string of question marks and spaces.
>
> This is what it should produce and it does so after the patch applied:
>
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
>
>
> The root problem and the fix:
>
> The root problem is the missing transliteration table that I am
> supplying here.
>
>
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
>
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
>
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
>
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
>
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
>
> Links:
>
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
>
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
>
>
> Best regards,
> Egor Kobylkin
>
>
>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^5][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-04 19:44 ` [PING^5][PATCH " Egor Kobylkin
@ 2019-04-06 20:15 ` Siddhesh Poyarekar
0 siblings, 0 replies; 111+ messages in thread
From: Siddhesh Poyarekar @ 2019-04-06 20:15 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Rafal Luzynski
Cc: Mike Fabian
On 05/04/19 1:14 AM, Egor Kobylkin wrote:
> Ping?
>
I'm committing to looking at this on Monday if nobody gets to it ovevr
the weekend.
Siddhesh
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-01-02 18:39 ` [PATCH v12] " Egor Kobylkin
2019-01-05 14:36 ` Rafal Luzynski
@ 2019-04-09 1:27 ` Carlos O'Donell
1 sibling, 0 replies; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-09 1:27 UTC (permalink / raw)
To: Egor Kobylkin, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Marko Myllynen, mfabian
On 1/2/19 1:38 PM, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target file locale/C-translit.h.in (the original file changed on the master branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to "sh`" instead of erroneous "SH`" in v11
I have installed this patch and I'm testing some transliterations.
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-03-19 10:39 ` ping " Egor Kobylkin
2019-03-28 16:20 ` [PING^4][PATCH " Marko Myllynen
2019-04-04 19:44 ` [PING^5][PATCH " Egor Kobylkin
@ 2019-04-16 9:59 ` Marko Myllynen
2019-04-16 13:39 ` Carlos O'Donell
2 siblings, 1 reply; 111+ messages in thread
From: Marko Myllynen @ 2019-04-16 9:59 UTC (permalink / raw)
To: libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian, Egor Kobylkin
Ping?
On 19/03/2019 12.39, Egor Kobylkin wrote:
> Changelog v12:
> * Adjusted to the new comment style suddenly appearing in the target
> file locale/C-translit.h.in (the original file changed on the master
> branch from /* style to # style since v11)
> * Fixed a typo for <U04BB> CYRILLIC SMALL LETTER SHHA to be mapped to
> "sh`" instead of erroneous "SH`" in v11
>
> Changelog v11:
> * Re-targeted the patch against locale/C-translit.h.in as the proper
> file for the ASCII translit table.
> * Correspondingly the patch now only contains the additional
> Cyrillic-ASCII strings in the format of locale/C-translit.h.in table.
> The 'include "translit_cyrillic";""' directives are not necessary in the
> locale files and they are now all left intact.
> * Also the file translit_cyrillic is not longer needed and is omitted.
> * Edited below email, commit message.
>
> Changelog v10:
> * Removed ISO 9.1995 GOST 7.79-2000 System A (transliteration to Latin
> with diacritics) as conflicting with System B within glibc mechanics and
> not solving BZ #2872
> * Edited below email, commit message, comment in translit_cyrillic to
> reflect System A removal
> * Removed <U0423><U0301> and <U0443><U0301> (Cyrillic U with acute,
> using composition) as composing is not covered by current glibc
> conversion mechanics
>
> Changelog v9:
> * Fixed formatting (trailing spaces etc.)
> * Put commit summary in the patch file, now it is generated completely
> by git format-patch
>
> Changelog v8:
> * Re-added missing translit_cyrillic in patch v7 (due to missing "git
> add" in the script).
>
> Changelog v7:
> * Generated against git://sourceware.org/git/glibc.git master with git
> format-patch.
> * The 'include "translit_cyrillic";""' now immediately follows last
> 'include "translit_XXX";""' string (was inserted just before
> translit_end previously.)
> * Only the locales already having 'include .*translit.*;""' are patched
> (see the list for manual exclusions below, full list of included locales
> at the end of the email in the commit section.)
> * Excluded az_AZ completely to avoid circular reference from tr_TR via
> âcopy "tr_TR"â.
>
> Changelog v6:
> * Locales removed from the patch: C and sd_PK.
> * Added locales: az_AZ and ky_KG.
> * Consistently transliterate single uppercase Cyrillic letters
> Â Â to sequences of all uppercase Latin letters in all languages (whenever
> Â Â a Cyrillic letter is transliterated to more than one Latin letter),
> Â Â for example "Ð" is now transliterated as "YI" rather than "Yi".
>
> Dear locale maintainers,
>
> fix the glibc bug 2872 "Transliteration Cyrillic -> ASCII fails"
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=2872 [1]
>
> add the Cyrillic transliteration rows to locale/C-translit.h.in.
>
> The patch is attached.
>
>
> Current bug effect:
>
> The glibc wiki explicitly lists this use case as the test example and
> currently it fails on Cyrillic texts [1] [8] [9]:
>
> iconv -f UTF-8 -t ASCII//TRANSLIT < translit-test-input.txt |grep CYRILLIC
>
> CYRILLIC ????? ??? ???? ?????? ??????????? ?????, ?? ????? ?? ???.
>
> - it produces a string of question marks and spaces.
>
> This is what it should produce and it does so after the patch applied:
>
> CYRILLIC S``esh` eshhyo e`tix myagkix franczuzskix bulok, da vy'pej zhe
> chayu.
>
>
> The root problem and the fix:
>
> The root problem is the missing transliteration table that I am
> supplying here.
>
>
> COMMIT MESSAGE:
> This translit_cyrillic table enables conversion (e.g. with iconv) from a
> UTF-8 encoded text based on Cyrillic alphabet to a ASCII//TRANSLIT text.
>
> Example: iconv -f UTF-8 -t ASCII//TRANSLIT will produce ASCII
> compatible transcription.
>
> While a UTF-encoded Cyrillic text requires Cyrillic fonts the result of
> a transliteration/transcription has only Latin/ASCII codes but still can
> be read by a native speaker. Among other things it is useful for
> processing the Cyrillic texts and filenames by programs or on systems
> that are not specifically prepared to work with Cyrillic, don't have
> corresponding fonts installed or can't handle UTF-8.
>
> The patch content (mapping) is based on ISO 9.1995 standard [10] and its
> derivative GOST 7.79-2000 System B official source (Federal Agency on
> Technical Regulating and Metrology Of Russian Federation [2]).
> Technically an independent but mostly identical source [3] was used and
> prepared in a spreadsheet [6].
>
> The transliteration of Cyrillic to ASCII according to GOST 7.79-2000
> System B represents what is actually called transcription (preserving
> phonemes), while System A is the transliteration (preserving graphemes).
> There is no meaningful way to preserve graphemes converting Cyrillic to
> ASCII and thus the System B is chosen [11]. To be super clear the System
> A has nothing to do with this bug regardless it being a transliteration.
>
> Those interested in implementing System A for transliteration of
> Cyrillic to Latin with Diacritic as a new feature are welcome to use the
> spreadsheet in [6] as a starting point.
>
> Links:
>
> [1] This bug entry https://sourceware.org/bugzilla/show_bug.cgi?id=2872
> [2] GOST 7.79-2000 official source
> http://protect.gost.ru/document.aspx?control=7&id=130715 (is only
> available in low quality gif format)
> [3] http://transliteration.ru/gost-7-79-2000/ and
> http://www.yfermer.ru/specifications/285821.html
> [4] Wikipedia article on Cyrillic transliteration with Latin alphabet
> https://ru.wikipedia.org/wiki/%D0%A2%D1%80%D0%B0%D0%BD%D1%81%D0%BB%D0%B8%D1%82%D0%B5%D1%80%D0%B0%D1%86%D0%B8%D1%8F_%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%BE%D0%B3%D0%BE_%D0%B0%D0%BB%D1%84%D0%B0%D0%B2%D0%B8%D1%82%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%B8%D1%86%D0%B5%D0%B9
>
> [5] http://man7.org/linux/man-pages/man5/locale.5.html
> [6] Spreadsheet for generating translit_cyrillic
> https://sourceware.org/bugzilla/attachment.cgi?bugid=2872&action=viewall&hide_obsolete=1
>
> [8] https://sourceware.org/glibc/wiki/Locales#Testing_Locales
> [9] translit-test-input.txt
> https://sourceware.org/bugzilla/attachment.cgi?id=11304
> [10] https://en.wikipedia.org/wiki/ISO_9#GOST_7.79_System_B
> [11]
> https://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=gslmka8xq3
>
>
> Best regards,
> Egor Kobylkin
>
>
>
>
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 9:59 ` [PING^6][PATCH " Marko Myllynen
@ 2019-04-16 13:39 ` Carlos O'Donell
2019-04-16 17:32 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 13:39 UTC (permalink / raw)
To: Marko Myllynen, libc-alpha, libc-locales, Carlos O'Donell,
Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian, Egor Kobylkin
On 4/16/19 3:15 AM, Marko Myllynen wrote:
> Ping?
I have this patch applied locally and I'm working through some
comparisons for the transliteration.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 13:39 ` Carlos O'Donell
@ 2019-04-16 17:32 ` Egor Kobylkin
2019-04-16 18:07 ` Carlos O'Donell
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-16 17:32 UTC (permalink / raw)
To: Carlos O'Donell, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f UTF-8
-t ASCII//TRANSLIT <<<
"ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒ
ÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
And this is the expected result ("" added by myself):
"YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`
G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'"
Bests,
Egor Kobylkin
On 16.04.19 15:17, Carlos O'Donell wrote:
> On 4/16/19 3:15 AM, Marko Myllynen wrote:
>> Ping?
>
> I have this patch applied locally and I'm working through some
> comparisons for the transliteration.
>
>
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 17:32 ` Egor Kobylkin
@ 2019-04-16 18:07 ` Carlos O'Donell
2019-04-16 19:06 ` Egor Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 18:07 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
On 4/16/19 1:06 PM, Egor Kobylkin wrote:
> Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f UTF-8 -t ASCII//TRANSLIT <<< "ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒ ÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
>
> And this is the expected result ("" added by myself):
> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e` G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'"
Thanks.
I was using CyrTranslit (python translater) to review other work done in this area,
but it wasn't very fruitful.
$ python3
Python 3.7.3 (default, Mar 27 2019, 13:36:35)
[GCC 9.0.1 20190227 (Red Hat 9.0.1-0.8)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cyrtranslit
>>> cyrtranslit.supported()
dict_keys(['sr', 'me', 'mk', 'ru'])
>>> cyrtranslit.to_latin("ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â")
'ÐÄÐÐÐ
ÐÐJLjNjÄÐÐDžABVGDEŽZIÐKLMNOPRSTUUÌFHCÄŠЩЪЫЬÐЮЯabvgdežziйklmnoprstuuÌfhcÄÅ¡ÑÑÑÑÑÑÑÑÄÑÑÑÑÑjljnjÄÑÑdžѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â'
>>>
"ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
'ÐÄÐÐÐ
ÐÐJLjNjÄÐÐDžABVGDEŽZIÐKLMNOPRSTUUÌFHCÄŠЩЪЫЬÐЮЯabvgdežziйklmnoprstuuÌfhcÄÅ¡ÑÑÑÑÑÑÑÑÄÑÑÑÑÑjljnjÄÑÑdžѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â'
Which doesn't give a good transliteration.
But the table is better:
https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L138-L155
Ð -> YO.
Which is a good cross-check for me.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 18:07 ` Carlos O'Donell
@ 2019-04-16 19:06 ` Egor Kobylkin
2019-04-16 20:56 ` Carlos O'Donell
0 siblings, 1 reply; 111+ messages in thread
From: Egor Kobylkin @ 2019-04-16 19:06 UTC (permalink / raw)
To: Carlos O'Donell, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
On 16.04.19 19:58, Carlos O'Donell wrote:
> On 4/16/19 1:06 PM, Egor Kobylkin wrote:
>> Just FYI, this what I was testing: ./testrun.sh /usr/bin/iconv -f
>> UTF-8 -t ASCII//TRANSLIT <<<
>> "ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒ
>> ÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
>>
>> And this is the expected result ("" added by myself):
>> "YODJG`YEZ`IYIJL`N`TSHK`U`DHABVGDEZHZIJKLMNOPRSTUU?FXCZCHSHSHHA`Y``E`YUYAabvgdezhzijklmnoprstuu?fxczchshshh``y``e`yuyayodjg`yez`iyijl`n`tshk`u`dhO`o`FHfhYHyhE`e`
>> G`g`GHghGHghZH`zh`K`k`K`k`N`n`NGngP`p`O`o`C`C`T`t`UuH`h`TCZtczSH`sh`CH`ch`CH`ch`iZH`zh`CH`ch`A`a`A`a`E`e`A`a`ZH`zh`Z`z`Z`z`I`i`O`o`O`o`U`u`U`u`CH`ch`Y`y`'"
>>
>
> Thanks.
>
> I was using CyrTranslit (python translater) to review other work done in
> this area,
> but it wasn't very fruitful.
>
> $ python3
> Python 3.7.3 (default, Mar 27 2019, 13:36:35)
> [GCC 9.0.1 20190227 (Red Hat 9.0.1-0.8)] on linux
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import cyrtranslit
>>>> cyrtranslit.supported()
> dict_keys(['sr', 'me', 'mk', 'ru'])
>>>> cyrtranslit.to_latin("ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â")
>>>>
> 'ÐÄÐÐÐ
ÐÐJLjNjÄÐÐDžABVGDEŽZIÐKLMNOPRSTUUÌFHCÄŠЩЪЫЬÐЮЯabvgdežziйklmnoprstuuÌfhcÄÅ¡ÑÑÑÑÑÑÑÑÄÑÑÑÑÑjljnjÄÑÑdžѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â'
>
>>>>
>
> "ÐÐÐÐÐ
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУУÌФХЦЧШЩЪЫЬÐЮЯабвгдежзийклмнопÑÑÑÑÑÌÑÑ
ÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑÑѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â"
>
> 'ÐÄÐÐÐ
ÐÐJLjNjÄÐÐDžABVGDEŽZIÐKLMNOPRSTUUÌFHCÄŠЩЪЫЬÐЮЯabvgdežziйklmnoprstuuÌfhcÄÅ¡ÑÑÑÑÑÑÑÑÄÑÑÑÑÑjljnjÄÑÑdžѪѫѲѳѴѵÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ¢Ò£Ò¤Ò¥Ò¦Ò§Ò¨Ò©ÒªÒ«Ò¬ÒÒ®Ò¯Ò²Ò³Ò´ÒµÒºÒ»Ò¼Ò½Ò¾Ò¿ÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓÓ Ó¡Ó¤Ó¥Ó¦Ó§Ó¨Ó©Ó°Ó±Ó²Ó³Ó´ÓµÓ¸Ó¹â'
>
>
> Which doesn't give a good transliteration.
I guess the reason for that is that it is using the first key 'sr' from
your list that stands for Serbian. And Serbian doesn't have those
characters that are omitted ( "Щ" for example).
> But the table is better:
> https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L138-L155
>
>
> Ð -> YO.
>
> Which is a good cross-check for me.
Yet the closest one from that codebase should be this
https://github.com/opendatakosovo/cyrillic-transliteration/blob/master/cyrtranslit/mapping.py#L88
It is exactly the reason we had 12 iterations on this patch - we wanted
to cover the most complete yet workable standard for the table. What we
reference in the bug memo is the actual accepted standard. It is
coalesced with the extended standard for further outdated cyrillic letters.
Bests,
Egor Kobylkin
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 19:06 ` Egor Kobylkin
@ 2019-04-16 20:56 ` Carlos O'Donell
2019-05-10 12:19 ` Marko Myllynen
0 siblings, 1 reply; 111+ messages in thread
From: Carlos O'Donell @ 2019-04-16 20:56 UTC (permalink / raw)
To: Egor Kobylkin, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
On 4/16/19 2:41 PM, Egor Kobylkin wrote:
> It is exactly the reason we had 12 iterations on this patch - we
> wanted to cover the most complete yet workable standard for the
> table. What we reference in the bug memo is the actual accepted
> standard. It is coalesced with the extended standard for further
> outdated cyrillic letters.
I agree, and this is what makes review complicated and time
consuming. I'm relying on you as the expert, and my goal is only
to spot check for any inconsistencies.
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-02-14 16:48 ` Marko Myllynen
2019-03-04 22:12 ` Egor Kobylkin
@ 2019-04-20 0:20 ` Rafal Luzynski
[not found] ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
1 sibling, 1 reply; 111+ messages in thread
From: Rafal Luzynski @ 2019-04-20 0:20 UTC (permalink / raw)
To: Marko Myllynen, Egor Kobylkin, libc-alpha, libc-locales,
Carlos O'Donell
Cc: Siddhesh Poyarekar, Mike Fabian
Thank you Siddhesh and Carlos for your involvement in testing this
patch and I apologize Egor and Marko and everyone else who need this
patch to be pushed for my poor involvement. I'd like to reply to
this email from Marko because it summarizes all issues. Also I hope
I will explain the problems which made me stuck.
14.02.2019 17:48 Marko Myllynen <myllynen@redhat.com> wrote:
> [...]
> 1) Built-in C locale doesn't read/use any translit_* files and it can't
> have any fallback mechanisms and it only supports ASCII so using GOST
> 7.79 System B in locale/C-translit.h.in (as per patch v12) would seem to
> be the appropriate way to implement Cyrillic transliteration for the
> built-in C locale (it adds some 8KB to the binary).
This sounds like a good idea.
Also, C locale is probably a good way to enforce the plain ASCII
transliteration without any fallback.
> 2) Other locales read/use translit_* files and with them fallbacks and
> non-ASCII are possible so it would seem preferable to first try ISO 9 /
> GOST 7.79 System A
OK, we agree here.
> and only if that fails then use GOST 7.79 System B
> (in which case the end result should match with the built-in C locale).
This is impossible due to this case. System A transliterates the Cyrillic
"Х" to Latin "H", system B transliterates it to Latin "X". Transliteration
as implemented in glibc supports a simple fallback algorithm: transliterate
the letter "X" to "YY" but if it is not available then to "ZZ". It can't
support the complex algorithm which we need here: transliterate "X" to "YY"
but if "Q" cannot be transliterated to "RR" then transliterate "X" to "ZZ".
In our case we would like to transliterate "Х" to "X" if "Ш" cannot be
transliterated to "Š". The only thing we can implement is a fallback
transliteration which is similar to System B but not 100% compatible.
This is not the case if we are going to implement only System B in C locale
because we know already that "Š" is unavailable so we have to transliterate
"Х" to "X" always.
> For this the translit_cyrillic file should be added (as per patch v9 +
> changes mentioned in patches v10 and v12).
>
> 3) Individual locale files can then be updated to use translit_cyrillic
> as appropriate (see patch v9) and language/national specific conventions
> (e.g., SFS 4900 for fi_FI) can be applied on per-locale basis.
Sometimes I wonder whether really any other locale than a language which
uses the Cyrillic script should want to have a Cyrillic transliteration
but on the other hand - why not.
Also I'd like to reiterate other disagreements which we have here:
1. How to handle upper/lower case in System B? Should we transliterate
"Ш" to "SH" or "Sh"? Should we maybe implement a smart context based
casing algorithm first? I mean the algorithm which would detect if
an uppercase letter appears as the first letter of otherwise lowercase
word so should be transliterated as "Sh", or maybe it's in a context
of a fully uppercase word so should be transliterated as "SH".
I think that uconv implements this algorithm.
2. How to handle ambiguous transliterations like "Схема" -> "Shema"
vs. "Шема" -> "Shema"? "SHema"?
3. How to handle the characters which are proper letters in Cyrillic
and have an upper and lower case like a hard and soft sign but are
transliterated to punctuation characters (grave accent "`")?
Should we transliterate upper and lower case to the same character
or should we mark them somehow? uconv adds Unicode combining low
line to the grave accent (so the output is "`̲") if the original
Cyrillic character was uppercase. But this is unavailable if
our target charset is ASCII.
Regarding the test cases which I mentioned the other day I discussed
this with Dmitry and he convinced me that requiring the test cases is
the bar set too high so I agree we don't need to require them already.
Regards,
Rafal
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
[not found] ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
@ 2019-04-27 7:34 ` Siddhesh Poyarekar
2019-04-27 9:35 ` Diego (Egor) Kobylkin
0 siblings, 1 reply; 111+ messages in thread
From: Siddhesh Poyarekar @ 2019-04-27 7:34 UTC (permalink / raw)
To: Diego (Egor) Kobylkin, Rafal Luzynski, Marko Myllynen,
libc-alpha, libc-locales, Carlos O'Donell
Cc: Mike Fabian, Dmitry V. Levin
On 27/04/19 4:19 AM, Diego (Egor) Kobylkin wrote:
> Dear all,
> I think Rafal is making good points again. And the best thing is that
> we actually seem to have full consensus from everyone involved about
> current limited ASCII patch V12 (GOST 7.79 System B in
> locale/C-translit.h.in).
> So let’s just for the time being concentrate on getting this committed?
>
> We can get to further issues in the next release and having a base to
> start with will make them much clearer by the contrast of what’s already
> in.
>
> Please let me know if you see any entanglement between the V12 patch
> content and other issues listed below. I believe Carlos can test the
> patch in isolation and hopefully have it approved for the next release.
Please put it as a release blocker:
https://sourceware.org/glibc/wiki/Release/2.30
Siddhesh
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30
2019-04-27 7:34 ` Siddhesh Poyarekar
@ 2019-04-27 9:35 ` Diego (Egor) Kobylkin
0 siblings, 0 replies; 111+ messages in thread
From: Diego (Egor) Kobylkin @ 2019-04-27 9:35 UTC (permalink / raw)
To: Siddhesh Poyarekar
Cc: Rafal Luzynski, Marko Myllynen, libc-alpha, libc-locales,
Carlos O'Donell, Mike Fabian, Dmitry V. Levin
Thanks, Siddhesh, it's in.
Bests,
Egor Kobylkin
P.S. just for the historians: I have noticed that my quoted message below didn't go to the lists because it was in html format. But I believe all involved have received it directly.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, April 27, 2019 4:51 AM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> On 27/04/19 4:19 AM, Diego (Egor) Kobylkin wrote:
> > current limited ASCII patch V12 (GOST 7.79 System B in
> > locale/C-translit.h.in).
> > So let’s just for the time being concentrate on getting this committed?
>
> Please put it as a release blocker:
>
> https://sourceware.org/glibc/wiki/Release/2.30
>
> Siddhesh
^ permalink raw reply [flat|nested] 111+ messages in thread
* Re: [PING^6][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872]
2019-04-16 20:56 ` Carlos O'Donell
@ 2019-05-10 12:19 ` Marko Myllynen
0 siblings, 0 replies; 111+ messages in thread
From: Marko Myllynen @ 2019-05-10 12:19 UTC (permalink / raw)
To: Carlos O'Donell, Egor Kobylkin, libc-alpha, libc-locales,
Carlos O'Donell, Siddhesh Poyarekar, Rafal Luzynski
Cc: Mike Fabian
Hi Carlos,
On 16/04/2019 22.06, Carlos O'Donell wrote:
> On 4/16/19 2:41 PM, Egor Kobylkin wrote:
>> It is exactly the reason we had 12 iterations on this patch - we
>> wanted to cover the most complete yet workable standard for the
>> table. What we reference in the bug memo is the actual accepted
>> standard. It is coalesced with the extended standard for further
>> outdated cyrillic letters.
>
> I agree, and this is what makes review complicated and time
> consuming. I'm relying on you as the expert, and my goal is only
> to spot check for any inconsistencies.
I know you've been very busy with everything else but did you happen to
have any chance to check this further, shall we still wait for your
results or how would you suggests us to proceed?
Thanks,
--
Marko Myllynen
^ permalink raw reply [flat|nested] 111+ messages in thread
end of thread, other threads:[~2019-05-10 12:19 UTC | newest]
Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <41532e13-a63d-5df1-ab37-05eb4d6c8d0a@kobylkin.com>
[not found] ` <20180412224352.GB2911@altlinux.org>
2018-07-17 19:34 ` SUBJECT: [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-07-17 19:41 ` Carlos O'Donell
2018-07-17 19:50 ` Egor Kobylkin
2018-07-17 19:59 ` Carlos O'Donell
2018-08-06 19:00 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] re-submission for 2.29 Egor Kobylkin
2018-10-03 9:20 ` Egor Kobylkin
2018-10-03 9:32 ` Keld Simonsen
2018-10-03 15:01 ` Egor Kobylkin
2018-10-05 9:20 ` Marko Myllynen
2018-10-05 9:56 ` Rafal Luzynski
2018-10-05 11:54 ` Egor Kobylkin
2018-10-08 22:23 ` Rafal Luzynski
2018-10-08 23:20 ` Egor Kobylkin
2018-10-09 21:52 ` Rafal Luzynski
2018-10-08 23:23 ` Zack Weinberg
2018-10-09 16:10 ` Carlos O'Donell
2018-10-09 22:09 ` Rafal Luzynski
2018-10-09 16:22 ` Marko Myllynen
2018-10-09 16:49 ` Egor Kobylkin
2018-10-09 17:04 ` Marko Myllynen
2018-10-09 22:18 ` Rafal Luzynski
2018-10-10 11:23 ` Marko Myllynen
2018-10-11 11:05 ` Marko Myllynen
[not found] ` <deacdf31-d0bb-a92d-1de3-934d6b4cb158@kobylkin.com>
2018-10-05 12:01 ` Marko Myllynen
2018-10-05 12:21 ` Egor Kobylkin
2018-10-05 15:55 ` Marko Myllynen
2018-10-08 10:42 ` Egor Kobylkin
2018-10-08 13:53 ` Marko Myllynen
2018-10-08 22:34 ` Rafal Luzynski
2018-10-09 8:40 ` Egor Kobylkin
2018-10-09 14:19 ` Egor Kobylkin
2018-10-09 18:56 ` Egor Kobylkin
2018-10-09 22:31 ` Rafal Luzynski
2018-10-09 22:43 ` Egor Kobylkin
2018-10-10 4:16 ` Egor Kobylkin
2018-10-10 12:12 ` Marko Myllynen
2018-10-10 12:34 ` Egor Kobylkin
2018-10-10 16:23 ` Marko Myllynen
2018-10-11 2:58 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v2 Egor Kobylkin
2018-10-11 10:10 ` Marko Myllynen
2018-10-11 12:13 ` Rafal Luzynski
2018-10-11 13:32 ` Marko Myllynen
2018-10-11 13:57 ` Volodymyr Lisivka
2018-10-11 15:05 ` Egor Kobylkin
2018-10-11 21:33 ` Egor Kobylkin
2018-10-11 15:23 ` Egor Kobylkin
2018-10-11 15:51 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v3 Egor Kobylkin
2018-10-12 1:04 ` [PATCH] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] v4 Egor Kobylkin
2018-10-12 14:08 ` [PATCH v5] Locales: Cyrillic -> ASCII transliteration table [BZ #2872] Egor Kobylkin
2018-10-13 16:58 ` Rafal Luzynski
2018-10-13 21:16 ` Egor Kobylkin
2018-10-15 11:15 ` Marko Myllynen
2018-10-15 12:04 ` Egor Kobylkin
2018-10-24 0:12 ` Rafal Luzynski
2018-10-17 14:20 ` [PATCH v6] " Egor Kobylkin
2018-11-01 22:52 ` [PATCH v7] " Egor Kobylkin
2018-11-02 0:01 ` [PATCH v8] " Egor Kobylkin
2018-11-02 22:22 ` Rafal Luzynski
2018-11-02 23:27 ` Egor Kobylkin
2018-11-14 21:25 ` [PATCH v9] " Egor Kobylkin
2018-11-16 22:17 ` Rafal Luzynski
2018-11-17 18:35 ` Egor Kobylkin
2018-11-19 7:14 ` Marko Myllynen
2018-11-19 9:22 ` Egor Kobylkin
2018-11-19 19:36 ` Marko Myllynen
2018-12-01 22:09 ` Rafal Luzynski
2018-12-01 22:53 ` Egor Kobylkin
2018-12-03 22:19 ` Egor Kobylkin
2018-12-08 12:37 ` Rafal Luzynski
2018-12-10 21:29 ` Marko Myllynen
2018-12-19 22:42 ` Rafal Luzynski
2018-12-19 22:56 ` Egor Kobylkin
2018-12-20 0:06 ` Rafal Luzynski
2018-11-19 11:11 ` [PATCH v10] " Egor Kobylkin
2018-12-08 0:02 ` Rafal Luzynski
2018-12-08 22:17 ` Egor Kobylkin
2018-12-19 22:48 ` Rafal Luzynski
2018-12-19 23:16 ` Egor Kobylkin
2018-12-20 0:14 ` Rafal Luzynski
2018-12-10 1:28 ` [PATCH v11] Locales: Cyrillic -> ASCII transliteration " Egor Kobylkin
2018-12-19 23:31 ` Egor Kobylkin
2018-12-26 12:14 ` Siddhesh Poyarekar
2018-12-26 14:55 ` Egor Kobylkin
2018-12-27 1:47 ` Siddhesh Poyarekar
2018-12-27 11:36 ` Rafal Luzynski
2019-01-02 18:39 ` [PATCH v12] " Egor Kobylkin
2019-01-05 14:36 ` Rafal Luzynski
2019-01-05 21:13 ` Egor Kobylkin
2019-01-07 20:37 ` Marko Myllynen
2019-01-09 0:46 ` Egor Kobylkin
2019-01-09 20:03 ` Marko Myllynen
2019-02-04 7:14 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] ping for 2.30 Egor Kobylkin
2019-02-14 16:48 ` Marko Myllynen
2019-03-04 22:12 ` Egor Kobylkin
2019-03-11 13:59 ` PING " Egor Kobylkin
2019-03-14 19:49 ` Egor Kobylkin
2019-04-20 0:20 ` Rafal Luzynski
[not found] ` <5ELixS9SQ0DW4mlvswp96ASpLobBabU9KQ6zOTH-Udrb34mABhcqiPERpBZfPWZ9F77s8XNmiLIAq9UWu0AjLFFdjOz_FZVU5_xF-SiQkrw=@kobylkin.com>
2019-04-27 7:34 ` Siddhesh Poyarekar
2019-04-27 9:35 ` Diego (Egor) Kobylkin
2019-04-09 1:27 ` [PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Carlos O'Donell
2019-03-19 10:39 ` ping " Egor Kobylkin
2019-03-28 16:20 ` [PING^4][PATCH " Marko Myllynen
2019-04-04 19:44 ` [PING^5][PATCH " Egor Kobylkin
2019-04-06 20:15 ` Siddhesh Poyarekar
2019-04-16 9:59 ` [PING^6][PATCH " Marko Myllynen
2019-04-16 13:39 ` Carlos O'Donell
2019-04-16 17:32 ` Egor Kobylkin
2019-04-16 18:07 ` Carlos O'Donell
2019-04-16 19:06 ` Egor Kobylkin
2019-04-16 20:56 ` Carlos O'Donell
2019-05-10 12:19 ` Marko Myllynen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).