[glibc/vineet/arc-port-latest] Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]

public inbox for glibc-cvs@sourceware.org
help / color / mirror / Atom feed

* [glibc/vineet/arc-port-latest] Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
@ 2020-07-07 22:03 Vineet Gupta
  0 siblings, 0 replies; only message in thread
From: Vineet Gupta @ 2020-07-07 22:03 UTC (permalink / raw)
  To: glibc-cvs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="us-ascii", Size: 6834 bytes --]

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6e540caa21616d5ec5511fafb22819204525138e

commit 6e540caa21616d5ec5511fafb22819204525138e
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Tue Jun 16 08:29:40 2020 +0200

    Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120]
    
    Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Diff:
---
 localedata/charmaps/UTF-8              | 2 ++
 localedata/locales/i18n_ctype          | 2 +-
 localedata/locales/tr_TR               | 2 +-
 localedata/locales/translit_circle     | 2 +-
 localedata/locales/translit_cjk_compat | 2 +-
 localedata/locales/translit_combining  | 2 +-
 localedata/locales/translit_compat     | 2 +-
 localedata/locales/translit_font       | 2 +-
 localedata/locales/translit_fraction   | 2 +-
 localedata/unicode-gen/utf8_gen.py     | 9 ++++++++-
 10 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/localedata/charmaps/UTF-8 b/localedata/charmaps/UTF-8
index 14c5d4fa33..8cce47cd97 100644
--- a/localedata/charmaps/UTF-8
+++ b/localedata/charmaps/UTF-8
@@ -48920,6 +48920,8 @@ WIDTH
 <UABE8>	0
 <UABED>	0
 <UAC00>...<UD7A3>	2
+<UD7B0>...<UD7C6>	0
+<UD7CB>...<UD7FB>	0
 <UF900>...<UFA6D>	2
 <UFA70>...<UFAD9>	2
 <UFB1E>	0
diff --git a/localedata/locales/i18n_ctype b/localedata/locales/i18n_ctype
index 6f078a101d..c63e0790fc 100644
--- a/localedata/locales/i18n_ctype
+++ b/localedata/locales/i18n_ctype
@@ -26,7 +26,7 @@ fax       ""
 language  ""
 territory "Earth"
 revision  "13.0.0"
-date      "2020-04-14"
+date      "2020-06-25"
 category  "i18n:2012";LC_CTYPE
 END LC_IDENTIFICATION
 
diff --git a/localedata/locales/tr_TR b/localedata/locales/tr_TR
index d5785ceca1..7dbb923228 100644
--- a/localedata/locales/tr_TR
+++ b/localedata/locales/tr_TR
@@ -43,7 +43,7 @@ fax        ""
 language   "Turkish"
 territory  "Turkey"
 revision   "1.0"
-date       "2020-04-14"
+date       "2020-06-25"
 
 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
diff --git a/localedata/locales/translit_circle b/localedata/locales/translit_circle
index 0f1e81541c..5c07b44532 100644
--- a/localedata/locales/translit_circle
+++ b/localedata/locales/translit_circle
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of encircled characters.
-% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_circle.py on 2020-06-25 for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_cjk_compat b/localedata/locales/translit_cjk_compat
index 17b74134fc..ee0d7f83c6 100644
--- a/localedata/locales/translit_cjk_compat
+++ b/localedata/locales/translit_cjk_compat
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of CJK compatibility characters.
-% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_cjk_compat.py on 2020-06-25 for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_combining b/localedata/locales/translit_combining
index d5c8bbfe8f..36128f097a 100644
--- a/localedata/locales/translit_combining
+++ b/localedata/locales/translit_combining
@@ -10,7 +10,7 @@ comment_char %
 
 % Transliterations that remove all combining characters (accents,
 % pronounciation marks, etc.).
-% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_combining.py on 2020-06-25 for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_compat b/localedata/locales/translit_compat
index ff18b02ea3..ac24c4e938 100644
--- a/localedata/locales/translit_compat
+++ b/localedata/locales/translit_compat
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of compatibility characters and ligatures.
-% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_compat.py on 2020-06-25 for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_font b/localedata/locales/translit_font
index e79b0d83f5..680c4ed426 100644
--- a/localedata/locales/translit_font
+++ b/localedata/locales/translit_font
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of font equivalents.
-% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_font.py on 2020-06-25 for Unicode 13.0.0.
 
 LC_CTYPE
 
diff --git a/localedata/locales/translit_fraction b/localedata/locales/translit_fraction
index 197d57a644..b52244969e 100644
--- a/localedata/locales/translit_fraction
+++ b/localedata/locales/translit_fraction
@@ -9,7 +9,7 @@ comment_char %
 % otherwise be governed by that license.
 
 % Transliterations of fractions.
-% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-04-14 for Unicode 13.0.0.
+% Generated automatically from UnicodeData.txt by gen_translit_fraction.py on 2020-06-25 for Unicode 13.0.0.
 % The replacements have been surrounded with spaces, because fractions are
 % often preceded by a decimal number and followed by a unit or a math symbol.
 
diff --git a/localedata/unicode-gen/utf8_gen.py b/localedata/unicode-gen/utf8_gen.py
index 17b99ee88d..11c906b92f 100755
--- a/localedata/unicode-gen/utf8_gen.py
+++ b/localedata/unicode-gen/utf8_gen.py
@@ -258,7 +258,13 @@ def process_width(outfile, ulines, elines, plines):
         if key in width_dict:
             del width_dict[key] # default width is 1
     for key in list(range(0x1160, 0x1200)):
-        width_dict[key] = 0
+        # Hangul jungseong and jongseong:
+        if key in unicode_utils.UNICODE_ATTRIBUTES:
+            width_dict[key] = 0
+    for key in list(range(0xD7B0, 0xD800)):
+        # Hangul jungseong and jongseong:
+        if key in unicode_utils.UNICODE_ATTRIBUTES:
+            width_dict[key] = 0
     for key in list(range(0x3248, 0x3250)):
         # These are “A” which means we can decide whether to treat them
         # as “W” or “N” based on context:
@@ -327,6 +333,7 @@ if __name__ == "__main__":
         help='The Unicode version of the input files used.')
     ARGS = PARSER.parse_args()
 
+    unicode_utils.fill_attributes(ARGS.unicode_data_file)
     with open(ARGS.unicode_data_file, mode='r') as UNIDATA_FILE:
         UNICODE_DATA_LINES = UNIDATA_FILE.readlines()
     with open(ARGS.east_asian_with_file, mode='r') as EAST_ASIAN_WIDTH_FILE:


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-07-07 22:03 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 22:03 [glibc/vineet/arc-port-latest] Set width of JUNGSEONG/JONGSEONG characters from UD7B0 to UD7FB to 0 [BZ #26120] Vineet Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).