public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "bruno at clisp dot org" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug locale/30243] GB18030-2022 is not supported!
Date: Sat, 20 May 2023 22:52:10 +0000	[thread overview]
Message-ID: <bug-30243-131-MnqhSw6HiE@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-30243-131@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=30243

Bruno Haible <bruno at clisp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bruno at clisp dot org

--- Comment #2 from Bruno Haible <bruno at clisp dot org> ---
Created attachment 14890
  --> https://sourceware.org/bugzilla/attachment.cgi?id=14890&action=edit
mapping tables

The official GB18030-2022 mapping table can be downloaded from
http://www.nits.org.cn/index/article/4034 (two data files).

The difference between GB18030-2005 and GB18030-2022, regarding the mapping
tables, is that GB18030-2022 gets rid of a PUA (private use area) mapping of
some characters that were not part of Unicode in 2005 but are in Unicode
nowadays. In other words, these PUA mappings are considered obsolete.

Find attached a tar file with
1) The current mapping tables from glibc (extracted from glibc 2.35, but it
hasn't changed since then),
2) The mapping tables from GNU libiconv.
The <encoding>.TXT files describe the multibyte to Unicode conversion
direction; the <encoding>.INVERSE.TXT files describe the Unicode to multibyte
conversion direction.

What needs to be done in glibc?

* For the multibyte to Unicode conversion direction: Look at "diff -u
glibc-2.35-iconv/GB18030.TXT libiconv/GB18030-2022.TXT"
  - Mappings for 0x82359037..0x82359134 and 0x84318236..0x84318335 need to be
added.
  - The mappings of 0xFE51, 0xFE52, 0xFE53, 0xFE6C, 0xFE76, 0xFE91 need to be
changed.

* For the Unicode to multibyte conversion direction: Look at "diff -u
glibc-2.35-iconv/GB18030.INVERSE.TXT libiconv/GB18030-2022.INVERSE.TXT"
  - Mappings for U+E81E, U+E826, U+E82B, U+E82C, U+E832, U+E843, U+E854,
U+E864, U+E78D..U+E796 need to be added.
  - The mappings of U+20087, U+20089, U+200CC, U+215D7, U+2298F, U+241FE need
to be changed.
  - Mappings for U+E816, U+E817, U+E818, U+E831, U+E83B, U+E855 need to be
added.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2023-05-20 22:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18 17:05 [Bug locale/30243] New: " starcold14 at gmail dot com
2023-03-18 17:12 ` [Bug locale/30243] " starcold14 at gmail dot com
2023-04-19  9:44 ` starcold14 at gmail dot com
2023-05-20 22:52 ` bruno at clisp dot org [this message]
2023-05-21 11:22 ` schwab@linux-m68k.org
2023-06-16  1:28 ` liqingqing3 at huawei dot com
2023-06-16  1:54 ` bruno at clisp dot org
2023-06-16  1:54 ` bruno at clisp dot org
2023-06-16  2:01 ` bruno at clisp dot org
2023-06-25  6:21 ` lijianglin2 at huawei dot com
2023-06-27  6:15 ` lijianglin2 at huawei dot com
2023-06-28  8:45 ` lijianglin2 at huawei dot com
2023-06-28 14:33 ` liqingqing3 at huawei dot com
2023-07-03  8:09 ` lijianglin2 at huawei dot com
2023-07-26 19:14 ` jamborm at gcc dot gnu.org
2023-08-16 13:26 ` matz at suse dot de
2023-08-16 19:20 ` carlos at redhat dot com
2023-08-16 19:20 ` [Bug locale/30243] Add full support for GB18030-2022 carlos at redhat dot com
2023-08-16 20:22 ` maiku.fabian at gmail dot com
2023-08-16 20:28 ` carlos at redhat dot com
2023-08-17  1:55 ` liqingqing3 at huawei dot com
2023-08-17 10:25 ` maiku.fabian at gmail dot com
2023-08-29 17:04 ` maiku.fabian at gmail dot com
2023-08-29 17:04 ` maiku.fabian at gmail dot com
2023-08-29 17:31 ` carlos at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-30243-131-MnqhSw6HiE@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).