From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D84443858D33; Sat, 20 May 2023 22:52:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D84443858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1684623131; bh=ahqp9RGVC3kYdwDCLSEfC0hkjP/p4NBEXIYZEEalPjQ=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jQaU5tEubYy8mbMsmZfoNXkApXdl0YnylBJXL55iJXDrrNyL2vVnFuQ7ej+2fsYfl YkFAjoBI3HqVASFZkNOy9c5InI3yRfKH6M+6O9FBnNHw4egduID7IfkfGqaRH/wcaB ACn1CERSpoLXfEL0PQQVXiZrqw6P2BVsgRovqOaA= From: "bruno at clisp dot org" To: glibc-bugs@sourceware.org Subject: [Bug locale/30243] GB18030-2022 is not supported! Date: Sat, 20 May 2023 22:52:10 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: locale X-Bugzilla-Version: 2.39 X-Bugzilla-Keywords: X-Bugzilla-Severity: critical X-Bugzilla-Who: bruno at clisp dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://sourceware.org/bugzilla/show_bug.cgi?id=3D30243 Bruno Haible changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bruno at clisp dot org --- Comment #2 from Bruno Haible --- Created attachment 14890 --> https://sourceware.org/bugzilla/attachment.cgi?id=3D14890&action=3Ded= it mapping tables The official GB18030-2022 mapping table can be downloaded from http://www.nits.org.cn/index/article/4034 (two data files). The difference between GB18030-2005 and GB18030-2022, regarding the mapping tables, is that GB18030-2022 gets rid of a PUA (private use area) mapping of some characters that were not part of Unicode in 2005 but are in Unicode nowadays. In other words, these PUA mappings are considered obsolete. Find attached a tar file with 1) The current mapping tables from glibc (extracted from glibc 2.35, but it hasn't changed since then), 2) The mapping tables from GNU libiconv. The .TXT files describe the multibyte to Unicode conversion direction; the .INVERSE.TXT files describe the Unicode to multiby= te conversion direction. What needs to be done in glibc? * For the multibyte to Unicode conversion direction: Look at "diff -u glibc-2.35-iconv/GB18030.TXT libiconv/GB18030-2022.TXT" - Mappings for 0x82359037..0x82359134 and 0x84318236..0x84318335 need to = be added. - The mappings of 0xFE51, 0xFE52, 0xFE53, 0xFE6C, 0xFE76, 0xFE91 need to = be changed. * For the Unicode to multibyte conversion direction: Look at "diff -u glibc-2.35-iconv/GB18030.INVERSE.TXT libiconv/GB18030-2022.INVERSE.TXT" - Mappings for U+E81E, U+E826, U+E82B, U+E82C, U+E832, U+E843, U+E854, U+E864, U+E78D..U+E796 need to be added. - The mappings of U+20087, U+20089, U+200CC, U+215D7, U+2298F, U+241FE ne= ed to be changed. - Mappings for U+E816, U+E817, U+E818, U+E831, U+E83B, U+E855 need to be added. --=20 You are receiving this mail because: You are on the CC list for the bug.=