From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15169 invoked by alias); 3 Dec 2014 08:46:11 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 7421 invoked by uid 48); 3 Dec 2014 07:17:28 -0000 From: "maiku.fabian at gmail dot com" To: libc-locales@sourceware.org Subject: [Bug localedata/17588] Update UTF-8 charmap and width to Unicode 7.0.0 Date: Wed, 03 Dec 2014 08:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: maiku.fabian at gmail dot com X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: pravin.d.s at gmail dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: security- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-q4/txt/msg00076.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=3D17588 --- Comment #9 from Mike FABIAN --- I built glibc with the patch from comment#8. I produces some FAILs in =E2=80=9Cmake check=E2=80=9D: FAIL: localedata/cs_CZ.UTF-8/LC_CTYPE ... similar FAILs ... Shortly after starting =E2=80=9Cmake check=E2=80=9D one sees: ./charmaps/UTF-8:42734: unknown character `U00009FCD' ... similar messages ... All the above problems are cause by ranges of reserved code points which are listed in EastAsianWidth.txt like this: 9FCD..9FFF;W # Cn [51] .. and these code points are not in UnicodeData.txt. Therefore, they are not generated into the CHARMAP section of glibc=E2=80=99s UTF-8 file and it causes the above problems if they are generated into the WIDTH section of glibc=E2=80=99s UTF-8 file. This can be fixed by not generating reserved code points into the WIDTH section, i.e. by ignoring the reserved code points mentioned in EastAsianWidth.txt. Patch for utf8-gen.py: diff --git a/utf8-gen.py b/utf8-gen.py index 57875b6..20b68bb 100755 --- a/utf8-gen.py +++ b/utf8-gen.py @@ -218,6 +218,8 @@ if __name__ =3D=3D "__main__": write_comments(outfile, 1) elines =3D [] for line in easta_file.readlines(): + if re.match(r'.*\.\..*', line): + continue if re.match(r'^[^;]*;[WF]', line): elines.append(line.strip()) process_width(outfile, flines, elines) --=20 You are receiving this mail because: You are on the CC list for the bug.