* [Bug localedata/32168] New: Update locale data to Unicode 16.0.0
@ 2024-09-12 13:00 maiku.fabian at gmail dot com
2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-12 13:00 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
Bug ID: 32168
Summary: Update locale data to Unicode 16.0.0
Product: glibc
Version: 2.41
Status: NEW
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: maiku.fabian at gmail dot com
CC: libc-locales at sourceware dot org
Target Milestone: ---
Now that Unicode 16.0.0 is released, the locale data that was updated for
Unicode 15.1.0 (bug 30854) should be updated to 16.0.0.
http://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/32168] Update locale data to Unicode 16.0.0
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
@ 2024-09-12 13:00 ` maiku.fabian at gmail dot com
2024-09-16 12:07 ` maiku.fabian at gmail dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-12 13:00 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maiku.fabian at gmail dot com
Assignee|unassigned at sourceware dot org |maiku.fabian at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/32168] Update locale data to Unicode 16.0.0
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
@ 2024-09-16 12:07 ` maiku.fabian at gmail dot com
2024-09-23 9:28 ` maiku.fabian at gmail dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-16 12:07 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Patch:
https://patchwork.sourceware.org/project/glibc/patch/20240913172844.120480-1-mfabian@redhat.com/
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/32168] Update locale data to Unicode 16.0.0
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
2024-09-16 12:07 ` maiku.fabian at gmail dot com
@ 2024-09-23 9:28 ` maiku.fabian at gmail dot com
2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
2024-09-27 14:13 ` maiku.fabian at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-23 9:28 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |2.41
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/32168] Update locale data to Unicode 16.0.0
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
` (2 preceding siblings ...)
2024-09-23 9:28 ` maiku.fabian at gmail dot com
@ 2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
2024-09-27 14:13 ` maiku.fabian at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-09-27 14:12 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
--- Comment #2 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Mike Fabian <mfabian@sourceware.org>:
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6
commit a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6
Author: Mike FABIAN <mfabian@redhat.com>
Date: Thu Sep 12 15:02:55 2024 +0200
Update to Unicode 16.0.0 [BZ #32168]
Unicode 16.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 16.0.0, using
the generator scripts contributed by Mike FABIAN (Red Hat).
Changes in CHARMAP and WIDTH:
Total added characters in newly generated CHARMAP: 5185
Total removed characters in newly generated WIDTH: 1
Total added characters in newly generated WIDTH: 170
The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
It changed like this:
UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL
RA;Mn;0;NSM;;;;;N;;;;;
UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL
RA;Mc;0;L;;;;;N;;;;;
EastAsianWidth.txt 15.1.0: 1171D..1171F ; N # Mn [3] AHOM CONSONANT
SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
EastAsianWidth.txt 16.0.0: 1171E ; N # Mc AHOM CONSONANT
SIGN MEDIAL RA
I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
combining). So it should now have width 1 instead of 0, therefore it
is OK that it was removed from WIDTH, characters not in WIDTH get
width 1 by default.
Nothing suspicious when browsing the list of the 170 added characters.
Changes in ctype:
alpha: Added 4452 characters in new ctype which were not in old ctype
combining: Added 51 characters in new ctype which were not in old ctype
combining_level3: Added 43 characters in new ctype which were not in
old ctype
graph: Added 5185 characters in new ctype which were not in old ctype
lower: Added 25 characters in new ctype which were not in old ctype
print: Added 5185 characters in new ctype which were not in old ctype
punct: Missing 33 characters of old ctype in new ctype
punct: Added 766 characters in new ctype which were not in old ctype
tolower: Added 27 characters in new ctype which were not in old ctype
totitle: Added 27 characters in new ctype which were not in old ctype
toupper: Added 27 characters in new ctype which were not in old ctype
upper: Added 27 characters in new ctype which were not in old ctype
Nothing suspicous in the additions.
About the 33 characters removed from `punct`:
U+0363 - U+036F are identical in UnicodeData.txt. Difference in
DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: not there.
DerivedCoreProperties.txt 16.0.0: 0363..036F ; Alphabetic # Mn [13]
COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X
So thatâs the reason why they are added to `alpha` and removed from
`punct`.
Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there
is a difference in DerivedCoreProperties.txt:
DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4 ; Alphabetic # Mn [14]
COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH
DIAERESIS
DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4 ; Alphabetic # Mn [34]
COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL
LETTER U WITH DIAERESIS
So they became `Alphabetic` and were thus added to `alpha` and removed from
`punct`.
Resolves: BZ #32168
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/32168] Update locale data to Unicode 16.0.0
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
` (3 preceding siblings ...)
2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
@ 2024-09-27 14:13 ` maiku.fabian at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-27 14:13 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=32168
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #3 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Fixed in glibc master branch.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-27 14:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
2024-09-16 12:07 ` maiku.fabian at gmail dot com
2024-09-23 9:28 ` maiku.fabian at gmail dot com
2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
2024-09-27 14:13 ` maiku.fabian at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).