public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/32168] New: Update locale data to Unicode 16.0.0
@ 2024-09-12 13:00 maiku.fabian at gmail dot com
  2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-12 13:00 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

            Bug ID: 32168
           Summary: Update locale data to Unicode 16.0.0
           Product: glibc
           Version: 2.41
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: maiku.fabian at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

Now that Unicode 16.0.0 is released, the locale data that was updated for
Unicode 15.1.0 (bug 30854) should be updated to 16.0.0.

http://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug localedata/32168] Update locale data to Unicode 16.0.0
  2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
@ 2024-09-12 13:00 ` maiku.fabian at gmail dot com
  2024-09-16 12:07 ` maiku.fabian at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-12 13:00 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maiku.fabian at gmail dot com
           Assignee|unassigned at sourceware dot org   |maiku.fabian at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug localedata/32168] Update locale data to Unicode 16.0.0
  2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
  2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
@ 2024-09-16 12:07 ` maiku.fabian at gmail dot com
  2024-09-23  9:28 ` maiku.fabian at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-16 12:07 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Patch: 

https://patchwork.sourceware.org/project/glibc/patch/20240913172844.120480-1-mfabian@redhat.com/

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug localedata/32168] Update locale data to Unicode 16.0.0
  2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
  2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
  2024-09-16 12:07 ` maiku.fabian at gmail dot com
@ 2024-09-23  9:28 ` maiku.fabian at gmail dot com
  2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
  2024-09-27 14:13 ` maiku.fabian at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-23  9:28 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |2.41

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug localedata/32168] Update locale data to Unicode 16.0.0
  2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
                   ` (2 preceding siblings ...)
  2024-09-23  9:28 ` maiku.fabian at gmail dot com
@ 2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
  2024-09-27 14:13 ` maiku.fabian at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-09-27 14:12 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

--- Comment #2 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Mike Fabian <mfabian@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6

commit a7b5eb821d48b0cb14d0c0d2706410d4f7838cf6
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Thu Sep 12 15:02:55 2024 +0200

    Update to Unicode 16.0.0 [BZ #32168]

    Unicode 16.0.0 Support: Character encoding, character type info, and
    transliteration tables are all updated to Unicode 16.0.0, using
    the generator scripts contributed by Mike FABIAN (Red Hat).

    Changes in CHARMAP and WIDTH:

        Total added characters in newly generated CHARMAP: 5185
        Total removed characters in newly generated WIDTH: 1
        Total added characters in newly generated WIDTH: 170

    The removed character from WIDTH is U+1171E AHOM CONSONANT SIGN MEDIAL RA.
    It changed like this:

    UnicodeData.txt 15.1.0: 1171E;AHOM CONSONANT SIGN MEDIAL
RA;Mn;0;NSM;;;;;N;;;;;
    UnicodeData.txt 16.0.0: 1171E;AHOM CONSONANT SIGN MEDIAL
RA;Mc;0;L;;;;;N;;;;;

    EastAsianWidth.txt 15.1.0: 1171D..1171F   ; N  # Mn     [3] AHOM CONSONANT
SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
    EastAsianWidth.txt 16.0.0: 1171E          ; N  # Mc         AHOM CONSONANT
SIGN MEDIAL RA

    I.e it changed from Mn (Mark Nonspacing) to Mc (Mark Spacing
    combining). So it should now have width 1 instead of 0, therefore it
    is OK that it was removed from WIDTH, characters not in WIDTH get
    width 1 by default.

    Nothing suspicious when browsing the list of the 170 added characters.

    Changes in ctype:

        alpha: Added 4452 characters in new ctype which were not in old ctype
        combining: Added 51 characters in new ctype which were not in old ctype
        combining_level3: Added 43 characters in new ctype which were not in
old ctype
        graph: Added 5185 characters in new ctype which were not in old ctype
        lower: Added 25 characters in new ctype which were not in old ctype
        print: Added 5185 characters in new ctype which were not in old ctype
        punct: Missing 33 characters of old ctype in new ctype
        punct: Added 766 characters in new ctype which were not in old ctype
        tolower: Added 27 characters in new ctype which were not in old ctype
        totitle: Added 27 characters in new ctype which were not in old ctype
        toupper: Added 27 characters in new ctype which were not in old ctype
        upper: Added 27 characters in new ctype which were not in old ctype

    Nothing suspicous in the additions.

    About the 33 characters removed from `punct`:

    U+0363 - U+036F are identical in UnicodeData.txt. Difference in
DerivedCoreProperties.txt:

    DerivedCoreProperties.txt 15.1.0: not there.
    DerivedCoreProperties.txt 16.0.0: 0363..036F    ; Alphabetic # Mn  [13]
COMBINING LATIN SMALL LETTER A..COMBINING LATIN SMALL LETTER X

    So thatâs the reason why they are added to `alpha` and removed from
`punct`.

    Same for U+1DD3 - U+1DE6, they are identical in UnicodeData.txt but there
is a difference in DerivedCoreProperties.txt:

    DerivedCoreProperties.txt 15.1.0: 1DE7..1DF4    ; Alphabetic # Mn  [14]
COMBINING LATIN SMALL LETTER ALPHA..COMBINING LATIN SMALL LETTER U WITH
DIAERESIS
    DerivedCoreProperties.txt 16.0.0: 1DD3..1DF4    ; Alphabetic # Mn  [34]
COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE..COMBINING LATIN SMALL
LETTER U WITH DIAERESIS

    So they became `Alphabetic` and were thus added to `alpha` and removed from
`punct`.

    Resolves: BZ #32168

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug localedata/32168] Update locale data to Unicode 16.0.0
  2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
                   ` (3 preceding siblings ...)
  2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
@ 2024-09-27 14:13 ` maiku.fabian at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-09-27 14:13 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=32168

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Fixed in glibc master branch.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-09-27 14:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-12 13:00 [Bug localedata/32168] New: Update locale data to Unicode 16.0.0 maiku.fabian at gmail dot com
2024-09-12 13:00 ` [Bug localedata/32168] " maiku.fabian at gmail dot com
2024-09-16 12:07 ` maiku.fabian at gmail dot com
2024-09-23  9:28 ` maiku.fabian at gmail dot com
2024-09-27 14:12 ` cvs-commit at gcc dot gnu.org
2024-09-27 14:13 ` maiku.fabian at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).