public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug localedata/17588] New: Update UTF-8 charmap and width to Unicode 7.0.0
@ 2014-11-12 10:11 pravin.d.s at gmail dot com
  2014-11-12 11:19 ` [Bug localedata/17588] " pravin.d.s at gmail dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: pravin.d.s at gmail dot com @ 2014-11-12 10:11 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=17588

            Bug ID: 17588
           Summary: Update UTF-8 charmap and width to Unicode 7.0.0
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: pravin.d.s at gmail dot com
                CC: libc-locales at sourceware dot org

Forked from #14094. Good to have separate bugs for UTF-8 and i18n file update.
Tracking changes and issues will be more clearer in long term.
*************************************************************
 Joseph Myers 2012-05-10 20:27:32 UTC

The Unicode locale data - character map and LC_CTYPE information - should be
updated from Unicode 6.1 (the character map is currently based on 6.0, and
LC_CTYPE is currently based on 5.0).  This should be done with proper
automation and wiki documentation being added of how to do future updates.  I
identified the following tasks at
<http://sourceware.org/ml/libc-alpha/2012-05/msg00590.html>:

* Ensure the character type data in localedata/charmaps/i18n can be
  properly reproduced from Unicode 5.0 data using gen-unicode-ctype.c,
  adapting gen-unicode-ctype.c as needed to replicate any changes that
  may have been made not using that program.

* Update the character type data to Unicode 6.1, removing any local
  hacks from gen-unicode-ctype.c that are no longer needed.
  (10646:2012, corresponding to Unicode 6.1, appears to be in
  publication stage so should be out very soon.)

* Ensure the character data in localedata/charmaps/UTF-8 can be
  reproduced in some automated fashion from Unicode 6.0, locating any
  previously used automation for this or creating some new automation
  if any previous automation can't be found.

* Update the character data to Unicode 6.1, removing any local hacks
  in the automation from the previous step.

* Document thoroughly on the wiki how the automation works and how to
  do updates to new Unicode versions.

[reply] [−] Comment 1 Rich Felker 2012-05-11 03:25:47 UTC

One of the major "local hacks" can be fixed, fixing many other problems at the
same time, by switching to using the Unicode "Alphabetic" property (from
DerivedCoreProperties.txt) instead of just categories L* for class alpha. Right
now there are many languages whose letters are considered non-alphabetic by
glibc because they're in category Mn or Mc or even Cf. There are "local hacks"
to fix this for maybe one or two languages, but using the right Unicode
property would fix it for all languages.
*******************************************************

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-return-26581-listarch-glibc-bugs=sources.redhat.com@sourceware.org Wed Nov 12 10:13:33 2014
Return-Path: <glibc-bugs-return-26581-listarch-glibc-bugs=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs@sources.redhat.com
Received: (qmail 23322 invoked by alias); 12 Nov 2014 10:13:33 -0000
Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs@sourceware.org>
List-Help: <mailto:glibc-bugs-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-owner@sourceware.org
Delivered-To: mailing list glibc-bugs@sourceware.org
Received: (qmail 22671 invoked by uid 48); 12 Nov 2014 10:13:26 -0000
From: "pravin.d.s at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug localedata/17588] Update UTF-8 charmap and width to Unicode 7.0.0
Date: Wed, 12 Nov 2014 10:13:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: localedata
X-Bugzilla-Version: unspecified
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: pravin.d.s at gmail dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: blocked
Message-ID: <bug-17588-131-xS6LUohNPs@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-17588-131@http.sourceware.org/bugzilla/>
References: <bug-17588-131@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-11/txt/msg00073.txt.bz2
Content-length: 370

https://sourceware.org/bugzilla/show_bug.cgi?id\x17588

Pravin S <pravin.d.s at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |14094

--
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-02-21  0:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-12 10:11 [Bug localedata/17588] New: Update UTF-8 charmap and width to Unicode 7.0.0 pravin.d.s at gmail dot com
2014-11-12 11:19 ` [Bug localedata/17588] " pravin.d.s at gmail dot com
2014-11-12 11:22 ` pravin.d.s at gmail dot com
2014-11-21  6:27 ` pravin.d.s at gmail dot com
2014-11-21  7:35 ` maiku.fabian at gmail dot com
2014-11-21 16:49 ` maiku.fabian at gmail dot com
2014-11-24 16:34 ` pravin.d.s at gmail dot com
2014-12-01 11:49 ` maiku.fabian at gmail dot com
2014-12-01 11:54 ` pravin.d.s at gmail dot com
2014-12-03  7:17 ` maiku.fabian at gmail dot com
2014-12-12 11:31 ` pravin.d.s at gmail dot com
2015-02-20 22:36 ` cvs-commit at gcc dot gnu.org
2015-02-21  0:06 ` aoliva at sourceware dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).