From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <glibc-bugs-return-25775-listarch-glibc-bugs=sources.redhat.com@sourceware.org>
Received: (qmail 10212 invoked by alias); 4 Jul 2014 09:13:39 -0000
Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs@sourceware.org>
List-Help: <mailto:glibc-bugs-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-owner@sourceware.org
Received: (qmail 10103 invoked by uid 48); 4 Jul 2014 09:13:27 -0000
From: "pravin.d.s at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug localedata/14094] Update locale data to Unicode 7.0.0
Date: Fri, 04 Jul 2014 09:13:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: localedata
X-Bugzilla-Version: 2.15
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: pravin.d.s at gmail dot com
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: pravin.d.s at gmail dot com
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: security-
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-14094-131-5NpSSjUuSu@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-14094-131@http.sourceware.org/bugzilla/>
References: <bug-14094-131@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-07/txt/msg00558.txt.bz2

https://sourceware.org/bugzilla/show_bug.cgi?id=14094
--- Comment #13 from Pravin S <pravin.d.s at gmail dot com> ---
Created attachment 7679
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7679&action=edit
Patch to update UTF-8 CHARMAP to unicode 7.0

 I have worked on updating UTF-8 file to Unicode 7.0. Following are the
important points before review this patch.

  1. Present patch is only for CHARMAP, patch for updating WIDTH will be
available soon.
  2. utf8-gen.py: New script to generate UTF-8 file.
  3. patch is created by ignoring space changes (-w)
  4.
   ''' Where UnicodeData.txt file has given characters in range
    Example:
    3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;
    4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;

    UTF-8 file mention these range by adding 0x3F inbetween First and
Last Unicode character.
    Example:
    <U3400>..<U343F>     /xe3/x90/x80         <CJK Ideograph Extension A>
    .
    .
    <U4D80>..<U4DB5>     /xe4/xb6/x80         <CJK Ideograph Extension A>

*    Note: No idea why Hangul syllable AC00; D7A3; were not expanded in
Unicode **
**    5.0 UTF-8. We are following consistency and expanding Hangul as
well.**
*    '''

    5. Name changes are in UnicodeData.txt in some cases.
    ''' Some characters have <control> as a name, so using "Unicode 1.0
Name" 
     Characters U+0080, U+0081, U+0084 and U+0099 has "<control>" as a
name and even no "Unicode 1.0 Name" (10th field) in UnicodeData.txt
     We can write code to take there alternate name from NameAliases.txt '''

-- 
You are receiving this mail because:
You are on the CC list for the bug.