From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 51430 invoked by alias); 10 Sep 2015 05:52:26 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 50866 invoked by uid 55); 10 Sep 2015 05:51:50 -0000 From: "keld at keldix dot com" To: libc-locales@sourceware.org Subject: [Bug localedata/18943] Collation of NFD strings Date: Thu, 10 Sep 2015 05:52:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: localedata X-Bugzilla-Version: 2.22 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: keld at keldix dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-q3/txt/msg00179.txt.bz2 https://sourceware.org/bugzilla/show_bug.cgi?id=18943 --- Comment #1 from keld at keldix dot com --- On Wed, Sep 09, 2015 at 07:46:02PM +0000, egmont at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=18943 > > Bug ID: 18943 > Summary: Collation of NFD strings > Product: glibc > Version: 2.22 > Status: NEW > Severity: enhancement > Priority: P2 > Component: localedata > Assignee: unassigned at sourceware dot org > Reporter: egmont at gmail dot com > CC: libc-locales at sourceware dot org > Target Milestone: --- > > Forking off from bug 18927 comment 8 & 11: > > Collate definitions currently assume the input to be in NFC. If the available > UTF-8 unittests are converted to NFD (the localedata/*.in files which have > UTF-8 in Makefile's test-input) then they fail. > > It would be nice to automatically make normalization the lowest priority factor > when deciding on collation, so that different normalizations of the same word > are as close to each other as possible. That is, to implement it once (e.g. in > iso14651_common) without having to modify individual locale definitions. Both NFC and NFD data should collate as expected. And you can mix then as you like, you do not need to normalize them. Best regards keld -- You are receiving this mail because: You are on the CC list for the bug.